August 9th in Articles, HTML5 by .

Introduction to HTML5


Over the last couple of weeks I have been presenting talks over at my day job covering different aspects of HTML5. The information has proved useful to my colleagues and I decided to put them on my blog to share them with everyone and hopefully help get you started and excited about the future of HTML.

In this article I will talk about where HTML came from, how it has evolved and how we got to where we are now. I will also look at some of the coolest new additions that HTML5 brings as well as some changes to the current version of HTML 4.

A Brief History of HTML

  • 1991 : HTML Tags – Tim Berners-Lee
  • 1995 : HTML 2.0 (No HTML 1.0)
  • 1997 HTML 4.0
  • 1999 : HTML 4.01 (Strict, Transitional, Frameset)

In 1991 Tim Berners-Lee created the first version of HTML referred to as HTML tags. This was a small collection of tags mainly focused on marking up data. In 95 version 2.o  of HTML was released, that also marked the introduction of the img tag. After this things moved pretty quickly over at the W3C and in 1999 HTML 4.01 was released. This was the most complete version of HTML introducing new additions such as strict and transitional  and frameset modes, and a whole slew of new elements.

HTML 4.01 is the also the current version of HTML in active use today. After the release of HTML 4.01, the W3C decided to take a turn in it’s focus and also felt that the future of document authoring lies in XML. This mind set led to the abandonment of HTML and the first release of XHTML in 2000. XHTML 1.0 added nothing new to what is available in HTML 4.01 and is merely a reformulation of HTML in XML. With that the syntax got much stricter.

So while the following is completely fine in HTML 4.01

<P><img src="someimage.jpg WIDTH="100">

While in XHTML it has to be:

<p><img src="someimage.jpg width="100" /></p>

Note that everything needs to be in lower case and all tags needs to be closed. In general this is a very good practice and one I follow however, there are other aspects of XML that brought a quick end to the W3C’s adventure into XHTML. In 2001 XHTML 1.1 was released and introduced one very important change that I hinted to before. Up to this point people were still serving XHTML with a MIME type of text/html but, as of XHTML 1.1 you had to serve your pages with the MIME type application/xml.

This triggers something very important in user agents, if your page had even one error, remember this can be as small as an unclosed <img> tag, the user agent must stop parsing the document and present an error to the end user. While we all strive to code clean, perfectly formatted code, we do not always have all of the control over all off the code that gets generates, especially when using a CMS for example. There are also authors out there that do not adhere to these standards and as such this move makes authoring web pages basically near impossible.

With that, further development of XHTML was halted and thus XHTML 2.0 never happened. After this things at the W3C went into a bit of a hiatus, that is until 2004. In 2004 Opera and Mozilla, with Apple joining later, formed the Web Hypertext Application Technology Working Group, or WHAT-WG for short, and approached the W3C with a proposal. They believed that there was life in HTML yet and with some enhancements to the current version of HTML and the integration of what was known as web forms, HTML had a real future.

At that time the W3C was not interested and felt there was other areas that required more of their focus. This did not stop WHAT-WG from continuing their work and they continued working on what they termed HTML5, note the lack of a space between HTML and 5. This was a combination of HTML 4.01 with extensions as well as Web Application 1.0 and Web Forms 2.0. The true turning point in the evolution came in 2006 with an article published by Tim Berners-Lee entitled “Reinventing HTML” where he acknowledged that the W3C had made a mistake and that there was indeed still life in HTML.

In 2008 work officially started at the W3C towards HTML 5. But they did not start from scratch and instead used the work already done by the WHAT-WG as the basis from where to spring board further development.

When Can I Use HTML 5?

This is the question everyone is asking and the answer to this question can be given in two ways. First, with the immense and growing support for HTML 5 in modern browsers and the promise of Internet Explorer 9, you can start using much of HTML 5 today. Libraries and enablers such as Modernizr and HTML5 Shiv, makes the process of creating new and rich experiences for users with modern browsers along with fall backs for users with browsers that does not support HTML 5 dead simple.

If you want the question answered more in a definitive manner then it is expected to be as follows:

  • At the W3C it is currently in last call working draft. This means that the language is unlikely to change further and focus is on bug foxes.
  • 2012: The spec will reach Candidate Recommendation. At this time the spec has been finalized and nailed down and UA vendors are recommended to ensure complete support of the spec.
  • 2022:  The spec will reach Proposed Recommendation. This means that at this time there will be two completely interoperable implementations of HTML 5 i.e. Two user agents that implement the spec in it’s entirety.

Thankfully, as mention before, we do not have to wait until 2022 to start using HTML 5, we can start today.

HTML 5 Design Principle

With XHTML the design principals was not very end user and developer friendly with theoretical purity being given a much higher priority over authors and users. HTML 5 brings a welcome and refreshing change this design principal. The HTML 5 design principle reads as follows:

Users over authors over implementers over specifiers over theoretical purity

This puts end users and developers in a unique position of being able to directly influence the evolution of HTML 5 where before we at the back of the row.

The Syntax

HTML 5 brings some welcome additions and changes to the language. Let’s start by looking at the DOCTYPE declaration.

The DocType

HTML 4.01 Strict doctype:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">

XHTML 1.0 doctype:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Definitely not easy to remember and this is why we rely on IDE’s generating this for us when we create a new document. Now let’s look at the HTML 5 doctype:

<!DOCTYPE html>

Now that is a doctype I can remember and, it is optional. The doctype is mainly included for backwards compatibility to support the triggering of browser modes such as standards and quirks mode. To read more about doctype’s and render modes have a read over the article ‘Activating Browser Modes with Doctype’ by Henri Sivonen.

Character Encoding

Specifying the character encoding for your HTML page is very important especially from an internationalization perspective. In HTML 4.01 and XHTML the following code was required:

For HTML 4.01

<META http-equiv="Content-Type" content="text/html; charset=utf-8">

For XHTML

<?xml version="1.0" encoding="utf-8"?>
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />

Both above is required in an XHTML document. Thankfully IDE’s came to the rescue and generated this for us else, I bet a lot of authors would have left these off. In HTML 5 character is set as follows:

<meta charset=”utf-8”>

The great thing about the character encoding syntax is that you can use it today, although I have heard reports that this does not work on Android based devices.

What’s Valid?

HTML has generally been a very forgiving language and authors could get away with a lot and the user agent will do it;s best to figure out what you intended. XHTML brought with it the much stricter syntax of XML. So what is valid in HTML 5? Look at the lines of code below:

<li class=”myclass” id=”unique”>Text</li>
<img src=”../path/test.png” alt=”Test” />

<li class=”myclass” id=”unique”>Text
<img src=”../path/test.png” alt=”Test”>

<LI CLASS=”myclass” ID=”unique”>Text
<IMG SRC=”../path/test.png” ALT=”Test”>

<li class=myclass iduniqueText</li>
<img src=../path/test.png alt=Test />

All off the above is completely valid in HTML 5. Which one you choose is completely up to you. The language does not enforce this but when it comes to validating your documents validators will, and should, suggest better coding practices. As stated, my preference definitely lies with the strictness of the XML based format as demonstrated in the first example above.

Error Handling

The error handling I am referring to here is not error handling within your HTML document but, how the user agent will handle errors. Before HTML 5 it was up to the user agent to decide how they will handle error scenarios and with that we got the problem of different browsers handling the same error in different ways which led to the incredible confusion that currently exist. HTML 5 is going to tell user agents how they should handle errors and this is a huge step forward. For some interesting reading you can read how Webkit went about implementing their HTML 5 parsing algorithm.

What’s Been Removed And Changed?

Before we look at what has been added the HTML with HTML 5, let’s first look at what has been removed and changed. There is no surprise here as you will see.

  • Presentational elements such as font, big, center etc.
  • Presentational attributes such as border, bgcolor etc.

As I said, no surprise here as the above should be handled using CSS, something we have all learnt and are implementing already. Some of the other elements that has been removed are:

  • Frames are removed.
  • Acronym is removed, you can now simply use the abbreviation element

With regards to what have changed, anchor elements <a> can now be wrapped around any element including block level elements. So the following is completely valid:

<a href="url" title="Descriptive text">
    <div id="ad">
        <p>Some random content</p>
    </div>
</a>

The interesting thing about this is, you can use this already and it will work, even in browsers that does not support HTML 5. The other more minor change is to to b, i, hr and small. Nothing is really different here and these elements has simply been given new semantic meaning and whether they will be part of the final spec remains to be seen.

What’s New?

Now this is where things get’s interesting. The first element we are going to look at is <mark>:

<mark>some text</mark>

One of the common use cases described for the mark tag is to highlight text on a page that contains keywords that the user has searched for. Currently one would do something as follows:

<span class="mark">some text</span>

Then in CSS we would add some style rules to the mark class to give the text a wrapped in the span a yellow background perhaps. With HTML 5 you can do exactly the same but instead of using a span tag with a class, you simply wrap the text in the mark tag and then apply the CSS style rule to the mark element directly. The next element I am going to look at is the <time> tag.

The time tag allows us to present the user with a human readable date format while providing a more detailed, machine readable format for browsers to for example easily add appointments directly to your iCal or Outlook calendar. This is done as follows:

<time datetime=”2009-09-02T09:30:00”>September 2nd, 9:30am</time>

I particularly like this one.

Sectioning Content

This is where some of the most new additions are and a new way of thinking about the way you author your content. With that said though, a lot of this will be very familiar to you, as the approach here has definitely been one of paving the cow paths. The new tags for sectioning content can be grouped into three as follows:

  • nav, article, section, aside
  • header, hgroup, footer
  • details, figure

As you can see from the above most, if not all of the new tags, will be very familiar to you and you have more then likely used these as a class or id of an element. Well now, you can simply use the tag itself. Of course there is a caveat that comes with this, because a lot of browsers do not understand these tags yet, your layout will not behave as you might expect. To get a little closer to the expected behavior you can add this to the start of your CSS file:

nav, article, section, aside, header, hgroup, footer, details, figure
{
    display:block;
}

From the above we can now see that all of these elements will now by default be displayed as block level elements. Combine this with HTML5 Shiv and you are good to go. I am not going to go into the details of these elements here as one can write an entire article just on that, especially when you try to make sense of the section and article tag and whether both will make it into the final spec. For now, please read over the HTML 5 specification for more details on these new elements.

Forms

This is an area where HTML has been greatly lacking and JavaScript has stepped in here to close the gap between web and desktop applications nicely however, this is not the ideal. With HTML 5 many of the JavaScript solutions coders have struggled with for so long is no longer needed. Let’s look at some of the new form enhancements in HTML 5.

Spinners

Spinners allows user to easily enter numerical data into your form and gives you much greater control over what is accepted and what not. The code for a spinner is as follows:

<input name="foo" type="number" />

This will give you a default spinner control allowing only numerical entries starting at 0 and up. Besides the default, there are three attributes you can use for greater control over input, these are:

  • Min : The lowest number
  • Max : The highest number
  • Step : The amount the number is incremented by with each click

Example:

<input name="foo" id="number" type="number" min="5" max="72" step="3" />

View Demo (Currently only Opera 10.60 +)

Placeholder

How many times have you implemented placeholder text in an input fields that would clear out when the field receives focus and appear again once focus is lost if the user did not fill in any data? This is another JavaScript snippet you can move to the fallback script library. In HTML 5 we can get the exact same behavior with the following code:

<input name="foo2" id="placeholder" type="text" placeholder="Please enter you name">

View Demo (Confirmed in Chrome 5+, Firefox 4+ and Safari 5+)

Email Address Validation

Ooh how many times have you looked for or tried to write the perfect regular expression to use when validating email addresses? Small wars have been fought over this one. But now, this is delegated to the user agents to handle in based on a known standard. To implement email validation do the following:

<input name="email" id="email_addr" type="email">

To test this and see how the browser behaves open the demo page below, enter an invalid email address and click on the submit button. The behavior in Chrome and Safari is a little strange here as the browser prevents you from submitting the form and moves focus to the email address input, but no error message is displayed to the user. For the complete behavior open the page in Opera 10.60 +

View Demo (Confirmed in Opera 10.60+)

Date Picker

This is another one that is implemented almost as a standard interface element but few know of the cross browser nightmares developer of this little widget has faced and then I do not even have to mention that accessibility aspect of this widget. With HTML 5 this is again moved to be implemented by the user agent and they can take care of the accessibility of the element on a operating system level which makes much more sense. To add a data picker is as simple as this:

<input name="date" id="selected_date" type="date" />

This will prove you with a default date picker provided by the user agent. But as with many of the other HTML 5 form controls, this one also has two attributes you can use to have more control over the data added. The two attributes are as follows:

  • min – The earliest date to allow as the date
  • max – The furthers date in the future that is allowed
<input name="date" id="selected_date" type="date" min="2010-01-01" max="2012-01-01">

The behavior in Chrome and Safari here is the same as for email validation with the only browser currently implementing the complete functionality being Opera 10.60 +

View Demo (Confirmed in Opera 10.60+)

Range

This is another input type that has often been implemented using JavaScript but can now be done just with HTML 5. This is basically the very familiar slider control.

<input name="foo" type="range">

The slider control shares the same attributes as spinners:

  • Min : The lowest number
  • Max : The highest number
  • Step : The amount the number is incremented by with each click

Example:

<input name="_range" id="range" min="5" max="72" step="3" type="range">

With the above your slider will start at 5 and move upwards until it reaches 72 in steps of 3. The problem with this is that, while the users can physical see and interact with the element, they have no way of telling what the current value is. Not to worry, HTML 5 has you covered. Along with the above we can use another new tag <output> as follows:

<input name="_range" id="range" min="5" max="72" step="3" type="range" /><br />
<output name="result" onforminput=value=_range.value>38</output><br />

The best current implementation of range by far is Opera 10.60+. While Chrome/Safari does implement this from an interface perspective steps are not as clearly marked as in Opera and the output tag does not work, so for the best demo of the range type, open the demo in Opera 10.60+

View Demo (Confirmed in Opera 10.60+, partial in Chrome 5+ and Safari 5+)

Data List

Ever wanted to present the user with a list of some default values that is most often filled into a specific field while still allowing them to enter their own if one of the defaults does not fill their needs? This is where Data Lists comes in. The code for data lists are as follows:

<input id="title" list="salutations" type="text" /><br />
<datalist id="salutations">
	<option label="Mr" value="Mr" />
	<option label="Ms" value="Ms" />
	<option label="Prof" value="Prof" />
	<option label="Monkey Business" value="Monkey Business" />
</datalist>

With the above, a list off these values will drop down when the field title receives focus.

View Demo (Confirmed in Opera 10.60+)

Pattern

Say you created a form for a parts store and there is an input field into which a prospective buyer will fill in a part number. Now you know that all part numbers follow a similar pattern. How can you aid users in entering part numbers correctly and validate their input. The pattern attribute helps you do exactly that and when used in combination with the placeholder attribute, you have a real win for usability. Below is an example:

<label for="partId">Part ID</label><br />
<input id="partId" name="partId" pattern="[0-9][A-Z]{3}" placeholder="Digit followed by three uppercase letters" /><br />

As you can see from the above the pattern attribute takes standard regular expression syntax, that must conform to JavaScript Pattern production [ECMA262], as it’s value and applies this to the field. With the pattern  we face an interesting dilemma. While Opera 10.60 implements the pattern validation it does not implement placeholder. Now the problem is Opera will tell you that your input does not match the pattern required it will not actually tell you what the expected pattern is. This is why I used the placeholder, to tell the user what is expected but, as this does not function in Opera we are limited in our use.

View Demo (Confirmed in Opera 10.60+)

Required

Required, this is something we have to check for over and over again on forms with JavaScript. With HTML 5 you no longer have to utilize JavaScript for this repetitive task, with one simple attribute you can leave this to the browser:

<input name="username" id="username" required />

Again, here we have the same situation as with some of the other elements. Chrome and Safari will prevent the form from submitting and move focus to the required field but, it will not provide and error message to the user. The only browser that has implemented this completely is again Opera 10.60+

View Demo (Confirmed in Opera 10.60+)

Of course one of the big additions that comes with HTML 5 is the canvas element. Basically over and above anything else this is the one that has garnered the most attention from developers and thus browser vendors themselves. With canvas you are limited merely by your imagination. And with browsers such as Internet Explorer 9 leading the way with hardware accelerated rendering the sky is the limit.

There is a lot more to HTML 5 than what I covered here such as the video element and the audio element which brings audio and video natively to the browser without the need for any plugins or downloads, but to cover all of this will fill a book ;) I am absolutely loving the revolution we are currently involved in with regards to the web and when you bring CSS3 and JavaScript 5 into the mix, the web is truly finally going to reach it’s promise with regards to web experienced and application and I, for one, is looking forward to be part of all of this.

Resources

Join In

  • http://java.dzone.com James

    Great overview Schalk!

    • http://expansive-derivation.ossreleasefeed.com/ Schalk

      Thanks James, glad you enjoyed it.

  • http://java.dzone.com James

    Great overview Schalk!

    • http://expansive-derivation.ossreleasefeed.com/ Schalk

      Thanks James, glad you enjoyed it.

  • http://www.texaswebdevelopers.com/blog TexasWebDevelopers

    We put together some boilerplate code for people who wanted some guidance when beginning to experiment with HTML5. The page code has working examples of quite a few tags inclucing header, nav, aside, article, figure, hgroup, time, pubdate, section, datalist, email, and footer. Download here:
    http://www.texaswebdevelopers.com/blog/template_permalink.asp?id=136

  • http://www.texaswebdevelopers.com/blog TexasWebDevelopers

    We put together some boilerplate code for people who wanted some guidance when beginning to experiment with HTML5. The page code has working examples of quite a few tags inclucing header, nav, aside, article, figure, hgroup, time, pubdate, section, datalist, email, and footer. Download here:
    http://www.texaswebdevelopers.com/blog/template_permalink.asp?id=136

  • http://www.eastdevonit.co.uk Dan

    Thanks for this, I found it very interesting…

    • http://expansive-derivation.ossreleasefeed.com/ Schalk

      Glad you enjoyed it, Schalk

  • http://www.eastdevonit.co.uk Dan

    Thanks for this, I found it very interesting…

    • http://expansive-derivation.ossreleasefeed.com/ Schalk

      Glad you enjoyed it, Schalk

  • http://jireck.free.fr Jireck

    Thanks for explain

    I hope all browser take HTML5 syntax

    • http://expansive-derivation.ossreleasefeed.com/ Schalk

      In time browsers will support this, there is already wide spread support but, we will still, for a long time, have to deal and implement alternate solutions for those that do not. It is part of the fun of being a web developer ;) Schalk

  • http://jireck.free.fr Jireck

    Thanks for explain

    I hope all browser take HTML5 syntax

    • http://expansive-derivation.ossreleasefeed.com/ Schalk

      In time browsers will support this, there is already wide spread support but, we will still, for a long time, have to deal and implement alternate solutions for those that do not. It is part of the fun of being a web developer ;) Schalk

  • Pingback: Introduktion till HTML 5 från expansive-derivation | Teknikveckan

Performance Optimization WordPress Plugins by W3 EDGE