
At a recent conference I attended the topic of HTML5 came up at a Q&A session with Andy Budd, John Resig, Jonathan Snook, Joe Stump and Dustin Diaz. The conversation quickly changed from the general topic of HTML5 and more specifically what is being called HTML5 these days. The term HTML5 is currently being used as an umbrella term for a bunch of new and emerging standards and API’s. Generally I do not think that is a bad thing.
As was mentioned during the Q&A, it was after all the term AJAX that was coined by Jesse James Garrett that brought JavaScript back into the limelight and sparked the innovation we and end users have been enjoying for a long time now. When the term was coined it referred to ability for portions of a webpage to be updated, without triggering a complete page reload, by sending and receiving small chunks of XML to the server however, since then pretty much the entire basis of what became Web 2.0 was termed Ajax.
Anytime a user could interact with a webpage and the user interface would adapt or change based on these actions, it was called Ajax. Buzz words such as this often gives people the ‘sales’ tools to drive the adoption of a new technology inside companies. In general, people love buzz words. So in that regard, using HTML5 as the term to describe the this new era we are moving into is not a bad thing. What we do however need to ensure is that people and especially developers are aware of what they refer to when using the term HTML5.
That is then the end goal of this article, to in a sense plot the ‘HTML5′ landscape and provide a short overview of what each part of this new platform is all about.
HTML5
On 24 December, 1999 the last revision of HTML was released by the W3C namely HTML 4.01. Since then this base for everything we create on the web has completely stagnated, even being abandoned at one point by the W3C in favor of an XML based serialization called XHTML. When HTML was created way back in 1991 or 1995, depending on who you ask, and even up to the release of 4.01 at the end of 1999, the design did not really cater for what the web is used for today.
That, coupled with the fact that the language was not revised in the last 8 or 9 years placed a lot of limitations on web developers and we often had to be very creative to create the interface elements and user experiences we would like to create. HTML5 is going a long way in changing this with a whole array of improvements to forms, adding native support for audio and video as well as bringing a new element to the table called canvas. The canvas element alone has stirred an immense amount of interest and for good reason. It’s basically a playground for JavaScript developers where the limits are literally your imagination. Another of the big additions to the HTML5 spec is drag and drop which I will touch on more a little later in the article.
There is a lot to HTML5 and you can read the current draft spec on the W3C website and you can also follow the continued evolution on the WHAT-WG website. In the end when you talk about HTML5 technically, this is what you are referring to, the 5th major revision of the core language of the Web.
W3C Specification WhatWG Working Draft
CSS3
One of the specifications that is placed under the HTML5 umbrella is CSS3. When CSS came along it changed the way we developed websites. We had a complete separation between structure and design but, CSS was not perfect. In fact, Douglas Crockford even went as far as saying that CSS 2.1 is not implementable and we can see this today. Even with all of the advances in the browser space, CSS 2.1 has not yet been fully implemented by any one browser.
CSS3 does not only bring new features to the table but, the CSS working group is taking a completely new look at CSS. As with a lot of HTML5, the authors of CSS3 has looked at the typical problems web developers face with regards to layout and styling with CSS and built this into the specification. The result is items such as border-radius for creating rounded corners without the need for images, hacks or JavaScript, box-shadow that allows yu to easily add drop shadows to elements.
There is a new layout module that solves a lot of the problems with regards to equal heights and creating grid based layouts. Talking about modules, this is one of the big changes in terms of the way the specification is being authored. The CSS3 specification is entirely modular and there are 26 modules in total. I do not see all of these modules coming to fruition though, especially one’s such as BECSS, which is an attempt to attach behavior to elements instead if style. I really feel that this should remain with JavaScript and CSS should remain the style layer in the web stack. CSS3 holds a lot of promise for web designers and as with HTML5 browser support is growing rapidly.
CSS3 Modules
- Syntax / grammar
- Selectors
- Values & units
- Value assignment / cascade / inheritance
- Box model / vertical
- Positioning
- Color / gamma / color profiles
- Colors and Backgrounds
- Line box model
- Text
- Fonts
- Ruby
- Generated content / markers
- Replaced content
- Paged media
- User interface
- WebFonts
- ACSS
- SMIL
- Tables
- Columns
- SVG
- Math
- BECSS
- Media queries
- Test Suite
Web Workers
Web Workers are an awesome addition for web application developers and is based on work that was done by Google on the Gears WorkerPool Model. As you know JavaScript is single threaded and this means that while one function is executing all others needs to queue up and wait before they can start executing, this also applies to Ajax requests. Web Workers allows for individual scripts to be run in parallel in the background allowing for thread-like operations. The most common use for Web Workers is to handle highly intensive scripts without blocking client side user interactions.
One important aspect to note is that the workers do not have access to the DOM. You communicate with Web Workers using message passing. A simple example would look as follows:
var worker = new Worker('worker.js');
worker.onmessage = function(event) {
document.getElementById('result').textContent = event.data;
}
// The worker
postMessage('Hello World');
So as you can see from the above, the Worker itself does not have access to the DOM but using the message passing mechanism, the client side script can add the result of the Worker to the DOM. So, above we create an instance of the Worker object and pass in the Worker script that will be run. The instance of the Worker object is stored in the worker variable and via this allows us to communicate with the Worker.
The onmessage event handler allows our client side script to receive messages from the Worker. The Worker in turn sends messages back to the client script using postMessage. There is much more to Web Workers then what I have discussed here and also keep in mind that this is still a work in progress so, keep your eye on the spec and get involved on the mailing list.
Web Socket API
One of the ways the web browser environment, i.e. the HTTP protocol, is different to other network protocols is that it is not bidirectional. That means that a request is sent and a response is received, in between these events the connection is closed. This then means that there is no way for the server to push data to the browser when data changes and instead we have to fall back to using some form off long polling using for example Ajax.
Web Developers have been using a technique such as the above that was later called Comet. While the Comet approach get’s the job done jumping through various hoops using hidden iFrame’s and Ajax there is no true cross browser, cross platform implementation that handles this gracefully. What you want to be able to do is open a communication channel to the server and keep that channel open constantly and simply listen for incoming messages being sent from the server and then processing them as they arrive.
No need for hidden iFrames, relying on features only available in a single browser or using polling with Ajax to open, read and close network connections at set intervals which can quickly become very expensive. This is then where the Web Sockets API comes in, it brings the ability for bidirectional communication between server and client without the need for any of the above and thus narrows the gap even further between desktop and web based application.
Web Storage
If you have been using WordPress for some time you may remember the Turbo button that was introduced. You will also have noticed that service such as Google Docs and Remember the Milk have provided you with the option to work offline. All of these capabilities was enabled by a browser extension developed by Google called Gears. But in late November 2009 Google announced that they will stop further development of Gears. Why?
Three things, the Web Storage API, Geo-location and offline web applications. The development done on Gears laid the groundwork and gave the impetus the application working group needed to create a standard for these requirements that can then be implemented by user agents and thus provide web developers with a cross browser, cross platform open standard, in this case, for storing data on the client and with this taking yet more load off the server and improving application performance.
Web Storage comes in two flavors for two distinctly different use cases namely sessionStorage and localStorage.
Session Storage
Let’s explore the use case of purchasing an item at an online store. Often times application developers use a combination of server sessions along with cookies to store state between pages and thus keep track of what the user has added to their basket. If a user for example makes two separate purchases from the same online store in two separate browser windows, that the cookie from the one session would leak into the other as a cookie is not tied to a specific instance of the browser but instead the domain.
Using session storage you can attach data to the session storage that will only be available to pages of the same site in the current session running in the current window. For both session and local storage we use interface definition language attributes as follows:
<input type="checkbox" value="true" name="isGift" onchange="sessionStorage.isGift = checked" />
When finishing up the order you can check whether this box was ticked as follows:
if(sessionStorage.isGift) {
// Your code here.
}
Local Storage
The most obvious scenario here is a service such as GMail. Instead of going to the server and downloading your entire inbox as well as any new messages that might have been received every time you reload the page or log into the service. Google can opt to offload the entire inbox or say messages received for the last month into local storage so all it has to transfer over the pipe is any new messages that was received since the last page refresh, or the last time you logged in. For mobile applications this ability is priceless. Usage of local storage is, as mentioned before, the same as for session storage.
Offline Web Applications
Having touched on Web Storage above one tends to think that is what makes offline web applications possible. That is actually just part of the puzzle, the other part is the offline web application specification that is part of the working draft of the WHAT-WG HTML5 specification for loading web pages.
The reason I say that simply using local storage does not an offline application make is because once the network connection is severed, the data off the application might reside on the client but not all of the files that the application relies on to work will have been cached and made available for offline usage. This is then where the Cache Manifest comes in. Say for example your web application relies on an HTML document, a CSS document and two JavaScript files to offer it’s functionality, you would then create a Manifest that looks like the following:
CACHE MANIFEST index.html default.css jquery.js yourapp.js
<!DOCTYPE HTML> <html manifest="app.manifest"> <head> <title>Offline Application</title> <link rel="stylesheet" href="default.css"> <script src="jquery.js"></script> <script src="yourapp.js"></script> </head> <body> </body> </html>
Now when a user accesses this page the browser will cache all of the files in your manifest and make them available for offline usage. When you then combine local storage with your MANIFEST for offline web applications, you not only have a web app that can work when your network connection is unavailable but, due to the local storage your application is going to perform a lot better and a lot of server overhead and latency is removed.
Web SQL Database
So, you have access to local storage, you can create a MANIFEST and specify the files that is needed by your web app to work offline and thus needs to be cached and made available offline by the user agent but, there is still something missing. If you logically think about the use case I used a little earlier with regards to local storage, you are going to have a hard time storing an entire mailbox offline in format that local storage offers and what happens if you want to query the data, you will need to go back to the server.
There is however another part to this and that is the web SQL database API. Before I go any further on this topic however there is an important point you need to be aware off. As far as the W3C is concerned, at this point this specification is at an impasse or deadlock as all the interested parties who implemented the Web SQL database specification has done so using Sqlite and as such there is no route to follow with regards to standardization as there is only one implementation. So, while at the moment the process of standardization is at a deadlock, the ability to use the Web SQL database specification still exist and is well supported, it is just important to be aware of the current status of the spec and keep an ear to the ground.
To demonstrate how the web sql database works I am using a piece of code from the specification that demonstrates the process really well:
function prepareDatabase(ready, error) {
return openDatabase('documents', '1.0', 'Offline document storage', 5*1024*1024, function (db) {
db.changeVersion('', '1.0', function (t) {
t.executeSql('CREATE TABLE docids (id, name)');
}, error);
});
}
function showDocCount(db, span) {
db.readTransaction(function (t) {
t.executeSql('SELECT COUNT(*) AS c FROM docids', [], function (t, r) {
span.textContent = r.rows[0].c;
}, function (t, e) {
// couldn't read database
span.textContent = '(unknown: ' + e.message + ')';
});
});
}
prepareDatabase(function(db) {
// got database
var span = document.getElementById('doc-count');
showDocCount(db, span);
}, function (e) {
// error getting database
alert(e.message);
});
Let’s then go over what is happening here. We start off by calling the prepareDatabase function. The purpose of this function is to return an open ‘connection’ to the database so that we can access and manipulate it’s content. The openDatabase method takes the following parameters:
openDatabase(name, version, displayName, estimatedSize, (optional) callback);
The first thing the code above will do is to check whether the database we are specifying already exists and if it does, it will simply return a ‘connection’ to the database without proceeding any further with the code block. If however it does not find the database it will progress into the callback. The callback in this case in an anonymous function that will create the database we were trying to open and on successful completion thereof execute the SQL to create the desired table.
On success it will return the ‘connection’ to the newly created database or if an error occurs the anonymous error function passed into prepareDatabase() will be executed that will open an alert with the error message. Assuming a successful completion of prepareDatabase our success function will execute. Here we get the DOM element into which we want to write the result and then call showDocCount() passing in the database ‘connection’ handle as well as a reference to the DOM node.
In showDocCount() we create a single new read transaction and execute a SQL statement that will do a count on our table and return the number of documents in the table. If the process was successful, the document count will be written to the DOM node, if an error occurred the error message will be written to the node. This is incredibly powerful and a major enhancement to the web application platform but, I would encourage you to read over the spec and understand exactly how all of this works and especially look at how you can secure your web sql database interaction to prevent SQL injection attacks and other such hacks so that you can truly make the most of this awesome capability.
Geolocation API
This then is the second of the APIs that is being defined as a standard where we can thank Gears. The name says it all and on mobile devices this is a feature that can come in extremely handy. The Geolocation API is not limited to mobile devices however as the API is completely agnostic to the underlying location information source.
With the Geolocation API you can get a single request location of the users location, get repeated location updates or query cached locations. With cached locations you can also force the browser, by setting timeout to 0, to return a fresh cached copy, specify the maximum age of a cached copy as well as forcing the user agent to return any cached copy that it has. Below is some simple examples of this:
Single location request
navigator.geolocation.getCurrentPosition(callback);
Repeated location request
var watchId = navigator.geolocation.watchPosition(callback); // Clearing the 'timer' navigator.geolocation.clearWatch(watchId);
Location from cache where the cached location is not older then 20 minutes
navigator.geolocation.getCurrentPosition(successCallback,
errorCallback,
{maximumAge:1200000});
Forcing a freshly cached position
// Setting timeout to 0 is key here.
navigator.geolocation.getCurrentPosition(successCallback,
errorCallback,
{maximumAge:600000, timeout:0});
Forcing the user agent to return any cached location it has available
navigator.geolocation.getCurrentPosition(successCallback,
errorCallback,
{maximumAge:Infinity, timeout:0});
File API
If you use GMail, I suppose that is like saying, if you brush your teeth everyday, then you have no doubt encounter the ability to drag files from your computer onto the GMail interface when composing a new message and the file is added as an attachment. I was baffled when I first saw this and immediately tried it out in Firefox thinking it was something baked into Chrome but, it worked perfectly fine in Firefox as well. What gives?
The reason this is possible is because of two parts, the first is drag and drop that is part of the HTML5 spec as well as the File API from the W3C. The File API provides us with the following objects and interfaces:
- A FileList sequence, which represents an array of individually selected files from the underlying system.
- A Blob interface, which represents raw binary data, and allows access to ranges of bytes within the Blob object.
- A File interface, which includes readonly informational attributes about a file such as its name, its mediatype, and a URL to access its data.
- A FileReader interface, which provides methods to read a File, and an event model to obtain the results of these reads.
- A FileError interface and a FileException exception which define error conditions used by this specification.
As far as drag and drop we are interested in the ondragenter, ondrop and dataTranfer events. First we create a div that will be the drop target of our file reader sample.
<div id="droptarget"></div>
Let’s add some basic styling:
#droptarget
{
padding:20px;
border-radius:5px;
-moz-border-radius:5px;
-webkit-border-radius:5px;
}
Next we add our JavaScript to handle the drag and drop events when a user drags a file over the drop target.
$().ready(function() {
var dropTarget = document.getElementById('droptarget');
function showFileInfo(file) {
alert('Getting file info');
}
dropTarget.ondragenter = function() {
dropTarget.innerHTML = "Drop your file here.";
dropTarget.style.border = '4px solid #17223F';
dropTarget.style.background = "#77A3EF";
return false;
}
dropTarget.ondragover = function() {
return false;
};
dropTarget.ondragleave = function() {
return false;
};
dropTarget.ondrop = function(event) {
event.preventDefault();
dropTarget.style.border = '4px solid #224F28';
dropTarget.style.background = "#B5DFBB";
showFileInfo(event.dataTransfer.files[0]);
}
});
And finally we use the File API to read and output some information about the file as well as the contents itself:
function showFileInfo(file) {
document.getElementById('filename').innerHTML = file.fileName;
document.getElementById('filesize').innerHTML = file.fileSize;
var reader = new FileReader();
reader.onload = function(event) {
document.getElementById('filecontents').innerHTML = event.target.result;
}
reader.onerror = function() {
document.getElementById('filecontents').innerHTML = 'Unable to read ' + file.fileName;
};
reader.readAsBinaryString(file);
}
// Update your ondrop code block to the following
dropTarget.ondrop = function(event) {
event.preventDefault();
dropTarget.style.border = '4px solid #224F28';
dropTarget.style.background = "#B5DFBB";
var fileInfoDisplay = "<p>File name: <span id='filename'></span></p><p>File size: <span id='filesize'></span></p><p>File contents: <span id='filecontents'></span></p>";
dropTarget.innerHTML = fileInfoDisplay;
showFileInfo(event.dataTransfer.files[0]);
}
So in 47 lines of JavaScript, which can be a lot less had a used jQuery more, we now have the ability to drag a file from anywhere on your file system, drop it onto the browser window and be able to read not only information about the file but, even the file content itself. I believe you can see how incredibly powerful these new features in HTML5 and the File API is and how this is going to move the process of developing web applications forward leaps and bounds.
From the above you can see that there is a lot happening in terms of standardization and the creation of new APIs. What you will also notice is that a lot of the new APIs and standards are solving problems web developers have been struggling with for a long time and have had to find, sometimes complicated, work around’s to accomplish these tasks. There is also a lot of new ‘features’ that opens up tremendous opportunities on the web and mobile devices and further blurs the line between native and web based applications.
As far as what is or is not HTML5 well, I believe as developers we should be able to draw the distinction and understand that what is being defined and standardized right now is a completely new web platform of which the the HTML5 language is just one part. But I do also believe that the use of an umbrella term drives adoption by giving people a buzz word or acronym, we all know how popular these are in the IT world, to rally around. And in the end, the more developers we get involved in the entire process, the better the end result will be.
I would love to here your thoughts and look forward to reading them in the comments.
