August 26th in Articles, HTML5 by .

A Look At Audio and Video in HTML5


Following on from my introduction to HTML5 article today I want to look at two new tags introduced in HTML5 namely, audio and video. While it does not sound like much, these two new additions is changing the way we think about multimedia on thew web. Previously if the topic of multimedia on the web came up there was a few technologies that came to mind, most notable Flash. Flash is the dominating force on the web when it comes to bringing audio and/or video to the web.

Whether you choose Flash, Silverlight or another third party vendor to embed your content, the one thing they all have in common is that they are executed inside a third party plugin and are not run natively inside the browser. The audio and video tags change all of that by making it possible to run both these directly inside the browser natively without any need for third party plugins. So besides the actual multimedia content, there is nothing additional for the user to download.

Because this media is now part of the browser there are some additional benefits, but also problems that comes with this. One the benefit side, developers can have much finer control over the look and feel of the controls through CSS as well as much greater interaction through JavaScript. The other benefit is that browser vendors can implement and handle all the big problems that come with multimedia on the web and here I am specifically referring to the accessibility of multimedia. I am not going to go into to much detail about this at the moment as one could dedicate an entire post to such a topic.

On the down side, we need to seriously ensure that we keep pressure on browser vendors to ensure that we do not have a split in implementation where we literally go back in time and have to create separate implementations for each browser. The other problem that we are currently facing is to find a standard video and audio codec, that is open, free from patents and no single entity has financial gain or any form of stronghold over the future growth and innovation of the platform. In the remainder of this article I will talk about where we currently stand in this regard and at the moment, we are sitting at an important cross roads and we as web developers need to be vigilant and make our voices heard. With that, let’s dive into audio.

Audio

I alluded to the fact that codecs are one of the most important aspects that can shape the future of audio and video on the web. Currently three codecs are supported by browsers off which one, Ogg Vorbis, is the one most are rooting for. Below is a breakdown of the current support browsers offer for the various codecs.

Browser Ogg Vorbis MP3 WAV
Firefox 3.5 x x
Safari 4 x x
Chrome 3 x x
Opera 10.60 x x
Internet Explorer 9 x

It might come as no surprise that all the browsers except for Safari and Internet Explorer 9 support the open Ogg Vorbis format, IE9 also support what is known as Advanced Audio Coding which is designed to be the successor to MP3. There is a lot of politics involved in the whole codec debacle so, instead of getting into that, let’s get into the code.

Audio Tag Attributes

The media items share the same attributes so the attributed for audio is shared by video. The attributes are:

  • src
  • preload
  • autoplay
  • loop
  • controls

So, the above attributes are pretty much self explanatory with preload and controls justifying some additional explanation. Let’s start with preload. Seems pretty straight forward, basically suggesting that the browser will start to download the audio content as soon as it is able to do so but, it is not as cut and dry as this. How this is being implemented is as I previously mentioned, with the user at the top of the chain closely followed by the author.

What I mean by this is that the author has control over what exactly is meant by preload based on what the author believes will provide the end user with the best possible user experience. Below is the different ways you can use the preload attribute:

<audio preload />
<audio preload="none" />
<audio preload="metadata" />
<audio preload="auto" />

As you can see from the above, preload can be used as a standalone attribute or in conjunction with a specific value. Using preload on it’s own is equivalent to specifying auto. When specifying ‘none’ it hints to the user agent that there is a possibility that the user will not need the audio data and thus should not download it, until it is requested by a user action. This can also be used to prevent unnecessary load on the server. Specifying ‘metadata’ hints to the user agent that the actual audio content may not be required but, it can go ahead and download any related meta data, such as the first keyframe, duration, size etc.

Specifying ‘auto’ would be mostly used on sites such as Last.fm etc. where the authors can be just about 100% sure that the user accessing the page/service is intending to listen to the audio content and therefore to improve the user experience when interacting with the media, the user agent should start downloading the audio content as soon as possible. What is important to note about preload is that it can be overwritten by the autoplay attribute.

The ‘controls’ attribute is actually straight forward but the best way to describe this one is to show you a demo of the audio tag in action. First let’s look at a complete code example:

<audio src="audio/hamster.ogg" preload="metadata" controls>

Example:

If you are using a modern browser that supports the audio tag but not the Ogg Vorbis format, the above should render a UI for you but, clicking on play will result in no audio being played. So, how do we get around this? Specifying the source of an audio element can be done in two ways, the first as we have seen above and then a second way that allows us to address the current issue. The audio tag can alternatively contain nested ‘source’ tags.

What this means is that we can have our Ogg Vorbis format and then provide a fallback such as MP3 for any user agent that supports the audio tag but not the codec. What the user agents will do is apply a waterfall parsing algorithm, starting from the first source element and falling down to the second, third etc. until it finds a format it supports. So to get a cross browser version going we can use Ogg Vorbis as our base and then use MP3 for our fallback as follows:

<audio preload="metadata" controls>
    <source src="audio/hamster.ogg" />
    <source src="audio/hamster.mp3" />
</audio>

Along with the above you can also specify the type such as:

<audio preload="metadata" controls>
    <source src="audio/hamster.ogg" type="audio/ogg" />
    <source src="audio/hamster.mp3" type="audio/mpeg" />
</audio>

You can also go further and specify the codec, which might be required when for example you have Speex audio in an Ogg container:

<audio preload="metadata" controls>
    <source src="audio/hamster.ogg" type="audio/ogg; codecs=vorbis" />
    <source src="audio/hamster.spx" type="audio/ogg; codecs=speex" />
</audio>

Example:

The audio below has a source entry specifying an Ogg Vorbis audio clip and secondly a MP3 audio clip. This means that if you load this page in say, Firefox 3.5+ you will here a very rude man insulting you parents. However, if you open this in Safari, you will hear one of Cartman’s famous clips.

The final question that remains is ok, but what if the user agent does not support any of the codecs or the even the tag itself. Yet again we are covered. You can specify content which will only be shown to user agents that does not support the audio tag directing them to an alternate version of the content or politely suggesting that they might want to consider upgrading to a modern browser. To do this is very simple:

<audio preload="metadata" controls>
    <source src="audio/hamster.ogg" type="audio/ogg; codecs=vorbis" />
    <source src="audio/hamster.spx" type="audio/ogg; codecs=speex" />
    <p>Your broser does not support the HTML5 audio tag, consider upgrading your browser or using an open and standards complient browsers : <a href="http://www.opera.com">Opera</a> : <a href="http://www.google.com/chrome">Chrome</a> : <a href="http://www.getfirefox.com/">Firefox</a></p>
</audio>

Example:

Your broser does not support the HTML5 audio tag, consider upgrading your browser or using an open and standards complient browsers : Opera : Chrome : Firefox

If you open the page in Internet Explorer, prior to version 9, you will be presented with the text content specified last without effecting how the other user agents handle the content. Before we move on to video, just a quick note on the controls. The controls that we have seen here are the controls provided by the user agent but, you have scripting access, which I will look at in a follow on article, so that you are able to create your own controls.

Video

You have no doubt read and seen the headlines countless times about how HTML5 and specifically canvas and video is the Flash killer. In truth, the possibility for it to be that is great but, at the moment the two can rather be seen as complimentary solutions. HTML5 video especially has to mature a lot more and a lot more support is required before it can reach it’s full potential.

With that said, video being a native part of the user agent is absolutely awesome and there is no reason you should not be looking at and using it today. As I mentioned earlier, there is a lot of aspects that are shared across the media tags such as audio and video so, instead of repeating myself, I will simply refer you back to what was discussed in the section on audio however, let’s start out this section as we did the previous, showing the support for the various codecs in today’s browsers.

Browser Ogg Theora WebM h.264
Firefox 3.5 x    
Safari 4     x
Chrome 3 x x x
Opera 10.60 x x  
Internet Explorer 9     x

Again as before there is a lot of uncertainty and politics involved with regards to the codec but at least for HTML5 video, we can support all of the above browsers by presenting our content in both Ogg Theora as well as h.264. For the most part video and audio share the same attributes but video has a couple all it’s own:

  • poster
  • width
  • height

Specifying ‘poster’ gives you a way to present an image to the end user while the video data is not yet available.

<video src="video/theora/squirel_fight.ogv" poster="video/poster.png" poster="video/poster.png" width="320" height="240" controls></video>

Example:

The poster attribute is till buggy, triggering a incorrect sizing of the video in Chrome, and not supported by most browser which will instead show the first frame as soon as it has been loaded. According to the MDC docs on video, Firefox using the Gecko 1.9.2 engine will support this property. The other two attributes, width and height, are self explanatory.

As with audio creating fallbacks for different codecs works exactly the same way:

<video width="320" height="240" controls>
	<source src="video/webm/princeofpersia.webm" type="video/webm" />
	<source src="video/theora/squirel_fight.ogv" type="video/ogg" />
	<source src="video/h264/MontyPython-Spam.mp4" type="video/mp4" />
	<p>Really, in this day and age? Maybe try one of these.... <a href="http://www.opera.com">Opera</a> : <a href="http://www.google.com/chrome">Chrome</a> : <a href="http://www.getfirefox.com/">Firefox</a></p>
		</video>

Example:

Really, in this day and age? Maybe try one of these…. Opera : Chrome : Firefox

And for complete cross-browser support:

<video width="320" height="240" controls>
	<source src="video/webm/princeofpersia.webm" type="video/webm" />
	<source src="video/theora/squirel_fight.ogv" type="video/ogg" />
	<source src="video/h264/MontyPython-Spam.mp4" type="video/mp4" />
	<object width="320" height="240" type="application/x-shockwave-flash" data="video/swf/frog_fly.swf">
                <param name="movie" value="video/swf/frog_fly.swf" />
	        <param name="allowfullscreen" value="true" />
		<p>Download video as <a href="video/h264/MontyPython-Spam.mp4">MP4</a>, <a href="video/webm/princeofpersia.webm">WebM</a>, or <a href="video/theora/squirel_fight.ogv">Ogg</a>.</p>
	 </object>
</video>

Example:

When we start to look at the above, especially the last sample, we can start to see why agreement on a single, open codec for video and audio is so important for the growth and adoption of these new elements. As with audio, because video is now native to the browser we have scripting access to the video as well as the ability to add styling using CSS. I will be looking at these in a follow on post as well.

That covers the bulk of audio and video in HTML5. I hope you enjoyed this post and got excited about and start to use these new elements. I look forward to your comments.

  • http://www.yellosoft.us/ Andrew Pennebaker

    Go Ogg!

    Once upon a time, my entire music collection was in Ogg Vorbis. Then I bought an iPod, and now I use MP3s. The only reason I wouldn’t want Ogg to catch on is that I would have to convert my music back into Vorbis or even into FLAC.

  • http://www.yellosoft.us/ Andrew Pennebaker

    Go Ogg!

    Once upon a time, my entire music collection was in Ogg Vorbis. Then I bought an iPod, and now I use MP3s. The only reason I wouldn’t want Ogg to catch on is that I would have to convert my music back into Vorbis or even into FLAC.

  • guix

    Hey,
    the video examples are not showing up for me (Opera 9.27 or Firefox 3.6.3 / Ubuntu), are they functional / right MIME type?
    Careful, the MP4 is 9.6MB, the OGV (squirrels) is 968Kb and the WEBM is 9.7MB .
    Audio examples are working.

    Thanks for the article.
    Schalk, can you provide your contact details somewhere on this website?

    • Anonymous

      Hey Dude,

      I do not believe Opera 9.27 supports HTML5 video, might be wrong here, but the fallback to Flash should work then, will look into that. As far as the sizes, I did not pay that much attention to the size of the files ;D I should have, sorry about that.

      Glad you enjoyed the article. I guess you found my contact info as I received your email. If you need more detailed info contact me on volume4.schalk@gmail.com

Performance Optimization WordPress Plugins by W3 EDGE