The HTML5 specification contains new elements to allow the embedding of audio and video, similar to the way that images have historically been embedded in HTML. In contrast to today's behavior, using object, where the behavior can vary based on both the type of the object and the browser, this allows for consistent attributes, DOM behavior, accessibility management, and so on. It also can handle the time-based nature of audio and video in a consistent way.
However, interoperability at the markup level does not ensure interoperability for the user, unless there are commonly supported formats for the video and audio encodings, and the file format wrapper. For images there is no mandated format, but the widely deployed solutions (PNG, JPEG/JFIF, GIF) mean that interoperability is, in fact, achieved.
The problem is complicated by the IPR situation around audio and video coding, combined with the W3C patent policy <http://www.w3.org/Consortium/Patent-Policy-20040205/>. "W3C seeks to issue Recommendations that can be implemented on a Royalty-Free (RF) basis." Note that much of the rest of the policy may not apply (as it concerns the specifications developed at the W3C, not those that are normatively referenced). However, it's clear that at least RF-decode is needed.
There are, of course, a number of codecs and formats that can be considered. A non-exhaustive list might include a variety of 'public' codecs, as well, of course, as proprietary ones:
a) open-source projects: the ogg family (vorbis, theora), and the BBC Dirac video codec project
b) Current ISO/IEC (MPEG) standard codecs, notably the MPEG-4 family: AVC (14496-10, jointly published with the ITU as H.264), AAC (part of 14496-3)
c) Older MPEG codecs, notably MPEG-2 layer 3 (aka MP3), MPEG-2 layer 1 and 2 audio, and maybe MPEG-4 part 2 video (14496-2)
d) Current standard codecs from other bodies; SMPTE VC-1, for example
e) Older standards from other bodies: ITU recommendations H.263 (with or without its many enhancement annexes) or even H.261
f) Very old standard codecs, formats, or industry practices; notably the common format for video from digital still cameras (Motion JPEG with uncompressed audio in an AVI wrapper)
g) Proprietary codecs, such as Dolby AC-3 audio
There are concerns or issues with all of these:
a) a number of large companies are concerned about the possible unintended entanglements of the open-source codecs; a 'deep pockets' company deploying them may be subject to risk here. Google and other companies have announced plans to ship Ogg Vorbis and Theora or are shipping Ogg Vorbis and Theora, so this may not be considered a problem in the future.
b) the current MPEG codecs are currently licensed on a royalty-bearing basis.
c) this is also true of the older MPEG codecs; though their age suggests examining the lifetime of the patents; MPEG-1 without MPEG-1 Audio Layer 3 might be royalty free right now. Three problems were mentioned with using a subset of MPEG-1. First, by using a subset some people might use the full MPEG-1 and not realize that this would not work on browsers only implementing a subset. Second, even clearing MPEG-1 subset as royalty free might be expensive, and third MPEG-1 has a lower quality than other codecs. MPEG-1 352 pixels x 240 lines at 30 frames a second with audio is about 1.9 Mbit/sec.
d) and also SMPTE VC-1
e) H.263 and H.261 both have patent declarations at the ITU. However, it is probably worth examining the non-assert status of these, which parts of the specifications they apply to (e.g. H.263 baseline or its enhancement annexes), and the age of the patents and their potential expiry. H.261's patent declarations either are expired, only applications, or seem to be a mistake. H.261 however only allows two video sizes, so it is unsuitable. Sun's OMS project is trying to make a video codec based on H.261, which might be worth considering when it is finished.
f) This probably doesn't have significant IPR risk, as its wide deployment in systems should have exposed any risk by now; but it hardly represents competitive compression. Motion JPEG 320 x 240 pixels at 30 frames a second with 8 bit PCM audio is about 5 Mbit/sec.
g) Most proprietary codecs are licensed for payment, as that is the business of the companies who develop them.
Other licensing concerns
It's also possible that there are other issues around licensing:
a) variations in licensing depending on filed patents in various geographies
b) restrictions on usage, or fees on usage, other than the fees on implementation (e.g. usage fees on content sold for remuneration).
It's not entirely clear, also, whether 'implementing' HTML means the ability to decode and display, or whether encoding is also included. Including encoding in the equation might significantly complicate matters.
The members of the WG are engineers, not IPR experts. There is general consensus that a solution is desirable, but also that engineers are not well placed to find it:
a) they are not experts in the IPR and licensing field;
b) many of them are discouraged by their employers from reading patents or discussing IPR.
It's clear that the December workshop cannot be silent on this subject. There must be recognition of the issue and evidence of at least efforts to solve it, and preferably signs of progress.
It is probable that this is best handled in parallel with the technical work, and headed by someone 'technically neutral' and qualified, such as W3C technical and legal staff. A good start would be to:
a) examine the declaration, licensing, and patent expiry situation for various codecs;
b) contact the licensing authorities for various codecs to determine their level of interest and flexibility, and possibly invite them to the December workshop.
c) analyze the open-source codecs for their risk level, and possibly seek statements from patent owners if that is deemed prudent;