HTTP Live Streaming (HLS) allows you to stream live and on- demand video and audio to an iPhone, iPad, or iPod Touch. HLS is similar to Smooth Streaming in Microsoft’s Silverlight platform architecture, and can be thought of as a successor to both RTSP and HTTP Progressive Download (HTTP PD), although both of those video options serve a purpose and likely won’t be going away anytime soon.
HLS was originally unveiled by Apple with the introduction of the iPhone 3.0 in mid-2009. Prior to the iPhone 3, no streaming protocols were supported natively on the iPhone, leaving developers to wonder what Apple had in mind for native streaming support. In May of last year, Apple proposed HLS as a standard to the IETF, and the draft is now in its fourth iteration.
As an adaptive streaming protocol, HLS has several advantages that I’ve enumerated in a previous post. Those advantages include multiple bit rate encoding for different devices, HTTP delivery, and segmented stream chunks suitable for delivery of live streams over widely available HTTP CDN infrastructure.
How HLS works
HLS lets you send streaming video and audio to any supported Apple product, including Macs with a Safari browser. HLS works by segmenting video streams into 10-second chunks; the chunks are stored using a standard MPEG-2 transport stream file format. Optionally, chunks are created using several bit rates, allowing a client to dynamically switch between different bit rates depending on network conditions.
How does a stream get into HLS format? There are three main steps:
- An input stream is encoded/transcoded. The input can be a satellite feed or any other typical input. The video and audio source is encoded (or transcoded) to an MPEG-2 transport stream container, with H.264 video and AAC audio, which are the codecs Apple devices currently support.
- Output profiles are created. Typically a single input stream will be transcoded to several output resolutions/bit rates, depending on the types of client devices that the stream is destined for. For example, an input stream of H.264/AAC at 7 Mbps could be transcoded to four different profiles with bit rates of 1.5Mbps, 750K, 500K, and 200K. These would be suitable for devices and network conditions ranging from high-end to low-end, such as an iPad, iPhone 4, iPhone 3, and a low bit rate version for bad network conditions.
- The streams are segmented. The streams contained within the profiles all need to be segmented and made available for delivery to an origin web server or directly to a client device over HTTP. The software or hardware device that does the segmenting (the segmenter) also creates an index file which is used to keep track of the individual video/audio segments.
Optionally, the segmenter might also encrypt the stream (each individual chunk) and create a key file.
What does the client do?
The client downloads the index file via a URL that identifies the stream. The index file tells the client where to get the stream chunks (each with its own URL). For a given stream, the client then fetches each stream chunk in order. Once the client has enough of the stream downloaded and buffered, it displays it to the user. If encryption is used, the URLs for the decryption keys are also given in the index file.
If multiple profiles (that is, bit rates and resolutions) are available, the index file is different in that it contains a specially tagged list of variant index files for the different stream profiles. In that case, the client downloads the primary index file first, and then downloads the index file for the bit rate it wants to play back. The bit rates and resolutions of the variant streams are specified in the main index file, but precise handling of the variants offered is left up to the client implementation.
Typical playback latency for HLS is around 30 seconds. This is caused by the size of the chunks (10 seconds) and the need for the client to buffer a number of chunks before it starts displaying the video.
An odd curiosity about HLS is that it doesn’t make use of Apple’s Quicktime MOV file format, which is the basis for the ISO MPEG file format. Apple thought that the TS format was more widely used and better understood by the broadcasters who would ultimately use HLS, and so they focused on the MPEG-2 TS format. Ironically, Microsoft’s Smooth Streaming protocol does make use of ISO MPEG files (MPEG-4 Part 12).