00001 <?xml version="1.0" standalone="no"?>
00002 <!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
00003                 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
00005 ]>
00007 <appendix id="vorbis-over-ogg">
00008 <appendixinfo>
00009  <releaseinfo>
00010   $Id: a1-encapsulation_ogg.xml 7186 2004-07-20 07:19:25Z xiphmont $
00011  </releaseinfo>
00012 </appendixinfo>
00013 <title>Embedding Vorbis into an Ogg stream</title>
00015 <section>
00016 <title>Overview</title>
00018 <para>
00019 This document describes using Ogg logical and physical transport
00020 streams to encapsulate Vorbis compressed audio packet data into file
00021 form.</para>
00023 <para>
00024 The <xref linkend="vorbis-spec-intro"/> provides an overview of the construction
00025 of Vorbis audio packets.</para>
00027 <para>
00028 The <ulink url="oggstream.html">Ogg
00029 bitstream overview</ulink> and <ulink url="framing.html">Ogg logical
00030 bitstream and framing spec</ulink> provide detailed descriptions of Ogg
00031 transport streams. This specification document assumes a working
00032 knowledge of the concepts covered in these named backround
00033 documents.  Please read them first.</para>
00035 <section><title>Restrictions</title>
00037 <para>
00038 The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis
00039 streams use Ogg transport streams in degenerate, unmultiplexed
00040 form only. That is:
00042 <itemizedlist>
00043  <listitem><simpara>
00044   A meta-headerless Ogg file encapsulates the Vorbis I packets
00045  </simpara></listitem>
00046  <listitem><simpara>
00047   The Ogg stream may be chained, i.e. contain multiple, contigous logical streams (links).
00048  </simpara></listitem>
00049  <listitem><simpara>
00050   The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link)
00051  </simpara></listitem>
00052 </itemizedlist>
00053 </para>
00055 <para>
00056 This is not to say that it is not currently possible to multiplex
00057 Vorbis with other media types into a multi-stream Ogg file.  At the
00058 time this document was written, Ogg was becoming a popular container
00059 for low-bitrate movies consisting of DiVX video and Vorbis audio.
00060 However, a 'Vorbis I audio file' is taken to imply Vorbis audio
00061 existing alone within a degenerate Ogg stream.  A compliant 'Vorbis
00062 audio player' is not required to implement Ogg support beyond the
00063 specific support of Vorbis within a degenrate ogg stream (naturally,
00064 application authors are encouraged to support full multiplexed Ogg
00065 handling).
00066 </para>
00068 </section>
00070 <section><title>MIME type</title>
00072 <para>
00073 The correct MIME type of any Ogg file is <literal>application/ogg</literal>.
00074 However, if a file is a Vorbis I audio file (which implies a
00075 degenerate Ogg stream including only unmultiplexed Vorbis audio), the
00076 mime type <literal>audio/x-vorbis</literal> is also allowed.</para>
00078 </section>
00080 </section>
00082 <section>
00083 <title>Encapsulation</title>
00085 <para>
00086 Ogg encapsulation of a Vorbis packet stream is straightforward.</para>
00088 <itemizedlist>
00090 <listitem><simpara>
00091   The first Vorbis packet (the identification header), which
00092   uniquely identifies a stream as Vorbis audio, is placed alone in the
00093   first page of the logical Ogg stream.  This results in a first Ogg
00094   page of exactly 58 bytes at the very beginning of the logical stream.
00095 </simpara></listitem>
00097 <listitem><simpara>
00098   This first page is marked 'beginning of stream' in the page flags.
00099 </simpara></listitem>
00101 <listitem><simpara>
00102   The second and third vorbis packets (comment and setup
00103   headers) may span one or more pages beginning on the second page of
00104   the logical stream.  However many pages they span, the third header
00105   packet finishes the page on which it ends.  The next (first audio) packet
00106   must begin on a fresh page.
00107 </simpara></listitem>
00109 <listitem><simpara>
00110   The granule position of these first pages containing only headers is zero.
00111 </simpara></listitem>
00113 <listitem><simpara>
00114   The first audio packet of the logical stream begins a fresh Ogg page.
00115 </simpara></listitem>
00117 <listitem><simpara>
00118   Packets are placed into ogg pages in order until the end of stream.
00119 </simpara></listitem>
00121 <listitem><simpara>
00122   The last page is marked 'end of stream' in the page flags.
00123 </simpara></listitem>
00125 <listitem><simpara>
00126   Vorbis packets may span page boundaries.
00127 </simpara></listitem>
00129 <listitem><simpara>
00130   The granule position of pages containing Vorbis audio is in units
00131   of PCM audio samples (per channel; a stereo stream's granule position
00132   does not increment at twice the speed of a mono stream).
00133 </simpara></listitem>
00135 <listitem><simpara>
00136   The granule position of a page represents the end PCM sample
00137   position of the last packet <emphasis>completed</emphasis> on that page.
00138   A page that is entirely spanned by a single packet (that completes on a
00139   subsequent page) has no granule position, and the granule position is
00140   set to '-1'.
00141 </simpara></listitem>
00143 <listitem>
00144   <simpara>
00145     The granule (PCM) position of the first page need not indicate
00146     that the stream started at position zero.  Although the granule
00147     position belongs to the last completed packet on the page and a 
00148     valid granule position must be positive, by
00149     inference it may indicate that the PCM position of the beginning
00150     of audio is positive or negative.
00151   </simpara>
00153   <itemizedlist>
00154     <listitem><simpara>
00155         A positive starting value simply indicates that this stream begins at
00156         some positive time offset, potentially within a larger
00157         program. This is a common case when connecting to the middle
00158         of broadcast stream.
00159     </simpara></listitem>
00160     <listitem><simpara>
00161         A negative value indicates that
00162         output samples preceeding time zero should be discarded during
00163         decoding; this technique is used to allow sample-granularity
00164         editing of the stream start time of already-encoded Vorbis
00165         streams.  The number of samples to be discarded must not exceed 
00166         the overlap-add span of the first two audio packets.
00167     </simpara></listitem>
00168   </itemizedlist>
00170   <simpara>
00171     In both of these cases in which the initial audio PCM starting
00172     offset is nonzero, the second finished audio packet must flush the
00173     page on which it appears and the third packet begin a fresh page.
00174     This allows the decoder to always be able to perform PCM position
00175     adjustments before needing to return any PCM data from synthesis, 
00176     resulting in correct positioning information without any aditional
00177     seeking logic.
00178   </simpara>
00180   <note><simpara>
00181     Failure to do so should, at worst, cause a
00182     decoder implementation to return incorrect positioning information
00183     for seeking operations at the very beginning of the stream.
00184   </simpara></note>
00185 </listitem>
00187 <listitem><simpara>
00188   A granule position on the final page in a stream that indicates
00189   less audio data than the final packet would normally return is used to
00190   end the stream on other than even frame boundaries.  The difference
00191   between the actual available data returned and the declared amount
00192   indicates how many trailing samples to discard from the decoding
00193   process.
00194  </simpara></listitem>
00195 </itemizedlist>
00197 </section>
00199 </appendix>
00201 <!-- end appendix on Vorbis encapsulation in Ogg -->

Generated by  doxygen 1.6.2