Example Applications: examples/sfexamples/oggvorbiscodec/src/libvorbis/doc/xml/07-floor1.xml Source File

00001 <?xml version="1.0" standalone="no"?>
00002 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
00003                 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
00004 
00005 ]>
00006 
00007 <section id="vorbis-spec-floor1">
00008 <sectioninfo>
00009 <releaseinfo>
00010  $Id: 07-floor1.xml 10466 2005-11-28 00:34:44Z giles $
00011 </releaseinfo>
00012 </sectioninfo>
00013 <title>Floor type 1 setup and decode</title>
00014 
00015 <section>
00016 <title>Overview</title>
00017 
00018 <para>
00019 Vorbis floor type one uses a piecewise straight-line representation to
00020 encode a spectral envelope curve. The representation plots this curve
00021 mechanically on a linear frequency axis and a logarithmic (dB)
00022 amplitude axis. The integer plotting algorithm used is similar to
00023 Bresenham's algorithm.</para>
00024 
00025 </section>
00026 
00027 <section>
00028 <title>Floor 1 format</title>
00029 
00030 <section><title>model</title>
00031 
00032 <para>
00033 Floor type one represents a spectral curve as a series of
00034 line segments.  Synthesis constructs a floor curve using iterative
00035 prediction in a process roughly equivalent to the following simplified
00036 description:</para>
00037 
00038 <para>
00039 <itemizedlist>
00040  <listitem><simpara> the first line segment (base case) is a logical line spanning
00041 from x_0,y_0 to x_1,y_1 where in the base case x_0=0 and x_1=[n], the
00042 full range of the spectral floor to be computed.</simpara></listitem>
00043 
00044 <listitem><simpara>the induction step chooses a point x_new within an existing
00045 logical line segment and produces a y_new value at that point computed
00046 from the existing line's y value at x_new (as plotted by the line) and
00047 a difference value decoded from the bitstream packet.</simpara></listitem>
00048 
00049 <listitem><simpara>floor computation produces two new line segments, one running from
00050 x_0,y_0 to x_new,y_new and from x_new,y_new to x_1,y_1. This step is
00051 performed logically even if y_new represents no change to the
00052 amplitude value at x_new so that later refinement is additionally
00053 bounded at x_new.</simpara></listitem>
00054 
00055 <listitem><simpara>the induction step repeats, using a list of x values specified in
00056 the codec setup header at floor 1 initialization time.  Computation
00057 is completed at the end of the x value list.</simpara></listitem>
00058 
00059 </itemizedlist>
00060 </para>
00061 
00062 <para>
00063 Consider the following example, with values chosen for ease of
00064 understanding rather than representing typical configuration:</para>
00065 
00066 <para>
00067 For the below example, we assume a floor setup with an [n] of 128.
00068 The list of selected X values in increasing order is
00069 0,16,32,48,64,80,96,112 and 128.  In list order, the values interleave
00070 as 0, 128, 64, 32, 96, 16, 48, 80 and 112.  The corresponding
00071 list-order Y values as decoded from an example packet are 110, 20, -5,
00072 -45, 0, -25, -10, 30 and -10.  We compute the floor in the following
00073 way, beginning with the first line:</para>
00074 
00075 <mediaobject>
00076 <imageobject>
00077  <imagedata fileref="floor1-1.png" format="PNG"/>
00078 </imageobject>
00079 <textobject>
00080  <phrase>[graph of example floor]</phrase>
00081 </textobject>
00082 </mediaobject>
00083 
00084 <para>
00085 We now draw new logical lines to reflect the correction to new_Y, and
00086 iterate for X positions 32 and 96:</para>
00087 
00088 <mediaobject>
00089 <imageobject>
00090  <imagedata fileref="floor1-2.png" format="PNG"/> 
00091  </imageobject>
00092  <textobject>
00093   <phrase>[graph of example floor]</phrase>
00094  </textobject>
00095 </mediaobject>
00096   
00097 <para>
00098 Although the new Y value at X position 96 is unchanged, it is still
00099 used later as an endpoint for further refinement.  From here on, the
00100 pattern should be clear; we complete the floor computation as follows:</para>
00101 
00102 <mediaobject>
00103 <imageobject>
00104  <imagedata fileref="floor1-3.png" format="PNG"/> 
00105  </imageobject>
00106  <textobject>
00107   <phrase>[graph of example floor]</phrase>
00108  </textobject>
00109 </mediaobject>
00110   
00111 <mediaobject>
00112 <imageobject>
00113  <imagedata fileref="floor1-4.png" format="PNG"/> 
00114  </imageobject>
00115  <textobject>
00116   <phrase>[graph of example floor]</phrase>
00117  </textobject>
00118 </mediaobject>
00119   
00120 
00121 <para>
00122 A more efficient algorithm with carefully defined integer rounding
00123 behavior is used for actual decode, as described later.  The actual
00124 algorithm splits Y value computation and line plotting into two steps
00125 with modifications to the above algorithm to eliminate noise
00126 accumulation through integer roundoff/truncation. </para>
00127 
00128 </section>
00129 
00130 <section><title>header decode</title>
00131 
00132 <para>
00133 A list of floor X values is stored in the packet header in interleaved
00134 format (used in list order during packet decode and synthesis).  This
00135 list is split into partitions, and each partition is assigned to a
00136 partition class.  X positions 0 and [n] are implicit and do not belong
00137 to an explicit partition or partition class.</para>
00138 
00139 <para>
00140 A partition class consists of a representation vector width (the
00141 number of Y values which the partition class encodes at once), a
00142 'subclass' value representing the number of alternate entropy books
00143 the partition class may use in representing Y values, the list of
00144 [subclass] books and a master book used to encode which alternate
00145 books were chosen for representation in a given packet.  The
00146 master/subclass mechanism is meant to be used as a flexible
00147 representation cascade while still using codebooks only in a scalar
00148 context.</para>
00149 
00150 <screen>
00151 
00152   1) [floor1_partitions] = read 5 bits as unsigned integer
00153   2) [maximum_class] = -1
00154   3) iterate [i] over the range 0 ... [floor1_partitions]-1 {
00155        
00156         4) vector [floor1_partition_class_list] element [i] = read 4 bits as unsigned integer
00157 
00158      }
00159 
00160   5) [maximum_class] = largest integer scalar value in vector [floor1_partition_class_list]
00161   6) iterate [i] over the range 0 ... [maximum_class] {
00162 
00163         7) vector [floor1_class_dimensions] element [i] = read 3 bits as unsigned integer and add 1
00164         8) vector [floor1_class_subclasses] element [i] = read 2 bits as unsigned integer
00165         9) if ( vector [floor1_class_subclasses] element [i] is nonzero ) {
00166             
00167              10) vector [floor1_class_masterbooks] element [i] = read 8 bits as unsigned integer
00168            
00169            }
00170 
00171        11) iterate [j] over the range 0 ... (2 exponent [floor1_class_subclasses] element [i]) - 1  {
00172 
00173              12) array [floor1_subclass_books] element [i],[j] = 
00174                  read 8 bits as unsigned integer and subtract one
00175            }
00176       }
00177 
00178  13) [floor1_multiplier] = read 2 bits as unsigned integer and add one
00179  14) [rangebits] = read 4 bits as unsigned integer
00180  15) vector [floor1_X_list] element [0] = 0
00181  16) vector [floor1_X_list] element [1] = 2 exponent [rangebits];
00182  17) [floor1_values] = 2
00183  18) iterate [i] over the range 0 ... [floor1_partitions]-1 {
00184 
00185        19) [current_class_number] = vector [floor1_partition_class_list] element [i]
00186        20) iterate [j] over the range 0 ... ([floor1_class_dimensions] element [current_class_number])-1 {
00187              21) vector [floor1_X_list] element ([floor1_values]) = 
00188                  read [rangebits] bits as unsigned integer
00189              22) increment [floor1_values] by one
00190            }
00191      }
00192  
00193  23) done
00194 </screen>
00195 
00196 <para>
00197 An end-of-packet condition while reading any aspect of a floor 1
00198 configuration during setup renders a stream undecodable.  In
00199 addition, a <varname>[floor1_class_masterbooks]</varname> or
00200 <varname>[floor1_subclass_books]</varname> scalar element greater than the
00201 highest numbered codebook configured in this stream is an error
00202 condition that renders the stream undecodable.</para>
00203 
00204 <section id="vorbis-spec-floor1-decode">
00205 <title>packet decode</title>
00206 
00207 <para>
00208 Packet decode begins by checking the <varname>[nonzero]</varname> flag:</para>
00209 
00210 <screen>
00211   1) [nonzero] = read 1 bit as boolean
00212 </screen>
00213 
00214 <para>
00215 If <varname>[nonzero]</varname> is unset, that indicates this channel contained
00216 no audio energy in this frame.  Decode immediately returns a status
00217 indicating this floor curve (and thus this channel) is unused this
00218 frame.  (A return status of 'unused' is different from decoding a
00219 floor that has all points set to minimum representation amplitude,
00220 which happens to be approximately -140dB).
00221 </para>
00222 
00223 <para>
00224 Assuming <varname>[nonzero]</varname> is set, decode proceeds as follows:</para>
00225 
00226 <screen>
00227   1) [range] = vector { 256, 128, 86, 64 } element ([floor1_multiplier]-1)
00228   2) vector [floor1_Y] element [0] = read <link linkend="vorbis-spec-ilog">ilog</link>([range]-1) bits as unsigned integer
00229   3) vector [floor1_Y] element [1] = read <link linkend="vorbis-spec-ilog">ilog</link>([range]-1) bits as unsigned integer
00230   4) [offset] = 2;
00231   5) iterate [i] over the range 0 ... [floor1_partitions]-1 {
00232 
00233        6) [class] = vector [floor1_partition_class]  element [i]
00234        7) [cdim]  = vector [floor1_class_dimensions] element [class]
00235        8) [cbits] = vector [floor1_class_subclasses] element [class]
00236        9) [csub]  = (2 exponent [cbits])-1
00237       10) [cval]  = 0
00238       11) if ( [cbits] is greater than zero ) {
00239  
00240              12) [cval] = read from packet using codebook number
00241                  (vector [floor1_class_masterbooks] element [class]) in scalar context
00242           }
00243       
00244       13) iterate [j] over the range 0 ... [cdim]-1 {
00245        
00246              14) [book] = array [floor1_subclass_books] element [class],([cval] bitwise AND [csub])
00247              15) [cval] = [cval] right shifted [cbits] bits
00248              16) if ( [book] is not less than zero ) {
00249              
00250                    17) vector [floor1_Y] element ([j]+[offset]) = read from packet using codebook 
00251                        [book] in scalar context
00252 
00253                  } else [book] is less than zero {
00254 
00255                    18) vector [floor1_Y] element ([j]+[offset]) = 0
00256 
00257                  }
00258           }
00259              
00260       19) [offset] = [offset] + [cdim]
00261          
00262      }
00263   
00264  20) done
00265 </screen>
00266 
00267 <para>
00268 An end-of-packet condition during curve decode should be considered a
00269 nominal occurrence; if end-of-packet is reached during any read
00270 operation above, floor decode is to return 'unused' status as if the
00271 <varname>[nonzero]</varname> flag had been unset at the beginning of decode.
00272 </para>
00273 
00274 <para>
00275 Vector <varname>[floor1_Y]</varname> contains the values from packet decode
00276 needed for floor 1 synthesis.</para>
00277 
00278 </section>
00279 
00280 <section id="vorbis-spec-floor1-synth">
00281 <title>curve computation</title>
00282 
00283 <para>
00284 Curve computation is split into two logical steps; the first step
00285 derives final Y amplitude values from the encoded, wrapped difference
00286 values taken from the bitstream.  The second step plots the curve
00287 lines.  Also, although zero-difference values are used in the
00288 iterative prediction to find final Y values, these points are
00289 conditionally skipped during final line computation in step two.
00290 Skipping zero-difference values allows a smoother line fit.  </para>
00291 
00292 <para>
00293 Although some aspects of the below algorithm look like inconsequential
00294 optimizations, implementors are warned to follow the details closely.
00295 Deviation from implementing a strictly equivalent algorithm can result
00296 in serious decoding errors.</para>
00297 
00298 <section>
00299 <title>step 1: amplitude value synthesis</title>
00300 
00301 <para>
00302 Unwrap the always-positive-or-zero values read from the packet into
00303 +/- difference values, then apply to line prediction.</para>
00304 
00305 <screen>
00306   1) [range] = vector { 256, 128, 86, 64 } element ([floor1_multiplier]-1)
00307   2) vector [floor1_step2_flag] element [0] = set
00308   3) vector [floor1_step2_flag] element [1] = set
00309   4) vector [floor1_final_Y] element [0] = vector [floor1_Y] element [0]
00310   5) vector [floor1_final_Y] element [1] = vector [floor1_Y] element [1]
00311   6) iterate [i] over the range 2 ... [floor1_values]-1 {
00312     
00313        7) [low_neighbor_offset] = <link linkend="vorbis-spec-low_neighbor">low_neighbor</link>([floor1_X_list],[i])
00314        8) [high_neighbor_offset] = <link linkend="vorbis-spec-high_neighbor">high_neighbor</link>([floor1_X_list],[i])
00315 
00316        9) [predicted] = <link linkend="vorbis-spec-render_point">render_point</link>( vector [floor1_X_list] element [low_neighbor_offset],
00317                                       vector [floor1_final_Y] element [low_neighbor_offset],
00318                                       vector [floor1_X_list] element [high_neighbor_offset],
00319                                       vector [floor1_final_Y] element [high_neighbor_offset],
00320                                       vector [floor1_X_list] element [i] )
00321 
00322       10) [val] = vector [floor1_Y] element [i]
00323       11) [highroom] = [range] - [predicted]
00324       12) [lowroom]  = [predicted]
00325       13) if ( [highroom] is less than [lowroom] ) {
00326 
00327             14) [room] = [highroom] * 2
00328          
00329           } else [highroom] is not less than [lowroom] {
00330                       
00331             15) [room] = [lowroom] * 2
00332         
00333           }
00334 
00335       16) if ( [val] is nonzero ) {
00336 
00337             17) vector [floor1_step2_flag] element [low_neighbor_offset] = set
00338             18) vector [floor1_step2_flag] element [high_neighbor_offset] = set
00339             19) vector [floor1_step2_flag] element [i] = set
00340             20) if ( [val] is greater than or equal to [room] ) {
00341  
00342                   21) if ( [highroom] is greater than [lowroom] ) {
00343 
00344                         22) vector [floor1_final_Y] element [i] = [val] - [lowroom] + [predicted]
00345                      
00346                       } else [highroom] is not greater than [lowroom] {
00347               
00348                         23) vector [floor1_final_Y] element [i] = [predicted] - [val] + [highroom] - 1
00349                    
00350                       }
00351                
00352                 } else [val] is less than [room] {
00353                  
00354                   24) if ([val] is odd) {
00355                  
00356                         25) vector [floor1_final_Y] element [i] = 
00357                             [predicted] - (([val] + 1) divided by  2 using integer division)
00358 
00359                       } else [val] is even {
00360 
00361                         26) vector [floor1_final_Y] element [i] = 
00362                             [predicted] + ([val] / 2 using integer division)
00363                           
00364                       }
00365 
00366                 }      
00367 
00368           } else [val] is zero {
00369 
00370             27) vector [floor1_step2_flag] element [i] = unset
00371             28) vector [floor1_final_Y] element [i] = [predicted]
00372 
00373           }
00374 
00375      }
00376 
00377  29) done
00378 
00379 </screen>
00380 
00381 </section>
00382 
00383 <section>
00384 <title>step 2: curve synthesis</title>
00385 
00386 <para>
00387 Curve synthesis generates a return vector <varname>[floor]</varname> of length
00388 <varname>[n]</varname> (where <varname>[n]</varname> is provided by the decode process
00389 calling to floor decode).  Floor 1 curve synthesis makes use of the
00390 <varname>[floor1_X_list]</varname>, <varname>[floor1_final_Y]</varname> and
00391 <varname>[floor1_step2_flag]</varname> vectors, as well as [floor1_multiplier]
00392 and [floor1_values] values.</para>
00393 
00394 <para>
00395 Decode begins by sorting the scalars from vectors
00396 <varname>[floor1_X_list]</varname>, <varname>[floor1_final_Y]</varname> and
00397 <varname>[floor1_step2_flag]</varname> together into new vectors
00398 <varname>[floor1_X_list]'</varname>, <varname>[floor1_final_Y]'</varname> and
00399 <varname>[floor1_step2_flag]'</varname> according to ascending sort order of the
00400 values in <varname>[floor1_X_list]</varname>.  That is, sort the values of
00401 <varname>[floor1_X_list]</varname> and then apply the same permutation to
00402 elements of the other two vectors so that the X, Y and step2_flag
00403 values still match.</para>
00404 
00405 <para>
00406 Then compute the final curve in one pass:</para>
00407 
00408 <screen>
00409   1) [hx] = 0
00410   2) [lx] = 0
00411   3) [ly] = vector [floor1_final_Y]' element [0] * [floor1_multiplier]
00412   4) iterate [i] over the range 1 ... [floor1_values]-1 {
00413 
00414        5) if ( [floor1_step2_flag]' element [i] is set ) {
00415 
00416              6) [hy] = [floor1_final_Y]' element [i] * [floor1_multiplier]
00417              7) [hx] = [floor1_X_list]' element [i]
00418              8) <link linkend="vorbis-spec-render_line">render_line</link>( [lx], [ly], [hx], [hy], [floor] )
00419              9) [lx] = [hx]
00420             10) [ly] = [hy]
00421           }
00422      }
00423  
00424  11) if ( [hx] is less than [n] ) {
00425 
00426         12) <link linkend="vorbis-spec-render_line">render_line</link>( [hx], [hy], [n], [hy], [floor] )
00427 
00428      }
00429 
00430  13) if ( [hx] is greater than [n] ) {
00431 
00432             14) truncate vector [floor] to [n] elements
00433 
00434      }
00435  
00436  15) for each scalar in vector [floor], perform a lookup substitution using 
00437      the scalar value from [floor] as an offset into the vector <link linkend="vorbis-spec-floor1_inverse_dB_table">[floor1_inverse_dB_static_table]</link>
00438 
00439  16) done
00440 
00441 </screen>
00442 
00443 </section>
00444 
00445 </section>
00446 
00447 </section>
00448 </section>
00449 </section>
00450