Chapter 9:
MPEG
”
The basic scheme
is to predict motion from frame to frame in the temporal direction,
and then to use DCT's (discrete cosine transforms) to organize the redundancy in the
spatial directions.
The DCT's are done on 8x8 blocks, and the motion prediction is done in the
luminance (Y) channel on 16x16 blocks
.
In other words, given the 16x16 block in the current frame
that you are trying to code, you look for a close match to that block in a previous or future frame
(there are backward prediction modes where later frames are sent
fi
rst to allow interpolating
between frames).
The DCT coef
fi
cients (of either the actual data, or the difference between this block and
the close match) are quantized, which means that you divide them by some value to drop bits off the bottom
end. Hopefully, many of the coef
fi
cients will then end up being zero.
The quantization can change for every
"macroblock" (a macroblock is 16x16 of Y and the corresponding 8x8's in both U and V).
The results of all of this, which
include the DCT coef
fi
cients, the motion vectors, and the quantization parameters (and other stuff) is Huffman coded using
fi
xed tables.
The DCT coef
fi
cients have a special Huffman table that is two-dimensional in that one code speci
fi
es a run-length of zeros and the non-
zero value that ended the run. Also, the motion vectors and the DC DCT components are DPCM, (subtracted from the last one) coded.
”
--Berkeley Multimedia Research Center MPEG-1 Document
23