© 2003 - 2005 Sipura Technology, Inc
Proprietary (See Copyright Notice on Page 2)
8
The table below displays speech quality metrics associated with various audio compression
algorithms:
Algorithm Bandwidth
Complexity MOS
Score
G.711
64 kbps
Very Low
4.5
G.726
16, 24, 32, 40 kbps
Low
4.1 (32 kbps)
G.729a
8 kbps
Low - Medium
4
G.729 8
kbps
Medium 4
G.723.1
6.3, 5.3 kbps
High
3.8
Please note: The SPA supports all the above voice coding algorithms.
Several factors that contribute to Voice Quality are described below.
Audio compression algorithm – Speech signals are sampled, quantized and compressed before they
are packetized and transmitted to the other end. For IP Telephony, speech signals are usually
sampled at 8000 samples per second with 12-16 bits per sample. The compression algorithm plays a
large role in determining the Voice Quality of the reconstructed speech signal at the other end. The
SPA supports the most popular audio compression algorithms for IP Telephony: G.711 a-law and µ-
law, G.726, G.729a and G.723.1.
The encoder and decoder pair in a compression algorithm is known as a codec. The compression
ratio of a codec is expressed in terms of the bit rate of the compressed speech. The lower the bit rate,
the smaller the bandwidth required to transmit the audio packets. Voice Quality is usually lower with
lower bit rate, however. But Voice Quality is usually higher as the complexity of the codec gets higher
at the same bit rate.
Silence Suppression – The SPA applies silence suppression so that silence packets are not sent to
the other end in order to conserve more transmission bandwidth; instead a noise level measurement
can be sent periodically during silence suppressed intervals so that the other end can generate
artificial comfort noise that mimics the noise at the other end (using a CNG or comfort noise
generator).
Packet Loss – Audio packets are transported by UDP which does not guarantee the delivery of the
packets. Packets may be lost or contain errors which can lead to audio sample drop-outs and
distortions and lowers the perceived Voice Quality. The SPA applies an error concealment algorithm
to alleviate the effect of packet loss.
Network Jitter – The IP network can induce varying delay of the received packets. The RTP receiver
in the SPA keeps a reserve of samples in order to absorb the network jitter, instead of playing out all
the samples as soon as they arrive. This reserve is known as a jitter buffer. The bigger the jitter
buffer, the more jitter it can absorb, but this also introduces bigger delay. Therefore the jitter buffer
size should be kept to a relatively small size whenever possible. If jitter buffer size is too small, then
many late packets may be considered as lost and thus lowers the Voice Quality. The SPA can
dynamically adjust the size of the jitter buffer according to the network conditions that exist during a
call.
Echo – Impedance mismatch between the telephone and the IP Telephony gateway phone port can
lead to near-end echo. The SPA has a near end echo canceller with at least 8 ms tail length to
compensate for impedance match. The SPA also implements an echo suppressor with comfort noise
generator (CNG) so that any residual echo will not be noticeable.