ISD94100 Series Technical Reference Manual
Sep 9, 2019
Page
881
of 928
Rev1.09
IS
D
9
410
0
S
ER
IE
S
T
E
C
HN
ICA
L
RE
F
E
RE
NCE
M
AN
U
AL
Figure 6.20-3 VAD Data Diagram
6.20.5.3 SINC Filter
For the SINC filter, it has three over sampling rate (OSR) configuration that controlled by register
SINCOSR (VAD_SINC_CTL[11:8]): OSR48, OSR64, OSR96. For the three OSR options, the DMIC
bus clock (DMIC_CLK) will be (Fs x 48) kHz, (Fs x 64) kHz or (Fs x 96) kHz, where Fs is sample
rate. The DMIC_CLK will be controlled by VAD module when VADEN (VAD_SINCCTL[31]) is
enabled. The frequency of DMIC_CLK is obtained according to the following equation.
F_DMIC_CLK = F_DMIC_MCLK / 4
where F_DMIC_CLK is the frequency of DMIC_CLK and F_DMIC_MCLK is the frequency of
DMIC_MCLK.
The frequency of DMIC working main clock (DMIC_MCLK) should be F_DMIC_CLK x 4.
6.20.5.4 Biquad Filter
The biquad filter is a second-order recursive linear filter with two poles and two zeros. Its transfer
function is the Z-domain consists of two quadratic functions:
𝐻𝐻
(
𝑧𝑧
) =
𝑏𝑏
0
+
𝑏𝑏
1
𝑧𝑧
−1
+
𝑏𝑏
2
𝑧𝑧
−2
1 +
𝑎𝑎
1
𝑧𝑧
−1
+
𝑎𝑎
2
𝑧𝑧
−2
Each Biquad Coefficient (a1, a2, b0, b1 and b2) has 16 bits in Sxx.13 format where
1. S is the sign bit (1 bit)
2. xx are integers (2bits)
3. 13 fractional bits (13 bits)
6.20.5.5 VAD Configuration
VAD analyses the PCM data from DMIC channel 0. In order to use the VAD function, the parameters
need to be set correctly. First it’s to set the attack time. For the attack time setting, the bigger we
set, the faster the energy we calculated. So the default value of the LTAT (VAD_CTL0[19:16]) (Long
term attack time) is 0x7, and the STAT (VAD_CLT0 [7:0]) (short term attack time) is 0xCC. If you
want to calculate the energy faster, you can set it bigger, and then the waveform of the energy will
be more similar with the input. But the STAT (VAD_CLT0[7:0]) should be always bigger than the
LTAT (VAD_CTL0[19:16]).
Then we need to set the threshold of the energy, including the short term energy threshold, long
term energy threshold and the deviation energy threshold. The threshold setting is based on the
input level. The bigger the input level, the larger the threshold setting. If the threshold is set too
high, then the VAD cannot detect the voice, and if the threshold is set too low, then it may have
some wrong detection.
In order to use the VAD detection, user can have some example such as some actual voice file to
tune the parameters.
6.20.5.6 VAD Decision Tree
Figure 6.20-4 illustrates the operation flow of VAD.