AMBE-2000™ Vocoder Chip Users Manual
Version 4.92, June, 08
DVSI Confidential Proprietary, Subject to Change
Page 11
Visit us at
www.dvsinc.com
The voice coder interface requires the A-to-D and D-to-A converters to operate at an 8 kHz sampling rate (i.e. a sampling
period of 125 microseconds) at the digital input/output reference points. This requirement necessitates the use of analog filters
at both the input and output to eliminate any frequency components above the Nyquist frequency (4 kHz). The recommended
input filter mask is shown in Figure 2 - C, and the recommended output filter mask is shown in Figure 2 - D. For proper
operation, the shaded zone of the respective figure should bound the frequency response of the front-end input and output.
+2 dB
0 200
3400
4000
-1 dB
4600
freq (Hz)
-2 dB
3000
-60 dB
-40 dB
-18 dB
8000
Figure 2 - D. Front End Output Filter Mask
This document assumes that the A-to-D converter produces digital samples where the maximum digital input level (+3 dBm0)
is defined to be +/- 32767, and similarly, that the maximum digital output level of the D-to-A converter occurs at the same
digital level of +/- 32767. If a converter is used which does not meet these assumptions then the digital gain elements shown in
Figure 2 should be adjusted appropriately. Note that these assumptions are automatically satisfied if 16 bit linear A-to-D and
D-to-A converters are used, in which case the digital gain elements should be set to unity gain. Also note that the vocoder
requires that any companding which is applied by the A-to-D converter (i.e. alaw or ulaw) should be removed prior to speech
encoding. Similarly any companding used by the D-to-A converter must be applied after speech decoding.
An additional recommendation addresses the maximum noise level measured at the output reference points shown in
Figure 2-B with the corresponding inputs set to zero. DVSI recommends that the noise level for both directions should not
exceed -60 dBm0 with no corresponding input. In addition the isolation from cross talk (or echo) from the output to the input
should exceed 45 dB which can be achieved via either passive (electrical and/or acoustic design) or active (echo cancellation
and/or suppression) means.
2.2.3
Channel Interface Overview
The channel interface is meant to be flexible to allow for easy integration with the system under design. The basic hardware
unit of the interface is a serial port. The serial port can run in passive or active modes. In passive mode, all of the channel
interface control signals are inputs to the AMBE2020™ chip. In active mode, only the TX_DATA_STRB is an output from
the AMBE2020™ chip. All other signals are inputs.
Under normal operation, every 20ms, the encoder outputs a frame of coded bits, and the decoder needs to be delivered a frame
of coded bits. There is some formatting of the data for both the encoder and the decoder. The primary purpose of the
formatting is to provide alignment information for the encoded bit stream. The data has two formats, Framed and Unframed.
Serial mode can run in either Framed or Unframed mode.
The Framed and Unframed modes are explained in full detail in Section 4, but essentially the two formats are trying to
achieve the same function, to provide positional information regarding the outgoing and incoming coded data streams. In
Framed mode each 20msecs of output data from the encoder is preceded by a known structure (each packet corresponds to
20ms of speech data input into the encoder). This structure also embeds some status type flags, meant for local control