![Analog Devices SHARC ADSP-21367 Manual Download Page 4](http://html1.mh-extra.com/html/analog-devices/sharc-adsp-21367/sharc-adsp-21367_manual_2939742004.webp)
Rev. D
|
Page 4 of 56
|
November 2008
ADSP-21367/ADSP-21368/ADSP-21369
GENERAL DESCRIPTION
The ADSP-21367/ADSP-21368/ADSP-21369 SHARC
®
proces-
sors are members of the SIMD SHARC family of DSPs that
feature Analog Devices’ Super Harvard Architecture. These pro-
cessors are source code-compatible with the ADSP-2126x and
ADSP-2116x DSPs as well as with first generation ADSP-2106x
SHARC processors in SISD (single-instruction, single-data)
mode. The processors are 32-bit/40-bit floating-point proces-
sors optimized for high performance automotive audio
applications with its large on-chip SRAM, mask programmable
ROM, multiple internal buses to eliminate I/O bottlenecks, and
an innovative digital applications interface (DAI).
As shown in the functional block diagram
, the
processors use two computational units to deliver a significant
performance increase over the previous SHARC processors on a
range of DSP algorithms. Fabricated in a state-of-the-art, high
speed, CMOS process, the ADSP-21367/ADSP-21368/
ADSP-21369 processors achieve an instruction cycle time of up
to 2.5 ns at 400 MHz. With its SIMD computational hardware,
the processors can perform 2.4 GFLOPS running at 400 MHz.
shows performance benchmarks for these devices.
The ADSP-21367/ADSP-21368/ADSP-21369 processors con-
tinue SHARC’s industry-leading standards of integration for
DSPs, combining a high performance 32-bit DSP core with inte-
grated, on-chip system features.
The block diagram of the ADSP-21368
following architectural features:
• Two processing elements, each of which comprises an
ALU, multiplier, shifter, and data register file
• Data address generators (DAG1, DAG2)
• Program sequencer with instruction cache
• PM and DM buses capable of supporting four 32-bit data
transfers between memory and the core at every core pro-
cessor cycle
• Three programmable interval timers with PWM genera-
tion, PWM capture/pulse width measurement, and
external event counter capabilities
• On-chip SRAM (2M bit)
• On-chip mask-programmable ROM (6M bit)
• JTAG test access port
The block diagram of the ADSP-21368
the following architectural features:
• DMA controller
• Eight full-duplex serial ports
• Digital applications interface that includes four precision
clock generators (PCG), an input data port (IDP), an
S/PDIF receiver/transmitter, eight channels asynchronous
sample rate converters, eight serial ports, a 16-bit parallel
input port (PDAP), a flexible signal routing unit (DAI
SRU).
• Digital peripheral interface that includes three timers, a 2-
wire interface, two UARTs, two serial peripheral interfaces
(SPI), and a flexible signal routing unit (DPI SRU).
SHARC FAMILY CORE ARCHITECTURE
The ADSP-21367/ADSP-21368/ADSP-21369 are code compati-
ble at the assembly level with the ADSP-2126x, ADSP-21160,
and ADSP-21161, and with the first generation ADSP-2106x
SHARC processors. The ADSP-21367/ADSP-21368/
ADSP-21369 processors share architectural features with the
ADSP-2126x and ADSP-2116x SIMD SHARC processors, as
detailed in the following sections.
SIMD Computational Engine
The processors contain two computational processing elements
that operate as a single-instruction, multiple-data (SIMD)
engine. The processing elements are referred to as PEX and PEY
and each contains an ALU, multiplier, shifter, and register file.
PEX is always active, and PEY may be enabled by setting the
PEYEN mode bit in the MODE1 register. When this mode is
enabled, the same instruction is executed in both processing ele-
ments, but each processing element operates on different data.
This architecture is efficient at executing math intensive DSP
algorithms.
Entering SIMD mode also has an effect on the way data is trans-
ferred between memory and the processing elements. When in
SIMD mode, twice the data bandwidth is required to sustain
computational operation in the processing elements. Because of
this requirement, entering SIMD mode also doubles the band-
width between memory and the processing elements. When
using the DAGs to transfer data in SIMD mode, two data values
are transferred with each access of memory or the register file.
Independent, Parallel Computation Units
Within each processing element is a set of computational units.
The computational units consist of an arithmetic/logic unit
(ALU), multiplier, and shifter. These units perform all opera-
tions in a single cycle. The three units within each processing
element are arranged in parallel, maximizing computational
throughput. Single multifunction instructions execute parallel
Table 1. Processor Benchmarks (at 400 MHz)
Benchmark Algorithm
Speed
(at 400 MHz)
1024 Point Complex FFT (Radix 4, with reversal) 23.2
μ
s
FIR Filter (per tap)
1
1
Assumes two files in multichannel SIMD mode.
1.25 ns
IIR Filter (per biquad)
1
5.0 ns
Matrix Multiply (pipelined)
[3×3] × [3×1]
[4×4] × [4×1]
11.25 ns
20.0 ns
Divide (y/x)
8.75 ns
Inverse Square Root
13.5 ns