General Changing
38/56
netX 50 to netX 51/52 | Migration Guide
DOC120109MG05EN | Revision 5 | English | 2013-08 | Released | Public
2012-2013
4.4
Improved Memory Access Performance
Improvement
The netX 50 has following disadvantages:
ARM966 has no Level 1 – Cache included
Tightly Coupled Memories (8K/8K Instruction/Data) are often too small for applications
Only 96 KByte internal SRAM
64 KByte internal SRAM used for data exchange between xPEC and ARM966
32 KByte internal SRAM used as Dual-Port Memory to exchange data between
ARM966 and external Host
One combined channel for data and instruction of ARM966 on SDRAM
So the user application has to run non-cached out of external memory which leads to a weak
access performance (see benchmark table below).
The netX 51/52 has following changes regarding to netX 50:
Internal SRAM enlarged from 96 KByte to 672 KByte
Tightly Coupled Memories removed and remaining two TCM Instruction/Data channels
connected to internal SRAM
Two separated channels for data and instruction of ARM966 on SDRAM
Advantage of the new ARM integration in netX 51/52 is that full internal SRAM can be reached by
TCM channels. Furthermore ARM can run accesses in parallel now:
Access can be performed on both TCM channels (e.g. instruction fetch and data store) and even
ARM AHB channel (e.g. peripheral access) simultaneously. Additionally some ARM-TCM features
(e.g. data buffering) lead to better performance than using standard AHB interface. That leads to
an increased total ARM performance even when operating frequency is decreased to 100MHz.
Decreased operating frequency leads to less power consumption. On SDRAM the ARM
performance benefits from separated channels for data and instructions.
Benchmark
CoreMark, an open source benchmark program for embedded processors, was used to visualize
the improvements.
Instruction code and data are located in different memory regions. The call stack is located within
internal RAM. The data area is static, no heap is used.
The following table shows the CoreMark Processing times (smaller values are better) in clock
cycles of 10ns under ARM Compiler Optimization –O2.
Instruction / Data Memory
netX 50 (200 MHz ARM966)
netX 51/52 (100 MHz ARM 966)
INTRAM / INTRAM
152.233
96.590
ITCM / DTCM
61.541
-
SDRAM 32 bit
454.253
392.966
XiP (Execution in Place) QSPI Clock =
80 MHz
- 1.240.979
Table 21: Memory Access Performance Results