background image

 

DPU IP Product Guide

 

www.xilinx.com

 

43 

PG338 (v1.2) March 26, 2019 

 

Appendix A: Legal Notices 

References 

These documents provide supplemental material useful with this product guide: 
1.

 

DNNDK User Guide

 (

UG1327

) 

2.

 

Zynq Ult MPSoC Embedded Design Tutorial

 (

UG1209

) 

3.

 

PetaLinux Tools Documentation Reference Guide

 (

UG1144

) 

4.

 

ZCU102 Evaluation Board User Guide

 (

UG1182

) 

Please Read: Important Legal Notices 

The information disclosed to you hereunder (the “Materials”) is provided solely for the selection and use of Xilinx products. To the maximum 
extent permitted by applicable law: (1) Materials are made available "AS IS" and with all faults, Xilinx hereby DISCLAIMS ALL WARRANTIES 
AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, 
NON-INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether in contract or tort, 
including negligence, or under any other theory of liability) for any loss or damage of any kind or nature related to, arising under, or in 
connection with, the Materials (including your use of the Materials), including for any direct, indirect, special, incidental, or consequential loss 
or damage (including loss of data, profits, goodwill, or any type of loss or damage suffered as a result of any action brought by a third party) 
even if such damage or loss was reasonably foreseeable or Xilinx had been advised of the possibility of the same. Xilinx assumes no 
obligation to correct any errors contained in the Materials or to notify you of updates to the Materials or to product specifications. You may not 
reproduce, modify, distribute, or publicly display the Materials without prior written consent. Certain products are subject to the terms and 
conditions of Xilinx’s limited warranty, please refer to Xilinx’s Terms of Sale which can be viewed at

 https://www.xilinx.com/legal.htm#tos

; IP 

cores may be subject to warranty and support terms contained in a license issued to you by Xilinx. Xilinx products are not designed or 
intended to be fail-safe or for use in any application requiring fail-safe performance; you assume sole risk and liability for use of Xilinx products 
in such critical applications, please refer to Xilinx’s Terms of Sale which can be viewed at 

https://www.xilinx.com/legal.htm#tos

AUTOMOTIVE APPLICATIONS DISCLAIMER 

AUTOMOTIVE PRODUCTS (IDENTIFIED AS “XA” IN THE PART NUMBER) ARE NOT WARRANTED FOR USE IN THE DEPLOYMENT OF 
AIRBAGS OR FOR USE IN APPLICATIONS THAT AFFECT CONTROL OF A VEHICLE (“SAFETY APPLICATION”) UNLESS THERE IS A 
SAFETY CONCEPT OR REDUNDANCY FEATURE CONSISTENT WITH THE ISO 26262 AUTOMOTIVE SAFETY STANDARD (“SAFETY 
DESIGN”). CUSTOMER SHALL, PRIOR TO USING OR DISTRIBUTING ANY SYSTEMS THAT INCORPORATE PRODUCTS, 
THOROUGHLY TEST SUCH SYSTEMS FOR SAFETY PURPOSES. USE OF PRODUCTS IN A SAFETY APPLICATION WITHOUT A 
SAFETY DESIGN IS FULLY AT THE RISK OF CUSTOMER, SUBJECT ONLY TO APPLICABLE LAWS AND REGULATIONS GOVERNING 
LIMITATIONS ON PRODUCT LIABILITY. 

© Copyright 2019 Xilinx, Inc. Xilinx, the Xilinx logo, Artix, ISE, Kintex, Spartan, Virtex, Zynq, and other designated brands included herein are 
trademarks of Xilinx in the United States and other countries. AMBA, AMBA Designer, Arm, ARM1176JZ-S, CoreSight, Cortex, PrimeCell, 
Mali, and MPCore are trademarks of Arm Limited in the EU and other countries. PCI, PCIe, and PCI Express are trademarks of PCI-SIG and 
used under license. All other trademarks are the property of their respective owners. 

 

Send Feedback

Summary of Contents for B1024

Page 1: ...DPU for Convolutional Neural Network v1 2 DPU IP Product Guide PG338 v1 2 March 26 2019...

Page 2: ...dated descrption Build the Demo Updated figure Demo Execution Updated code 03 08 2019 Version 1 1 Table 6 Reg_dpu_base_addr Updated descrption Figure 10 DPU Configuration Updated figure Build the Peta...

Page 3: ...are Architecture 10 DSP with Enhanced Utilization DPU_EU 11 Register Space 13 Interrupts 17 Chapter 3 DPU Configuration 18 Introduction 18 Configuration Options 19 DPU Performance on Different Devices...

Page 4: ...U IP Product Guide www xilinx com 4 PG338 v1 2 March 26 2019 Introduction 33 Hardware Design Flow 36 Software Design Flow 39 Appendix A Legal Notices 43 References 43 Please Read Important Legal Notic...

Page 5: ...B512 B800 B1024 B1152 B1600 B2304 B3136 and B4096 o Configurable core number up to three o Convolution and deconvolution o Max pooling o ReLu and Leaky ReLu o Concat o Elementwise o Dilation o Reorg...

Page 6: ...YOLO SSD MobileNet FPN etc The DPU IP can be integrated as a block in the programmable logic PL of the selected Zynq 7000 SoC and Zynq UltraScale MPSoC devices with direct connections to the processi...

Page 7: ...driver which is included in the Xilinx Deep Neural Network Development Kit DNNDK toolchain You can download the free developer resources from the Xilinx website https www xilinx com products design to...

Page 8: ...ntation DPU Camera AXI Interconnect Controller DDR ARM R5 DisplayPort USB3 0 SATA3 1 PCIe Gen2 GigE USB2 0 UART SPI Quad SPI NAND SD demosaic gamma Color_ conversion DMA AXI Interconnect AXI Interconn...

Page 9: ...Loader Operating System Host CPU Deep Learning App DPU accelerated Profiler Libarary DPU Driver DPU User Space Kernel Space Hardware Platform X22331 022019 Figure 5 Application Execution Hierarchy Lic...

Page 10: ...sed to buffer the intermediate data input and output data The data is reused as much as possible to reduce the memory bandwidth Deep pipelined design is used for the computing engine Like other accele...

Page 11: ...performance achieved with the device Therefore two input clocks for DPU is needed one for general logic and the other for DSP slices The difference between DPU and DPU_EU is shown in Figure 7 All DPU...

Page 12: ...tive Low reset for DSP unit m_axi_dpu_aclk Clock 1 I Input clock used for DPU general logic m_axi_dpu_aresetn Reset 1 I Active Low reset for DPU general logic DPUx_M_AXI_INSTR Memory mapped AXI master...

Page 13: ...nals are active High The details of reg_dpu_reset is shown in Table 2 Table 2 Reg_dpu_reset Register Address Offset Width Type Description Reg_dpu_reset 0x004 32 R W 0 reset of DPU core 0 1 reset of D...

Page 14: ...th Type Description Reg_dpu0_instr_addr 0x20c 32 R W 0 The instruction start address in external memory for DPU core0 Reg_dpu1_instr_addr 0x30c 32 R W 0 The instruction start address in external memor...

Page 15: ...ase address4 of DPU core0 Reg_dpu0_base_addr5_l 0x24C 32 R W The lower 32 bits of the base address5 of DPU core0 Reg_dpu0_base_addr5_h 0x250 32 R W The lower 8 bits in the register represent the upper...

Page 16: ...R W The lower 8 bits in the register represent the upper 8 bits of the base address1 of DPU core2 Reg_dpu2_base_addr2_l 0x434 32 R W The lower 32 bits of the base address2 of DPU core2 Reg_dpu2_base_a...

Page 17: ...determined by the number of DPU cores When the parameter of DPU_NUM is set to 2 it means the DPU IP is integrated with two DPU cores and the data width of the dpu_interrupt signal is two bits The lowe...

Page 18: ...ing table Table 7 Deep Neural Network Features and Parameters Supported by DPU Features Description Convolution Kernel Sizes W 1 16 H 1 16 Strides W 1 4 H 1 4 Padding_w 1 kernel_w 1 Padding_h 1 kernel...

Page 19: ...bitrary Notes 1 The parameter channel_parallel is determined by the DPU configuration For example the channel_parallel of DPU B1152 is 12 the channel_parallel of DPU B4096 is 16 Configuration Options...

Page 20: ...ifferent programmable logic resource The larger convolution architecture can achieve higher performance with more resources The parallelism for different convolution architecture is listed in Table 8...

Page 21: ...d low DSP usage is shown in Table 9 The data is tested on the Xilinx ZCU102 platform without Depthwise Conv Average Pooling Relu6 and Leaky Relu features Table 9 Resources of Different DSP Usage High...

Page 22: ...The final utilization is shown in Figure 11 Figure 11 Summary Page of DPU Configuration DPU Performance on Different Devices Table 10 shows the peak performance of the DPU on different devices Table 1...

Page 23: ...dwidth requirements for some neural networks averaged by layer have been tested with one DPU core running at full speed The peak and average I O bandwidth requirements of three different neural networ...

Page 24: ...ure 12 shows the three clock domains PL s_axi_clk DPU Register Configure Data Controller Calculation Unit m_axi_dpu_aclk dpu_2x_aclk X22334 022019 Figure 12 Clock Domain in DPU Register Clock The inpu...

Page 25: ...axi_dpu_aclk and the two clocks must be synchronous to meet the timing closure The recommended circuit design is shown in Figure 13 MMCM RST CLKIN CLKOUT BUFGCE_DIV CE CLR I O BUFGCE_DIV_CLK2_INST dpu...

Page 26: ...be set to Auto Figure 14 Recommended Clocking Options of Clock Wizard Matched Routing Select the Matched Routing for the m_axi_dpu_aclk and dpu_2x_clk in the Output Clocks tab of the Clock Wizard IP W...

Page 27: ...t You must guarantee each pair of clocks and resets is generated in a synchronous clock domain If the related clocks and resets are not matched the DPU might not work properly A recommended solution i...

Page 28: ...ository Add DPU IP into Block Design Configure DPU Parameters Connect DPU with a Processing System in the Xilinx SoC Assign Register Address for DPU Generate Bitstream Generate BOOT BIN Add DPU IP int...

Page 29: ...rch 26 2019 Figure 18 DPU IP in Repository Add DPU IP into Block Design Search DPU IP in the block design interface and add DPU IP into the block design The procedure is shown in Figure 19 and Figure...

Page 30: ...The number of master interfaces in the DPU IP depends on the DPU_NUM parameter You can connect the DPU to a processing system PS with any kind of interconnections You must ensure the DPU can correctly...

Page 31: ...stom system with the pre built Linux environment in the DNNDK package the DPU slave interface must be connected to the M_AXI_HPM0_LPD PS Master and the DPU base address must be set to 0x8F00_000 with...

Page 32: ...itstream in Vivado shown in Figure 24 Figure 24 Generate Bitstream Generate BOOT BIN You can use the Vivado SDK or PetaLinux to generate the BOOT BIN file For boot image creation using the Vivado SDK...

Page 33: ...ures Hardware Design Flow gives an overview of how to use Xilinx Vivado Design Suite to generate the reference hardware design Software Design Flow describes the design flow of project creation in the...

Page 34: ...more information about the DNNDK package refer to the DNNDK User Guide UG1327 Requirements The following summarizes the requirements of the TRD Target platforms ZCU102 evaluation board production sil...

Page 35: ...n DPU IP Product Guide www xilinx com 35 PG338 v1 2 March 26 2019 Design Files Design files are in the following directory structure Figure 26 Directory Structure Note DPU_IP is in the pl srcs dpu_ip...

Page 36: ...file The parameters of DPU IP in the reference design are configured accordingly Both the connections of the DPU interrupt and the assignment addresses for DPU in the reference design should not be m...

Page 37: ...o build the reference Vivado project with Vivado 2018 2 For information about setting up your Vivado environment refer to the Vivado Design Suite User Guide UG910 Building the hardware design consists...

Page 38: ...U IP Product Guide www xilinx com 38 PG338 v1 2 March 26 2019 Figure 29 TRD Block Design 4 In the GUI click Generate Bitstream to generate the bit file as shown in the following figure Figure 30 Gener...

Page 39: ...wing figure Figure 31 DPU Configuration Page Those parameters of DPU can be configured in case of different resource requirements For more information about DPU and its parameters refer to Chapter 3 D...

Page 40: ...ilt path to TRD_HOME pl prj zcu102 sdk Create BOOT BIN Use the following to create the BOOT BIN file cd images linux petalinux package boot fsbl zynqmp_fsbl elf u boot u boot elf pmufw pmufw elf fpga...

Page 41: ...e resnet50 directory on the SD card for example home resnet50 If the directory does not exist create a new directory 4 Insert the SD card into the ZCU102 and boot up the board After the Linux boot run...

Page 42: ...Chapter 6 Example Design DPU IP Product Guide www xilinx com 42 PG338 v1 2 March 26 2019 Figure 32 Running Results Send Feedback...

Page 43: ...cly display the Materials without prior written consent Certain products are subject to the terms and conditions of Xilinx s limited warranty please refer to Xilinx s Terms of Sale which can be viewed...

Reviews: