background image

 

 

 

Chapter 6: Example Design 

 

DPU IP Product Guide

 

www.xilinx.com

 

41 

PG338 (v1.2) March 26, 2019 

 

Demo Execution 

To run the demo: 
1.

 

After generating the BOOT.BIN file, copy BOOT.BIN and image.ub (which is in 

image/linux

 folder) 

to the SD card.  

2.

 

Copy the common directory in 

$TRD_HOME/images

 to the SD card. 

3.

 

Copy the pre-built resnet50 in 

$TRD_HOME/images/resnet50

 or the newly generated resnet50 in 

$TRD_HOME/apu/apps/resnet50/build

 to the resnet50 directory on the SD card (for example, 

/home/resnet50

). If the directory does not exist, create a new directory. 

4.

 

Insert the SD card into the ZCU102 and boot up the board. After the Linux boot, run as follows: 

% cd /home/resnet50/ 
% ./resnet50 

The screenshot is shown in Figure 32. 

The input images name is displayed in each line beginning with “Load image”, followed by the expected 

result of the input image. The predicted results of DPU is below, and the top-5 prediction probability of 

image classification are printed. If the Top-0 predict results are the same as the expected result, then 

the DPU is working properly. 

Send Feedback

Содержание B1024

Страница 1: ...DPU for Convolutional Neural Network v1 2 DPU IP Product Guide PG338 v1 2 March 26 2019...

Страница 2: ...dated descrption Build the Demo Updated figure Demo Execution Updated code 03 08 2019 Version 1 1 Table 6 Reg_dpu_base_addr Updated descrption Figure 10 DPU Configuration Updated figure Build the Peta...

Страница 3: ...are Architecture 10 DSP with Enhanced Utilization DPU_EU 11 Register Space 13 Interrupts 17 Chapter 3 DPU Configuration 18 Introduction 18 Configuration Options 19 DPU Performance on Different Devices...

Страница 4: ...U IP Product Guide www xilinx com 4 PG338 v1 2 March 26 2019 Introduction 33 Hardware Design Flow 36 Software Design Flow 39 Appendix A Legal Notices 43 References 43 Please Read Important Legal Notic...

Страница 5: ...B512 B800 B1024 B1152 B1600 B2304 B3136 and B4096 o Configurable core number up to three o Convolution and deconvolution o Max pooling o ReLu and Leaky ReLu o Concat o Elementwise o Dilation o Reorg...

Страница 6: ...YOLO SSD MobileNet FPN etc The DPU IP can be integrated as a block in the programmable logic PL of the selected Zynq 7000 SoC and Zynq UltraScale MPSoC devices with direct connections to the processi...

Страница 7: ...driver which is included in the Xilinx Deep Neural Network Development Kit DNNDK toolchain You can download the free developer resources from the Xilinx website https www xilinx com products design to...

Страница 8: ...ntation DPU Camera AXI Interconnect Controller DDR ARM R5 DisplayPort USB3 0 SATA3 1 PCIe Gen2 GigE USB2 0 UART SPI Quad SPI NAND SD demosaic gamma Color_ conversion DMA AXI Interconnect AXI Interconn...

Страница 9: ...Loader Operating System Host CPU Deep Learning App DPU accelerated Profiler Libarary DPU Driver DPU User Space Kernel Space Hardware Platform X22331 022019 Figure 5 Application Execution Hierarchy Lic...

Страница 10: ...sed to buffer the intermediate data input and output data The data is reused as much as possible to reduce the memory bandwidth Deep pipelined design is used for the computing engine Like other accele...

Страница 11: ...performance achieved with the device Therefore two input clocks for DPU is needed one for general logic and the other for DSP slices The difference between DPU and DPU_EU is shown in Figure 7 All DPU...

Страница 12: ...tive Low reset for DSP unit m_axi_dpu_aclk Clock 1 I Input clock used for DPU general logic m_axi_dpu_aresetn Reset 1 I Active Low reset for DPU general logic DPUx_M_AXI_INSTR Memory mapped AXI master...

Страница 13: ...nals are active High The details of reg_dpu_reset is shown in Table 2 Table 2 Reg_dpu_reset Register Address Offset Width Type Description Reg_dpu_reset 0x004 32 R W 0 reset of DPU core 0 1 reset of D...

Страница 14: ...th Type Description Reg_dpu0_instr_addr 0x20c 32 R W 0 The instruction start address in external memory for DPU core0 Reg_dpu1_instr_addr 0x30c 32 R W 0 The instruction start address in external memor...

Страница 15: ...ase address4 of DPU core0 Reg_dpu0_base_addr5_l 0x24C 32 R W The lower 32 bits of the base address5 of DPU core0 Reg_dpu0_base_addr5_h 0x250 32 R W The lower 8 bits in the register represent the upper...

Страница 16: ...R W The lower 8 bits in the register represent the upper 8 bits of the base address1 of DPU core2 Reg_dpu2_base_addr2_l 0x434 32 R W The lower 32 bits of the base address2 of DPU core2 Reg_dpu2_base_a...

Страница 17: ...determined by the number of DPU cores When the parameter of DPU_NUM is set to 2 it means the DPU IP is integrated with two DPU cores and the data width of the dpu_interrupt signal is two bits The lowe...

Страница 18: ...ing table Table 7 Deep Neural Network Features and Parameters Supported by DPU Features Description Convolution Kernel Sizes W 1 16 H 1 16 Strides W 1 4 H 1 4 Padding_w 1 kernel_w 1 Padding_h 1 kernel...

Страница 19: ...bitrary Notes 1 The parameter channel_parallel is determined by the DPU configuration For example the channel_parallel of DPU B1152 is 12 the channel_parallel of DPU B4096 is 16 Configuration Options...

Страница 20: ...ifferent programmable logic resource The larger convolution architecture can achieve higher performance with more resources The parallelism for different convolution architecture is listed in Table 8...

Страница 21: ...d low DSP usage is shown in Table 9 The data is tested on the Xilinx ZCU102 platform without Depthwise Conv Average Pooling Relu6 and Leaky Relu features Table 9 Resources of Different DSP Usage High...

Страница 22: ...The final utilization is shown in Figure 11 Figure 11 Summary Page of DPU Configuration DPU Performance on Different Devices Table 10 shows the peak performance of the DPU on different devices Table 1...

Страница 23: ...dwidth requirements for some neural networks averaged by layer have been tested with one DPU core running at full speed The peak and average I O bandwidth requirements of three different neural networ...

Страница 24: ...ure 12 shows the three clock domains PL s_axi_clk DPU Register Configure Data Controller Calculation Unit m_axi_dpu_aclk dpu_2x_aclk X22334 022019 Figure 12 Clock Domain in DPU Register Clock The inpu...

Страница 25: ...axi_dpu_aclk and the two clocks must be synchronous to meet the timing closure The recommended circuit design is shown in Figure 13 MMCM RST CLKIN CLKOUT BUFGCE_DIV CE CLR I O BUFGCE_DIV_CLK2_INST dpu...

Страница 26: ...be set to Auto Figure 14 Recommended Clocking Options of Clock Wizard Matched Routing Select the Matched Routing for the m_axi_dpu_aclk and dpu_2x_clk in the Output Clocks tab of the Clock Wizard IP W...

Страница 27: ...t You must guarantee each pair of clocks and resets is generated in a synchronous clock domain If the related clocks and resets are not matched the DPU might not work properly A recommended solution i...

Страница 28: ...ository Add DPU IP into Block Design Configure DPU Parameters Connect DPU with a Processing System in the Xilinx SoC Assign Register Address for DPU Generate Bitstream Generate BOOT BIN Add DPU IP int...

Страница 29: ...rch 26 2019 Figure 18 DPU IP in Repository Add DPU IP into Block Design Search DPU IP in the block design interface and add DPU IP into the block design The procedure is shown in Figure 19 and Figure...

Страница 30: ...The number of master interfaces in the DPU IP depends on the DPU_NUM parameter You can connect the DPU to a processing system PS with any kind of interconnections You must ensure the DPU can correctly...

Страница 31: ...stom system with the pre built Linux environment in the DNNDK package the DPU slave interface must be connected to the M_AXI_HPM0_LPD PS Master and the DPU base address must be set to 0x8F00_000 with...

Страница 32: ...itstream in Vivado shown in Figure 24 Figure 24 Generate Bitstream Generate BOOT BIN You can use the Vivado SDK or PetaLinux to generate the BOOT BIN file For boot image creation using the Vivado SDK...

Страница 33: ...ures Hardware Design Flow gives an overview of how to use Xilinx Vivado Design Suite to generate the reference hardware design Software Design Flow describes the design flow of project creation in the...

Страница 34: ...more information about the DNNDK package refer to the DNNDK User Guide UG1327 Requirements The following summarizes the requirements of the TRD Target platforms ZCU102 evaluation board production sil...

Страница 35: ...n DPU IP Product Guide www xilinx com 35 PG338 v1 2 March 26 2019 Design Files Design files are in the following directory structure Figure 26 Directory Structure Note DPU_IP is in the pl srcs dpu_ip...

Страница 36: ...file The parameters of DPU IP in the reference design are configured accordingly Both the connections of the DPU interrupt and the assignment addresses for DPU in the reference design should not be m...

Страница 37: ...o build the reference Vivado project with Vivado 2018 2 For information about setting up your Vivado environment refer to the Vivado Design Suite User Guide UG910 Building the hardware design consists...

Страница 38: ...U IP Product Guide www xilinx com 38 PG338 v1 2 March 26 2019 Figure 29 TRD Block Design 4 In the GUI click Generate Bitstream to generate the bit file as shown in the following figure Figure 30 Gener...

Страница 39: ...wing figure Figure 31 DPU Configuration Page Those parameters of DPU can be configured in case of different resource requirements For more information about DPU and its parameters refer to Chapter 3 D...

Страница 40: ...ilt path to TRD_HOME pl prj zcu102 sdk Create BOOT BIN Use the following to create the BOOT BIN file cd images linux petalinux package boot fsbl zynqmp_fsbl elf u boot u boot elf pmufw pmufw elf fpga...

Страница 41: ...e resnet50 directory on the SD card for example home resnet50 If the directory does not exist create a new directory 4 Insert the SD card into the ZCU102 and boot up the board After the Linux boot run...

Страница 42: ...Chapter 6 Example Design DPU IP Product Guide www xilinx com 42 PG338 v1 2 March 26 2019 Figure 32 Running Results Send Feedback...

Страница 43: ...cly display the Materials without prior written consent Certain products are subject to the terms and conditions of Xilinx s limited warranty please refer to Xilinx s Terms of Sale which can be viewed...

Отзывы: