background image

STB/STH/STW

Store to Memory With a Register Offset or 5-Bit Unsigned Constant Offset

3-124

 

Instruction Type

Store

Pipeline 
Stage

E1

E2

E3

Read

baseR,
offsetR

src

Written

baseR

Unit in use

.D2

Delay Slots

0
For more information on delay slots for a store, see Chapter 5, 

TMS320C62x

Pipeline, and Chapter 6, TMS320C67x Pipeline.

Example 1

STB .D1

A1,*A10

Before

instruction

1 cycle after

instruction

3 cycles after

instruction

A1

9A32 7634h

A1

9A32 7634h

A1

9A32 7634h

A10

0000 0100h

A10

0000 0100h

A10

0000 0100h

mem 100h

11h

mem 100h

11h

mem 100h

34h

Example 2

STH .D1

A1,*+A10(4)

Before

instruction

1 cycle after

instruction

3 cycles after

instruction

A1

9A32 7634h

A1

9A32 7634h

A1

9A32 7634h

A10

0000 0100h

A10

0000 0100h

A10

0000 0100h

mem 104h

1134h

mem 104h

1134h

mem 104h

7634h

Pipeline

Содержание TMS320C67 DSP Series

Страница 1: ...TMS320C6000 CPU and Instruction Set Reference Guide Literature Number SPRU189D March 1999 Printed on Recycled Paper...

Страница 2: ...ONDUCTOR PRODUCTS MAY INVOLVE POTENTIAL RISKS OF DEATH PERSONAL INJURY OR SEVERE PROPERTY OR ENVIRONMENTAL DAMAGE CRITICAL APPLICATIONS TI SEMICONDUCTOR PRODUCTS ARE NOT DESIGNED AUTHORIZED OR WARRANT...

Страница 3: ...al as a reference for the architecture of the TMS320C6000 CPU First time readers should read Chapter 1 for general information about TI DSPs the features of the C6000 and the applications for which th...

Страница 4: ...S320C67x Floating Point Instruction Set Chapter 5 TMS320C62x Pipeline Chapter 6 TMS320C67x Pipeline General purpose register files Chapter 2 CPU Data Paths and Control Instruction set Chapter 3 TMS320...

Страница 5: ...ypes as defined in Chapter 3 TMS320C62x C67x Fixed Point Instruction Set Although the instruction mnemonic MPY in this example is in capital let ters the C6x assembler is not case sensitive it can ass...

Страница 6: ...ed point DSP and provides pinouts electrical specifications and timings for the de vice TMS320C6701 Digital Signal Processor Data Sheet literature number SPRS067 describes the features of the TMS320C6...

Страница 7: ...de literature number SPRU052 alphabetically lists over 100 third parties that provide various products that serve the family of TMS320 digital signal processors A myriad of products and applications a...

Страница 8: ...22 25 40 Europe Customer Training Helpline Fax 49 81 61 80 40 10 Asia Pacific Literature Response Center 852 2 956 7288 Fax 852 2 956 2200 Hong Kong DSP Hotline 852 2 956 7268 Fax 852 2 956 1002 Korea...

Страница 9: ...Architecture 1 7 1 4 1 Central Processing Unit CPU 1 8 1 4 2 Internal Memory 1 8 1 4 3 Peripherals 1 9 2 CPU Data Paths and Control 2 1 Summarizes the TMS320C62x C67x architecture and describes the p...

Страница 10: ...s on Register Reads 3 19 3 7 6 Constraints on Register Writes 3 19 3 8 Addressing Modes 3 21 3 8 1 Linear Addressing Mode 3 21 3 8 2 Circular Addressing Mode 3 21 3 8 3 Syntax for Load Store Address G...

Страница 11: ...struction Types 6 13 6 3 Functional Unit Hazards 6 20 6 3 1 S Unit Hazards 6 21 6 3 2 M Unit Hazards 6 25 6 3 3 L Unit Hazards 6 30 6 3 4 D Unit Instruction Hazards 6 34 6 3 5 Single Cycle Instruction...

Страница 12: ...pts Interrupt Flag Set and Clear Registers IFR ISR ICR 7 14 7 3 3 Returning From Interrupt Servicing 7 16 7 4 Interrupt Detection and Processing 7 18 7 4 1 Setting the Nonreset Interrupt Flag 7 18 7 4...

Страница 13: ...9 5 1 Fixed Point Pipeline Stages 5 2 5 2 Fetch Phases of the Pipeline 5 3 5 3 Decode Phases of the Pipeline 5 4 5 4 Execute Phases of the Pipeline and Functional Block Diagram of the TMS320C62x 5 5...

Страница 14: ...on Block Diagram 6 43 6 16 Branch Instruction Phases 6 44 6 17 Branch Execution Block Diagram 6 45 6 18 2 Cycle DP Instruction Phases 6 46 6 19 4 Cycle Instruction Phases 6 47 6 20 INTDP Instruction P...

Страница 15: ...Return Pointer NRP 7 16 7 11 Interrupt Return Pointer IRP 7 17 7 12 TMS320C62x Nonreset Interrupt Detection and Processing Pipeline Operation 7 19 7 13 TMS320C67x Nonreset Interrupt Detection and Pro...

Страница 16: ...Unit Latency Summary 3 12 3 6 Registers That Can Be Tested by Conditional Operations 3 16 3 7 Indirect Address Generation for Load Store 3 23 3 8 Relationships Between Operands Operand Size Signed Un...

Страница 17: ...Multiply M Unit Instruction Hazards 6 25 6 8 4 Cycle M Unit Instruction Hazards 6 26 6 9 MPYI M Unit Instruction Hazards 6 27 6 10 MPYID M Unit Instruction Hazards 6 28 6 11 MPYDP M Unit Instruction H...

Страница 18: ...viii 7 1 Interrupt Priorities 7 3 7 2 Interrupt Service Table Pointer ISTP Field Descriptions 7 8 7 3 Interrupt Control Registers 7 10 7 4 Control Status Register CSR Interrupt Control Field Descripti...

Страница 19: ...ble Interrupts Globally 7 12 7 3 Code Sequence to Enable Maskable Interrupts Globally 7 12 7 4 Code Sequence to Enable an Individual Interrupt INT9 7 14 7 5 Code Sequence to Disable an Individual Inte...

Страница 20: ...sts of multiple execution units running in parallel performing multiple instructions during a single clock cycle Parallelism is the key to extremely high performance taking these DSPs well beyond the...

Страница 21: ...ct of the Year Today the TMS320 family consists of many generations C1x C2x C2xx C5x and C54x fixed point DSPs C3x and C4x floating point DSPs and C8x multipro cessor DSPs Now there is a new generatio...

Страница 22: ...control Power line monitoring Robotics Security access Instrumentation Medical Military Digital filtering Function generation Pattern matching Phase locked loops Seismic processing Spectrum analysis T...

Страница 23: ...function applications such as Pooled modems Wireless local loop base stations Beam forming base stations Remote access servers RAS Digital subscriber loop DSL systems Cable modems Multichannel telepho...

Страница 24: ...Access Port and Boundary Scan Architecture Features of the C62x C67x include Advanced VLIW CPU with eight functional units including two multipliers and six arithmetic units J Executes up to eight ins...

Страница 25: ...Peak 688M FLOPS at 167 MHz for multiply and accumulate operations Hardware support for single precision 32 bit and double precision 64 bit IEEE floating point operations 32 32 bit integer multiply wi...

Страница 26: ...U while peripherals such as serial ports and host ports are on only certain devices Check the data sheet for your device to determine the specific peripheral configurations you have Figure 1 1 TMS320C...

Страница 27: ...16 32 bit general purpose registers The data paths are described in more detail in Chapter 2 CPU Data Paths and Control A control register file provides the means to configure and control various proc...

Страница 28: ...ces have a subset of these peripherals but may not have all of them Serial ports Timers External memory interface EMIF that supports synchronous and asynchronous SRAM and synchronous DRAM DMA controll...

Страница 29: ...consist of Two general purpose register files A and B Eight functional units L1 L2 S1 S2 M1 M2 D1 and D2 Two load from memory paths LD1 and LD2 Two store to memory paths ST1 and ST2 Two register file...

Страница 30: ...src1 src1 src1 src1 src1 src1 src1 8 8 8 8 8 8 long dst long dst dst dst dst dst dst dst dst src2 src2 src2 src2 src2 src2 src2 long src Control register file DA1 DA2 ST1 LD1 LD2 ST2 32 32 Data path A...

Страница 31: ...src1 src1 src1 src1 src1 src1 8 8 long dst long dst dst dst dst dst dst dst dst src2 src2 src2 src2 src2 src2 src2 long src Control register file DA1 DA2 ST1 LD1 32 LSB LD2 32 LSB LD2 32 MSB 32 32 Dat...

Страница 32: ...bit data is contained across two registers the 32 LSBs of the data are placed in an even register and the remaining eight MSBs are placed in the eight LSBs of the next upper register which is always...

Страница 33: ...4 MSBs of the odd register Operations producing a long result zero fill the 24 MSBs of the odd register The even register is encoded in the opcode Figure 2 3 Storage Scheme for 40 Bit Data in a Regist...

Страница 34: ...SP DP conversion operations M unit M1 M2 16 16 bit multiply operations 32 32 bit fixed point multiply operations Floating point multiply operations D unit D1 D2 32 bit add subtract linear and circula...

Страница 35: ...le The L1 and L2 units src1 and src2 inputs are also multiplex selectable between the cross path and the same side register file Only two cross paths 1X and 2X exist in the C62x C67x CPUs This limits...

Страница 36: ...ht registers also contains sizes for circular addressing 2 9 CSR Control status register Contains the global interrupt enable bit cache control bits and other miscellaneous control and status bits 2 1...

Страница 37: ...ds and block size fields are shown in Figure 2 4 and the mode select field encoding is shown in Table 2 4 Figure 2 4 Addressing Mode Register AMR 31 26 16 25 21 20 BK0 R W 0 Reserved R 0 R W 0 BK1 Blo...

Страница 38: ...2 5 Block Size Calculations N Block Size N Block Size 00000 2 10000 131 072 00001 4 10001 262 144 00010 8 10010 524 288 00011 16 10011 1 048 576 00100 32 10100 2 097 152 00101 64 10101 4 194 304 00110...

Страница 39: ...1 24 8 CPU ID CPU ID defines which CPU CPU ID 00b indicates C62x CPU ID 10b indicates C67x 23 16 8 Revision ID Revision ID defines silicon revision of the CPU 15 10 6 PWRD Control power down modes the...

Страница 40: ...e PCE1 shown in Figure 2 6 contains the 32 bit address of the execute packet in the E1 pipeline phase Figure 2 6 E1 Phase Program Counter PCE1 31 PCE1 R W x 16 15 PCE1 R W x 0 Legend R Readable by the...

Страница 41: ...ttempted with a NaN source Table 2 7 shows the addi tional registers used by the C67x The OVER UNDER INEX INVAL DENn NANn INFO UNORD and DIV0 bits within these registers will not be modified by a cond...

Страница 42: ...cific to each of the L units L1 and L2 Figure 2 7 shows the layout of FADCR The functions of the fields in the FADCR are shown in Table 2 8 Figure 2 7 Floating Point Adder Configuration Register FADCR...

Страница 43: ...to integer conversion or when infinity is subtracted from infinity 19 1 DEN2 L2 src2 is a denormalized number 18 1 DEN1 L2 src1 is a denormalized number 17 1 NAN2 L2 src2 is NaN 16 1 NAN1 L2 src1 is N...

Страница 44: ...c to each of the S units S1 and S2 Figure 2 8 shows the layout of FAUCR The functions of the fields in the FAUCR are shown in Table 2 9 Figure 2 8 Floating Point Auxiliary Configuration Register FAUCR...

Страница 45: ...point to integer conversion or when infinity is subtracted from infinity 19 1 DEN2 S2 src2 is a denormalized number 18 1 DEN1 S2 src1 is a denormalized number 17 1 NAN2 S2 src2 is NaN 16 1 NAN1 S2 sr...

Страница 46: ...ds specific to each of the M units M1 and M2 Figure 2 9 shows the layout of FMCR The functions of the fields in the FMCR are shown in Table 2 10 Figure 2 9 Floating Point Multiplier Configuration Regi...

Страница 47: ...nt to integer conversion or when infinity is subtracted from infinity 19 1 DEN2 M2 src2 is a denormalized number 18 1 DEN1 M2 src1 is a denormalized number 17 1 NAN2 M2 src2 is NaN 16 1 NAN1 M2 src1 i...

Страница 48: ...C67x digital sig nal processors Also described are parallel operations conditional operations resource constraints and addressing modes Instructions unique to the C67x floating point addition subtract...

Страница 49: ...tring b cond Check for either creg equal to 0 or creg not equal to 0 creg 3 bit field specifying a conditional register cstn n bit constant field for example cst5 int 32 bit integer value lmb0 x Leftm...

Страница 50: ...bit x ext l r Extract and sign extend a field in x specified by l shift left value and r shift right value x extu l r Extract an unsigned field in x specified by l shift left value and r shift right...

Страница 51: ...ADDK SHL ADDAB STH 15 bit offset ADDU MPYUS ADD2 SHR ADDAH STW 15 bit offset AND MPYSU AND SHRU ADDAW SUB CMPEQ MPYH B disp SSHL LDB SUBAB CMPGT MPYHU B IRP SUB LDBU SUBAH CMPGTU MPYHUS B NRP SUBU LD...

Страница 52: ...l Unit to Instruction Mapping C62x C67x Functional Units Instruction L Unit M Unit S Unit D Unit ABS n ADD n n n ADDU n ADDAB n ADDAH n ADDAW n ADDK n ADD2 n AND n n B n B IRP n B NRP n B reg n CLR n...

Страница 53: ...ction D Unit S Unit M Unit L Unit LDW mem n LDB mem 15 bit offset n LDBU mem 15 bit offset n LDH mem 15 bit offset n LDHU mem 15 bit offset n LDW mem 15 bit offset n LMBD n MPY n MPYU n MPYUS n MPYSU...

Страница 54: ...ing Continued C62x C67x Functional Units Instruction D Unit S Unit M Unit L Unit MVKH n MVKLH n NEG n n NOP NORM n NOT n n OR n n SADD n SAT n SET n SHL n SHR n SHRU n SMPY n SMPYH n SMPYHL n SMPYLH n...

Страница 55: ...d Functional Units 3 8 Table 3 3 Functional Unit to Instruction Mapping Continued C62x C67x Functional Units Instruction D Unit S Unit M Unit L Unit SUBU n n SUBAB n SUBAH n SUBAW n SUBC n SUB2 n XOR...

Страница 56: ...baseR base address register creg 3 bit field specifying a conditional register cst constant csta constant a cstb constant b dst destination h MVK or MVKH bit ld st load store opfield mode addressing m...

Страница 57: ...op 0 0 0 s p Operations on the D unit 3 5 5 5 6 7 6 1 0 src2 src1 cst 31 29 28 27 23 22 creg z dst src 4 3 2 1 0 1 1 s p Load store with 15 bit offset on the D unit 3 5 15 6 ld st ucst15 7 8 y 3 Load...

Страница 58: ...5 5 2 Field operations immediate forms on the S unit src2 31 29 28 27 23 22 creg z dst 7 6 5 4 3 2 1 0 1 0 1 0 s p 3 5 16 h cst MVK and MVKH on the S unit Bcond disp on the S unit 31 29 28 27 creg z...

Страница 59: ...are equivalent to an execution or result latency All of the instruc tions that are common to the C62x and C67x have a functional unit latency of 1 This means that a new instruction can be started on...

Страница 60: ...bits are scanned from left to right lower to higher address If the p bit of instruction i is 1 then instruction i 1 is to be executed in parallel with in the the same cycle as instruction i If the p...

Страница 61: ...nce Cycle Execute Packet Instructions 1 A 2 B 3 C 4 D 5 E 6 F 7 G 8 H The eight instructions are executed sequentially Example 3 2 Fully Parallel p Bit Pattern in a Fetch Packet This p bit pattern 1 1...

Страница 62: ...s signify that an instruction is to execute in parallel with the pre vious instruction The code for the fetch packet in Example 3 3 would be rep resented as this instruction A instruction B instructio...

Страница 63: ...Can Be Tested by Conditional Operations Specified C diti l creg z Conditional Register Bit 31 30 29 28 Unconditional 0 0 0 0 Reserved 0 0 0 1 B0 0 0 1 z B1 0 1 0 z B2 0 1 1 z A1 1 0 0 z A2 1 0 1 z Res...

Страница 64: ...per data path per execute packet can read a source operand from its opposite register file via the cross paths 1X and 2X For example S1 can read both of an instruction s operands from the A register f...

Страница 65: ...or storing from the same register file cannot be issued in the same execute packet The following execute packet is invalid LDW D1 A4 A5 Loading to and storing from the STW D2 A6 B4 same register file...

Страница 66: ...ional registers are not included in this count The following code sequences are invalid MPY M1 A1 A1 A4 five reads of register A1 ADD L1 A1 A1 A5 SUB D1 A1 A2 A3 MPY M1 A1 A1 A4 five reads of register...

Страница 67: ...n L2 and L3 might not be detected by the assembler The instructions in L4 do not constitute a write conflict because they are mutually exclusive In con trast because the instructions in L5 may or may...

Страница 68: ...action instructions linear mode simply shifts the src1 cst operand to the left by 2 1 or 0 for word halfword or byte data sizes respectively and then performs the add or subtract specified 3 8 2 Circu...

Страница 69: ...s borrows propagate as usual If you specify src1 greater than the circular buffer size 2 N 1 the effective offsetR cst is modulo the circular buffer size see Example 3 5 The circular buffer size in th...

Страница 70: ...ore In this case you can use the B14 or B15 register as the base register and use a 15 bit constant ucst15 as the offset Table 3 7 Indirect Address Generation for Load Store Addressing Type No Modific...

Страница 71: ...llowing information Assembler syntax Functional units Operands Opcode Description Execution Instruction type Delay slots Functional Unit Latency Examples The ADD instruction is used as an example to f...

Страница 72: ...s situation is documented for the ADD instruction This instruction has three opcode map fields src1 src2 and dst In the seventh row the operands have the types cst5 long and long for src1 src2 and dst...

Страница 73: ...2 dst sint xsint slong L1 L2 0100011 ADD src1 src2 dst uint xuint ulong L1 L2 0101011 ADDU src1 src2 dst xsint slong slong L1 L2 0100001 ADD src1 src2 dst xuint ulong ulong L1 L2 0101001 ADDU src1 src...

Страница 74: ...defined in Table 3 1 on page 3 2 Pipeline This section contains a table that shows the sources read from the destina tions written to and the functional unit used during each execution cycle of the i...

Страница 75: ...olute value of src2 is placed in dst Execution if cond abs src2 dst else nop The absolute value of src2 when src2 is an sint is determined as follows 1 If src2 w 0 then src2 dst 2 If src2 t 0 and src2...

Страница 76: ...BS L1 A1 A5 Before instruction 1 cycle after instruction A1 8000 4E3Dh 2147463619 A1 8000 4E3Dh 2147463619 A5 XXXX XXXXh A5 7FFF B1C3h 2147463619 Example 2 ABS L1 A1 A5 Before instruction 1 cycle afte...

Страница 77: ...1 L2 0000011 src1 src2 dst sint xsint slong L1 L2 0100011 src1 src2 dst uint xuint ulong L1 L2 0101011 src1 src2 dst xsint slong slong L1 L2 0100001 src1 src2 dst xuint ulong ulong L1 L2 0101001 src1...

Страница 78: ...2 Description for L1 L2 and S1 S2 Opcodes src2 is added to src1 The result is placed in dst Execution for L1 L2 and S1 S2 Opcodes if cond src1 src2 dst else nop Opcode D unit 31 29 28 27 23 22 18 17 c...

Страница 79: ...A4 Before instruction 1 cycle after instruction A1 0000 325Ah 12890 A1 0000 325Ah A3 A2 0000 00FFh FFFF FF12h 1099511627538 A3 A2 0000 00FFh FFFF FF12h A5 A4 0000 0000h 0000 0000h 0 A5 A4 0000 0000h...

Страница 80: ...er Addition Without Saturation ADD U 3 33 TMS320C62x C67x Fixed Point Instruction Set Example 6 ADD D1 26 A1 A6 Before instruction 1 cycle after instruction A1 0000 325Ah 12890 A1 0000 325Ah A6 XXXX X...

Страница 81: ...0 0 s p 3 5 5 5 6 7 6 1 0 src2 src1 cst Description src1 is added to src2 using the addressing mode specified for src2 The addi tion defaults to linear mode However if src2 is one of A4 A7 or B4 B7 th...

Страница 82: ...0001h BK0 2 size 8 A4 in circular addressing mode using BK0 Example 2 ADDAH D1 A4 A2 A4 Before instruction 1 cycle after instruction A2 0000 000Bh A2 0000 000Bh A4 0000 0100h A4 0000 0106h AMR 0002 0...

Страница 83: ...s p 31 creg 29 28 27 23 22 7 1 3 1 1 Description A 16 bit signed constant is added to the dst register specified The result is placed in dst Execution if cond cst dst dst else nop Pipeline Stage E1 R...

Страница 84: ...The upper and lower halves of the src1 operand are added to the upper and lower halves of the src2 operand Any carry from the lower half add does not affect the upper half add Execution if cond lsb16...

Страница 85: ...111 src1 src2 dst scst5 xuint uint S1 S2 011110 Opcode L unit form 31 29 28 27 23 22 18 17 creg z dst 13 12 11 5 4 3 2 1 0 x op 1 1 0 s p 3 5 5 5 7 src2 src1 cst S unit form 31 29 28 27 23 22 18 17 cr...

Страница 86: ...use L or S Instruction Type Single cycle Example 1 AND L1X A1 B1 A2 Before instruction 1 cycle after instruction A1 F7A1 302Ah A1 F7A1 302Ah A2 XXXX XXXXh A2 02A0 2020h B1 02B6 E724h B1 02B6 E724h Exa...

Страница 87: ...2 If two branches are in the same execute packet and both are taken behavior is undefined Two conditional branches can be in the same execute packet if one branch uses a displacement and the other use...

Страница 88: ...L1 A1 A2 A3 0000 0008 ADD L2 B1 B2 B3 0000 000C LOOP MPY M1X A3 B3 A4 0000 0010 SUB D1 A5 A6 A6 0000 0014 MPY M1 A3 A6 A5 0000 0018 MPY M1 A6 A7 A8 0000 001C SHR S1 A4 15 A4 0000 0020 ADD D1 A4 A6 A4...

Страница 89: ...n be in the same execute packet if one branch uses a displacement and the other uses a register IRP or NRP As long as onlly one branch has a true condition the code executes in a well defined way Exec...

Страница 90: ...DD L2 B1 B2 B3 1000 000C MPY M1X A3 B3 A4 1000 0010 SUB D1 A5 A6 A6 1000 0014 MPY M1 A3 A6 A5 1000 0018 MPY M1 A6 A7 A8 1000 001C SHR S1 A4 15 A4 1000 0020 ADD D1 A4 A6 A4 Table 3 10 Program Counter V...

Страница 91: ...nches can be in the same execute packet if one branch uses a displacement and the other uses a register IRP or NRP As long as only one branch has a ture condition the code executes in a well defined w...

Страница 92: ...P 0000 1000 0000 0020 B S2 IRP 0000 0024 ADD S1 A0 A2 A1 0000 0028 MPY M1 A1 A0 A1 0000 002C NOP 0000 0030 SHR S1 A1 15 A1 0000 0034 ADD L1 A1 A2 A1 0000 0038 ADD L2 B1 B2 B3 Table 3 11 Program Counte...

Страница 93: ...can be in the same execute packet if one branch uses a displacement and the other uses a register IRP or NRP As long as only one branch has a true condition the code executes in a well defined way Ex...

Страница 94: ...00 1000 0000 0020 B S2 NRP 0000 0024 ADD S1 A0 A2 A1 0000 0028 MPY M1 A1 A0 A1 0000 002C NOP 0000 0030 SHR S1 A1 15 A1 0000 0034 ADD L1 A1 A2 A1 0000 0038 ADD L2 B1 B2 B3 Table 3 12 Program Counter Va...

Страница 95: ...it Opfield src2 csta cstb dst uint ucst5 ucst5 uint S1 S2 11 src2 src1 dst xuint uint uint S1 S2 111111 Opcode Constant form 5 z cstb 6 5 0 dst 0 0 1 0 s p 31 creg 29 28 27 7 1 3 18 17 23 22 src2 5 cs...

Страница 96: ...are valid for the register version of the instruction If any of the 22 MSBs are non zero the result is invalid src2 dst 0 x x x x x x x x x x x x x x x x x x x x x x x 1 1 1 1 1 0 0 0 0 x x x x x x x...

Страница 97: ...CLR Clear a Bit Field 3 50 Example 2 CLR S2 B1 B3 B2 Before instruction 1 cycle after instruction B1 03B6 E7D5h B1 03B6 E7D5h B2 XXXX XXXXh B2 03B0 0001h B3 0000 0052h B3 0000 0052h...

Страница 98: ...2 dst xsint slong uint L1 L2 1010001 src1 src2 dst scst5 slong uint L1 L2 1010000 Opcode 31 29 28 27 23 22 18 17 creg z dst 13 12 11 5 4 3 2 1 0 x op 1 1 0 s p 3 5 5 5 7 src2 src1 cst Description This...

Страница 99: ...0h false B1 0000 4B7h 1207 B1 0000 4B7h Example 2 CMPEQ L1 Ch A1 A2 Before instruction 1 cycle after instruction A1 0000 000Ch 12 A1 0000 000Ch A2 XXXX XXXXh A2 0000 0001h true Example 3 CMPEQ L2X A1...

Страница 100: ...L2 1000111 CMPGT src1 src2 dst scst5 xsint uint L1 L2 1000110 CMPGT src1 src2 dst xsint slong uint L1 L2 1000101 CMPGT src1 src2 dst scst5 slong uint L1 L2 1000100 CMPGT src1 src2 dst uint xuint uint...

Страница 101: ...else 0 dst else nop Pipeline Stage E1 Read src1 src2 Written dst Unit in use L Instruction Type Single cycle Delay Slots 0 Example 1 CMPGT L1X A1 B1 A2 Before instruction 1 cycle after instruction A1...

Страница 102: ...Before instruction 1 cycle after instruction A1 0000 0128h 296 A1 0000 0128h A2 FFFF FFDEh 4294967262 A2 FFFF FFDEh A3 XXXX XXXXh A3 0000 0000h false Example 6 CMPGTU L1 0Ah A1 A2 Before instruction 1...

Страница 103: ...c2 dst scst5 xsint uint L1 L2 1010110 CMPLT src1 src2 dst xsint slong uint L1 L2 1010101 CMPLT src1 src2 dst scst5 slong uint L1 L2 1010100 CMPLT src1 src2 dst uint xuint uint L1 L2 1011111 CMPLTU src...

Страница 104: ...2 Written dst Unit in use L Instruction Type Single cycle Delay Slots 0 Example 1 CMPLT L1 A1 A2 A3 Before instruction 1 cycle after instruction A1 0000 07E2h 2018 A1 0000 07E2h A2 0000 0F6Bh 3947 A2...

Страница 105: ...1h true Unsigned 32 bit integer Example 5 CMPLTU L1 14 A1 A2 Before instruction 1 cycle after instruction A1 0000 000Fh 15 A1 0000 000Fh A2 XXXX XXXXh A2 0000 0001h true Example 6 CMPLTU L1 A1 A5 A4 A...

Страница 106: ...1 1 1 1 Description The field in src2 specified by csta and cstb is extracted and sign extended to 32 bits The extract is performed by a shift left followed by a signed shift right csta and cstb are t...

Страница 107: ...2 3 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 31 30 29 28 27 26 25 24...

Страница 108: ...t Example 1 EXT S1 A1 10 19 A2 Before instruction 1 cycle after instruction A1 07A4 3F2Ah A1 07A4 3F2Ah A2 XXXX XXXXh A2 FFFF F21Fh Example 2 EXT S1 A1 A2 A3 Before instruction 1 cycle after instructi...

Страница 109: ...11 Description The field in src2 specified by csta and cstb is extracted and zero extended to 32 bits The extract is performed by a shift left followed by an unsigned shift right csta and cstb are th...

Страница 110: ...to produce 1 2 3 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 31 30 29 28...

Страница 111: ...10 19 A2 Before instruction 1 cycle after instruction A1 07A4 3F2Ah A1 07A4 3F2Ah A2 XXXX XXXXh A2 0000 121Fh Example 2 EXTU S1 A1 A2 A3 Before instruction 1 cycle after instruction A1 03B6 E7D5h A1...

Страница 112: ...0 0 0 0 0 s p 31 Reserved 18 17 16 14 15 1 14 13 12 11 10 9 8 7 6 0 0 0 0 0 0 0 0 1 1 1 1 1 4 3 2 Description This instruction performs an infinite multicycle NOP that terminates upon servicing an in...

Страница 113: ...unsigned constant ucst5 If an offset is not given the assembler assigns an offset of zero offsetR and baseR must be in the same register file and on the same side as the D unit used The y bit in the...

Страница 114: ...and s 1 indicates dst will be loaded in the B register file The r bit should be set to zero Table 3 13 Data Types Supported by Loads Mnemonic ld st Field Load Data Type SIze Left Shift of Offset LDB 0...

Страница 115: ...ou must type either brackets or parentheses around the specified offset if you use the optional offset parameter Word and halfword addresses must be aligned on word two LSBs are 0 and halfword LSB is...

Страница 116: ...D1 A4 A1 A8 Before LDH 1 cycle after LDH 5 cycles after LDH A1 0000 0002h A1 0000 0002h A1 0000 0002h A4 0000 0020h A4 0000 0024h A4 0000 0024h A8 1103 51FFh A8 1103 51FFh A8 FFFF A21Fh AMR 0000 0000h...

Страница 117: ...gister Offset 3 70 Example 5 LDW D1 A4 1 A6 Before LDW 1 cycle after LDW 5 cycles after LDW A4 0000 0100h A4 0000 0104h A4 0000 0104h A6 1234 5678h A6 1234 5678h A6 0217 6991h AMR 0000 0000h 0000 0000...

Страница 118: ...ucst15 is added to baseR Subtraction is not supported The result of the calculation is the address sent to memory The ad dressing arithmetic is always performed in linear mode For LDH U and LDB U the...

Страница 119: ...ld st Field Load Data Type SIze Left Shift of Offset LDB 0 1 0 Load byte 8 0 bits LDBU 0 0 1 Load byte unsigned 8 0 bits LDH 1 0 0 Load halfword 16 1 bit LDHU 0 0 0 Load halfword unsigned 16 1 bit LD...

Страница 120: ...oint Instruction Set Example LDB D2 B14 36 B1 Before LDB 1 cycle after LDB B1 XXXX XXXXh B1 XXXX XXXXh B14 0000 0100h B14 0000 0100h mem 124 127h 4E7A FF12h mem 124 127h 4E7A FF12h mem 124h 12h mem 12...

Страница 121: ...following diagram illustrates the operation of LMBD for several cases 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 x 0 1 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x...

Страница 122: ...n Set Pipeline Stage E1 Read src1 src2 Written dst Unit in use L Instruction Type Single cycle Delay Slots 0 Example LMBD L1 A1 A2 A3 Before instruction 1 cycle after instruction A1 0000 0001h A1 0000...

Страница 123: ...16 sint M1 M2 11101 MPYUS src1 src2 dst slsb16 xulsb16 sint M1 M2 11011 MPYSU src1 src2 dst scst5 xslsb16 sint M1 M2 11000 MPY src1 src2 dst scst5 xulsb16 sint M1 M2 11110 MPYSU Opcode 31 29 28 27 23...

Страница 124: ...A1 0000 0123h A2 01E0 FA81h 1407 A2 01E0 FA81h A3 XXXX XXXXh A3 FFF9 C0A3 409437 Example 2 MPYU M1 A1 A2 A3 Before instruction 2 cycles after instruction A1 0000 0123h 291 A1 0000 0123h A2 0F12 FA81h...

Страница 125: ...fore instruction 2 cycles after instruction A1 3497 FFF3h 13 A1 3497 FFF3h A2 XXXX XXXXh A2 FFFF FF57h 163 Example 5 MPYSU M1 13 A1 A2 Before instruction 2 cycles after instruction A1 3497 FFF3h 65523...

Страница 126: ...1 src2 dst umsb16 xumsb16 uint M1 M2 00111 MPYHU src1 src2 dst umsb16 xsmsb16 sint M1 M2 00101 MPYHUS src1 src2 dst smsb16 xumsb16 sint M1 M2 00011 MPYHSU Opcode 31 29 28 27 23 22 18 17 creg z dst src...

Страница 127: ...1234h 89 A2 FFA7 1234h A3 XXXX XXXXh A3 FFFF F3D5h 3115 Example 2 MPYHU M1 A1 A2 A3 Before instruction 2 cycles after instruction A1 0023 0000h 35 A1 0023 0000h A2 FFA7 1234h 65447 A2 FFA7 1234h A3 X...

Страница 128: ...HL src1 src2 dst umsb16 xulsb16 uint M1 M2 01111 MPYHLU src1 src2 dst umsb16 xslsb16 sint M1 M2 01101 MPYHULS src1 src2 dst smsb16 xulsb16 sint M1 M2 01011 MPYHSLU Opcode 31 29 28 27 23 22 18 17 creg...

Страница 129: ...src1 src2 Written dst Unit in use M Instruction Type Multiply 16 16 Delay Slots 1 Example MPYHL M1 A1 A2 A3 Before instruction 2 cycles after instruction A1 008A 003Eh 138 A1 008A 003Eh A2 21FF 00A7h...

Страница 130: ...LH src1 src2 dst ulsb16 xumsb16 uint M1 M2 10111 MPYLHU src1 src2 dst ulsb16 xsmsb16 sint M1 M2 10101 MPYLUHS src1 src2 dst slsb16 xumsb16 sint M1 M2 10011 MPYLSHU Opcode 31 29 28 27 23 22 18 17 creg...

Страница 131: ...d src1 src2 Written dst Unit in use M Instruction Type Multiply 16 16 Delay Slots 1 Example MPYLH M1 A1 A2 A3 Before instruction 2 cycles after instruction A1 0900 000Eh 14 A1 0900 000Eh A2 0029 00A7h...

Страница 132: ...src dst xsint sint L1 L2 0000010 src dst sint sint D1 D2 010010 src dst slong slong L1 L2 0100001 src dst xsint sint S1 S2 000110 Opcode See ADD instruction Description This is a pseudo operation that...

Страница 133: ...scription The src2 register is moved from the control register file to the register file Valid values for src2 are any register listed in the control register file Operands when moving from the regist...

Страница 134: ...nterrupt enable register 00100 R W ISTP Interrupt service table pointer 00101 R W IRP Interrupt return pointer 00110 R W NRP Nonmaskable interrupt return pointer 00111 R W PCE1 Program counter E1 phas...

Страница 135: ...has one delay slot because the results cannot be read by the MVC instruction in the IFR until two cycles after the write to the ISR or ICR Delay Slots 0 Example MVC S2 B1 AMR Before instruction 1 cyc...

Страница 136: ...creg 29 28 27 23 22 7 1 3 1 1 Description The 16 bit constant is sign extended and placed in dst Execution if cond scst16 dst else nop Pipeline Stage E1 Read Written dst Unit in use S Instruction Type...

Страница 137: ...93 A1 Before instruction 1 cycle after instruction A1 XXXX XXXXh A1 0000 0125h 293 Example 2 MVK S2 125h B1 Before instruction 1 cycle after instruction B1 XXXX XXXXh B1 0000 0125h 293 Example 3 MVK S...

Страница 138: ...on The 16 bit constant cst is loaded into the upper 16 bits of dst The 16 LSBs of dst are unchanged The assembler encodes the 16 MSBs of a 32 bit constant into the cst field of the opcode for the MVKH...

Страница 139: ...uctions MVK 0x5678 MVKLH 0x1234 You could also use MVK 0x12345678 MVKH 0x12345678 If you are loading the address of a label use MVK label MVKH label Example 1 MVKH S1 0A329123h A1 Before instruction 1...

Страница 140: ...Opfield src dst xsint sint S1 S2 010110 src dst xsint sint L1 L2 0000110 src dst slong slong L1 L2 0100100 Opcode See SUB instruction Description This is a pseudo operation used to negate src and pla...

Страница 141: ...A multicycle NOP will not finish if a branch is completed first For example if a branch is initiated on cycle n and a NOP 5 instruction is initiated on cycle n 3 the branch is complete on cycle n 6 an...

Страница 142: ...TMS320C62x C67x Fixed Point Instruction Set Example 2 MVK S1 1 A1 MVKLH S1 0 A1 NOP 5 ADD L1 A1 A2 A1 Before NOP 5 1 cycle after ADD instruction 6 cycles after NOP 5 A1 0000 0001h A1 0000 0004h A2 000...

Страница 143: ...x x x x x x x x x x x x x x x x x x x x x x x x In this case NORM returns 3 In this case NORM returns 30 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 31 30 29...

Страница 144: ...E1 Read src2 Written dst Unit in use L Delay Slots 0 Example 1 NORM L1 A1 A2 Before instruction 1 cycle after instruction A1 02A3 469Fh A1 02A3 469Fh A2 XXXX XXXXh A2 0000 0005h 5 Example 2 NORM L1 A1...

Страница 145: ...uint uint L1 L2 1101110 src dst xuint uint S1 S2 001010 Opcode See XOR instruction Description This is a pseudo operation used to bitwise NOT the src operand and place the result in dst The assembler...

Страница 146: ...src1 src2 dst uint xuint uint S1 S2 011011 src1 src2 dst scst5 xuint uint S1 S2 011010 Opcode L unit form 31 29 28 27 23 22 18 17 creg z dst 13 12 11 5 4 3 2 1 0 x op 1 1 0 s p 3 5 5 5 7 src2 src1 cs...

Страница 147: ...truction Type Single cycle Delay Slots 0 Example 1 OR L1X A1 B1 A2 Before instruction 1 cycle after instruction A1 08A3 A49Fh A1 08A3 A49Fh A2 XXXX XXXXh A2 08FF B7DFh B1 00FF 375Ah B1 00FF 375Ah Exam...

Страница 148: ...eg z dst 13 12 11 5 4 3 2 1 0 x op 1 1 0 s p 3 5 5 5 7 src2 src1 cst Description src1 is added to src2 and saturated if an overflow occurs according to the fol lowing rules 1 If the dst is an int and...

Страница 149: ...A1 5A2E 51A3h A2 012A 3FA2h 19546018 A2 012A 3FA2h A2 012A 3FA2h A3 XXXX XXXXh A3 5B58 9145h 1532531013 A3 5B58 9145h CSR 0001 0100h CSR 0001 0100h CSR 0001 0100h Not saturated Example 2 SADD L1 A1 A...

Страница 150: ...instruction A5 A4 0000 0000h 7C83 39B1h 1922644401 A5 A4 0000 0000h 7C83 39B1h A7 A6 XXXX XXXXh XXXX XXXXh A7 A6 0000 0000h 8DAD 7953h 2376956243 B2 112A 3FA2h 287981474 B2 112A 3FA2h CSR 0001 0100h C...

Страница 151: ...ription A 40 bit src2 value is converted to a 32 bit value If the value in src2 is greater than what can be represented in 32 bits src2 is saturated The result is placed in dst If a saturate occurs th...

Страница 152: ...aturated Example 2 SAT L2 B1 B0 B5 Before instruction 1 cycle after instruction 2 cycles after instruction B1 B0 0000 0000h A190 7321h B1 B0 0000 0000h A190 7321h B1 B0 0000 0000h A190 7321h B5 XXXX X...

Страница 153: ...nd type Unit src2 csta cstb dst uint ucst5 ucst5 uint S1 S2 src2 src1 dst xuint uint uint S1 S2 Opcode Constant form 5 5 z dst cstb 6 5 0 src2 1 0 0 0 1 0 s p 31 creg 29 28 27 23 22 7 1 3 18 13 1 1 17...

Страница 154: ...rc2 is 31 In the example below csta is 15 and cstb is 23 Only the ten LSBs are valid for the register version of the instruction If any of the 22 MSBs are non zero the result is invalid src2 dst 0 x x...

Страница 155: ...efore instruction 1 cycle after instruction A0 4B13 4A1Eh A0 4B13 4A1Eh A1 XXXX XXXXh A1 4B3F FF9Eh Example 2 SET S2 B0 B1 B2 Before instruction 1 cycle after instruction B0 9ED3 1A31h B0 9ED3 1A31h B...

Страница 156: ...S2 110000 src2 src1 dst xuint ucst5 ulong S1 S2 010010 Opcode 31 29 28 27 23 22 18 17 creg z dst 13 12 5 4 3 2 1 0 op 0 0 0 s p 3 5 5 5 6 6 1 11 x src1 cst src2 Description The src2 operand is shifte...

Страница 157: ...tion A0 29E3 D31Ch A0 29E3 D31Ch A1 XXXX XXXXh A1 9E3D 31C0h Example 2 SHL S2 B0 B1 B2 Before instruction 1 cycle after instruction B0 4197 51A5h B0 4197 51A5h B1 0000 0009h B1 0000 0009h B2 XXXX XXXX...

Страница 158: ...0 s p 3 5 5 5 6 6 1 11 x src1 cst src2 Description The src2 operand is shifted to the right by the src1 operand The sign extended result is placed in dst When a register is used the six LSBs specify t...

Страница 159: ...Example 2 SHR S2 B0 B1 B2 Before instruction 1 cycle after instruction B0 1492 5A41h B0 1492 5A41h B1 0000 0012h B1 0000 0012h B2 XXXX XXXXh B2 0000 0524h Example 3 SHR S2 B1 B0 B2 B3 B2 Before instr...

Страница 160: ...0 s p 3 5 5 5 6 6 1 11 x src1 cst src2 Description The src2 operand is shifted to the right by the src1 operand The zero extended result is placed in dst When a register is used the six LSBs specify t...

Страница 161: ...SHRU Logical Shift Right 3 114 Delay Slots 0 Example SHRU S1 A0 8 A1 Before instruction 1 cycle after instruction A0 F123 63D1h A0 F123 63D1h A1 XXXX XXXXh A1 00F1 2363h...

Страница 162: ...PY src1 src2 dst smsb16 xslsb16 sint M1 M2 01010 SMPYHL src1 src2 dst slsb16 xsmsb16 sint M1 M2 10010 SMPYLH src1 src2 dst smsb16 xsmsb16 sint M1 M2 00010 SMPYH Opcode 31 29 28 27 23 22 18 17 creg z d...

Страница 163: ...1 SMPY M1 A1 A2 A3 Before instruction 2 cycle after instruction A1 0000 0123h 291 A1 0000 0123h A2 01E0 FA81h 1407 A2 01E0 FA81h A3 XXXX XXXXh A3 FFF3 8146h 818874 CSR 0001 0100h CSR 0001 0100h Not sa...

Страница 164: ...Point Instruction Set Example 3 SMPYLH M1 A1 A2 A3 Before instruction 2 cycles after instruction A1 0000 8000h 32768 A1 0000 8000h A2 8000 0000h 32768 A2 8000 0000h A3 XXXX XXXXh A3 7FFF FFFFh 214748...

Страница 165: ...a register is used to specify the shift the five least significant bits specify the shift amount Valid values are 0 through 31 and the result of the shift is invalid if the shift amount is greater tha...

Страница 166: ...031Ch A0 02E3 031Ch A1 XXXX XXXXh A1 0B8C 0C70h A1 0B8C 0C70h CSR 0001 0100h CSR 0001 0100h CSR 0001 0100h Not saturated Example 2 SSHL S1 A0 A1 A2 Before instruction 1 cycle after instruction 2 cycl...

Страница 167: ...5 4 3 2 1 0 x op 1 1 0 s p 3 5 5 5 7 src2 src1 cst Description src2 is subtracted from src1 and is saturated to the result size according to the following rules 1 If the result is an int and src1 src2...

Страница 168: ...512984995 B1 5A2E 51A3h B1 5A2E 51A3h B2 802A 3FA2h 2144714846 B2 802A 3FA2h B2 802A 3FA2h B3 XXXX XXXXh B3 7FFF FFFFh 2147483647 B3 7FFF FFFFh CSR 0001 0100h CSR 0001 0100h CSR 0001 0300h Saturated E...

Страница 169: ...selects the D2 unit and baseR and offsetR from the B register file offsetR ucst5 is scaled by a left shift of 0 1 or 2 for STB STH and STW respectively After scaling offsetR ucst5 is added to or subt...

Страница 170: ...R ucst5 Preincrement 1 0 0 0 R ucst5 Predecrement 1 0 1 1 R ucst5 Postincrement 1 0 1 0 R ucst5 Postdecrement Increments and decrements default to 1 and offsets default to zero when no bracketed regi...

Страница 171: ...Chapter 6 TMS320C67x Pipeline Example 1 STB D1 A1 A10 Before instruction 1 cycle after instruction 3 cycles after instruction A1 9A32 7634h A1 9A32 7634h A1 9A32 7634h A10 0000 0100h A10 0000 0100h A1...

Страница 172: ...0000 0100h A10 0000 0104h A10 0000 0104h mem 100h 1111 1134h mem 100h 1111 1134h mem 100h 1111 1134h mem 104h 0000 1111h mem 104h 0000 1111h mem 104h 9A32 7634h Example 4 STH D1 A1 A10 A11 Before inst...

Страница 173: ...dded to baseR The result of the calcula tion is the address that is sent to memory The addressing arithmetic is always performed in linear mode For STB and STH the 8 and 16 LSBs of the src register ar...

Страница 174: ...6 1 bit STW 1 1 1 Store word 32 2 bits Execution if cond src mem else nop Pipeline Stage E1 E2 E3 Read B14 B15 src Written Unit in use D2 Instruction Type Store Delay Slots 0 Note This instruction exe...

Страница 175: ...nt L1 L2 0000111 SUB src1 src2 dst xsint sint sint L1 L2 0010111 SUB src1 src2 dst sint xsint slong L1 L2 0100111 SUB src1 src2 dst xsint sint slong L1 L2 0110111 SUB src1 src2 dst uint xuint ulong L1...

Страница 176: ...5 5 7 src2 src1 cst S unit form 31 29 28 27 23 22 18 17 creg z dst 13 12 5 4 3 2 1 0 op 0 0 0 s p 3 5 5 5 6 6 1 11 x src1 cst src2 Description for L1 L2 and S1 S2 Opcodes src2 is subtracted from src1...

Страница 177: ...ant ucst5 allows a greater offset for addressing with the D unit Pipeline Stage E1 Read src1 src2 Written dst Unit in use L S or D Instruction Type Single cycle Delay Slots 0 Example 1 SUB L1 A1 A2 A3...

Страница 178: ...creg z dst 13 12 5 4 3 2 1 0 op 0 0 0 s p 3 5 5 5 6 7 6 1 0 src2 src1 cst Description src1 is subtracted from src2 The subtraction defaults to linear mode Howev er if src2 is one of A4 A7 or B4 B7 th...

Страница 179: ...0004h A0 0000 0004h A5 0000 4000h A5 0000 400Ch AMR 0003 0004h AMR 0003 0004h BK0 3 size 16 A5 in circular addressing mode using BK0 Example 2 SUBAW D1 A5 2 A3 Before instruction 1 cycle after instru...

Страница 180: ...1 5 4 3 2 1 0 x 1 0 0 1 0 1 1 1 1 0 s p 3 5 5 5 7 src2 src1 Description Subtract src2 from src1 If result is greather than or equal to 0 left shift result by 1 add 1 to it and place it in dst If resul...

Страница 181: ...BC L1 A0 A1 A0 Before instruction 1 cycle after instruction A0 0000 125Ah 4698 A0 0000 024B4h 9396 A1 0000 1F12h 7954 A1 0000 1F12h Example 2 SUBC L1 A0 A1 A0 Before instruction 1 cycle after instruct...

Страница 182: ...wer halves of src2 are subtracted from the upper and lower halves of src1 Any borrow from the lower half subtraction does not affect the upper half subtraction Execution if cond lsb16 src1 lsb16 src2...

Страница 183: ...uint uint S1 S2 001011 src1 src2 dst scst5 xuint uint S1 S2 001010 Opcode L unit form 31 29 28 27 23 22 18 17 creg z dst 13 12 11 5 4 3 2 1 0 x op 1 1 0 s p 3 5 5 5 7 src2 src1 cst S unit form 31 29 2...

Страница 184: ...Unit in use L or S Instruction Type Single cycle Delay Slots 0 Example 1 XOR L1 A1 A2 A3 Before instruction 1 cycle after instruction A1 0721 325Ah A1 0721 325Ah A2 0019 0F12h A2 0019 0F12h A3 XXXX X...

Страница 185: ...dst sint S1 S2 010111 dst slong L1 L2 0110111 Description This is a pseudo operation used to fill the dst register with 0s by subtracting the dst from itself and placing the result in the dst The ass...

Страница 186: ...cluding addition subtraction and multiplication This chapter de scribes these C67x specific instructions Instructions that are common to both the C62x and C67x are described in Chapter 3 Topic Page 4...

Страница 187: ...sion floating point register value dp x Convert x to dp dst_h msb32 of dst dst_l lsb32 of dst int 32 bit integer value int x Convert x to integer lsbn or LSBn n least significant bits for example lsb3...

Страница 188: ...mple ucstn5 uint Unsigned 32 bit integer value dp Double precision floating point register value xsint Signed 32 bit integer value that can optionally use cross path sp Single precision floating point...

Страница 189: ...S Unit D Unit ADDDP MPYDP ABSDP ADDAD ADDSP MPYI ABSSP LDDW DPINT MPYID CMPEQDP DPSP MPYSP CMPEQSP INTDP CMPGTDP INTDPU CMPGTSP INTSP CMPLTDP INTSPU CMPLTSP SPINT RCPDP SPTRUNC RCPSP SUBDP RSQRDP SUB...

Страница 190: ...t S Unit M Unit L Unit CMPLTDP n DP compare CMPLTSP n Single cycle DPINT n 4 cycle DPSP n 4 cycle DPTRUNC n 4 cycle INTDP n INTDP INTDPU n INTDP INTSP n 4 cycle INTSPU n 4 cycle LDDW n Load MPYDP n MP...

Страница 191: ...produce a double precision result write the low 32 bit word one cycle before writing the high 32 bit word If an instruction that writes a DP result is followed by an instruction that uses the result...

Страница 192: ...issa field x Can have value of 0 or 1 don t care NaN Not a Number SNaN or QNaN SNaN Signal NaN QNaN Quiet NaN NaN_out QNaN with all bits in the f field 1 Inf Infinity LFPN Largest floating point numbe...

Страница 193: ...int fields represent floating point numbers within two ranges normalized e is between 0 and 255 and denormalized e is 0 The following formulas define how to translate the s e and f fields into a singl...

Страница 194: ...x0000 0001 1 40129846e 45 Figure 4 2 shows the fields of a double precision floating point number repre sented within a pair of 32 bit registers Figure 4 2 Double Precision Floating Point Fields 31 e...

Страница 195: ...le 4 8 shows hex and decimal values for some double precision floating point numbers Table 4 8 Hex and Decimal Representation for Selected Double Precision Values Symbol Hex Value Decimal Value NaN_ou...

Страница 196: ...es the functional unit read ports For example the ADDDP instruction has a functional unit latency of 2 Operands are read on cycle i and cycle i 1 Therefore a new instruction cannot begin until cycle i...

Страница 197: ...tional unit on cycles i i 1 i 2 and i 3 If a cross path is used to read a source in an instruction with a multicycle func tional unit latency you must ensure that no other instructions executing on th...

Страница 198: ...on cycle i 3 or i 4 due to a write hazard on cycle i 3 or i 4 respectively An INTDP instruction cannot be scheduled on that func tional unit on cycle i 1 due to a write hazard on cycle i 1 A 4 cycle i...

Страница 199: ...nit on cycle i 2 or i 3 due to a write hazard on cycle i 5 or i 6 respectively All of the above cases deal with double precision floating point instructions or the MPYI or MPYID instructions except fo...

Страница 200: ...idual Instruction Descriptions This section gives detailed information on the floating point instruction set for the C67x Each instruction presents the following information Assembler syntax Functiona...

Страница 201: ...2 port for the 32 MSBs and the src1 port for the 32 LSBs Execution if cond abs src2 dst else nop The absolute value of src2 is determined as follows 1 If src2 w 0 then src2 dst 2 If src2 t 0 then src2...

Страница 202: ...ay slots can be reduced by one because these instructions read the lower word of the DP source one cycle before the upper word of the DP source Instruction Type 2 cycle DP Delay Slots 1 Functional Uni...

Страница 203: ...abs src2 dst else nop The absolute value of src2 is determined as follows 1 If src2 w 0 then src2 dst 2 If src2 t 0 then src2 dst Notes 1 If scr2 is SNaN NaN_out is placed in dst and the INVAL and NA...

Страница 204: ...Absolute Value ABSSP 4 19 TMS320C67x Floating Point Instruction Set Functional Unit Latency 1 Example ABSSP S1X B1 A5 Before instruction 1 cycle after instruction B1 c020 0000h 2 5 B1 c020 0000h 2 5...

Страница 205: ...leword addressing mode specified for src2 The addition defaults to linear mode However if src2 is one of A4 A7 or B4 B7 the mode can be changed to circular mode by writing the appropri ate value to th...

Страница 206: ...DDAD 4 21 TMS320C67x Floating Point Instruction Set Functional Unit Latency 1 Example ADDAD D1 A1 A2 A3 Before instruction 1 cycle after instruction A1 0000 1234h 4660 A1 0000 1234h 4660 A2 0000 0002h...

Страница 207: ...or L2 Opcode map field used For operand type Unit src1 src2 dst dp xdp dp L1 L2 Opcode 31 29 28 27 23 22 18 17 creg z dst 13 12 11 5 4 3 2 1 0 x 0 0 1 1 0 0 0 1 1 0 s p 3 5 5 5 7 src2 src1 Description...

Страница 208: ...int number Overflow Output Rounding Mode Result Sign Nearest Even Zero Infinity Infinity infinity LFPN infinity LFPN infinity LFPN LFPN infinity 6 If underflow occurs the INEX and UNDER bits are set a...

Страница 209: ...be reduced by one because these instructions read the lower word of the DP source one cycle before the upper word of the DP source Instruction Type ADDDP SUBDP Delay Slots 6 Functional Unit Latency 2...

Страница 210: ...rc1 src2 dst unit L1 or L2 Opcode map field used For operand type Unit src1 src2 dst sp xsp sp L1 L2 Opcode 31 29 28 27 23 22 18 17 creg z dst 13 12 11 5 4 3 2 1 0 x 0 0 1 0 0 0 0 1 1 0 s p 3 5 5 5 7...

Страница 211: ...put Rounding Mode Result Sign Nearest Even Zero Infinity Infinity infinity LFPN infinity LFPN infinity LFPN LFPN infinity 6 If underflow occurs the INEX and UNDER bits are set and the results are roun...

Страница 212: ...ge E1 E2 E3 E4 Read src1 src2 Written dst Unit in use L Instruction Type 4 cycle Delay Slots 3 Functional Unit Latency 1 Example ADDSP L1 A1 A2 A3 Before instruction 4 cycles after instruction A1 C020...

Страница 213: ...src1 src2 Description This instruction compares src1 to src2 If src1 equals src2 1 is written to dst Otherwise 0 is written to dst Execution if cond if src1 src2 1 dst else 0 dst else nop Special cas...

Страница 214: ...xcept the NaNn and DENn bits when appropriate Pipeline Stage E1 E2 Read src1_l src2_l src1_h src2_h Written dst Unit in use S S Instruction Type DP compare Delay Slots 1 Functional Unit Latency 2 Exam...

Страница 215: ...src2 Description This instruction compares src1 to src2 If src1 equals src2 1 is written to dst Otherwise 0 is written to dst Execution if cond if src1 src2 1 dst else 0 dst else nop Special cases of...

Страница 216: ...hose shown in the preceding table are set except for the NaNn and DENn bits when appropriate Pipeline Stage E1 Read src1 src2 Written dst Unit in use S Instruction Type Single cycle Delay Slots 0 Func...

Страница 217: ...c2 Description This instruction compares src1 to src2 If src1 is greater than src2 1 is written to dst Otherwise 0 is written to dst Execution if cond if src1 src2 1 dst else 0 dst else nop Special ca...

Страница 218: ...s when appropriate Pipeline Stage E1 E2 Read src1_l src2_l src1_h src2_h Written dst Unit in use S S Instruction Type DP compare Delay Slots 1 Functional Unit Latency 2 Example CMPGTDP S1 A1 A0 A3 A2...

Страница 219: ...c2 Description This instruction compares src1 to src2 If src1 is greater than src2 1 is written to dst Otherwise 0 is written to dst Execution if cond if src1 src2 1 dst else 0 dst else nop Special ca...

Страница 220: ...e are set ex cept for the NaNn and DENn bits when appropriate Pipeline Stage E1 Read src1 src2 Written dst Unit in use S Instruction Type Single cycle Delay Slots 0 Functional Unit Latency 1 Example C...

Страница 221: ...2 Description This instruction compares src1 to src2 If src1 is less than src2 1 is written to dst Otherwise 0 is written to dst Execution if cond if src1 t src2 1 dst else 0 dst else nop Special case...

Страница 222: ...s when appropriate Pipeline Stage E1 E2 Read src1_l src2_l src1_h src2_h Written dst Unit in use S S Instruction Type DP compare Delay Slots 1 Functional Unit Latency 2 Example CMPLTDP S1X A1 A0 B3 B2...

Страница 223: ...2 Description This instruction compares src1 to src2 If src1 is less than src2 1 is written to dst Otherwise 0 is written to dst Execution if cond if src1 t src2 1 dst else 0 dst else nop Special case...

Страница 224: ...e are set ex cept for the NaNn and DENn bits when appropriate Pipeline Stage E1 Read src1 src2 Written dst Unit in use S Instruction Type Single cycle Delay Slots 0 Functional Unit Latency 1 Example C...

Страница 225: ...MSBs and the src1 port for the 32 LSBs Execution if cond int src2 dst else nop Notes 1 If src2 is NaN the maximum signed integer 7FFF FFFFh or 8000 0000h is placed in dst and the INVAL bit is set 2 If...

Страница 226: ...PINT 4 41 TMS320C67x Floating Point Instruction Set Delay Slots 3 Functional Unit Latency 1 Example DPINT L1 A1 A0 A4 Before instruction 4 cycles after instruction A1 A0 4021 3333h 3333 3333h 8 6 A1 A...

Страница 227: ...st dp sp L1 L2 Opcode 31 29 28 27 23 22 18 17 creg z dst 13 12 11 5 4 3 2 1 0 x 0 0 0 1 0 0 1 1 1 0 s p 3 5 5 5 7 src2 0 0 0 0 0 Description The double precision 64 bit value in src2 is converted to a...

Страница 228: ...the INEX and DEN2 bits are set 5 If src2 is signed infinity the result is signed infinity and the INFO bit is set 6 If overflow occurs the INEX and OVER bits are set and the results are set as follows...

Страница 229: ...Stage E1 E2 E3 E4 Read src2_l src2_h Written dst Unit in use L Instruction Type 4 cycle Delay Slots 3 Functional Unit Latency 1 Example DPSP L1 A1 A0 A4 Before instruction 4 cycles after instruction...

Страница 230: ...ero truncate is always used The 64 bit operand is read in one cycle by using the src2 port for the 32 MSBs and the src1 port for the 32 LSBs Execution if cond int src2 dst else nop Notes 1 If src2 is...

Страница 231: ...Value to Integer With Truncation 4 46 Delay Slots 3 Functional Unit Latency 1 Example DPTRUNC L1 A1 A0 A4 Before instruction 4 cycles after instruction A1 A0 4021 3333h 3333 3333h 8 6 A1 A0 4021 3333...

Страница 232: ...0 Description The integer value in src2 is converted to a double precision value and placed in dst Execution if cond dp src2 dst else nop You cannot set configuration bits with this instruction Pipeli...

Страница 233: ...ter instruction B4 1965 1127h 426053927 B4 1965 1127h 426053927 A1 A0 XXXX XXXXh XXXX XXXXh A1 A0 41B9 6511h 2700 0000h 4 2605393 E08 Example 2 INTDPU L1 A4 A1 A0 Before instruction 5 cycles after ins...

Страница 234: ...L2 1001001 Opcode 31 29 28 27 23 22 18 17 creg z dst 13 12 11 5 4 3 2 1 0 x op 1 1 0 s p 3 5 5 5 7 src2 0 0 0 0 0 Description The integer value in src2 is converted to single precision value and plac...

Страница 235: ...ore instruction 4 cycles after instruction A1 1965 1127h 426053927 A1 1965 1127h 426053927 A2 XXXX XXXXh A2 4DCB 2889h 4 2605393 E08 Example 2 INTSPU L1X B1 A2 Before instruction 4 cycles after instru...

Страница 236: ...other load and store instructions The dst field must always be an even value because LDDW loads register pairs Therefore bit 23 is always zero Further more the value of the ld st field is110 The brac...

Страница 237: ...pending on the mode selected When LDDW is used to load two 32 bit single precision floating point values or two 32 bit integer val ues the order is dependent on the endian mode used In little endian m...

Страница 238: ...0 XXXX XXXXh XXXX XXXXh A1 A0 4021 3333h 3333 3333h 8 6 B10 0000 0010h 16 B10 0000 0010h 16 mem 0x18 3333 3333h 4021 3333h 8 6 mem 0x18 3333 3333h 4021 3333h 8 6 Little endian mode Example 2 LDDW D1 A...

Страница 239: ...ut signs 2 Signed infinity multiplied by signed infinity or a normalized number other than signed 0 returns signed infinity Signed infinity multiplied by signed 0 returns a signed NaN_out and sets the...

Страница 240: ...BDP instruction the number of delay slots can be reduced by one because these instructions read the lower word of the DP source one cycle before the upper word of the DP source Instruction Type MPYDP...

Страница 241: ...Description The src1 operand is multiplied by the src2 operand The lower 32 bits of the result are placed in dst Execution if cond lsb32 src1 src2 dst else nop Pipeline Stage E1 E2 E3 E4 E5 E6 E7 E8...

Страница 242: ...7 creg z dst src2 13 12 11 5 4 3 2 1 0 x op 0 0 0 s p 3 5 5 5 5 7 6 0 0 src1 cst Description The src1 operand is multiplied by the src2 operand The 64 bit result is placed in the dst register pair Exe...

Страница 243: ...Bits 4 58 Example MPYID M1 A1 A2 A5 A4 Before instruction 10 cycles after instruction A1 0034 5678h 3430008 A1 0034 5678h 3430008 A2 0011 2765h 1124197 A2 0011 2765h 1124197 A5 A4 XXXX XXXXh XXXX XXXX...

Страница 244: ...xclusive or of the input signs 2 Signed infinity multiplied by signed infinity or a normalized number other than signed 0 returns signed infinity Signed infinity multiplied by signed 0 returns a signe...

Страница 245: ...the number of delay slots can be reduced by one because these instructions read the lower word of the DP source one cycle before the upper word of the DP source Instruction Type 4 cycle Delay Slots 3...

Страница 246: ...ort for the 32 MSBs The RCPDP instruction provides the correct exponent and the mantissa is accurate to the eighth binary position therefore mantissa error is less than 2 8 This estimate can be used a...

Страница 247: ...ws signed 0 is placed in dst and the INEX and UNDER bits are set Underflow occurs when 21022 t src2 t infinity Pipeline Stage E1 E2 Read src2_l src2_h Written dst_l dst_h Unit in use S If dst is used...

Страница 248: ...instruction provides the correct exponent and the mantissa is accurate to the eighth binary position therefore mantissa error is less than 2 8 This estimate can be used as a seed value for an algorith...

Страница 249: ...If src2 is signed 0 signed infinity is placed in dst and the DIV0 and INFO bits are set 5 If src2 is signed infinity signed 0 is placed in dst 6 If the result underflows signed 0 is placed in dst and...

Страница 250: ...the 32 MSBs The RSQRDP instruction provides the correct exponent and the mantissa is accurate to the eighth binary position therefore mantissa error is less than 2 8 This estimate can be used as a se...

Страница 251: ...igned 0 signed infinity is placed in dst and the DIV0 and INFO bits are set The Newton Rhapson approximation cannot be used to cal culate the square root of 0 because infinity multiplied by 0 is inval...

Страница 252: ...imation RSQRDP 4 67 TMS320C67x Floating Point Instruction Set Example RCPDP S1 A1 A0 A3 A2 Before instruction 2 cycles after instruction A1 A0 4010 0000h 0000 0000h 4 0 A1 A0 4010 0000h 0000 0000h 4 0...

Страница 253: ...e correct exponent and the mantissa is accurate to the eighth binary position therefore mantissa error is less than 2 8 This estimate can be used as a seed value for an algorithm to compute the recipr...

Страница 254: ...signed infinity is placed in dst and the DIV0 INEX and DEN2 bits are set 5 If src2 is signed 0 signed infinity is placed in dst and the DIV0 and INFO bits are set The Newton Rhapson approximation can...

Страница 255: ...Precision Floating Point Square Root Reciprocal Approximation 4 70 Example 2 RSQRSP S2X A1 B2 Before instruction 1 cycle after instruction A1 4109 999Ah 8 6 A1 4109 999Ah 8 6 B2 XXXX XXXXh B2 3EAE 800...

Страница 256: ...p src2 dst else nop Notes 1 If src2 is SNaN NaN_out is placed in dst and the INVAL and NAN2 bits are set 2 If src2 is QNaN NaN_out is placed in dst and the NAN2 bit is set 3 If src2 is a signed denorm...

Страница 257: ...sion Floating Point Value 4 72 Instruction Type 2 cycle DP Delay Slots 1 Functional Unit Latency 1 Example SPDP S1X B2 A1 A0 Before instruction 2 cycles after instruction B2 4109 999Ah 8 6 B2 4109 999...

Страница 258: ...if cond int src2 dst else nop Notes 1 If src2 is NaN the maximum signed integer 7FFF FFFFh or 8000 0000h is placed in dst and the INVAL bit is set 2 If src2 is signed infinity or if overflow occurs t...

Страница 259: ...INT Convert Single Precision Floating Point Value to Integer 4 74 Example SPINT L1 A1 A2 Before instruction 4 cycles after instruction A1 4109 9999Ah 8 6 A1 4109 999Ah 8 6 A2 XXXX XXXXh A2 0000 0009h...

Страница 260: ...nding modes in the FADCR are ignored and round toward zero truncate is always used Execution if cond int src2 dst else nop Notes 1 If src2 is NaN the maximum signed integer 7FFF FFFFh or 8000 0000h is...

Страница 261: ...ecision Floating Point Value to Integer With Truncation 4 76 Functional Unit Latency 1 Example SPTRUNC L1X B1 A2 Before instruction 4 cycles after instruction B1 4109 9999Ah 8 6 B1 4109 999Ah 8 6 A2 X...

Страница 262: ...BDP unit src1 src2 dst unit L1 or L2 Opcode map field used For operand type Unit Opfield src1 src2 dst dp xdp dp L1 L2 0011001 src1 src2 dst xdp dp dp L1 L2 0011101 Opcode 31 29 28 27 23 22 18 17 creg...

Страница 263: ...verflow Output Rounding Mode Result Sign Nearest Even Zero Infinity Infinity infinity LFPN infinity LFPN infinity LFPN LFPN infinity 6 If underflow occurs the INEX and UNDER bits are set and the resul...

Страница 264: ...er of delay slots can be reduced by one because these instructions read the lower word of the DP source one cycle before the upper word of the DP source Instruction Type ADDDP SUBDP Delay Slots 6 Func...

Страница 265: ...t unit L1 or L2 Opcode map field used For operand type Unit Opfield src1 src2 dst sp xsp sp L1 L2 0010001 src1 src2 dst xsp sp sp L1 L2 0010101 Opcode 31 29 28 27 23 22 18 17 creg z dst 13 12 11 5 4 3...

Страница 266: ...oating point number Overflow Output Rounding Mode Result Sign Nearest Even Zero Infinity Infinity infinity LFPN infinity LFPN infinity LFPN LFPN infinity 6 If underflow occurs the INEX and UNDER bits...

Страница 267: ...d src1 src2 Written dst Unit in use L Instruction Type 4 cycle Delay Slots 3 Functional Unit Latency 1 Example SUBSP L1X A2 B1 A3 Before instruction 4 cycles after instruction A2 4109 999Ah A2 4109 99...

Страница 268: ...y during the same pipeline phase eliminating read after write memory conflicts All instructions require the same number of pipeline phases for fetch and decode but require a varying number of execute...

Страница 269: ...PR Program fetch packet receive The C62x uses a fetch packet FP of eight instructions All eight of the instruc tions proceed through fetch processing together through the PG PS PW and PR phases Figur...

Страница 270: ...Fetch Phases of the Pipeline PR PW PS PG PW Memory PS PR PG Registers units Functional a b CPU PR PW PS PG 256 MVK LDW LDW SHL ADD MVK LDW LDW NOP MVK MV B SADD SMPYH SADD SHR SMPY SHR SMPYH LDW LDW...

Страница 271: ...e pipeline The last six instruc tions of the fetch packet FP are parallel and form an execute packet EP This EP is in the dispatch phase DP of the decode stage The arrows indicate each instruction s a...

Страница 272: ...peline Execution of Instruction Types Figure 5 4 a shows the execute phases of the pipeline in sequential order from left to right Figure 5 4 b shows the portion of the functional block diagram in whi...

Страница 273: ...e pipeline For example examine cycle 7 in Figure 5 6 When the instructions from FP n reach E1 the instructions in the execute packet from FPn 1 are being decoded FP n 2 is in dispatch while FPs n 3 n...

Страница 274: ...ddress genera tion is performed and address modifications are written to a register file For branch instructions branch fetch packet in PG phase is affected For single cycle instructions results are w...

Страница 275: ...DD SADD STH LDW STH LDW B SUB SMPY SMPYH SADD SADD STH STH B SUB SMPY SMPYH SADD SADD STH STH Register file A Register file B Data 2 Data 1 32 32 32 32 byte addressable Internal data memory Data addre...

Страница 276: ...instruction execute packet is in decode The arrows between DP and DC correspond to the functional units identified in the code in Example 5 1 Example 5 1 Execute Packet in Figure 5 7 SADD L1 A2 A7 A2...

Страница 277: ...nstructions in E1 are shaded in Figure 5 7 The multi plexers used for the input operands to the functional units are also shaded in the figure The bold crosspaths are used by the MPY instructions Most...

Страница 278: ...ack to CPU E5 Write data into register Delay slots 0 1 0 4 5 See section 5 2 3 and 5 2 4 for more information on execution and delay slots for stores and loads See section 5 2 5 for more information o...

Страница 279: ...execution diagram The operands are read the operation is performed and the results are written to a register all during E1 Single cycle instructions have no delay slots Figure 5 9 Single Cycle Execut...

Страница 280: ...ctions Store instructions require phases E1 through E3 to complete their operations Figure 5 12 shows the pipeline phases the store instructions use Figure 5 12 Store Instruction Phases PG PS PW PR DP...

Страница 281: ...ycle When a load is executed before a store the old value is loaded and the new value is stored i LDW i 1 STW When a store is executed before a load the new value is stored and the new value is loaded...

Страница 282: ...e data address pointer is modified in its register In the E2 phase the data address is sent to data memory In the E3 phase a memory read at that address is performed Figure 5 15 Load Execution Block D...

Страница 283: ...load following a store accesses the value placed in memory by that store in the cycle after the store is completed This is why the store is considered to have zero delay slots 5 2 5 Branch Instructio...

Страница 284: ...Because the branch target has to wait until it reaches the E1 phase to begin execution the branch takes five delay slots before the branch target code executes Figure 5 17 Branch Execution Block Diagr...

Страница 285: ...struction has a fixed number of execute cycles that determines when this instruction s operations are complete Section 5 3 2 covers the effect of including a multicycle NOP in an individual EP Finally...

Страница 286: ...es 1 4 During these cycles a program fetch phase is started for each of the fetch packets that follow In cycle 5 the program dispatch DP phase the CPU scans the p bits and detects that there are three...

Страница 287: ...packet in parallel with other code The results of the LD ADD and MPY will all be available during the proper cycle for each instruction Hence NOP has no effect on the execute packet Figure 5 19 b show...

Страница 288: ...e NOPs EP7 Normal Cycle 11 10 9 8 7 6 5 4 3 2 1 Target E1 DC DP PR PW PS PG Branch E1 EP6 EP5 EP4 EP3 EP2 EP1 NOP5 ADD MPY LD EP without branch EP without branch B EP without branch EP without branch...

Страница 289: ...loads and instruction fetches dispatches The comparison is valid because data loads and program fetches operate on internal memories of the same speed on the C62x and per form the same types of operat...

Страница 290: ...ata memory access The memory stall causes all of the pipeline phases to lengthen beyond a single clock cycle caus ing execution to take additional clock cycles to finish The results of the program exe...

Страница 291: ...ycle result in a memory stall that halts all pipeline operation for one cycle while the second value is read from memory Two memory operations per cycle are allowed without any stall as long as they d...

Страница 292: ...re with an access to bank 0 in another memory space and no pipeline stall occurs Figure 5 24 4 Bank Interleaved Memory With Two Memory Spaces 6 7 14 15 8N 6 8N 7 Bank 3 Bank 2 8N 5 8N 4 13 12 5 4 2 3...

Страница 293: ...me pipeline phase eliminating read after write memory conflicts All instructions require the same number of pipeline phases for fetch and decode but require a varying number of execute phases This cha...

Страница 294: ...gram fetch packet receive The C67x uses a fetch packet FP of eight instructions All eight of the instruc tions proceed through fetch processing together through the PG PS PW and PR phases Figure 6 2 a...

Страница 295: ...Fetch Phases of the Pipeline PR PW PS PG PW Memory PS PR PG Registers units Functional a b CPU PR PW PS PG 256 MVK LDW LDW SHL ADD MVK LDW LDW NOP MVK MV B SADD SMPYH SADD SHR SMPY SHR SMPYH LDW LDW...

Страница 296: ...e pipeline The last six instruc tions of the fetch packet FP are parallel and form an execute packet EP This EP is in the dispatch phase DP of the decode stage The arrows indicate each instruction s a...

Страница 297: ...ed in section 6 2 Pipeline Execution of Instruction Types Figure 6 4 a shows the execute phases of the pipeline in sequential order from left to right Figure 6 4 b shows the por tion of the functional...

Страница 298: ...ructions from FP n reach E1 the instructions in the execute packet from FPn 1 are being decoded FP n 2 is in dispatch while FPs n 3 n 4 n 5 and n 6 are each in one of four phases of program fetch See...

Страница 299: ...Decode DC Instructions are decoded in functional units Execute Execute 1 E1 For all instruction types the conditions for the instructions are evaluated and operands are read For load and store instruc...

Страница 300: ...in struction that saturates results sets the SAT bit in the CSR if saturation occurs For MPYDP instruction the upper 32 bits of src1 and the lower 32 bits of src2 are read For MPYI and MPYID instruct...

Страница 301: ...ritten to a register file ADDDP SUBDP Execute 8 E8 Nothing is read or written Execute 9 E9 For the MPYI instruction the result is written to a register file For MPYDP and MPYID instructions the lower...

Страница 302: ...SP ADDSP MV LDDW B MPYSP SUBSP LDDW Register file A Register file B Data 2 Data 1 32 32 32 32 byte addressable Internal data memory Data address 2 Data address 1 9 8 7 6 5 4 3 2 1 0 16 16 16 16 Data m...

Страница 303: ...he code in Example 6 1 In the DC phase portion of Figure 6 7 one box is empty because a NOP was the eighth instruction in the fetch packet in DC and no functional unit is needed for a NOP Finally the...

Страница 304: ...A15 A8 A1 ABSSP S2 B12 B15 LOOP B2 LDDW D1 A0 2 A5 A4 DP and PS Phases B2 ZERO D2 B0 SUBSP L1 A12 A2 A12 ADDSP L2 B9 B12 B12 MPYSP M1X A5 B7 A10 MPYSP M2 B4 B7 B10 B0 B S1 LOOP B1 CMPLTSP S2 B15 B8 B1...

Страница 305: ...ases E1 Compute result and write to register Read operands and start computations Compute address Compute address Target code in PG E2 Compute result and write to register Send address and data to mem...

Страница 306: ...utation Read upper sources finish com putation and write results to register E3 Continue computation Continue computation E4 Complete computa tion and write results to register Continue computa tion a...

Страница 307: ...ontinue com putation E5 Continue computation Continue computation Continue computation Continue computation E6 Compute the lower results and write to register Continue computation Continue computation...

Страница 308: ...ew instruction dispatched to that functional unit during this locking period causes undefined results If an in struction with a multicycle functional unit latency has a condition that is evalu ated as...

Страница 309: ...on the same functional unit attempt to read or write respectively to the register file on the came cycle An instruction scheduled on cycle i has the following constraints 2 cycle DP A single cycle in...

Страница 310: ...same functional unit on cycle i 4 i 5 or i 6 A MPYI instruction cannot be scheduled on the same functional unit on cycle i 4 i 5 or i 6 A MPYID instruction cannot be scheduled on the same functional...

Страница 311: ...point instructions The S and L units share their long write port with the load port for the 32 most significant bits of an LDDW load Therefore the LDDW instruction and the S or L unit writing a long r...

Страница 312: ...en dif ferently for each instruction If you analyze these differences you can make further optimization improvements by considering what happens during the execution phases of instructions that use th...

Страница 313: ...RW Instruction Type Subsequent Same Unit Instruction Executable Single cycle n DP compare n 2 cycle DP n Branch n Instruction Type Same Side Different Unit Both Using Cross Path Executable Single cycl...

Страница 314: ...it Both Using Cross Path Executable Single cycle Xr n Load Xr n Store Xr n INTDP Xr n ADDDP SUBDP Xr n 16 16 multiply Xr n 4 cycle Xr n MPYI Xr n MPYID Xr n MPYDP Xr n Legend E1 phase of the single cy...

Страница 315: ...ble Single cycle Xw n DP compare n n 2 cycle DP Xw n Branch n n Instruction Type Same Side Different Unit Both Using Cross Path Executable Single cycle n n Load n n Store n n INTDP n n ADDDP SUBDP n n...

Страница 316: ...n Branch n n n n n n n Instruction Type Same Side Different Unit Both Using Cross Path Executable Single cycle n n n n n n n Load n n n n n n n Store n n n n n n n INTDP n n n n n n n ADDDP SUBDP n n...

Страница 317: ...uction Type Subsequent Same Unit Instruction Executable 16 16 multiply n n 4 cycle n n MPYI n n MPYID n n MPYDP n n Instruction Type Same Side Different Unit Both Using Cross Path Executable Single cy...

Страница 318: ...n n n n MPYI n n n n MPYID n n n n MPYDP n n n n Instruction Type Same Side Different Unit Both Using Cross Path Executable Single cycle n n n n Load n n n n Store n n n n DP compare n n n n 2 cycle D...

Страница 319: ...erent Unit Both Using Cross Path Executable Single cycle Xr Xr Xr n n n n n n Load Xr Xr Xr n n n n n n Store Xr Xr Xr n n n n n n DP compare Xr Xr Xr n n n n n n 2 cycle DP Xr Xr Xr n n n n n n Branc...

Страница 320: ...it Both Using Cross Path Executable Single cycle Xr Xr Xr n n n n n n n Load Xr Xr Xr n n n n n n n Store Xr Xr Xr n n n n n n n DP compare Xr Xr Xr n n n n n n n 2 cycle DP Xr Xr Xr n n n n n n n Bra...

Страница 321: ...ifferent Unit Both Using Cross Path Executable Single cycle Xr Xr Xr n n n n n n n Load Xr Xr Xr n n n n n n n Store Xr Xr Xr n n n n n n n DP compare Xr Xr Xr n n n n n n n 2 cycle DP Xr Xr Xr n n n...

Страница 322: ...Type Subsequent Same Unit Instruction Executable Single cycle n 4 cycle n INTDP n ADDDP SUBDP n Instruction Type Same Side Different Unit Both Using Cross Path Executable Single cycle n DP compare n 2...

Страница 323: ...n n n INTDP n n n n ADDDP SUBDP n n n n Instruction Type Same Side Different Unit Both Using Cross Path Executable Single cycle n n n n DP compare n n n n 2 cycle DP n n n n 4 cycle n n n n Load n n n...

Страница 324: ...n ADDDP SUBDP n n n n n Instruction Type Same Side Different Unit Both Using Cross Path Executable Single cycle n n n n n DP compare n n n n n 2 cycle DP n n n n n 4 cycle n n n n n Load n n n n n Sto...

Страница 325: ...Both Using Cross Path Executable Single cycle Xr n n n n n n DP compare Xr n n n n n n 2 cycle DP Xr n n n n n n 4 cycle Xr n n n n n n Load Xr n n n n n n Store Xr n n n n n n Branch Xr n n n n n n 1...

Страница 326: ...gle cycle n n n n n Load n n n n n Store n n n n n Instruction Type Same Side Different Unit Both Using Cross Path Executable 16 16 multiply n n n n n MPYI n n n n n MPYID n n n n n MPYDP n n n n n Si...

Страница 327: ...ction Executable Single cycle n n n Load n n n Store n n n Instruction Type Same Side Different Unit Both Using Cross Path Executable 16 16 multiply n n n MPYI n n n MPYID n n n MPYDP n n n Single cyc...

Страница 328: ...Subsequent Same Unit Instruction Executable Single cycle n Load n Store n Instruction Type Same Side Different Unit Both Using Cross Path Executable 16 16 multiply n MPYI n MPYID n MPYDP n Single cycl...

Страница 329: ...Hazards Instruction Execution Cycle 1 2 3 4 5 6 LDDW RW W Instruction Type Subsequent Same Unit Instruction Executable Instruction with long result n n n Xw n Legend E1 phase of the single cyle instr...

Страница 330: ...s the single cycle execution diagram The operands are read the operation is per formed and the results are written to a register all during E1 Single cycle instructions have no delay slots Table 6 20...

Страница 331: ...ing in the pipeline for a multiply In the E1 phase the operands are read and the multiply begins In the E2 phase the multiply finishes and the result is written to the destination register Multiply in...

Страница 332: ...s of the data to be stored is computed In the E2 phase the data and destination addresses are sent to data memory In the E3 phase a memory write is performed The address modification is per formed in...

Страница 333: ...cycle When a load is executed before a store the old value is loaded and the new value is stored i LDW i 1 STW When a store is executed before a load the new value is stored and the new value is load...

Страница 334: ...ipeline Stage E1 E2 E3 E4 E5 Read baseR offsetR Written baseR dst Unit in use D Figure 6 14 Load Instruction Phases PG PS PW PR DP DC E1 E2 E3 E4 E5 4 delay slots Address modification Figure 6 15 show...

Страница 335: ...inter results are written to the register in E1 there are no delay slots associated with the address modification In the following code pointer results are written to the A4 register in the first exec...

Страница 336: ...of the target code see Table 6 24 Figure 6 16 shows the pipeline phases used by the branch instruction and branch target code The delay slots are shaded Table 6 24 Branch Execution Pipeline Stage E1 P...

Страница 337: ...the branch target has to wait until it reaches the E1 phase to begin execution the branch takes five delay slots before the branch target code executes Figure 6 17 Branch Execution Block Diagram DP PR...

Страница 338: ...E1 using the src1 and src2 ports respectively The lower 32 bits of the DP source are written on E1 and the upper 32 bits of the DP source are written on E2 The 2 cycle DP instructions are executed on...

Страница 339: ...Figure 6 19 shows the pipeline phases the 4 cycle instruc tions use Table 6 26 4 Cycle Execution Pipeline Stage E1 E2 E3 E4 Read src1 src2 Written dst Unit in use L or M Figure 6 19 4 Cycle Instructi...

Страница 340: ...e read on E1 the upper 32 bits of the sources are read on E2 and the results are written on E2 The following instructions are DP compare instructions CMPEQDP CMPLTDP CMPGTDP The DP compare instruction...

Страница 341: ...t are written on E7 The ADDDP SUBDP instructions are executed on the L unit The functional unit latency for ADDDP SUBDP instructions is 2 The status is written to the FADCR on E6 Figure 6 22 shows the...

Страница 342: ...Unit in use M M M M Figure 6 23 MPYI Instruction Phases PG PS PW PR DP DC E1 E2 E3 E4 E5 E6 E7 E8 E9 8 delay slots 6 3 16 MPYID Instructions The MPYID instruction uses the E1 through E10 phases of th...

Страница 343: ...the upper 32 bits of src2 are read on E2 and E4 The lower 32 bits of the result are written on E9 and the upper 32 bits of the result are written on E10 The MPYDP instruction is executed on the M unit...

Страница 344: ...struction has a fixed number of execute cycles that determines when this instruction s operations are complete Section 6 4 2 covers the effect of including a multicycle NOP in an individual EP Finally...

Страница 345: ...etch packet n goes through the program fetch phases during cycles 1 4 During these cycles a program fetch phase is started for each of the fetch packets that follow In cycle 5 the program dispatch DP...

Страница 346: ...packet in parallel with other code The results of the LD ADD and MPY will all be available during the proper cycle for each instruction Hence NOP has no effect on the execute packet Figure 6 27 b sho...

Страница 347: ...e NOPs EP7 Normal Cycle 11 10 9 8 7 6 5 4 3 2 1 Target E1 DC DP PR PW PS PG Branch E1 EP6 EP5 EP4 EP3 EP2 EP1 NOP5 ADD MPY LD EP without branch EP without branch B EP without branch EP without branch...

Страница 348: ...loads and instruction fetches dispatches The comparison is valid because data loads and program fetches operate on internal memories of the same speed on the C67x and per form the same types of opera...

Страница 349: ...data memory access The memory stall causes all of the pipeline phases to lengthen beyond a single clock cycle causing execution to take additional clock cycles to finish The results of the program exe...

Страница 350: ...ch of these banks is single ported memory only one access to each bank is allowed per cycle Two accesses to a single bank in a given cycle result in a memory stall that halts all pipeline operation fo...

Страница 351: ...m device to device See the TMS320C62x C67x Peripherals Reference Guide to determine the memory spaces in your particular device Figure 6 32 8 Bank Interleaved Memory With Two Memory Spaces Bank 7 Bank...

Страница 352: ...automatically the presence of interrupts and divert program execution flow to your interrupt service code Finally the chapter describes the programming implications of interrupts Topic Page 7 1 Overv...

Страница 353: ...ets the pending status of the interrupt within the interrupt flag register IFR If the interrupt is properly enabled the CPU begins processing the interrupt and redirecting program flow to the interrup...

Страница 354: ...errupt service fetch packet must be located at address 0 RESET is not affected by branches 7 1 1 2 Nonmaskable Interrupt NMI NMI is the second highest priority interrupt and is generally used to alert...

Страница 355: ...register CSR is set to1 The NMIE bit in the interrupt enable register IER is set to1 The corresponding interrupt enable IE bit in the IER is set to1 The corresponding interrupt occurs which sets the...

Страница 356: ...outine may fit in an individual fetch packet The addresses and contents of the IST are shown in Figure 7 1 Because each fetch packet contains eight 32 bit instruction words or 32 bytes each address in...

Страница 357: ...nstr3 Interrupt service table IST Instr2 Instr4 Instr5 Instr6 B IRP NOP 5 ISFP for INT6 000h 020h 040h 060h 080h 0A0h 0C0h 0E0h 100h 120h 140h 160h 180h 1A0h 1C0h 1E0h 0C0h 0C4h 0C8h 0CCh 0D0h 0D4h 0D...

Страница 358: ...nterrupt ISFP Instr1 Instr2 B 1234h Instr4 Instr5 Instr6 Instr7 Instr8 ISFP for INT4 080h 084h 088h 08Ch 090h 094h 098h 09Ch Program memory Instr9 Instr11 1224h 1228h 122Ch 1230h 1234h 1238h 123Ch B I...

Страница 359: ...ns Bits Field Name Description 0 4 Set to 0 fetch packets must be aligned on 8 word 32 byte boundaries 5 9 HPEINT Highest priority enabled interrupt This field gives the number related bit position in...

Страница 360: ...880h 8A0h 8C0h 8E0h 900h 920h 940h 96h0 980h 9A0h 9C0h 9E0h Program memory 800h RESET ISFP 1 Copy the IST located between 0h and 200h to the memory loca tion between 800h and A00h 2 Write 800h to the...

Страница 361: ...ted in the table Table 7 3 Interrupt Control Registers Abbreviation Name Description Page Number CSR Control status register Allows you to globally set or disable interrupts 7 11 IER Interrupt enable...

Страница 362: ...rrupts globally enabled GIE 0 maskable interrupts globally disabled 1 PGIE Previous GIE saves the value of GIE when an interrupt is taken This value is used on return from an interrupt The global inte...

Страница 363: ...ck to GIE result ing in GIE being cleared as directed by your code Example 7 2 shows how to disable maskable interrupts globally and Example 7 3 shows how to enable maskable interrupts globally Exampl...

Страница 364: ...ot writeable and is always read as 1 so the reset inter rupt is always enabled You cannot disable the reset interrupt Bits IE4 IE15 can be written as 1 or 0 enabling or disabling the associated interr...

Страница 365: ...check the status of interrupts use the MVC instruction to read the IFR Figure 7 7 shows the IFR Figure 7 7 Interrupt Flag Register IFR 31 16 Reserved R 0 15 0 IF15 IF14 IF13 IF12 IF11 IF10 IF9 IF8 IF7...

Страница 366: ...rrupts Figure 7 8 Interrupt Set Register ISR 31 16 Reserved 15 0 IS15 IS14 IS13 IS12 IS11 IS10 IS9 IS8 IS7 IS6 IS5 IS4 W Rsv Rsv Rsv Rsv Legend W Writeable by the MVC instruction Rsv Reserved Figure 7...

Страница 367: ...I return pointer register NRP contains the return pointer that directs the CPU to the proper location to continue program execution after NMI processing A branch using the address in the NRP B NRP in...

Страница 368: ...ing is complete Example 7 9 shows how to return from a maskable interrupt Example 7 9 Code to Return from a Maskable Interrupt B IRP return moves PGIE to GIE NOP 5 delay slots The IRP contains the 32...

Страница 369: ...pt signal enters the CPU it is has been detected cycle 4 Two clock cycles after detection the interrupt s corresponding flag bit in the IFR is set cycle 6 In Figure 7 12 and Figure 7 13 IFm is set dur...

Страница 370: ...E5 E4 E5 E3 E4 E5 DC E1 E2 E3 E4 DP DC E1 E2 E3 PR DP DC E1 E2 PW PR DP DC E1 PS PW PR DP DC E5 E4 E3 E2 E1 n 5 n 4 n 3 n 2 n 1 n Execute packet INUM IACK IFm External INTm Clock cycle 0 0 0 0 0 0 0...

Страница 371: ...External INTm at pin 0 0 IACK INUM 0 E2 E1 DC E1 DC DP DP PR PW PS PR PW PS PG n n 1 n 2 n 3 n 4 n 5 n 6 DC DP PR PW PS PG Execute packet PR 12 11 PW PS 10 9 8 PG DP PW PR PS PG PR PS PW PS PG PG PW P...

Страница 372: ...ction to be annulled in future pipeline stages The address of the first annulled execute packet n 5 is loaded in to the NRP in the case of NMI or IRP for all other interrupts A branch to the address h...

Страница 373: ...le Figure 7 14 RESET Interrupt Detection and Processing Pipeline Operation Reset ISFP n 7 n 6 Pipeline flush E1 DC DP PR PW PS PG PG PS PW PR DP DC E1 n 5 n 4 n 3 n 2 n 1 n Execute packet INUM IACK IF...

Страница 374: ...3 3 1 on page 7 16 During CPU cycles 15 21 of Figure 7 14 the following reset processing actions occur Processing of subsequent nonreset interrupts is disabled because GIE and NMIE are cleared A bran...

Страница 375: ...occur only every second cycle However the frequency of interrupt processing depends on the time required for inter rupt service and whether you reenable interrupts during processing thereby allowing n...

Страница 376: ...ar to the instructions after the interrupt to have fewer delay slots than they actually have For example suppose that register A1 contains 0 and register A0 points to a memory location containing a va...

Страница 377: ...bit which would reenable interrupts inside the interrupt service routine 7 6 3 Manual Interrupt Processing You can poll the IFR and IER to detect interrupts manually and then branch to the value held...

Страница 378: ...a branch using a displace ment the MVKH instructions could be eliminated thus shortening the code sequence The trap is processed with the code located at the address pointed to by the label TRAP_HAND...

Страница 379: ...er up C clock cycles Cycles based on the input from the external clock code A set of instructions written to perform a task a computer program or part of a program CPU cycle The period during which a...

Страница 380: ...ce fetch packet ISFP See also fetch packet FP A fetch packet used to service interrupts If eight instructions are insufficient the user must branch out of this block for additional interrupt service I...

Страница 381: ...upt A higher priority interrupt that must be serviced before completion of the current interrupt service routine nonmaskable interrupt An interrupt that can be neither masked nor manu ally disabled O...

Страница 382: ...mber with the sign bit W wait state A period of time that the CPU must wait for external program data or I O memory to respond when reading from or writing to that ex ternal memory The CPU waits one e...

Страница 383: ...for load store 3 23 address paths 2 7 addressing mode circular mode 3 21 definition A 1 linear mode 3 21 addressing mode register AMR 2 8 2 9 field encoding table 2 9 figure 2 9 ADDSP instruction 4 25...

Страница 384: ...upts 7 13 of interrupts 7 11 control register file extension C67x 2 13 interrupt 7 10 list of 2 8 register addresses for accessing 3 87 control status register CSR 7 10 description 2 8 2 11 figure 2 1...

Страница 385: ...derations C67x 6 52 pipeline operation 5 18 execute phases of the pipeline 5 22 6 56 figure 5 5 6 5 execution notations fixed point instructions 3 2 floating point instructions 4 2 execution table ADD...

Страница 386: ...ts 4 12 instruction operation fixed point notations for 3 2 floating point notations for 4 2 instruction to functional unit mapping 3 4 4 4 instruction types 2 cycle DP instructions 6 46 4 cycle instr...

Страница 387: ...10 performance considerations 7 24 priorities 7 3 processing 7 18 to 7 23 programming considerations 7 25 to 7 28 setting 7 14 signals used 7 2 traps 7 27 types of 7 2 INTSP instruction 4 49 to 4 50 I...

Страница 388: ...ing functional unit to instruction 3 5 4 4 instruction to functional unit 3 4 4 4 maskable interrupt description 7 4 return from 7 17 memory considerations 5 22 internal 1 8 paths 2 7 pipeline phases...

Страница 389: ...code example 3 15 parallel fetch packets 3 14 parallel operations 3 13 partially serial fetch packets 3 15 PCC field CSR 2 11 PCE1 See program counter PCE1 performance considerations pipeline 5 18 6 5...

Страница 390: ...ter IRP ISR See interrupt set register ISR ISTP See interrupt service table pointer ISTP NRP See nonmaskable interrupt return pointer NRP PCE1 See program counter PCE1 read constraints 3 19 write cons...

Страница 391: ...uction 15 bit offset 3 126 to 3 127 register offset or 5 bit unsigned constant offset 3 122 to 3 125 using circular addressing 3 21 SUB instruction 3 128 to 3 130 SUB2 instruction 3 135 SUBAB instruct...

Отзывы: