background image

Glossary 

ARM DDI 0301H

Copyright © 2004-2009 ARM Limited. All rights reserved.

Glossary-17

ID012310

Non-Confidential, Unrestricted Access

Short vector operation

A VFP coprocessor operation involving more than one destination register and perhaps more 
than one source register in the generation of the result for each destination.

Should Be One (SBO)

Should be written as 1 (or all 1s for bit fields) by software. Writing a 0 produces Unpredictable 
results.

Should Be Zero (SBZ)

Should be written as 0 (or all 0s for bit fields) by software. Writing a 1 produces Unpredictable 
results.

Should Be Zero or Preserved (SBZP)

Should be written as 0 (or all 0s for bit fields) by software, or preserved by writing the same 
value back that has been previously read from the same field on the same processor.

Significand

The component of a binary floating-point number that consists of an explicit or implicit leading 
bit to the left of the implied binary point and a fraction field to the right.

SPSR

See 

Saved Program Status Register

Stride

In the VFP extension, specifies the increment applied to register addresses in short vector 
operations. A stride of 00, specifying an increment of +1, causes a short vector operation to 
increment each vector register by +1 for each iteration, while a stride of 11 specifies an 
increment of +2. 

Subnormal value

A value in the range (–2

Emin

< x < 2

Emin

), except for 0. In the IEEE 754 standard format for 

single-precision and double-precision operands, a subnormal value has a zero exponent and a 
nonzero fraction field. The IEEE 754 standard requires that the generation and manipulation of 
subnormal operands be performed with the same precision as normal operands.

Support code

Software that must be used to complement the hardware to provide compatibility with the 
IEEE 754 standard. The support code has a library of routines that performs supported 
functions, such as divide with unsupported inputs or inputs that might generate an exception in 
addition to operations beyond the scope of the hardware. The support code has a set of exception 
handlers to process exceptional conditions in compliance with the IEEE 754 standard.

Synchronization primitive

The memory synchronization primitive instructions are those instructions that are used to ensure 
memory synchronization. That is, the LDREX, STREX, SWP, and SWPB instructions.

Tag

The upper portion of a block address used to identify a cache line within a cache. The block 
address from the CPU is compared with each tag in a set in parallel to determine if the 
corresponding line is in the cache. If it is, it is said to be a cache hit and the line can be fetched 
from cache. If the block address does not correspond to any of the tags, it is said to be a cache 
miss and the line must be fetched from the next level of memory.

See also 

Cache terminology diagram on the last page of this glossary.

TAP

See 

 Test access port.

TCM

See 

Tightly coupled memory.

Test Access Port (TAP)

The collection of four mandatory and one optional terminals that form the input/output and 
control interface to a JTAG boundary-scan architecture. The mandatory terminals are 

TDI

TDO

TMS

, and 

TCK

. The optional terminal is 

TRST

. This signal is mandatory in ARM cores 

because it is used to reset the debug logic.

Thumb instruction

A halfword that specifies an operation for an ARM processor in Thumb state to perform. Thumb 
instructions must be halfword-aligned.

Summary of Contents for ARM1176JZF-S

Page 1: ...Copyright 2004 2009 ARM Limited All rights reserved ARM DDI 0301H ID012310 ARM1176JZF S Revision r0p7 Technical Reference Manual ...

Page 2: ...d it means ARM or any of its subsidiaries as appropriate Figure 14 1 on page 14 2 reprinted with permission from IEEE Std 1149 1 2001 IEEE Standard Test Access Port and Boundary Scan Architecture by IEEE Std The IEEE disclaims any responsibility or liability resulting from the placement and use in the described manner Some material in this document is based on IEEE Standard for Binary Floating Poi...

Page 3: ...pyright 2004 2009 ARM Limited All rights reserved iii ID012310 Non Confidential Unrestricted Access Product Status The information in this document is final that is for a developed product Web Address http www arm com ...

Page 4: ...processor 1 8 1 6 Power management 1 23 1 7 Configurable options 1 25 1 8 Pipeline stages 1 26 1 9 Typical pipeline operations 1 28 1 10 ARM1176JZF S instruction set summary 1 32 1 11 Product revisions 1 47 Chapter 2 Programmer s Model 2 1 About the programmer s model 2 2 2 2 Secure world and Non secure world operation with TrustZone 2 3 2 3 Processor operating states 2 12 2 4 Instruction length 2...

Page 5: ... 10 Chapter 6 Memory Management Unit 6 1 About the MMU 6 2 6 2 TLB organization 6 4 6 3 Memory access sequence 6 7 6 4 Enabling and disabling the MMU 6 9 6 5 Memory access control 6 11 6 6 Memory region attributes 6 14 6 7 Memory attributes and types 6 20 6 8 MMU aborts 6 27 6 9 MMU fault checking 6 29 6 10 Fault status and address 6 34 6 11 Hardware page table translation 6 36 6 12 MMU descriptor...

Page 6: ...hapter 13 Debug 13 1 Debug systems 13 2 13 2 About the debug unit 13 3 13 3 Debug registers 13 5 13 4 CP14 registers reset 13 25 13 5 CP14 debug instructions 13 26 13 6 External debug interface 13 28 13 7 Changing the debug enable signals 13 31 13 8 Debug events 13 32 13 9 Debug exception 13 35 13 10 Debug state 13 37 13 11 Debug communications channel 13 42 13 12 Debugging in a cached system 13 4...

Page 7: ...ction to the VFP coprocessor 18 1 About the VFP11 coprocessor 18 2 18 2 Applications 18 3 18 3 Coprocessor interface 18 4 18 4 VFP11 coprocessor pipelines 18 5 18 5 Modes of operation 18 11 18 6 Short vector instructions 18 13 18 7 Parallel execution of instructions 18 14 18 8 VFP11 treatment of branch instructions 18 15 18 9 Writing optimal VFP11 code 18 16 18 10 VFP11 revision information 18 17 ...

Page 8: ...2 15 22 8 Overflow exception 22 16 22 9 Underflow exception 22 17 22 10 Inexact exception 22 18 22 11 Input exceptions 22 19 22 12 Arithmetic exceptions 22 20 Appendix A Signal Descriptions A 1 Global signals A 2 A 2 Static configuration signals A 4 A 3 TrustZone internal signals A 5 A 4 Interrupt signals including VIC interface A 6 A 5 AXI interface signals A 7 A 6 Coprocessor interface signals A...

Page 9: ...mode 3 1 42 Table 1 11 Addressing mode 4 1 42 Table 1 12 Addressing mode 5 1 42 Table 1 13 Operand2 1 43 Table 1 14 Fields 1 43 Table 1 15 Condition codes 1 43 Table 1 16 Thumb instruction set summary 1 44 Table 2 1 Write access behavior for system control processor registers 2 9 Table 2 2 Secure Monitor bus signals 2 11 Table 2 3 Address types in the processor system 2 16 Table 2 4 Mode structure...

Page 10: ... Results of access to the Instruction Set Attributes Register 2 3 40 Table 3 34 Instruction Set Attributes Register 3 bit functions 3 41 Table 3 35 Results of access to the Instruction Set Attributes Register 3 3 41 Table 3 36 Instruction Set Attributes Register 4 bit functions 3 42 Table 3 37 Results of access to the Instruction Set Attributes Register 4 3 43 Table 3 38 Results of access to the I...

Page 11: ...bit functions 3 96 Table 3 94 Results of access to the TCM Selection Register 3 97 Table 3 95 Cache Behavior Override Register bit functions 3 98 Table 3 96 Results of access to the Cache Behavior Override Register 3 98 Table 3 97 TLB Lockdown Register bit functions 3 100 Table 3 98 Results of access to the TLB Lockdown Register 3 100 Table 3 99 Primary Region Remap Register bit functions 3 102 Ta...

Page 12: ...ter 3 146 Table 3 148 TLB Lockdown Index Register bit functions 3 149 Table 3 149 TLB Lockdown VA Register bit functions 3 150 Table 3 150 TLB Lockdown PA Register bit functions 3 150 Table 3 151 Access permissions APX and AP bit fields encoding 3 151 Table 3 152 TLB Lockdown Attributes Register bit functions 3 151 Table 3 153 Results of access to the TLB lockdown access registers 3 152 Table 4 1 ...

Page 13: ...ble 8 33 Noncacheable LDM8 from word 0 8 22 Table 8 34 Noncacheable LDM8 from word 1 2 3 4 5 6 or 7 8 22 Table 8 35 Noncacheable LDM9 8 22 Table 8 36 Noncacheable LDM10 8 23 Table 8 37 Noncacheable LDM11 8 23 Table 8 38 Noncacheable LDM12 8 24 Table 8 39 Noncacheable LDM13 8 24 Table 8 40 Noncacheable LDM14 8 24 Table 8 41 Noncacheable LDM15 8 25 Table 8 42 Noncacheable LDM16 8 25 Table 8 43 Half ...

Page 14: ...3 Processor Watchpoint Value Registers 13 20 Table 13 14 Watchpoint Value Registers bit field definitions 13 21 Table 13 15 Processor Watchpoint Control Registers 13 21 Table 13 16 Watchpoint Control Registers bit field definitions 13 21 Table 13 17 Debug State Cache Control Register bit functions 13 23 Table 13 18 Debug State MMU Control Register bit functions 13 24 Table 13 19 CP14 debug instruc...

Page 15: ... 19 1 VFP11 MCR instructions 19 6 Table 19 2 VFP11 MRC instructions 19 6 Table 19 3 VFP11 MCRR instructions 19 6 Table 19 4 VFP11 MRRC instructions 19 7 Table 19 5 Single precision data memory images and byte addresses 19 9 Table 19 6 Double precision data memory images and byte addresses 19 9 Table 19 7 Single precision three operand register usage 19 13 Table 19 8 Single precision two operand re...

Page 16: ...lds 22 23 Table 22 11 FCVTSD bounce thresholds 22 24 Table 22 12 Single precision float to integer bounce thresholds and stored results 22 25 Table 22 13 Double precision float to integer bounce thresholds and stored results 22 26 Table A 1 Global signals A 2 Table A 2 Static configuration signals A 4 Table A 3 TrustZone internal signals A 5 Table A 4 Interrupt signals A 6 Table A 5 Port signal na...

Page 17: ...partition in the Secure and Non secure worlds 2 7 Figure 2 4 Big endian addresses of bytes within words 2 15 Figure 2 5 Little endian addresses of bytes within words 2 15 Figure 2 6 Register organization in ARM state 2 20 Figure 2 7 Processor core register set showing banked registers 2 21 Figure 2 8 Register organization in Thumb state 2 22 Figure 2 9 ARM state and Thumb state registers relations...

Page 18: ...rol Register format 3 56 Figure 3 32 Translation Table Base Register 0 format 3 57 Figure 3 33 Translation Table Base Register 1 format 3 59 Figure 3 34 Translation Table Base Control Register format 3 61 Figure 3 35 Domain Access Control Register format 3 63 Figure 3 36 Data Fault Status Register format 3 64 Figure 3 37 Instruction Fault Status Register format 3 66 Figure 3 38 Cache operations 3 ...

Page 19: ...ore word big endian 4 12 Figure 6 1 Memory ordering restrictions 6 24 Figure 6 2 Translation table managed TLB fault checking sequence part 1 6 30 Figure 6 3 Translation table managed TLB fault checking sequence part 2 6 31 Figure 6 4 Backwards compatible first level descriptor format 6 37 Figure 6 5 Backwards compatible second level descriptor format 6 38 Figure 6 6 Backwards compatible section s...

Page 20: ...gure 14 3 Bypass register bit order 14 8 Figure 14 4 Device ID code register bit order 14 9 Figure 14 5 Instruction register bit order 14 9 Figure 14 6 Scan chain select register bit order 14 10 Figure 14 7 Scan chain 0 bit order 14 11 Figure 14 8 Scan chain 1 bit order 14 11 Figure 14 9 Scan chain 4 bit order 14 13 Figure 14 10 Scan chain 5 bit order EXTEST selected 14 15 Figure 14 11 Scan chain ...

Page 21: ...mited All rights reserved xxi ID012310 Non Confidential Unrestricted Access Preface This preface introduces the ARM1176JZF S Technical Reference Manual TRM It contains the following sections About this book on page xxii Feedback on page xxvi ...

Page 22: ...ter 2 Programmer s Model Read this for a description of the processor registers and programming details Chapter 3 System Control Coprocessor Read this for a description of the processor s system control coprocessor CP15 registers and programming details Chapter 4 Unaligned and Mixed endian Data Access Support Read this for a description of the processor support for unaligned and mixed endian data ...

Page 23: ... description of the timing parameters applicable to the processor Chapter 18 Introduction to the VFP coprocessor Read this to get an overview of the VFP11 coprocessor Chapter 19 The VFP Register File Read this to learn about the structure and operation of the VFP11 register file Chapter 20 VFP Programmer s Model Read this to learn about the VFPv2 programmer s model including the ARMv5TE coprocesso...

Page 24: ... You can enter the underlined text instead of the full command or option name monospace italic Denotes arguments to monospace text where the argument is to be replaced by a specific value monospace bold Denotes language keywords when used outside example code and Enclose replaceable terms for assembler syntax where they appear in code or code fragments For example MRC p15 0 Rd CRn CRm Opcode_2 Tim...

Page 25: ...chitecture Reference Manual ARM DDI 0225 AMBA AXI Protocol V1 0 Specification ARM IHI 0022 Embedded Trace Macrocell Architecture Specification ARM IHI 0014 ARM1136J S Technical Reference Manual ARM DDI 0211 ARM11 Memory Built In Self Test Controller Technical Reference Manual ARM DDI 0289 ARM1176JZF S and ARM1176JZ S Implementation Guide ARM DII 0081 CoreSight ETM11 Technical Reference Manual ARM ...

Page 26: ...contact your supplier and give The product name The product revision or version An explanation with as much information as you can provide Include symptoms and diagnostic procedures if appropriate Feedback on content If you have comments on content then send an e mail to errata arm com Give the title the number ARM DDI 0301H the page numbers to which your comments apply a concise explanation of yo...

Page 27: ...g sections About the processor on page 1 2 Extensions to ARMv6 on page 1 3 TrustZone security extensions on page 1 4 ARM1176JZF S architecture with Jazelle technology on page 1 6 Components of the processor on page 1 8 Power management on page 1 23 Configurable options on page 1 25 Pipeline stages on page 1 26 Typical pipeline operations on page 1 28 ARM1176JZF S instruction set summary on page 1 ...

Page 28: ...terfaces supporting prioritized multiprocessor implementations an integer core with integral EmbeddedICE RT logic an eight stage pipeline branch prediction with return stack low interrupt latency configuration internal coprocessors CP14 and CP15 Vector Floating Point VFP coprocessor support external coprocessor interface Instruction and Data Memory Management Units MMUs managed using MicroTLB stru...

Page 29: ...ge number of bits to describe all of the options for inner and outer cachability In reality it is believed that no application requires all of these options simultaneously Therefore it is possible to configure the ARM1176JZF S processor to support only a small number of options by means of the TEX remap mechanism This implies a level of indirection in the page table mappings The TEX CB encoding ta...

Page 30: ...n be either Secure or Non secure Secure Monitor mode that is always Secure Except when the processor is in Secure Monitor mode the NS bit in the Secure Configuration Register determines whether the processor runs code in the Secure or Non secure worlds The Secure Configuration Register is in CP15 register c1 see c1 Secure Configuration Register on page 3 52 Secure Monitor mode is used to switch op...

Page 31: ...rusted peripherals through these signals AxPROT 1 Protection type signal see AxPROT 2 0 on page 8 12 RRESP 1 0 Read response signal see AXI interface signals on page A 7 BRESP 1 0 Write response signal see AXI interface signals on page A 7 ETMIASECCTL 1 0 and ETMCPSECCTL 1 0 TrustZone information for tracing see Secure control bus on page 15 4 ...

Page 32: ...it architecture with higher code density than a 32 bit architecture The ARM1176JZ S processor can easily switch between running in ARM state and running in Thumb state This enables you to optimize both code density and performance to best suit your application requirements 1 4 2 The Thumb instruction set The Thumb instruction set is a subset of the most commonly used 32 bit ARM instructions Thumb ...

Page 33: ... that are too complex to execute directly in hardware are executed in software An ARM register is used to access a table of exception handlers to handle these particular bytecodes A complete list of the ARM1176JZF S processor supported Java bytecodes and their corresponding hardware or software instructions is in the Jazelle V1 Architecture Reference Manual ...

Page 34: ...lock diagram 1 5 1 Integer core The ARM1176JZF S processor is built around the ARM11 integer core It is an implementation of the ARMv6 architecture that runs the ARM Thumb and Java instruction sets The processor contains EmbeddedICE RT logic and a JTAG debug interface to enable hardware debuggers to communicate with the processor The following sections describe the core in more detail Instruction ...

Page 35: ...ap instructions can access data from memory Conditional execution The processor conditionally executes nearly all ARM instructions You can decide if the condition code flags Negative Zero Carry and Overflow are updated according to their result Registers The ARM1176JZF S core contains 33 general purpose 32 bit registers 7 dedicated 32 bit registers Note At any one time 16 general purpose registers...

Page 36: ...nstruction that improves the performance and size of code for multi word unsigned multiplications Single Instruction Multiple Data SIMD Instructions to perform operations on pairs of 16 bit values held in a single register or on sets of four 8 bit values held in a single register The main operations supplied are addition and subtraction selection pack and saturation Instructions to extract bytes a...

Page 37: ...d Return stack The processor includes a three entry return stack to accelerate returns from procedure calls For each procedure call the processor pushes the return address onto a hardware stack When the processor recognizes a procedure return the processor pops the address held in the return stack that the prefetch unit uses as the predicted return address Note See Pipeline stages on page 1 26 for...

Page 38: ...ss controls and virtual memory management support for four sizes of memory page two channel DMA into TCMs I fetch D read write interface compatible with multi layer AMBA AXI 32 bit dedicated peripheral interface export of memory attributes for second level memory system The following sections describe the memory system in more detail Instruction and data caches Cache power management on page 1 13 ...

Page 39: ... not respond well to caching configurable memory blocks are provided for Instruction and Data Tightly Coupled Memories TCMs These ensure high speed access to code or data An Instruction TCM typically holds an interrupt or exception code that the processor must access at high speed without any potential delay resulting from a cache miss A Data TCM typically holds a block of data for intensive proce...

Page 40: ...active at any time DMA features The DMA controller has the following features runs in background of CPU operations enables CPU priority access to TCM during DMA programmed with Virtual Addresses controls DMA to either the instruction or data TCM allocated by a privileged process OS software can check and monitor DMA progress interrupts on DMA event ability to configure each channel to transfer dat...

Page 41: ...ections 64KB large pages 4KB small pages Domains Sixteen access domains are supported TLB A two level TLB structure is implemented Eight entries in the main TLB are lockable Hardware TLB loading is supported and is backwards compatible with previous versions of the ARM architecture ASIDs TLB entries can be global or can be associated with particular processes or applications using Application Spac...

Page 42: ...aves as a single bidirectional port These ports enable several simultaneous outstanding transactions providing high performance from second level memory systems that support parallelism high use of pipelined and multi page memories such as SDRAM The following sections describe the AMBA AXI interface in more detail Bus clock speeds Unaligned accesses Mixed endian support Write buffer on page 1 17 P...

Page 43: ... data for all loads to coprocessors in the order of the accesses in the program The processor suppresses HUM operation of the cache for coprocessor instructions The external coprocessor interface relies on the coprocessor executing all its instructions in order Externally connected coprocessors follow the early stages of the core pipeline to permit the exchange of instructions and data between the...

Page 44: ...ull speed of the core The ETM interface connects directly to the external ETM unit without any additional glue logic You can disable the ETM interface for power saving For more information see the Embedded Trace Macrocell Architecture Specification Chapter 15 Trace Interface Port Appendix A Signal Descriptions for details of ETM related signals ETM trace buffer You can extend the functionality of ...

Page 45: ...ception entry activates a debug monitor program that performs critical interrupt service routines to debug the processor The debug monitor program communicates with the debug host over the DCC Debug and trace Environment Several external hardware and software tools are available for you to enable real time debugging using the EmbeddedICE RT logic execution trace using the ETM 1 5 8 Instruction cyc...

Page 46: ...basic single and basic double formats are used For full compliance the VFP requires support code to handle arithmetic where operands or results are de norms This support code is normally installed on the Undefined instruction exception handler Flush to zero mode A flush to zero mode is provided where a default treatment of de norms is applied Table 1 3 lists the default behavior in flush to zero m...

Page 47: ... signal This provides faster interrupt entry but you can disable it for compatibility with earlier interrupt controllers Low interrupt latency configuration This mode minimizes the worst case interrupt latency of the processor with a small reduction in peak performance or instructions per cycle You can tune the behavior of the core to suit the requirements of the application The low interrupt late...

Page 48: ...ol coprocessor To ensure that a change between normal and low interrupt latency configurations is synchronized correctly you must use software systems that only change the configuration while interrupts are disabled Exception processing enhancements The ARMv6 architecture contains several enhancements to exception processing to reduce interrupt handler entry and exit time SRS Save return state to ...

Page 49: ...N and FREECLKIN four optional IEM Register Slices to have an asynchronous interface between the Level 2 ports powered by VCore and clocked by CLKIN and the AXI system powered by VSoc and clocked by ACLK clocks one for each port The ARM1176JZF S processor support four levels of power management Run mode This mode is the normal mode of operation when the processor can use all its functions Standby m...

Page 50: ...wered up and maintaining their state The valid bits remain visible to software to enable you to implement dormant mode For full implementation of dormant mode you must modify the RAM blocks to include an input clamp implement separate power domains For full implementation of dormant mode see ARM1176JZF S and ARM1176JZ S Implementation Guide For more details of power management features see Chapter...

Page 51: ...en the processor is implemented For details see the ARM11 Memory Built In Self Test Controller Technical Reference Manual Table 1 5 lists the default configuration of ARM1176JZF S processor Table 1 4 Configurable options Feature Range of options IEM support Yes or No Cache way size 1KB 2KB 4KB 8KB or 16KB Number of cache ways 4 not configurable TCM block size 4KB 8KB 16KB or 32KB Number of TCM blo...

Page 52: ...f integer results WBex Write back of data from the multiply or main execution pipelines MAC1 First stage of the multiply accumulate pipeline MAC2 Second stage of the multiply accumulate pipeline MAC3 Third stage of the multiply accumulate pipeline ADD Address generation stage DC1 First stage of data cache access DC2 Second stage of data cache access WBls Write back of data from the Load Store Unit...

Page 53: ...ions where branch prediction is performed on instructions ahead of execution of earlier instructions The Issue and Decode stages can contain any instruction in parallel with a predicted branch The Execute Memory and Write stages can contain a predicted branch an ALU or multiply instruction a load store multiple instruction and a coprocessor instruction in parallel execution ...

Page 54: ...ond half of the array once to produce the final result MAC1 1st multiply stage Sh Shifter operation Ex1 1st fetch stage Fe1 Fe2 De Iss WBex DC1 DC2 2nd fetch stage Instruction decode Register read and instruction issue Base register writeback Data address calculation First stage of data cache access Second stage of data cache access Writeback from LSU Load miss waits ADD WBls Common decode pipelin...

Page 55: ... stage Sat Not used Ex3 MAC2 2nd multiply stage ALU Not used Ex2 MAC1 1st multiply stage Sh Not used Ex1 1st fetch stage Fe1 Fe2 De Iss 2nd fetch stage Instruction decode Register read and instruction issue Not used Common decode pipeline WBex Base register writeback Not used ADD DC1 Not used DC2 Not used Not used WBls ALU pipeline Load store pipeline Hit under miss Multiply pipeline ...

Page 56: ...ration Ex3 MAC2 Not used ALU Calculate writeback value Ex2 MAC1 Not used Sh Shifter operation Ex1 1st fetch stage Fe1 Fe2 De Iss 2nd fetch stage Instruction decode Register read and instruction issue Not used Common decode pipeline Data address calculation ADD DC1 First stage of data cache access DC2 Second stage of data cache access Writeback from LSU WBls WBex Base register writeback ALU pipelin...

Page 57: ... Saturation Ex3 MAC2 Not used ALU Calculate writeback value Ex2 MAC1 Not used Sh Shifter operation Ex1 1st fetch stage Fe1 Fe2 De Iss WBex DC1 DC2 2nd fetch stage Instruction decode Register read and instruction issue Base register writeback Data address calculation Writeback from LSU ADD WBls ALU pipeline Load store pipeline Hit under miss Common decode pipeline Multiply pipeline First stage of d...

Page 58: ...s that the CPSR is loaded from the SPSR B Byte operation H Halfword operation T Forces execution to be handled as having User mode privilege Cannot be used with pre indexed addresses x Selects HIGH or LOW 16 bits of register Rm T selects the HIGH 16 bits T top and B selects the LOW 16 bits B bottom y Selects HIGH or LOW 16 bits of register Rs T selects the HIGH 16 bits T top and B selects the LOW ...

Page 59: ... coprocessor operation to perform operand2 See Table 1 13 on page 1 43 option Specifies additional instruction options to the coprocessor An integer in the range 0 to 255 surrounded by and reglist A comma separated list of registers enclosed in braces and rotation One of ROR 8 ROR 16 or ROR 24 Rm Specifies the register the value of which is the instruction operand Rn Specifies the address of the b...

Page 60: ...ltiply 32x16 SMULWy cond Rd Rm Rs Multiply accumulate 32x16 32 SMLAWy cond Rd Rm Rs Rn Multiply signed accumulate long 16x16 64 SMLALxy cond RdLo RdHi Rm Rs Count leading zeros CLZ cond Rd Rm Compare Compare CMP cond Rn operand2 Compare negative CMN cond Rn operand2 Logical Move MOV cond S Rd operand2 Move NOT MVN cond S Rd operand2 Test TST cond Rn operand2 Test equivalence TEQ cond Rn operand2 A...

Page 61: ...mode3 Halfword LDR cond H Rd a_mode3 Halfword signed LDR cond SH Rd a_mode3 Doubleword LDR cond D Rd a_mode3 Return from exception RFE a_mode4 Rn Load multiple Stack operations LDM cond a_mode4L Rn reglist Increment before LDM cond IB Rn reglist Increment after LDM cond IA Rn reglist Decrement before LDM cond DB Rn reglist Decrement after LDM cond DA Rn reglist Stack operations and restore CPSR LD...

Page 62: ... REV16 cond Rd Rm Byte reverse signed halfword REVSH cond Rd Rm Synchronization primitives Load exclusive LDREX cond Rd Rn Store exclusive STREX cond Rd Rm Rn Load Byte Exclusive LDREXB cond Rxf Rbase Load Halfword Exclusive LDREXH cond Rd Rn Load Doubleword Exclusive LDREXD cond Rd Rn Store Byte Exclusive STREXB cond Rd Rm Rn Store Halfword Exclusive STREXH cond Rd Rm Rn Store Doubleword Exclusiv...

Page 63: ...DD16 cond Rd Rn Rm Saturated add high 16 16 low 16 16 QADD16 cond Rd Rn Rm Signed high 16 16 low 16 16 halved SHADD16 cond Rd Rn Rm Unsigned high 16 16 low 16 16 set GE flags UADD16 cond Rd Rn Rm Saturated unsigned high 16 16 low 16 16 UQADD16 cond Rd Rn Rm Unsigned high 16 16 low 16 16 halved UHADD16 cond Rd Rn Rm Signed high 16 low 16 low 16 high 16 set GE flags SADDSUBX cond Rd Rn Rm Saturated ...

Page 64: ... set GE flags USUB16 cond Rd Rn Rm Saturated unsigned high 16 16 low 16 16 UQSUB16 cond Rd Rn Rm Unsigned high 16 16 low 16 16 halved UHSUB16 cond Rd Rn Rm Four signed 8 8 set GE flags SADD8 cond Rd Rn Rm Four saturated 8 8 QADD8 cond Rd Rn Rm Four signed 8 8 halved SHADD8 cond Rd Rn Rm Four unsigned 8 8 set GE flags UADD8 cond Rd Rn Rm Four saturated unsigned 8 8 UQADD8 cond Rd Rn Rm Four unsigne...

Page 65: ...m rotation Low 8 zero extend to 32 UXTB cond Rd Rm rotation Low 16 zero extend to 32 UXTH cond Rd Rm rotation Signed multiply and multiply accumulate Signed high 16 x 16 low 16 x 16 32 and set Q flag SMLAD cond Rd Rm Rs Rn As SMLAD but high x low low x high and set Q flag SMLADX cond Rd Rm Rs Rn Signed high 16 x 16 low 16 x 16 32 SMLSD cond Rd Rm Rs Rn As SMLSD but high x low low x high SMLSDX con...

Page 66: ...ate select and pack Signed saturation at bit position n SSAT cond Rd immed_5 Rm shift Unsigned saturation at bit position n USAT cond Rd immed_5 Rm shift Two 16 signed saturation at bit position n SSAT16 cond Rd immed_4 Rm Two 16 unsigned saturation at bit position n USAT16 cond Rd immed_4 Rm Select bytes from Rn Rm based on GE flags SEL cond Rd Rn Rm Pack low 16 32 high 16 32 PKHBT cond Rd Rn Rm ...

Page 67: ...med_5 Rn Rm ROR immed_5 Rn Rm RRX Post indexed offset Immediate Rn immed_12 Zero offset Rn Register offset Rn Rm Scaled register offset Rn Rm LSL immed_5 Rn Rm LSR immed_5 Rn Rm ASR immed_5 Rn Rm ROR immed_5 Rn Rm RRX Table 1 9 Addressing mode 2P post indexed only Addressing mode Assembler Post indexed offset Immediate offset Rn immed_12 Zero offset Rn Register offset Rn Rm Scaled register offset ...

Page 68: ...ndexed Rn Rm Table 1 11 Addressing mode 4 Addressing mode Stack type Block load Stack pop LDM RFE IA Increment after FD Full descending IB Increment before E D Empty descending DA Decrement after FA Full ascending DB Decrement before E A Empty ascending Block store Stack push STM SRS IA IA Increment after E A Empty ascending IB IB Increment before FA Full ascending DA DA Decrement after E D Empty ...

Page 69: ...tate right Rm ROR immed_5 Register Rm Logical shift left Rm LSL Rs Logical shift right Rm LSR Rs Arithmetic shift right Rm ASR Rs Rotate right Rm ROR Rs Rotate right extended Rm RRX Table 1 14 Fields Suffix Sets this bit in the MSR field_mask MSR instruction bit number c Control field mask bit bit 0 16 x Extension field mask bit bit 1 17 s Status field mask bit bit 2 18 f Flags field mask bit bit ...

Page 70: ...gs MOV Rd immed_8 LowReg to LowReg update flags MOV Rd Rm HighReg to LowReg MOV Rd Rm LowReg to HighReg MOV Rd Rm HighReg to HighReg MOV Rd Rm Copy CPY Rd Rm Arithmetic Add ADD Rd Rn immed_3 Add immediate ADD Rd immed_8 Add LowReg and LowReg update flags ADD Rd Rn Rm Add HighReg to LowReg ADD Rd Rm Add LowReg to HighReg ADD Rd Rm Add HighReg to HighReg ADD Rd Rm Add immediate to PC ADD Rd PC immed...

Page 71: ...Rm OR ORR Rd Rm Bit clear BIC Rd Rm Move NOT MVN Rd Rm Test bits TST Rd Rm Shift Rotate Logical shift left LSL Rd Rm immed_5 LSL Rd Rs Logical shift right LSR Rd Rm immed_5 LSR Rd Rs Arithmetic shift right ASR Rd Rm immed_5 ASR Rd Rs Rotate right ROR Rd Rs Branch Conditional B cond label Unconditional B label Branch with link BL label Branch link and exchange BLX label Branch link and exchange BLX...

Page 72: ... immed_8 4 Multiple STMIA Rn reglist Push Pop Push registers onto stack PUSH reglist Push LR and registers onto stack PUSH reglist LR Pop registers from stack POP reglist Pop registers and PC from stack POP reglist PC Change state Change processor state CPS effect iflags Change endianness SETEND endian_specifier Byte reverse Byte reverse word REV Rd Rm Byte reverse halfword REV16 Rd Rm Byte revers...

Page 73: ... RAMs has been changed For more information see the description of the RAM interface implementation in the ARM1176JZF S and ARM1176JZ S Implementation Guide r0p1 r0p2 There are no functional differences between r0p1 and r0p2 r0p2 r0p4 There are no functional differences between r0p2 and r0p4 r0p4 r0p6 Between r0p4 and r0p6 there are no differences in the functionality described in this Technical R...

Page 74: ...ections About the programmer s model on page 2 2 Secure world and Non secure world operation with TrustZone on page 2 3 Processor operating states on page 2 12 Instruction length on page 2 13 Data types on page 2 14 Memory formats on page 2 15 Addresses in a processor system on page 2 16 Operating modes on page 2 17 Registers on page 2 18 The program status registers on page 2 24 Additional instru...

Page 75: ...cture includes the 32 bit ARM instruction set 16 bit Thumb instruction set and the 8 bit Java instruction set For details of both the ARM and Thumb instruction sets see the ARM Architecture Reference Manual For the Java instruction set see the Jazelle V1 Architecture Reference Manual TrustZone provides Secure and Non secure worlds for software to operate in For more details see Secure world and No...

Page 76: ...S that includes the Secure kernel and the Non secure OS that includes the Non secure kernel For details on modes of operation see Operating modes on page 2 17 Figure 2 1 Secure and Non secure worlds In normal Non secure operation the OS runs tasks in the usual way When a User process requires Secure execution it makes a request to the Non secure kernel that operates in privileged mode and this cal...

Page 77: ...s switches to and from the Secure world The overall security of the software relies on the security of this code along with the Secure boot code When the Secure Monitor transfers control from one world to the other it must save the processor context that includes register banks from one world and restore those for the other world The processor hardware automatically shadows and changes context inf...

Page 78: ...rongly recommended that you do not set the NS bit in Privileged modes other than in Secure Monitor mode If you do so you face the same problem as a return to the Non secure world with the MSR instruction Note To avoid leakage after an MSR instruction use an IMB sequence To enter the Secure Monitor the processor executes SMC cond imm16 Where cond Is the condition when the processor executes the SMC...

Page 79: ...ed appropriately For external accesses AxPROT 1 indicates whether the access is Secure or Non secure The TrustZone security extensions are completely compatible with existing software This means that existing applications and operating systems access memory without change Where a system employs Secure functionality the Non secure world is effectively blind to Secure memory This means that Secure a...

Page 80: ...emory and can target both Secure and Non secure memory Figure 2 3 Memory partition in the Secure and Non secure worlds Non secure Virtual memory 32KB on chip RAM Non secure translation table base address NS attribute Secure Virtual memory Physical memory Non secure level 1 descriptors 4KB non secure 4KB non secure 4KB non secure 4KB non secure 4KB non secure 4KB secure 4KB secure 4KB secure Secure...

Page 81: ...is includes such actions as a Allocate TCM memory for the Secure Monitor code b Allocate scratch work space c Set up the Secure Monitor stack pointer and initialize its state block 3 Program the partition checker to allocate physical memory available to the Non secure OS 4 Yield control to the Non secure OS The Non secure OS boots after this The overall security of the software relies on the secur...

Page 82: ...ically possible Software must perform a Prefetch Flush CP15 operation after a change to this pin on the boundary of the macrocell to ensure that its effect is recognized for following instructions It it is expected that control of the CP15SDISABLE pin remains within the SoC that embodies the macrocell the CP15SDISABLE pin is set to logic 0 by the SoC hardware at reset You can use the CP15SDISABLE ...

Page 83: ...emory Remap Register MCR p15 0 Rd c10 c2 1 Secure Monitor or Privileged when NS 0 Secure Vector Base Register MCR p15 0 Rd c12 c0 0 Secure Monitor or Privileged when NS 0 Monitor Vector Base Register MCR p15 0 Rd c12 c0 1 Secure Monitor or Privileged when NS 0 Secure FCSE Register MCR p15 0 Rd c13 c0 0 Secure Monitor or Privileged when NS 0 Peripheral Port remap Register MCR p15 0 Rd c15 c2 4 Secu...

Page 84: ...ecure Control Register I bit bit 12 14 Secure Control Register C bit bit 2 13 Secure Control Register M bit bit 0 12 Secure Configuration Register NS bit bit 0 11 CPSR A bit bit 8 taken from the core pipeline writeback stage 10 CPSR I bit bit 7 taken from the core pipeline writeback stage 9 CPSR F bit bit 6 taken from the core pipeline writeback stage 8 5 CPSR mode bits bits 3 0 taken from the cor...

Page 85: ...ails on entering and exiting Jazelle state see Jazelle V1 Architecture Reference Manual 2 3 1 Switching state You can switch the operating state of the processor between ARM state and Thumb state using the BX and BLX instructions and loads to the PC The ARM Architecture Reference Manual describes the switching state ARM state and Jazelle state using the BXJ instruction All exceptions are entered h...

Page 86: ...004 2009 ARM Limited All rights reserved 2 13 ID012310 Non Confidential Unrestricted Access 2 4 Instruction length Instructions are one of 32 bits long in ARM state 16 bits long in Thumb state variable length multiples of 8 bits in Jazelle state ...

Page 87: ...of these types are described as signed the N bit data value represents an integer in the range 2N 1 to 2N 1 1 using two s complement format For best performance you must align these as follows word quantities must be aligned to four byte boundaries halfword quantities must be aligned to two byte boundaries byte quantities can be placed on any byte boundary The processor provides mixed endian and u...

Page 88: ...red byte and the least significant byte at the highest numbered byte Therefore byte 0 of the memory system connects to data lines 31 24 Figure 2 4 shows this Figure 2 4 Big endian addresses of bytes within words 2 6 2 Little endian format In little endian format the lowest numbered byte in a word is the least significant byte of the word and the highest numbered byte is the most significant Theref...

Page 89: ...rld where the core is 2 The Instruction Cache is indexed by the lower bits of the VA The VA is translated using the ProcID Secure or Non secure one to the MVA and then to PA in the Translation Lookaside Buffer TLB The TLB performs the translation in parallel with the Cache lookup The translation uses Secure descriptors if the core is in Secure world Otherwise it uses the Non secure ones 3 If the p...

Page 90: ...OS Undefined mode is entered when an undefined instruction exception occurs Secure Monitor mode is a Secure mode for the TrustZone Secure Monitor code Note Secure Monitor mode is not the same as monitor debug mode Modes other than User mode are collectively known as privileged modes Privileged modes are used to service interrupts or exceptions or to access protected resources Table 2 4 lists the m...

Page 91: ... Stack Pointer Register R13 is used as the Stack Pointer SP R13 is banked for the exception modes This means that an exception handler can use a different stack to the one in use when the exception occurred In many instructions you can use R13 as a general purpose register but the architecture deprecates this use of R13 in most instructions For more information see the ARM Architecture Reference M...

Page 92: ...ecure Monitor Supervisor Abort IRQ and Undefined modes each have alternative mode specific registers mapped to R13 and R14 permitting a private stack pointer and link register for each mode Figure 2 6 on page 2 20 shows the ARM state registers Table 2 5 Register mode identifiers Mode Mode identifier User usra a The usr identifier is usually omitted from register names It is only used in descriptio...

Page 93: ...IRQ Undefined R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 FIQ R0 R1 R2 R3 R4 R5 R6 R7 R8_fiq R9_fiq R10_fiq R11_fiq R12_fiq R13_fiq R14_fiq R15 PC R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13_svc R14_svc R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13_abt R14_abt R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13_irq R14_irq R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13_und R14_und CPSR CPSR C...

Page 94: ...Thumb state on page 2 22 the PC a stack pointer SP ARM R13 an LR ARM R14 the CPSR There are banked SPs LRs and SPSRs for each privileged mode Figure 2 8 on page 2 22 shows the Thumb state core register set R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 PC R8_fiq R9_fiq R10_fiq R11_fiq R12_fiq R13_fiq R14_fiq R13_svc R14_svc R13_abt R14_abt R13_irq R14_irq R13_und R14_und CPSR SPSR_fiq SPSR_...

Page 95: ...egister values For more details see the ARM Architecture Reference Manual 2 9 4 ARM state and Thumb state registers relationship Figure 2 9 on page 2 23 shows the relationships between the Thumb state and ARM state registers See the Jazelle V1 Architecture Reference Manual for details of Jazelle state registers Thumb state general registers and program counter System and User Thumb state program s...

Page 96: ...e registers relationship Note Registers R0 R7 are known as the low registers Registers R8 R15 are known as the high registers Thumb state ARM State R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 Stack Pointer R13 Link Register R14 Program Counter R15 CPSR SPSR Stack pointer SP Link register LR Program counter PC CPSR SPSR R0 R1 R2 R3 R4 R5 R6 R7 Low registers High registers ...

Page 97: ...eserved for example during process context switches Writable to enable the processor state to be restored To maintain compatibility with future ARM processors and as good practice you are strongly advised to use a read modify write strategy when changing the CPSR 2 10 1 The condition code flags The N Z C and V bits are the condition code flags You can set them by arithmetic and logical operations ...

Page 98: ... on the status of the Q flag To determine the status of the Q flag you must read the PSR into a register and extract the Q flag from this For details of how the Q flag is set and cleared see individual instruction definitions in the ARM Architecture Reference Manual 2 10 3 The J bit The J bit in the CPSR indicates when the processor is in Jazelle state When J 0 The processor is in ARM or Thumb sta...

Page 99: ...ractions and so are carry out bits For signed operations the rules for setting the GE bits are chosen so that they have the same sort of greater than or equal functionality as for unsigned operations Table 2 6 GE 3 0 settings GE 3 GE 2 GE 1 GE 0 Instruction A op B C A op B C A op B C A op B C Signed SADD16 31 16 31 16 0 31 16 31 16 0 15 0 15 0 0 15 0 15 0 0 SSUB16 31 16 31 16 0 31 16 31 16 0 15 0 ...

Page 100: ... bits They are the Interrupt disable bits T bit Mode bits on page 2 28 The control bits change when an exception occurs When the processor is operating in a privileged mode software can manipulate these bits Interrupt disable bits The I and F bits are the interrupt disable bits When the I bit is set IRQ interrupts are disabled When the F bit is set FIQ interrupts are disabled FIQ can be non maskab...

Page 101: ...cted Bits in Figure 2 10 on page 2 24 that are in this category are J and T Bits that can only be modified from privileged modes and that are completely protected from modification by instructions while the processor is in User mode The only way that these bits can be modified while the processor is in User mode is by entering a processor exception as Exceptions on page 2 36 describes Bits in Figu...

Page 102: ... privileged modes it ignores changes to the CPSR to enter the Secure Monitor The core does not copy mode bits in the SPSR changed in the Non secure world across to the CPSR 2 10 9 Reserved bits The remaining bits in the PSRs are unused but are reserved When changing a PSR flag or control bits make sure that these reserved bits are not altered You must ensure that your program does not rely on rese...

Page 103: ...nt restrictions apply to the addresses of these instructions The LDREXB and STREXB instructions share the same data monitors as the LDREX and STREX instructions a local and a global monitor for each processor for shared memory support LDREXB Figure 2 11 shows the format of the Load Register Byte Exclusive LDREXB instruction Figure 2 11 LDREXB instruction Syntax LDREXB cond Rxf Rbase Operation if C...

Page 104: ...ent faults if this condition is not met For more information see Operation of unaligned accesses on page 4 13 The transaction must be a single access or indivisible burst on bus widths 16 bits For AXI based systems the exclusive access signal AxPROT 4 must remain asserted throughout the burst where AxSIZE 0x1 The LDREXH and STREXH instructions share the same data monitors as the LDREX and STREX in...

Page 105: ... as two words that load or store to consecutive word addressed locations in memory Register restrictions are the same as LDRD and STRD For STRD in ARM state the registers Rm and R m 1 provide the value that is stored where m is an even number The address in memory must be 64 bit aligned address 2 0 b000 When A U 0 1 1 0 or 1 1 in CP15 register 1 the instruction generates alignment faults if this c...

Page 106: ...clusiveLocal processor_id STREXD Figure 2 16 shows the format of the Store Register Doubleword Exclusive STREXD instruction Figure 2 16 STREXD instruction Syntax STREXD cond Rd Rm Rn Operation if ConditionPassed cond then processor_id ExecutingProcessor if IsExclusiveLocal processor_id then if Shared Rn 1 then physical_address TLB Rn if IsExclusiveGlobal physical_address processor_id 8 then Memory...

Page 107: ...on Syntax cond Is the condition when the instruction executes It produces no useful change in functionality but is provided to ensure disassembly followed by reassembly always regenerates the original code hint defaults to zero hint 0x0 the instruction is NOP hint 0x1 the instruction is YIELD For all other values RESERVED the instruction behaves like NOP The true NOP for ARM state is equivalent to...

Page 108: ...The hint indicates that the current activity of the thread is not important for example sitting in a spin lock and so can yield On a uniprocessor system this instruction behaves as a NOP OSs can use the yielding NOP in those places that require the yield hint and the non yielding NOP in other cases Operation The instruction acts as a NOP irrespective of whether the condition passes or fails effect...

Page 109: ...access load store instructions ARM LDC LDM LDRD STC STM and STRD and Thumb LDMIA POP PUSH and STMIA can be interrupted and then restarted after the interrupt has been processed Support for an imprecise Data Abort that behaves as an interrupt rather than as an abort in that it occurs asynchronously relative to the instruction execution Support involves the masking of a pending imprecise Data Abort ...

Page 110: ...s at their start and re enabling interrupts at their end A similar Thumb instruction is also provided However the Thumb instruction can only change the interrupt masks not the processor mode as well to avoid using too much instruction set space 2 12 2 Exception entry and exit summary Table 2 8 summarizes the PC value preserved in the relevant R14 on exception entry and the recommended instruction ...

Page 111: ... the CPSR into the appropriate SPSR 3 Forces the CPSR mode bits to a value that depends on the exception 4 Forces the PC to fetch the next instruction from the relevant exception vector The processor can also set the interrupt and imprecise abort disable flags to prevent otherwise unmanageable nesting of exceptions Note Exceptions are always entered handled and exited in ARM state When the process...

Page 112: ...ate or Jazelle state an FIQ handler returns from the interrupt by executing SUBS PC R14_fiq 4 You can disable FIQ exceptions within a privileged mode by setting the CPSR F flag When the F flag is clear the processor checks for a LOW level on the output of the nFIQ register at the end of each instruction The FW bit and FIQ bit in the SCR register configure the FIQ as non maskable in Non secure worl...

Page 113: ...are fully restartable In particular they must not be used on memory locations that produce non idempotent side effects for the type of memory access concerned This enables but does not require implementations to make these instructions interruptible when in low interrupt latency configuration If the instruction is interrupted before it is complete the result might be that one or more of the words ...

Page 114: ...ty interrupts These interrupts can be prioritized and are assumed to be signaled to the processor core by means of the FIQ interrupt Their handlers do not use the facilities supplied by the other two layers This means that all memory they use must be locked down in the TLBs and caches It is possible to use additional code to make access to nonlocked memory possible but this example does not descri...

Page 115: ...h All IRQ and SVC handlers use the stack pointed to by R13_svc This stack does not have to be locked down in memory The stack pointed to by R13_usr is used by the current process This process can be privileged or unprivileged and uses System or User mode accordingly 5 Timings are roughly consistent with ARM10 timings with the pipeline reload penalty being three cycles It is assumed that pipeline r...

Page 116: ...of time that FIQs are disabled at the start of the lower priority FIQs The worst case interrupt latency for the FIQ1 interrupt occurs if a lower priority FIQ2 has fetched its handler address and is approximately 3 cycles for the pipeline refill after the LDR PC instruction fetches the handler address 24 cycles to get to and execute the MSR instruction that re enables FIQs 3 cycles to re enter the ...

Page 117: ...n the FIQ is detected was only significant because the memory system was able to stretch its cycles considerably Otherwise it was dwarfed by the number of cycles lost because of FIQs being disabled at the start of a lower priority interrupt handler In ARMv6 this is still the case but it is a lot closer Alternatives to the example system Two alternatives to the design in FIQs in the example system ...

Page 118: ...letion layer re enable IRQs but they also use additional VIC facilities to place a lower limit on the priority of IRQs that is taken This permits IRQs at that priority or higher to be treated as being in the real time layer and IRQs at lower priorities to be treated as being in the non real time layer The price paid is some additional complexity in the software and in the VIC hardware Note For eit...

Page 119: ...e precise or imprecise Two separate FSR encodings indicate if the external abort is precise or imprecise all external aborts to loads when the CP15 Register 1 FI bit bit 21 is set are precise all external aborts to loads or stores to Strongly Ordered memory are precise all external aborts to loads to the Program Counter or the CSPR are precise all external aborts on the load part of a SWP are prec...

Page 120: ...st be held by the system until a time when the Abort mode can safely be entered A mask is added into the CPSR to indicate that an imprecise Data Abort can be accepted This bit is referred to as the A bit The imprecise Data Abort causes a Data Abort to be taken when imprecise Data Aborts are not masked When imprecise Data Aborts are masked then the implementation is responsible for holding the pres...

Page 121: ...tion does not cause the processor to take the Prefetch Abort exception until the instruction reaches the Execute stage of the pipeline If the instruction is not executed for example because a branch occurs while it is in the pipeline the breakpoint does not take place After dealing with the breakpoint the handler executes the following instruction irrespective of the processor operating state SUBS...

Page 122: ...it store value of Secure Control Register bit 25 CPSR 24 0 Clear J bit if high vectors configured then PC 0xFFFF0000 else PC 0x00000000 Undefined instruction On an undefined instruction Non secure state is unchanged R14_und address of the next instruction after the undefined instruction SPSR_und CPSR CPSR 4 0 0b11011 Enter undefined Instruction mode CPSR 5 0 Execute in ARM state CPSR 7 1 Disable i...

Page 123: ...it AW CPSR 8 1 Disable imprecise aborts Else CPSR 8 UNCHANGED CPSR 9 Non secure EE bit store value of NS Control Reg 25 CPSR 24 0 Clear J bit if high vectors configured then PC 0xFFFF000C else PC Non_Secure_Base_Address 0x0000000C Internal Prefetch Abort On an internal prefetch abort Non secure state is unchanged R14_abt address of the aborted instruction 4 SPSR_abt CPSR CPSR 4 0 0b10111 Enter abo...

Page 124: ...aborts on L1 memory management occurring when a fault is detected in MMU Non secure state is unchanged R14_abt address of the aborted instruction 8 SPSR_abt CPSR CPSR 4 0 0b10111 Enter abort mode CPSR 5 0 Execute in ARM state CPSR 7 1 Disable interrupts If SCR 5 1 bit AW CPSR 8 1 Disable imprecise aborts Else CPSR 8 UNCHANGED CPSR 9 Non secure EE bit store value of NS Control Reg 25 CPSR 24 0 Clea...

Page 125: ... interrupts CPSR 7 1 Disable interrupts CPSR 8 1 Disable imprecise aborts CPSR 9 Secure EE bit store value of secure Ctrl Reg bit 25 CPSR 24 0 Clear J bit PC Monitor_Base_Address 0x0000001C Else SCR 4 bit FW must be set to avoid infinite loop until FIQ is asserted R14_fiq address of the next instruction to be executed 4 SPSR_fiq CPSR CPSR 4 0 0b10001 Enter FIQ mode CPSR 5 0 Execute in ARM state CP...

Page 126: ...y other exception to occur in Secure Monitor mode However if an exception occurs in Secure Monitor mode the NS bit in SCR register is automatically reset and the core branches either to the exception handler in Secure world or in Secure Monitor mode Secure Monitor mode for IRQ FIQ or external aborts with the corresponding bit set in SCR 3 1 The following exceptions occur in the Secure world Reset ...

Page 127: ...s CPSR 9 Secure EE bit store value of secure Control Reg 25 CPSR 24 0 Clear J bit if high vectors configured then PC 0xFFFF0008 else PC Secure_Base_Address 0x00000008 External Prefetch Abort On an external prefetch abort secure state is unchanged if SCR 3 1 external prefetch aborts trapped to Secure Monitor mode R14_mon address of the aborted instruction 4 SPSR_mon CPSR CPSR 4 0 0b10110 Enter Secu...

Page 128: ...PSR 6 1 Disable fast interrupts CPSR 7 1 Disable interrupts CPSR 8 1 Disable imprecise aborts CPSR 9 Secure EE bit store value of secure Control Reg 25 CPSR 24 0 Clear J bit PC Monitor_Base_Address 0x00000010 Else external Aborts trapped in abort mode R14_abt address of the aborted instruction 8 SPSR_abt CPSR CPSR 4 0 0b10111 Enter abort mode CPSR 5 0 Execute in ARM state CPSR 7 1 Disable interrup...

Page 129: ...cise aborts CPSR 9 Secure EE bit store value of secure Control Reg 25 CPSR 24 0 Clear J bit if VE 0 Core with VIC port only if high vectors configured then PC 0xFFFF0018 else PC Secure_Base_Address 0x00000018 else PC IRQADDR Fast Interrupt Request FIQ exception On a Fast Interrupt Request and CPSR 6 0 F bit secure state is unchanged if SCR 2 1 FIQ trapped in Secure Monitor mode R14_mon address of ...

Page 130: ... CPSR 4 0 0b10110 Enter Secure Monitor mode CPSR 5 0 Execute in ARM state CPSR 6 1 Disable fast interrupts CPSR 7 1 Disable interrupts CPSR 8 1 Disable imprecise aborts CPSR 9 Secure EE bit store value of secure Control Reg 25 CPSR 24 0 Clear J bit PC Monitor_Base_Address 0x00000008 SMC vectored to the conventional SVC vector 2 12 17 Exception priorities When multiple exceptions arise at the same ...

Page 131: ... priority than FIQs to ensure that the transfer error does not escape detection You must add the time for this exception entry to the worst case FIQ latency calculations in a system that uses aborts to support virtual memory The FIQ handler must not access any memory that can generate a Data Abort because the initial Data Abort exception condition is lost if this happens Note If the data abort is ...

Page 132: ... for the generation of an interrupt by the DMA indicating the completion of a transfer between external memory and an Instruction TCM the prioritization between core requests from a tight loop and the DMA can mean the DMA is locked out from writing the TCM so freezing the system To avoid this two mechanisms are recommended 1 The use of the WFI operation in the wait loop to freeze core execution wh...

Page 133: ...nrestricted Access Chapter 3 System Control Coprocessor This chapter describes the purpose of the system control coprocessor its structure operation and how to use it It contains the following sections About the system control coprocessor on page 3 2 System control processor registers on page 3 13 ...

Page 134: ...ers are System control and configuration on page 3 5 MMU control and configuration on page 3 6 Cache control and configuration on page 3 7 TCM control and configuration on page 3 8 Cache Master Valid Registers on page 3 8 DMA control on page 3 9 System performance monitor on page 3 10 System validation on page 3 10 The system control coprocessor controls the TrustZone operation of the processor so...

Page 135: ...cheme c0 CPUID registers on page 3 26 MMUcontroland configuration TLB Type c0 TLB Type Register on page 3 25 Translation Table Base 0 c2 Translation Table Base Register 0 on page 3 57 Translation Table Base 1 c2 Translation Table Base Register 1 on page 3 59 Translation Table Base Control c2 Translation Table Base Control Register on page 3 60 Domain Access Control c3 Domain Access Control Registe...

Page 136: ...lid c15 Instruction Cache Master Valid Register on page 3 147 Data Cache Master Valid c15 Data Cache Master Valid Register on page 3 148 DMA control DMA Identification and Status c11 DMA identification and status registers on page 3 106 DMA User Accessibility c11 DMA User Accessibility Register on page 3 107 DMA Channel Number c11 DMA Channel Number Register on page 3 109 DMA enable c11 DMA enable...

Page 137: ...tion Secure User and Non secure Access Validation Control c15 Secure User and Non secure Access Validation Control Register on page 3 132 System Validation Counter c15 System Validation Counter Register on page 3 140 System Validation Operations c15 System Validation Operations Register on page 3 142 System Validation Cache Size Mask c15 System Validation Cache Size Mask Register on page 3 145 a R...

Page 138: ...l and configuration The purpose of the MMU control and configuration registers is to allocate physical address locations from the Virtual Addresses VAs that the processor generates control program access to memory designate areas of memory as either noncacheable unbufferable noncacheable and unbufferable detect MMU faults and external aborts hold thread and process IDs provide direct access to the...

Page 139: ...nd configuration registers consist of one 32 bit read only register and four 32 bit read write registers Figure 3 3 on page 3 8 shows the arrangement of the registers in this functional group c0 3 c0 0 c2 1 2 0 c0 0 c5 1 0 c0 0 c6 1 0 c0 0 c8 0 c10 c3 0 c0 0 0 c15 2 4 0 c2 c0 TLB Type Register Translation Table Base Control Register Translation Table Base Register 1 Translation Table Base Register...

Page 140: ...r write individual registers that make up the group see Use of the system control coprocessor on page 3 12 TCM control and configuration behaves in three ways as a set of numbers values that describe aspects of the TCMs as a set of bits that enable specific TCM functionality as a set of addresses that define the memory locations of data stored in the TCMs 3 1 6 Cache Master Valid Registers The pur...

Page 141: ... Figure 3 6 shows the arrangement of registers Figure 3 6 DMA control and configuration registers To use the DMA control and configuration registers you read or write the individual registers that make up the group see Use of the system control coprocessor on page 3 12 Code can execute several DMA operations while in User mode if these operations are enabled by the DMA User Accessibility Register ...

Page 142: ...stem control coprocessor on page 3 12 Note The counters are only enabled when the SPNIDEN input and the SUNIDEN bit see c1 Secure Debug Enable Register on page 3 54 are appropriately set When the core is in a mode where non invasive debug is not permitted events are not counted but the cycle count register CCNT continues to count You can not use the system performance monitor registers at the same...

Page 143: ...rivileged modes but a Secure User and Non secure Access Validation Control Register is provided to permit access to the System Validation Registers from User modes and Non secure modes The System Validation Cache Size Mask Register masks the physical size of the caches and TCMs to make their size appear different to the processor You can use this in validation by simulation but you must not use it...

Page 144: ...code_1 Rd CRn CRm Opcode_2 MRC cond P15 Opcode_1 Rd CRn CRm Opcode_2 Figure 3 9 shows the instruction bit pattern of MRC and MCR instructions Figure 3 9 CP15 MRC and MCR bit pattern The CRn field of MRC and MCR instructions specifies the coprocessor register to access The CRm field and Opcode_2 fields specify a particular operation when addressing registers The L bit distinguishes between an MRC L...

Page 145: ...Table 3 2 on page 3 14 lists the allocation and reset values of the registers of the system control coprocessor where CRn is the register number within CP15 Op1 is the Opcode_1 value for the register CRm is the operational register Op2 is the Opcode_2 value for the register Type applies to the Secure S or the Non secure NS world and is B registers banked in Secure and Non secure worlds If the regi...

Page 146: ... Model Feature 1 RO RO 0x10030302 page 3 32 6 Memory Model Feature 2 RO RO 0x01222100 page 3 33 7 Memory Model Feature 3 RO RO 0x00000000 page 3 35 c2 0 Instruction Set Feature Attribute 0 RO RO 0x00140011 page 3 36 1 Instruction Set Feature Attribute 1 RO RO 0x12002111 page 3 37 2 Instruction Set Feature Attribute 2 RO RO 0x11231121 page 3 39 3 Instruction Set Feature Attribute 3 RO RO 0x01102131...

Page 147: ...000000 page 3 69 c7 0 c0 4 Wait For Interrupt WO WO page 3 85 c4 0 PA R W B R W 0x00000000 page 3 80 c5 0 Invalidate Entire Instruction Cache WO WO X page 3 71 1 Invalidate Instruction Cache Line by MVA WO WO page 3 71 2 Invalidate Instruction Cache Line by Index WO WO page 3 71 4 Flush Prefetch Buffer WO WO page 3 79 6 Flush Entire Branch Target Cache WO WO page 3 79 7 Flush Branch Target Cache E...

Page 148: ...e Line by MVA WO WO page 3 71 2 Clean and Invalidate Data Cache Line by Index WO WO page 3 71 c8 0 c5 0 Invalidate Instruction TLB unlocked entries WO B WO page 3 86 1 Invalidate Instruction TLB entry by MVA WO B WO page 3 86 2 Invalidate Instruction TLB entry on ASID match WO B WO page 3 86 c8 0 c6 0 Invalidate Data TLB unlocked entries WO B WO page 3 86 1 Invalidate Data TLB entry by MVA WO B WO...

Page 149: ...page 3 101 1 Normal Memory Region Remap Register R W B X R W 0x44E048E0 page 3 101 c11 0 c0 0 3 DMA identification and status RO RO X 0x0000000Bi page 3 106 c1 0 DMA User Accessibility R W R W X 0x00000000 page 3 107 c2 0 DMA Channel Number R W X R W X 0x00000000 page 3 109 c3 0 2 DMA enable WO X WO X page 3 110 c4 0 DMA Control R W X R W X 0x08000000 page 3 112 c5 0 DMA Internal Start Address R W...

Page 150: ...00000000 page 3 142 c14 0 System Validation Cache Size Mask R W X R W X 0x00006655l page 3 145 c15 1 c13 0 7 System Validation Operations R W X R W X 0x00000000 page 3 142 c15 2 c13 1 7 System Validation Operations R W X R W X 0x00000000 page 3 142 c15 3 c8 0 7 Instruction Cache Master Valid R W X NA 0x00000000 page 3 147 c12 0 7 Data Cache Master Valid R W X NA 0x00000000 page 3 148 c13 0 7 Syste...

Page 151: ...depends on the TCM sizes implemented and on the value of the INITRAM static configuration signal The value here is for 16KB TCM banks with INITRAM tied LOW h Some bits in this register are common and some Secure modify only i Reset value depends on the number of DMA channels implemented and the presence of TCMs j Reset value depends on external signals k This register is read write in Privileged m...

Page 152: ...equal to c0 and Opcode_1 0 is encountered the system control coprocessor returns the value of the main ID register Table 3 5 lists the results of attempted access for each mode Variant number Implementor 31 24 23 20 19 16 15 4 3 0 Architecture Primary part number Revision Table 3 4 Main ID Register bit functions Bits Field name Function 31 24 Implementor Indicates implementor ARM Limited 0x41 23 2...

Page 153: ...systems The Cache Type Register is in CP15 c0 a 32 bit read only register common to Secure and Non secure worlds accessible in privileged modes only All ARMv4T and later cached processors contain this register Figure 3 11 shows the arrangement of bits in the Cache Type Register Figure 3 11 Cache Type Register format Table 3 6 lists how the bit values correspond with the Cache Type Register functio...

Page 154: ...che not supported b0010 2KB cache not supported b0011 4KB cache b0100 8KB cache b0101 16KB cache b0110 32KB cache b0111 64KB cache b1000 128KB cache not supported 17 15 Assoc b010 indicates that the ARM1176JZF S processor has 4 way associativity All other values for Assoc are reserved 14 M bit Indicates the cache size and cache associativity values in conjunction with the Size and Assoc fields In ...

Page 155: ...ache size 16KB associativity 4 way line length eight words caches use write back CP15 c7 for cache cleaning and Format C for cache lockdown Table 3 7 Results of access to the Cache Type Register Secure Privileged Non secure Privileged User Read Write Read Write Data Undefined exception Data Undefined exception Undefined exception Table 3 8 Example Cache Type Register format Bits Field name Value B...

Page 156: ...t arrangement for the TCM Status Register Figure 3 12 TCM Status Register format Table 3 9 lists how the bit values correspond with the TCM Status Register functions Attempts to write the TCM Status Register or read it in User modes result in Undefined exceptions To use the TCM Status Register read CP15 with Opcode_1 set to 0 CRn set to c0 CRm set to c0 Opcode_2 set to 2 0 31 30 29 28 19 18 16 15 ...

Page 157: ...unctions Table 3 11 lists the results of attempted access for each mode To use the TLB Type Register read CP15 with Opcode_1 set to 0 CRn set to c0 CRm set to c0 Opcode_2 set to 3 U SBZ UNP 31 24 23 16 15 8 7 1 0 ILsize DLsize SBZ UNP Table 3 10 TLB Type Register bit functions Bits Field name Function 31 24 UNP SBZ 23 16 ILsize Instruction lockable size specifies the number of instruction TLB lock...

Page 158: ...butes Register 3 on page 3 40 c0 Instruction Set Attributes Register 4 on page 3 42 c0 Instruction Set Attributes Register 5 on page 3 43 Note The CPUID registers are sometimes described as the Core Feature ID registers c0 Processor Feature Register 0 The purpose of the Processor Feature Register 0 is to provide information about the execution state support and programmer s model for the processor...

Page 159: ...e in privileged modes only Figure 3 15 on page 3 28 shows the bit arrangement for Processor Feature Register 1 19 16 Reserved RAZ 15 12 State3 Indicates support for Thumb 2 execution environment 0x0 ARM1176JZF S processors do not support Thumb 2 11 8 State2 Indicates support for Java extension interface 0x1 ARM1176JZF S processors support Java 7 4 State1 Indicates type of Thumb encoding that the p...

Page 160: ...l Security extension Programmer s model Table 3 14 Processor Feature Register 1 bit functions Bits Field name Function 31 28 Reserved RAZ 27 24 Reserved RAZ 23 20 Reserved RAZ 19 16 Reserved RAZ 15 12 Reserved RAZ 11 8 Microcontroller programmer s model Indicates support for the ARM microcontroller programmer s model 0x0 Not supported by ARM1176JZF S processors 7 4 Security extension Indicates sup...

Page 161: ...7 24 Reserved RAZ 23 20 Indicates the type of memory mapped microcontroller debug model that the processor supports 0x0 ARM1176JZF S processors do not support this debug model 19 16 Indicates the type of memory mapped Trace debug model that the processor supports 0x0 ARM1176JZF S processors do not support this debug model 15 12 Indicates the type of coprocessor based Trace debug model that the pro...

Page 162: ...he Auxiliary Feature Register 0 31 16 are Reserved The contents of the Auxiliary Feature Register 0 15 0 are Implementation Defined In the ARM1176JZF S processor the Auxiliary Feature Register 0 reads as 0x00000000 Table 3 19 lists the results of attempted access for each mode To use the Auxiliary Feature Register 0 read CP15 with Opcode_1 set to 0 CRn set to c0 CRm set to c1 Opcode_2 set to 3 For...

Page 163: ...bit functions Bits Field name Function 31 28 Reserved RAZ 27 24 Indicates support for FCSE 0x1 ARM1176JZF S processors support FCSE 23 20 Indicates support for the ARMv6 Auxiliary Control Register 0x1 ARM1176JZF S processors support the Auxiliary Control Register 19 16 Indicates support for TCM and associated DMA 0x3 ARM1176JZF S processors support ARMv6 TCM and DMA 15 12 Indicates support for cac...

Page 164: ...ister 1 functions 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 Table 3 22 Memory Model Feature Register 1 bit functions Bits Field name Function 31 28 Indicates support for branch target buffer 0x1 ARM1176JZF S processors require flushing of branch predictor on VA change 27 24 Indicates support for test and clean operations on data cache Harvard or unified architecture 0x0 no support in ARM1176JZF S...

Page 165: ...icates support for level one cache line maintenance operations by Set Way Harvard architecture 0x3 ARM1176JZF S processors support clean data cache line by Set Way clean and invalidate data cache line by Set Way invalidate data cache line by Set Way invalidate instruction cache line by Set Way 7 4 Indicates support for level one cache line maintenance operations by MVA unified architecture 0 no su...

Page 166: ...essors support invalidate all entries invalidate TLB entry by MVA invalidate TLB entries by ASID match 15 12 Indicates support for TLB maintenance operations Harvard architecture 0x2 ARM1176JZF S processors support invalidate instruction and data TLB all entries invalidate instruction TLB all entries invalidate data TLB all entries invalidate instruction TLB by MVA invalidate data TLB by MVA inval...

Page 167: ...n secure worlds accessible in privileged modes only Figure 3 20 shows the bit arrangement for Memory Model Feature Register 3 Figure 3 20 Memory Model Feature Register 3 format Table 3 26 lists how the bit values correspond with the Memory Model Feature Register 3 functions Table 3 25 Results of access to the Memory Model Feature Register 2 Secure Privileged Non secure Privileged User Read Write R...

Page 168: ...y register common to the Secure and Non secure worlds accessible in privileged modes only Figure 3 21 shows the bit arrangement for Instruction Set Attributes Register 0 Figure 3 21 Instruction Set Attributes Register 0 format Table 3 28 lists how the bit values correspond with the Instruction Set Attributes Register 0 functions Table 3 27 Results of access to the Memory Model Feature Register 3 S...

Page 169: ...s only Figure 3 22 on page 3 38 shows the bit arrangement for Instruction Set Attributes Register 1 19 16 Indicates support for coprocessor instructions 0x4 ARM1176JZF S processors support CDP LDC MCR MRC STC CDP2 LDC2 MCR2 MRC2 STC2 MCRR MRRC MCRR2 MRRC2 15 12 Indicates support for combined compare and branch instructions 0x0 no support in ARM1176JZF S processors 11 8 Indicates support for bitfie...

Page 170: ...RM1176JZF S processors support BX and T bit in PSRs BLX and PC loads have BX behavior 23 20 Indicates support for immediate instructions 0x0 no support in ARM1176JZF S processors 19 16 Indicates support for if then instructions 0x0 no support in ARM1176JZF S processors 15 12 Indicates support for sign or zero extend instructions 0x2 ARM1176JZF S processors support SXTB SXTB16 SXTH UXTB UXTB16 and ...

Page 171: ...utes Register 2 functions 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 Table 3 32 Instruction Set Attributes Register 2 bit functions Bits Field name Function 31 28 Indicates support for reversal instructions 0x1 ARM1176JZF S processors support REV REV16 and REVSH 27 24 Indicates support for PSR instructions 0x1 ARM1176JZF S processors support MRS and MSR exception return instructions for data proce...

Page 172: ...ers common to the Secure and Non secure worlds accessible in privileged modes only Figure 3 24 shows the bit arrangement for Instruction Set Attributes Register 3 Figure 3 24 Instruction Set Attributes Register 3 format 11 8 Indicates support for multi access interruptible instructions 0x1 ARM1176JZF S processors support restartable LDM and STM 7 4 Indicates support for memory hint instructions 0x...

Page 173: ...s support for table branch instructions 0x0 no support in ARM1176JZF S processors 15 12 Indicates support for synchronization primitive instructions 0x2 ARM1176JZF S processors support LDREX and STREX LDREXB LDREXH LDREXD STREXB STREXH STREXD and CLREX 11 8 Indicates support for SVC instructions 0x1 ARM1176JZF S processors support SVC 7 4 Indicates support for Single Instruction Multiple Data SIMD...

Page 174: ...ctions Reserved Reserved 31 16 15 12 11 8 7 4 3 0 20 19 28 27 24 23 Table 3 36 Instruction Set Attributes Register 4 bit functions Bits Field name Function 31 28 Reserved RAZ 27 24 Reserved RAZ 23 20 Indicates fractional support for synchronization primitive instructions 0x0 ARM1176JZF S processors support all synchronization primitive instructions See Table 3 34 on page 3 41 19 16 Indicates suppo...

Page 175: ... worlds accessible in privileged modes only The contents of the Instruction Set Attributes Register 5 are implementation defined In the ARM1176JZF S processor Instruction Set Attributes Register 5 is read as 0x00000000 Table 3 38 lists the results of attempted access for each mode To use the Instruction Set Attributes Register 5 read CP15 with Opcode_1 set to 0 CRn set to c0 CRm set toc2 Opcode_2 ...

Page 176: ...nd cache replacement strategy interrupts and the behavior of interrupt latency the location for exception vectors program flow prediction Table 3 39 on page 3 45 lists the purposes of the individual bits in the Control Register Structure of the Control Register The Control Register is in CP15 c1 a 32 bit register Table 3 39 on page 3 45 lists read and write access to individual bits for the Secure...

Page 177: ...ception 24 VE bit Banked Enables the VIC interface to determine interrupt vectors See the description of the V bit bit 13 0 Interrupt vectors are fixed reset value 1 Interrupt vectors are defined by the VIC interface 23 XP bit Banked Enables the extended page tables to be configured for the hardware page translation mechanism 0 Subpage AP bits enabled reset value 1 Subpage AP bits disabled 22 U bi...

Page 178: ...F0000 0xFFFF001C 12 I bit Banked Enables level one instruction cache 0 Instruction Cache disabled reset value 1 Instruction Cache enabled 11 Z bit Banked Enables branch prediction 0 Program flow prediction disabled reset value 1 Program flow prediction enabled 10 F bit Should Be Zero 9 R bit Banked Deprecated Enables ROM protection If you modify the R bit this does not affect the access permission...

Page 179: ...0 Opcode_2 set to 0 For example MRC p15 0 Rd c1 c0 0 Read Control Register configuration data MCR p15 0 Rd c1 c0 0 Write Control Register configuration data Normally to set the V bit and the B EE and U bits you configure signals at reset The V bit depends on VINITHI at reset VINITHI LOW sets V to 0 VINITHI HIGH sets V to 1 2 C bit Banked Enables level one data cache 0 Data cache disabled reset val...

Page 180: ...to enable the Instruction TCM In ARMv6 the TCM blocks have individual enables that apply to each block As a result this bit is now redundant and Should Be One See c9 Instruction TCM Region Register on page 3 91 for a description of the ARM1176JZF S TCM enables R bit Modifying the R bit does not affect the access permissions of entries already in the TLB See MMU software accessible registers on pag...

Page 181: ...orts to loads are precise 30 FSD Provides additional level of control for speculative operations see c1 Control Register on page 3 44 Force speculative operations force the PC to a new value because of static speculative branch prediction 0 Enable force speculative operations reset value 1 Disable force speculative operations 29 BFD Disables branch folding This behavior also depends on the SB and ...

Page 182: ...acement is Round Robin reset value 1 MicroTLB replacement is Random if cache replacement is also Random 2 SB Enables static branch prediction This depends on program flow prediction that the Z bit enables see c1 Control Register on page 3 44 0 Static branch prediction disabled 1 Static branch prediction enabled if the Z bit is set The reset value is 1 1 DB Enables dynamic branch prediction This de...

Page 183: ... Attempts to read or write the Coprocessor Access Control Register access bits depend on the corresponding bit for each coprocessor in c1 Non Secure Access Control Register on page 3 55 Table 3 45 lists the results of attempted access to coprocessor access bits for each mode To use the Coprocessor Access Control Register read or write CP15 with Opcode_1 set to 0 SBZ UNP 31 28 27 26 25 24 23 22 21 ...

Page 184: ... as Secure or Non secure the world in which the core executes exceptions the ability to modify the A and I bits in the CPSR in the Non secure world The Secure Configuration Register is in CP15 c1 a 32 bit read write register accessible in Secure privileged modes only Figure 3 29 shows the arrangement of bits in the register Figure 3 29 Secure Configuration Register format Table 3 46 lists how the ...

Page 185: ...exception 2 FIQ Determines FIQ behavior for Secure and Non secure worlds 0 Branch to FIQ mode on an FIQ exception reset value 1 Branch to Secure Monitor mode on an FIQ exception 1 IRQ Determines IRQ behavior for Secure and Non secure worlds 0 Branch to IRQ mode on an IRQ exception reset value 1 Branch to Secure Monitor mode on an IRQ exception 0 NS bit Defines the world for the processor 0 Secure ...

Page 186: ...le 3 49 lists the purposes of the individual bits in the Secure Debug Enable Register The Secure Debug Enable Register is in CP15 c1 a 32 bit register in the Secure world only accessible in Secure privileged modes only Figure 3 30 shows the arrangement of bits in the register Figure 3 30 Secure Debug Enable Register format Table 3 49 lists how the bit values correspond with the Secure Debug Enable...

Page 187: ...Non Secure Access Control Register is to define the Non secure access permission for coprocessors cache lockdown registers TLB lockdown registers internal DMA Note This register has no effect on Non secure access permissions for the debug control coprocessor CP14 or the system control coprocessor CP15 The Non Secure Access Control Register is in CP15 c1 a 32 bit register read write in the Secure w...

Page 188: ...are used for DMA transfers reset value 1 DMA can be used by the Non secure world and the Non secure page tables are used for DMA transfers 17 TL Prevents operations in the Non secure world from locking page tables in TLB lockdown entries The Invalidate Single Entry or Invalidate ASID match operations can match a TLB lockdown entry but an Invalidate All operation only applies to unlocked entries 0 ...

Page 189: ...ster 0 for process specific addresses where each process maintains a separate first level page table On a context switch you must modify both Translation Table Base Register 0 and the Translation Table Base Control Register if appropriate Table 3 53 on page 3 58 lists the purposes of the individual bits in the Translation Table Base Register 0 The Translation Table Base Register 0 is in CP15 c2 a ...

Page 190: ...ble Base Register 0 read or write CP15 c2 with Opcode_1 set to 0 CRn set to c2 CRm set to c0 Table 3 53 Translation Table Base Register 0 bit functions Bits Field name Function 31 14 N a Translation table base 0 Holds the translation table base address the physical address of the first level translation table The reset value is 0 13 N 5 a UNP SBZ 4 3 RGN Indicates the Outer cacheable attributes fo...

Page 191: ...ister 1 is for OS and I O addresses Table 3 55 lists the purposes of the individual bits in the Translation Table Base Register 1 The Translation Table Base Register 1 is in CP15 c2 a 32 bit read write register banked for Secure and Non secure worlds accessible in privileged modes only Figure 3 33 shows the bit arrangement for the Translation Table Base Register 1 Figure 3 33 Translation Table Bas...

Page 192: ...ication so that the mechanism for the hardware page table walks sees them 3 2 15 c2 Translation Table Base Control Register The purpose of the Translation Table Base Control Register is to determine if a page table miss for a specific VA uses for its page table walk either Translation Table Base Register 0 The recommended use is for task specific addresses Translation Table Base Register 1 The rec...

Page 193: ...es occurrence of a page table walk on a TLB miss when using Translation Table Base Register 1 When page table walk is disabled a Section Fault occurs instead on a TLB miss 0 The processor performs a page table walk on a TLB miss with Secure or Non secure privilege appropriate to the current world This is the reset value 1 The processor does not perform a page table walk If a TLB miss occurs with T...

Page 194: ... default case at reset It is backwards compatible with ARMv5 and earlier processors If N is set greater than 0 and bits 31 32 N of the VA are all 0 use Translation Table Base Register 0 otherwise use Translation Table Base Register 1 N must be in the range 0 7 Note The ARM1176JZF S processor cannot page table walk from level one cache Therefore if C is set to 1 to ensure coherency you must either ...

Page 195: ... access for each mode To use the Domain Access Control Register read or write CP15 c3 with Opcode_1 set to 0 CRn set to c3 CRm set to c0 Opcode_2 set to 0 For example D15 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 D14 D13 D12 D11 D10 D9 D8 D7 D6 D5 D4 D3 D2 D1 D0 Table 3 59 Domain Access Control Register bit functions Bits Field name Function D n a The pu...

Page 196: ...ngement in the Data Fault Status Register Figure 3 36 Data Fault Status Register format Table 3 61 shows how the bit values correspond with the Data Fault Status Register functions S D 0 UNP SBZ 31 8 7 4 3 0 Domain Status 9 0 S 10 11 R W 12 13 Table 3 61 Data Fault Status Register bit functions Bits Field name Function 31 13 UNP SBZ 12 SD Indicates if an AXI Decode or Slave error caused an abort T...

Page 197: ...Bit fault on Page b0111 Translation Page fault b1000 Precise external abort b1001 Domain Section fault b1010 no function b1011 Domain Page fault b1100 External abort on translation first level b1101 Permission Section fault b1110 External abort on translation second level b1111 Permission Page fault 3 0 with bit 10 1 Status Indicates type of fault generated See Fault status and address on page 6 3...

Page 198: ... 0 Write Data Fault Status Register 3 2 18 c5 Instruction Fault Status Register The purpose of the Instruction Fault Status Register IFSR is to hold the source of the last instruction fault Table 3 63 on page 3 67 lists the purposes of the individual bits in IFSR The Instruction Fault Status Register is in CP15 c5 a 32 bit read write register banked for Secure and Non secure worlds accessible in p...

Page 199: ...atus field see bits 3 0 in this table Always 0 9 4 UNP SBZ 3 0 with bit 10 0 Status Indicates type of fault generated See Fault status and address on page 6 34 for full details of Domain and FAR validity and priorities b0000 no function reset value b0001 Alignment fault b0010 Instruction debug event fault b0011 Access Bit fault on Section b0100 no function b0101 Translation Section fault b0110 Acc...

Page 200: ...d write register banked for Secure and Non secure worlds accessible in privileged modes only The Fault Address Register bits 31 0 contain the MVA that the precise abort occurred on The reset value is 0 Table 3 65 lists the results of attempted access for each mode To use the FAR read or write CP15 with Opcode_1 set to 0 CRn set to c6 CRm set to c0 Opcode_2 set to 0 For example MRC p15 0 Rd c6 c0 0...

Page 201: ...ted access for each mode To use the IFAR read or write CP15 with Opcode_1 set to 0 CRn set to c6 CRm set to c0 Opcode_2 set to 2 For example MRC p15 0 Rd c6 c0 2 Read Instruction Fault Address Register MCR p15 0 Rd c6 c0 2 Write Instruction Fault Address Register A write to this register sets the IFAR to the value of the data written This is useful for a debugger to restore the value of the IFAR 3...

Page 202: ...MVA Using Set and Index Write only c7 c6 c7 c0 4 c5 0 1 2 4 6 7 0 1 2 0 c10 c13 c14 0 1 2 4 5 6 1 0 1 2 SBZ SBZ MVA Index SBZ SBZ MVA SBZ MVA Index SBZ SBZ MVA Index SBZ SBZ MVA SBZ MVA Index 0 Invalidate Data Cache Line using Index Invalidate Both Caches Invalidate Data Cache Line using MVA Invalidate Entire Data Cache Flush Entire Branch Target Cache Wait For Interrupt WFI Flush Prefetch Buffer ...

Page 203: ...r Data Memory Barrier and Clean Data Cache Range These can be operated in User mode Attempting to execute a privileged instruction in User mode results in the Undefined instruction trap being taken There are three ways to use c7 For the Cache Dirty Status Register read c7 with the MRC instruction For range operations use the MCRR instruction with the value of CRm to select the required operation F...

Page 204: ...ion Set and Index format MVA VA SBZ Set and Index format Figure 3 40 shows the Set and Index format for invalidate and clean operations Figure 3 40 c7 format for Set and Index Table 3 67 lists how the bit values correspond with the Cache Operation functions for Set and Index format operations The value of S in Table 3 68 depends on the cache size Table 3 68 lists the relationship of cache sizes an...

Page 205: ...e no effect if they miss in the cache If the corresponding entry is not in the TLB these instructions can cause a TLB miss exception or hardware page table walk depending on the miss handling mechanism For the cache control operations the MVAs that are passed to the cache are not translated by the FCSE extension VA format Figure 3 42 shows the VA format for invalidate and clean operations All VA f...

Page 206: ... To determine if the cache is dirty use the Cache Dirty Status Register see Cache Dirty Status Register on page 3 78 Entire cache Table 3 71 lists the instructions and operations that you can use to clean and invalidate the entire cache Register c7 specifies operations for cleaning the entire Data Cache and also for performing a clean and invalidate of the entire Data Cache These are blocking oper...

Page 207: ...sumed to be enabled at this point Loop1 MOV R1 0 MCR CP15 0 R1 C7 C10 0 Clean or Clean Invalidate Cache MRS R2 CPSR CPSID iaf Disable interrupts MRC CP15 0 R1 C7 C10 6 Read Cache Dirty Status Register ANDS R1 R1 1 Check if it is clean BEQ UseClean MSR CPSR R2 Re enable interrupts B Loop1 clean the cache again UseClean Do_Clean_Operations Perform whatever operation relies on the cache being clean i...

Page 208: ...73 can only be performed using an MCRR or MCRR2 instruction and all other operations to these registers are ignored The End Address and Start Address in Table 3 73 is the true VA before any modification by the Fast Context Switch Extension FCSE This address is translated by the FCSE logic Each of the range operations operates between cache lines containing the Start Address and the End Address inc...

Page 209: ...he fault TrustZone behavior TrustZone affects cache operations as follows Secure world operations In the Secure world cache operations can affect both Secure and Non secure cache lines Clean invalidate and clean and invalidate operations affect all cache lines regardless of their status as locked or unlocked For clean invalidate and clean and invalidate operations with the Set and Index format the...

Page 210: ...an and invalidate operations of the whole cache in the Secure world clear both the Secure and Non secure Cache Dirty Status Registers if the core is in the Non secure world or targets Non secure data from the Secure world stores that write a dirty bit in the cache set both the Secure and the Non secure Cache Dirty Status Register all stores that write a dirty bit in the cache set the Secure Cache ...

Page 211: ... See Explicit Memory Barriers on page 6 25 VA to PA translation operations The purpose of the VA to PA translation operations is to provide a Secure means to determine address translation in the Secure and Non secure worlds and for address translation between the Secure and Non secure worlds VA to PA translations operate through PA Register on page 3 80 Table 3 75 Cache operations flush functions ...

Page 212: ...ister banked in Secure and Non secure worlds accessible in privileged modes only Figure 3 45 shows the format of the PA Register for successful translations Figure 3 45 PA Register format for successful translation Figure 3 46 shows the format of the PA register for aborted translations Figure 3 46 PA Register format for aborted translation Table 3 77 lists the functional bits of the PA Register f...

Page 213: ...ister is not set the processor updates the Secure or Non secure versions of the two registers depending on the Secure or Non secure state of the core when the operation was issued 6 4 INNER Indicates the inner attributes from the page table b000 Noncacheable b001 Strongly Ordered b010 Reserved b011 Device b100 Reserved b101 Reserved b110 Inner Write through no allocate on write b111 Inner Write ba...

Page 214: ...A to PA translation in the current world operations use CP15 c7 four 32 bit write only operations common to the Secure and Non secure worlds operations accessible in privileged modes only The operations work for privileged or User access permissions and returns information in the PA Register for aborts when the translation is unsuccessful or page table information when the translation succeeds Att...

Page 215: ...ranslation in the other world write CP15 c7 with Opcode_1 set to 0 CRn set to c7 CRm set to c8 Opcode_2 set to 4 for privileged read permission 5 for privileged write permission 6 for User read permission 7 for User write permission General register Rn contains the VA for translation The result returns in the PA Register for example MCR p15 0 Rn c7 c8 4 get VA Rn and run Non secure translation wit...

Page 216: ...his instruction completes Therefore no explicit memory transactions occurring in program order after this instruction are started until this instruction completes See Explicit Memory Barriers on page 6 25 It can be used instead of Strongly Ordered memory when the timing of specific stores to the memory system has to be controlled For example when a store to an interrupt acknowledge location must b...

Page 217: ...e CP15 with Rd SBZ and Opcode_1 set to 0 CRn set to c7 CRm set to c0 Opcode_2 set to 4 For example MCR p15 0 Rd c7 c0 4 Wait For Interrupt This puts the processor into a low power state and stops it executing following instructions until an interrupt an imprecise external abort or a debug request occurs regardless of whether the interrupts or external imprecise aborts are disabled by the masks in ...

Page 218: ...attempted access for each mode To access the TLB Operations Register write CP15 with Opcode_1 set to 0 CRn set to c8 CRm set to c5 Instruction TLB c6 Data TLB c7 Unified TLB Opcode_2 set to 0 Invalidate TLB unlocked entries 1 Invalidate TLB Entry by MVA 2 Invalidate TLB Entry on ASID Match For example to invalidate all the unlocked entries in the Instruction TLB MCR p15 0 Rd c8 c5 0 Write TLB Oper...

Page 219: ...a single interruptible operation that invalidates all TLB entries that match the provided ASID value This function invalidates locked entries but does not invalidate entries marked as global In this processor this operation takes several cycles to complete and the instruction is interruptible When interrupted the R14 state is set to indicate that the MCR instruction has not executed Therefore R14 ...

Page 220: ...cess that cache way For details of the RR bit that controls the selection of Random or Round Robin cache policy see c1 Control Register on page 3 44 ARM1176JZF S processors have an associativity of 4 With all ways locked the ARM1176JZF S processor behaves as if only ways 3 to 1 are locked and way 0 is unlocked SBO 31 4 3 0 L bit for each cache way Table 3 83 Instruction and data cache lockdown reg...

Page 221: ...ons used by any exception handlers that can be called must meet the conditions specified in step 2 2 Ensure that all data or instructions used by the following code apart from the data or instructions that are to be locked down are either in an noncacheable area of memory including the TCM in an already locked cache way 3 Ensure that the data or instructions to be locked down are in a Cacheable ar...

Page 222: ...Undefined exception see TrustZone write access disable on page 2 9 Note When the NS access bit is 0 for Data TCM see c9 Data TCM Non secure Control Access Register on page 3 93 attempts to access the Data TCM Region Register from the Non secure world cause an Undefined exception E n Base address physical address 31 12 11 7 6 2 1 0 SBZ UNP Size S B Z Table 3 85 Data TCM Region Register bit function...

Page 223: ...ion TCM region and to provide a mechanism to enable it Table 3 87 on page 3 92 lists the purposes of the individuals bits of the Instruction TCM Region Register The Instruction TCM Region Register is in CP15 c9 a 32 bit read write register common to Secure and Non secure worlds accessible in privileged modes only If the processor is configured to have 2 Instruction TCMs each TCM has a separate Ins...

Page 224: ... for Instruction TCM see c9 Instruction TCM Non secure Control Access Register on page 3 94 attempts to access the Instruction TCM Region Register from the Non secure world cause an Undefined exception Table 3 87 Instruction TCM Region Register bit functions Bits Field name Function 31 12 Base address Contains the physical base address of the TCM The base address must be aligned to the size of the...

Page 225: ...ecure Access Register is to set access permission to the Data TCM Region Register define data in the Data TCM as Secure or Non secure The Data TCM Non secure Control Access Register is in CP15 c9 a 32 bit read write register in the Secure world only accessible in privileged modes only If the processor is configured to have 2 Data TCMs each TCM has a separate Data TCM Non secure Control Access Regi...

Page 226: ... c9 Instruction TCM Non secure Control Access Register The purpose of the Instruction TCM Non secure Control Access Register is to set access permission to the Instruction TCM Region Register define instructions in the Instruction TCM as Secure or Non secure Table 3 89 Data TCM Non secure Control Access Register bit functions Bits Field name Function 31 1 UNP SBZ 0 NS access Makes Data TCM invisib...

Page 227: ... in Secure Privileged mode when CP15SDISABLE is HIGH result in an Undefined exception see TrustZone write access disable on page 2 9 31 1 0 SBZ NS access Table 3 91 Instruction TCM Non secure Control Access Register bit functions Bits Field name Function 31 1 UNP SBZ 0 NS access Makes Instruction TCM invisible to the Non secure world and makes TCM data Secure 0 Instruction TCM Region Register only...

Page 228: ...a TCM Region Register on page 3 89 c9 Instruction TCM Region Register on page 3 91 c9 Data TCM Non secure Control Access Register on page 3 93 c9 Instruction TCM Non secure Control Access Register on page 3 94 The TCM Selection Register is in CP15 c9 a 32 bit read write register banked in the Secure and Non secure worlds accessible in privileged modes only Figure 3 54 shows the bit arrangement for...

Page 229: ... Selection register MCR p15 0 Rd c9 c2 0 Write TCM Selection register 3 2 30 c9 Cache Behavior Override Register The purpose of the Cache Behavior Override Register is to control cache write through and line fill behavior for interruptible cache operations or during debug The register enables you to ensure that the contents of caches do not change for example in debug The Cache Behavior Override R...

Page 230: ...es write through behavior for regions marked as Secure write back 0 Do not force write through normal operation reset value 1 Force write through 4 S_IL Secure only Defines Instruction Cache linefill behavior for Secure regions 0 Instruction Cache linefill enabled normal operation reset value 1 Instruction Cache linefill disabled 3 S_DL Secure only Defines Data Cache linefill behavior for Secure r...

Page 231: ...ate Non secure entries into the cache and must treat all writes to Non secure regions that hit in the cache as write though Note Three bits nWT nIL and nDL are also defined for Debug state in CP14 see CP14 c10 Debug State Cache Control Register on page 13 23 and apply to all Secure and Non secure regions The CP14 register has precedence over the CP15 register when the core is in Debug state and th...

Page 232: ... TL bit is not set the Lockdown entries are reserved for the Secure world Table 3 98 lists the results of attempted access for each mode The lockdown region of the TLB contains eight entries TLB organization on page 6 4 describes the structure of the TLB P SBZ 31 29 28 26 25 1 0 Victim SBZ UNP Table 3 97 TLB Lockdown Register bit functions Bits Field name Function 31 29 UNP SBZ 28 26 Victim Specif...

Page 233: ...lue of the address to be locked down MCR p15 0 r1 c8 c7 1 invalidate TLB single entry to ensure that LockAddr is not already in the TLB MRC p15 0 R0 c10 c0 0 read the lockdown register ORR R0 R0 1 set the preserve bit MCR p15 0 R0 c10 c0 0 write to the lockdown register LDR r1 r1 TLB misses and entry is loaded MRC p15 0 R0 c10 c0 0 read the lockdown register victim increments BIC R0 R0 1 clear pre...

Page 234: ... Remap Register bit functions Bits Field name Functiona a The reset values ensure that no remapping occurs at reset 31 20 UNP SBZ 19 Remaps shareable attribute when S 1 for Normal regionsb 1 reset value 18 Remaps shareable attribute when S 0 for Normal regionsb 0 reset value 17 Remaps shareable attribute when S 1 for Device regionsb 0 reset value 16 Remaps shareable attribute when S 0 for Device r...

Page 235: ...Shared attributes of this register Table 3 100 Encoding for the remapping of the primary memory type Encoding Memory type b00 Strongly ordered b01 Device b10 Normal b11 UNP normal 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Table 3 101 Normal Memory Remap Register bit functions Bits Field name Functiona 31 30 Remaps Outer attribute for TEX 0 C B b111 b01 r...

Page 236: ... b10 reset value 9 8 Remaps Inner attribute for TEX 0 C B b100 b00 reset value 7 6 Remaps Inner attribute for TEX 0 C B b011 b11 reset value 5 4 Remaps Inner attribute for TEX 0 C B b010 b10 reset value 3 2 Remaps Inner attribute for TEX 0 C B b001 b00 reset value 1 0 Remaps Inner attribute for TEX 0 C B b000 b00 reset value a The reset values ensure that no remapping occurs at reset Table 3 102 R...

Page 237: ... uses the Normal Memory Remap Register to remap the inner and outer cacheable attributes The behavior of the memory region remap registers depends on the TEX remap bit see c1 Control Register on page 3 44 If the TEX remap bit is set the entries in the memory region remap registers remap each possible value of the TEX 0 C and B bits in the page tables You can therefore set your own definitions for ...

Page 238: ...igure 3 59 DMA identification and status registers format Table 3 104 lists how the bit values correspond with the DMA identification and status registers Table 3 105 lists the Opcode_2 values used to select the DMA channel function C H 0 UNP 31 2 1 0 C H 1 Table 3 104 DMA identification and status register bit functions Bits Field name Function 31 2 UNP SBZ 1 CH1 Provides information on DMA Chann...

Page 239: ...pting 3 2 34 c11 DMA User Accessibility Register The purpose of the DMA User Accessibility Register is to determine if a User mode process can access the registers for each channel The DMA User Accessibility Register is in CP15 c11 a 32 bit read write register common to the Secure and Non secure worlds accessible in privileged modes only Figure 3 60 on page 3 108 shows the bit arrangement for the ...

Page 240: ...nable registers on page 3 110 c11 DMA Control Register on page 3 112 c11 DMA Internal Start Address Register on page 3 114 c11 DMA External Start Address Register on page 3 115 c11 DMA Internal End Address Register on page 3 116 c11 DMA Channel Status Register on page 3 117 U 0 SBZ UNP 31 2 1 0 U 1 Table 3 107 DMA User Accessibility Register bit functions Bits Field name Function 31 2 UNP SBZ 1 U1...

Page 241: ...arrangement for the DMA Channel Number Register Figure 3 61 DMA Channel Number Register format Table 3 109 lists how the bit values correspond with the DMA Channel Number Register Access in the Non secure world depends on the DMA bit see c1 Non Secure Access Control Register on page 3 55 The processor can access this register in User mode if the U bit see c11 DMA User Accessibility Register on pag...

Page 242: ...l cycles to stop after the DMA issues a Stop instruction The channel status remains at Running until the DMA channel stops The channel status is set to Complete or Error at the point that all outstanding memory accesses complete The Start Address Registers contain the addresses the DMA requires to restart the operation when the channel stops If the Stop command occurs when the channel status is Qu...

Page 243: ...rt DMA Enable Register MCR p15 0 Rd c11 c3 2 Clear DMA Enable Register Debug implications for the DMA The level one DMA behaves as a separate engine from the processor core and when started works autonomously When the level one DMA has channels with the status of Running or Queued these channels continue to run or start running even if a debug mechanism stops the processor This can cause the conte...

Page 244: ... direction of transfer 0 Transfer from level two memory to the TCM reset value 1 Transfer from the TCM to the level two memory 29 IC Indicates whether the DMA channel must assert an interrupt on completion of the DMA transfer or if the DMA is stopped by a Stop command see c11 DMA enable registers on page 3 110 The interrupt is deasserted from this source if the processor performs a Clear operation...

Page 245: ...DMA A Stride of zero reset value indicates that the external address is not to be incremented This is designed to facilitate the accessing of volatile locations such as a FIFO The Stride is interpreted as a positive number or zero The internal address increment is not affected by the Stride but is fixed at the transaction size The stride value is in bytes The value of the Stride must be aligned to...

Page 246: ...l Start VA Access in the Non secure world depends on the DMA bit see c1 Non Secure Access Control Register on page 3 55 The processor can access this register in User mode if the U bit see c11 DMA User Accessibility Register on page 3 107 for the currently selected channel is set to 1 Table 3 114 lists the results of attempted access for each mode To access the DMA Internal Start Address Register ...

Page 247: ...ransfers go to or from The DMA External Start Address Register is in CP15 c11 one 32 bit read write register for each DMA channel common to Secure and Non secure worlds accessible in user and privileged modes The DMA External Start Address Register bits 31 0 contain the External Start VA Access in the Non secure world depends on the DMA bit see c1 Non Secure Access Control Register on page 3 55 Th...

Page 248: ...or that channel This is the end address of the data transfer The DMA Internal End Address Register is in CP15 c11 one 32 bit read write register for each DMA channel common to Secure and Non secure worlds accessible in user and privileged modes The DMA Internal End Address Register bits 31 0 contain the Internal End VA Access in the Non secure world depends on the DMA bit see c1 Non Secure Access ...

Page 249: ...be affected The Internal End Address must be aligned to the transaction size set in the DMA Control Register or the processor generates a bad parameter error 3 2 41 c11 DMA Channel Status Register The purpose of the DMA Channel Status Register for each channel is to define the status of the most recently started DMA operation on that channel The DMA Channel Status Register is in CP15 c11 one 32 bi...

Page 250: ...ection b11011 Domain fault page b11101 Permission fault section b11111 Permission fault page 6 2 IS Indicates the status of the Internal Address Error All other encodings are Reserved b00000 No error reset value b00xxx No error b01000 TCM out of range b11100 External Abort on translation of first level page table b11110 External Abort on translation of second level page table b10011 Access Bit fau...

Page 251: ... before the external accesses from the transfer complete If the processor attempts to access memory locations that are not marked as shared then the ES bits signal an Unshared error for either a DMA transfer in User mode a DMA transfer that has the UM bit set in the DMA Control Register A DMA transfer where the external address is within the range of the TCM also results in an Unshared data error ...

Page 252: ...Register on page 3 55 Table 3 120 lists the results of attempted access for each mode To access the DMA Context ID register in a privileged mode set the DMA Channel Number Register to the appropriate DMA channel and read or write CP15 with Opcode_1 set to 0 CRn set to c11 CRm set to c15 Opcode_2 set to 0 MRC p15 0 Rd c11 c15 0 Read DMA Context ID Register MCR p15 0 Rd c11 c15 0 Write DMA Context I...

Page 253: ...ure or Non secure Vector Base Address Register The purpose of the Secure or Non secure Vector Base Address Register is to hold the base address for exception vectors in the Secure and Non secure worlds For more information see Exceptions on page 2 36 The Secure or Non secure Vector Base Address Register is in CP15 c12 a 32 bit read write register banked in Secure and Non secure worlds accessible i...

Page 254: ...or Base Address Register read or write CP15 with Opcode_1 set to 0 CRn set to c12 CRm set to c0 Opcode_2 set to 0 For example MRC p15 0 Rd c12 c0 0 Read Secure or Non secure Vector Base Address Register MCR p15 0 Rd c12 c0 0 Write Secure or Non secure Vector Base Address Register 3 2 44 c12 Monitor Vector Base Address Register The purpose of the Monitor Vector Base Address Register is to hold the ...

Page 255: ...te value for the Secure Monitor Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH result in an Undefined exception see TrustZone write access disable on page 2 9 Table 3 124 lists the results of attempted access for each mode To use the Monitor Vector Base Address Register read or write CP15 with Opcode_1 set to 0 CRn set to c12 CRm set to c0 Opcode_2 set to 1 ...

Page 256: ...rnal abort occurs and automatically clears when the abort is taken Table 3 126 lists the results of attempted access for each mode The A I and F bits map to the same format as the CPSR so that you can use the same mask for these bits SBZ 31 9 8 7 6 5 0 A I F SBZ Table 3 125 Interrupt Status Register bit functions Bits Field name Functiona a The reset values depend on external signals 31 9 SBZ 8 A ...

Page 257: ...tricted Access The Secure Monitor can poll these bits to detect the exceptions before it completes context switches This can reduce interrupt latency To use the Interrupt Status Register read CP15 with Opcode_1 set to 0 CRn set to c12 CRm set to c1 Opcode_2 set to 0 For example MRC p15 0 Rd c12 c1 0 Read Interrupt Status Register ...

Page 258: ... an Undefined exception see TrustZone write access disable on page 2 9 Table 3 128 lists the results of attempted access for each mode To use the FCSE PID Register read or write CP15 with Opcode_1 set to 0 CRn set to c13 CRm set to c0 Opcode_2 set to 0 For example MRC p15 0 Rd c13 c0 0 Read FCSE PID Register MCR p15 0 Rd c13 c0 0 Write FCSE PID Register FCSE PID 31 25 24 0 SBZ Table 3 127 FCSE PID...

Page 259: ...ProcID 0 1 A5 any instruction Fetched with ProcID 0 1 A6 any instruction Fetched with ProcID 1 Note You must not rely on this behavior for future compatibility An IMB must be executed between changing the ProcID and fetching from locations that are translated by the ProcID Addresses issued by the ARM1176JZF S processor in the range 0 32MB are translated by the ProcID Address A becomes A ProcID x 3...

Page 260: ...text ID Register format Table 3 129 lists how the bit values correspond with the Context ID Register functions Table 3 130 lists the results of attempted access for each mode The current ASID value in the ID Context Register is exported to the MMU To use the Context ID Register read or write CP15 with Opcode_1 set to 0 CRn set to c13 CRm set to c0 Opcode_2 set to 1 For example MRC p15 0 Rd c13 c0 ...

Page 261: ...r Read Write Thread and Process ID Register User Read Only Thread and Process ID Register Privileged Only Thread and Process ID Register each accessible in different modes User Read Write read write in User and privileged modes User Read Only read only in User mode read write in privileged modes Privileged Only read write in privileged modes only Table 3 131 lists the results of attempted access t...

Page 262: ...rocess ID registers on process switches to prevent data leaking from one process to another This is important to ensure the security of secure data The reset value of these registers is 0 3 2 49 c15 Peripheral Port Memory Remap Register The purpose of the Peripheral Port Memory Remap Register is to remap the memory attributes to Non Shared Device This forces access to the peripheral port and overr...

Page 263: ...r while the MMU is disabled the virtual base address is equal to the physical base address that is used The assumption is that the Base Address is aligned to the size of the remapped region Any bits in the range log2 Region size 1 12 are ignored The value is the base address The reset value is 0 11 5 UNP SBZ 4 0 Size Indicates the size of the memory region that the peripheral port is remapped to A...

Page 264: ...it read write register in the Secure world only accessible in privileged modes only Figure 3 72 shows the bit arrangement for the Secure User and Non secure Access Validation Control Register Figure 3 72 Secure User and Non secure Access Validation Control Register format Table 3 134 lists how the bit values correspond with the Secure User and Non secure Access Validation Control Register function...

Page 265: ...ce Monitor Control Register is to control the operation of the Cycle Counter Register the Count Register 0 the Count Register 1 Table 3 136 on page 3 134 lists the purpose of the individual bits in the register The Performance Monitor Control Register is in CP15 c15 a 32 bit read write register common to Secure and Non secure worlds accessible in User and Privileged modes Figure 3 73 shows the bit...

Page 266: ... disabled EVNTBUS held at 0x0 reset value 1 Export enabled EVNTBUS driven by the events 10 CCR Cycle Counter Register overflow flag 0 For reads No overflow reset value For writes No effect 1 For reads overflow occurred For writes Clear this bit 9 CR1 Count Register 1 overflow flag 0 For reads No overflow reset value For writes No effect 1 For reads overflow occurred For writes Clear this bit 8 CR0...

Page 267: ...d Table 3 136 Performance Monitor Control Register bit functions continued Bits Field name Function Table 3 137 Performance monitoring events EVNTBUS bit position Event number Event definition 0xFF An increment each cycle 0x26 Procedure return instruction executed and return address predicted incorrectly The procedure return address was restored to the return stack following the prediction being i...

Page 268: ...the location is cacheable 10 0x9 Data cache access Does not include Cache Operations This event occurs for each nonsequential access to a cache line for cacheable locations 9 8 0x7 Instruction executed If EVENTBUS bit 9 is HIGH two instructions were executed in this clock cycle and the count is increments by two 7 0x6 Branch mispredicted 6 Reserved 5 0x5 Branch instruction executed branch might or...

Page 269: ...ken from various pipeline stages This means that the absolute counts recorded might vary because of pipeline effects This has negligible effect except in cases where the counters are enabled for a very short time In addition to the two counters within the processor most of the events that Table 3 137 on page 3 135 lists are available on an external bus EVNTBUS You can connect this bus to the ETM u...

Page 270: ...Register to count every 64th clock cycle 3 2 53 c15 Count Register 0 The purpose of the Count Register 0 is to count instances of an event that the Performance Monitor Control Register selects The Count Register 0 is in CP15 c15 is a 32 bit read write register common to Secure and Non secure worlds counts up and can trigger an interrupt on overflow Count Register 0 bits 31 0 contain the count valu...

Page 271: ...Count Register 1 is to count instances of an event that the Performance Monitor Control Register selects The Count Register 1 is in CP15 c15 is a 32 bit read write register common to Secure and Non secure worlds counts up and can trigger an interrupt on overflow Count Register 1 bits 31 0 contain the count value The reset value is 0 You can use this register in conjunction with the Performance Mon...

Page 272: ...to count core clock cycles to trigger a system validation event The System Validation Counter Register is in CP15 c15 a 32 bit read write register common to the Secure and Non secure worlds accessible in User and Privileged modes The System Validation Counter Register consists of one 32 bit register that performs four functions Table 3 142 lists the arrangement of the functions in this group The r...

Page 273: ...ounter MRC p15 0 Rd c15 c12 2 Read interrupt counter MCR p15 0 Rd c15 c12 2 Write interrupt counter MRC p15 0 Rd c15 c12 3 Read fast interrupt counter MCR p15 0 Rd c15 c12 3 Write fast interrupt counter MCR p15 0 Rd c15 c12 7 Write external debug request counter A read or write to the System Validation Counter Register with a value of Opcode_2 other than 1 2 3 or 7 has no effect When the system st...

Page 274: ...lds accessible in user and privileged modes The System Validation Operations Register consists of one 32 bit register that performs 16 functions Table 3 144 lists the arrangement of the functions in this group A write to the System Validation Operations Register with a combination of Opcode_1 and Opcode_2 that Table 3 144 does not list has no effect A read from the System Validation Operations Reg...

Page 275: ...ue External debug request counter For example MCR p15 0 Rd c15 c13 1 Start reset counter MCR p15 0 Rd c15 c13 2 Start interrupt counter MCR p15 0 Rd c15 c13 3 Start reset and interrupt counters MCR p15 0 Rd c15 c13 4 Start fast interrupt counter MCR p15 0 Rd c15 c13 5 Start reset and fast interrupt counters MCR p15 0 Rd c15 c13 6 Start interrupt and fast interrupt counters MCR p15 0 Rd c15 c13 7 S...

Page 276: ...ar the counters to return them to their System performance monitoring function To do this set bits in Rn and write to the Performance Monitor Control Register to clear the relevant overflow flags bit 10 to clear the reset counter bit 9 to clear the fast interrupt counter bit 8 to clear the interrupt counter You must carry out this operation with a read modify write sequence to avoid changes to oth...

Page 277: ...Mask Register functions SBZ 31 15 14 12 11 10 8 7 6 4 3 2 0 DTCM S B Z ITCM S B Z DCache S B Z ICache Write enable Table 3 146 System Validation Cache Size Mask Register bit functions Bits Field name Function 31 Write enable Enables the update of the Cache and TCM sizes 0 The Cache and TCM sizes are not changed reset value 1 The Cache and TCM sizes take the new values that the DTCM ITCM DCache and...

Page 278: ...write CP15 with Opcode_1 set to 0 CRn set to c15 CRm set to c14 Opcode_2 set to 0 For example MRC p15 0 Rd c15 c14 0 Read System Validation Cache Size Mask Register MCR p15 0 Rd c15 c14 0 Write System Validation Cache Size Mask Register 7 SBZ UNP SBZ 6 4 DCache Specifies apparent size of Data Cache as it appears to the processor All other values are reserved b011 4KB b100 8KB b101 16KB b110 32KB b...

Page 279: ...A also appears not to be present Note You must not modify the System Validation Cache Size Mask Register in a manufactured device Physical RAMs do not support cache and TCM size masking Therefore any attempt to mask cache and TCM sizes using this register causes address aliasing effects and problems with cache master valid bits that result in incorrect operation and Unpredictable effects 3 2 58 c1...

Page 280: ...only The number of Master Valid bits in the register is a function of the cache size There is one Master Valid bit for each 8 cache lines For instance there are 64 Master Valid bits for a 16KB cache You can access Master Valid bits through 32 bit registers indexed using Opcode_2 The maximum number of 32 bit registers required for the largest cache size 64KB is 8 The Master Valid bits fill the regi...

Page 281: ...down VA Register TLB Lockdown PA Register TLB Lockdown Attributes Register accessible in privileged modes only The four registers have different bit arrangements and functions Figure 3 76 shows the arrangement of bits in the TLB Lockdown Index Register Figure 3 76 TLB Lockdown Index Register format Table 3 148 lists how the bit values correspond with the TLB Lockdown Index Register functions Figur...

Page 282: ...le entries For global entries this field Should Be Zero V PA 31 12 11 10 9 8 7 6 5 4 3 2 1 0 SBZ N S A Size SBZ A P X AP NSTID Table 3 150 TLB Lockdown PA Register bit functions Bits Field name Function 31 12 PA Holds the PA of this page table entry 11 10 UNP SBZ 9 NSA Defines whether memory accesses in the memory region that this page table entry describes are Secure or Non secure accesses This m...

Page 283: ...t fields encoding APX AP Supervisor permissions User permissions Access type 0 b00 No access No access All accesses generate a permission fault 0 b01 Read write No access Supervisor access only 0 b10 Read write Read only Writes in user mode generate permission faults 0 b11 Read write Read write Full access 1 b00 No access No access Domain fault encoded field 1 b01 Read only No access Supervisor re...

Page 284: ...y supports sub pages Page table entries that support sub pages must be marked as Global see c15 TLB lockdown access registers on page 3 149 0 Sub pages are not valid 1 Sub pages are valid 24 11 SBZ UNP SBZ 10 7 Domain Specifies the Domain number for the page table entry 6 XN Specifies Execute Never attribute when set the contents of the memory region that this page table entry describes cannot be ...

Page 285: ...o memory and later restores them to the TLB Lockdown region You might use sequences similar to this for entry into Dormant mode Example 3 3 Save and restore all TLB Lockdown entries ADR r1 TLBLockAddr Set r1 to save address MOV R0 0 Initialize counter CPSID aif Disable interrupts TLBLockSave MCR p15 5 R0 c15 c4 2 Set TLB Lockdown Index MRC p15 5 R2 c15 c5 2 Read TLB Lockdown VA MRC p15 5 R3 c15 c7...

Page 286: ...mixed endianness data access support for the processor It contains the following sections About unaligned and mixed endian support on page 4 2 Unaligned access support on page 4 3 Endian support on page 4 6 Operation of unaligned accesses on page 4 13 Mixed endian access support on page 4 17 Instructions to reverse bytes in a general purpose register on page 4 20 Instructions to change the CPSR E ...

Page 287: ...nd stored back out of the register file In previous architectures this Program Status Register bit was specified as zero It is not set in legacy code written to conform to architectures prior to ARMv6 ARM and Thumb instructions to set and clear the E bit explicitly A byte invariant addressing scheme to support fine grain big endian and little endian shared data structures to conform to a shared me...

Page 288: ...ied for processors with architecturally compliant Memory Management Units MMUs under control of CP15 Register c1 A control bit bit 1 When a transfer is not naturally aligned to the size of data transferred a Data Abort is signaled with an Alignment fault status code see ARM Architecture Reference Manual for more details 4 2 2 ARMv6 extensions ARMv6 adds unaligned word and halfword load and store d...

Page 289: ... this right by the byte offset denoted by Address 1 0 see the ARM Architecture Reference Manual ARM and Thumb load multiple accesses always treated as aligned No rotation of read data ARM and Thumb store word and store multiple treated as aligned No rotation of write data ARM load and store doubleword operations treated as 64 bit aligned For more information see Operation of unaligned accesses on ...

Page 290: ...um performance Accesses can abort on either or both halves of an access where this occurs over a page boundary The Data Abort handler must handle restartable aborts carefully after an Alignment Fault status code is signaled As a result shared memory schemes must not rely on seeing monotonic updates of non aligned data of loads stores and swaps for data items greater than byte width Unaligned acces...

Page 291: ...treated as pointing to the most significant byte of the addressed data 4 3 1 Load unsigned byte endian independent The addressed byte is loaded from memory into the low eight bits of the general purpose register and the upper 24 bits are zeroed as Figure 4 1 shows Figure 4 1 Load unsigned byte 4 3 2 Load signed byte endian independent The addressed byte is loaded from the memory into the low eight...

Page 292: ... 4 shows Figure 4 4 Load unsigned halfword little endian If strict alignment fault checking is enabled and Address bit 0 is not zero then a Data Abort is generated and the MMU returns a Misaligned fault in the Fault Status Register 4 3 5 Load unsigned halfword big endian The addressed byte pair is loaded from memory into the low 16 bits of the general purpose register and the upper 16 bits are zer...

Page 293: ...nd the upper 16 bits are sign extended from bit 15 as Figure 4 6 shows Figure 4 6 Load signed halfword little endian In Figure 4 6 se1 means bit 15 b1 bit 7 sign extended If strict alignment fault checking is enabled and Address bit 0 is not zero then a Data Abort is generated and the MMU returns a Misaligned fault in the Fault Status Register 4 3 7 Load signed halfword big endian The addressed by...

Page 294: ...th bits 7 0 written to the addressed byte in memory bits 15 8 to the incremental byte address in memory as Figure 4 8 shows Figure 4 8 Store halfword little endian If strict alignment fault checking is enabled and Address bit 0 is not zero then a Data Abort is generated and the MMU returns a Misaligned fault in the Fault Status Register 4 3 9 Store halfword big endian The low 16 bits of the genera...

Page 295: ...ant addressed byte in memory appears in bits 7 0 of the ARM register as Figure 4 10 shows Figure 4 10 Load word little endian If strict alignment fault checking is enabled and Address bits 1 0 are not zero then a Data Abort is generated and the MMU returns a Misaligned fault in the Fault Status Register 4 3 11 Load word big endian The addressed byte quad is loaded from memory into the 32 bit gener...

Page 296: ... transferred to the least significant addressed byte in memory as Figure 4 12 shows Figure 4 12 Store word little endian If strict alignment fault checking is enabled and Address bits 1 0 are not zero then a Data Abort is generated and the MMU returns a Misaligned fault in the Fault Status Register 4 3 13 Store word big endian The 32 bit general purpose register is stored to four bytes in memory w...

Page 297: ...n page 4 11 where the lowest two address bits are zeroed If strict alignment fault checking is enabled and effective Address bits 1 0 are not zero then a Data Abort is generated and the MMU returns an Alignment fault in the Fault Status Register 4 3 16 Store double store multiple store coprocessor little endian E 0 The access is treated as a series of incrementing aligned word stores to memory The...

Page 298: ...ed in Table 4 3 on page 4 14 are determined from the load store instruction that Table 4 2 lists The following terminology is used to describe the memory locations accessed Byte X This means the byte whose address is X in the current endianness model The correspondence between the endianness models is that Byte A in the LE endianness model Byte A in the BE 8 endianness model and Byte A EOR 3 in th...

Page 299: ...e endianness model as the lowest word Table 4 3 Unalignment fault occurrence when access behavior is architecturally unpredictable A U Addr 2 0 Access types Architectural Behavior Memory accessed Note 0 0 Legacy no alignment 0 0 bxxx Byte BSync Normal Byte Addr 0 0 bxx0 Halfword Normal Halfword Addr 0 0 bxx1 Halfword Unpredictable Halfword Align16 Addr Operation unaffected by Addr 0 0 0 bxx0 HWSyn...

Page 300: ...r 0 1 bxx1 bx1x WSync Multi word Two word Alignment fault 0 1 b000 DWSync Normal Word Addr 0 1 bxx1 bx1x b1xx DWSync Alignment fault 1 x Full alignment faulting 1 x bxxx Byte BSync Normal Byte Addr 1 x bxx0 Halfword HWSync Normal Halfword Addr 1 x bxx1 Halfword HWSync Alignment fault 1 x bx00 WLoad WStore WSync Multi word Normal Word Addr 1 x bxx1 bx1x WLoad WStore WSync Multi word Alignment fault...

Page 301: ...hem is Addr 1 0 b00 the effective address of the transfer has its two least significant bits forced to 0 if A is set 0 and U is set to 0 Otherwise the behavior specified in Table 4 3 on page 4 14 is either Unpredictable or Alignment Fault regardless of the destination register Any WLoad WStore WSync Two word or Multi word instruction that accesses device memory has Addr 1 0 b00 and Table 4 3 on pa...

Page 302: ...ites the same word of data in memory when configured as either big endian or little endian For more information see Endianness on page 8 38 This behavior is still provided for legacy software when the U bit in CP15 Register c1 is zero as Table 4 4 lists 4 5 2 ARMv6 support for mixed endian data In ARMv6 the instruction and data endianness are separated instructions are fixed little endian data acc...

Page 303: ...cribes signed byte load as Load signed byte endian independent on page 4 6 describes byte store as Store byte endian independent on page 4 6 describes Halfword data access The same two physical bytes in memory are accessed whether big endian BE 8 or little endian Big endian halfword load data is byte reversed as read into the processor register to ensure little endian internal representation and s...

Page 304: ...NDINIT and UBITINIT pins that determine the values of the U B and EE bits at reset The pins determine the reset value of the B bit and both the Secure and Non secure reset values of the U and EE bits Table 4 5 Mixed endian configuration U B E Instruction endianness Data endianness Description 1 0 0 LE LE LE instructions little endian data load store Unaligned data access permitted 1 0 1 LE BE 8 LE...

Page 305: ... required The following new instructions are added to the ARM and Thumb instruction sets to provide this functionality reverse word 4 bytes register for transforming big and little endian 32 bit representations reverse halfword and sign extend for transforming signed 16 bit representations Reverse packed halfwords in a register for transforming big and little endian 16 bit representations ARM1176J...

Page 306: ...cess 4 7 Instructions to change the CPSR E bit ARM and Thumb instructions are provided to set and clear the E bit efficiently SETEND BE Sets the CPSR E bit SETEND LE Resets the CPSR E bit These are specified as unconditional operations to minimize pipelined implementation complexity ARM1176JZF S instruction set summary on page 1 32 describe these instructions ...

Page 307: ...he strategies used for determining if a branch is likely to be taken or not It also describes the two architecturally defined SVC functions required for backwards compatibility with earlier architectures for flushing the Prefetch Unit PU buffers It contains the following sections About program flow prediction on page 5 2 Branch prediction on page 5 4 Return stack on page 5 7 Memory Barriers on pag...

Page 308: ...edures up to three deep The integer core includes a Static Branch Predictor SBP a Return Stack RS branch resolution logic a BTAC update interface to the PU a BTAC allocate interface to the PU The processor PU is responsible for fetching instructions from the memory system as required by the integer core and coprocessors The PU buffers up to seven instructions in its FIFO to detect branch instructi...

Page 309: ... of the integer core That is it performs prefetches in ARM state Thumb state and Jazelle state However the rate at which the PU is drained is state dependent and the functioning of the branch prediction hardware is a function of the state Branch prediction is performed in all three states but branch folding operates only in ARM and Thumb states The PU is responsible for fetching the instruction st...

Page 310: ...e stack pointer and Moves or ALU operations to the PC derived from R14 the Link Register In these cases if the calling operation can also be identified the likely return address can be stored in a hardware implemented stack termed a Return Stack RS Typical calling operations are BL and BLX instructions In addition Moves or ALU operations to the Link Register from the PC are often preludes to a bra...

Page 311: ...on The scheme used in the ARM1176JZF S processor predicts that all forward conditional branches are not taken and all backward branches are taken Around 65 of all branches are preceded by enough non branch cycles to be completely predicted Branch prediction is performed only when the Z bit in CP15 Register c1 is set to 1 See c1 Control Register on page 3 44 for details of this register Dynamic pre...

Page 312: ... stream to be fetched If branch folding is implemented the failure of the condition codes of a folded branch causes the instruction that follows the folded branch to fail Whenever a potentially incorrect prediction is made the following information necessary for recovering from the error is stored a fall through address in the case of a predicted taken branch instruction the branch target address ...

Page 313: ...pipeline and pushed onto the return stack The instructions recognized as procedure calls are BL dest BLX dest BLX reg The first two instructions are predicted by the BTAC unless they result in a BTAC miss The third instruction is not predicted The SBP predicts unconditional procedure calls as taken and conditional procedure calls as not taken When a procedure return instruction is predicted an ins...

Page 314: ...xecute reliably This sequence is called an Instruction Memory Barrier IMB and might depend both on the ARM processor implementation and on the memory system implementation The IMB sequence must be executed after the new instructions have been stored to memory and before they are executed for example after a program has been loaded and before its entry point is branched to Any self modifying code s...

Page 315: ...art_addr inclusive to end_addr exclusive When the standard ARM Procedure Call Standard is used this means that start_addr is passed in R0 and end_addr in R1 The execution time cost of an IMB can be very large many thousands of clock cycles even when a small address range is specified For small scale uses of self modifying code this is likely to lead to a major loss of performance It is therefore r...

Page 316: ...ed area of code is small even if there is no distinction between it and the IMB instruction on ARM1176JZF S processors Future processors might implement the IMBRange instruction in a more efficient and faster manner and code migrated from the ARM1176JZF S core is likely to benefit when executed on these processors ARM1176JZF S processors implement a Flush Prefetch Buffer operation that is user acc...

Page 317: ...g BitBlt code IMBRange EQU 0xF00001 code that constructs loop code load R0 with the start address of the constructed loop load R1 with the end address of the constructed loop SVC IMBRange branch to IMBRange service routine read registers R0 and R1 to set up address range parameters perform processor specific operations to execute IMBRange within address range return to code start of loop code When...

Page 318: ...bout the MMU on page 6 2 TLB organization on page 6 4 Memory access sequence on page 6 7 Enabling and disabling the MMU on page 6 9 Memory access control on page 6 11 Memory region attributes on page 6 14 Memory attributes and types on page 6 20 MMU aborts on page 6 27 MMU fault checking on page 6 29 Fault status and address on page 6 34 Hardware page table translation on page 6 36 MMU descriptors...

Page 319: ... can specify access permissions for 64KB large pages and 4KB small pages separately for each quarter of the page these quarters are called subpages 16 domains one 64 entry unified TLB and a lockdown region of eight entries you can mark entries as a global mapping or associated with a specific application space identifier to eliminate the requirement for TLB flushes on most context switches access ...

Page 320: ...cess control on page 6 11 for more details Memory region attributes These describe properties of a memory region Examples include Strongly Ordered Device cacheable Write Through and cacheable Write Back If an entry for a virtual address is not found in a TLB then a set of translation tables in memory are automatically searched by hardware to create a TLB entry This process is known as a translatio...

Page 321: ...hatever the value of the NS bit in the CP15 SCR register or in any Secure mode NS bit in CP15 SCR 0 The MicroTLB returns the physical address to the cache for the address comparison and also checks the protection attributes in sufficient time to signal a Data Abort in the DC2 cycle An additional set of attributes to be used by the cache line miss handler are provided by the MicroTLB The timing req...

Page 322: ...s the creation of new NS entries in the Lockdown region The TL bit has no influence on the Read Write Lockdown entry operations VA PA or Attributes in the system control coprocessor see c15 TLB lockdown access registers on page 3 149 When the TL bit is set the processor can write an NS entry in the Lockdown region with the Write Lockdown operation of the system control coprocessor A low associativ...

Page 323: ...iptor in the page tables similar to the way a Section is defined Because each first level page table entry covers a 1MB region of virtual memory the 16MB supersections require that 16 identical copies of the first level descriptor of the supersection exist in the first level page table Every supersection is defined to have its Domain as 0 Supersections can be specified regardless of whether subpag...

Page 324: ...and if it is shared as Memory region attributes on page 6 14 describes 3 The physical address is used for any access to external or tightly coupled memory to perform Tag matching for cache entries 6 3 1 TLB match process Each TLB entry contains a virtual address a page size a physical address and a set of memory properties Each is marked as being associated with a particular application space or a...

Page 325: ... bits in the TTB Control register and a mapping is placed in the TLB See Hardware page table translation on page 6 36 for more details 6 3 2 Virtual to physical translation mapping restrictions You can use the processor MMU architecture in conjunction with virtually indexed physically tagged caches For details of any mapping page table restrictions for virtual to physical addresses see Restriction...

Page 326: ...esponding world before or at the same time as disabling the MMU Note If the MMU is enabled then disabled and subsequently re enabled in the same world the contents of the TLBs for this world are preserved If these are now invalid you must invalidate the TLBs in the corresponding world before you re enable the MMU see c8 TLB Operations Register on page 3 86 2 Clear bit 0 to 0 in the CP15 Control Re...

Page 327: ...med and no aborts are generated by the MMU The physical address for every access is equal to its virtual address This is known as a flat address mapping The NS attribute for the target memory region is equal to the state Secure or Non secure of the request that is Secure requests are considered to target Secure memory The FCSE PID Should Be Zero when the MMU is disabled This is the reset value of ...

Page 328: ...upported Clients Clients are users of domains in that they execute programs and access data They are guarded by the access permissions of the TLB entries for that domain A client is a domain user and each access has to be checked against the access permission settings for each memory block and the system protection bit the S bit and the ROM protection bit the R bit in CP15 Control Register c1 Tabl...

Page 329: ...el access In this case the AP 0 bit provides Access bit information so that software can optimize the memory management algorithm The Access bit behaves in this way except in the deprecated case that uses the S and R bits that is when the S and R bits have opposite values and when APX and AP 1 0 b000 6 5 3 Execute never bits in the TLB entry Each memory region can be tagged as not containing execu...

Page 330: ...imited All rights reserved 6 13 ID012310 Non Confidential Unrestricted Access descriptors do not contain the XN bit and all pages are executable In ARMv6 mode XP bit 1 the descriptors specify the XN attribute see Figure 6 7 on page 6 39 and Figure 6 8 on page 6 40 ...

Page 331: ...onfiguration the Shared bit can be remapped too For TrustZone support the TEX remap bit is duplicated as Secure and Non secure versions so it is possible to configure in each world the options that are available to the core The TLB does not cache the effect of the TEX remap bit on page tables As a result there is no requirement for the processor to invalidate the TLB on a change of the TEX remap b...

Page 332: ...s used in page table formats Page table encodings Description Memory type Page shareable TEX C B b000 0 0 Strongly Ordered Strongly Ordered Shareda a Shared regardless of the value of the S bit in the page table b000 0 1 Shared Device Device Shareda b000 1 0 Outer and Inner Write Through No Allocate on Write Normal sb b s is Shared if the value of the S bit in the page table is 1 or Non shared if ...

Page 333: ... Instruction Cache is off see Behavior with MMU disabled on page 6 9 TexRemap 1 configuration Only three bits TEX 0 C and B are relevant in this configuration The OS can use the TEX 2 1 bits to manage the page tables In this configuration the processor provides the OS with a remap capability for the memory attribute Two CP15 registers the Primary Region Remap Register PRRR and the Normal Memory Re...

Page 334: ...ty has two levels 1 The first level the Primary Region Remap enables remap of the primary memory type Normal Device or Strongly Ordered See Table 6 5 2 After primary remapping any region remapped as Normal memory has the Inner and Outer cacheable attributes remapped by the Normal Memory Region Remap register See Table 6 5 To provide maximum flexibility this level of remapping permits regions that ...

Page 335: ...s 1 prior to remapping This behavior takes place regardless of whether or not the instruction cache is enabled Note The reset value for each field of the PRRR and NMRR makes the MMU behave as if no remapping occurs that is Strongly Ordered regions are remapped as Strongly Ordered and so on For security reasons the NS Attribute bit has no remap capability Table 6 6 Values that remap the shareable a...

Page 336: ...figured as Non secure If the access goes external to the core then it is marked as Non secure with AxPROT 1 Non secure The NS Attribute is specified in the L1 descriptors in position 19 for sections and supersections and in position 3 for coarse pages The bit contained in the NS descriptors is always ignored so that all NS entries in the TLB that is entries with NSTID 1 Non secure have the NS Attr...

Page 337: ...can be writable or read only For writable normal memory unless there is a change to the physical address mapping a load from a specific location returns the most recently stored data at that location for the same processor two loads from a specific location without a store in between return the same data for each load For read only normal memory two loads from a specific location return the same d...

Page 338: ...The processor does not cache shareable locations at level one In systems that implement a TCM the regions of memory covered by the TCM must not be marked as Shared The attributes for these regions are remapped to Inner and Outer Write Back Non Shared Writes to Shared Normal memory might not be atomic That is all observers might not see the writes occurring at the same time To preserve coherence wh...

Page 339: ...errupt to enable the interrupt to abandon a slow access You must ensure these optimizations are not performed on regions of memory marked as Device If a memory operation that causes multiple transactions such as an LDM or an unaligned memory access crosses a 4KB address boundary then it can perform more accesses than are specified by the program regardless of one or both of the areas being marked ...

Page 340: ...d in ARMv6 Programs must not rely on this behavior but instead include an explicit Memory Barrier between the memory access and the following instruction See Explicit Memory Barriers on page 6 25 The processor does not require an explicit memory barrier in this situation but for future compatibility it is recommended that programmers insert a memory barrier Explicit accesses from the processor to ...

Page 341: ...es The program order of instruction execution is defined as the order of the instructions in the control flow trace Two explicit memory accesses in an execution can either be Ordered Denoted by If the accesses are Ordered then they must occur strictly in order Weakly Ordered Denoted by If the accesses are Weakly Ordered then they must occur in order or simultaneously The rules for determining this...

Page 342: ...7 For details on how to use this register see c7 Cache operations on page 3 69 For more information on explicit memory barriers see the ARM Architecture Reference Manual Data Memory Barrier This memory barrier ensures that all explicit memory transactions occurring in program order before this instruction are completed No explicit memory transactions occurring in program order after this instructi...

Page 343: ...predictor 6 7 6 Backwards compatibility The ARMv6 memory attributes are significantly different from those in previous versions of the architecture Table 6 10 lists the interpretation of the earlier memory types in the light of this definition Table 6 10 Memory region backwards compatibility Previous architectures ARMv6 attribute NCNB Noncacheable Non Bufferable Strongly Ordered NCB Noncacheable B...

Page 344: ...on imprecise Data Aborts For all prefetch aborts the processor updates the Instruction Fault Address Register IFAR with the address of the instruction that causes the abort When the EA bit is set see c1 Secure Configuration Register on page 3 52 all external aborts are trapped to the Secure Monitor mode and only the Secure versions of the FSR and FAR registers are updated In all other cases the FA...

Page 345: ...s of an AXI burst External abort on VA to PA translation operation For VA to PA translation operations the only case when an external abort can be asserted is during the page table walk In this case the external abort is precise and both the DFSR and the FAR are updated in the world Secure or Non secure that generated the VA to PA translation operation This is in addition to the standard abort mec...

Page 346: ...This bit is duplicated in the Secure and Non secure worlds for the support of TrustZone Alignment fault checking is independent of the MMU being enabled Translation Access bit domain and permission faults are only generated when the MMU is enabled The access control mechanisms of the MMU detect the conditions that produce these faults If a fault is detected as the result of a memory access the MMU...

Page 347: ...nslation table managed TLB modes Figure 6 2 Translation table managed TLB fault checking sequence part 1 Virtual address Check address alignment No Yes No Alignment fault Yes Checking alignment Misaligned Get first level descriptor PTW disabled Section translation fault Yes No A No Translation external abort first level Section Page access flag fault Section Page translation abort External abort D...

Page 348: ...cess Client Access type Manager Condition is MMU on Strongly ordered or Device Unaligned access Section A Section or page Get second level descriptor Check domain Check domain Translation external abort 2nd level No External abort Yes Page translation fault Page access bit fault Yes Yes No Page Alignment fault Yes Check access permissions No Condition true Alignment fault Yes No Condition true Phy...

Page 349: ...bit fault When the Force AP bit see c1 Control Register on page 3 44 bit 29 is set then AP 0 indicates if there is an Access Bit Fault This bit is only taken into account when the MMU is in ARMv6 mode that is XP 1 bit 23 in the CP15 Control register In the configuration XP 1 and ForceAP 1 the OS uses only bits APX and AP 1 as Access Permission bits and AP 0 becomes an Access Bit see Access permiss...

Page 350: ...ain in CP15 c3 the Domain Access Control Register to select If the selected domain has bit 0 set to 0 indicating either no access or reserved then a domain fault occurs 6 9 6 Permission fault If the two bit domain field returns Client the access permission check is performed on the access permission field in the TLB entry A permission fault occurs if the access permission check fails 6 9 7 Debug e...

Page 351: ...d to determine the faulting address You can determine the domain information by performing a TLB lookup for the faulting address and extracting the domain field Table 6 12 on page 6 35 lists a summary of the abort vector that is taken and the Fault Status and Fault Address Registers that are updated for each abort type Table 6 11 Fault Status Register encoding Priority Sources FSR 10 3 0 Domain FS...

Page 352: ...refetch Abort Yesa Yesa Yes No No No Instruction external abort Prefetch Abort Yesa Yesa Yes No No No Instruction cache maintenance operation Data Abort Yes Yes No Yes Yes No Data MMU fault Data Abort Yes No No Yes Yes No Data debug abort Data Abort No No No Yes Yes Yes Data external abort on translation Data Abort Yesa No No Yesa Yesa Noa Data external abort Data Abort Nob No No Yesa Yes No Data ...

Page 353: ...e walks do not cause a read from the level one Unified Data Cache or the TCM The P RGN S and C bits in the Translation Table Base Registers determine the memory region attributes for the page table walk Two formats of page tables are supported A backwards compatible format supporting subpage access permissions These have been extended so that certain page table entries support extended region type...

Page 354: ...arge and small pages there can be four subpages defined with different access permissions For a large page the subpage size is 16KB and is accessed using bits 15 14 of the page index of the virtual address For a small page the subpage size is 1KB and is accessed using bits 11 10 of the page index of the virtual address The use of subpage AP bits where AP3 AP2 AP1 and AP0 contain different values i...

Page 355: ...compatible descriptors Figure 6 6 Backwards compatible section supersection and page translation SBZ TEX AP3 B B 1 Large page base address AP2 AP1 AP0 C 0 0 Small page base address AP3 AP2 AP1 AP0 C B 1 1 Extended small page base address TEX AP C 1 Translation fault Large page 64KB Small page 4KB 0 Ignored 31 16 15 12 11 10 9 8 7 6 5 4 3 2 1 0 0 Extended small page 4KB 16KB subpage Invalid Invalid...

Page 356: ...ss permission bit All ARMv6 page table mappings support the TEX field ARMv6 page table format With the sub pages enabled or not all first level descriptors have been enhanced with the addition of the NS Attribute bit to enable the support of TrustZone Figure 6 7 shows the format of an ARMv6 first level descriptor when subpages are disabled Figure 6 7 ARMv6 first level descriptor formats with subpa...

Page 357: ...igure 6 8 shows the format of an ARMv6 second level descriptors Figure 6 8 ARMv6 second level descriptor format As shown in Figure 6 8 bits 1 0 of a second level descriptor determine the type of the descriptor Bits 1 0 b00 Translation fault Bits 1 0 b01 The entry points to a 64KB Large page in memory Note You must repeat any Large page description in 16 consecutive page table locations with the fi...

Page 358: ...s problems For pages marked as Non Shared if bit 11 or bit 23 of the Cache Type Register is set the restriction applies to pages that remap virtual address bits 13 12 and might cause aliasing problems when 4KB pages are used To prevent this you must ensure the following restrictions are applied 1 If multiple virtual addresses are mapped onto the same physical address then for all mappings of bits ...

Page 359: ...is no restriction on the more significant bits in the virtual address equalling those in the physical address Avoiding the page coloring restriction The processor provides the ability to restrict the cache size to 16KB so that software does not have to support the page coloring restriction on mapping see CZ bit in c1 Auxiliary Control Register on page 3 48 Note Setting the CZ flag in the CP15 Auxi...

Page 360: ...del the virtual address space is divided into two regions 0x0 1 32 N that TTBR0 controls 1 32 N 4GB that TTBR1 controls The value of N is set in the TTBCR If N is zero then TTBR0 is used for all addresses and that gives legacy v5 behavior If N is not zero the OS and memory mapped IO are located in the upper part of the memory map TTBR1 and the tasks or processes all occupy the same virtual address...

Page 361: ... can be used to prevent pagetable walks from either TTBR In particular disabling walks from TTBR1 and setting TTBR0 to the address of a truncated translation table can minimize the overhead otherwise incurred in unused translation table entries Figure 6 10 Creating a first level descriptor address Translation base 31 14 N 13 N 3 2 1 0 P S C First level table index 32 N 20 19 0 Translation table ba...

Page 362: ...e see MMU fault checking on page 6 29 If the first level descriptor describes a section or supersection when the Force AP bit is set and the MMU is in ARMv6 mode Access bit faults might be generated if AP 0 0 First level page table address If bits 1 0 of the first level descriptor are b01 then a page table walk is required Second level page table walk on page 6 47 describes this process First leve...

Page 363: ...ation for a 1MB section backwards compatible format 0 S n G A P X TEX 0 Section base address 31 20 19 12 11 10 9 8 5 4 3 2 1 0 N S AP P Domain X N C B 1 First level table index 31 20 19 0 Section index Translation base 31 14 13 0 0 Translation base 31 14 13 0 First level table index 0 2 1 Physical address First level descriptor First level descriptor address Modified virtual address Translation ta...

Page 364: ...00 then a translation fault is generated This generates an abort to the processor either a Prefetch Abort for the instruction side or a Data Abort for the data side see MMU fault checking on page 6 29 N S 1 Coarse page table base address 31 10 9 8 5 4 2 1 0 P Domain 0 First level table index 31 20 19 12 11 0 Second level table index Translation base 31 14 13 0 0 Coarse page table base address 31 1...

Page 365: ...k ARMv6 format Figure 6 15 on page 6 49 shows the translation process for a 64KB large page or a 16KB large page subpage using backwards compatible format AP bits enabled N S X N S TEX 1 Coarse page table base address 31 10 9 8 5 4 2 1 0 P Domain 0 First level table index 31 20 19 12 11 0 Page index Translation base 31 14 13 0 1 Page base address 31 12 11 10 9 8 6 5 4 3 2 1 0 n G A P X SBZ AP C B ...

Page 366: ...atible format then a small page table walk is required Figure 6 16 on page 6 50 shows the translation process for a 4KB small page or a 1KB small page subpage using backwards compatible format descriptors AP bits enabled N S 0 TEX 1 Coarse page table base address 31 10 9 8 5 4 2 1 0 P Domain 0 First level table index 31 20 19 12 11 0 Page index Translation base 31 14 13 0 1 Page base address 31 12...

Page 367: ...Mv6 format descriptors or b11 for backwards compatible descriptors then an extended small page table walk is required Figure 6 17 on page 6 51 shows the translation process for a 4KB extended small page using ARMv6 format descriptors AP bits disabled N S 1 Coarse page table base address 31 10 9 8 5 4 2 1 0 P Domain 0 First level table index 31 20 19 12 11 0 Second level table index Page index Tran...

Page 368: ...e table base address 31 10 9 8 5 4 3 2 1 0 P Domain 0 First level table index 31 20 19 12 11 0 Second level table index Page index Translation base 31 14 13 0 X N Extended small page base address 31 12 11 10 9 8 6 5 4 3 2 1 0 n G A P X TEX AP C B 1 0 Coarse page table base address 31 10 9 2 1 0 Second level table index 0 0 Translation base 31 14 13 0 First level table index 0 2 1 Page index Page b...

Page 369: ...l page subpages The subpage access permission bits are chosen using the virtual address bits 11 10 N S 1 Coarse page table base address 31 10 9 8 5 4 2 1 0 P Domain 0 First level table index 31 20 19 12 11 0 Second level table index Page index Translation base 31 14 13 0 1 Extended small page base address 31 12 11 9 8 6 5 4 3 2 1 0 SBZ TEX AP C B 1 0 Coarse page table base address 31 10 9 2 1 0 Se...

Page 370: ...Control Register c2 Translation Table Base Control Register on page 3 60 Domain Access Control Register c3 Domain Access Control Register on page 3 63 Data Fault Status Register DFSR c5 Data Fault Status Register on page 3 64 Instruction Fault Status Register IFSR c5 Instruction Fault Status Register on page 3 66 Fault Address Register FAR c6 Fault Address Register on page 3 68 and MMU fault check...

Page 371: ...ol coprocessor CP14 also influences the MMU when in Debug state Table 6 17 lists the registers that affect the MMU Table 6 17 CP14 register functions Register Cross reference Debug State MMU Control Register CP14 c11 Debug State MMU Control Register on page 13 23 Debug State Cache Control Register CP14 c10 Debug State Cache Control Register on page 13 23 ...

Page 372: ...pter 7 Level One Memory System This chapter describes the processor level one memory system It contains the following sections About the level one memory system on page 7 2 Cache organization on page 7 3 Tightly coupled memory on page 7 7 DMA on page 7 10 TCM and cache interactions on page 7 12 Write buffer on page 7 16 ...

Page 373: ...ce anywhere in the physical address map and does not have to be backed by memory implemented externally The Instruction and Data TCMs have separate base addresses A DMA mechanism can access TCMs and this enables loads from or stores to another location in memory while the processor core is running The MMU provides the facilities required by sophisticated operating systems to deliver protected virt...

Page 374: ...ty bits for the Data Cache are updated when the Data Cache Write Buffer data is written to the RAM This requires the dirty bits to be held as a separate storage array Significantly the Tag arrays cannot be written because the arrays are not accessed during the data RAM writes but permits the dirty bits to be implemented as a small RAM The other main operations performed by the cache are cache line...

Page 375: ...he replacement algorithm when all cache lines are valid If one or more lines is invalid then the invalid cache line with the lowest way number is allocated to in preference to replacing a valid cache line This mechanism does not allocate to locked cache ways unless all cache ways are locked See Cache miss handling when all ways are locked down on page 7 6 Cache lines can contain either Secure or N...

Page 376: ...ress from the MicroTLB The processor also compares the NS Tag that the processor stores in the Tag RAMs along with the physical address with the NS attribute from the MicroTLB Both comparisons form hit signals for each of the cache ways 4 The hit signals are used to select the data from the cache way that has a hit Any bytes contained in both the data RAMs and the Write Buffer entries are taken fr...

Page 377: ...t to the Data Caches is eight bytes per cycle and supports streaming Cache miss handling when all ways are locked down The ARM architecture describes the behavior of the cache as being Unpredictable when all ways in the cache are locked down However for ARM1176JZF S processors a cache miss is serviced as if Way 0 is not locked 7 2 5 Cache disabled behavior If the cache is disabled then the cache i...

Page 378: ... one single RAM This RAM then has a size in the 0 64 KB range The lower part of the RAM corresponds to the TCM called TCM0 and the upper part corresponds to TCM1 You can also configure each individual TCM to contain Secure or Non secure data You make this configuration in CP15 register c9 accessible in Secure state only See c9 Data TCM Non secure Control Access Register on page 3 93 and c9 Instruc...

Page 379: ...d The TCM is used as part of the physical memory map of the system and is not backed by a level of external memory with the same physical addresses For this reason the TCM behaves differently from the caches for regions of memory that are marked as being Write Through Cacheable In such regions no external writes occur in the event of a write to memory locations contained in the TCM 7 3 2 Restricti...

Page 380: ...n on page table attributes The page table entries that describe areas of memory that are handled by the TCM are remapped to normal non cacheable non shared type If the page table entry covers a region larger than the size of the TCM then the attributes are ignored for the TCM region but still apply to the rest of the region covered by the page table entry ...

Page 381: ...ccess is reset the DMA channel accesses external Secure memory If the NS attribute is set the DMA channel accesses external Non secure memory For internal access the page descriptor selects the TCM and the DMA performs a security permission check before accessing the TCM The process specifies the internal start and end addresses and external start address together with the direction of the DMA The...

Page 382: ...ling Shared memory regions must be used if the external addresses being accessed by the level one DMA system are also accessed by the rest of the processor Memory attributes and types on page 6 20 describes these If a User mode DMA transfer is performed using an external address that is not marked as Shared an error is signaled by the DMA channel There is no ordering requirement of memory accesses...

Page 383: ...s no security consideration is necessary because there cannot be a conflict between accesses targeting Secure and Non secure memory Any cache line or TCM data is marked as being Secure or Non secure and no Unpredictable situations can result from this 7 5 1 Overlapping between TCM regions Where TCM regions overlap the access priority is worked out using these rules starting with the highest priori...

Page 384: ...riting the data This delay enables the Instruction TCM access to be scheduled to take place only when the presence of a hit to the Instruction TCM is known This saves power and avoids unnecessary delays being inserted into the instruction fetch side This delay is applied to all accesses in a multiple operation in the case of an LDM an LDCL an STM or an STCL Literal pool accesses It can take 5 12 c...

Page 385: ... the same base address as a Data TCM and if the RAMs are of different sizes the regions in physical memory of the two RAMs must not be overlapped because the resulting behavior is architecturally Unpredictable If an access is made to a location that is covered by both an Instruction TCM and a Data TCM the access is only to the Data TCM Table 7 4 summarizes the results of data accesses to TCM and t...

Page 386: ...che fill even if marked Cacheable Write to Instruction TCM No write to level two even if marked as Write Through Miss Miss Miss If Cacheable and cache enabled cache linefill If Noncacheable or cache disabled read to level two Write to level two a Excludes unexpected hit Table 7 4 Summary of data accesses to TCM and caches continued Data TCM Data cache Instruction TCMa Read behavior Write behavior ...

Page 387: ...e multiple This reduces the number of address entries that must be stored in the Write buffer In addition to this a separate FIFO of Write Back addresses and data words is implemented Having a separate structure avoids complications associated with performing an external write while the write though is being handled The address of a new read access is compared against the addresses in the Write bu...

Page 388: ...terface to memory and peripherals This chapter describes the features of the level two interface not covered in the AMBA AXI Protocol Specification The chapter contains the following sections About the level two interface on page 8 2 Synchronization primitives on page 8 6 AXI control signals in the processor on page 8 8 Instruction Fetch Interface transfers on page 8 14 Data Read Write Interface t...

Page 389: ...tions giving the potential for high performance from level two memory systems that support parallelism and also for high utilization of pipelined memories such as SDRAM No outstanding accesses are issued on the DMA port The DMA port can issue bursts of 32 bit or 64 bit data when the address is correctly aligned The data read write port can issue outstanding accesses The maximum number of outstandi...

Page 390: ...on side controller contains some buffering Instruction Fetch Interface The Instruction Fetch Interface is a read only interface that services the Instruction Cache on cache misses including the fetching of instructions for the PU that are held in memory marked as Noncacheable The interface is optimized for cache linefills rather than individual requests 8 1 3 Level two data side controller The lev...

Page 391: ... Watchdog Timer Accesses to regions of memory that are marked as Device and Non Shared are routed to the Peripheral Interface in preference to the Data Read Write Interface Instruction and DMA accesses are not routed to the Peripheral port Unaligned accesses and exclusive accesses are not supported by the peripheral port because they are not supported in Device memory The order that accesses are p...

Page 392: ...tem for writing and reading the TCMs Although the DMA Interface is bidirectional it is able to produce a stream of successive accesses that are in the same direction followed by either an extra stream in the same direction or a stream in the opposite direction Correspondingly the direction turnaround is not significantly optimized The size of the transfer is given in the parameters of the transfer...

Page 393: ...clusive this does not clear the tag However if the region has been marked by another processor an STR clears the tag Other events might cause the tag to be cleared In particular for memory regions that are not shared it is systems dependent whether a store by another processor to a tagged physical address causes the tag to be cleared An external abort on either a load exclusive or store exclusive ...

Page 394: ...return value is 0 otherwise it is 1 In both cases the physical address is no longer tagged as exclusive access for any processor 8 2 3 Example of LDREX and STREX usage This is an example of typical usage Suppose you are trying to claim a lock Lock address LockAddr Lock free 0x00 Lock taken 0xFF MOV R1 0xFF load the lock taken value try LDREX R0 LockAddr load the lock value CMP R0 0 is the lock fre...

Page 395: ... has an additional write response channel to enable the slave to signal to the master the completion of the write transaction The AXI protocol permits address information to be issued ahead of the actual data transfer and enables support for multiple outstanding transactions in addition to out of order completion of transactions Figure 8 2 shows how a read transaction uses the read address and rea...

Page 396: ...l The write address channel is used in every transaction and carries all the required write address and control information for that transaction The AXI supports the following mechanisms variable length bursts from 1 to 16 data transfers per burst bursts with a transfer size of eight bits up to the maximum data bus width wrapping incrementing and fixed address bursts atomic operations using exclus...

Page 397: ...The second character in the signal name indicates if the data direction is a read R or write W For example AxSIZE 2 0 is called ARSIZEI 2 0 for reads in the Instruction Fetch Interface 8 3 3 Address channel signals The address channel control signals in the processor are AxLEN 3 0 AxSIZE 2 0 on page 8 11 AxBURST 1 0 on page 8 11 AxLOCK 1 0 on page 8 11 AxCACHE 3 0 on page 8 12 AxPROT 2 0 on page 8...

Page 398: ...K 1 0 signal indicates the lock type of access The processor supports all locked type accesses The instruction port only generates Normal access types The DMA port only generates Normal access types The Data Read Write port generates all access types Normal exclusive and locked access Table 8 5 shows the values of AxLOCK that the processor supports Table 8 3 AxSIZE 2 0 encoding AxSIZE 2 0 Bytes in...

Page 399: ...struction port are marked as instruction accesses ARPROTI 2 1 Transactions from the DMA port are marked as instruction accesses AxPROTD 2 1 if the transaction is to or from the Instruction TCM and as data accesses AxPROTD 2 0 for transfers to or from the Data TCM Transactions on the peripheral and data read write ports are marked as data accesses Table 8 7 shows the supported values for AxPROT 2 0...

Page 400: ...respondence between the ARSIDEBANDI 4 1 encoding and the TLB cacheable attributes for the Instruction port These signals are not part of the AXI protocol and are added for additional information Table 8 8 AxSIDEBAND 4 1 encoding AxSIDEBAND 4 1 Transaction attributes b0000 Strongly ordered b0001 Shared device or non shared device b0010 Inner noncacheable b0110 Inner write through no allocate on wri...

Page 401: ...es Table 8 10 AXI signals for Cacheable fetches Address 4 0 ARADDRI ARBURSTI ARSIZEI ARLENI 0x00 word 0 0x00 Incr 64 bit 4 data transfers 0x04 word 1 0x00 Incr 64 bit 4 data transfers 0x08 word 2 0x08 Wrap 64 bit 4 data transfers 0x0C word 3 0x08 Wrap 64 bit 4 data transfers 0x10 word 4 0x10 Wrap 64 bit 4 data transfers 0x14 word 5 0x10 Wrap 64 bit 4 data transfers 0x18 word 6 0x18 Wrap 64 bit 4 d...

Page 402: ...doubleword Write data doubleword Write Valid 1 Dirty 0 and data doubleword The linefill can only progress to attempt to write a doubleword if it does not contain dirty data This is determined in one of two ways if the victim cache line is not valid then there is no danger and the linefill progresses if the victim line is valid a signal encodes the doublewords that are clean either because they wer...

Page 403: ...fer 0x02 byte 2 0x02 Incr 8 bit 1 data transfer 0x03 byte 3 0x03 Incr 8 bit 1 data transfer 0x04 byte 4 0x04 Incr 8 bit 1 data transfer 0x05 byte 5 0x05 Incr 8 bit 1 data transfer 0x06 byte 6 0x06 Incr 8 bit 1 data transfer 0x07 byte 7 0x07 Incr 8 bit 1 data transfer Table 8 14 Noncacheable LDRH Address 4 0 ARADDRRW ARBURSTRW ARSIZERW ARLENRW 0x00 byte 0 0x00 Incr 16 bit 1 data transfer 0x01 byte ...

Page 404: ...2 bit 1 data transfer 0x04 Incr 8 bit 1 data transfer 0x02 byte 2 0x02 Incr 16 bit 1 data transfer 0x04 Incr 16 bit 1 data transfer 0x03 byte 3 0x03 Incr 8 bit 1 data transfer 0x04 Incr 32 bit 1 data transfer 0x04 byte 4 word 1 0x04 Incr 32 bit 1 data transfer 0x05 byte 5 0x05 Incr 32 bit 1 data transfer 0x08 Incr 8 bit 1 data transfer 0x06 byte 6 0x06 Incr 16 bit 1 data transfer 0x08 Incr 16 bit ...

Page 405: ...0 Operations 0x1C word 7 LDR from 0x1C LDR from 0x00 Table 8 18 Noncacheable LDM3 Strongly Ordered or Device memory Address 4 0 ARADDRRW ARBURSTRW ARSIZERW ARLENRW 0x00 word 0 0x00 Incr 32 bit 3 data transfers 0x04 word 1 0x04 Incr 32 bit 3 data transfers 0x08 word 2 0x08 Incr 32 bit 3 data transfers 0x0C word 3 0x0C Incr 32 bit 3 data transfers 0x10 word 4 0x10 Incr 32 bit 3 data transfers 0x14 w...

Page 406: ...ngly Ordered or Device memory Address 4 0 ARADDRRW ARBURSTRW ARSIZERW ARLENRW 0x00 word 0 0x00 Incr 64 bit 2 data transfers 0x04 word 1 0x04 Incr 32 bit 4 data transfers 0x08 word 2 0x08 Incr 64 bit 2 data transfers 0x0C word 3 0x0C Incr 32 bit 4 data transfers 0x10 word 4 0x10 Incr 64 bit 2 data transfers Table 8 22 Noncacheable LDM4 Noncacheable memory or cache disabled Address 4 0 ARADDRRW ARBU...

Page 407: ...x00 Incr 32 bit 5 data transfers 0x04 word 1 0x04 Incr 32 bit 5 data transfers 0x08 word 2 0x08 Incr 32 bit 5 data transfers 0x0C word 3 0x0C Incr 32 bit 5 data transfers Table 8 25 Noncacheable LDM5 Noncacheable memory or cache disabled Address 4 0 ARADDRRW ARBURSTRW ARSIZERW ARLENRW 0x00 word 0 0x00 Incr 64 bit 3 data transfers 0x04 word 1 0x04 Incr 64 bit 3 data transfers 0x08 word 2 0x08 Incr ...

Page 408: ... 0x08 Incr 64 bit 3 data transfers Table 8 29 Noncacheable LDM6 from word 3 4 5 6 or 7 Address 4 0 Operations 0x0C word 3 LDM5 from 0x0C LDR from 0x00 0x10 word 4 LDM4 from 0x10 LDM2 from 0x00 0x14 word 5 LDM3 from 0x14 LDM3 from 0x00 0x18 word 6 LDM2 from 0x18 LDM4 from 0x00 0x1C word 7 LDR from 0x1C LDM5 from 0x00 Table 8 30 Noncacheable LDM7 Strongly Ordered or Device memory Address 4 0 ARADDRR...

Page 409: ...from word 2 3 4 5 6 or 7 continued Address 4 0 Operations Table 8 33 Noncacheable LDM8 from word 0 Address 4 0 ARADDRRW ARBURSTRW ARSIZERW ARLENRW 0x00 word 0 0x00 Incr 64 bit 4 data transfers Table 8 34 Noncacheable LDM8 from word 1 2 3 4 5 6 or 7 Address 4 0 Operations 0x04 word 1 LDM7 from 0x04 LDR from 0x00 0x08 word 2 LDM6 from 0x08 LDM2 from 0x00 0x0C word 3 LDM5 from 0x0C LDM3 from 0x00 0x1...

Page 410: ...4 0 Operations 0x00 word 0 LDM8 from 0x00 LDM2 from 0x00 0x04 word 1 LDM7 from 0x04 LDM3 from 0x00 0x08 word 2 LDM6 from 0x08 LDM4 from 0x00 0x0C word 3 LDM5 from 0x0C LDM5 from 0x00 0x10 word 4 LDM4 from 0x10 LDM6 from 0x00 0x14 word 5 LDM3 from 0x14 LDM7 from 0x00 0x18 word 6 LDM2 from 0x18 LDM8 from 0x00 0x1C word 7 LDR from 0x1C LDM8 from 0x00 LDR from 0x00 Table 8 37 Noncacheable LDM11 Addres...

Page 411: ... word 4 LDM4 from 0x10 LDM8 from 0x00 0x14 word 5 LDM3 from 0x14 LDM8 from 0x00 LDR from 0x00 0x18 word 6 LDM2 from 0x18 LDM8 from 0x00 LDM2 from 0x00 0x1C word 7 LDR from 0x1C LDM8 from 0x00 LDM3 from 0x00 Table 8 39 Noncacheable LDM13 Address 4 0 Operations 0x00 word 0 LDM8 from 0x00 LDM5 from 0x00 0x04 word 1 LDM7 from 0x04 LDM6 from 0x00 0x08 word 2 LDM6 from 0x08 LDM7 from 0x00 0x0C word 3 LD...

Page 412: ... 0 LDM8 from 0x00 LDM7 from 0x00 0x04 word 1 LDM7 from 0x04 LDM8 from 0x00 0x08 word 2 LDM6 from 0x08 LDM8 from 0x00 LDR from 0x00 0x0C word 3 LDM5 from 0x0C LDM8 from 0x00 LDM2 from 0x00 0x10 word 4 LDM4 from 0x10 LDM8 from 0x00 LDM3 from 0x00 0x14 word 5 LDM3 from 0x14 LDM8 from 0x00 LDM4 from 0x00 0x18 word 6 LDM2 from 0x18 LDM8 from 0x00 LDM5 from 0x00 0x1C word 7 LDR from 0x1C LDM8 from 0x00 ...

Page 413: ...bit 2 data transfers Evicted cache line valid and upper half dirty 0x10 Incr 64 bit 2 data transfers 0x08 0x0F Evicted cache line valid and lower half dirty 0x08 Wrap 64 bit 2 data transfers Evicted cache line valid and upper half dirty 0x10 Incr 64 bit 2 data transfers 0x10 0x17 Evicted cache line valid and lower half dirty 0x00 Incr 64 bit 2 data transfers Evicted cache line valid and upper half...

Page 414: ...nsfer b0000 0100 0x03 byte 3 0x03 Incr 8 bit 1 data transfer b0000 1000 0x04 byte 4 0x04 Incr 8 bit 1 data transfer b0001 0000 0x05 byte 5 0x05 Incr 8 bit 1 data transfer b0010 0000 0x06 byte 6 0x06 Incr 8 bit 1 data transfer b0100 0000 0x07 byte 7 0x07 Incr 8 bit 1 data transfer b1000 0000 Table 8 46 Cacheable Write Through or Noncacheable STRH Address 4 0 AWADDRRW AWBURSTRW AWSIZERW AWLENRW WSTR...

Page 415: ...b0000 1110 0x04 Incr 8 bit 1 data transfer b0001 0000 0x02 byte 2 0x02 Incr 16 bit 1 data transfer b0000 1100 0x04 Incr 16 bit 1 data transfer b0011 0000 0x03 byte 3 0x03 Incr 8 bit 1 data transfer b0000 1000 0x04 Incr 32 bit 1 data transfer b0111 0000 0x04 byte 4 word 1 0x04 Incr 32 bit 1 data transfer b1111 0000 0x05 byte 5 0x04 Incr 32 bit 1 data transfer b1110 0000 0x08 Incr 8 bit 1 data trans...

Page 416: ... word 3 0x0C Incr 32 bit 2 data transfers b1111 0000 0x10 word 4 0x10 Incr 64 bit 1 data transfer b1111 1111 0x14 word 5 0x14 Incr 32 bit 2 data transfers b1111 0000 0x18 word 6 0x18 Incr 64 bit 1 data transfer b1111 1111 Table 8 49 Cacheable Write Through or Noncacheable STM2 to word 7 Address 4 0 Operations 0x1C STR to 0x1C STR to 0x00 Table 8 50 Cacheable Write Through or Noncacheable STM3 to w...

Page 417: ...RBRW 0x00 word 0 0x00 Incr 64 bit 2 data transfers b1111 1111 0x04 word 1 0x04 Incr 32 bit 4 data transfers b11110000 0x08 word 2 0x08 Incr 64 bit 2 data transfers b11111111 0x0C word 3 0x0C Incr 32 bit 4 data transfers b11110000 0x10 word 4 0x10 Incr 64 bit 2 data transfers b11111111 Table 8 53 Cacheable Write Through or Noncacheable STM4 to word 5 6 or 7 Address 4 0 Operations 0x14 word 5 STM3 t...

Page 418: ...to 0x1C STM4 to 0x00 Table 8 55 Cacheable Write Through or Noncacheable STM5 to word 4 5 6 or 7 Address 4 0 Operations Table 8 56 Cacheable Write Through or Noncacheable STM6 to word 0 1 or 2 Address 4 0 AWADDRR W AWBURSTR W AWSIZERW AWLENRW First WSTRBRW 0x00 word 0 0x00 Incr 64 bit 3 data transfers b1111 1111 0x04 word 1 0x04 Incr 32 bit 6 data transfers b1111 0000 0x08 word 2 0x08 Incr 64 bit 3...

Page 419: ...M2 to 0x00 0x10 word 4 STM4 to 0x10 STM3 to 0x00 0x14 word 5 STM3 to 0x14 STM4 to 0x00 0x18 word 6 STM2 to 0x18 STM5 to 0x00 0x1C word 7 STR to 0x1C STM6 to 0x00 Table 8 60 Cacheable Write Through or Noncacheable STM8 to word 0 Address 4 0 AWADDRR W AWBURSTR W AWSIZERW AWLENRW First WSTRBRW 0x00 word 0 0x00 Incr 64 bit 4 data transfers b1111 1111 Table 8 61 Cacheable Write Through or Noncacheable ...

Page 420: ...te Through or Noncacheable STM9 continued Address 4 0 Operations Table 8 63 Cacheable Write Through or Noncacheable STM10 Address 4 0 Operations 0x00 word 0 STM8 to 0x00 STM2 to 0x00 0x04 word 1 STM7 to 0x04 STM3 to 0x00 0x08 word 2 STM6 to 0x08 STM4 to 0x00 0x0C word 3 STM5 to 0x0C STM5 to 0x00 0x10 word 4 STM4 to 0x10 STM6 to 0x00 0x14 word 5 STM3 to 0x14 STM7 to 0x00 0x18 word 6 STM2 to 0x18 ST...

Page 421: ...0x00 STM4 to 0x00 0x04 word 1 STM7 to 0x04 STM5 to 0x00 0x08 word 2 STM6 to 0x08 STM6 to 0x00 0x0C word 3 STM5 to 0x0C STM7 to 0x00 0x10 word 4 STM4 to 0x10 STM8 to 0x00 0x14 word 5 STM3 to 0x14 STM8 to 0x00 STR to 0x00 0x18 word 6 STM2 to 0x18 STM8 to 0x00 STM2 to 0x00 0x1C word 7 STR to 0x1C STM8 to 0x00 STM3 to 0x00 Table 8 66 Cacheable Write Through or Noncacheable STM13 Address 4 0 Operations...

Page 422: ...d 1 STM7 to 0x04 STM7 to 0x00 0x08 word 2 STM6 to 0x08 STM8 to 0x00 0x0C word 3 STM5 to 0x0C STM8 to 0x00 STR to 0x00 0x10 word 4 STM4 to 0x10 STM8 to 0x00 STM2 to 0x00 0x14 word 5 STM3 to 0x14 STM8 to 0x00 STM3 to 0x00 0x18 word 6 STM2 to 0x18 STM8 to 0x00 STM4 to 0x00 0x1C word 7 STR to 0x1C STM8 to 0x00 STM5 to 0x00 Table 8 68 Cacheable Write Through or Noncacheable STM15 Address 4 0 Operations...

Page 423: ... Table 8 69 Table 8 69 Cacheable Write Through or Noncacheable STM16 Address 4 0 Operations 0x00 word 0 STM8 to 0x00 STM8 to 0x00 0x04 word 1 STM7 to 0x04 STM8 to 0x00 STR to 0x00 0x08 word 2 STM6 to 0x08 STM8 to 0x00 STM2 to 0x00 0x0C word 3 STM5 to 0x0C STM8 to 0x00 STM3 to 0x00 0x10 word 4 STM4 to 0x10 STM8 to 0x00 STM4 to 0x00 0x14 word 5 STM3 to 0x14 STM8 to 0x00 STM5 to 0x00 0x18 word 6 STM2...

Page 424: ...mum It does not support unaligned accesses Table 8 70 Example Peripheral Interface reads and writes Example transfer read or write AxADDRP AxBURSTP AxSIZEP AxLENP WSTRBP Words 0 7 0x00 Incr 32 bit 2 data transfers b1111 0x04 b1111 0x08 Incr 32 bit 2 data transfers b1111 0x0C b1111 0x10 Incr 32 bit 2 data transfers b1111 0x14 b1111 0x18 Incr 32 bit 2 data transfers b1111 0x1C b1111 Words 0 3 0x00 I...

Page 425: ...the two 32 bit words forming these 64 bit data The AXI protocol does not support 32 bit word invariant big endian BE 32 accesses Therefore in this configuration the ARM1176JZF S processor issues byte invariant big endian BE 8 accesses on the four ports by swizzling the byte lanes and the byte strobes as Figure 8 4 shows Figure 8 4 Swizzling of data and strobes in BE 32 big endian configuration Not...

Page 426: ...action occurs the master must follow the locked transaction with an unlocked transaction to remove the lock of the interconnect For ARM1176JZF S processors this implies that in the case of an abort received on the read part of a SWP instruction the Peripheral port or Data port issues a dummy write access with all byte strobes LOW at the same address as the read access and with AWLOCK 00 normal tra...

Page 427: ...cted Access Chapter 9 Clocking and Resets This chapter describes the clocking and reset options available for the processor It contains the following sections About clocking and resets on page 9 2 Clocking and resets with no IEM on page 9 3 Clocking and resets with IEM on page 9 5 Reset modes on page 9 10 ...

Page 428: ... ID012310 Non Confidential Unrestricted Access 9 1 About clocking and resets The processor clocking and reset schemes depend on the optional implementation of IEM This chapter gives details of the way that clocking and resets work for processors that implement IEM and for those that do not ...

Page 429: ...DEACK signals are not used and must be left unconnected All clocks can be stopped indefinitely without loss of state Figure 9 1 shows the clocks for the processor with no IEM Figure 9 1 Processor clocks with no IEM Read latency penalty with no IEM The Nonsequential Noncacheable read latency with zero wait state AXI is a six cycle penalty over a cache hit where data is returned in the DC2 cycle on ...

Page 430: ...TIN signal is the main processor reset that initializes the majority of the processor logic DBGnTRST The DBGnTRST signal is the DBGTAP reset nPORESETIN The nPORESETIN signal is the power on reset that initializes the CP14 debug logic See CP14 registers reset on page 13 25 for details nVFPRESETIN The nVFPRESETIN signal is the reset for the VFP block All of these are active LOW signals that reset lo...

Page 431: ...W ACLKP and ACLKD Because of the signals SYNCMODEREQI SYNCMODEREQRW SYNCMODEREQP SYNCMODEREQD SYNCMODEACKI SYNCMODEACKRW SYNCMODEACKP and SYNCMODEACKD it is possible to configure each IEM register slice to operate synchronously or asynchronously The four level two interfaces and the VCore part of the IEM register slices use dedicated clock enables ACLKENI ACLKENRW ACLKENP and ACLKEND If you config...

Page 432: ...ws the processor synchronization with such a system Figure 9 4 Processor synchronization with IEM RAMs Level shift and clamp Core Instruction level 2 interface DMA level 2 interface Level shift and clamp Processor Clock enables CLKIN ACLK clocks VIC interface Debug interface VCoreSliceI Data read write level 2 interface Peripheral level 2 interface Level shift and clamp CLK Level 2 Level shift and...

Page 433: ...slices take the same number of cycles so the SYNCMODEACK signals all deassert at the same time Alternatively if necessary you can daisy chain the IEM register slices together so that each slice in the chain only closes its inputs when the previous slice has been multiplexed out Read latency penalty for synchronous operation with IEM When the IEM register slices are instantiated but are synchronous...

Page 434: ...SD2 and passes through a buffer cycle before finally passing to the level two interfaces in cycle RDC When the level two interfaces of the core receive the data they then pass it back to the LSU or PU in two cycles see Figure 9 2 on page 9 4 Each of the IEM register slices except the peripheral port slice can store multiple items of read and write data This means that a burst of data can typically...

Page 435: ...ved 9 9 ID012310 Non Confidential Unrestricted Access nVFPRESETIN The nVFPRESETIN signal is the reset for the VFP block ARESETIn ARESETRWn ARESETPn ARESETDn Reset signals for the SoC part of the IEM register slices All of these are active LOW signals that reset logic in the processor ...

Page 436: ...be synchronous to CLKIN Because the nRESETIN and nPORESETIN signals are synchronized within the processor you do not have to synchronize these signals Figure 9 6 shows the application of power on reset Figure 9 6 Power on reset It is recommended that you assert the reset signals for at least three CLKIN cycles to ensure correct reset behavior Adopting a three cycle reset eases the integration of o...

Page 437: ...ynchronized within the processor you do not have to synchronize this signal 9 4 4 DBGTAP reset DBGTAP reset initializes the state of the processor DBGTAP controller DBGTAP reset is typically used by the RealView ICE module for hot connection of a debugger to a system DBGTAP reset enables initialization of the DBGTAP controller without affecting the normal operation of the processor Because the DBG...

Page 438: ...Confidential Unrestricted Access Chapter 10 Power Control This chapter describes the processor power control functions It contains the following sections About power control on page 10 2 Power management on page 10 3 VFP shutdown on page 10 6 Intelligent Energy Management on page 10 7 ...

Page 439: ...etch and decode operations use of physically addressed caches to reduce the number of cache flushes and refills saving energy in the system the use of MicroTLBs reduces the power consumed in translation and protection look ups each cycle the caches use sequential access information to reduce the number of accesses to the TagRAMs and to unwanted Data RAMs In the processor extensive use is also made...

Page 440: ...ormed A Data Synchronization Barrier operation ensures that all explicit memory accesses occurring in program order before the Wait For Interrupt have completed This avoids any possible deadlocks that might be caused in a system where memory access triggers or enables an interrupt that the core is waiting for This might require some TLB page table walks to take place as well The DMA continues runn...

Page 441: ...ynthesizable flow The RAM blocks that are to remain powered up must be implemented on a separate power domain and there is a requirement to clamp all of the inputs to the RAMs to a known logic level with the chip enable being held inactive This clamping is not implemented in gates as part of the default synthesis flow because it contributes to a critical path The RAMCLAMP input is provided to driv...

Page 442: ...ontrol mechanism Transition from Dormant state to Run state is triggered by the external power controller asserting Reset to the processor until the power to the processor is restored When power has been restored the core leaves reset and by interrogating the external power controller can determine that the saved state must be restored 10 2 5 Communication to the Power Management Controller Your P...

Page 443: ...s not in use There is a clamping placeholder between the VFP and the rest of the logic but this block is not implemented in gates because it contributes to a critical path You must add clamps to the placeholder either as explicit gates in the Core power domain or as pull down transistors that clamp the values when the VFP is powered down To shutdown the VFP 1 Save all VFP registers if the VFP cont...

Page 444: ... 9 5 10 4 1 Purpose of IEM The purpose of IEM technology is to provide a dynamic optimization between processor performance and power consumption 10 4 2 Structure of IEM The ARM1176JZF S processor provides a number of features that enable the processor voltage to vary relative to the voltage of the rest of the system For this purpose the processor optionally implements Placeholders for level shift...

Page 445: ...lligent Energy Controller IEC For example systems see the Intelligent Energy Controller Technical Overview IEM is functionally transparent to the user RAMs Core Instruction level 2 interface DMA level 2 interface Up level shift and clamp Processor Clock enables CLKIN ACLK clocks VCoreSliceI Data read write level 2 interface Peripheral level 2 interface Down level shift and clamp CLK Level 2 VCoreS...

Page 446: ...ace This chapter describes the coprocessor interface of the ARM1176JZF S processor It contains the following sections About the coprocessor interface on page 11 2 Coprocessor pipeline on page 11 3 Token queue management on page 11 9 Token queues on page 11 12 Data transfer on page 11 15 Operations on page 11 19 Multiple coprocessors on page 11 22 ...

Page 447: ...r instructions can be canceled by the core if a condition code fails or the entire coprocessor pipeline can be flushed in the event of a mispredicted branch Load and store data are also required to pass between the core Logic Store Unit LSU and the coprocessor pipeline The coprocessor interface operates over a two cycle delay Any signal passing from the core to the coprocessor or from the coproces...

Page 448: ...all possible coprocessor instructions Coprocessors reject those instructions they cannot handle Table 11 1 lists all the coprocessor instructions supported by the processor and gives a brief description of each For more details of coprocessor instructions see the ARM Architecture Reference Manual The coprocessor instructions fall into three groups loads stores processing instructions The load and ...

Page 449: ...on Figure 11 1 on page 11 5 shows an outline of the core and coprocessor pipelines and the synchronizing queues that communicate between them Each queue is implemented as a very short First In First Out FIFO buffer No explicit flow control is required for the queues because the pipeline lengths between the queues limits the number of items any queue can hold at any time The geometry used means tha...

Page 450: ...the length queue that is maintained by the core The coprocessor I stage sends a token to the core Ex2 stage through the accept queue that is also maintained by the core This token indicates to the core if the coprocessor is accepting the instruction in its I stage or bouncing it Fe2 Length Core pipeline Coprocessor pipeline De Iss Ex1 Ex2 Ex3 Wb D I Ex1 Ex2 Ex3 Ex4 Ex5 Ex6 Instruction Length Cance...

Page 451: ...g coprocessor number The length of any vectored data transfer is also decided at this point and sent back to the core The decoded instruction then passes into the issue I stage This stage decides if this particular instance of the instruction can be accepted If it cannot because it addresses a non existent register the instruction is bounced informing the core that it cannot be accepted If the ins...

Page 452: ...ow the new state of the pipeline stage is derived The Enable input comes from the next stage in the pipeline and indicates if data can be passed on In general if this signal is unasserted the pipeline stage cannot receive new data or pass on its own contents However if the pipeline stage is empty it can receive new data without passing any data on to the next stage This is known as bubble closing ...

Page 453: ...ruction it is When the coprocessor Decode stage removes the non coprocessor instructions it is left with an instruction stream carrying contiguous tags The tags can also be used to verify that the sequence of tokens moving down the queues matches the sequence of instructions moving down the core and coprocessor pipelines 11 2 6 Flush broadcast If a branch has been mispredicted it might be necessar...

Page 454: ... become empty If the queue is full the oldest data and therefore the first to be read from the queue occupies buffer C and the newest occupies buffer A The multiplexors also select the current flag that then indicates whether the selected output is valid 11 3 2 Queue modification The queue is written to on each cycle Buffer A accepts the data arriving at the interface and the buffer A flag accepts...

Page 455: ... of the selected stage making it available for input Figure 11 5 shows reading and writing a queue Figure 11 5 Queue reading and writing Four valid inputs labeled One Two Three and Four are written into the queue and are clocked into buffer A as they arrive Figure 11 5 shows how these inputs are clocked from buffer to buffer until the first input reaches buffer C At this point a read from the queu...

Page 456: ...nstruction If the queue is to be flushed from a selected buffer the buffer is chosen by looking for a matching tag When this is found the flag associated with that buffer is cleared and every flag newer than the selected one is also cleared Figure 11 6 shows queue flushing Figure 11 6 Queue flushing Each buffer in the queue has a tag comparator associated with it The flush tag is presented to each...

Page 457: ...oder Figure 11 7 shows an instruction queue implementation Figure 11 7 Instruction queue The decoder decodes the instruction written into buffer A as soon as it arrives The subsequent buffers B and C receive the decoded version of the instruction in buffer A The A flag now indicates that the data in buffer A are valid and represent a coprocessor instruction This means that non coprocessor or unrec...

Page 458: ...is copied from the queue buffer supplying the instruction CPALENGTHHOLD This is deasserted when the instruction queue is providing valid information to the core length queue Otherwise the signal is asserted to indicate that no valid data are available 11 4 3 Accept queue The coprocessor must decide in the issue stage if it can accept an otherwise valid coprocessor instruction It passes this inform...

Page 459: ...sor from the core and must be clocked into buffer A ACPCANCELT 3 0 This is the flush tag associated with the cancel command and must be clocked into the tag associated with buffer A The coprocessor Ex1 stage reads the cancel queue that then acts on the value of the queued ACPCANCEL signal by removing the instruction from the Ex1 stage if the signal is set and not passing it on to the Ex2 stage 11 ...

Page 460: ...rcase are the tails In the example shown the vector length is four so there is one head and three tails At the first iteration of the instruction the tail flag is set so that subsequent iterations send tail instructions down the pipeline In the example shown in Figure 11 9 on page 11 16 instruction B has stalled in the Ex1 stage that might be caused by the cancel queue being empty so that instruct...

Page 461: ...stage of the core LSU and are received by the coprocessor Ex6 stage Each item in a vectored load is picked up by one instance of the iterated load instruction The pipeline timing means that a load instruction is always ready or arrived a short time ago in Ex6 to pick up each data item If a load instruction has arrived in Ex6 but the load information has not yet appeared the load instruction must s...

Page 462: ...t trigger a flush Any coprocessor load instructions behind the flush point find themselves stalled if they get as far as the Ex6 stage for the lack of a finish token so no data transfers can have taken place Any data in the load data buffers expires naturally during the flush dead period while the pipeline reloads Loads and cancels If a load instruction is canceled both the head and any tails must...

Page 463: ... stopped at any time by the LSU a store data queue is required Additionally because store data vectors can be of arbitrary length flow control is required A queue length of three slots is sufficient to enable flow control to be used without loss of data Stores and flushes When a store instruction is involved in a flush the store data queue must be flushed by the core Because the queue continues to...

Page 464: ...on reaches the Ex1 stage it looks for a token in the cancel queue If the token indicates that the instruction is to be cancelled it is removed from the pipeline and does not pass to Ex2 Any tail instruction in the I stage is also removed 11 6 3 Bounce operations The coprocessor can reject an instruction by bouncing it when it reaches the issue stage This can happen to an instruction that has been ...

Page 465: ...tion and cancel queues cannot be performing any other operation This means that flushing is not required to be combined with queue updates for these queues There is a single cycle following a flush where nothing happens that affects the flushed queues and this provides a good opportunity to carry out the queue flushing operation The following signals provide the flush broadcast signal from the cor...

Page 466: ... DDI 0301H Copyright 2004 2009 ARM Limited All rights reserved 11 21 ID012310 Non Confidential Unrestricted Access all load instructions must pick up data from the load pipeline phantom load instructions retire unconditionally ...

Page 467: ...every coprocessor holds its outputs to zero when it is inactive 11 7 2 Coprocessor selection Coprocessors are enabled by a signal ACPENABLE from the core There are 12 of these signals one for each coprocessor Only one can be active at any time In addition instructions to the coprocessor include the coprocessor number enabling coprocessors to reject instructions that do not match their own number C...

Page 468: ...12 Vectored Interrupt Controller Port This chapter describes the vectored interrupt controller port of the processor It contains the following sections About the PL192 Vectored Interrupt Controller on page 12 2 About the processor VIC port on page 12 3 Timing of the VIC port on page 12 5 Interrupt entry flowchart on page 12 7 ...

Page 469: ...ith an interrupt controller having the above features software is still required to determine the interrupt source that is requesting service determine where the service routine for that interrupt source is loaded A Vectored Interrupt Controller VIC does both things in hardware It supplies the starting address vector address of the service routine corresponding to the highest priority requesting i...

Page 470: ...clock This capability ensures that the controller can be used in systems that have either a synchronous or asynchronous interface between the core clock and the AXI clock The VIC port consists of the signals that Table 12 1 lists IRQACK is driven by the processor to indicate to an external VIC that the processor wants to read the IRQADDR input nFIQ nIRQ IRQADDRV IRQADDR 31 2 VICVECTADDROUT 31 2 VI...

Page 471: ...ernally for the case of asynchronous sources The Synchronous Interrupt Enable port INTSYNCEN is also provided to enable SoC designers to bypass the synchronizers if required Similarly a synchronizer is provided inside the processor for the IRQADDRV signal If this signal is known to be synchronous the synchronizer can be bypassed by pulling IRQADDRVSYNCEN HIGH These signals enable SoC designers to ...

Page 472: ...ween B3 and B4 the processor decides that the pending interrupt is an IRQ rather than a FIQ and asserts the IRQACK signal 5 At B4 the VIC samples IRQACK HIGH and starts generating IRQADDRV The VIC can still change IRQADDR to the IRQB vector address while IRQADDRV is LOW 6 At B6 the VIC asserts IRQADDRV while IRQADDR is set to the IRQB vector address IRQADDR is held until the processor acknowledges...

Page 473: ...HIGH 4 Clears IRQADDRV so the processor can recognize another interrupt If nIRQ is also to be deasserted at this point because there are no higher priority interrupts pending it is deasserted before or at the same time as IRQADDRV to ensure that the processor does not take the same interrupt again 12 3 2 Core timing As its part of the handshake mechanism the core 1 Starts an interrupt entry sequen...

Page 474: ...PSR 5 ARM state CPSR 7 IRQs disabled VE 1 FALSE V 1 LR_fiq RA 4 CPSR 4 0 FIQ mode CPSR 5 ARM state CPSR 7 FIQs and IRQs disabled SPSR_fiq CPSR V 1 PC 31 0 0xFFFF0018 TRUE PC 31 0 IRQADDR 31 2 0b00 PC 31 0 NSBA 0x1C FALSE PC 31 0 0xFFFF001C TRUE IRQADDRV VE IRQ ADDRV 1 TRUE TRUE FALSE FIQ 1 in SCR LR_mon RA 4 CPSR 4 0 MON mode CPSR 5 ARM state CPSR 7 FIQs and IRQs disabled SPSR_mon CPSR TRUE PC 31 ...

Page 475: ...g unit on page 13 3 Debug registers on page 13 5 CP14 registers reset on page 13 25 CP14 debug instructions on page 13 26 External debug interface on page 13 28 Changing the debug enable signals on page 13 31 Debug events on page 13 32 Debug exception on page 13 35 Debug state on page 13 37 Debug communications channel on page 13 42 Debugging in a cached system on page 13 43 Debugging in a system ...

Page 476: ...ommands such as set breakpoint at location XX or examine the contents of memory from 0x0 0x100 13 1 2 The protocol converter The debug host is connected to the processor development system using an interface for example an RS232 The messages broadcast over this connection must be converted to the interface signals of the processor This function is performed by a protocol converter for example Real...

Page 477: ...nd input output locations through the DBGTAP This mode is intentionally invasive to program execution Halting debug mode debugging requires external hardware to control the DBGTAP a software debugger to provide the user interface to the debug hardware See CP14 c1 Debug Status and Control Register DSCR on page 13 7 to learn how to set the processor debug unit into Halting debug mode 13 2 2 Monitor ...

Page 478: ...u to halt the processor and examine and modify registers and memory SPIDEN and SUIDEN control invasive debug permissions Non invasive debug Non invasive is debug where the system can only be observed but not affected The ETM interface the System Performance Monitor and the DBGTAP program counter sample register provide non invasive debug SPNIDEN and SUNIDEN control non invasive debug permissions 1...

Page 479: ...riptions Term Description R Read only Written values are ignored However it is written as 0 or preserved by writing the same value previously read from the same fields on the same processor W Write only This bit cannot be read Reads return an Unpredictable value RW Read or write C Cleared on read This bit is cleared whenever the register is read UNP SBZP Unpredictable or Should Be Zero or Preserve...

Page 480: ... c7 Vector Catch Register VCR b000 b1000 b1001 c8 c9 Reserved b000 b1010 c10 Debug State Cache Control Register DSCCR b000 b1011 c11 Debug State MMU Control Register DSMCR b000 b1100 b1111 c12 c15 Reserved b001 b011 b0000 b1111 c16 c63 Reserved b100 b0000 b0101 c64 c69 Breakpoint Value Registers BVRya b0110 b111 c70 c79 Reserved b101 b0000 b0101 c80 c85 Breakpoint Control Registers BCRya b0110 b11...

Page 481: ... b0000 1 WRP b0001 2 WRPs b1111 16 WRPs For the ARM1176JZF S processor these bits are b0001 2 WRPs 27 24 BRP R Number of Breakpoint Register Pairs b0000 Reserved The minimum number of BRPs is 2 b0001 2 BRPs b0010 3 BRPs b1111 16 BRPs For the ARM1176JZF S processor these bits are b0101 6 BRPs 23 20 Context R Number of Breakpoint Register Pairs with context ID comparison capability b0000 1 BRP has c...

Page 482: ...isable Sticky precise Data Abort flag Table 13 4 Debug Status and Control Register bit field definitions Bits Core view External view Reset value Description 31 UNP SBZP UNP SBZP Reserved 30 R R 0 The rDTRfull flag 0 rDTR empty 1 rDTR full This flag is automatically set on writes by the DBGTAP debugger to the rDTR and is cleared on reads by the core of the same register No writes to the rDTR are e...

Page 483: ... Debug state using the Debug Test Access Port If this bit is set when the core is not in Debug state the behavior of the processor is architecturally Unpredictable For ARM1176JZF S processors it has no effect 12 RW R 0 User mode access to comms channel control bit 0 User mode access to comms channel enabled 1 User mode access to comms channel disabled If this bit is set and a User mode process tri...

Page 484: ...imprecise data aborts occurring when in Debug state Note In previous versions of the debug architecture the sticky imprecise data abort was set when the processor took an imprecise data abort In version 6 1 it is set when an imprecise data abort is detected 6 R RC 0 Sticky precise Data Abort flag 0 No precise Data Abort occurred since the last time this bit was cleared 1 A precise Data Abort has o...

Page 485: ...he cause for entering Debug state A Prefetch Abort or Data Abort handler must first check the IFSR or DFSR register to determine a debug exception has occurred before checking the DSCR to find the cause These bits are not set on any events in Debug state 1 R R 1 Core restarted bit 0 the processor is exiting Debug state 1 the processor has exited Debug state The DBGTAP debugger can poll this bit to...

Page 486: ...at caused the watchpoint The register WFAR is in CP14 c6 a 32 bit read write register accessible in privileged modes only When a watchpoint occurs in ARM state the WFAR contains the address of the instruction causing it plus 0x8 Thumb state the WFAR contains the address of the instruction causing it plus 0x4 Jazelle state the WFAR contains the address of the instruction causing it The contents of ...

Page 487: ...hes related to bits 15 0 are only triggered by fetches in a Secure world Catches related to bits 31 25 are only triggered in the Non secure world There are three groups of bits one each to catch exceptions relative to the three vector base address registers for Non secure Secure and Secure Monitor modes The update of the VCR might occur several instruction after the corresponding MCR instruction I...

Page 488: ...A Vector Catch Enable Undefined Instruction in Non secure world 24 16 DNM RAZ 0 Reserved 15 RW 0 MBA Vector Catch Enable FIQ in Secure world 14 RW 0 MBA Vector Catch Enable IRQ in Secure world 13 DNM RAZ 0 Reserved 12 RW 0 MBA Vector Catch Enable Data Abort in Secure world 11 RW 0 MBA Vector Catch Enable Prefetch Abort in Secure World 10 RW 0 MBA Vector Catch Enable SMC in Secure world 9 8 DNM RAZ...

Page 489: ...onitor X 0 SBA 0x0000001C 1 0xFFFF001C VCR 10 1 NS bit 0 or Mode Secure Monitor X X MBA 0x00000008 VCR 11 1 NS bit 0 or Mode Secure Monitor X X MBA 0x0000000C VCR 12 1 NS bit 0 or Mode Secure Monitor X X MBA 0x00000010 VCR 14 1 NS bit 0 or Mode Secure Monitor X X MBA 0x00000018 VCR 15 1 NS bit 0 or Mode Secure Monitor X X MBA 0x0000001C VCR 25 1 NS bit 1 and mode Secure Monitor X 0 NSBA 0x00000004...

Page 490: ... BCR can be configured so this BVR value is compared against the CP15 Context ID Register c13 instead of the IMVA bus Another register pair loaded with an IMVA or DMVA can then be linked with the context ID holding BRP A breakpoint or watchpoint debug event is only generated if both the address and the context ID match at the same time This means that unnecessary hits can be avoided when debugging...

Page 491: ...implements Figure 13 6 shows the format of the Breakpoint Control Registers Figure 13 6 Breakpoint Control Registers format Table 13 9 Breakpoint Value Registers bit field definition Context ID capable Bits Read write attributes Description No 31 2 RW Breakpoint address Yes 31 0 RW Breakpoint address or context ID Table 13 10 Processor Breakpoint Control Registers Binary address Register number CP...

Page 492: ... linking BCR 22 20 is set b011 then the BCR 15 14 field of the IMVA holding BRP takes precedence and it is Undefined whether this field is included in the comparison or not Therefore it must be set to b00 The WCR 15 14 field of a WRP linked with this BRP also takes precedence over this field 13 9 UNP SBZP Reserved 8 5 RW Byte address select The BVR is programmed with a word address You can use thi...

Page 493: ...ead write attributes Reset value Description Table 13 12 Meaning of BCR 22 20 bits BCR 22 20 Meaning b000 The corresponding BVR is compared against the IMVA bus This BRP is not linked with any other one It generates a breakpoint debug event on an IMVA match b001 The corresponding BVR is compared against the IMVA bus This BRP is linked with the one indicated by BCR 19 16 linked BRP field They gener...

Page 494: ...the breakpoint debug event is not generated BCR 22 20 fields of the second BRP must be set to b011 If a BRP holding an IMVA is linked with one that is not implemented it is architecturally Unpredictable if a breakpoint debug event is generated or not For ARM1176JZF S processors the breakpoint debug event is not generated If a BRP is linked with itself it is architecturally Unpredictable if a break...

Page 495: ...gisters Table 13 14 Watchpoint Value Registers bit field definitions Bits Read write attributes Reset value Description 31 2 RW Watchpoint address 1 0 UNP SBZP Table 13 15 Processor Watchpoint Control Registers Binary address Register number CP14 debug register name Abbreviation ContextID capable Opcode_2 CRm b111 b0000 b0001 c112 c113 Watchpoint Control Registers 0 1 WCR0 1 W UNP SBZP 31 21 20 19...

Page 496: ...point hitsbx1xx If the byte at address WVR 31 2 b00 2 is accessed the watchpoint hitsb1xxx If the byte at address WVR 31 2 b00 3 is accessed the watchpoint hits Note These are little endian byte addresses This ensures that a watchpoint is triggered regardless of the way it is accessed For example if a watchpoint is set on a certain byte in memory by doing WCR 8 5 b0001 LDRB R0 0x0 it triggers the ...

Page 497: ...ster controls cache behavior in Debug state MRC p14 0 Rd c0 c10 0 MCR p14 0 Rd c0 c10 0 Table 13 17 lists the functional bits in the register The effect of these bits only applies in Debug state The operation under control only occurs if it is enabled in both this register and by the corresponding bit in the Cache Behavior Override Register 13 3 12 CP14 c11 Debug State MMU Control Register The Deb...

Page 498: ...B loading in Debug state 0 Main TLB load disabled in Debug state 3 0 nIUM 1 Normal operation of Instruction Micro TLB matching in Debug state 0 Instruction Micro TLB match disabled in Debug state 2 0 nDUM 1 Normal operation of Data Micro TLB matching in Debug state 0 Data Micro TLB match disabled in Debug state 1 0 nIUL 1 Normal operation of Instruction Micro TLB loading and flushing in Debug stat...

Page 499: ...xternal interface are all reset by the processor power on reset signal nPORESETIN see Reset with no IEM on page 9 4 or Reset with IEM on page 9 8 This ensures that a vector catch set on the reset vector is taken when nRESETIN is deasserted It also ensure that the DBGTAP debugger can be connected when the processor is running without clearing CP14 debug setting because DBGnTRST does not reset these...

Page 500: ...egister number Abbreviation Legal instructions Opcode_2 CRm b000 b0000 0 DIDR MRC p14 0 Rd c0 c0 0a b000 b0001 1 DSCR MRC p14 0 Rd c0 c1 0a MRC p14 0 R15 c0 c1 0 MCR p14 0 Rd c0 c1 0a b000 b0101 5 DTR rDTR wDTR MRC p14 0 Rd c0 c5 0a MCR p14 0 Rd c0 c5 0a STC p14 c5 addressing mode LDC p14 c5 addressing mode b000 b0110 6 WFAR MRC p14 0 Rd c0 c6 0a MCR p14 0 Rd c0 c6 0a b000 b0111 7 VCR MRC p14 0 Rd...

Page 501: ...ses to CP14 debug registers generate an Undefined instruction exception When DSCR bit 14 is set Halting debug mode selected and enabled if the software running on the processor tries to access any register other than the DIDR the DSCR or the DTR the core takes the Undefined instruction exception The same thing happens if the core is not in any Debug mode DSCR 15 14 b00 This lockout mechanism ensur...

Page 502: ...SUNIDEN bit If this input signal is LOW non invasive debug is not permitted in all Secure privileged modes Non invasive debug is permitted in Secure User mode according to the SUNIDEN bit Note You must control access to the SPIDEN and SPNIDEN pins as they represent a significant security risk For example it must not be possible to set these pins through the boundary scan in a final device For soft...

Page 503: ...ebug not permittedb Not permitted in privileged modes in Secure state 1 10 0 1 0 not User Debug not permittedb Not permitted in privileged modes in Secure state 1 10 0 1 0 User Monitor debug mode Permitted in User mode in Secure state c 1 X1 1 X X X Halting debug mode Permitted in Non secure state and in all modes in Secure state 1 X1 0 0 1 not Secure Monitor Halting debug mode Permitted in Non se...

Page 504: ...d Only the BKPT instruction external debug request signal and Halt DBGTAP instructions have an effect when no debug mode is selected All other debug events are ignored b Behavior of the processor on debug events on page 13 33 describes the behavior marked as not permitted Logically the processor is still configured for either Halting debug mode or Monitor debug mode as appropriate c Debug exceptio...

Page 505: ...ht have to write a value to a control register in a system peripheral 2 Perform a Data Memory Barrier operation This stage can be omitted if the previous stage does not involve any memory operations 3 Poll debug registers for the view that the processor has of the signal values This stage is required because system specific issues might result in the processor not receiving a signal change until s...

Page 506: ... the context ID in CP15 c13 the instruction is now committed for execution A breakpoint debug event also occurs when an instruction was fetched and the CP15 Context ID register 13 matched the breakpoint value at the same time the instruction was fetched all the conditions of the BCR matched the breakpoint was enabled the instruction is now committed for execution A software breakpoint debug event ...

Page 507: ...cessor enters Debug state regardless of any debug mode selected by DSCR 15 14 When a debug event occurs and Halting debug mode is selected and enabled and the core is in a state that debug is permitted then the processor enters Debug state All software debug events other than the BKPT instruction that is register breakpoints watchpoints and vector catches when no debug mode is selected and enabled...

Page 508: ...s set to the VA of the instruction that caused the Watchpoint debug event plus an offset dependent on the processor state These offsets are the same as the ones that Table 13 25 on page 13 39 lists Table 13 23 lists the setting of CP15 registers on debug events You must take care when setting a breakpoint or software breakpoint debug event inside the Prefetch Abort or Data Abort exception handlers...

Page 509: ...y 1 It must first check for the presence of a debug monitor target 2 If present the handler must disable the active watchpoints This is necessary to prevent corruption of the FAR because of an unexpected watchpoint debug event when servicing a Data Abort exception 3 If the cause is a Debug exception the Data Abort handler branches to the debug monitor target Note the watchpointed address can be fo...

Page 510: ...t the watchpoint BKPT instruction RA 4 RA 4 RA 4 BKPT instruction address Vector catch RA 4 RA 4 RA 4 Vector address Prefetch Abort RA 4 RA 4 RA 4 Address of the instruction where the execution resumes Data Abort RA 8 RA 8 RA 8 Address of the instruction where the execution resumes a This is the address of the instruction that the processor first executes on Debug state exit Watchpoints can be imp...

Page 511: ...state entry request commands are ignored There is a mechanism using the Debug Test Access Port where the core is forced to execute an ARM state instruction This mechanism is enabled using DSCR 13 execute ARM instruction enable bit The core executes the instruction as if it is in ARM state regardless of the actual value of the T and J bits of the CPSR Any instruction issued in Debug state that puts...

Page 512: ...bugger The debugger can write to the CPSR mode bits to switch to Secure Monitor mode and thereby set or clear the NS bit to read or write all CP15 registers in either bank If debug is permitted only in Non secure state and in Secure User mode then if the processor is stopped in Secure User mode it has no privileged access to any CP15 registers If the processor is stopped in any Non secure mode inc...

Page 513: ...regardless of the value of the I and F bits of the CPSR although these bits are not changed because of the Debug state entry 13 10 3 Exceptions Exceptions are handled as follows while in Debug state Reset This exception is taken as in a normal processor state ARM Thumb or Jazelle This means the processor leaves Debug state as a result of the system reset Prefetch Abort This exception cannot occur ...

Page 514: ...e setting of the CPSR A bit PC CPSR SPSR_abt R14_abt and DSCR 5 2 method of entry bits are unchanged The processor remains in Debug state DSCR 7 sticky imprecise data abort bit is set The imprecise Data Abort is not taken so DFSR is not set and the FAR is not updated Note The DFSR and FAR that are updated depends on if the core is in a Secure or Non secure state The registers that can be read in D...

Page 515: ...ot set If the A bit in the CPSR is set it is pended until the A bit in the CPSR is cleared as for normal operation Table 13 26 lists an example sequence of a memory operation executed in normal operation that eventually causes an imprecise abort when the processor is in Debug state In addition a memory operation issued by the debugger in Debug state causes a second imprecise abort that is ignored ...

Page 516: ...R comprises both a read rDTR and a write portion wDTR a data item written by the core can be held in this register at the same time as one written by the DBGTAP debugger Some flags and control bits of CP14 Debug Register c1 DSCR User mode access to comms channel disable DSCR 12 If this bit is set only privileged software is able to access the debug communications channel That is access the DSCR an...

Page 517: ... 3 69 These instructions enable you to reset the processor memory system to a known safe state and are accessible from both the core and the DBGTAP debugger side When the processor is in Secure User mode and SPIDEN is not asserted only the User mode CP15 registers are accessible with the exception of Invalidate Instruction Cache Range and Flush Entire BTAC that are always accessible in Debug state...

Page 518: ...ing in a system with TLBs Debugging in a system with TLBs has to be as non invasive as possible There has to be a way to put the TLBs in a state where their contents are not affected by the debugging process The processor enables you to put the TLBs in this mode using CP14 c11 See CP14 c11 Debug State MMU Control Register on page 13 23 ...

Page 519: ...Data Abort vector catch debug events are ignored Debug exception on page 13 35 describes debug exception entry The Prefetch Abort handler can check the IFSR and the Data Abort handler can check the DFSR to find out the caused of the exception If the cause was a Debug exception the handler branches to the debug monitor target When the debug monitor target is running it can determine and modify the ...

Page 520: ...word and write it back to the BCR Now the breakpoint is disabled 3 Write the context ID value to the BVR register 4 Write to the BCR with its fields set as follows BCR 22 21 meaning of BVR bit set to b01 to indicate that the value loaded into BVR is to be compared against the CP15 Context Id Register c13 BCR 20 enable linking bit cleared to indicate that this breakpoint is not to be linked BCR 15 ...

Page 521: ...a 19 16 BRPb in this example BCR 15 14 Secure access as required binary representation of b into BCR 9 6 linked BRP field BCRa 8 5 byte address select field as required BCRa 2 1 supervisor access field as required BCRa 0 enable breakpoint set Setting a simple watchpoint You can set a simple watchpoint as follows 1 Read the WCR 2 Clear the WCR 0 enable watchpoint bit in the read word and write it b...

Page 522: ...ue loaded into BVRb is to be compared against the CP15 Context ID Register BCRb 20 enable linking bit set BCR 15 14 Secure access set to b00 BCRb 8 5 byte address select set to b1111 BCRb 2 1 supervisor access set to b11 BCRb 0 enable breakpoint bit set 13 14 3 Setting software breakpoint debug events BKPT To set a software breakpoint on a particular virtual address the debug monitor target must p...

Page 523: ...DDI 0301H Copyright 2004 2009 ARM Limited All rights reserved 13 49 ID012310 Non Confidential Unrestricted Access 2 If DSCR 29 wDTRfull flag is set then go to 1 3 Write the word to the wDTR CP14 Debug Register c5 ...

Page 524: ...BGTAP A DBGTAP Restart instruction restarts the integer core 13 15 1 Entering Debug state When a debug event occurs and Halting debug mode is selected and enabled and the core is in a state when debug is permitted then the processor enters Debug state as defined in Debug state on page 13 37 When the core is in Debug state the DBGTAP debugger can determine and modify the processor state and new deb...

Page 525: ... or Jazelle state Setting software breakpoints BKPT To set a software breakpoint the DBGTAP debugger must perform the same steps as the debug monitor target Setting breakpoints watchpoints and vector catch debug events on page 13 45 describes this The difference is that CP14 debug registers are accessed using the DBGTAP scan chains See Chapter 14 Debug Test Access Port Reading and writing to memor...

Page 526: ...equest signal As External debug request signal on page 13 32 describes this input signal forces the core into Debug state if the Debug logic is enabled by DBGEN and debug is permitted DBGNOPWRDWN Powerdown disable signal generated from DSCR 9 When this signal is HIGH the system power controller is forced into Emulate mode This is to avoid losing CP14 Debug state that can only be written through th...

Page 527: ...contains the following sections Debug Test Access Port and Debug state on page 14 2 Synchronizing RealView ICE on page 14 3 Entering Debug state on page 14 4 Exiting Debug state on page 14 5 The DBGTAP port and debug registers on page 14 6 Debug registers on page 14 8 Using the Debug Test Access Port on page 14 21 Debug sequences on page 14 29 Programming debug events on page 14 40 Monitor debug m...

Page 528: ...DBGTAP state Machine DBGTAPSM is illustrated in Figure 14 1 Figure 14 1 JTAG DBGTAP state machine diagram1 1 From IEEE Std 1149 1 2001 Copyright 2001 IEEE All rights reserved tms 1 tms 0 tms 1 tms 1 tms 1 tms 0 tms 1 tms 0 tms 1 tms 1 tms 0 Run Test Idle Test Logic Reset Select DR Scan Select IR Scan tms 1 Capture DR tms 0 tms 0 tms 0 Capture IR tms 0 Shift IR Exit1 IR tms 1 Pause IR tms 0 Exit2 I...

Page 529: ... a TCK signal and waits for the RTCK Returned TCK signal to come back Synchronization is maintained because the off chip device does not progress to the next TCK edge until after an RTCK edge is received Figure 14 2 shows this synchronization Figure 14 2 RealView ICE clock synchronization Note All of the D type flip flops are reset by DBGnTRST D Q D Q D Q D Q TMS TDI CLKIN Input sample and hold RT...

Page 530: ...P controller must pass through Run Test Idle to issue the Halt command to the processor EDBGRQ is asserted If debug is enabled by DBGEN scanning a Halt instruction in through the DBGTAP or asserting EDBGRQ halts the processor and causes it to enter Debug state regardless of the selection of a debug state in DSCR 15 14 This means that a debugger can halt the processor immediately after reset in a s...

Page 531: ...PC before restarting depending on the way the integer core entered Debug state When the state machine enters the Run Test Idle state normal operations resume The delay waiting until the state machine is in Run Test Idle enables conditions to be set up in other devices in a multiprocessor system without taking immediate effect When Run Test Idle state is entered all the processors resume operation ...

Page 532: ... and DBGTDO When the instruction register is loaded with the EXTEST instruction the debug scan chains can be written See Scan chains on page 14 10 b00001 Reserved b00010 Scan_N Selects the Scan Chain Select Register SCREG This instruction connects SCREG between DBGTDI and DBGTDO See Scan chain select register SCREG on page 14 9 b00011 Reserved b00100 Restart Forces the processor to leave Debug sta...

Page 533: ...ruction on page 14 22 for the effects of using this instruction b11110 IDcode See IEEE 1149 1 Selects the DBGTAP controller device ID code register The IDcode instruction connects the device identification register or ID register between DBGTDI and DBGTDO The ID register is a 32 bit register that enables you to determine the manufacturer part number and version of a component using the DBGTAP See ...

Page 534: ...erial data is transferred from DBGTDI to DBGTDO in the Shift DR state with a delay of one TCK cycle There is no parallel output from the bypass register A logic 0 is loaded from the parallel input of the bypass register in the Capture DR state Nothing happens at the Update DR state Order Figure 14 3 shows the order of bits in the bypass register Figure 14 3 Bypass register bit order 14 6 2 Device ...

Page 535: ...ter Figure 14 4 Device ID code register bit order 14 6 3 Instruction register Purpose Holds the current DBGTAP controller instruction Length 5 bits Operating mode When in Shift IR state the shift section of the instruction register is selected as the serial path between DBGTDI and DBGTDO At the Capture IR state the binary value b00001 is loaded into this shift section This is shifted out during Sh...

Page 536: ...e scan chain select register Figure 14 6 Scan chain select register bit order 14 6 5 Scan chains To access the debug scan chains you must 1 Load the Scan_N instruction into the IR Now SCREG is selected between DBGTDI and DBGTDO 2 Load the number of the required scan chain For example load b00101 to access scan chain 5 3 Load either INTEST or EXTEST into the IR 4 Go through the DR leg of the DBGTAP...

Page 537: ...M Architecture Reference Manual This register is read only Therefore EXTEST has the same effect as INTEST Order Figure 14 7 shows the order of bits in scan chain 0 Figure 14 7 Scan chain 0 bit order Scan chain 1 Debug Status and Control Register DSCR Purpose Debug Length 32 bits Description This scan chain accesses CP14 register 1 the DSCR This is mostly a read write register although certain bits...

Page 538: ...truction enable bit This bit enables the mechanism used for executing instructions in Debug state It changes the behavior of the rDTR and wDTR registers the sticky precise Data Abort bit rDTRempty wDTRfull and InstCompl flags See Scan chain 5 on page 14 15 DSCR 6 Sticky precise Data Abort flag If the core is in Debug state and the DSCR 13 execute ARM instruction enable bit is HIGH then this flag i...

Page 539: ...e processor must be in Debug state The DSCR 13 execute ARM instruction enable bit must be set For details of the DSCR see CP14 c1 Debug Status and Control Register DSCR on page 13 7 Scan chain 4 or 5 must be selected INTEST or EXTEST must be selected Ready flag must be captured set That is the last time the DBGTAPSM went through Capture DR the InstCompl flag must have been set The DSCR 6 sticky pr...

Page 540: ...can chain 4 When an instruction is issued to the core in Debug state the PC is not incremented It is only changed if the instruction being executed explicitly writes to the PC For example branch instructions and move to PC instructions If CP14 debug register c5 is a source register for the instruction to be executed the DBGTAP debugger must set up the data in the rDTR before issuing the coprocesso...

Page 541: ...EXTEST selects the rDTR Additionally scan chain 5 contains some status flags These are nRetry Valid and Ready They are the captured versions of the rDTRempty wDTRfull and InstCompl flags respectively All are captured at the Capture DR state Order Figure 14 10 shows the order of bits in scan chain 5 with EXTEST selected Figure 14 11 shows the order of bits in scan chain 5 with INTEST selected Figur...

Page 542: ... is sampled clear meaning that the rDTR is full when going through the Capture DR state then the rDTR is not updated at the Update DR state The InstCompl flag is always set The sticky precise Data Abort flag is Unpredictable See CP14 c1 Debug Status and Control Register DSCR on page 13 7 DSCR 13 1 The wDTR Full flag behaves as if DSCR 13 is clear However the Ready flag can be used for handshaking ...

Page 543: ...chain 6 the use of INTEST and EXTEST differs from their standard use that the start of this section describes Order Figure 14 12 shows the order of bits in scan chain 6 Figure 14 12 Scan chain 6 bit order Scan chain 7 Purpose Debug Length 7 32 1 40 bits Description Scan chain 7 accesses the VCR PC BRPs and WRPs The accesses are performed with the help of read or write request commands A read reque...

Page 544: ...quested write has completed successfully If the Address field is all 0s address of the NULL register at the Update DR state then no request is generated A request to a reserved register generates Unpredictable behavior Order Figure 14 13 shows the order of bits in scan chain 7 Figure 14 13 Scan chain 7 bit order A typical sequence for writing registers is as follows 1 Scan in the address of a firs...

Page 545: ...ode See Interpreting the PC samples on page 14 20 for details of how to interpret the sampled value The external program counter sample register always reads 0xFFFFFFFF in Debug state or when the core is in a mode when Non invasive debug is not permitted When accessing registers using scan chain 7 the processor can be either in Debug state or in normal state This implies that breakpoints watchpoin...

Page 546: ...ponds to a Thumb state instruction whose 31 most significant bits of the offset address instruction address 4 are given in Data 31 1 If a read request to the PC completes and Data 1 0 equals b10 the read value corresponds to a Jazelle state instruction whose 30 most significant bits of its address are given in Data 31 2 the offset is 0 Because of the state encoding the lower two bits of the Java a...

Page 547: ...5 is selected the instruction can be issued to the core by making the DBGTAPSM go through the Run Test Idle state provided certain conditions that this section describes are met This mechanism enables re executing the same instruction over and over without having to reload it The DTR can be used in conjunction with the ITR to transfer data in and out of the core For example to read out the value o...

Page 548: ...to the IR at the Update IR state the DBGTAP controller behaves as if EXTEST and scan chain 4 are selected but SCREG retains its value It can be used to speed up certain debug sequences Figure 14 14 shows the effect of the ITRsel IR instruction Figure 14 14 Behavior of the ITRsel IR instruction Consider for example the preceding sequence to store out the contents of ARM register R0 This is the same...

Page 549: ...be this Using the debug communications channel Target to host debug communications channel sequence on page 14 24 Host to target debug communications channel on page 14 24 Transferring data in Debug state on page 14 25 Example sequences on page 14 26 14 7 5 Using the debug communications channel The DCC is defined as the set of resources that the external DBGTAP debugger uses to communicate with a...

Page 550: ...When the DBGTAPSM goes through the Capture DR state with INTEST and scan chain 5 selected the contents of the wDTR are loaded into the Data field of the scan chain This is how the DBGTAP debugger reads the data sent by the software running on the core Valid flag When set this flag indicates to the DBGTAP debugger that the contents of the wDTR that it captured a short time ago are valid nRetry flag...

Page 551: ...n chain 5 See Scan chain 5 on page 14 15 It is used for writing in or reading out the data and for monitoring the state of the execution rDTR When the DBGTAPSM goes through the Update DR state with EXTEST and scan chain 5 selected and the Ready flag set the contents of the Data field are loaded into the rDTR wDTR When the DBGTAPSM goes through the Capture DR state with INTEST or EXTEST selected th...

Page 552: ...contents of the DSCR This clears the sticky precise Data Abort sticky imprecise Data Abort flags and sticky Undefined flags 5 Scan_N into the IR 6 4 into the SCREG 7 EXTEST into the IR 8 Scan in the LDC p14 c5 R0 4 instruction into the ITR 9 Scan_N into the IR 10 5 into the SCREG 11 INTEST into the IR 12 Go through Run Test Idle state The instruction loaded into the ITR is issued to the processor ...

Page 553: ...the SCREG 11 EXTEST into the IR 12 Scan in 34 bits the least significant 32 holding the word to be sent At the same time 34 bits are scanned out If the Ready flag is clear repeat this step 13 Go through Run Test Idle state 14 Go to step 12 again for writing in more data 15 Scan in 34 bits All the values are don t care At the same time 34 bits are scanned out If the Ready flag is clear repeat this ...

Page 554: ... Limited All rights reserved 14 28 ID012310 Non Confidential Unrestricted Access Note If the sticky imprecise Data Abort flag is set an imprecise Data Abort has occurred and the sequence restarts at step 1 after the cause of the abort is fixed and c0 is reloaded ...

Page 555: ...n Halting debug mode Monitor debug mode debugging on page 14 42 describes the monitor debug mode debugging 14 8 1 Debug macros The debug code sequences in this section are written using a fixed set of macros The mapping of each macro into a debug scan chain sequence is given in this section Scan_N n Select scan chain register number n 1 Scan the Scan_N instruction into the IR 2 Scan the number n i...

Page 556: ...is set the value of the Ready flag is stored in stateout If the DSCR 13 execute ARM instruction enable bit is clear the nRetry or Valid flag depending on whether EXTEST or INTEST is selected is stored in stateout 3 If scan chain 1 is selected scan in 32 bit datain value for DSCR 31 0 Stateout and dataout fields are not used in this case DATAOUT dataout 1 Scan out a data value DSCR scan chain 1 and...

Page 557: ...reakpoints watchpoints and vector catches reset disabled on power up 14 8 3 Forcing the processor to halt Scan the Halt instruction into the DBGTAP controller IR and go through Run Test Idle 14 8 4 Entering Debug state To enter Debug state you must 1 Check whether the core has entered Debug state as follows SCAN_N 1 select DSCR INTEST LOOP DATAOUT readDSCR UNTIL readDSCR 0 1 until Core Halted bit ...

Page 558: ...on ends c Read R0 using the standard sequence of Reading a current mode ARM register in the range R0 R14 on page 14 34 8 Store out CPSR using the standard sequence of Reading the CPSR SPSR on page 14 35 9 Store out PC using the standard sequence of Reading the PC on page 14 36 10 Adjust the PC to enable you to resume execution later subtract 0x8 from the stored value if the processor was in ARM st...

Page 559: ...0 c5 0 instruction to copy R0 into CP14 debug register c5 RTI LOOP INST 0x00000000 Ready UNTIL Ready 1 wait until the instruction ends 5 Restore CPSR using the standard CPSR writing sequence that Writing the CPSR SPSR on page 14 35 describes 6 Restore the PC using the standard sequence of Writing the PC on page 14 36 7 Restore R0 using the standard sequence of Writing a current mode ARM register i...

Page 560: ...y 1 wait until the instruction ends Save value in readData Note Register R15 cannot be read in this way because the effect of the required MCR is to take an Undefined exception 14 8 7 Writing a current mode ARM register in the range R0 R14 Use the following sequence to write a current mode ARM register in the range R0 R14 SCAN_N 5 select DTR ITRSEL select the ITR and EXTEST INST MRC p14 0 Rd c0 c5...

Page 561: ...0 is used as a temporary register 1 Load the required value into R0 using the standard sequence that Writing a current mode ARM register in the range R0 R14 on page 14 34 describes Now scan chain 5 and EXTEST are selected 2 Move the contents of R0 to CPRS SPRS ITRSEL select the ITR and EXTEST INST MSR CPSR R0 or SPSR RTI LOOP INST 0x00000000 Ready UNTIL Ready 1 wait until the instruction ends This...

Page 562: ...based read and write sequences are substantially more efficient than the halfword and byte sequences This is because the ARM LDC and STC instructions always perform word accesses and this can be used for efficient access to word width memory Halfword and byte accesses must be done with a combination of loads or stores and coprocessor register transfers This is much less efficient When writing data...

Page 563: ...ns caused a precise Data Abort All the instructions that follow are not executed Register R0 points to the next word to be written and after the cause for the abort has been fixed the sequences resumes at step 1 Note If the sticky imprecise Data Aborts flag is set an imprecise Data Abort has occurred and the sequence restarts at step 1 after the cause of the abort is fixed and R0 is reloaded 14 8 ...

Page 564: ...n page 14 34 describes on register R1 Now scan chain 5 and INTEST are selected 3 If there are more halfwords or bytes to be read go to 1 4 Check for aborts as Reading memory as words on page 14 36 describes 14 8 16 Writing memory as halfwords bytes This sequence assumes that R0 has been set to the address to store data to prior to running this sequence Register R0 is post incremented so that it ca...

Page 565: ...struction ends 2 Use the standard sequence that Reading a current mode ARM register in the range R0 R14 on page 14 34 describes 14 8 19 Writing coprocessor registers 1 Write the value onto R0 using the standard sequence See Writing a current mode ARM register in the range R0 R14 on page 14 34 for more details Scan chain 5 and EXTEST are selected 2 Transfer the contents of R0 to a coprocessor regis...

Page 566: ...gisters using scan chain 7 A typical sequence for writing to a register using scan chain 7 is as follows SCAN_N 7 select ITR EXTEST REQ 1stAddr2Wr 1stData2Wr 0b1 write request for register 1stAddr2write FOR i 2 i Words2Write i DO LOOP REQ ithAddr2Wr ithData2Wr 1 Ready ith write request while waiting UNTIL Ready 1 wait until the previous request completes ENDFOR LOOP REQ 0 0 0 Ready null request wh...

Page 567: ...CP14 debug registers using scan chain 7 14 9 4 Setting software breakpoints To set a software breakpoint on a certain Virtual Address a debugger must go through the following steps 1 Read memory location and save actual instruction 2 Write the BKPT instruction to the memory location 3 Read memory location again to check that the BKPT instruction got written 4 If it is not written determine the rea...

Page 568: ... exception is taken the handler uses the DCC to transmit status information to and receive commands from the host using a DBGTAP debugger Monitor debug mode is essential in real time systems when the core cannot be halted to collect information 14 10 1 Receiving data from the core SCAN_N 5 select DTR INTEST FOREACH Data2Read LOOP DATA 0x00000000 Valid readData UNTIL Valid 1 wait until instruction ...

Page 569: ...rights reserved 15 1 ID012310 Non Confidential Unrestricted Access Chapter 15 Trace Interface Port This chapter describes the Embedded Trace Macrocell ETM support for the processor It contains the following section About the ETM interface on page 15 2 ...

Page 570: ...efore any data transfers associated with them as required by the ETM protocol Table 15 1 lists the instruction interface signals ETMIA is used for branch target address calculation Other than this the ETM must know for each cycle the current address of the instruction in execute and the address of any branch phantom progressing through the pipeline The processor does not maintain the address of br...

Page 571: ...fied by 17 IASlotKill Kill outstanding slots IAException 16 IADAbort Data Abort IAException 15 IAExCancel Exception canceled previous instruction IAException 12 14 IAExInt b001 IRQb101 FIQb100 Java exception b110 Precise Data Abortb000 Other exception IAException 11 IAException Instruction is an exception vector Nonea 10 IABounce Kill the data slot associated with this instruction There is only ev...

Page 572: ...tion For more information on the ETM protocol see the Embedded Trace Macrocell Architecture Specification 15 1 2 Secure control bus The Secure control bus ETMIASECCTL indicates when the processor is in Secure state and when the data trace is prohibited Table 15 3 lists the signals in the Secure control bus ETMIASECCTL 15 1 3 Data address interface Data addresses are sampled at the ADD stage becaus...

Page 573: ...ers is not one word greater than the previous transfer and therefore the transfer must have its address re output During an unaligned access this signal is only valid on the first transfer of the access DASlot 00 16 DALast The data transfer is the last for this data instruction This signal is asserted for both halves of an unaligned access A related signal DAFirst can be implied from this signal b...

Page 574: ...ocessor interface This interface enables an ETM to monitor a sub set of CP14 and CP15 operations Rather than using the external coprocessor interface the core provides a dedicated cut down coprocessor interface similar to that used by the debug logic Table 15 6 Data value interface signals Signal name Description Qualified by ETMDDCTL 3 0 Data value interface control signals ETMDD 63 0 Contains th...

Page 575: ...MCPSECCTL 1 0 signals Figure 15 1 shows the format of the ETMCPADDRESS 14 0 signals Figure 15 1 ETMCPADDRESS format Table 15 9 Coprocessor interface signals Signal name Direction Description Qualified by Reg bound ETMCPENABLE Output Interface enable ETMCPWRITE and ETMCPADDRESS are valid this cycle and the remaining signals are valid two cycles later None No latea ETMCPCOMMIT Output Commit If this ...

Page 576: ...at Table 15 11 lists are also connected to the core Table 15 11 Other connections Signal name Direction Description EVNTBUS 19 0 Output Gives the status of the performance monitoring events See c15 Performance Monitor Control Register on page 3 133 ETMEXTOUT 1 0 Input Provides feedback to the core of the EVNTBUS signals after being passed through ETM triggering facilities and comparators This enab...

Page 577: ... page 16 7 QADD QDADD QSUB and QDSUB instructions on page 16 9 ARMv6 media data processing on page 16 10 ARMv6 Sum of Absolute Differences SAD on page 16 11 Multiplies on page 16 12 Branches on page 16 14 Processor state updating instructions on page 16 15 Single load and store instructions on page 16 16 Load and Store Double instructions on page 16 19 Load and Store Multiple Instructions on page ...

Page 578: ...rms on page 16 5 16 1 1 Changes in instruction flow overview To minimize the number of cycles because of changes in instruction flow the processor includes a dynamic branch predictor static branch predictor return stack The dynamic branch predictor is a 128 entry direct mapped branch predictor using VA bits 9 3 The prediction scheme uses a two bit saturating counter for predictions that are Strong...

Page 579: ...lete underneath outstanding loads Extensive forwarding to the Sh MAC1 ADD ALU MAC2 and DC1 stages enables many dependent instruction sequences to run without pipeline stalls General forwarding occurs from the ALU Sat WBex and WBls pipeline stages In addition the multiplier contains an internal multiply accumulate forwarding path Most instructions do not require a register until the ALU stage All r...

Page 580: ...variable number of cycles to execute The number of cycles is dependent on the length of the operation the number of cycles between the setting of the flags and the start of the dependent instruction The worst case number of cycles for a condition code failing multicycle instruction is five The following algorithm describes the number of cycles taken for multi cycle instructions that condition code...

Page 581: ...cycles before the result of this instruction is available for a following instruction requiring the result at the start of the ALU MAC2 and DC1 stage This is the normal case Exceptions to this mark the register as an Early Reg Note The result latency is the number of cycles from the first cycle of an instruction Register Lock Latency For STM and STRD instructions only This is the number of cycles ...

Page 582: ...ns take one cycle and have a result latency of one Table 16 3 Register interlock examples Instruction sequence Behavior LDR R1 R2 ADD R6 R5 R4 Takes two cycles because there are no register dependencies ADD R1 R2 R3 ADD R9 R6 R1 Takes two cycles because ADD instructions have a result latency of one LDR R1 R2 ADD R6 R5 R1 Takes four cycles because of the result latency of R1 ADD R1 R5 R6 LDR R2 R1 ...

Page 583: ...ven separately in the table For condition code failing cycle counts the cycles for the non PC destination variants must be used Table 16 4 Data Processing Instruction cycle timing behavior if destination is not PC Example Instruction Cycle s Earl y Reg Late Reg Result latency Comment ADD Rd Rn Rm 1 1 Normal case ADD Rd Rn Rm LSL immed 1 Rm 1 Requires a shifted source register ADD Rd Rn Rm LSL Rs 2...

Page 584: ...r controlled shifts take two cycles to execute the register containing the shift distance is read in the first cycle the shift is performed in the second cycle The final operand is not required until the ALU stage for the second cycle Because a shift distance is required the register containing the shift distance is an Early Reg and incurs an extra interlock penalty For example the following seque...

Page 585: ...g arithmetic Their result is produced during the Sat stage consequently they have a result latency of two The QDADD and QDSUB instructions must double and saturate the register Rn before the addition This operation occurs in the Sh stage of the pipeline consequently this register is an Early Reg Table 16 6 lists the cycle timing behavior for QADD QDADD QSUB and QDSUB instructions Table 16 6 QADD Q...

Page 586: ...fore are marked as requiring an Early Reg Table 16 7 ARMv6 media data processing instructions cycle timing behavior Instructions Cycle s Early Reg Result latency SADD16 SSUB16 SADD8 SSUB8 1 1 USAD8 USADA8 1 Rm Rs 3 UADD16 USUB16 UADD8 USUB8 1 1 SEL 1 1 QADD16 QSUB16 QADD8 QSUB8 1 2 SHADD16 SHSUB16 SHADD8 SHSUB8 1 2 UQADD16 UQSUB16 UQADD8 UQSUB8 1 2 UHADD16 UHSUB16 UHADD8 UHSUB8 1 2 SSAT16 USAT16 1...

Page 587: ...rly Reg Result latency USAD8 1 Rm Rs 3a a Result latency is one less If the destination is the accumulate for a subsequent USADA8 USADA8 1 Rm Rs 3 Table 16 9 Example interlocks Instruction sequence Behavior USAD8 R1 R2 R3 ADD R5 R6 R1 Takes four cycles because USAD8 has a Result latency of three and the ADD requires the result of the USAD8 instruction USAD8 R1 R2 R3 MOV R9 R9 MOV R9 R9 ADD R5 R6 R...

Page 588: ... low half of the result always available first The multiplicand and multiplier are required as Early Regs because they are both required at the start of MAC1 Table 16 10 lists the cycle timing behavior of example multiply instructions Table 16 10 Example multiply instruction cycle timing behavior Example Instruction Cycle s Cycles if sets flags Early Reg Late Reg Result latency MUL S 2 5 Rm Rs 4 M...

Page 589: ...r ARM DDI 0301H Copyright 2004 2009 ARM Limited All rights reserved 16 13 ID012310 Non Confidential Unrestricted Access Note Result latency is one less if the result is used as the accumulate register for a subsequent multiply accumulate ...

Page 590: ...tion B immed BL immed BLX immed 1 Not folded dynamic prediction B immed BL immed BLX immed 1 Correct not taken static prediction B immed BL immed BLX immed 4 Correct taken static prediction B immed BL immed BLX immed 5 7a a Mispredicted branches including taken unpredicted branches takes a varying number of cycles to execute depending on their distance from a flag setting instruction The timing be...

Page 591: ... MSR MRS CPS and SETEND instructions Table 16 12 lists processor state updating instructions and their cycle timing behavior Table 16 12 Processor state updating instructions cycle timing behavior instruction Cycles Comments MRS 1 All MRS instructions MSR CPSR_f s fs 2 MSRs to CPSR flags and or status MSR 4 All other MSRs to the CPSR MSR SPSR 5 All MSRs to the SPSR CPS effect iflags 1 Interrupt ma...

Page 592: ...Mv6 unaligned support is enabled and the final access address is unaligned there is an extra cycle of result latency PLD data preload hint instructions have cycle timing behavior as for load instructions Because they have no destination register the result latency is not applicable for such instructions Because a PLD instruction is treated as any other load instruction by all levels of cache stand...

Page 593: ... 14 use Table 16 14 Cycle timing behavior for loads to the PC Example instruction Cycle s Memory cycles Result latency Comments LDR pc sp cns 4 1 Correctly return stack predicted LDR pc sp cns 4 1 Correctly return stack predicted LDR pc sp cns 9 1 Return stack mispredicted LDR pc sp cns 9 1 Return stack mispredicted LDR cond pc sp cns 8 1 Conditional return or empty return stack LDR cond pc sp cns...

Page 594: ...tions reusing the same base register there is a local forwarding path to recycle the updated base register around the ADD stage For example the following instruction sequence take three cycles to execute LDR R5 R2 4 LDR R6 R2 0x10 LDR R7 R2 0x20 LDR Rd Rn Rm Rm If negative register offset or shift other than LSL 2 then two issue cycles LDR Rd Rm Rm shf cns Rm LDR Rd Rn Rm Rm LDR Rd Rn Rm shf cns R...

Page 595: ...base is available to the following load store instruction with a result latency of 0 To prevent instructions after a STRD from writing to a register before it has stored that register the STRD registers have a lock latency that determines how many cycles it is before a subsequent instruction that writes to that register can start Table 16 16 lists the cycle timing behavior for LDRD and STRD instru...

Page 596: ...ift LSL 2 then one issue cycle LDRD Rd Rn Rm Rn Rm LDRD Rd Rn Rm LSL 2 Rn Rm LDRD Rd Rn cns Rn LDRD Rd Rn Rm Rn Rm LDRD Rd Rn Rm LSL 2 Rn Rm addr_md_2cycle LDRD Rd Rn Rm Rm If negative register offset or shift other than LSL 2 then two issue cycles LDRD Rd Rm Rm shf cns Rm LDRD Rd Rn Rm Rm LDRD Rd Rn Rm shf cns Rm Table 16 17 addr_md_1cycle and addr_md_2cycle LDRD example instruction explanation c...

Page 597: ... a store multiple has stored that register the register list has a lock latency that determines how many cycles it is before a subsequent instruction that writes to that register can start 16 12 1 Load and Store Multiples other than load multiples including the PC In all cases the base register Rx is an Early Reg Table 16 18 lists the cycle timing behavior of load and store multiples including the...

Page 598: ...ently a condition code failing LDM to the PC takes one cycle In all cases the base register Rx is an Early Reg and requires an extra cycle of result latency to provide its value Table 16 19 lists the cycle timing behavior of Load Multiples where the PC is in the register list 16 12 3 Example Interlocks The following sequence that has an LDM instruction take five cycles because R3 has a result late...

Page 599: ...y cycles It first loads the SPSR value from the stack and then the return address The SRS instruction takes one or two memory cycles depending on double word alignment first address location In all cases the base register is an Early Reg and requires an extra cycle of result latency to provide its value Table 16 20 lists the cycle timing behavior for RFE and SRS instructions Table 16 20 RFE and SR...

Page 600: ...n extra cycle of result latency to provide its value Table 16 21 lists the synchronization instructions cycle timing behavior CLREX instructions have cycle timing behavior as for load instructions Because they have no destination register the result latency is not applicable for such instructions Table 16 21 Synchronization Instructions cycle timing behavior Instruction Cycle s Memory Cycles Resul...

Page 601: ...ecise timing of coprocessor instructions is tightly linked with the behavior of the relevant coprocessor The numbers in Table 16 22 are best case numbers For LDC STC instructions the coprocessor can determine how many words are required Table 16 22 lists the coprocessor instructions cycle timing behavior Table 16 22 Coprocessor Instructions cycle timing behavior Instruction Cycle s Memory cycles R...

Page 602: ...etch Abort In all cases the exception is taken in the WBex stage of the pipeline SVC SMC and most Undefined instructions that fail their condition codes take one cycle A small number of undefined instructions that fail their condition codes take two cycles Table 16 23 lists the SVC SMC BKPT undefined prefetch aborted instructions cycle timing behavior Table 16 23 SVC BKPT undefined prefetch aborte...

Page 603: ...gs and Interlock Behavior ARM DDI 0301H Copyright 2004 2009 ARM Limited All rights reserved 16 27 ID012310 Non Confidential Unrestricted Access 16 17 No operation The no operation instruction NOP takes two cycles ...

Page 604: ...ted Access 16 18 Thumb instructions The cycle timing behavior for Thumb instructions follow the ARM equivalent instruction cycle timing behavior Thumb BL instructions that are encoded as two Thumb instructions can be dynamically predicted The predictions occurs on the second part of the BL pair consequently a correct prediction takes two cycles ...

Page 605: ...12310 Non Confidential Unrestricted Access Chapter 17 AC Characteristics This chapter gives the timing diagrams and timing parameters for the processor This chapter contains the following sections Processor timing diagrams on page 17 2 Processor timing parameters on page 17 3 ...

Page 606: ...2009 ARM Limited All rights reserved 17 2 ID012310 Non Confidential Unrestricted Access 17 1 Processor timing diagrams The AMBA AXI bus interface of the processor conforms to the AMBA Specification See this document for the relevant timing diagrams ...

Page 607: ...bal signal timing parameters Table 17 2 lists the AXI interface timing parameters Table 17 1 Global signals Name Minimum input delay Maximum input delay ACLKEND Clock uncertainty 40 ACLKENI Clock uncertainty 40 ACLKENP Clock uncertainty 40 ACLKENRW Clock uncertainty 40 ARESETDn Clock uncertainty 20 ARESETIn Clock uncertainty 20 ARESETPn Clock uncertainty 20 ARESETRWn Clock uncertainty 20 nPORESETI...

Page 608: ...ncertainty 70 RDATAP 31 0 Clock uncertainty 70 RDATARW 63 0 Clock uncertainty 70 RLASTD Clock uncertainty 70 RLASTI Clock uncertainty 70 RLASTP Clock uncertainty 70 RLASTRW Clock uncertainty 70 RRESPD 1 0 Clock uncertainty 70 RRESPI 1 0 Clock uncertainty 70 RRESPP 1 0 Clock uncertainty 70 RRESPRW 1 0 Clock uncertainty 70 RVALIDD Clock uncertainty 50 RVALIDI Clock uncertainty 50 RVALIDP Clock uncer...

Page 609: ...k uncertainty 70 CPALENGTHHOLD Clock uncertainty 70 CPALENGTHT 3 0 Clock uncertainty 70 CPAPRESENT 11 0 Clock uncertainty 70 CPASTDATA 63 0 Clock uncertainty 70 CPASTDATAT 3 0 Clock uncertainty 70 CPASTDATAV Clock uncertainty 70 Table 17 4 ETM interface signals Name Minimum input delay Maximum input delay ETMEXTOUT 1 0 Clock uncertainty 60 ETMPWRUP Clock uncertainty 60 nETMWFIREADY Clock uncertain...

Page 610: ...y 20 EDBGRQ Clock uncertainty 60 DBGEN Clock uncertainty 60 DBGVERSION 3 0 Clock uncertainty 50 DBGMANID 10 0 Clock uncertainty 50 SPIDEN Clock uncertainty 60 SPNIDEN Clock uncertainty 60 Table 17 7 Test signals Name Minimum input delay Maximum input delay SE Clock uncertainty 20 RSTBYPASS Clock uncertainty 20 MTESTON Clock uncertainty 60 MBISTDIN 63 0 Clock uncertainty 60 MBISTADDR 12 0 Clock unc...

Page 611: ...ted All rights reserved 17 7 ID012310 Non Confidential Unrestricted Access Table 17 9 lists the internal TrustZone signal port timing parameters Table 17 9 TrustZone internal signals Name Minimum input delay Maximum input delay CP15SDISABLE Clock uncertainty 60 ...

Page 612: ... the following sections About the VFP11 coprocessor on page 18 2 Applications on page 18 3 Coprocessor interface on page 18 4 VFP11 coprocessor pipelines on page 18 5 Modes of operation on page 18 11 Short vector instructions on page 18 13 Parallel execution of instructions on page 18 14 VFP11 treatment of branch instructions on page 18 15 Writing optimal VFP11 code on page 18 16 VFP11 revision in...

Page 613: ...parallel with other arithmetic operations to reduce the impact of long latency operations near IEEE 754 standard compatibility in RunFast mode without support code assistance providing determinable run time calculations for all input data low power consumption small die size and reduced kernel code The VFP11 coprocessor is an ARM enhanced numeric coprocessor that provides operations that are compa...

Page 614: ...as personal digital assistants and smartphones for graphics voice compression and decompression user interfaces Java interpretation and Just In Time JIT compilation games machines for three dimensional graphics and digital audio printers and MultiFunction Peripheral MFP controllers for high definition color rendering set top boxes for digital audio and digital video and three dimensional user inte...

Page 615: ...D number 10 for single precision instructions and coprocessor ID number 11 for double precision instructions In some cases such as mixed precision instructions the coprocessor ID represents the destination precision In a system containing a VFP11 coprocessor these coprocessor ID numbers must not be used by another coprocessor Access to the VFP11 coprocessor is controlled by the ARM11 Coprocessor A...

Page 616: ...lines can operate in parallel enabling more than one instruction to be completed per cycle Instructions issued to the FMAC pipeline can complete out of order with respect to operations in the LS and DS pipelines This out of order completion might be visible to you when a short vector FMAC or DS operation generates an exception and an LS operation begins before the exception is detected The destina...

Page 617: ...oat FTOUI Convert float to unsigned integer FSITO Convert signed integer to float FTOSI Convert float to signed integer FTOUIZ Convert float to unsigned integer with forced round towards zero mode FTOSIZ Convert float to signed integer with forced round towards zero mode FCMP Compare To register file E3 E5 E2 Issue E6 Read port Fm Read port Fd Read port Fn Read port Fm Read port Fn Load forward DS...

Page 618: ...n and checked for exceptions The final result is identical to the equivalent sequence of operations executed in sequence Exception processing and status reporting also reflect the independence of the components of the chained operations As an example the FMAC instruction performs a chained multiply and add operation with the following sequence of operations 1 The product of the operands in the Fn ...

Page 619: ...olve data transfer to and from the ARM11 processor including loads stores moves to coprocessor system registers and moves from coprocessor system registers It remains synchronized with the ARM11 LS pipeline for the duration of the instruction Data written to the ARM11 processor is read from the VFP11 coprocessor register file in the Issue stage and transferred to the ARM11 processor in the next cy...

Page 620: ...11 register file to memory FSTM Store up to 32 single precision or integer values or 16 double precision values from the VFP11 register file to memory FMSR Move a single precision or integer value from an ARM11 register to a VFP11 single precision register FMRS Move a single precision or integer value from a VFP11 single precision register to an ARM11 register FMDHR Move an ARM11 register value to...

Page 621: ...s to a VFP11 double precision register FMRRD Move a double precision VFP11 register value to two ARM11 registers FMSRR Move two ARM11 register values to two consecutively numbered VFP11 single precision registers FMRRS Move two consecutively numbered VFP11 single precision register values to two ARM11 registers FMXR Move an ARM11 register value to a VFP11 control register FMRX Move a VFP11 control...

Page 622: ...rt code for assistance The operations requiring support code are Any CDP operation involving a subnormal input when not in flush to zero mode Enable flush to zero mode by setting the FZ bit FPSCR 24 Any CDP operation involving a NaN input when not in default NaN mode Enable default NaN mode by setting the DN bit FPSCR 25 Any CDP operation that has the potential of generating an underflow condition...

Page 623: ... and FCPY operations all other CDP operations ignore any information in the fraction bits of an input NaN See NaN handling on page 20 4 for a description of default NaNs 18 5 4 RunFast mode RunFast mode is the combination of the following conditions the VFP11 coprocessor is in flush to zero mode the VFP11 coprocessor is in default NaN mode all exception enable bits are cleared In RunFast mode the ...

Page 624: ...in graphics and signal processing applications They reduce code size increase speed of execution by supporting parallel operations and multiple transfers and simplify algorithms with high data throughput Short vector operations issue the individual operations specified in the instruction in a serial fashion To eliminate data hazards short vector operations begin execution only after all source reg...

Page 625: ...her when initial processing is completed This makes it possible to issue a short vector operation and a load or store multiple operation in the next cycle and have both executing at the same time provided no data hazards exist between the two instructions With this mechanism algorithms that can be double buffered can be written to hide much of the time to transfer data to and from the VFP11 coproc...

Page 626: ...TAT instruction This enables you to use the ARM11 branch instructions and conditional execution capabilities to executing conditional floating point code In some cases full IEEE 754 standard comparisons are not required Simple comparisons of single precision data such as comparisons to zero or to a constant can be done using an FMRS transfer and the ARM11 CMP and CMN instructions This method is fa...

Page 627: ...of the short vector DS operation issues from the Execute 1 stage If the short vector DS operation can be separated other VFP11 instructions can be issued in the cycles immediately following the divide or square root See Parallel execution on page 21 20 The best performance for data intensive applications requires double buffering looped short vector instructions The register banks can be divided t...

Page 628: ...estricted Access 18 10 VFP11 revision information This manual describes the fifth version of the VFP11 coprocessor Updates in the fifth version of the VFP11 coprocessor are corrections for errata update to the FPSID register to reflect the fifth version There are no other functional differences between the VFP11 fourth and fifth versions ...

Page 629: ...rocessor that are useful to programmers It contains the following sections About the register file on page 19 2 Register file internal formats on page 19 3 Decoding the register file on page 19 5 Loading operands from ARM11 registers on page 19 6 Maintaining consistency in register precision on page 19 8 Data transfer between memory and VFP11 registers on page 19 9 Access to register banks in CDP ...

Page 630: ...le can be configured as four circular buffers for use by short vector instructions in applications requiring high data throughput such as filtering and graphics transforms For short vector instructions register addressing is circular within each bank Because load and store operations do not circulate you can load or store multiple banks up to the entire register file with a single instruction Shor...

Page 631: ...ccessing a register that has not been initialized or loaded with valid data is Unpredictable A way to detect access to an uninitialized register is to load all registers with Signaling NaNs SNaNs in the precision of the initial access of the register and enable the Invalid Operation exception 19 2 1 Integer data format The VFP11 coprocessor supports signed and unsigned 32 bit integers Signed integ...

Page 632: ...the exponent bits 30 20 the upper 20 bits of the fraction bits 19 0 The LSW contains the lower 32 bits of the fraction The IEEE 754 standard defines the double precision data format used in the VFP11 coprocessor See the IEEE 754 standard for details about exponent bias special formats and numerical ranges 31 Exponent Fraction upper 20 bits S 30 20 19 0 Fraction lower 32 bits Double precision MSW D...

Page 633: ...nd the least significant bit is the M N or D bit for each instruction format For instructions with double precision operands or destinations the M N and D bit corresponding to a double precision access must be zero Figure 19 3 shows the register file See the ARM Architecture Reference Manual for instruction formats and the positions of these bits Figure 19 3 Register file access 31 0 S1 S3 S7 S5 S...

Page 634: ... FPINST2 FMDLR Dn 31 0 Rd Move from ARM11 register Rd to lower half of VFP11 double precision register Dn FMDHR Dn 63 32 Rd Move from ARM11 register Rd to upper half of VFP11 double precision register Dn FMSR Sn Rd Move from ARM11 register Rd to VFP11 single precision or integer register Sn a Writing to the FPSID register does not change the contents of the FPSID but might be used as a serializing...

Page 635: ...nrestricted Access Table 19 4 VFP11 MRRC instructions Instruction Operation Description FMRRD Rd Dm 31 0 Rn Dm 63 32 Move from lower and upper halves of VFP11 double precision register Dm to ARM11 registers Rd and Rn FMRRS Rd Sm Rn S m 1 Move from single precision VFP11 registers Sm and S m 1 to ARM11 registers Rd and Rn ...

Page 636: ...The usable format of the register or registers depends on the last load or arithmetic instruction that wrote to the register or registers The VFP11 hardware does not check the register format to see if it is consistent with the precision of the current operation Inconsistent use of the registers is possible but Unpredictable The hardware interprets the data in the format required by the instructio...

Page 637: ...are stored in different locations for little endian and big endian data access formats Table 19 6 lists the data storage in memory and the address to access each byte in little endian and big endian access modes In this example the target address is 0x40000000 The memory image for the data is identical for both little endian and big endian within data words The ARM11 hardware performs the address ...

Page 638: ...C for more information on VFP addressing modes Figure 19 4 Register banks A short vector CDP operation that has a source or destination vector crossing a bank boundary wraps around and accesses the first register in the bank Example 19 1 shows the iterations of the following short vector add instruction FADDS S11 S22 S31 In this instruction the LEN field contains b101 selecting a vector length of ...

Page 639: ... if the LEN field contains b000 then the following operation writes the sum of the single precision values in S21 and S22 to S12 FADDS S12 S21 S22 Some instructions can operate only on scalar data regardless of the value in the LEN field These instructions are Compare operations FCMPS D FCMPZS D FCMPES D and FCMPEZS D Integer conversions FTOUIS D FTOUIZS D FTOSIS D FTOSIZS D FUITOS D and FSITOS D ...

Page 640: ...wo source registers D8 and D9 by the value in D2 and writes the new values to D12 and D13 Scalar instructions in short vector mode You can mix scalar and short vector operations by carefully selecting the source and destination registers If the destination is in bank 0 the operation is scalar only regardless of the value in the LEN field You do not have to change the LEN field from a nonzero value...

Page 641: ...rand register usage Table 19 7 Single precision three operand register usage LEN field Fd Fn Fm Operation type b000 Any Any Any S S op S OR S S S S Nonzero 0 7 Any Any S S op S OR S S S S Nonzero 8 31 Any 0 7 V V op S OR V V V S Nonzero 8 31 Any 8 31 V V op V OR V V V V Table 19 8 Single precision two operand register usage LEN field Fd Fm Operation type b000 Any Any S op S Nonzero 0 7 Any S op S ...

Page 642: ...grammer s Model This chapter describes implementation specific features of the VFP11 coprocessor that are useful to programmers It contains the following sections About the programmer s model on page 20 2 Compliance with the IEEE 754 standard on page 20 3 ARMv5TE coprocessor extensions on page 20 8 VFP11 system registers on page 20 12 ...

Page 643: ... the NaNs involved in the operation This mode is compatible with the IEEE 754 standard but not with current handling of NaNs by industry Addition of the input subnormal flag IDC FPSCR 7 IDC is set whenever the VFP11 coprocessor is in flush to zero mode and a subnormal input operand is replaced by a positive zero It remains set until cleared by writing to the FPSCR register A new Input Subnormal ex...

Page 644: ...ber to integer valued floating point number binary to decimal conversions decimal to binary conversions direct comparison of single precision and double precision values For complete implementation of the IEEE 754 standard the VFP11 coprocessor and support code must be augmented with library functions that implement these operations See Application Note 98 VFP Support Code for details of support c...

Page 645: ...significant fraction bit of zero indicates a Signaling NaN SNaN A one indicates a Quiet NaN QNaN Two NaN values are treated as different NaNs if they differ in any bit Table 20 1 lists the default NaN values in both single and double precision Any SNaN passed as input to an operation causes an Invalid Operation exception and sets the IOC flag FPSCR 0 If the IOE bit FPSCR 8 is set control passes to...

Page 646: ...whether the result is less than equal to or greater than When the VFP11 coprocessor is not in flush to zero mode comparisons involving subnormal operands bounce to support code Table 20 2 QNaN and SNaN handling Instruction type Default NaN mode With QNaN operand With SNaN operand Arithmetic CDP Off INVa set Bounce to support code to process operation INV set Bounce to support code to process opera...

Page 647: ...port code Some simple comparisons on single precision data can be computed directly by the ARM11 processor If only equality or comparison to zero is required and NaNs are not an issue performing the comparison in ARM11 registers using CMP or CMN instructions can be faster If branching on the state of the Z flag is required you can use the following instructions for positive values FMRS Rx Sn CMP R...

Page 648: ...s Exceptions are taken in the VFP11 coprocessor in an imprecise manner When exception processing begins the states of the ARM11 processor and the VFP11 coprocessor might not be the same as when the exception occurred Exceptional instructions cause the VFP11 coprocessor to enter the exceptional state and the next VFP11 instruction triggers exception processing After the issue of the exceptional ins...

Page 649: ...sion register The ARM11 registers do not have to be contiguous Figure 20 1 shows the format of the FMDRR instruction Figure 20 1 FMDRR instruction format Syntax FMDRR cond Dm Rd Rn where cond Is the condition under which the instruction is executed If cond is omitted the AL always condition is used Dm Specifies the destination double precision VFP11 coprocessor register Rd Specifies the source ARM...

Page 650: ...ch the instruction is executed If cond is omitted the AL always condition is used Rd Specifies the destination ARM11 register for the lower 32 bits of the operand Rn Specifies the destination ARM11 register for the upper 32 bits of the operand Dm Specifies the source double precision VFP11 coprocessor register Architecture version D variants only Exceptions None Operation if ConditionPassed cond t...

Page 651: ...ist is encoded in the instruction by setting Sm to the top four bits of m and M to the bottom bit of m For example if registers is S1 S2 the Sm field of the instruction is b0000 and the M bit is 1 Rd Specifies the source ARM11 register for the Sm VFP11 single precision register Rn Specifies the source ARM11 register for the S m 1 VFP11 single precision register Architecture version All Exceptions ...

Page 652: ...ion VFP11 source registers separated by a comma and surrounded by brackets If m is the number of the first register in the list the list is encoded in the instruction by setting Sm to the top four bits of m and M to the bottom bit of m For example if registers is S16 S17 the Sm field of the instruction is b1000 and the M bit is 0 Architecture version All Exceptions None Operation If ConditionPasse...

Page 653: ...ional bits to support exceptional conditions These registers are designed to be used with the support code software available from ARM Limited As a result this document does not fully specify exception handling in all cases The coprocessor also provides two feature registers Media and VFP Feature Register 0 on page 20 19 MVFR0 Media and VFP Feature Register 1 on page 20 20 MVFR1 Table 20 3 lists t...

Page 654: ...ction registers FPINST and FPINST2 on page 20 18 20 4 1 Floating Point System ID Register FPSID FPSID is a read only register that identifies the VFP11 coprocessor Figure 20 5 shows the FPSID bit fields Figure 20 5 Floating Point System ID Register Table 20 4 Accessing VFP11 system registers FMXR FMRX reg field ARM11 processor mode Register VFP11 coprocessor enabled VFP11 coprocessor disabled FPSI...

Page 655: ...e systems Figure 20 6 shows the FPSCR bit fields Figure 20 6 Floating Point Status and Control Register Table 20 5 FPSID bit fields Bit Meaning Value 31 24 Implementer 0x41 A ARM Limited 23 Hardware software 0 Hardware implementation 22 21 FSTMX FLDMX format b00 Format 1 20 Precisions supported 0 Both single precision and double precision data supported 19 16 Architecture version b0001 VFPv2 archi...

Page 656: ...2 Rmode Rounding mode control field b00 Round to nearest RN mode b01 Round towards plus infinity RP mode b10 Round towards minus infinity RM mode b11 Round towards zero RZ mode 21 20 Stride See Vector length and stride control on page 20 16 19 Should Be Zero 18 16 LEN See Vector length and stride control on page 20 16 15 IDE Input Subnormal exception trap enable bit 14 13 Should Be Zero 12 IXE Ine...

Page 657: ...trap handler or a user trap handler You must save and restore the FPEXC register whenever changing the context If the EX flag FPEXC 31 is set then the VFP11 coprocessor is in the exceptional state and you must also save and restore the FPINST and FPINST2 registers You can write the context switch code to determine from the EX flag the registers to save and restore or to save all three The EN bit F...

Page 658: ...set whenever an operation has the potential to generate a result that cannot be represented or is not defined Note To prevent an infinite loop of exceptions the support code must clear the EX flag FPEXC 31 immediately on entry to the exception code All exception flags must be cleared before returning from exception code to user code Figure 20 7 shows the FPEXC bit fields Figure 20 7 Floating Point...

Page 659: ... The instruction in the FPINST2 register is in the same format as the issued instruction but is modified by forcing the condition code flags FPINST2 31 28 to b1110 the AL always condition 10 8 VECITR Vector iteration count field VECITR contains the number of remaining short vector iterations after a potential exception was detected in one of the iterations b000 1 iteration b001 2 iterations b010 3...

Page 660: ... 31 16 15 8 7 3 0 Table 20 9 Media and VFP Feature Register 0 bit functions Bits Name Function 31 28 Indicates the VFP hardware support level when user traps are disabled 0x1 In ARM1176JZF S processors when Flush to Zero and Default_NaN and Round to Nearest are all selected in FPSCR the coprocessor does not require support code Otherwise floating point support code is required 27 24 Indicates supp...

Page 661: ...igure 20 9 shows the bit arrangement for Media and VFP Feature Register 1 Figure 20 9 Media and VFP Feature Register 1 format Table 20 10 shows how the bit values correspond with the Media and VFP Feature Register 1 functions The values in the Media and VFP Feature Register 1 are implementation defined 31 8 7 3 0 4 11 12 Table 20 10 Media and VFP Feature Register 1 bit functions Bits Name Function...

Page 662: ...struction pipeline It contains the following sections About instruction execution on page 21 2 Serializing instructions on page 21 3 Interrupting the VFP11 coprocessor on page 21 4 Forwarding on page 21 5 Hazards on page 21 6 Operation of the scoreboards on page 21 7 Data hazards in full compliance mode on page 21 13 Data hazards in RunFast mode on page 21 16 Resource hazards on page 21 17 Paralle...

Page 663: ...interrupt service LDM and STM instructions detect exceptional conditions after the first transfer and restart after interrupt service if reissued See Interrupting the VFP11 coprocessor on page 21 4 To reduce stall time the VFP11 coprocessor forwards data from load instructions to CDP instructions from CDP instructions to CDP instructions See Forwarding on page 21 5 In full compliance mode the VFP1...

Page 664: ...re completed and the data to be written by the VFP11 coprocessor is valid For example a compare operation updates the FPSCR register condition codes in the Writeback stage of the compare An FMXR instruction stalls until all prior floating point operations are past the point of being affected by the instruction For example writing to the FPSCR register stalls until the point when changing the contr...

Page 665: ...d no operations that depend on the load or store data can execute until the load or store operation is complete When interrupt processing begins there can be a delay before the VFP11 coprocessor is available to the interrupt routine Any prior short vector instruction that passes the ARM11 Execute 2 stage also passes the VFP11 Execute 1 stage and executes to completion uninterrupted The maximum del...

Page 666: ...tions are valid the VFP11 coprocessor operates at its highest performance When these assumptions are not valid load and store operations are affected by the delay required to access data Example 21 1 Example 21 2 and Example 21 3 illustrate the capabilities of the VFP11 coprocessor in ideal conditions In Example 21 1 the second FADDS instruction depends on the result of the first FADDS instruction...

Page 667: ... hazard detection mechanism enabling instructions to begin execution earlier than in full compliance mode There are two VFP11 pipeline hazards A data hazard is a combination of instructions that creates the potential for operands to be accessed in the wrong order A Read After Write RAW data hazard occurs when the pipeline creates the potential for an instruction to read an operand before a prior i...

Page 668: ...tore multiple instructions the source scoreboard clears source register locks in the Execute stage where the instruction writes the store data to the ARM11 processor The destination scoreboard clears the destination register lock in the cycle before the result data is written back to the register file or is available for forwarding Execute 7 in the FMAC pipeline Execute 4 in the DS pipeline In a l...

Page 669: ... FADDS instruction performs the following operations FADDS S8 S16 S24 FADDS S9 S17 S25 FADDS S10 S18 S26 FADDS S11 S19 S27 FADDS S12 S20 S28 In full compliance mode the source scoreboard locks S16 S20 and S24 S28 in the Issue stage of the instruction In RunFast mode the source scoreboard locks only the fifth iteration source registers S20 and S28 Table 21 1 Single precision source register locking...

Page 670: ...ns FADDS S8 S16 S24 The FADDS instruction performs the following operations FADDS S8 S16 S24 FADDS S9 S17 S25 FADDS S10 S18 S26 FADDS S11 S19 S27 FADDS S12 S20 S28 In full compliance mode the source scoreboard clears the source registers of each iteration in the Execute 1 cycle of the iteration In RunFast mode the source scoreboard locks only the fifth iteration source registers S20 and S28 It cle...

Page 671: ...14 FADDD D7 D11 D15 In full compliance mode the source scoreboard locks D8 D11 and D12 D15 in the Issue stage of the instruction In RunFast mode the source scoreboard locks only the third iteration source registers D10 and D14 and the fourth iteration source registers D11 and D15 21 6 5 Double precision source register clearing The number of Execute 1 cycles required to clear the source registers ...

Page 672: ...registers D10 and D14 and the fourth iteration source registers D11 and D15 It clears D10 and D14 in the first Execute 1 cycle of the instruction and clears D11 and D15 in the second Execute 1 cycle Instructions with two cycle throughput In full compliance mode the source scoreboard clears the source registers of each iteration in the first Execute 1 cycle of the iteration In RunFast mode the sour...

Page 673: ...ce mode the source scoreboard clears the source registers of each iteration in the first Execute 1 cycle of the iteration In RunFast mode only the third iteration source registers D10 and D14 and the fourth iteration source registers D11 and D15 are locked The source scoreboard clears D10 and D14 in the first Execute 1 cycle and clears D11 and D15 in the third Execute 1 cycle of the instruction 4 ...

Page 674: ... the FMSTAT is stalled for four cycles in the Decode stage until the FCMPS updates the condition codes in the FPSCR register Two cycles later the FMSTAT writes the condition codes to the ARM11 processor Example 21 4 FCMPS FMSTAT RAW hazard FCMPS S1 S2 FMSTAT Table 21 6 lists the VFP11 pipeline stages for Example 21 4 21 7 2 Load multiple CDP RAW hazard example In Example 21 5 the FADDS is stalled ...

Page 675: ...FADDS source register loaded by the FLDM is S7 This example is based on the assumption that the remaining source and destination registers are available to the FADDS in cycle 6 Example 21 6 FLDM short vector FADDS RAW hazard FLDM R2 S7 S14 FADDS S16 S7 S25 Table 21 8 lists the VFP11 pipeline stages of the FLDM and the first iteration of the short vector FADDS for Example 21 6 21 7 4 CDP CDP RAW ha...

Page 676: ...ting a vector stride of one The VFP11 coprocessor stalls the FLDMS until the FMULS clears the scoreboard locks for all the source registers S16 S19 and S24 S27 Example 21 8 Short vector FMULS FLDMS WAR hazard FMULS S8 S16 S24 FLDMS R2 S16 S27 Table 21 10 lists the VFP11 pipeline stages for the first iteration of Example 21 8 Table 21 9 FMULS FADDS RAW hazard Instruction cycle number Instruction 1 ...

Page 677: ...load multiple WAR hazard example Example 21 9 is the same as Example 21 8 on page 21 15 The LEN field contains b011 selecting a vector length of four iterations and the STRIDE field contains b00 selecting a vector stride of one Executing these instructions in RunFast mode reduces the cycle count of the FLDMS by four cycles Example 21 9 Short vector FMULS FLDMS WAR hazard in RunFast mode FMULS S8 S...

Page 678: ...precision divide or square root operation stalls subsequent DS operations for 15 cycles A double precision divide or square root operation stalls subsequent DS operations for 29 cycles A short vector divide or square root operation requires the FMAC pipeline for the first cycle of each iteration and stalls any following CDP operation The following CDP operation stalls until the final iteration of ...

Page 679: ...2 a short vector divide is followed by a FADDS instruction The short vector divide has b001 in the LEN field selecting a vector length of two iterations It requires the Execute 1 stage of the FMAC pipeline for the first cycle of each iteration of the divide resulting in a stall of the FADDS until the final iteration of the divide completes the first Execute 1 cycle The divide iterates for 14 cycle...

Page 680: ... 14 lists the pipeline stages for Example 21 12 on page 21 18 Table 21 14 Short vector FDIVS FADDS resource hazard Instruction cycle number 1 2 3 4 1 6 1 7 1 8 1 9 2 0 2 1 2 2 2 3 2 4 2 5 2 6 3 0 3 1 3 2 3 3 3 4 3 5 3 6 FDIVS D I E 1 E 1 E 1 E 1 E 1 E 1 E 1 E 1 E 1 E 1 E 1 E 1 E 1 E 1 E 1 E 1 E 2 E 3 E 4 W FADDS D D D D I E 1 E 2 E 3 E 4 E 5 E 6 E 7 W ...

Page 681: ...ny currently executing operations the FMAC pipeline is available that is no short vector CDP is executing and no double precision multiply is in the first cycle of the multiply operation no short vector operation with unissued iterations is currently executing in either the FMAC or DS pipeline A divide or square root instruction can be issued to the DS pipeline if no data hazards exist with any cu...

Page 682: ...tion and requires one cycle in the FMAC pipeline E1 stage If the FDIVS were a short vector operation the FADDS might not begin execution until the last FDIVS iteration passed the FMAC E1 pipeline stage The FADDS is a short vector operation and requires the FMAC pipeline E1 stage for cycles 5 8 Note E1 is the first cycle in E1 and is in both FMAC and DS blocks Subsequent E1 cycles represent the ite...

Page 683: ... Latency FABS FNEG FCVT FCPY 1 4 1 4 FCMP FCMPE FCMPZ FCMPEZ 1 4 1 4 FSITO FUITO FTOSI FTOUI FTOUIZ FTOSIZ 1 8 1 8 FADD FSUB 1 8 1 8 FMUL FNMUL 1 8 2 9 FMAC FNMAC FMSC FNMSC 1 8 2 9 FDIV FSQRT 15 19 29 33 FLDa 1 4 1 4 FSTa 1a System dependent 1 System dependent FLDMa Xb Xb 3 Xb Xb 3 FSTMa Xb System dependent Xb System dependent FMSTAT 1 2 FMSR FMSRRc 1 4 FMDHR FMDHC FMDRRc 1 4 FMRS FMRRSc 1 2 FMRD...

Page 684: ...ections About exception processing on page 22 2 Bounced instructions on page 22 3 Support code on page 22 5 Exception processing on page 22 8 Input Subnormal exception on page 22 12 Invalid Operation exception on page 22 13 Division by Zero exception on page 22 15 Overflow exception on page 22 16 Underflow exception on page 22 17 Inexact exception on page 22 18 Input exceptions on page 22 19 Arith...

Page 685: ...pending on sequence of instructions that follow the bounce can occur several instructions later The VFP11 coprocessor can generate exceptions only on arithmetic operations Data transfer operations between the ARM11 processor and the VFP11 coprocessor and instructions that copy data between VFP11 registers FCPY FABS and FNEG cannot produce exceptions In full compliance mode the VFP11 hardware and s...

Page 686: ... that might underflow when the VFP11 coprocessor is not in flush to zero mode an operation involving a subnormal operand when the VFP11 coprocessor is not in flush to zero mode an operation involving a NaN when the VFP11 coprocessor is not in default NaN mode For these conditions the VFP11 coprocessor relies on support code to process the operation See Underflow exception on page 22 17 and Input e...

Page 687: ...tential for an exception Support code performs the multiply operation and determines the exception status If the multiply operation results in an overflow the processor jumps to the Overflow user trap handler If the operation does not result in an overflow it writes the computed result to the destination sets the appropriate flags in the FPSCR and returns to user code ...

Page 688: ... If IXE FPSCR 12 is set an inexact exception has occurred that takes priority over other exceptions and is precise Other exceptions are imprecise 4 The support code reads either the FPINST register or the instruction pointed to by R14 4 depending on whether the exception is precise or not to determine the instruction that caused the potential exception 5 The support code decodes the instruction in...

Page 689: ...ction the current instruction can still be bounced if it is architecturally Undefined in some way When this happens the EX flag FPEXC 31 is not set The instruction that caused the bounce is contained in the memory word pointed to by R14_undef 4 It is possible that both conditions for an instruction to be bounced occur simultaneously This happens when an illegal instruction is encountered and there...

Page 690: ...DDI 0301H Copyright 2004 2009 ARM Limited All rights reserved 22 7 ID012310 Non Confidential Unrestricted Access a short vector instruction with overlapping source and destination register addresses that are not exactly the same ...

Page 691: ...tion that accesses the FPEXC FPINST or FPINST2 register in a privileged mode is not a trigger instruction An instruction that accesses the FPSID register in any mode is not a trigger instruction A data processing instruction that reaches the LS pipeline Execute stage or a CDP instruction that reaches the FMAC or DS pipeline E1 stage is not the trigger instruction There can be several of these if t...

Page 692: ...on multiply of length 4 FLDD D0 R5 Load of 1 double precision register FSTMS R3 S2 S9 Store multiple of 8 single precision registers FLDS S8 R9 Load of 1 single precision register A double precision multiply requires two cycles in the Execute 2 stage The exception on the third iteration is detected in cycle 8 Before the FMULD exception is detected the FLDD enters the Decode stage in cycle 2 and th...

Page 693: ...rocessing Example 22 2 Exceptional short vector FADDS with a FADDS in the pretrigger slot FADDS S24 S26 S28 Vector single precision add of length 2 FADDS S3 S4 S5 Scalar single precision add FMULS S12 S16 S16 Short vector single precision multiply Table 22 2 lists the pipeline stages for Example 22 2 After exception processing begins the FPEXC register fields contains the following EX 1 The VFP11 ...

Page 694: ...2 Scalar single precision mac Table 22 3 lists the pipeline stages for Example 22 3 After exception processing begins the FPEXC register fields contain the following EX 1 The VFP11 coprocessor is in the exceptional state EN 1 FP2V 0 FPINST2 does not contain a valid instruction VECITR 010 Three iterations remain INV 0 UFC 0 OFC 1 Exception detected is a potential overflow IOC 0 The FPINST register ...

Page 695: ...es on the presence of a subnormal input If FZ is set the IDE bit FPSCR 15 determines whether a bounce occurs 22 5 1 Exception enabled Setting the IDE bit enables Input Subnormal exceptions An Input Subnormal exception sets the EX flag FPEXC 31 the INV flag FPEXC 7 and calls the Input Subnormal user trap handler The source and destination registers for the instruction are unchanged in the VFP11 reg...

Page 696: ...than round towards zero The impact of rounding is unknown in the Execute 1 stage An FMAC family operation with an infinity in the A operand and a potential product overflow when an infinity with the sign of the product would result in an invalid condition Table 22 4 Possible Invalid Operation exceptions Instruction Invalid Operation exceptions FADD infinity infinity or infinity infinity FSUB infin...

Page 697: ...hat is outside the range of the destination integer is an invalid condition rather than an overflow condition When an invalid condition exists for a float to integer conversion the VFP11 coprocessor delivers a default result to the destination register and sets the IOC flag FPSCR 0 Table 22 5 lists the default results for input values after rounding If the VFP11 coprocessor is not in default NaN m...

Page 698: ... as a positive zero for detection of a division by zero What happens depends on whether or not the Invalid Operation exception is enabled 22 7 1 Exception enabled If the DZE bit FPSCR 9 is set the Division by Zero user trap handler is called The source and destination registers for the instruction are unchanged in the VFP11 register file 22 7 2 Exception disabled Clearing the DZE bit disables Divi...

Page 699: ...nation register OFC is set and the Overflow user trap handler is called The support code sets or clears the IXC flag FPSCR 4 as appropriate When the VFP11 coprocessor detects a potential overflow condition the EX flag FPEXC 31 and the OFC flag FPEXC 2 are set The OFC flag in the FPSCR register FPSCR 2 is not set by the hardware and must be set by the support code before calling the user trap handl...

Page 700: ...ination register and returns without setting the UFC flag FPSCR 3 If there is underflow regardless of any accuracy loss the intermediate result is written to the destination register UFC is set and the Underflow user trap handler is called The support code sets or clears the IXC flag FPSCR 4 as appropriate When the VFP11 coprocessor detects a potential underflow condition the EX flag FPEXC 31 and ...

Page 701: ...ion differently from the other floating point exceptions It has no mechanism for reporting inexact results to the software but can handle the exception without software intervention as long as the IXE bit FPSCR 12 is cleared disabling Inexact exceptions 22 10 1 Exception enabled If the IXE bit FPSCR 12 is set all CDP instructions are bounced to the support code without any attempt to perform the c...

Page 702: ...uction An arithmetic operation bounces with an Input exception when it has either of the following a NaN operand or operands and default NaN mode is not enabled a subnormal operand or operands and flush to zero mode is not enabled Note In default NaN mode an SNaN input to an arithmetic operation causes an Invalid Operation exception When the IOE bit FPSCR 8 is set the instruction bounces to the In...

Page 703: ...TOSIZ on page 22 24 22 12 1 FADD and FSUB In an addition or subtraction the exponent is initially the larger of the two input exponents For clarity we define the operation as a Like Signed Addition LSA or an Unlike Signed Addition USA Table 22 7 specifies how this distinction is made In the table indicates a positive operand and indicates a negative operand Because it is possible for an LSA operat...

Page 704: ...F DP overflow NaN or infinity Bounce 0x7FE DP overflow Bounce 0x7FD DP overflow Bounce 0x7FC DP normal Normal 0x47F 0xFF SP overflow Bounce Normal 0x47F 0xFF SP NaN or infinity Bounce Normal 0x47E 0xFE SP overflow Bounce Normal 0x47D 0xFD SP overflow Bounce Normal 0x47C 0xFC SP normal Normal Normal 0x3FF 0x7F e 0 bias value Normal Normal 0x3A0 0x20 SP normal LSA Minimum USA Normal 0x39F 0x1F SP un...

Page 705: ... to but not including four In this case it is possible for the final exponent to require incrementing by two to normalize the significand The bounce thresholds for the FADD family in Table 22 8 on page 22 21 and for the FMUL family in Table 22 9 incorporate this additional factor Those ranges are used to detect potential exceptions for the FMAC family Table 22 9 FMUL family bounce thresholds Initi...

Page 706: ...ble 22 10 lists the FDIV bounce thresholds The exponent values shown in Table 22 10 are in biased format 22 12 6 FSQRT It is not possible for FSQRT to overflow or underflow Table 22 10 FDIV bounce thresholds Initial quotient exponent value Float value Condition in full compliance mode DPa a DP double precision SPb b SP single precision SP DP 0x7FF DP overflow Bounce 0x7FF DP NaN or infinity Bounce...

Page 707: ...r C C and Java compiled code the thresholds for pessimistic bouncing are different for the various rounding modes Table 22 12 on page 22 25 and Table 22 13 on page 22 26 use the following notation In the VFP Response column the response notations are all These input values are bounced for all rounding modes S These input values are bounced for signed conversions in all rounding modes SnZ These inp...

Page 708: ...FF Invalid 0x7FFFFFFF Invalid Bounce all 0x7F7FFFFF to 0x4F800000 maximum SPa to 232 0xFFFFFFFF Invalid 0x7FFFFFFF Invalid Bounce all 0x4F7FFFFF to 0x4F000000 232 28 to 231 0xFFFFFF00 to 0x80000000 Valid 0x7FFFFFFF Invalid Bounce S UnZ 0x4EFFFFFF to 0x4E800000 231 27 to 230 0x7FFFFF80 to 0x40000000 Valid 0x7FFFFF80 to 0x40000000 Valid Bounce SnZ 0x4E7FFFFF to 0x00000000 230 26 to 0 0x3FFFFFC0 to 0...

Page 709: ...FFF FFE00001 232 2 1 221 to 232 20 2 21 0xFFFFFFFF P 0xFFFFFFFF N Z M Invalid Valid 0x7FFFFFFF Invalid Bounce S UnZ 0x41EFFFFF FFE00000 to 0x41E00000 00000000 232 20 to 231 0xFFFFFFFF to 0x80000000 Valid 0x7FFFFFFF Invalid Bounce S UnZ 0x41DFFFFF FFFFFFFF to 0x41DFFFFF FFE00000 231 222 to 231 2 1 0x80000000 N P 0x7FFFFFFF Z M Valid Valid 0x7FFFFFFF N P 0x7FFFFFFF Z M Invalid Valid Bounce SnZ 0x41D...

Page 710: ...x80000000 M Valid Invalid Bounce U SnZ 0xC1E00000 00100001 to 0xC1E00000 001FFFFF 231 2 1 2 21 to 231 20 2 21 0x00000000 Invalid 0x80000000 Z P 0x80000000 N M Valid Invalid Bounce U SnZ 0xC1E00000 00200000 to 0xFFEFFFFF FFFFFFFF 231 20 to maximum DP 0x00000000 Invalid 0x80000000 Invalid Bounce all 0xFFF00000 00000000 infinity 0x00000000 Invalid 0x00000000 Invalid Bounce all a DP double precision b...

Page 711: ...ge A 2 Static configuration signals on page A 4 TrustZone internal signals on page A 5 Interrupt signals including VIC interface on page A 6 AXI interface signals on page A 7 Coprocessor interface signals on page A 12 Debug interface signals including JTAG on page A 14 ETM interface signals on page A 15 Test signals on page A 16 Note The output signals that Table A 1 on page A 2 to Table A 14 on p...

Page 712: ...ed rate ACLKEND Input Clock enable for the DMA port to enable it to be clocked at a reduced rate ACLKENI Input Clock enable for the instruction port to enable it to be clocked at a reduced rate ACLKENRW Input Clock enable for the data port to enable it to be clocked at a reduced rate ARESETIn Input AXI reset for Instruction IEM Register Slice ARESETRWn Input AXI reset for Data IEM Register Slice A...

Page 713: ...CKRW Output Acknowledge for synchronous or asynchronous mode of Data IEM Register Slice SYNCMODEACKP Output Acknowledge for synchronous or asynchronous mode of Peripheral IEM Register Slice SYNCMODEACKD Output Acknowledge for synchronous or asynchronous mode of DMA IEM Register Slice Table A 1 Global signals continued Name Direction Description ...

Page 714: ...ion Description BIGENDINIT Input When HIGH indicates v5 Big endian mode CFGBIGEND Output Current state of CP15 Bigend bit INITRAM Input Determines the reset value of the En bit bit 0 of the Instruction TCM Region Register When HIGH this bit resets to 1 and the Instruction TCM is enabled on reset For more information see c9 Instruction TCM Region Register on page 3 91 UBITINIT Input When HIGH indic...

Page 715: ...Table A 3 lists the processor TrustZone internal signals Depending on the implementation these signals do not appear at the chip level Table A 3 TrustZone internal signals Name Direction Description CP15SDISABLE Input Disables write access to some system control processor registers SECMONBUS 24 0 Output Monitors the state of some of the key signals in the processor ...

Page 716: ...nizers are bypassed and the interface is synchronous IRQACK Output Interrupt acknowledge IRQADDR 31 2 Input Address of IRQ IRQADDRV Input Indicates IRQADDR is valid IRQADDRVSYNCEN Input When HIGH indicates that IRQADDRV synchronizer is bypassed and the interface is synchronous nFIQa a Because this signal is level sensitive to generate an interrupt you must ensure it is held LOW until the processor...

Page 717: ...rocessor The AXI signal names have a one or two letter suffix that indicate the port as shown in Table A 5 A 5 1 Instruction read port signals The instruction read port is a 64 bit wide read only AXI port The standard AXI read channel signal names are suffixed with I and the implementation details of the port are ARID 3 0 and RID 3 0 signals are not implemented the read data bus is implemented as ...

Page 718: ...d and inner cacheable accesses the WRITEBACK output signal is implemented to indicate cache line evictions Table A 7 on page A 9 gives more information about the data port AXI implementation See the AMBA AXI Protocol V1 0 Specification for details of the other signals on this port Table A 6 Instruction read port AXI signal implementation Name Direction Type Description ARLENI 3 0 Output Read Burst...

Page 719: ... AWBURSTRW 1 0 Output Write Write burst type 01 INCR Incrementing burst 10 WRAP Wrapping burst AWLOCKRW 1 0 Output Write Write lock type 00 Normal access 01 Exclusive access ARLENRW 3 0 Output Read Burst length that gives the exact number of transfer b0000 1 data transfer b0001 2 data transfers b0010 3 data transfers b0011 4 data transfers b0100 5 data transfers b0101 6 data transfers b0110 7 data...

Page 720: ...tion Name Direction Type Description AWSIZEP 2 0 Output Write Write burst size b000 8 bit transfers b001 16 bit transfers b010 32 bit transfers maximum for the peripheral port AWBURSTP 1 0 Output Write Write burst type always set to b01 INCR Incrementing burst AWLOCKP 1 0 Output Write Write lock type always set to b00 Normal access AWCACHEP 3 0 Output Write Cache type giving additional information...

Page 721: ...t size b000 indicating 8 bit transfer b001 indicating 16 bit transfer b010 indicating 32 bit transfer b011 indicating 64 bit transfer AWBURSTD 1 0 Output Write Write burst type b00 FIXED fixed burst b01 INCR incrementing burst AWLOCKD 1 0 Output Write Write lock type always set to b00 indicating normal access ARLEND 3 0 Output Read Burst length that gives the exact number of transfer b0000 1 data ...

Page 722: ...e tag to be flushed from ACPINSTR 31 0 Output The instruction passed from the core Fe2 stage to the coprocessor Decode stage ACPINSTRT 3 0 Output The tag accompanying the instruction in ACPINSTR ACPINSTRV Output Asserted to indicate that ACPINSTR carries a valid instruction ACPLDDATA 63 0 Output The load data from the core to the coprocessor ACPLDVALID Output Asserted to indicate that the data in ...

Page 723: ...ial Unrestricted Access CPASTDATA 63 0 Input The store data passing from the coprocessor to the core CPASTDATAT 3 0 Input The tag accompanying the store data in CPASTDATA CPASTDATAV Input Indicates that the store data to the core is valid Table A 11 Coprocessor to core signals continued Name Direction Description ...

Page 724: ...TDI Input JTAG TDI TMS Input JTAG TMS DBGTDI Output Synchronized TDI DBGTMS Output Synchronized TMS EDBGRQ Input External debug request DBGEN Input Debug enable DBGVERSION 3 0 Input JTAG ID Version field See Device ID code register on page 14 8 DBGMANID 10 0 Input JTAG manufacturer ID field See Device ID code register on page 14 8 DBGTDO Output Debug TDO DBGnTDOEN Output Debug nTDOEN COMMTX Output...

Page 725: ...rustZone trace information ETMIARET 31 0 Output ETM return instruction address ETMPADV 2 0 Output ETM pipeline advance ETMPWRUP Input When HIGH indicates that the ETM is powered up When LOW logic supporting the ETM must be clock gated to conserve power nETMWFIREADY Input When LOW indicates ETM can accept Wait For Interrupt ETMCPADDRESS 14 0 Output Coprocessor address ETMCPSECCTL 1 0 Output Coproce...

Page 726: ... Input Scan enable RSTBYPASS Input Bypass of reset repeaters MTESTON Input BIST enable MBISTDIN 63 0 Input MBIST data in MBISTADDR 12 0 Input MBIST address MBISTCE 19 0 Input MBIST chip enable MBISTWE 7 0 Input MBIST write enable MBISTDOUT 63 0 Output MBIST data out nVALIRQ Output Request for an interrupt nVALFIQ Output Request for a fast interrupt nVALRESET Output Request for a reset VALEDBGRQ Ou...

Page 727: ...Appendix B Summary of ARM1136JF S and ARM1176JZF S Processor Differences This appendix describes the main differences between the ARM1136JF S and ARM1176JZF S processors It contains these sections About the differences between the ARM1136JF S and ARM1176JZF S processors on page B 2 Summary of differences on page B 3 ...

Page 728: ... S and ARM1176JZF S processors These have an integer core a level one memory system that comprises caches write buffers TCM and MMU level two interfaces integrated VFP units high performance coprocessor interfaces debug and trace support The ARM1176JZF S processor adds the TrustZone architecture for enhanced OS security level two interfaces that use AXI busses compatible with AMBA 3 0 support for ...

Page 729: ...1 Memory System see also Debug on page B 10 The ARM1176JZF S processor embodies for TrustZone operation in Secure or Non secure states a new exception model a new mode Secure Monitor mode a new instruction SMC to switch to Secure Monitor mode new CP15 registers to support the TrustZone architecture some CP15 registers that are only accessible in Secure Privileged mode duplicated banked between Sec...

Page 730: ...ly Therefore it is possible to configure the ARM1176JZF S processor to support only a small number of options by means of the TEX remap mechanism This implies a level of indirection in the page table mappings Recent cores that include ARM1136JF S processors support this mapping with the MMU remap capability that was originally designed for debug of the hardware in CP15 register 15 By moving one en...

Page 731: ...t in the ARM1136JF S processor and if implemented you can use the IEM Register Slices to provide the asynchronous interface in the Level 2 ports of the ARM1136JF S processor VFP The power domains in the ARM1176JZF S processor are divided for the VFP all other logic outside the VFP a placeholder for clamping logic between these two blocks With this hierarchy you can switch off the VFP power to save...

Page 732: ...8KB 16KB 32KB 64KB The ARM1176JZF S processor implements zero one or two Tightly Coupled Memories on each side For each side the two TCMs are physically located within one RAM Table B 1 lists the possible configurations for ARM1176JZF S Tightly Coupled Memories for each side B 2 8 Fault Address Register ARM1136JF S processors includes an Instruction Fault Address Register in the system control cop...

Page 733: ...egisters in the ARM1176JZF S processor now use bit 12 to determine if the external aborts are SLVERR or DECERR B 2 10 Prefetch Unit In ARM1136JF S processors the Prefetch Unit has a three stage instruction buffer In ARM1176JZF S processors the Prefetch Unit has a seven stage instruction buffer This improves the performance of branch folding B 2 11 System control coprocessor operations The CP15 c15...

Page 734: ...M1136JF S processors Table B 2 CP15 c15 features common to ARM1136JF S and ARM1176JZF S processors CRn Opcode_1 CRm Opcode_2 Register Function c15 0 c2 4 Peripheral Memory Remap c12 0 Performance Monitor Control 1 Cycle Counter 2 Count Register 0 3 Count Register 1 3 c8 0 Instruction Cache Master Valid c12 0 Data Cache Master Valid 5a a Only applies for Lockdown entries c4 2 TLB Lockdown Index c5 ...

Page 735: ...code_1 CRm Opcode_2 Register Function c15 0 c2 0 Data Memory Remap Register 1 Instruction Memory Remap Register 2 DMA Memory Remap Register 3 C0 0 Data Debug Cache 1 Instruction Debug Cache C2 0 Data TAG RAM Read Operation 1 Instruction TAG RAM Read Operation C4 1 Instruction Cache RAM Data Read Operation 5 C4 0 Data MicroTLB Entry Operation 1 Instruction MicroTLB Entry Operation 2 Read Main TLB E...

Page 736: ...here SU stands for Secure User SP for Secure Privileged I for Invasive for example watchpoints and breakpoints NI for Non invasive for example trace and performance monitoring DEN for Debug Enable EDBGRQ In the ARM1176JZF S processor Halting debug mode is entered when EDBGRQ is asserted regardless of the selection of Debug state in DSCR 15 14 Debug test access port The ARM1136JF S processor requir...

Page 737: ... Write DMA It has one 32 bit AHB Lite Peripheral interface The ARM1176JZF S processor has three 64 bit AXI interfaces Instruction Data Read Write DMA It has one 32 bit AXI Peripheral interface B 2 15 Memory BIST MBISTWE from the ARM1136JF S processor is extended to 8 bits MBISTWE 7 0 in ARM1176JZF S processors to enable control of individual write enables for bit and byte write RAMs ...

Page 738: ... Register on page 3 52 All revisions Improve description of MVA alignment for L1 operations Table 3 69 on page 3 73 All revisions Improve description of DMA user access bits Table 3 107 on page 3 108 All revisions Correct B and C bit descriptions for the TLB Lockdown Attributes Register Table 3 152 on page 3 151 All revisions Correct user permissions for memory regions Table 6 1 on page 6 12 All r...

Page 739: ... instruction for entering debug state Entering Debug state on page 14 31 All revisions Deselect DTR in debug sequence Writing memory as words on page 14 37 All revisions Correct description of nETMWFIREADY signal Table A 13 on page A 15 All revisions Table C 1 Differences between issue G and issue H continued Change Location Affects ...

Page 740: ... that specify base register write back Addressing modes A mechanism shared by many different instructions for generating values used by the instructions For four of the ARM addressing modes the values generated are memory addresses the traditional role of an addressing mode A fifth addressing mode generates values to be used as operands by data processing instructions Advanced eXtensible Interface...

Page 741: ...B AP An optional component of the DAP that provides an AHB interface to a SoC AHB AP See AHB Access Port AHB Lite A subset of the full AMBA AHB protocol specification It provides all of the basic functions required by the majority of AMBA AHB slave and master designs particularly when used with a multi layer AMBA interconnect In most cases the extra facilities provided by a full AMBA AHB interface...

Page 742: ... Automatic Test Pattern Generation Automatic Test Pattern Generation ATPG The process of automatically generating manufacturing test vectors for an ASIC design using a specialized software tool AXI See Advanced eXtensible Interface AXI channel order and interfaces The block diagram shows the order in which AXI channel signals are described the master and slave interface conventions for AXI compone...

Page 743: ...y for master interfaces that use a combined storage for active write and read transactions Read ID capability The maximum number of different ARID values that a master interface can generate for all active read transactions at any one time Read ID width The number of bits in the ARID bus Read issuing capability The maximum number of active read transactions that a master interface can generate Wri...

Page 744: ... registers are R8 to R14 Base register A register specified by a load or store instruction that is used to hold the base value for the instruction s address calculation Depending on the instruction and its addressing mode an offset can be added to or subtracted from the base register value to form the virtual address that is sent to memory Base register write back Updating the contents of the base...

Page 745: ... where on the prediction of most branches the branch instruction is completely removed from the instruction stream presented to the execution pipeline Branch folding can significantly improve the performance of branches taking the CPI for branches lower than one Branch phantom The condition codes of a predicted taken branch Branch prediction The process of predicting if conditional branches are to...

Page 746: ...set associativity of the cache In this case main memory activity increases and performance decreases Cache hit A memory access that can be processed at high speed because the instruction or data that it addresses is already held in the cache Cache line The basic unit of storage in a cache It is always a power of two words in size usually four or eight words and is required to be aligned to a suita...

Page 747: ... communication is for debug purposes it is called the Debug Comms Channel In an ARMv6 compliant core the communications channel includes the Data Transfer Register some bits of the Data Status and Control Register and the external debug interface controller such as the DBGTAP controller in the case of the JTAG interface Condition field A four bit field in an instruction that specifies a condition ...

Page 748: ...ptional terminals that form the input output and control interface to a JTAG boundary scan architecture The mandatory terminals are DBGTDI DBGTDO DBGTMS and TCK The optional terminal is TRST This signal is mandatory in ARM cores because it is used to reset the debug logic Default NaN mode A mode in which all operations that result in a NaN return the default NaN regardless of the cause of the NaN ...

Page 749: ...ption enable bit in the FPCSR is set When an enabled exception occurs a trap to the user handler is taken An operation that generates an exception condition might bounce to the support code to produce the result defined by the IEEE 754 standard The exception is then reported to the user trap handler Endianness Byte ordering The scheme that determines the order in which successive bytes of a data w...

Page 750: ...sed to the memory system as data In these cases no address modification takes place See also Fast Context Switch Extension Fast Context Switch Extension FCSE An extension to the ARM architecture that enables cached processors with an MMU to present different addresses to the rest of the memory system for different software processes even when those processes are using identical addresses See also ...

Page 751: ...ly Undefined IMB See Instruction Memory Barrier Implementation defined Means that the behavior is not architecturally defined but should be defined and documented by individual implementations Implementation specific Means that the behavior is not architecturally defined and does not have to be documented by individual implementations Used when there are a number of implementation options availabl...

Page 752: ...internal nodes of the device and export the resulting values Interrupt handler A program that control of the processor is passed to when an interrupt occurs Interrupt vector One of a number of fixed addresses in low memory or in high memory if high vectors are configured that contains the first instruction of the corresponding interrupt handler Invalidate To mark a cache line as being not valid by...

Page 753: ...value read by a data read or instruction fetch is the value that was most recently written to that location Memory coherency is made difficult when there are multiple possible physical locations that are involved such as a system that has main memory a write buffer and a cache Memory Management Unit MMU Hardware that controls caches and access permissions to blocks of memory and translates virtual...

Page 754: ...emory access A Prefetch Abort can be caused by the external or internal memory system as a result of attempting to access invalid instruction memory See also Data Abort External Abort and Abort Processor A processor is the circuitry in a computer system required to process data using the computer instructions It is an abbreviation of microprocessor A clock source power supplies and main memory are...

Page 755: ... default NaN and flush to zero modes and disabling all exceptions In RunFast mode the VFP11 coprocessor does not bounce to the support code for any legal operation or any operand but supplies a result to the destination For all inexact and overflow results and all invalid operations that result from operations not involving NaNs the result is as specified by the IEEE 754 standard For operations in...

Page 756: ...f subnormal operands be performed with the same precision as normal operands Support code Software that must be used to complement the hardware to provide compatibility with the IEEE 754 standard The support code has a library of routines that performs supported functions such as divide with unsupported inputs or inputs that might generate an exception in addition to operations beyond the scope of...

Page 757: ...t in the FPSCR register The user trap handler is executed Trigger instruction The VFP coprocessor instruction that causes a bounce at the time it is issued A potentially exceptional instruction causes the VFP11 coprocessor to enter the exceptional state A subsequent instruction unless it is an FMXR or FMRX instruction accessing the FPEXC FPINST or FPSID register causes a bounce beginning exception...

Page 758: ...of each byte of memory changes when switching between little endian and big endian operation in such a way that the byte with address A in one endianness has address A EOR 3 in the other endianness As a result each aligned word of memory always consists of the same four bytes of memory in the same order regardless of endianness The change of endianness occurs because of the change to the byte addr...

Page 759: ...inguish between having the effect of the write visible and having the state of target updated This stricter requirement for some types of memory ensures that any side effects of the memory access can be guaranteed by the processor to have taken place You can use this to prevent the starting of a subsequent operation in the program order until the side effects are visible Write through WT In a writ...

Reviews: