background image

Related Information
Nios II Core Implementation Details

 on page 121

2.6.2.3.3. Peripheral Region

Nios II cores optionally support a new peripheral region mechanism to indicate

cacheability. The peripheral region cacheability mechanism allows a user at Platform

Designer generation time to specify a region of address space that is treated as non-

cacheable. The peripheral region is any integer power of 2 bytes from a minimum of

4096 bytes up to a maximum of 2 GBytes and must be located at a base address

aligned to the size of the peripheral region. The peripheral region is available as long

as an MMU is not present.

2.6.3. Tightly-Coupled Memory

Tightly-coupled memory provides guaranteed low-latency memory access for

performance-critical applications. Compared to cache memory, tightly-coupled

memory provides the following benefits:

Performance similar to cache memory

Software can guarantee that performance-critical code or data is located in tightly-

coupled memory

No real-time caching overhead, such as loading, invalidating, or flushing memory

Physically, a tightly-coupled memory port is a separate master port on the Nios II

processor core, similar to the instruction or data master port. A Nios II core can have

zero, one, or multiple tightly-coupled memories. The Nios II architecture supports

tightly-coupled memory for both instruction and data access. Each tightly-coupled

memory port connects directly to exactly one memory with guaranteed low, fixed

latency. The memory is external to the Nios II core and is located on chip.

2.6.3.1. Accessing Tightly-Coupled Memory

Tightly-coupled memories occupy normal address space, the same as other memory

devices connected via system interconnect fabric. The address ranges for tightly-

coupled memories (if any) are determined at system generation time.

Software accesses tightly-coupled memory using regular load and store instructions.

From the software’s perspective, there is no difference accessing tightly-coupled

memory compared to other memory.

Note: 

The tightly-coupled master requires a fixed memory latency of 1 cycle. Hence, the

transaction with a slave in a different clock domain may not be successful since the

transfer would take more than 1 cycle.

2.6.3.2. Effective Use of Tightly-Coupled Memory

A system can use tightly-coupled memory to achieve maximum performance for

accessing a specific section of code or data. For example, interrupt-intensive

applications can place exception handler code into a tightly-coupled memory to

minimize interrupt latency. Similarly, compute-intensive digital signal processing (DSP)

applications can place data buffers into tightly-coupled memory for the fastest

possible data access.

2. Processor Architecture

NII-PRG | 2018.04.18

Nios II Processor Reference Guide

29

Summary of Contents for NIOS II

Page 1: ...Nios II Processor Reference Guide Subscribe Send Feedback NII PRG 2018 04 18 Latest document on the web PDF HTML...

Page 2: ...5 Exception and Interrupt Controllers 22 2 5 1 Exception Controller 22 2 5 2 EIC Interface 23 2 5 3 Internal Interrupt Controller 23 2 6 Memory and I O Organization 24 2 6 1 Instruction and Data Buses...

Page 3: ...9 Exception Processing Flow 88 3 7 10 Determining the Cause of Interrupt and Instruction Related Exceptions 93 3 7 11 Handling Nested Exceptions 94 3 7 12 Handling Nonmaskable Interrupts 96 3 7 13 Ma...

Page 4: ...Prediction 120 4 7 9 RAM Memory Protection 120 4 8 The Quartus Prime IP File 120 4 9 Instantiating the Nios II Processor Revision History 120 5 Nios II Core Implementation Details 121 5 1 Device Famil...

Page 5: ...imination 149 7 4 2 Call Saved Registers 149 7 4 3 Further Examples of Stacks 149 7 4 4 Function Prologues 151 7 5 Arguments and Return Values 152 7 5 1 Arguments 153 7 5 2 Return Values 153 7 6 DWARF...

Page 6: ...85 8 5 21 cmpeq 185 8 5 22 cmpeqi 186 8 5 23 cmpge 186 8 5 24 cmpgei 187 8 5 25 cmpgeu 188 8 5 26 cmpgeui 188 8 5 27 cmpgt 189 8 5 28 cmpgti 189 8 5 29 cmpgtu 189 8 5 30 cmpgtui 190 8 5 31 cmple 190 8...

Page 7: ...70 nop 217 8 5 71 nor 217 8 5 72 or 217 8 5 73 orhi 218 8 5 74 ori 218 8 5 75 rdctl 219 8 5 76 rdprs 219 8 5 77 ret 220 8 5 78 rol 220 8 5 79 roli 221 8 5 80 ror 221 8 5 81 sll 222 8 5 82 slli 222 8 5...

Page 8: ...d processors Related Information Nios II Processor webpage 1 1 Nios II Processor System Basics The Nios II processor is a general purpose RISC processor core with the following features Full 32 bit in...

Page 9: ...core a set of on chip peripherals on chip memory and interfaces to off chip memory all implemented on a single Intel FPGA device Like a microcontroller family all Nios II processor systems use a cons...

Page 10: ...provided reference design you can copy the reference design and use it without modification in the final hardware platform Otherwise you can customize the Nios II processor system until it meets cost...

Page 11: ...ice goals Soft means the processor core is not fixed in silicon and can be targeted to any Intel FPGA family You are not required to create a new Nios II processor configuration for every new design I...

Page 12: ...a result to a destination register Because the processor is implemented on reprogrammable Intel FPGAs software and hardware engineers can work together to iteratively optimize the hardware and test th...

Page 13: ...design in hardware You only need to purchase a license for the Nios II processor when you are completely satisfied with its functionality and performance and want to take your design to production Rel...

Page 14: ...ler Instruction bus Data bus Memory management unit MMU Memory protection unit MPU Instruction and data cache memories Tightly coupled memory interfaces for instructions and data JTAG debug module NII...

Page 15: ...ic_port_valid Shadow Register Sets Required Module Optional Module Key 2 1 Processor Implementation The functional units of the Nios II architecture form the foundation for the Nios II instruction set...

Page 16: ...d Information Instantiating the Nios II Processor on page 106 Nios II Core Implementation Details on page 121 Instruction Set Reference on page 169 2 2 Register File The Nios II architecture supports...

Page 17: ...er instruction The ALU supports arithmetic shift right and logical shift right left The ALU supports rotate left right 2 3 1 Unimplemented Instructions Some Nios II processor core implementations do n...

Page 18: ...n to the basic instruction set Note For optimum performance and device footprint Intel recommends using FPH2 rather than FPH1 These floating point instructions are implemented as custom instructions T...

Page 19: ...e and absolute operations support subnormal numbers The add subtract multiply divide square root and float to integer operations do NOT support subnormal numbers Subnormal operands are treated as sign...

Page 20: ...ne a b fcmples 230 1 a b 1 0 Supported None a b fcmpgts 229 1 a b 1 0 Supported None a b fcmpges 228 1 a b 1 0 Supported None a b fcmpeqs 227 1 a b 1 0 Supported None a b fcmpnes 226 1 a b 1 0 Support...

Page 21: ...codes to use the custom instructions for floating point operations 2 3 3 2 Nios II Floating Point Hardware FPH1 Component The FPH1 component supports addition subtraction multiplication and optionall...

Page 22: ...ner as when a breakpoint is encountered transfers execution to the routine located at the break address and asserts a debugack signal Asserting the debugreq signal when the processor is already paused...

Page 23: ...n interrupt level The Nios II processor uses the interrupt level in determining when to service the interrupt Any external interrupt can be configured as an NMI NMIs are not masked by the status PIE b...

Page 24: ...d I O organization are the most notable difference between Nios II processor systems and traditional microcontrollers Because Nios II processor systems are configurable the memories and peripherals va...

Page 25: ...ction and Data Buses The Nios II architecture supports separate instruction and data buses classifying it as a Harvard architecture Both the instruction and data buses are implemented as Avalon MM mas...

Page 26: ...read requests before data has returned from prior requests The Nios II processor can prefetch sequential instructions and perform branch prediction to keep the instruction pipe as active as possible T...

Page 27: ...and the structure of the system interconnect fabric The data and instruction master ports never cause a gridlock condition in which one port starves the other For highest performance assign the data...

Page 28: ...uction cache may degrade performance in this situation If an application always requires certain data or sections of code to be located in cache memory for performance reasons the tightly coupled memo...

Page 29: ...architecture supports tightly coupled memory for both instruction and data access Each tightly coupled memory port connects directly to exactly one memory with guaranteed low fixed latency The memory...

Page 30: ...II MMU provides the following features and functionality Virtual to physical address mapping Memory protection 32 bit virtual and physical addresses mapping a 4 GB virtual address space into as much a...

Page 31: ...gions Variable instruction and data region sizes Amount of region memory defined by size or upper address limit Read and write access permissions for data regions Execute access permissions for instru...

Page 32: ...o the routine located at the break address The break address is specified with the Nios II Processor parameter editor in Platform Designer Soft processor cores such as the Nios II processor offer uniq...

Page 33: ...g action based on conditions on the instruction or data bus during real time program execution Triggers can do more than halt processor execution For example a trigger can be used to enable trace data...

Page 34: ...on a range of values within a specified range 2 7 6 Trace Capture Trace capture refers to ability to record the instruction by instruction execution of the processor as it executes code in real time...

Page 35: ...ce with the processor executing in real time execution trace is optimized to store only selected addresses such as branches calls traps and interrupts From these addresses host side debug software can...

Page 36: ...ng sections define the modes their relationship to your system software and application code and their relationship to the Nios II MMU and Nios II MPU 3 1 1 Supervisor Mode Supervisor mode allows unre...

Page 37: ...to make requests to the operating system to perform I O operations manage memory and access other system functionality in the supervisor memory The Nios II MMU statically divides the 32 bit virtual a...

Page 38: ...al address is the address that software uses A physical address is the address which the hardware outputs on the address lines of the Avalon bus The Nios II MMU divides virtual memory into 4 KB pages...

Page 39: ...is allowed to read data from write data to or execute instructions on each particular page The MMU also controls whether accesses to each data page are cacheable or uncacheable by default Whenever an...

Page 40: ...0000 Accessed directly or viaTLB Accessed only viaTLB High physical memory can only be accessed through the TLB Any physical address in low memory 29 bits or less can be accessed through the TLB or by...

Page 41: ...III Stratix IV 256 entries requiring one M9K RAM For more information refer to the Instantiating the Nios II Processor chapter of the Nios II Processor Reference Handbook The operating system software...

Page 42: ...ual addresses into each TLB entry Related Information Programming Model on page 36 Nios II Core Implementation Details on page 121 3 2 5 TLB Lookups A TLB lookup attempts to convert a virtual address...

Page 43: ...de although all software can run in supervisor mode if desired System software defines which MPU regions belong to supervisor mode and which belong to user mode MPU protects user application Therefore...

Page 44: ...region_limit The region limit uses a less than instead of a less than or equal to comparison because less than provides a more efficient implementation The limit is one bit larger than the address so...

Page 45: ...nformation Related Information Working with the MPU on page 68 3 4 Registers The Nios II register set includes general purpose registers and control registers In addition the Nios II f core can option...

Page 46: ...The Status Register section For more information refer to the Application Binary Interface chapter of the Nios II Processor Reference Handbook Related Information Application Binary Interface on page...

Page 47: ...efer to The config Register on page 58 Available only when the MPU or ECC is present Otherwise reserved 14 mpubase Refer to The mpubase Register Available only when the MPU is present Otherwise reserv...

Page 48: ...bmisc Refer to The tlbmisc Register Available only when the MMU is present Otherwise reserved 11 eccinj Refer to The eccinj Register Available only when ECC is present 12 badaddr Refer to The badaddr...

Page 49: ...e n is the number of implemented register sets The processor core implements the number of significant bits needed to represent n 1 Unused high order bits are always read as 0 and must be written as 0...

Page 50: ...interrupt exceptions are unaffected by PIE Read Write 0 Always Related Information External Interrupt Controller Interface on page 80 3 4 2 2 The estatus Register The estatus register holds a saved co...

Page 51: ...Field Description table describes the details of the fields defined in the bstatus register When a break occurs the value of the status register is copied into bstatus Using bstatus the debugger can r...

Page 52: ...ore information refer to the Nios II Core Implementation Details chapter of this document For information about controlling the extra exception information option refer to the Instantiating the Nios I...

Page 53: ...olation exception or on a TLB read operation The VPN field is not written on any exceptions taken when an exception is already active that is when status EH is already one 3 4 2 9 The tlbacc Register...

Page 54: ...uctions are allowed to access memory Read Write 0 Only with MMU X X is the executable flag When X 0 instructions are not allowed to execute When X 1 instructions are allowed to execute Read Write 0 On...

Page 55: ...0 Only with MMU DBL DBL is the double TLB miss exception flag Read 0 Only with MMU BAD BAD is the bad virtual address exception flag Read 0 Only with MMU PERM PERM is the TLB permission violation exc...

Page 56: ...que identifier for the current process that effectively extends the virtual address The process identifier can be less than 14 bits Any unused upper bits must be zero tlbmisc PID contains the PID fiel...

Page 57: ...the exception is related to a data access and clears D to zero for all other nonbreak exceptions The following exceptions set the D flag to one Fast TLB miss data Double TLB miss data TLB permission...

Page 58: ...es are always full 32 bit values When an MMU is present the BADDR field contains the virtual address If there is no MMU or MPU and the Nios II address space is less than 32 bits unused high order bits...

Page 59: ...ble bit When PE 1 the MPU is enabled When PE 0 the MPU is disabled In systems without an MPU PE is always zero Read Write 0 Only with MPU 3 4 2 13 The mpubase Register The mpubase register works in co...

Page 60: ...nd are read as zero Refer to the MPU Region Read and Write Operations section for more information on MPU region read and write operations Related Information MPU Region Read and Write Operations on p...

Page 61: ...ite 0 Only with MPU The MASK and LIMIT fields are mutually exclusive Refer to mpucc Control Register Field for MASK Variation Table and mpuacc Control Register Field for LIMIT Variation Table The foll...

Page 62: ...size is in bytes MASK 0xFFFFFF log2 region_size 8 3 4 2 14 2 The LIMIT Field When the amount of memory reserved for a region is defined by an upper address limit the LIMIT field specifies the upper ad...

Page 63: ...n for more information on cache bypass 3 4 2 14 4 The PERM Field The PERM field specifies the allowed access permissions Table 36 Instruction Region Permission Values Value Supervisor Permissions User...

Page 64: ...9 8 7 6 5 4 3 2 1 0 DTCM 1 DTCM 0 TLB DC DAT DC TAG ICDAT ICTAG RF Software writes 0x1 to inject a 1 bit ECC error or 0x2 to inject a 2 bit ECC error to the RAM field Hardware sets the value of the i...

Page 65: ...n have up to 63 shadow register sets If n is the configured number of shadow register sets the shadow register sets are numbered from 1 to n Register set 0 is the normal register set A shadow register...

Page 66: ...hen an external interrupt occurs if the interrupt required the processor to switch to a different register set Read Write Undefined EIC interface and shadow register sets only RSIE RSIE is the registe...

Page 67: ...execute eret If the processor is currently running in a shadow register set insert the new register set number in sstatus CRS and execute eret Before executing eret to change the register set system s...

Page 68: ...he mpubase register to read the loaded the mpubase register value Execute a rdctl instruction to the mpuacc register to read the loaded the mpuacc register value The MPU region read operation retrieve...

Page 69: ...ro When using limit set the mpubase BASE to a nonzero value and clear mpuacc LIMIT to zero Note You must enable at least one instruction and one data region otherwise unpredictable behavior might occu...

Page 70: ...bit error because the TLB is a software managed cache of the operating system page tables stored in the main memory e g SDRAM Software can invalid the TLB entry return to the instruction that took the...

Page 71: ...CINJ ICDAT to INJS or INJD This setting causes an ECC error to occur on the start of the next line fill 5 Use a JMP instruction to jump to an instruction address in the flushed line 6 The ECC error is...

Page 72: ...ion to ensure the value of ECCINJ DCTAG is NOINJ Before the RDCTL use a FLUSHP instruction to avoid the RAW hazard on ECCINJ 6 Do another LOAD or STORE instruction to the same line 7 The ECC error sho...

Page 73: ...RAW hazard on ECCINJ 5 Either use a LOAD instruction from the same address or trigger a writeback of the dirty line e g FLUSHDA instruction 6 The ECC error should be triggered on the LOAD instruction...

Page 74: ...licit request signal from an external device also hardware interrupt Interrupt controller hardware that interfaces the processor to interrupt request signals from external devices Internal interrupt c...

Page 75: ...specify in the Nios II processor IP core setup parameters The following table columns specify information for the exceptions Exception Gives the name of the exception Type Specifies the exception typ...

Page 76: ...ion related MMU or MPU 10 ea 4 16 General exception Trap instruction Instruction related Always 3 ea 4 16 General exception Illegal instruction Instruction related Illegal instruction detection on MMU...

Page 77: ...ption the software and hardware configuration and the processor state 3 7 3 1 Interrupt Latency The interrupt controller can mask individual interrupts Each interrupt can have a different maximum mask...

Page 78: ...tly zero Control registers except for status status RSIE is reset to 1 and the remaining fields are reset to 0 Instruction and data memory Cache memory except for the instruction cache line associated...

Page 79: ...editor Note All noninterrupt exception handlers including the break handler must run in the normal register set 3 7 5 2 Understanding Register Usage The bstatus control register and general purpose r...

Page 80: ...RNMI mode Refer to the Requested NMI Mode section of this chapter The Nios II processor EIC interface connects to a single EIC but an EIC can support a daisy chained configuration In a daisy chained c...

Page 81: ...d restores them on exit Typically the Nios II processor is configured so that when it takes an interrupt other interrupts in the same register set are disabled If interrupt preemption within a registe...

Page 82: ...rqn is asserted The corresponding bit n of the ienable control register is one Upon hardware interrupt the processor clears the PIE bit to zero disabling further interrupts and performs the other step...

Page 83: ...ing Flow on page 88 3 7 7 Instruction Related Exceptions Instruction related exceptions occur during execution of Nios II instructions When they occur the processor perform the steps outlined in the E...

Page 84: ...lated Information Break Exceptions on page 78 3 7 7 3 Unimplemented Instruction When the processor issues a valid instruction that is not implemented in hardware an unimplemented instruction exception...

Page 85: ...Model on page 36 Nios II Core Implementation Details on page 121 3 7 7 5 Supervisor Only Instruction When your system contains an MMU or MPU and the processor is in user mode status U 1 executing a s...

Page 86: ...e width of the load or store instruction data width four bytes for word two bytes for half word Byte load and store instructions are always aligned so never take a misaligned address exception Related...

Page 87: ...etch can cause this exception Fast TLB miss data Load store initda and flushda instructions can cause this exception The fast TLB miss exception handler can inspect the tlbmisc D field to determine wh...

Page 88: ...for that region did not allow the action to complete An instruction fetch or data address did not match any region The general exception handler reads the MPU region attributes to determine if the ad...

Page 89: ...the fast TLB miss exception It is built for speed to process TLB misses quickly The fast TLB miss exception handler address specified with the Nios II Processor parameter editor in Platform Designer...

Page 90: ...ts as described in the Shadow Register Set section of this chapter Keeping status PIE set allows higher level interrupts to be taken immediate without requiring the interrupt handler to set status PIE...

Page 91: ...errupt Exception status EH 1 34 status EH 0 status EH 1 status EH 0 TLB Miss 36 No TLB Miss RRS 0 35 RRS 0 RRS 0 RRS 0 TLB Permission Violation 36 No TLB Permission Violation pteaddr VPN 20 No change...

Page 92: ...ermission violation set to 0 otherwise 30 Set to 1 on a bad virtual address exception set to 0 otherwise 31 Disables exceptions and nonmaskable interrupts 32 If the MMU is implemented indicates that t...

Page 93: ...ck for interrupt exceptions With an external interrupt controller ipending is always 0 and this check can be omitted if estatus PIE 1 and ipending 0 handle interrupt Decode exception from instruction...

Page 94: ...itten to prevent each handler from corrupting the context in which a pre empted handler runs If an exception handler issues a trap instruction an optional instruction or an instruction which could gen...

Page 95: ...o another register set The following tables demonstrate the validity of register set assignments when preemption within a register set is enabled Table 45 Example of Illegal RIL Assignment RIL Registe...

Page 96: ...intact the processor state associated with maskable interrupts and other exceptions as well as normal nonexception processing when each NMI is assigned to a dedicated shadow register set Therefore NMI...

Page 97: ...ation Handling Nested Exceptions on page 94 Exception Processing on page 74 3 7 13 4 Returning From Interrupt and Instruction Related Exceptions The eret instruction is used to resume execution at the...

Page 98: ...ted exceptions execution must resume from the instruction following the instruction where the exception occurred Therefore ea contains the correct return address On the other hand hardware interrupt e...

Page 99: ...ch processor cores implement bit 31 cache bypass Refer to Instruction Set Reference chapter of the Nios II Processor Reference Handbook for details of the cache bypass instructions Code written for a...

Page 100: ...o select the line and are translated bits bits 12 and up are known as the color of the address An operating system avoids illegal virtual address aliases by ensuring that if multiple virtual addresses...

Page 101: ...ri xori These operations are immediate versions of the and or and xor instructions The 16 bit immediate value is zero extended to 32 bits and then combined with a register value to form the result and...

Page 102: ...ions perform all the equality and relational operators of the C programming language Table 51 Comparison Instructions Instruction Description cmpeq cmpne cmpge signed cmpgeu unsigned cmpgt signed cmpg...

Page 103: ...a register and stores the return address in register ra This instruction serves the roll of dereferencing a C function pointer ret The ret instruction is used to return from subroutines called by call...

Page 104: ...s from the pipeline This is necessary before jumping to recently modified instruction memory sync This instruction ensures that all previously issued operations have completed before allowing executio...

Page 105: ...erate an unimplemented instruction exception An exception routine must exercise caution if it uses these instructions because they could generate another exception before the previous exception is pro...

Page 106: ...elect the processor core The core you select on this tab affects other options available on this and other tabs Figure 6 Nios II Platform Designer Main Tab NII PRG 2018 04 18 Intel Corporation All rig...

Page 107: ...er to the Nios II Core Implementation Details chapter of the Nios II Processor Reference Handbook Related Information Nios II Core Implementation Details on page 121 4 2 Vectors Tab Figure 7 Nios II P...

Page 108: ...n a typical system select a low latency memory module for the exception code Exception vector offset specifies the location of the exception vector relative to the memory module s base address Platfor...

Page 109: ...displays the readonly calculated address The address is always a physical address even when an MMU is present Note The Nios II MMU is optional and mutually exclusive from the Nios II MPU Nios II syste...

Page 110: ...th but might consume additional FPGA resources Be aware that when bursts are enabled accesses to slaves might go through additional hardware called burst adapters which might decrease your fMAX When t...

Page 111: ...memory bandwidth but might consume additional FPGA resources Be aware that when bursts are enabled accesses to slaves might go through additional hardware called burst adapters which might decrease y...

Page 112: ...best option to balance embedded multiplier usage logic element LE usage and performance Figure 9 Nios II Platform Designer Arithmetic Instructions Tab 4 4 1 Arithmetic Instructions Multiply Shift Rot...

Page 113: ...s achieved by using the 32 bit multiplier along with the multiply extended instructions mulxss mulxsu mulxuu which can be found in the Instruction Set Reference chapter of this manual Table 58 64 bit...

Page 114: ...te based on the device family of the target hardware and disables TLB entries TLB entries Specifies the number of entries in the translation lookaside buffer TLB TLB Set Associativity Specifies the nu...

Page 115: ...ction region size Specifies the minimum instruction region size Allowed values range from 256 bytes to 1 MB and must be a power of two Note The maximum region size is the size of the Nios II instructi...

Page 116: ...akpoint on instructions residing in RAM Hardware Breakpoints Sets a breakpoint on instructions residing in nonvolatile memory such as flash memory Data Triggers Triggers based on address value data va...

Page 117: ...AM blocks Every M4K RAM block can store up to 128 trace frames Related Information Processor Architecture on page 14 4 7 Advanced Features Tab Figure 12 Nios II Platform Designer Advanced Features Tab...

Page 118: ...e configured with up to 63 shadow register sets Shadow register sets are available only on the Nios II f core Note When the EIC interface and shadow register sets are implemented on the Nios II core y...

Page 119: ...e on There are two misaligned memory address exceptions Misaligned data address Data addresses of load and store instructions are checked for misalignment A data address is considered misaligned if th...

Page 120: ...o add this qip file to the current project at the time of Quartus Prime file generation In most cases the qip file contains all of the necessary assignments and information required to process the cor...

Page 121: ...nal Cyclone III Final Cyclone III LS Final Cyclone IV GX Final Cyclone IV E Final Cyclone V Final Intel Cyclone 10 LP Final Stratix II Final Stratix II GX Final Stratix III Final continued NII PRG 201...

Page 122: ...atency Maximize fMAX performance of the processor core The resulting core is optimal for performance critical applications as well as for applications with large amounts of code and or data such as sy...

Page 123: ...le on the target device This option is available only on Intel FPGAs that have a hardware multiplier that supports 32 bit multiplication Embedded Multipliers Includes dedicated embedded multipliers av...

Page 124: ...an add operation that uses the result of the multiply On the Nios II f core the addi instruction like most ALU instructions executes in a single cycle However in this code example execution of the ad...

Page 125: ...port Note Although the Nios II processor can operate entirely out of tightly coupled memory without the need for Avalon MM instruction or data masters software debug is not possible when either the A...

Page 126: ...he is optional However excluding instruction cache from the Nios II f core requires that the core include at least one tightly coupled instruction memory 5 2 3 2 2 Data Cache Direct mapped cache imple...

Page 127: ...ote Avoid mixing cached and uncached accesses to the same cache line regardless whether you are reading from or writing to the cache line If it is necessary to mix cached and uncached data accesses fl...

Page 128: ...sts of one main TLB stored in on chip RAM and two separate micro TLBs TLB for instructions ITLB and data DTLB stored in LE based registers The TLBs have a configurable number of entries and are fully...

Page 129: ...ad multiply shift 5 2 7 1 Pipeline Stalls The pipeline is set up so that if a stage stalls no new values enter that stage or any earlier stages No catching up of pipeline stages is allowed even if a p...

Page 130: ...ty and an execution time of four cycles Instructions that require Avalon MM transfers are stalled until any required Avalon MM transfers up to one write and one read are completed Table 68 Instruction...

Page 131: ...tions are present they require five clock cycles MMU MPU Division exception Misaligned load store address exception EIC port Shadow register sets Related Information Data Cache on page 126 Instruction...

Page 132: ...sted interrupt level RIL The six bit interrupt level If RIL is 0 no interrupt is requested Requested nonmaskable interrupt RNMI flag A one bit flag indicating whether the interrupt is to be treated as...

Page 133: ...Nios II includes the ECC encoder decoder logic for each TCM and the TCM master port data width is increased to allow the Nios II to read and write the ECC parity bits The TCM must be a RAM and must s...

Page 134: ...line None Data cache present 18 DCWB_UE Unrecoverable 2 bit ECC error in data cache data RAM or victim line buffer RAM during writeback of a dirty line Likely fatal Data cache present 19 TLB_RE Recove...

Page 135: ...rview The Nios II s core Has an instruction cache but no data cache Can access up to 2 GB of external address space Supports optional tightly coupled memory for instructions Employs a 5 stage pipeline...

Page 136: ...the Nios II s Core ALU Option Hardware Details Cycles per instruction Supported Instructions No hardware multiply or divide Multiply and divide instructions generate an exception None LE based multipl...

Page 137: ...ast one tightly coupled memory to take the place of the missing master port Note Although the Nios II processor can operate entirely out of tightly coupled memory without the need for Avalon MM instru...

Page 138: ...reside in tightly coupled memory If the address resides in tightly coupled memory the Nios II core fetches the instruction through the tightly coupled memory interface Software does not require aware...

Page 139: ...erforming its operation when using the multicycle shift circuitry i e when the hardware multiplier is not available An M stage shift rotate multiply instruction is still performing its operation when...

Page 140: ...5 3 8 JTAG Debug Module The Nios II s core supports the JTAG debug module to provide a JTAG interface to software debugging tools The Nios II s core supports an optional enhanced interface that allows...

Page 141: ...shift circuitry achieves one bit per cycle shift and rotate operations 5 4 3 Memory Access The Nios II e core does not provide instruction cache or data cache All memory and peripheral accesses genera...

Page 142: ...ructions e g add cmplt 6 All branch jmp jmpi ret call callr 6 trap break eret bret flushp wrctl rdctl unimplemented 6 All load word 6 Duration of Avalon MM read transfer All load halfword 9 Duration o...

Page 143: ...rsion Changes 2018 04 18 Implemented editorial enhancements 2017 05 08 Added link to Nios II Performance Benchmarks 2016 10 28 Maintenance release 2015 04 02 Initial release 5 Nios II Core Implementat...

Page 144: ...hitecture Revisions Architecture revisions augment the fundamental capabilities of the Nios II architecture and affect all Nios II cores A change in the architecture mandates a revision to all Nios II...

Page 145: ...e Nios II s core 6 3 3 Nios II e Core Table 80 Nios II e Core Revisions Version Release Date Notes 14 0 January 2015 Initial release of the Nios II e core 6 4 JTAG Debug Module Revisions JTAG debug mo...

Page 146: ...only be aligned to a 32 bit boundary Structures unions and strings must be aligned to a minimum of 32 bits Bit fields inside structures are always 32 bit aligned NII PRG 2018 04 18 Intel Corporation A...

Page 147: ...second 32 bits r6 v Register arguments third 32 bits r7 v Register arguments fourth 32 bits r8 v Caller saved general purpose registers r9 v r10 v r11 v r12 v r13 v r14 v r15 v r16 v v Callee saved g...

Page 148: ...points to the last used slot The frame pointer points to the saved frame pointer near the top of the stack frame The figure below shows an example of the structure of a current frame In this case func...

Page 149: ...support If you are not using a debugger you can optimize your code by eliminating the frame pointer using the fomit frame pointer compiler option When the frame pointer is eliminated register fp is a...

Page 150: ...ter Calling alloca higher addresses lower addresses space for outgoing stack arguments sp sp space for outgoing stack arguments memory allocated by alloca a c o l l a g n i l l a c r e t f A e r o f e...

Page 151: ...ure is passed using registers the function might need to copy the register contents back to the stack This operation is similar to that required in the variable arguments case as shown in the figure a...

Page 152: ...er Usage on page 147 7 4 4 1 Prologue Variations The following variations can occur in a prologue If the function s frame size is greater than 32 767 bytes extra temporary registers are used in the ca...

Page 153: ...th Variable Arguments Related Information Stack Frame for a Function with Variable Arguments on page 150 7 5 2 Return Values Return values of types up to 8 bytes are returned in r2 and r3 For return v...

Page 154: ...type specifies how to calculate the relocated address The bit mask specifies where the address is found in the instruction Table 85 Nios II Relocation Calculation Name Value Overflow check 45 Relocat...

Page 155: ...n a n a R_NIOS2_UJMP 18 No S A 16 0xFFFF S A 4 0xFFFF 0x003FFFC0 6 R_NIOS2_CJMP 19 No S A 16 0xFFFF S A 4 0xFFFF 0x003FFFC0 6 R_NIOS2_CALLR 20 No S A 16 0xFFFF S A 4 0xFFFF 0x003FFFC0 6 R_NIOS2_ALIGN...

Page 156: ...cation section 0xFFFFFFFF 0 R_NIOS2_RELATIVE 39 47 No BA A 0xFFFFFFFF 0 R_NIOS2_GOTOFF 40 47 No S A 0xFFFFFFFF 0 R_NIOS2_GOT_LO 42 47 No G 0xFFFF 0x003FFFC0 6 R_NIOS2_GOT_HA 43 47 No Adj G 0x003FFFC0...

Page 157: ...eyond the Linux specific information in Nios II ABI Register Usage Table and the Nios II Relocation Calculation Table Related Information Relocation on page 154 Register Usage on page 147 7 9 1 Linux...

Page 158: ..._NIOS2_GOTOFF_HA gotoff_lo R_NIOS2_PCREL_LO hiadj R_NIOS2_PCREL_HA lo R_NIOS2_TLS_GD16 tls_gd R_NIOS2_TLS_LDM16 tls_ldm R_NIOS2_TLS_LDO16 tls_ldo R_NIOS2_TLS_IE16 tls_ie R_NIOS2_TLS_LE16 tls_le R_NIOS...

Page 159: ...eturn value is always returned in r2 Calls to __tls_get_addr must use the normal position independent code PIC calling convention in PIC code these sequences are for example only and the compiler migh...

Page 160: ...odified On non Linux systems r23 is a general purpose callee saved register The global pointer r26 or gp is globally fixed It is initialized in startup code and always valid on entry to a function Thi...

Page 161: ...n SIGILL Illegal instruction SIGILL Break instruction SIGTRAP Supervisor only data address SIGSEGV Misaligned data address SIGBUS Misaligned destination address SIGBUS Division error SIGFPE TLB Permis...

Page 162: ...of pairs of 32 bit tag and 32 bit value terminated by an AT_NULL tag 7 9 5 Linux Position Independent Code Every position independent code PIC function which uses global data or global functions must...

Page 163: ...3 r22 GOT n R_NIOS2_RELATIVE x The call and jmpi instructions are not available in position independent code Instead all calls are made through the GOT Function addresses may be loaded with call which...

Page 164: ...otoff Label1 word gotoff Label2 word gotoff Label3 Related Information Procedure Linkage Table on page 165 7 9 6 Linux Program Loading and Dynamic Linking 7 9 6 1 Global Offset Table Because shared li...

Page 165: ...for lazy binding The link editor fills in an initial value pointing to the lazy binding stubs at the start of the PLT section Each PLT entry appears as shown in the example below Example 24 PLT Entry...

Page 166: ...unctions outside the current shared object must pass through the GOT The program loads function addresses using call and the link editor may arrange for such entries to be lazily bound Because PLT ent...

Page 167: ...ap instruction with immediate operand 31 all ones The OS must distinguish this instruction from a trap 0 system call and generate a trap signal 7 9 7 3 Atomic Operations The Nios II architecture does...

Page 168: ...I R2 ISA 7 11 Application Binary Interface Revision History Document Version Changes 2018 04 18 Implemented editorial enhancements Updated the information about object macros in Development Environmen...

Page 169: ...4 3 2 1 0 IMM16 OP 8 1 2 R Type The defining characteristic of the R type instruction word format is that all arguments and results are specified as registers R type instructions contain A 6 bit opco...

Page 170: ...and jmpi transfer execution anywhere within a 256 MB range Table 92 J Type Instruction Format Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 IMM26 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 IMM...

Page 171: ...Table 94 OPX Encodings for R Type Instructions OPX Instruction OPX Instruction OPX Instruction OPX Instruction 0x00 0x10 cmplt 0x20 cmpeq 0x30 cmpltu 0x01 eret 0x11 0x21 0x31 add 0x02 roli 0x12 slli...

Page 172: ...IMMED 1 cmple rC rA rB cmpge rC rB rA cmplei rB rA IMMED cmplti rB rA IMMED 1 cmpleu rC rA rB cmpgeu rC rB rA cmpleui rB rA IMMED cmpltui rB rA IMMED 1 mov rC rA add rC rA r0 movhi rB IMMED orhi rB r0...

Page 173: ...all Nios II instruction mnemonics in alphabetical order Table 97 Notation Conventions Notation Meaning X Y X is written with Y PC X The program counter PC is written with address X the instruction at...

Page 174: ...tion about these and all Nios II exceptions refer to the Programming Model chapter of the Nios II Processor Reference Handbook Related Information Programming Model on page 36 8 5 1 add Instruction ad...

Page 175: ...21 20 19 18 17 16 A B C 0x31 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0x31 0 0x3A 8 5 2 addi Instruction addi Operation rB rA IMM16 Assembler Syntax addi rB rA IMM16 Example addi r6 r7 100 Description Si...

Page 176: ...ptions None Instruction Type I Instruction Fields A Register index of operand rA B Register index of operand rB IMM16 16 bit signed immediate value Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18...

Page 177: ...B Register index of operand rB IMM16 16 bit unsigned immediate value Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 A B IMM16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 IMM16 0x2c 8 5 5 andi I...

Page 178: ...ptions Misaligned destination address Instruction Type I Instruction Fields A Register index of operand rA B Register index of operand rB IMM16 16 bit signed immediate value Bit Fields 31 30 29 28 27...

Page 179: ...on If unsigned rA unsigned rB then bgeu transfers program control to the instruction at label In the instruction encoding the offset given by IMM16 is treated as a signed number of bytes relative to t...

Page 180: ...nstruction at label Pseudo instruction bgtu is implemented with the bltu instruction by swapping the register operands 8 5 11 ble Instruction branch if less than or equal signed Operation if signed rA...

Page 181: ...word aligned Exceptions Misaligned destination address Instruction Type I Instruction Fields A Register index of operand rA B Register index of operand rB IMM16 16 bit signed immediate value Bit Fiel...

Page 182: ...uction encoding the offset given by IMM16 is treated as a signed number of bytes relative to the instruction immediately following bne The two least significant bits of IMM16 are always zero because i...

Page 183: ...m execution and transfers control to the debugger break processing routine Saves the address of the next instruction in register ba and saves the contents of the status register in bstatus Disables in...

Page 184: ...tion Instruction Type R Instruction Fields None Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0x1e 0 0x1e 0x09 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0x09 0 0x3a 8 5 19 call Instruction ca...

Page 185: ...eference C language function pointers Exceptions Misaligned destination address Instruction Type R Instruction Fields A Register index of operand rA Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 1...

Page 186: ...s the 16 bit immediate value IMM16 to 32 bits and compares it to the value of rA If rA IMM16 cmpeqi stores 1 to rB otherwise stores 0 to rB Usage cmpeqi performs the operation of the C programming lan...

Page 187: ...gned immediate Operation if signed rA signed IMM16 then rB 1 else rB 0 Assembler Syntax cmpgei rB rA IMM16 Example cmpgei r6 r7 100 Description Sign extends the 16 bit immediate value IMM16 to 32 bits...

Page 188: ...13 12 11 10 9 8 7 6 5 4 3 2 1 0 0x28 0 0x3a 8 5 26 cmpgeui Instruction compare greater than or equal unsigned immediate Operation if unsigned rA unsigned 0x0000 IMM16 then rB 1 else rB 0 Assembler Sy...

Page 189: ...e Operation if signed rA signed IMMED then rB 1 else rB 0 Assembler Syntax cmpgti rB rA IMMED Example cmpgti r6 r7 100 Description Sign extends the immediate value IMMED to 32 bits and compares it to...

Page 190: ...C programming language The maximum allowed value of IMMED is 65534 The minimum allowed value is 0 Pseudo instruction cmpgtui is implemented using a cmpgeui instruction with an IMM16 immediate value o...

Page 191: ...programming language Pseudo instruction cmpleu is implemented with the cmpgeu instruction by swapping its rA and rB operands 8 5 34 cmpleui Instruction compare less than or equal unsigned immediate Op...

Page 192: ...ate Operation if signed rA signed IMM16 then rB 1 else rB 0 Assembler Syntax cmplti rB rA IMM16 Example cmplti r6 r7 100 Description Sign extends the 16 bit immediate value IMM16 to 32 bits and compar...

Page 193: ...13 12 11 10 9 8 7 6 5 4 3 2 1 0 0x30 0 0x3a 8 5 38 cmpltui Instruction compare less than unsigned immediate Operation if unsigned rA unsigned 0x0000 IMM16 then rB 1 else rB 0 Assembler Syntax cmpltui...

Page 194: ...A B Register index of operand rB C Register index of operand rC Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 A B C 0x18 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0x18 0 0x3a 8 5 40 cmpnei In...

Page 195: ...s which custom instruction to use Custom instructions can use up to two parameters xA and xB and can optionally write the result to a register xC Usage To access a custom register inside the custom in...

Page 196: ...resentable in 32 bits There is no overflow exception Nios II processors that do not implement the div instruction cause an unimplemented instruction exception Usage Remainder of Division If the result...

Page 197: ...The original divu operation rD remainder Exceptions Division error Unimplemented instruction Instruction Type R Instruction Fields A Register index of operand rA B Register index of operand rB C Regi...

Page 198: ...entifying the data cache line flushd ignores the tag field and only uses the line field to select the data cache line to clear Skip comparing the cache line tag with the effective address to determine...

Page 199: ...the cache This process comprises the following steps Compute the effective address specified by the sum of rA and the signed 16 bit immediate value Identify the data cache line associated with the co...

Page 200: ...9 18 17 16 A 0 IMM16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 IMM16 0x1b Related Information Cache and Tightly Coupled Memory initda on page 203 initd on page 201 flushd on page 198 8 5 47 flushi Instruc...

Page 201: ...uction memory Exceptions None Instruction Type R Instruction Fields None Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 A 0 0 0x04 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0x04 0 0x3a 8 5 49...

Page 202: ...r the valid bit for the line If the Nios II processor core does not have a data cache the initd instruction performs no operation Usage Use initd after processor reset and before accessing data memory...

Page 203: ...ss is not currently cached so the instruction does nothing Skip checking if the data cache line is dirty Because initd skips the dirty cache line check data that has been modified by the processor but...

Page 204: ...essor core does not have an instruction cache the initi instruction performs no operation Usage This instruction is used to initialize the processor s instruction cache Immediately after processor res...

Page 205: ...x0d 0 0x3a 8 5 53 jmpi Instruction jump immediate Operation PC PC31 28 IMM26 x 4 Assembler Syntax jmpi label Example jmpi write_char Description Transfers execution to the instruction at address PC31...

Page 206: ...er In processors without a data cache ldbio acts like ldb For more information on data cache refer to the Cache and Tightly Coupled Memory chapter of the Nios II Software Developer s Handbook Exceptio...

Page 207: ...nsfer In processors without a data cache ldbuio acts like ldbu For more information on data cache refer to the Cache and Tightly Coupled Memory chapter of the Nios II Software Developer s Handbook Exc...

Page 208: ...rs with a data cache ldhio bypasses the cache and is guaranteed to generate an Avalon MM data transfer In processors without a data cache ldhio acts like ldh For more information on data cache refer t...

Page 209: ...d data from the cache instead of from memory Use the ldhuio instruction for peripheral I O In processors with a data cache ldhuio bypasses the cache and is guaranteed to generate an Avalon MM data tra...

Page 210: ...ache this instruction may retrieve the desired data from the cache instead of from memory Use the ldwio instruction for peripheral I O In processors with a data cache ldwio bypasses the cache and memo...

Page 211: ...high halfword Operation rB IMMED 0x0000 Assembler Syntax movhi rB IMMED Example movhi r6 0x8000 Description Writes the immediate value IMMED into the high halfword of rB and clears the lower halfword...

Page 212: ...i is implemented as addi rB r0 IMMED 8 5 62 movia Instruction move immediate address into word Operation rB label Assembler Syntax movia rB label Example movia r6 function_address Description Writes t...

Page 213: ...enerates a carry unsigned overflow If a 0 1 result is desired follow the mulxuu with the cmpne instruction Overflow Detection signed operands After the multiply operation overflow can be detected usin...

Page 214: ...21 20 19 18 17 16 A B IMM16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 IMM16 0x24 8 5 66 mulxss Instruction multiply extended signed signed Operation rC signed rA x signed rB 63 32 Assembler Syntax mulxss...

Page 215: ...unimplemented instruction exception Usage mulxsu can be used as part of the calculation of a 128 bit product of two 64 bit signed integers Given two 64 bit integers each contained in a pair of 32 bit...

Page 216: ...t unsigned integers each contained in a pair of 32 bit registers T1 U1 and T2 U2 their 128 bit product is U1 x U2 U1 x T2 32 T1 x U2 32 T1 x T2 64 The mulxuu and mul instructions are used to calculate...

Page 217: ...bitwise logical NOR of rA and rB and stores the result in rC Exceptions None Instruction Type R Instruction Fields A Register index of operand rA B Register index of operand rB C Register index of op...

Page 218: ...logical OR of rA and IMM16 0x0000 and stores the result in rB Exceptions None Instruction Type I Instruction Fields A Register index of operand rA B Register index of operand rB IMM16 16 bit signed im...

Page 219: ...operand rC N Control register index of operand ctlN Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 C 0x26 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0x26 N 0x3a 8 5 76 rdprs Instruction re...

Page 220: ...13 12 11 10 9 8 7 6 5 4 3 2 1 0 IMM16 0x38 8 5 77 ret Instruction return from subroutine Operation PC ra Assembler Syntax ret Example ret Description Transfers execution to the address in ra Usage Any...

Page 221: ...Rotates rA left by the number of bits specified in IMM5 and stores the result in rC The bits that shift out of the register rotate into the least significant bit positions Usage In addition to the ro...

Page 222: ...left logical Operation rC rA rB4 0 Assembler Syntax sll rC rA rB Example sll r6 r7 r8 Description Shifts rA left by the number of bits specified in rB4 0 inserting zeroes and then stores the result in...

Page 223: ...rA unsigned rB4 0 Assembler Syntax sra rC rA rB Example sra r6 r7 r8 Description Shifts rA right by the number of bits specified in rB4 0 duplicating the sign bit and then stores the result in rC Bits...

Page 224: ...x3a IMM5 0x3a 8 5 85 srl Instruction shift right logical Operation rC unsigned rA unsigned rB4 0 Assembler Syntax srl rC rA rB Example srl r6 r7 r8 Description Shifts rA right by the number of bits sp...

Page 225: ...on Mem8 rA IMM16 rB7 0 Assembler Syntax stb rB byte_offset rA stbio rB byte_offset rA Example stb r6 100 r5 Description Computes the effective byte address specified by the sum of rA and the instructi...

Page 226: ...ruction s signed 16 bit immediate value Stores the low halfword of rB to the memory location specified by the effective byte address The effective byte address must be halfword aligned If the byte add...

Page 227: ...cation specified by the effective byte address The effective byte address must be word aligned If the byte address is not a multiple of 4 the operation is undefined Usage In processors with a data cac...

Page 228: ...e written to a register or a conditional branch can be taken based on the carry condition Both cases are shown in the following code sub rC rA rB cmpltu rD rA rB sub rC rA rB bltu rA rB label The orig...

Page 229: ...xtends the immediate value IMMED to 32 bits subtracts it from the value of rA and then stores the result in rB Usage The maximum allowed value of IMMED is 32768 The minimum allowed value is 32767 Pseu...

Page 230: ...ified with the Nios_II Processor parameter editor in Platform Designer The 5 bit immediate field imm5 is ignored by the processor but it can be used by the debugger trap with no argument is the same a...

Page 231: ...is specified by status PRS By default status PRS indicates the register set in use before an exception such as an external interrupt caused a register set change To write to an arbitrary register set...

Page 232: ...5 4 3 2 1 0 0x1e 0 0x3a 8 5 97 xorhi Instruction bitwise logical exclusive or immediate into high halfword Operation rB rA IMM16 0x0000 Assembler Syntax xorhi rB rA IMM16 Example xorhi r6 r7 100 Desc...

Page 233: ...truction Fields A Register index of operand rA B Register index of operand rB IMM16 16 bit unsigned immediate value Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 A B IMM16 15 14 13 12 11...

Reviews: