Digital Equipment Alpha 21164PC Hardware Reference Manual Download Page 55 | Manualshive

Page: 55 / 372

Digital Equipment Alpha 21164PC Hardware Reference Manual Download Page 55

29 September 1997 – Subject To Change

Internal Architecture

2–25

Scheduling and Issuing Rules

1

The multiplier is unable to receive data from IEU bypass paths. The instruction issues at the expected time,
but its latency is increased by the time it takes for the input data to become available to the multiplier. For
example, an IMULL instruction issued one cycle later than an ADDL instruction, which produced one of its
operands, has a latency of 10 (8 + 2). If the IMULL instruction is issued two cycles later than the ADDL
instruction, the latency is 9 (8 + 1).

2

When idle, Bcache arbitration predicts a load miss in E0. If a load actually does miss in E0, it is sent to the
Bcache immediately. If it hits in the Bcache, and no other event in the CBU affects the operation, the requested
data is available for use in 10 or more cycles. Otherwise, the request takes longer (possibly much longer,
depending on the state of the CBU and memory). It should be possible to schedule some unrolled code loops
for Bcache by prefetching data into the Dcache using LDQ R31, x(Rx).

3

A special bypass provides an effective latency of 0 (zero) cycles for an ICMP or ILOG instruction producing
the test operand of an IBR or CMOV instruction. This is true only when the IBR or CMOV instruction issues
in the same cycle as the ICMP or ILOG instruction that produced the test operand of the IBR or CMOV
instruction. In all other cases, the effective latency of ICMP and ILOG instructions is 1 cycle.

IMULQ Latency=12, plus up to 2 cycles of added latency, depending on

the source of the data.

Latency until next IMULL, IMULQ, or

IMULH instruction can issue (if there are no data dependencies) is
8 cycles plus the number of cycles added to the latency.

1 cycle

IMULH Latency=14, plus up to 2 cycles of added latency, depending on

the source of the data.

Latency until next IMULL, IMULQ, or

IMULH instruction can issue (if there are no data dependencies) is
8 cycles plus the number of cycles added to the latency.

1 cycle

MVI

Latency=2.

1 cycle

FADD

Latency=4.

—

FDIV

Data-dependent latency: 15 to 31 single precision, 22 to 60 double
precision. Next floating divide can be issued in the same cycle.
The result of the previous divide is available, regardless of data
dependencies.

—

FMUL

Latency=4.

—

FCYPS Latency=4.

—

MISC

RPCC, latency=2. TRAPB produces no result.

1 cycle

UNOP

UNOP produces no result.

—

Table 2–9 Instruction Latencies

(Sheet 2 of 2)

Class

Latency

Additional Time Before
Result Available to
Integer Multiply Unit

1

«
...
53
54
55
56
57
...
»

Summary of Contents for Alpha 21164PC

Page 1: ... Maynard Massachusetts http www digital com semiconductor Digital Semiconductor Alpha 21164PC Microprocessor Hardware Reference Manual Order Number EC R2W0A TE Revision Update Information This is a preliminary document Preliminary ...

Page 2: ... or software in accordance with the description Digital Equipment Corporation 1997 All rights reserved Printed in U S A DIGITAL Digital Semiconductor OpenVMS VAX the AlphaGeneration design mark and the DIGITAL logo are trademarks of Digital Equipment Corporation Digital Semiconductor is a Digital Equipment Corporation business 29 September 1997 Subject to Change GRAFOIL is a registered trademark o...

Page 3: ...Prefetch 2 4 2 1 1 3 Branch Execution 2 5 2 1 1 4 Instruction Translation Buffer 2 7 2 1 1 5 Interrupts 2 8 2 1 2 Integer Execution Unit 2 9 2 1 3 Floating Point Execution Unit 2 9 2 1 4 Memory Address Translation Unit 2 10 2 1 4 1 Data Translation Buffer 2 10 2 1 4 2 Load Instruction and the Miss Address File 2 11 2 1 4 3 Dcache Control and Store Instructions 2 11 2 1 4 4 Write Buffer 2 12 2 1 5 ...

Page 4: ... 2 31 2 5 4 Fill Operation 2 31 2 6 MTU Store Instruction Execution 2 32 2 7 Write Buffer and the WMB Instruction 2 33 2 7 1 The Write Buffer 2 34 2 7 2 The Write Memory Barrier WMB Instruction 2 34 2 7 3 Entry Pointer Queues 2 34 2 7 4 Write Buffer Entry Processing 2 35 2 7 5 Ordering of Noncacheable Space Write Instructions 2 36 2 8 Performance Measurement Support Performance Counters 2 36 2 8 1...

Page 5: ... Probes 4 26 4 6 6 Selecting Bcache Options 4 27 4 7 21164PC Initiated System Transactions 4 27 4 7 1 READ MISS Clean No Victim 4 30 4 7 2 FILL 4 32 4 7 3 READ MISS with Victim 4 33 4 7 4 WRITE BLOCK 4 37 4 8 System Initiated Transactions 4 38 4 8 1 Sending Commands to the 21164PC 4 38 4 8 2 Write Invalidate Protocol Commands 4 40 4 8 2 1 21164PC Responses to Flush Based Protocol Commands 4 41 4 8...

Page 6: ...03 5 7 5 1 4 Instruction Translation Buffer Page Table Entry Temporary ITB_PTE_TEMP Register 104 5 7 5 1 5 Instruction Translation Buffer Invalidate All Process ITB_IAP Register 106 5 7 5 1 6 Instruction Translation Buffer Invalidate All ITB_IA Register 105 5 8 5 1 7 Instruction Translation Buffer IS ITB_IS Register 107 5 8 5 1 8 Formatted Faulting Virtual Address IFAULT_VA_FORM Register 112 5 9 5...

Page 7: ...ocess DTB_IAP Register 209 5 40 5 2 12 Dstream Translation Buffer Invalidate All DTB_IA Register 20A 5 40 5 2 13 Dstream Translation Buffer Invalidate Single DTB_IS Register 20B 5 41 5 2 14 MTU Control MCSR Register 20F 5 42 5 2 15 Dcache Mode DC_MODE Register 216 5 44 5 2 16 Miss Address File Mode MAF_MODE Register 217 5 46 5 2 17 Dcache Flush DC_FLUSH Register 210 5 49 5 2 18 Alternate Mode ALT_...

Page 8: ...al Instruction Cache Load Operation 7 7 7 5 Serial Terminal Port 7 8 7 6 Cache Initialization 7 8 7 6 1 Icache Initialization 7 9 7 6 2 Flushing Dirty Blocks 7 9 7 7 External Interface Initialization 7 9 7 8 Internal Processor Register Reset State 7 10 7 9 Timeout Reset 7 12 7 10 IEEE 1149 1 Test Port Reset 7 13 8 Error Detection and Error Handling 8 1 Error Flows 8 1 8 1 1 Icache Data or Tag Pari...

Page 9: ... 4 4 Timing of Test Features 9 17 9 4 4 1 Icache BiSt Operation Timing 9 17 9 4 4 2 Automatic SROM Load Timing 9 19 9 4 5 Clock Test Modes 9 20 9 4 5 1 Normal 1 Clock Mode 9 20 9 4 5 2 Clock Test Reset Mode 9 20 9 4 6 IEEE 1149 1 JTAG Performance 9 21 9 5 Power Supply Considerations 9 21 9 5 1 Decoupling 9 22 9 5 1 1 Vdd Decoupling 9 22 9 5 1 2 Vddi Decoupling 9 22 9 5 2 Power Supply Sequencing 9 ...

Page 10: ...nstruction Summary A 1 A 1 1 Opcodes Reserved for DIGITAL A 9 A 1 2 Opcodes Reserved for PALcode A 10 A 2 IEEE Floating Point Instructions A 10 A 3 VAX Floating Point Instructions A 12 A 4 Opcode Summary A 13 A 5 Required PALcode Function Codes A 15 A 6 21164PC Microprocessor IEEE Floating Point Conformance A 15 B 21164PC Microprocessor Specifications C Serial Icache Load Predecode Values D Errata...

Page 11: ...thm for System Sending Commands to the 21164PC 4 39 4 19 FLUSH Timing Diagram Bcache Hit Flow Through SSRAM 4 42 4 20 INVALIDATE Timing Diagram Bcache Hit 4 43 4 21 READ Timing Diagram Bcache Hit Flow Through SSRAM 4 44 4 22 Driving the Command Address Bus 4 45 4 23 Using data_bus_req_h 4 48 4 24 System READ to FILL Spacing 4 49 4 25 FILL to Private READ or WRITE Operation 4 50 4 26 READ MISS with...

Page 12: ...er 5 32 5 30 Dstream Translation Buffer Page Table Entry DTB_PTE Register Write Format 5 33 5 31 Dstream Translation Buffer Page Table Entry Temporary DTB_PTE_TEMP Register 5 34 5 32 Dstream Memory Management Fault Status MM_STAT Register 5 35 5 33 Faulting Virtual Address VA Register 5 36 5 34 Formatted Virtual Address VA_FORM Register NT_Mode 1 5 37 5 35 Formatted Virtual Address VA_FORM Registe...

Page 13: ...Input Output Pin Timing 9 9 9 4 Bcache Timing 9 12 9 5 sys_clk System Timing 9 14 9 6 BiSt Timing Event Timeline 9 18 9 7 SROM Load Timing Event Timeline 9 19 9 8 Serial ROM Load Timing 9 20 10 1 Heat Sink 1 10 3 11 1 Package Dimensions 11 2 11 2 21164PC Top View Pin Down 11 8 11 3 21164PC Bottom View Pin Up 11 9 12 1 IEEE 1149 1 Test Access Port 12 3 12 2 TAP Controller State Machine 12 4 ...

Page 14: ...idate Protocol 4 40 4 10 21164PC Responses to Flush Based Protocol Commands 4 41 4 11 Interrupt Priority Level Effect 4 59 5 1 IDU MTU Dcache and PALtemp IPR Encodings 5 1 5 2 Granularity Hint Bits in ITB_PTE_TEMP Read Format 5 7 5 3 Icache Parity Error Status Register Fields 5 11 5 4 Exception Summary Register Fields 5 13 5 5 IDU Control and Status Register Fields 5 16 5 6 Software Interrupt Requ...

Page 15: ...ut Clock Specification 9 8 9 5 Bcache Loop Timing 9 10 9 6 Normal Output Driver Characteristics 9 11 9 7 Big Output Driver Characteristics 9 11 9 8 21164PC System Clock Output Timing sysclk Tø 9 13 9 9 Input Timing for sys_clk_out Based Systems 9 15 9 10 Output Timing for sys_clk_out Based Systems 9 16 9 11 Bcache Control Signal Timing 9 17 9 12 BiSt Timing for Some System Clock Ratios Port Mode N...

Page 16: ...ALcode A 10 A 5 IEEE Floating Point Instruction Function Codes A 10 A 6 VAX Floating Point Instruction Function Codes A 12 A 7 Opcode Summary A 14 A 8 Required PALcode Function Codes A 15 B 1 21164PC Microprocessor Specifications B 1 D 1 Document Revision History D 1 ...

Page 17: ...164PC and provides an overview of the Alpha architecture Chapter 2 Internal Architecture describes the major hardware functions and the internal chip architecture It describes performance measurement facilities cod ing rules and design examples Chapter 3 Hardware Interface lists and describes the external hardware inter face signals Chapter 4 Clocks Cache and External Interface describes the exter...

Page 18: ...ics describes chip and system testability features Appendix A Alpha Instruction Set summarizes the Alpha instruction set Appendix B 21164PC Microprocessor Specifications summarizes the 21164PC specifications Appendix C Serial Icache Load Predecode Values provides a C code example that calculates the predecode values of a serial Icache load Appendix D Errata Sheet lists changes and revisions to thi...

Page 19: ... and bits have the following definitions IGN Ignore Register bits specified as IGN are ignored when written and are UNPRE DICTABLE when read if not otherwise specified MBZ Must Be Zero Software must never place a nonzero value in bits and fields specified as MBZ Reads return unpredictable values Such fields are reserved for future use RAO Read As One Register bits specified as RAO return a 1 when ...

Page 20: ... Writing a zero clears these bits for the duration of the write writing a one has no effect W1C Write One to Clear Bits and fields specified as W1C can be read Writing a one clears these bits for the duration of the write writing a zero has no effect WO Write Only Bits and fields specified as WO can be written but not read Addresses Unless otherwise noted all addresses and offsets are hexadecimal ...

Page 21: ...s not contained in the 21164PC Numbering All numbers are decimal or hexadecimal unless otherwise indicated The prefix 0x indicates a hexadecimal number For example 19 is decimal but 0x19 and 0x19A are hexadecimal also see Addresses Otherwise the base is indicated by a sub script for example 1002 is a binary number Ranges and Extents Ranges are specified by a pair of numbers separated by two period...

Page 22: ...ms UNPREDICTABLE and UNDEFINED are used Their meanings are quite different and must be carefully distinguished In particular only privileged software that is software running in kernel mode can trigger UNDEFINED operations Unprivileged software cannot trigger UNDE FINED operations However either privileged or unprivileged software can trigger UNPREDICTABLE results or occurrences UNPREDICTABLE resu...

Page 23: ...EDICTABLE results must not Write or modify the contents of memory locations or registers to which the current process in the current access mode does not have access Halt or hang the system or any of its components For example a security hole would exist if some UNPREDICTABLE result depended on the value of a register in another process on the contents of processor temporary registers left behind ...

Page 24: ......

Page 25: ...architecture designed with particular emphasis on speed multiple instruction issue multiple processors and software migration from many operating systems All registers are 64 bits long and all operations are performed between 64 bit regis ters All instructions are 32 bits long Memory operations are either load or store operations All data manipulation is done between registers The Alpha architectu...

Page 26: ...ard machine code with some implementation specific extensions to provide direct access to low level hardware functions PALcode sup ports optimizations for multiple operating systems flexible memory management implementations and multi instruction atomic sequences The Alpha architecture performs byte shifting and masking with normal 64 bit reg ister to register instructions it does not include sing...

Page 27: ...dressable byte boundary A byte is an 8 bit value A byte is supported in Alpha architecture by the EXTRACT INSERT LDBU MASK SEXTB STB ZAP PACK UNPACK MIN MAX and PERR instructions Word A word is two contiguous bytes that start at an arbitrary byte boundary A word is a 16 bit value A word is supported in Alpha architecture by the EXTRACT INSERT LDWU MASK SEXTW STW PACK UNPACK MIN and MAX instruction...

Page 28: ...VAX F_floating and G_floating data types and supports longword 32 bit and quadword 64 bit integers Byte 8 bit and word 16 bit support is provided by byte manipulation instructions Limited hardware support is provided for the VAX D_floating data type Other 21164PC features include A peak instruction execution rate of four times the CPU clock frequency The ability to issue up to four instructions du...

Page 29: ...nternal clock generator providing a high speed clock used by the 21164PC and a pair of programmable system clocks for use by the CPU module Onchip performance counters to measure and analyze CPU and system perfor mance Chip and module level test support including an instruction cache test interface to support chip and module level testing A 3 3 V external interface and 2 5 V internal interface Ref...

Page 30: ......

Page 31: ...a certain piece of hardware seems to be architecturally incomplete the missing func tionality is implemented in PALcode Chapter 6 provides more information on PAL code This chapter describes the major functional hardware units and is not intended to be a detailed hardware description of the chip It is organized as follows 21164PC microarchitecture Pipeline organization Scheduling and issuing rules...

Page 32: ...rogram Logic 0 1 Instruction Translation Buffer 48 Entry Associative Instruction Buffer Pipe Stages Slot Logic Issue Scoreboard Logic Integer Register File S0 S1 S2 S3 S4 S5 S6 S7 S8 Floating Point Register File Integer Multiplier Integer Pipe 0 Integer Pipe 1 ADD LOG SHIFT LD ST IMUL CMP SEXT CMOV BYTE WORD ADD LOG LD BR CMP CMOV Floating Point Divider Floating Point Add Pipe and Divider Floating...

Page 33: ...t CBU with interface to external cache Section 2 1 5 Data cache Dcache Section 2 1 6 1 Instruction cache Icache Section 2 1 6 2 Serial read only memory SROM interface Section 2 1 7 2 1 1 Instruction Fetch Decode Unit and Branch Unit The primary function of the instruction fetch decode unit and branch unit IDU is to manage and issue instructions to the IEU MTU and FEU It also manages the instructio...

Page 34: ...ch target to the end of the current INT16 the IDU then proceeds to the next INT16 of instructions after all the instructions in the target INT16 are issued Thus achieving maximum issue rate and optimal performance requires that code be be scheduled properly and that floating or integer NOP instructions be used to fill empty slots in the scheduled instruction stream For more information on instruct...

Page 35: ...request misses the CBU drives a main memory request If there is an Icache hit at this time the Icache returns to access mode and the prefetcher stops sending fetches to the MTU When a new program counter PC is loaded that is taken branches the Icache returns to access mode until the first miss The refill buffer receives and holds instruction data from fetches initiated before the Icache returned t...

Page 36: ...turn stack that is controlled by decoding the opcode BSR HW_REI and JMP JSR RET JSR_COROUTINE and DISP 15 14 in JMP JSR RET JSR_COROUTINE The stack stores an Icache index in each entry The stack is implemented as a circular queue that wraps around in the overflow and underflow cases Table 2 1 lists the effect each of these instructions has on the state of the branch pre diction stack The 21164PC u...

Page 37: ...ntains the ITB Each entry supports all four granularity hint bit combinations so that any single ITB entry can provide translation for up to 512 con tiguously mapped 8KB pages The operating system using PALcode must ensure that virtual addresses can only be mapped through a single ITB entry or superpage mapping at one time Multiple simultaneous mapping can cause UNDEFINED results While not executi...

Page 38: ...kernel mode Superpage mapping allows the operating system to map all physical memory to a privileged virtual memory region 2 1 1 5 Interrupts The IDU exception logic supports three sources of interrupts Hardware interrupts There are 7 level sensitive hardware interrupt sources supplied by the following signals irq_h 3 0 mch_hlt_irq_h pwr_fail_irq_h sys_mch_chk_irq_h Software interrupts There are 1...

Page 39: ...tion logic An integer multiplier A motion video instruction unit The IEU also includes the 40 entry 64 bit integer register file IRF that contains the 32 integer registers defined by the Alpha architecture and 8 PAL shadow registers The register file has four read ports and two write ports that provide operands to both integer execution pipelines and accept results from both pipes The register fil...

Page 40: ... generates the corresponding physical addresses and access control information for each virtual address The 21164PC implements a 43 bit virtual address a 40 bit noncacheable physical address and a 33 bit cacheable physical address Cacheable addresses consist of bits 32 0 when bit 39 0 Physical addresses that set bits 38 33 are not supported by the 21164PC These addresses are not checked by the 211...

Page 41: ...sing the data cache Dcache Translation and Dcache tag read operations occur in parallel If the addressed location is found in the Dcache a hit then the data from the Dcache is formatted and written to either the integer register file IRF or floating point register file FRF The formatting required depends on the particular load instruction executed If the data is not found in the Dcache a miss then...

Page 42: ...which holds the data from one or more store instructions that access the same 32 byte block in memory until the data is written into the Bcache The write buffer provides a finite high bandwidth resource for receiving store data to minimize the number of CPU stall cycles The write buffer and associated WMB instruction are described in Sec tion 2 7 2 1 5 Cache Control and Bus Interface Unit The cach...

Page 43: ...ks The 21164PC supports board level cache sizes of 512KB 1MB 2MB and 4MB 2 1 7 Serial Read Only Memory Interface The serial read only memory SROM interface provides the initialization data load path from a system SROM to the Icache Chapter 7 provides information about the SROM interface 2 2 Pipeline Organization The 21164PC has a 7 stage or 7 cycle pipeline for integer operate and memory ref erenc...

Page 44: ...Register File Arithmetic logical shift and compare instructions complete in pipeline stage 4 1 cycle latency CMOV completes in stage 5 2 cycle latency IMULL has an 8 cycle or 9 cycle latency CMOV or BR can issue in parallel 0 cycle latency with a dependent CMP instruction Floating Point Register File Access First Floating Point Operate Stage Write Floating Point Register File Last Floating Point O...

Page 45: ...e and that no write write hazards exist Read the IRF Stall preced ing stages if any instruction cannot be issued All source operands must be available at the end of this stage for the instruction to issue Table 2 3 Pipeline Examples Integer Add Pipeline Stage Events 4 Perform the add operation 5 Result is available for use by an operate function in this cycle 6 Write the IRF Result is available fo...

Page 46: ...ction in this cycle Pipeline Stage1 Events 4 Calculate the effective address Begin the Dcache data and tag store access 5 Finish the Dcache data and tag store access Detect Dcache miss Bcache arbitration defaults to pipe E0 in anticipation of a possible miss If there are load instructions in both E0 and E1 the load instruction in E1 would be delayed at least one more cycle because default arbitrat...

Page 47: ...esponsible for ensuring that all resource conflicts are resolved before an instruction is allowed to continue The only means of stopping instructions after the issue stage is an abort condition The term abort as used here is different from its use in the Alpha AXP Architecture Reference Manual 2 2 2 Aborts and Exceptions Aborts result from a number of causes In general they can be grouped into two...

Page 48: ...e last instruction executed was also executed For machine check and interrupts EXC_ADDR points to the instruction immediately following the last instruction exe cuted For the remaining cases EXC_ADDR points to the exceptional instruction where in all cases its execution should naturally restart When the pipeline is fully drained the processor begins instruction execution at the address given by th...

Page 49: ...ashing involves the ability of the first four pipeline stages to advance whenever a bubble or buffer slot is detected in the pipeline stage immedi ately ahead of it while the pipeline is otherwise stalled 2 3 Scheduling and Issuing Rules The following sections define the classes of instructions and provide rules for instruction slotting instruction issuing and latency 2 3 1 Instruction Class Defin...

Page 50: ...EXTB SEXTW SHIFT E0 SLL SRL SRA EXTQL EXTLL EXTWL EXTBL EXTQH EXTLH EXTWH MSKQL MSKLL MSKWL MSKBL MSKQH MSKLH MSKWH INSQL INSLL INSWL INSBL INSQH INSLH INSWH ZAP ZAPNOT CMOV E0 or E1 CMOVEQ CMOVNE CMOVLT CMOVLE CMOVGT CMOVGE CMOVLBS CMOVLBC ICMP E0 or E1 CMPEQ CMPLT CMPLE CMPULT CMPULE CMPBGE IMULL E0 MULL MULL V IMULQ E0 MULQ MULQ V IMULH E0 UMULH MVI E0 PERR UNPKBW UNPKBL PKWB PKLB MINSB8 MINSB4...

Page 51: ...hat can issue in either FA or FM assign it to FA unless FA is not free If it is an FA only instruction it must be assigned to FA If it is an FM only instruction it must be assigned to FM Mark the pipeline selected by this process as taken and resume with the next sequential instruction Stop when an instruction cannot be allocated in an execution pipeline because any pipeline it can use is already ...

Page 52: ... both of E0 or E1 F instruction is any instruction that can issue in one or both of FA or FM From lowest address to highest within an INT16 with the following arrange ment F instruction I instruction I instruction I instruction When this type of case is detected the first two instructions are forwarded to the issue point in one cycle The second two are sent only when the first two have both issued...

Page 53: ...ue is governed by the availability of registers for read or write operations and the availability of the floating divide unit and the integer multi ply unit There are producer consumer dependencies producer producer dependen cies also known as write after write conflicts and dynamic function unit availability dependencies integer multiply and floating divide The IDU logic in stage 3 of the 21164PC...

Page 54: ...les MXPR HW_MFPR latency 1 2 or longer depending on the IPR HW_MTPR produces no result 1 or 2 cycles IBR Produces no result Taken branch issue latency minimum 1 cycle branch mispredict penalty 5 cycles FBR Produces no result Taken branch issue latency minimum 1 cycle branch mispredict penalty 5 cycles JSR All but HW_REI latency 1 HW_REI produces no result Issue latency minimum 1 cycle 2 cycles SEX...

Page 55: ... ILOG instruction producing the test operand of an IBR or CMOV instruction This is true only when the IBR or CMOV instruction issues in the same cycle as the ICMP or ILOG instruction that produced the test operand of the IBR or CMOV instruction In all other cases the effective latency of ICMP and ILOG instructions is 1 cycle IMULQ Latency 12 plus up to 2 cycles of added latency depending on the so...

Page 56: ...atency An example of this case is shown in the following code LDQ R2 0 R0 R2 destination ADDQ R2 R3 R4 wr rd conflict stalls execution waiting for R2 LDQ R2 D R1 wr wr conflict may dual issue when ADDQ issues Producer producer latency is generally determined by applying the rule that register file write operations must occur in the correct order enforced by IDU hardware Two IADD or ILOG class inst...

Page 57: ...wn the load would miss in time to prevent issue An instruction of class LD cannot be issued in the second cycle after an instruc tion of class ST is issued No LD ST MXPR to an MTU register or MBX class instructions can be issued after an MB instruction has been issued until the MB instruction has been acknowledged by the CBU No LD ST MXPR to an MTU register or MBX class instructions can be issued ...

Page 58: ...ere are no stalls after the instruction issue point in the pipeline In some situations an MTU instruction cannot be executed because of insufficient resources or some other reason These instructions trap and the IDU restarts their execution from the beginning of the pipeline This is called a replay trap Replay traps occur in the fol lowing cases The write buffer is full when a store instruction is...

Page 59: ...ules The following sections describe the miss address file MAF and its load merging function and the load merging rules that apply after a load miss 2 5 1 Merging Rules When a load miss occurs each MAF entry is checked to see if it contains a load miss that addresses the same 32 byte Dcache block If it does and certain merging rules3 are satisfied then the new load miss is merged with an existing ...

Page 60: ...e trapping load hits in the Dcache Only quadwords can merge with other quadwords provided they are not in the same INT8 Bytes words and longwords cannot merge Merging stops for a load instruction to noncacheable space as soon as the CBU accepts the reference This permits the system environment to access only those INT8s that are actually requested by load instructions All accesses that could not m...

Page 61: ...n 2 5 4 Fill Operation Eventually the CBU provides the data requested for a given MAF entry a fill The CBU requests that the IDU allocate up to three consecutive bubble cycles in the IEU pipelines The first bubble prevents any store instruction from issuing The sec ond bubble prevents any instructions from issuing The third bubble prevents only MTU instructions particularly load and store instruct...

Page 62: ...sue any more MTU instructions until the MTU has successfully sent the LDL_L or LDQ_L instruction to the CBU This guarantees correct ordering between an LDL_L or LDQ_L instruction and a subse quent STL_C or STQ_C instruction even if they access different addresses 2 6 MTU Store Instruction Execution Store instructions execute in the MTU by 1 Reading the Dcache tag store in the pipeline stage in whi...

Page 63: ... store instruc tion it will be issue stalled for one cycle This is not an optimal solution but is pre ferred over incurring a replay trap on the load instruction For each store instruction a search of the MAF is done to detect load before store hazards If a store instruction is executed and a load of the same address is present in the MAF two things happen 1 Bits are set in each conflicting MAF en...

Page 64: ...et in every write buffer entry containing valid store data that will prevent future store instructions from merging with any of the entries Also the next entry to be allocated is marked with a WMB flag At this point the entry marked with the WMB flag does not yet have valid data in it When an entry marked with a WMB flag is ready to issue to the CBU the entry is not issued until every previously i...

Page 65: ...e entry from the pending request queue without placing it in the free entry queue When the CBU has completely processed the write buffer entry it notifies the MTU and the now invalid write buffer entry is placed in the free entry queue The MTU may request that up to five additional write buffer entries be processed while waiting for the CBU to finish the first The write buffer entries are invalida...

Page 66: ... Load misses are checked in the write buffer for conflicts The granularity of this check is an INT32 Any load instruction matching any write buffer entry s address is considered a hit even if it does not access a byte marked for update in that write buffer entry If a load hits in the write buffer a conflict bit is set in the load instruc tion s MAF entry which prevents the load instruction from be...

Page 67: ...unter control refer to the following IPR descriptions Hardware interrupt clear HWINT_CLR register see Section 5 1 23 Interrupt summary register ISR see Section 5 1 24 Performance counter PMCTR register see Section 5 1 27 CBU configuration control CBOX_CONFIG2 register bits 13 08 see Section 5 3 4 2 8 1 CBU Performance Counters The counters in the CBU counters 0 and 1 are used to count Bcache and s...

Page 68: ...he 21164PC detects either a read miss or write miss in the Bcache A BCACHE VICTIM command is also generated along with the READ MISS command if the block the request misses on is valid and dirty in the cache In this case the 64 byte Bcache block is read from the Bcache and sent to the system System to CBU Requests The system can issue the following requests to the 21164PC FILL commands READ comman...

Page 69: ... does a write request merge with a read request Using the Counters The two counters work in parallel so they can be used to determine simple ratios like Bcache miss rate or more complex statistics like Dstream read merging in the CBU by running several tests and normalizing the results For example Bcache miss rate 1 Bcache read hits Total read requests Counter 0 selects 0x0 and counter 1 selects 0...

Page 70: ...57 56 55 54 53 52 INED 62 Inexact disable Suppress INE trap and place correct IEEE nontrap ping result in the destination register if the 21164PC is capable of producing correct IEEE nontrapping result UNFD 61 Underflow disable Subset support Suppress UNF trap if UNDZ is also set and the S qualifier is set on the instruction UNDZ 60 Underflow to zero When set together with UNFD on underflow the ha...

Page 71: ...that differed from the mathematically exact result UNF 55 Underflow A floating arithmetic or conversion operation under flowed the destination exponent OVF 54 Overflow A floating arithmetic or conversion operation overflowed the destination exponent DZE 53 Division by zero An attempt was made to perform a floating divide operation with a divisor of zero INV 52 Invalid operation An attempt was made...

Page 72: ...e configuration This configuration employs additional system memory controller chipsets Figure 2 4 shows a typical uniprocessor system with a board level cache This sys tem configuration could be used in standalone or networked workstations Figure 2 4 Typical Uniprocessor Configuration PCA019 21164PC Memory and External Cache Tag External Cache Data I O Interface DRAM Bank DRAM Bank Main Memory I ...

Page 73: ...Interface 3 1 3 Hardware Interface This chapter contains the 21164PC microprocessor logic symbol and provides a list of signal names and their functions 3 1 21164PC Microprocessor Logic Symbol Figure 3 1 shows the logic symbol for the 21164PC chip ...

Page 74: ...rq_h 3 0 mch_hlt_irq_h osc_clk_in_l port_mode_h 1 0 pwr_fail_irq_h srom_data_h sys_mch_chk_irq_h sys_reset_l tck_h tdi_h temp_sense Vss Vddi osc_clk_in_h tms_h data_h 127 0 addr_res_h 1 0 cmd_h 3 0 data_ram_oe_l data_ram_we_l 3 0 index_h 21 4 int4_valid_h 3 0 tag_data_h 32 19 tag_data_par_h tag_dirty_h tag_ram_oe_l tag_ram_we_l tag_valid_h victim_pending_h cpu_clk_out_h srom_clk_h srom_oe_l srom_p...

Page 75: ... pin interstitial pin grid array IPGA package There are 264 functional signal pins 2 spare signal pins unused 5 voltage refer ence pins unused 46 external power Vdd pins 22 internal power Vddi pins and 74 ground Vss pins The following table defines the 21164PC signal types referred to in this section Signal Type Definition B Bidirectional I Input only O Output only ...

Page 76: ...y space When the byte word instructions are used and addr_h 39 is asserted six additional bits of information are communicated over the pin bus Two of the new bits are driven over addr_h 38 37 becoming transfer_size 1 0 with the fol lowing values 00 Size 8 bytes 01 Size 4 bytes 10 Size 2 bytes 11 Size 1 byte addr_bus_req_h I 1 Address bus request The system interface uses this signal to gain contr...

Page 77: ... 1 21164PC Signal Descriptions Sheet 2 of 10 Signal Type Count Description Bits Description 00 CPU clock frequency is equal to the input clock fre quency 01 CPU clock frequency is equal to the input clock fre quency with the onchip duty cycle equalizer enabled 10 Initialize the CPU clock allowing the system clock to be synchronized to a stable reference clock 11 Initialize the CPU clock allowing t...

Page 78: ...ional information refer to Section 4 1 1 1 Table 3 1 21164PC Signal Descriptions Sheet 3 of 10 Signal Type Count Description 21164PC Commands to System cmd_h 3 0 Command Meaning 0000 NOP Nothing 0001 Reserved 0010 Reserved 0011 Reserved 0100 Reserved 0101 Reserved 0110 WRITE BLOCK Request to write a block 0111 Reserved 1000 READ MISS0 Request for data 1001 READ MISS1 Request for data 1010 Reserved...

Page 79: ...ing edge of sysclk n then the 21164PC does not drive the data bus on the rising edge of sysclk n 1 Before asserting this signal the system should assert idle_bc_h for the correct number of cycles If the 21164PC samples this signal deas serted on the rising edge of sysclk n then the 21164PC drives the data bus on the rising edge of sysclk n 1 For timing details refer to Section 4 9 4 data_ram_oe_l ...

Page 80: ...s to the 21164PC that the system has detected an invalid address or hard error The system still provides an apparently normal read sequence with correct ECC parity though the data is not valid The 21164PC traps to the machine check MCHK PALcode entry point and indicates a serious hardware error fill_error_h should be asserted when the data is returned Each assertion produces a MCHK trap fill_id_h ...

Page 81: ...als indi cate which INT8 bytes of a 32 byte block need to be read and returned to the processor This is useful for read operations to noncached memory Note For both read and write operations multiple int4_valid_h 3 0 bits can be set simultaneously Table 3 1 21164PC Signal Descriptions Sheet 6 of 10 Signal Type Count Description int4_valid_h 3 0 Write Meaning xxx1 data_h 31 0 valid xx1x data_h 63 3...

Page 82: ...sactions For write transactions Table 3 1 21164PC Signal Descriptions Sheet 7 of 10 Signal Type Count Description addr_h 38 37 int4_valid_h 3 0 Value 00 Valid INT8 mask 01 addr_h 3 2 valid on int4_valid_h 3 2 int4_valid 1 0 undefined 10 addr_h 3 1 valid on int4_valid_h 3 1 int4_valid 0 undefined 11 addr_h 3 0 valid on int4_valid_h 3 0 addr_h 38 37 int4_valid_h 3 0 Value 00 Valid INT4 mask 01 Valid...

Page 83: ...is signal is used to set up sys_clk_out2_ h delay see Table 4 3 During normal opera tion it is used to signal a halt request osc_clk_in_h osc_clk_in_l I I 1 1 Oscillator clock inputs These signals provide the differential clock input that is the fundamental timing of the 21164PC These signals are driven at the same frequency as the internal clock frequency clk_mode_h 1 0 01 port_mode_h 1 0 I 2 Sel...

Page 84: ...ations and with sys_clk_out1_h during read and fill operations st_clk2_h O 1 This signal is a duplicate of st_clk1_h to increase the fanout capability of the signal st_clk3_h O 1 This signal is another duplicate of st_clk1_h to increase the fanout capability of the signal sys_clk_out1_h O 1 System clock output Programmable system clock cpu_clk_out_h divided by a value of 3 to 15 is used for board ...

Page 85: ...id bit During fills this signal is asserted to indicate that the block has valid data See Table 4 5 for information about Bcache protocol tck_h B 1 JTAG boundary scan clock tdi_h I 1 JTAG serial boundary scan data in signal tdo_h O 1 JTAG serial boundary scan data out signal temp_sense I 1 Temperature sense This signal is used to measure the die tem perature and is for manufacturing use only For n...

Page 86: ...ock output st_clk3_h O 1 Bcache STRAM clock output sys_clk_out1_h O 1 System clock output sys_clk_out2_h O 1 System clock output sys_reset_l I 1 System reset Bcache data_h 127 0 B 128 Data bus data_adsc_l O 1 Data RAM address load enable data_adv_l O 1 Data RAM address advance enable data_ram_oe_l O 1 Data RAM output enable data_ram_we_l 3 0 O 4 Data RAM write enable bits index_h 21 4 O 18 Index l...

Page 87: ... Fill identification idle_bc_h I 1 Idle Bcache int4_valid_h 3 0 O 4 INT4 data valid victim_pending_h O 1 Victim pending Interrupts irq_h 3 0 I 4 System interrupt requests mch_hlt_irq_h I 1 Machine halt interrupt request pwr_fail_irq_h I 1 Power failure interrupt request sys_mch_chk_irq_h I 1 System machine check interrupt request Test Modes and Miscellaneous dc_ok_h I 1 dc voltage OK port_mode_h 1...

Page 88: ...srom_oe_l O 1 Serial ROM output enable srom_present_l1 B 1 Serial ROM present tck_h B 1 JTAG boundary scan clock tdi_h I 1 JTAG serial boundary scan data in tdo_h O 1 JTAG serial boundary scan data out temp_sense I 1 Temperature sense test_status_h 1 O 1 Icache test status or timeout reset tms_h I 1 JTAG test mode select trst_l1 B 1 JTAG test access port TAP reset Table 3 2 21164PC Signal Descript...

Page 89: ...ized as follows Introduction to the external interface Clocks Physical address considerations Bcache structure and operation Cache coherency 21164PC to Bcache transactions 21164PC initiated system transactions System initiated transactions Data bus and command address bus contention 21164PC interface restrictions 21164PC system race conditions Data integrity and Bcache errors Interrupts Chapter 3 ...

Page 90: ... system supports a 64 byte block size to the external Bcache Figure 4 1 shows a simplified view of the external interface The function and pur pose of each signal is described in Chapter 3 4 1 1 System Interface This section describes the system or external bus interface The system interface is made up of bidirectional address and command buses a data bus that is shared with the Bcache interface a...

Page 91: ... 21164PC performs the task as soon as the Bcache becomes free The 21164PC acknowledges receiving the command at the start of the Bcache transaction MK5504B Tag State V D SRAM System Memory and I O 21164PC addr_h 39 4 index_h 21 4 data_h 127 0 addr_bus_req_h cack_h cmd_h 3 0 dack_h data_bus_req_h fill_h fill_error_h fill_id_h idle_bc_h int4_valid_h 3 0 victim_pending_h tag_data_h 32 19 p tag_valid_...

Page 92: ...e system interface 4 8 GB s peak data transfer rate Programmable Bcache clock rate up to 300 MHz operation 4 1 2 1 Bcache Interface Enhancements With the advent of commodity SSRAMs offchip high speed caches can now be built at low cost to take advantage of the same performance techniques that until now had been restricted to onchip caches The SSRAMs contain an address register a self incrementing ...

Page 93: ...rite hit dirty bandwidth The Bcache interface decouples the tag and data store control to allow tag write probes to be interleaved with data writes Figure 4 3 shows an example of write interleaving and its ability to keep the data bus at 100 utilization PCA002 index data A1 A1 index data A2 A3 A2 A3 A4 A5 A6 A8 A7 D10 D11 D20 D21 Nonpipelined Cache Pipelined Cache D10 D11 D20 D30 D21 D31 D40 D50 D...

Page 94: ...ction 7 1 Signal Description cpu_clk_out_h A 21164PC internal clock that may or may not drive the system clock sys_clk_out1_h A clock of programmable speed supplied to the external interface sys_clk_out2_h A delayed copy of sys_clk_out1_h The delay is programmable and is an integer number of cpu_clk_out_h periods PCA003 tag A1 index data A2 A3 A4 A5 A6 D10 D11 D20 D30 D21 D31 D40 D41 latency 1 Int...

Page 95: ...k_mode_h 1 is set the internal CPU clock is reset to a known state When it is clear the CPU clock is driven at the same frequency as the osc_clk_h l differential input Caution A clock source should always be provided on osc_clk_ in_h l when sig nal dc_ok_h is asserted Table 4 1 CPU Clock Generation Control Mode clk_mode_h 1 0 Description Normal 0 0 CPU clock frequency is the same as the input cloc...

Page 96: ... into the CPU clock frequency Refer to Section 7 2 for information on sysclk behavior during reset The value is also latched into the SYS_CLK_RATIO 3 0 field of the CBOX_STATUS IPR bits 7 4 for read only purposes Table 4 2 System Clock Divisor Sheet 1 of 2 irq_h 3 irq_h 2 irq_h 1 irq_h 0 Ratio Low High Low Low 4 Low High Low High 5 Low High High Low 6 Low High High High 7 High Low Low Low 8 MK5502...

Page 97: ...exible timing for system use The delay unit from 0 to 7 CPU CLK cycles is obtained from the three interrupt signals mch_hlt_irq_h pwr_fail_irq_h and sys_mch_chk_irq_h at power up as listed in Table 4 3 The output of this programmable divider is symmetric if the divisor is High Low Low High 9 High Low High Low 10 High Low High High 11 High High Low Low 12 High High Low High 13 High High High Low 14...

Page 98: ...mory like 2 The second region is the second half of the physical address space except for a 1MB region reserved for CBU IPRs It is treated by the 21164PC as noncache able 3 The third region is the 1MB region reserved for CBU IPRs In the first region write merging and load merging are permitted All 21164PC accesses in this region are 64 byte the Bcache block size This memory like region is limited ...

Page 99: ...NT16 bound aries READ and FLUSH commands are all wrapped on INT16 boundaries as described here The valid wrap orders for 64 byte blocks are selected by addr_h 5 4 They are 0 1 2 3 1 0 3 2 2 3 0 1 3 2 1 0 Similarly when the system interface supplies a command that returns data from the 21164PC caches the values that the system drives on addr_h 5 4 determine the order in which data is supplied by th...

Page 100: ...ster file or Icache Note A special case using int4_valid_h 3 0 occurs during an Icache fill In this case the entire returned block is valid although int4_valid_h 3 0 indicates zero 4 3 4 Noncached Write Operations Write operations to physical addresses that have addr_h 39 asserted are not writ ten to any of the caches These write operations are merged in the write buffer before being sent to the s...

Page 101: ... thus private Bcache transactions require two data cycles and system Bcache transactions require four data cycles Longword write enables are provided to the data store for Bcache write operations To support byte and word write transactions the 21164PC performs a read modify write sequence at the Bcache interface 4 4 1 Bcache Victim Buffers A Bcache victim is generated when the 21164PC deallocates ...

Page 102: ... each transaction to determine if the block is present If the block is present the requested action is taken If the block is not present the com mand is still acknowledged but no other action is taken The Flush protocol for the 21164PC does not support a duplicate tag store Section 4 5 1 provides a more detailed description of flush cache coherency protocol The system commands that are used to mai...

Page 103: ...TE system command The 21164PC invalidates the Bcache block if the block was found Figure 4 6 shows the 21164PC cache state transitions that can occur as a result of transactions with the system Figure 4 7 shows the 21164PC cache state transitions maintained by the 21164PC as a result of transactions by other nodes on the system bus These two figures both represent the same state machine They show ...

Page 104: ...l be able to process read and write hits to the Bcache without assistance from the system When system logic writes to or reads from the Bcache it transfers data to and from the Bcache but only under the direct control of the 21164PC 4 6 1 Synchronous Burst Mode Cache Support The 21164PC supports both pipelined and flow through SSRAMs These SSRAMs provide several new control functions that are capi...

Page 105: ...W_L ADSC_L LW0 ADSP_L BWE_L 3 0 LW1 OE_L MODE LW2 CE_L LW3 CLK GW_L GW_L GW_L A X 0 A X 0 A X 0 A X 0 Store TAG BWE_L 3 0 OE_L MODE CE_L CLK ADV_L A X 0 DATA DATA 128 32 Vss Vdd Vss Vss Vdd Vdd Vdd Vdd Vdd tag_ram_we_l tag_ram_oe_l st_clkx_h tag_data_h 32 19 data_ram_we_l 3 data_ram_we_l 2 data_ram_we_l 1 data_ram_we_l 0 data_adsc_l data_adv_l data_ram_oe_l data_h 127 0 32 32 32 Vdd Store DATA X 3...

Page 106: ...and data buses refer to Sec tion 4 9 2 for more details For Bcache data writes the 21164PC supports the early write protocol using the ADSC pin the ADSP late write is not supported During data transfers the 21164PC drives longword write enables data_ram_we_l 3 0 to the SSRAMs that correspond to the appropriate longword lanes within the 128 bit data bus For byte and word granularity data writes the...

Page 107: ...nclude memory fills Bcache victims and system commands that require data movement System transactions read and write the Bcache in the sys_clk regime see Section 4 2 2 System Bcache read or write operations start rel ative to a sysclk edge It is the responsibility of the system to control the rate of Bcache transactions by using the dack_h signal Private transactions include CPU initiated read and...

Page 108: ...h a speculative 32 byte data store read Figure 4 9 shows an example of the timing for a private read operation to Bcache by the 21164PC CBOX_CONFIG BC_LATENCY_OFF 0 which represents a minimum read latency value of five cpu_clk cycles The index is launched from an arbitrary internal cpu_clk edge t 0 The data store address strobe data_adsc_l is also asserted at this time and is deasserted one bc_clk...

Page 109: ...er For a bc_clk_ratio greater than 5 the st_clk remains asserted for 3 cpu_clk cycles For a bc_clk_ratio from 3 to 5 the st_clk remains asserted for 2 cpu_clk cycles For a bc_clk_ratio of 2 the st_clk remains asserted for 1 cpu_clk cycle FM 05560 AI4 index_h 21 4 tag_ram_oe_l st_clkx_h tag_ram_we_l data_adsc_l tag_data 32 19 data_adv_l data_ram_oe_l data_ram_we_l 3 0 data 127 0 cpu_clk 0 1 2 3 4 5...

Page 110: ...he write probe operation hits dirty then the tag store will not be updated The two suboperations that make up a CPU initiated write operation are not atomic and are pipelined with other read and write operations Up to three Bcache probe opera tions can be in flight at any given time to increase overall Bcache performance 4 6 5 1 Bcache Private Write Probe Operation Figure 4 10 shows an example of ...

Page 111: ... hits dirty then the data write operation does not update the tag store If the CPU initiated write command misses in the Bcache the data write is sched uled after the fill data from memory has returned During the fill operation the Bcache tag store is updated to reflect the new tag and control state modified There FM 05561 AI4 index_h 21 4 tag_ram_oe_l st_clkx_h tag_ram_we_l data_adsc_l tag_data 3...

Page 112: ...cpu_clk cycles Figure 4 11 Bcache Private Data Write Hit Clean The index is launched from an arbitrary internal cpu_clk edge t 0 The data store address strobe data_adsc_l is also asserted at this time and is deasserted one bc_clk cycle later The tag store address strobe is tied at the module level to always be asserted so that a new address is latched every bc_clk cycle The data store address is a...

Page 113: ...quent bc_clk cycle Figure 4 12 shows an example of the timing for a data write operation that hits dirty to the Bcache during the write probe CBOX_CONFIG BC_CLK_RATIO is set to three cpu_clk cycles Note that the tag update is not required for a write hit to a dirty block Figure 4 12 Bcache Private Data Write Hit Dirty FM 05563 AI4 index_h 21 4 tag_ram_oe_l st_clkx_h tag_ram_we_l data_adsc_l tag_da...

Page 114: ...rations access different stores tag and data This technique is used to fully saturate the data bus during write hit streams as is shown in Figure 4 13 Figure 4 13 Bcache Interleaving 0 1 2 3 4 5 6 7 8 FM 05564 AI4 index_h 21 4 tag_ram_oe_l st_clkx_h tag_ram_we_l data_adsc_l tag_data 32 19 data_adv_l data_ram_oe_l data_ram_we_l 3 0 data 127 0 cpu_clk bc_rd_latency 5 A3 T3 D00 A0 A4 A1 bc_rd_latency...

Page 115: ...transaction when It encounters a miss The CPU addresses a noncached region of memory For example the sequence for a 21164PC initiated transaction caused by a Bcache miss is At the start of a Bcache transaction the 21164PC checks the tag and tag control status of the target block Table 4 7 Bcache Options Parameter Selection sysclk ratio 4 15 ____ CPU cycles Cache protocol flush or flush invalidate ...

Page 116: ...actions to use the Bcache while a miss is being serviced Prior to the fill data arriving the system asserts idle_bc_h back to the 21164PC to arbitrate for the shared 128 bit data bus Any private read or write operations in progress are allowed to complete before the fill data arrives from the system At a later time the system asserts fill_h The 21164PC asserts the tag and tag control bits and cont...

Page 117: ... indicates that the 21164PC has probed its caches and that the addressed block is not present READ MISS1 1001 Request for data This command indicates that the 21164PC has probed its caches and that the addressed block is not present 1010 Reserved 1011 Reserved BCACHE VICTIM 1100 Bcache victim should be removed If there is a victim buffer in the system this command is used to pass the address of th...

Page 118: ...ess and command can be immediately issued to the system interface on the next sysclk edge However if a miss is detected to a dirty block the CBU will require the data bus to process victims and must wait for any in flight probes to complete Figure 4 14 shows the timing of several Bcache reads and the resulting READ MISS Clean request The system immediately asserts cack_h to acknowledge the com man...

Page 119: ...ata_h 127 0 st_clkx_h dack_h sys_clk 0 1 2 3 4 5 6 7 8 A0 9 10 11 12 14 13 bc_clk_delay fill_offset index_h 21 4 idle_bc_h 15 16 17 18 19 20 F03 D04 21 22 23 24 25 A1 RM0 RM1 NOP NOP NOP F02 F01 F00 D31 D30 D21 D20 Db1 Db0 Da1 Da0 A0 A1 A2 A3 A0 A1 tag_data_ 32 19 T0 T1 T2 T3 T0 T0 tag_dirty_h D D D D D d tag_valid_h V V V V V V tag_ram_oe_l tag_ram_we_l data_adsc_l data_adv_l data_ram_oe_l data_r...

Page 120: ...lso to Section 4 9 3 for more information on using signals idle_bc_h and fill_h If fill_h is asserted at the rising edge of sysclk N the 21164PC samples fill_id_h then ensures that data_h 127 0 are tristated at the rising edge of sysclk N 1 Also at sysclk N 1 the 21164PC asserts the Bcache index and begins a Bcache write operation The system should drive the data onto the data bus and assert dack_...

Page 121: ...g cycle the BCACHE VICTIM command is driven along with the victim address Each assertion of dack_h causes the Bcache index to advance to the next part of the block Figures 4 15 and 4 16 show the timing of a READ MISS command with a victim The 21164PC and system must treat a READ MISS BCACHE VICTIM as an atomic transaction pair Once the system has acknowledged the READ MISS command it must guarante...

Page 122: ...be confused with the sampling of data When using the pipelined SSRAMs the data output register delays the data an additional sysclk cycle When the CBOX_CONFIG BC_REG_REG bit is set the data_ram_oe_l deassertion is delayed an additional sysclk cycle to allow the system ample time to sample the delayed Bcache read data System designers must also maintain the proper read to write spacing when going f...

Page 123: ...addr_h 39 4 cmd_h 3 0 A0 FM 05567 AI4 data_h 127 0 index_h 21 4 tag_ram_oe_l tag_ram_we_l V00 dack_h st_clk x _h bc_clk_delay A0 A0 tag_data 32 19 T1 tag_dirty_h D D tag_valid_h V V data_adsc_l data_adv_l data_ram_oe_l data_ram_we_l 3 0 F victim_pending_h RM0 NOP BCTVM NOP V0 cack_h fill_h fill_id_h Da0 Da1 Db0 Db1 V01 V02 V03 F00 F01 bc_clk_delay fill_offset A1 T0 D V T0 0 0 A0 miss0 miss1 deasse...

Page 124: ...25 addr_h 39 4 cmd_h 3 0 A0 FM 05566 AI4 data_h 127 0 index_h 21 4 tag_ram_oe_l tag_ram_we_l V01 dack_h st_clk x _h bc_clk_delay A0 A0 tag_data 32 19 T1 tag_dirty_h D D tag_valid_h V V data_adsc_l data_adv_l data_ram_oe_l data_ram_we_l 3 0 F victim_pending_h RM0 NOP BCTVM NOP V0 cack_h fill_h fill_id_h idle_bc_h Da0 Da1 Db0 Db1 V02 V03 F01 F02 bc_clk_delay fill_offset A1 T0 D V T0 0 0 A0 miss0 mis...

Page 125: ...the 21164PC retains the WRITE command and waits for bus ownership to be returned When the system takes the first part of the data it asserts dack_h This causes the 21164PC to drive the next 16 bytes of data on the same sysclk edge plus one cpu_clk cycle delay If the system asserts cack_h the 21164PC outputs the next command in the next sysclk Receipt of cack_h indicates to the 21164PC that the wri...

Page 126: ...n Section 4 9 1 The algo rithm used by the 21164PC for accepting system commands to be processed in paral lel by the 21164PC is presented in Section 4 8 1 Note Timing diagrams do not explicitly show tristated buses For examples of tristate timing refer to Section 4 9 4 8 1 Sending Commands to the 21164PC The rules used by the CBU BIU to process commands sent by the system to the 21164PC are listed...

Page 127: ...ime The algorithm used by the system to send commands to the 21164PC without overflow ing the two CBU BIU command buffers is shown in Figure 4 18 Figure 4 18 Algorithm for System Sending Commands to the 21164PC PCA016 Yes No CPU response equals ACK Bcache or NACK Is CMD Not NOP and Count 2 Init Start Set count to zero Increment count Send command Decrement count Yes Yes No No ...

Page 128: ...h NOACK If the block is found and is clean the 21164PC responds with NOACK The block is invalidated in the Dcache and Bcache If the block is found and is dirty the 21164PC responds with ACK Bcache and the Bcache read operation begins in the same sysclk cycle as the ACK The block is invalidated in the Dcache and Bcache INVALIDATE 0010 Remove the block When the system issues the INVALIDATE command t...

Page 129: ...D When the block state changes to VALID the state of DIRTY does not matter If the block is clean the 21164PC invalidates both the Dcache and Bcache and responds to the system with a NOACK If the block is not found the 21164PC responds to the system with a NOACK The system probe is performed in private mode and if data is found dirty in the Bcache the subsequent tag invalidate and data movement are...

Page 130: ...AM FM 05569 AI4 data_h 127 0 index_h 21 4 tag_ram_oe_l tag_ram_we_l sys_clk 0 1 2 3 4 5 6 7 8 9 10 11 12 14 13 15 16 17 18 19 20 addr_bus_req_h addr_h 39 4 A0 cmd_h 3 0 FLSH addr_res_h 1 0 ACKBC D00 D01 D02 D03 dack_h st_clk x _h bc_clk_delay A0 bc_rd_latency A0 tag_data 32 19 T0 tag_dirty_h D V tag_valid_h V D data_adsc_l data_adv_l data_ram_oe_l data_ram_we_l 3 0 F ...

Page 131: ...e system probe and the invalidate are performed in private mode to reduce overall latency Figure 4 20 INVALIDATE Timing Diagram Bcache Hit 4 8 2 4 READ The READ command is used by the system to read dirty data from the 21164PC The tag control status does not change Figure 4 21 shows the timing and tag control sta tus of a read transaction FM 05571 AI4 addr_h 39 4 cmd_h 3 0 addr_bus_req_h dack_h st...

Page 132: ...ystem ample time to sample the delayed Bcache read data Figure 4 21 READ Timing Diagram Bcache Hit Flow Through SSRAM FM 05570 AI4 data_h 127 0 index_h 21 4 tag_ram_oe_l tag_ram_we_l sys_clk 0 1 2 3 4 5 6 7 8 9 10 11 12 14 13 15 16 17 18 19 20 addr_bus_req_h addr_h 39 4 A0 cmd_h 3 0 READ addr_res_h 1 0 ACKBC D00 D01 D02 D03 dack_h st_clk x _h bc_clk_delay A0 bc_rd_latency A0 tag_data_h 32 19 T0 ta...

Page 133: ... to the system The 21164PC turns off its drivers at the rising edge of sysclk N While the system must turn on its driv ers between sysclk N and sysclk N 1 it must ensure that the drivers do not turn on before the 21164PC drivers turn off The 21164PC samples the state of the com mand address bus at the end of sysclk N 1 If addr_bus_req_h remains asserted the system should continue to drive the comm...

Page 134: ...nough to ensure that the 21164PC completes any private Bcache transaction it might have started while waiting for the fill data Signal fill_h is asserted a fixed number of sysclk cycles before the start of a fill transaction At the end of the fill the 21164PC waits five cpu_clk cycles before starting a read or write operation This time should allow the system to turn off its drivers If in prac tic...

Page 135: ...C and the Bcache from driving the data bus In general the system should not need to use this feature but it may be useful if the system places other devices on the data bus Figure 4 23 shows an example of this timing idle_cpu_cycles 4 bc_rd_latency bc_clk_ratio tristate_ram_turn_off 4 5 3 2 14 CPU cycles idle_bc_time sysclk ROUNDUP 14 7 2 sysclk cycles This requires idle_bc_h to be sampled two sys...

Page 136: ...ta_h 32 19 buses must be operated in such a way that no more than one driver may drive the bus at a time This section describes particular cases where tristate overlap may be a problem that needs to be corrected using features described in previous sections The owner of each bus must drive the bus to some value for each cycle Tristate drivers in the 21164PC turn on and off very fast in the 0 5 ns ...

Page 137: ...AMs data_ram_oe_l is deasserted one cpu_clk cycle after the detec tion of the final dack_h The system must allow time for data_ram_oe_l to turn off and the RAMs to stop driving the bus before the system drives fill data to avoid data bus contention Figure 4 24 System READ to FILL Spacing FM 05572 AI4 data_ram_oe_l data_h dack_h fill_h sys_clk 0 1 2 3 4 5 6 7 8 final dack detected earliest fill sam...

Page 138: ... Figure 4 25 FILL to Private READ or WRITE Operation 4 10 21164PC Interface Restrictions This section lists restrictions on the use of 21164PC interface features 4 10 1 Fill Operations After Other Transactions For a system Bcache read operation Bcache victim or a system initiated data move ment followed by a fill operation the earliest assertion of fill_h by the system is dependent upon the CBOX_C...

Page 139: ...ce conditions to be avoided are described and illustrated in Section 4 11 2 through Section 4 11 6 4 11 1 Rules for 21164PC and System Use of External Interface This section lists the rules for determining the order in which 21164PC and system requests are allowed by the CBU BIU In general the order allowed is determined by use of cmd_h 3 0 idle_bc_h and fill_h 1 If idle_bc_h is not asserted and t...

Page 140: ...ith a Bcache victim transaction is treated as an atomic pair If the READ MISS command is acknowledged with cack_h then the BCACHE VIC TIM command must be acknowledged with cack_h and all the data acknowl edged with dack_h before the 21164PC responds to any other request The system must also guarantee that once the read miss operation has been cacked system commands or fill transactions are not sta...

Page 141: ...ceived from the Bcache and then asserts idle_bc_h This causes the 21164PC to remove the READ MISS command with vic tim pending The 21164PC reasserts the READ MISS and BCACHE VICTIM com mands if needed at a later time Figure 4 26 READ MISS with Victim Aborted by FILL Example PCA010 victim_pending_h addr_bus_req_h idle_bc_h cack_h dack_h index_h 21 4 sys_clk_out1_h cmd_h 3 0 addr_h 39 4 data_h 127 0...

Page 142: ... the READ MISS and BCACHE VICTIM commands before doing any thing else The last dack_h meets the requirement that the cack_h arrive before or with the last dack_h Figure 4 27 idle_bc_h and cack_h Race Examples HLO PCA011 victim_pending_h addr_bus_req_h idle_bc_h cack_h dack_h index_h 21 4 sys_clk_out1_h cmd_h 3 0 addr_h 39 4 data_h 127 0 data_ram_oe_l READ MISS NOP D0 D1 D2 I0 I1 I2 0 3 12 1 2 4 5 ...

Page 143: ...ation that misses The signal idle_bc_h is asserted but no victim was created so the read miss request is loaded into the pad ring The system then takes the request Figure 4 28 READ MISS with idle_bc_h Asserted Example PCA012 victim_pending_h addr_bus_req_h idle_bc_h cack_h dack_h index_h 21 4 sys_clk_out1_h cmd_h 3 0 addr_h 39 4 data_h 127 0 data_ram_oe_l READ MISS NOP D0 D1 I0 I1 0 3 12 1 2 4 5 6...

Page 144: ...es the request The 21164PC then responds to the FLUSH command and drives index_h 21 4 to read the Bcache The 21164PC restarting the Bcache read operation requesting the read miss with victim is not shown in the timing diagram If the victim block was invalidated by the system request the 21164PC produces a clean read miss transaction Figure 4 29 READ MISS with Victim Abort Example PCA013 victim_pen...

Page 145: ...at errors on data received by the 21164PC from the Bcache the system or both are described in this section Tag data errors are also described 4 12 1 Data Parity The 21164PC supports INT4 parity protection on the data bus for the external Bcache and memory system When the 21164PC drives data to memory it generates longword parity and places it on lw_parity_h 3 0 for write operations Parity is check...

Page 146: ...t is not expected such as a small system with fixed access time it is likely that the 21164PC internal IDU timeout logic would detect a stall if the system fails to complete a fill transaction Systems in which a fill error timeout could occur should contain logic to detect fill timeouts and cleanly terminate the transaction with the 21164PC To properly terminate a fill in an error case the fill_er...

Page 147: ...ntrollers 4 13 3 Interrupt Priority Level Table 4 11 shows which interrupts are enabled for a given interrupt priority level IPL An interrupt is enabled if the current IPL is less than the target IPL of the interrupt Table 4 11 Interrupt Priority Level Effect Sheet 1 of 2 Interrupt Source Target IPL Source Software Interrupt Request 1 1 Internal Software Interrupt Request 2 2 Internal Software Int...

Page 148: ...ated in INTID is to be serviced at an IPL higher than the current IPL If it is not PALcode should ignore the spurious interrupt 1 These interrupts are from external sources In some cases the system environment provides the logic OR of multiple interrupt sources at the same IPL to a particular pin 2 The external interrupts 20 23 are separately maskable by setting the appropriate bits in the ICSR re...

Page 149: ...ALtemp IPRs are accessible to PALcode by means of the HW_MTPR and HW_MFPR instructions Table 5 1 lists the IPR num bers for these instructions CBU and backup cache Bcache IPRs are accessible in the physical address region FF FFF0 0000 to FF FFFF FFFF Table 5 25 summarizes the CBU and Bcache IPRs Table 5 31 lists restrictions on the IPRs Note Unless explicitly stated IPRs are not cleared or set by ...

Page 150: ...0C E1 EXC_MASK R 10D E1 PAL_BASE R W 10E E1 ICM R W 10F E1 IPLR R W 110 E1 INTID R 111 E1 IFAULT_VA_FORM R 112 E1 IVPTBR R W 113 E1 HWINT_CLR W 115 E1 SL_XMIT W 116 E1 SL_RCV R 117 E1 ICSR R W 118 E1 IC_FLUSH_CTL W 119 E1 ICPERR_STAT R W1C 11A E1 PMCTR R W 11C E1 PALtemp_IPRs PALtemp0 R W 140 E1 PALtemp1 R W 141 E1 Table 5 1 IDU MTU Dcache and PALtemp IPR Encodings Sheet 2 of 4 IPR Mnemonic Access...

Page 151: ...A E1 PALtemp11 R W 14B E1 PALtemp12 R W 14C E1 PALtemp13 R W 14D E1 PALtemp14 R W 14E E1 PALtemp15 R W 14F E1 PALtemp16 R W 150 E1 PALtemp17 R W 151 E1 PALtemp18 R W 152 E1 PALtemp19 R W 153 E1 PALtemp20 R W 154 E1 PALtemp21 R W 155 E1 PALtemp22 R W 156 E1 PALtemp23 R W 157 E1 MTU_IPRs DTB_ASN W 200 E0 DTB_CM W 201 E0 DTB_TAG W 202 E0 Table 5 1 IDU MTU Dcache and PALtemp IPR Encodings Sheet 3 of 4...

Page 152: ...B_IAP W 209 E0 DTB_IA W 20A E0 DTB_IS W 20B E0 ALT_MODE W 20C E0 CC W 20D E0 CC_CTL W 20E E0 MCSR R W 20F E0 DC_FLUSH W 210 E0 DC_PERR_STAT R W1C 212 E0 DC_TEST_CTL R W 213 E0 DC_TEST_TAG R W 214 E0 DC_TEST_TAG_TEMP R W 215 E0 DC_MODE R W 216 E0 MAF_MODE R W 217 E0 Table 5 1 IDU MTU Dcache and PALtemp IPR Encodings Sheet 4 of 4 IPR Mnemonic Access Index16 IDU Slots to Pipe ...

Page 153: ... which is determined by a not last used replacement algorithm The PTE field is obtained from the HW_MTPR ITB_PTE instruction Figure 5 1 shows the ITB_TAG register format Figure 5 1 Istream Translation Buffer Tag ITB_TAG Register 5 1 2 Instruction Translation Buffer Page Table Entry ITB_PTE Register 102 ITB_PTE is a read write register Write Format A write operation to this register writes both the...

Page 154: ...ter returns the PTE pointed to by the NLU pointer to the ITB_PTE_TEMP register and increments the NLU pointer If the HW_MFPR ITB_PTE instruction falls in the shadow of a trapping instruction the NLU pointer may be incremented multiple times A zero value is returned to the integer register file A second read of the ITB_PTE_TEMP register returns the PTE to the general purpose integer register file I...

Page 155: ...TB_PTE register returns data to this register A second read of the ITB_PTE_TEMP register returns data to the general purpose integer register file IRF Figure 5 3 shows the ITB_PTE register format Table 5 2 shows the GHD settings for the ITB_PTE_TEMP register 5 1 5 Instruction Translation Buffer Invalidate All Process ITB_IAP Register 106 ITB_IAP is a write only register Any write operation to this...

Page 156: ...instruction in order to initialize the NLU pointer 5 1 7 Instruction Translation Buffer IS ITB_IS Register 107 ITB_IS is a write only register Writing a virtual address to this register invalidates the ITB entry that meets either of the following criteria An ITB entry whose virtual address VA field matches ITB_IS 42 13 and whose ASN field matches ITB_ASN 10 04 An ITB entry whose VA field matches I...

Page 157: ... The formatted faulting address generated depends on whether NT superpage mapping is enabled through ICSR bit SPE 0 Figure 5 6 shows the IFAULT_VA_FORM register format in non NT mode Figure 5 6 Formatted Faulting Virtual Address IFAULT_VA_FORM Register NT_Mode 0 Figure 5 7 shows the IFAULT_VA_FORM register format in NT mode Figure 5 7 Formatted Faulting Virtual Address IFAULT_VA_FORM Register NT_M...

Page 158: ...R register format in non NT mode Figure 5 8 Virtual Page Table Base IVPTBR Register NT_Mode 0 Figure 5 9 shows the IVPTBR register format in NT mode Figure 5 9 Virtual Page Table Base IVPTBR Register NT_Mode 1 IGN I G N 31 62 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 63 VPTB 63 33 MA0602 AI4 30 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18...

Page 159: ...ble 5 3 describe the ICPERR_STAT register format Figure 5 10 Icache Parity Error Status ICPERR_STAT Register 5 1 11 Icache Flush Control IC_FLUSH_CTL Register 119 IC_FLUSH_CTL is a write only register Writing any value to this register flushes the entire Icache Table 5 3 Icache Parity Error Status Register Fields Name Extent Type Description DPE 11 W1C Data parity error TPE 12 W1C Tag parity error...

Page 160: ...case of precise exceptions this is the PC value of the instruction that caused the exception In case of imprecise exceptions interrupts this is the PC value of the next instruction that would have issued if the exception interrupt was not reported In case of a CALL_PAL instruction the PC value of the next instruction after the CALL_PAL is written to EXC_ADDR Bit 00 of this register is used to indi...

Page 161: ... SWC bit is cleared whenever a floating point instruction without the S modifier completes with an arithmetic trap The bit remains cleared regardless of additional arithmetic traps until the register is written by an HW_ MTPR instruc tion The bit is always cleared upon any HW_MTPR write operation to the EXC_SUM register INV 11 WA Indicates invalid operation DZE 12 WA Indicates divide by zero FOV 1...

Page 162: ...s the destinations of instructions that have caused an arithmetic trap between EXC_MASK write operations The destina tion is recorded as a single bit mask in the 64 bit IPR representing F0 F31 and I0 I31 A write operation to EXC_ SUM clears the EXC_MASK register Figure 5 13 shows the EXC_MASK register format Figure 5 13 Exception Mask EXC_MASK Register 00 31 32 63 LJ 03485 AI4 I1 I0 I31 I30 I29 F1...

Page 163: ...L_BASE register format Figure 5 14 PAL Base Address PAL_BASE Register 5 1 16 IDU Current Mode ICM Register 10F ICM is a read write register containing the current mode bits of the architecturally defined processor status as described in the Alpha AXP Architecture Reference Manual Figure 5 15 shows the ICM register format Figure 5 15 IDU Current Mode ICM Register 00 13 14 31 32 39 40 63 LJ 03486 AI...

Page 164: ...gister Fields Sheet 1 of 3 Name Extent Type Description PME 1 0 09 08 RW 0 Performance counter master enable bits If both PME 1 and PME 0 are clear all perfor mance counters in the PMCTR IPR are disabled If either PME 1 or PME 0 are set the counter is enabled according to the settings of the PMCTR CTL fields RSV 17 RW 0 Reserved to DIGITAL MBZ 18 RW 0 Reserved to DIGITAL Must be zero HLO001B 31 00...

Page 165: ...ptions HWE 27 RW 0 If set allows PALRES instructions to be issued in kernel mode SPE 1 0 29 28 RW 0 If SPE 1 is set it enables superpage mapping of Istream virtual address VA 39 13 directly to physical address PA 39 13 assuming VA 42 41 10 Virtual address bit VA 40 is ignored in this translation Access is allowed only in kernel mode If SPE 0 is set NT mode it enables super page mapping of Istream ...

Page 166: ...Figure 5 17 shows the IPLR register format Refer to Table 4 11 for a description of which inter rupts are enabled for a given IPL Figure 5 17 Interrupt Priority Level IPLR Register FBD 36 RW 0 If set forces bad Icache data parity MBZ in nor mal operation MBO 37 RW 1 Reserved to DIGITAL Must be one ISTA 38 RO Reading this bit indicates ICACHE BIST status If set ICACHE BIST was successful TST 39 RW ...

Page 167: ...must ensure that the IPL in INTID is greater than the IPL specified by IPLR This restriction is required because a level sensitive hardware interrupt may disappear before the interrupt service routine is entered passive release The contents of INTID are not correct on a HALT interrupt because this particular interrupt does not have a target IPL at which it can be masked When a HALT inter rupt occu...

Page 168: ... processor mode given in the ICM 04 03 should be equal to or higher than the mode associated with the AST request Figure 5 19 shows the ASTRR register format Figure 5 19 Asynchronous System Trap Request ASTRR Register 5 1 21 Asynchronous System Trap Enable ASTER Register 10A ASTER is a read write register containing bits to enable corresponding asynchronous system trap AST interrupt requests Figur...

Page 169: ...interrupt requests A software request for a particular IPL may be requested by setting the appropriate bit in SIRR 15 01 Figure 5 21 and Table 5 6 describe the SIRR register format Figure 5 21 Software Interrupt Request SIRR Register Table 5 6 Software Interrupt Request Register Fields Name Extent Type Description SIRR 15 1 18 04 RW Request software interrupts 00 03 04 18 19 31 32 63 RAZ IGN LJ 03...

Page 170: ...register format Figure 5 22 Hardware Interrupt Clear HWINT_CLR Register Table 5 7 Hardware Interrupt Clear Register Fields Name Extent Type Description PC0C 27 W1C Clears performance counter 0 interrupt requests PC1C 28 W1C Clears performance counter 1 interrupt requests PC2C 29 W1C Clears performance counter 2 interrupt requests CRDC 32 W1C Clears correctable read data interrupt requests SLC 33 W...

Page 171: ... ISR Register Table 5 8 Interrupt Summary Register Fields Sheet 1 of 2 Name Extent Type Description ASTRR 3 0 and ASTER 3 0 03 00 RO Boolean AND of ASTRR USEK with ASTER USEK used to indicate enabled AST requests SISR 15 1 18 04 RO 0 Software interrupt requests 15 through 1 corre sponding to IPL 15 through 1 ATR 19 RO Set if any AST request and corresponding enable bit is set and if the processor ...

Page 172: ...ter 0 IPL 29 PC1 28 RO External hardware interrupt performance counter 1 IPL 29 PC2 29 RO External hardware interrupt performance counter 2 IPL 29 PFL 30 RO External hardware interrupt power failure IPL 30 MCK 31 RO External hardware interrupt system machine check IPL 31 CRD 32 RO Correctable ECC errors IPL 31 SLI 33 RO Serial line interrupt HLT 34 RO External hardware interrupt halt Table 5 8 Int...

Page 173: ...p The value of the TMT bit is transmitted offchip on the srom_clk_h signal In normal operation mode not in debugging mode the srom_clk_h signal serves both the serial line transmission and the Icache SROM interface see Sections 7 4 and 7 5 Figure 5 24 and Table 5 9 describe the SL_XMIT register format Figure 5 24 Serial Line Transmit SL_XMIT Register Table 5 9 Serial Line Transmit Register Fields ...

Page 174: ...al A serial line interrupt is requested whenever a transition is detected on the srom_data_h signal and the SLE bit in the ICSR is set During normal operations not in test mode the srom_data_h signal serves both the serial line reception and the Icache SROM interface see Sections 7 4 and 7 5 Figure 5 25 and Table 5 10 describe the SL_RCV register format Figure 5 25 Serial Line Receive SL_RCV Regis...

Page 175: ...r select options are described in the PM0_ MUX 2 0 and PM1_ MUX 2 0 fields of the CBOX_CONFIG2 IPR see Table 5 29 Section 2 8 describes the per formance measurement support features Note The arrangement of the select option tables is not meant to imply any restrictions on permitted combinations of selections The only cases in which the selection for one counter influences another s count is SEL1 8...

Page 176: ...ion 5 1 23 and Section 5 1 24 11 counter enable interrupt at count 256 CTL1 1 0 13 12 RW 0 CTR1 counter control 00 counter disable interrupt disable 01 counter enable interrupt disable 10 counter enable interrupt at count 65536 11 counter enable interrupt at count 256 CTL2 1 0 11 10 RW 0 CTR2 counter control 00 counter disable interrupt disable 01 counter enable interrupt disable 10 counter enable...

Page 177: ...lay trap occurred 0x4 single issue cycles Exactly one instruction issued 0x5 dual issue cycles Exactly two instructions issued 0x6 triple issue cycles Exactly three instructions issued 0x7 quad issue cycles Exactly four instructions issued 1 Instructions 0x8 jsr ret if sel2 PC M Instruction issued if sel2 is PC M 0x2 PC mispredicts 0x8 cond branch if sel2 BR M Instruction issued if sel2 is BR M 0x...

Page 178: ...ereas user PAL mode measures the events during the PAL calls made by the OS 1 In this instance Kk means kill kernel only The combination Ku 1 Kp 1 and Kk 1 is used to gather events for the executive and supervisor modes only 0xB Reserved 0xC CPU cycles 0xD MB stall cycles 0xE LDxL instructions issued 0xF pick CBU 0 input 0xF pick CBU 1 input Table 5 13 Measurement Mode Control Kill Bit Settings Me...

Page 179: ...Figure 5 27 shows the DTB_ASN register format Figure 5 27 Dstream Translation Buffer Address Space Number DTB_ASN Register 5 2 2 Dstream Translation Buffer Current Mode DTB_CM Register 201 DTB_CM is a write only register that must be written with an exact duplicate of the IDU current mode ICM register CM field These bits indicate the current mode of the machine as described in the Alpha AXP Archit...

Page 180: ...hip reset but not on timeout reset Figure 5 29 shows the DTB_TAG register format Figure 5 29 Dstream Translation Buffer Tag DTB_TAG Register 5 2 4 Dstream Translation Buffer Page Table Entry DTB_PTE Register 203 DTB_PTE is a read write register representing the 64 entry DTB page table entries PTEs The entry to be written is chosen by a not last used replacement algorithm implemented in hardware Wr...

Page 181: ...PTE_TEMP register returns the PTE entry to the register file Reading the DTB_PTE register increments the TB entry pointer of the DTB which allows reading the entire set of DTB PTE entries Figure 5 30 shows the DTB_PTE register format Note The Alpha AXP Architecture Reference Manual provides descriptions of the fields of the PTE Figure 5 30 Dstream Translation Buffer Page Table Entry DTB_PTE Regist...

Page 182: ...ions to return the PTE data to the register file The first reads the DTB_PTE register to the DTB_PTE_TEMP register and returns zero to the register file The second returns the DTB_PTE_TEMP regis ter to the integer register file IRF Figure 5 31 shows the DTB_PTE_TEMP regis ter format Figure 5 31 Dstream Translation Buffer Page Table Entry Temporary DTB_PTE_TEMP Register 00 01 02 03 04 05 06 07 08 0...

Page 183: ... register is not unlocked or cleared on reset Figure 5 32 and Table 5 14 describe the MM_STAT register format Figure 5 32 Dstream Memory Management Fault Status MM_STAT Register Table 5 14 Dstream Memory Management Fault Status Register Fields Sheet 1 of 2 Name Extent Type Description WR 00 RO Set if reference that caused error was a write operation ACV 01 RO Set if reference caused an access viol...

Page 184: ..._FORM and MM_STAT registers are locked against further updates until software reads the VA register The VA register is not unlocked on reset Figure 5 33 shows the VA register format Figure 5 33 Faulting Virtual Address VA Register BAD_VA 05 RO Set if reference had a bad virtual address RA 10 06 RO RA field of the faulting instruction OPCODE 16 11 RO Opcode field of the faulting instruction Table 5...

Page 185: ...is formatted as a 32 bit PTE when the NT_Mode bit MCSR 01 is set see Figure 5 34 VA_ FORM is locked on any Dstream fault DTB miss or Dcache parity error The VA VA_FORM and MM_STAT registers are locked against further updates until software reads the VA register The VA_FORM register is not unlocked on reset Figure 5 35 shows the VA_FORM reg ister format when MCSR 01 is clear Figure 5 34 Formatted V...

Page 186: ...ster the MVPTBR is not locked against further updates when a Dstream fault DTB miss or Dcache parity error occurs Figure 5 36 shows the MVPTBR register format Figure 5 36 MTU Virtual Page Table Base MVPTBR Register Table 5 15 Formatted Virtual Address Register Fields Name Extent Type Description NT_Mode 0 VPTB 63 33 RO Virtual page table base address as stored in MVPTBR VA 42 13 32 03 RO Subset of...

Page 187: ...the SEO bit until software writes a 1 to clear the LOCK bit The SEO bit is set when a Dcache parity error occurs while the Dcache parity error status register is locked Once the SEO bit is set it is locked against further updates until the software writes a 1 to DC_PERR_STAT 00 to unlock and clear the bit The SEO bit is not set when Dcache parity errors are detected on both pipes within the same c...

Page 188: ...ries and resets the DTB not last used NLU pointer to its initial state Table 5 16 Dcache Parity Error Status Register Fields Name Extent Type Description SEO 00 W1C Set if second Dcache parity error occurred in a cycle after the register was locked The SEO bit is not set as a result of a second parity error that occurs within the same cycle as the first LOCK 01 W1C Set if parity error is detected ...

Page 189: ... matches DTB_IS 42 13 and whose ASN field matches DTB_ASN 63 57 A DTB entry whose VA field matches DTB_IS 42 13 and whose ASM bit is set Figure 5 38 shows the DTB_IS register format Figure 5 38 Dstream Translation Buffer Invalidate Single DTB_IS Register Note The DTB_IS register is written before the normal IDU trap point The DTB invalidate single operation is aborted by the IDU only for the fol l...

Page 190: ...F MCSR is a read write register that controls features and records status in the MTU This register is cleared on chip reset but not on timeout reset Figure 5 39 and Table 5 17 describe the MCSR register format Figure 5 39 MTU Control MCSR Register 00 01 02 03 04 05 06 31 32 63 RAZ IGN RAZ IGN M_BIG_ENDIAN LJ 03511 AI4 SP 1 0 MBZ E_BIG_ENDIAN MBZ ...

Page 191: ...address bit VA 40 is ignored in this translation SP 0 enables one to one superpage mapping of Dstream virtual addresses with VA 42 30 1FFE16 In this mode virtual addresses VA 29 13 are mapped directly to physical addresses PA 29 13 with bits 39 30 of physical address set to 0 SP 0 is the NT_Mode bit that is used to control virtual address formatting on a read operation from the VA_FORM register Re...

Page 192: ...d test modes in the Dcache This register is cleared on chip reset but not on timeout reset Figure 5 40 and Table 5 18 describe the DC_MODE register format Note The following bit settings are required for normal operation DC_ENA 1 DC_FHIT 0 DC_BAD_PARITY 0 DC_PERR_DISABLE 0 Figure 5 40 Dcache Mode DC_MODE Register 00 01 02 03 04 31 DC_ENA DC_FHIT DC_BAD_PARITY DC_PERR_DISABLE 32 63 LJ 03512 AI4 RAZ...

Page 193: ...the data parity inputs to the Dcache on integer stores This has the effect of putting bad data parity into the Dcache on integer stores that hit in the Dcache This bit has no effect on the tag parity written to the Dcache during FILL opera tions or the data parity written to the CBU write data buffer on integer store instructions Floating point store instructions should not be issued when this bit...

Page 194: ...cleared on timeout reset Figure 5 41 and Table 5 19 describe the MAF_MODE register format Note The following bit settings are required for normal operation DREAD_NOMERGE 0 WB_FLUSH_ALWAYS 0 WB_NOMERGE 0 MAF_ARB_DISABLE 0 WB_CNT_DISABLE 0 Figure 5 41 Miss Address File Mode MAF_MODE Register PCA008 63 32 31 00 01 02 03 04 05 06 07 08 RAZ IGN RAZ IGN 09 10 11 12 WB_CLR_LO_THRESH 1 0 WB_SET_LO_THRESH ...

Page 195: ...o allocate a new entry Subsequent merging to that entry is not allowed even if WB_ NOMERGE is cleared Must be zero MBZ in normal operation IO_NMERGE 03 RW 0 When set this bit prevents loads from I O space address bit 39 1 from merging in the MAF Should be zero SBZ in typical operation WB_CNT_ DISABLE 04 RW 0 When set this bit disables the 256 cycle WB counter in the MAF arbiter The top entry of th...

Page 196: ... the threshold at which the WB begins arbitration at low priority The thresholds are as follows 00 3 entries 01 4 entries 10 5 entries 11 2 entries 21164 mode WB_SET_LO_THRESH must be greater than WB_CLR_LO_THRESH WB_CLR_LO_ THRESH 1 0 11 10 RW 0 These bits set the threshold at which the WB stops arbitration The thresholds are as follows 00 0 entries 01 1 entry 21164 mode 10 2 entries 11 3 entries...

Page 197: ...banks of the Dcache 5 2 18 Alternate Mode ALT_MODE Register 20C ALT_MODE is a write only register that specifies the alternate processor mode used by some HW_LD and HW_ST instructions Figure 5 42 and Table 5 20 describe the ALT_MODE register format Figure 5 42 Alternate Mode ALT_MODE Register Table 5 20 Alternate Mode Register Settings ALT_MODE 04 03 Mode 0 0 Kernel 0 1 Executive 1 0 Supervisor 1 ...

Page 198: ...ble the cycle counter The CC 31 00 is writ ten to CC_CTL by an HW_MTPR instruction The CC register is read by the RPCC instruction as defined in the Alpha AXP Archi tecture Reference Manual The RPCC instruction returns a 64 bit value The cycle counter is enabled to increment only three cycles after the MTPR CC_CTL with CC_CTL 32 set instruction is issued This means that an RPCC instruction issued ...

Page 199: ... not changed If CC_CTL 32 is set then the counter is enabled otherwise the counter is disabled Figure 5 44 and Table 5 21 describe the CC_CTL register format Figure 5 44 Cycle Counter Control CC_CTL Register Table 5 21 Cycle Counter Control Register Fields Name Extent Type Description COUNT 31 04 31 04 WO Cycle count This value is loaded into CC 31 04 CC_ENA 32 WO Cycle Counter enable When set thi...

Page 200: ...set Figure 5 45 Dcache Test Tag Control DC_TEST_CTL Register Table 5 22 Dcache Test Tag Control Register Fields Sheet 1 of 2 Name Extent Type Description BANK0 00 RW Dcache Bank0 enable When set reads from DC_TEST_TAG return the tag from Dcache bank0 writes to DC_TEST_TAG write to Dcache bank0 When clear reads from DC_TEST_TAG return the tag from Dcache bank1 BANK1 01 RW Dcache Bank1 enable When s...

Page 201: ...sent to the Dcache enabling it to shift the data from one scan latch to the next Consecutively setting this bit has the effect of shifting soft repair data into the Dcache programmable soft repair logic A write to this location should be followed by an MB instruction LOAD 15 RW 0 Load signal for Dcache soft repair When set the data shifted into the soft repair scan chain is selected thus enabling ...

Page 202: ...eration is from Dcache bank0 Otherwise the read operation is from Dcache bank1 When DC_TEST_TAG is written the value written to DC_TEST_ TAG is written to the Dcache index referenced by the value in the DC_TEST_CTL register The tag tag parity and valid bits are affected by this write operation Data parity bits are not affected by this write operation use DC_MODE 02 and force hit modes If BANK0 is ...

Page 203: ...e Dcache tag parity bit that covers tag bits 32 through 13 valid bits not covered OW0_VALID 11 WO Octaword valid bit 0 This bit refers to the Dcache valid bit for the low order octaword within a Dcache 32 byte block OW1_VALID 12 WO Octaword valid bit 1 This bit refers to the Dcache valid bit for the high order octaword within a Dcache 32 byte block TAG 32 13 32 13 WO TAG 32 13 These bits refer to ...

Page 204: ...DC_TEST_TAG reads the tag array and data parity bits and loads them into the DC_ TEST_TAG_TEMP register An UNDEFINED value is returned to the integer register file IRF 2 The second read operation of the DC_TEST_TAG_TEMP register returns the Dcache test data to the integer register file IRF Figure 5 47 and Table 5 24 describe the DC_TEST_TAG_TEMP register format Figure 5 47 Dcache Test Tag Temporar...

Page 205: ...ered DATA_PAR 7 0 10 03 RO Data parity When any of these bits are set it indi cates a parity error occurred in a read of DC_TEST_TAG in the bank specified in DC_TEST_CTL OW0_VALID 11 RO Octaword valid bit 0 This bit refers to the Dcache valid bit for the low order octaword within a Dcache 32 byte block OW1_VALID 12 RO Octaword valid bit 1 This bit refers to the Dcache valid bit for the high order ...

Page 206: ...R in this address space produces UNDEFINED behavior The operating system should not map any address in this region as writable in any mode The CBU internal processor registers are described in Section 5 3 1 through Section 5 3 4 Table 5 25 CBU Internal Processor Register Descriptions Register Address Type Description CBOX_CONFIG FF FFF0 0008 RW Contains Bcache configuration parameters CBOX_ADDR FF...

Page 207: ...riod st_clk in number of CPU cycles At power up the st_clk remains 0 until the Bcache is enabled The supported range of values is 2 to10 BC_ LATENCY_ OFF 3 0 11 08 RW 0 This offset field determines the number of CPU cycles to wait from the CPU clock edge that launches the index until the data is latched into the 21164PC Total Latency 5 BC_LATENCY_OFF 3 0 At power up this field is initialized to 0 ...

Page 208: ...acing when switching from private Bcache reads to private Bcache writes Total read to write spacing 1 BC_RW_OFF 3 0 At power up this field is initialized to 2 which represents a total read to write spacing of three CPU cycles The supported range of values for this field is 2 to 7 which provides a total read to write spacing of three to eight CPU cycles For other data movement commands such as FLUS...

Page 209: ... set all read and write operations with PA 39 0 hit in the Bcache This is useful when ini tializing the Bcache on power up BC_FORCE_ ERR 26 RW 0 When set bit zero of each longword written into the Bcache is inverted BC_BIG_ DRV 27 RW 0 When set this bit enables 50 more drive on the fol lowing pins index_h 21 4 data_ram_oe_l data_ram_we_l 3 0 st_clk1_h st_clk2_h st_clk3_h data_adsc_l data_adv_l BC_...

Page 210: ...s set A read of CBOX_STATUS unlocks the CBOX_ADDR register Figure 5 49 and Table 5 27 describe the CBOX_ADDR register format Figure 5 49 CBU Address CBOX_ADDR Register Table 5 27 CBU Address Register Fields Name Extent Type Description Reserved 03 00 RO Reserved to DIGITAL Must be zero MBZ ADDRESS 36 04 36 04 RO Error address Reserved 38 37 RO Reserved to DIGITAL Must be zero MBZ ADDRESS 39 39 RO ...

Page 211: ...RO 0 Reserved to DIGITAL Must be zero MBZ SYS_CLK_ RATIO 3 0 07 04 RO 0 The sysclk period in CPU cycles The sysclk ratio is loaded from the IRQ pins on reset Note that this field is read only CHIP_REV 3 0 11 08 RO 0 This field displays 0001 the current revision of the chip Future update revisions of the chip will return different unique values DATA_PAR_ ERR 3 0 15 12 RO 0 If set this field indicat...

Page 212: ...r the failing address If set the data had been modified and not written to memory MEMORY 18 RO 0 If set the error was detected during a fill from memory MULTI_ERR 19 RO 0 If set another error was detected after the register was locked Reserved 31 20 RO 0 Reserved to DIGITAL Must be zero MBZ Table 5 28 CBU Status Register Fields Sheet 2 of 2 Name Extent Type Description ...

Page 213: ...Reserved 03 00 RW 0 Reserved to DIGITAL Must be zero MBZ BC_REG_REG 04 RW 1 When set this bit indicates that the Bcache is built from REG REG SSRAM When clear it indicates that the Bcache is built from REG FT SSRAM This bit is used to delay the deassertion of data_ram_oe_l during system Bcache read transactions for example Bcache victims or system probes that require data movement DBG_SEL 5 RW 0 S...

Page 214: ...ad requests the total number of read requests from the MTU 0x1 Bcache Dstream read hits total number of Dstream read requests that hit in the Bcache 0x2 Bcache Dstream read fills the total number of Dstream read fill requests to the Bcache 0x3 Bcache write operations the total number of write requests from the MTU 0x4 Undefined 0x5 Bcache clean write hits the total number of write operations that ...

Page 215: ...e total number of write fill operations in the Bcache 0x5 System read flush Bcache hits the total num ber of system READ or FLUSH hits in the Bcache 0x6 System read flush Bcache misses the total number of system READ or FLUSH requests 0x7 Read miss 3 launched the number of times a third READ MISS request is sent to the system while there are already two READ MISSes outstanding SYSRD_DCLK_EN 14 RW ...

Page 216: ..._MFPR instructions The latency from a PALtemp read operation to availability is one cycle 5 5 Restrictions The following sections list all known register access restrictions A software tool called the PALcode violation checker PVC is available This tool can be used to verify adherence to many of the PALcode restrictions 5 5 1 CBU IPR PALcode Restrictions Table 5 30 describes the CBU IPR PALcode re...

Page 217: ...e space Clearing of BC_FORCE_HIT in CBOX_CONFIG Must be followed by MB read operation of CBOX_STATUS then MB prior to subsequent store Table 5 31 PALcode Restrictions Table Sheet 1 of 5 The following in cycle 0 Restrictions Note Numbers refer to cycle number Y if checked by PVC1 CALL_PAL entry No HW_REI or HW_REI_STALL in cycle 0 No HW_MFPR EXC_ADDR in cycle 0 1 Y Y PALshadow write instruc tion No...

Page 218: ...cycle ARITH trap entry No HW_MFPR EXC_SUM or EXC_MASK in cycle 0 1 Y Machine check trap entry No register file read or write access in 0 1 2 3 4 5 6 7 No HW_MFPR EXC_SUM or EXC_MASK in cycle 0 1 Y HW_MTPR any IDU IPR including PALtemp regis ters No HW_MFPR same IPR in cycle 1 2 No floating point conditional branch in 0 No FEN or OPCDEC instruction in 0 Y HW_MTPR ASTRR ASTER No HW_MFPR INTID in 0 1...

Page 219: ...R ITB_PTE Must be followed by HW_REI_STALL HW_MTPR ITB_IAP ITB_IS ITB_IA Must be followed by HW_REI_STALL HW_MTPR ITB_IS HW_REI_STALL must be in the same Istream octaword HW_MTPR IVPTBR No HW_MFPR IFAULT_VA_FORM in 0 1 2 Y HW_MTPR PAL_BASE No CALL_PAL in 0 1 2 3 4 5 6 7 No HW_REI in 0 1 2 3 4 5 6 Y Y HW_MTPR ICM No HW_REI in 0 1 2 No private CALL_PAL in 0 1 2 3 Y HW_MTPR CC CC_CTL No RPCC in 0 1 2...

Page 220: ...E DC_ PERR_STAT DC_TEST_CTL DC_TEST_TAG in 2 Y Y HW_MTPR DTB_TAG No virtual MTU instructions in 1 2 3 No HW_MTPR DTB_TAG in 1 No HW_MFPR DTB_PTE in 1 2 No HW_MTPR DTB_IS in 1 2 No HW_REI in 0 1 2 Y Y Y Y Y HW_MTPR DTB_IAP DTB_IA No virtual MTU instructions in 1 2 3 No HW_MTPR DTB_IS in 0 1 2 No HW_REI in 0 1 2 Y Y Y HW_MTPR DTB_IA No HW_MFPR DTB_PTE in 1 Y HW_MTPR MAF_MODE No MTU instructions in 1...

Page 221: ...een HW_ MFPR DC_TEST_TAG and HW_MFPR DC_TEST_ TAG_TEMP HW_MFPR DTB_PTE No MTU instructions in 0 1 No HW_MTPR DC_TEST_CTL DC_TEST_TAG in 0 1 No HW_MFPR DTB_PTE_TEMP issued or slotted in 1 2 3 No HW_MFPR DTB_PTE in 1 No virtual MTU instructions in 0 1 2 Y Y Y Y HW_MFPR VA Must be done in ARITH MACHINE CHECK DTBMISS_SINGLE UNALIGN DFAULT traps and ITBMISS flow after the VPTE load Table 5 31 PALcode R...

Page 222: ......

Page 223: ...rchitecture library code PALcode is macrocode that provides an archi tecturally defined operating system specific programming interface that is common across all Alpha microprocessors The actual implementation of PALcode differs for each operating system PALcode runs with privileges enabled instruction stream mapping disabled and interrupts disabled PALcode has privilege to use five special opcode...

Page 224: ...s is the sequence that returns from an exception or inter rupt There are some instructions that are necessary for backward compatibility or ease of programming however these are not used often enough to dedicate them to hardware or are so complex that they would jeopardize the overall perfor mance of the computer For example an instruction that does a VAX style inter locked memory access might be ...

Page 225: ... invoking the normal memory management routines HW_LD HW_ST Return from an exception or interrupt HW_REI When executing in PALmode there are certain restrictions for using the privileged instructions because PALmode gives the programmer complete access to many of the internal details of the 21164PC Refer to Section 6 6 for information on these special PALmode instructions Caution It is possible to...

Page 226: ... of PALcode When the 21164PC is reset it enters PALmode and executes the RESET PALcode The system will remain in PALmode until a HW_ REI instruction is executed and EXC_ADDR 00 is cleared It then continues execution in non PALmode native mode as just described It is during this initial RESET PAL code execution that the rest of the low level system initialization is performed including any modifica...

Page 227: ...on immediately following the CALL_PAL is loaded into EXC_ADDR and is pushed onto the return prediction stack The IDU contains special hardware to minimize the number of cycles in the TRAPB at the start of a CALL_PAL Software can benefit from this by scheduling CALL_PALs such that they do not fall in the shadow of IMUL Any floating point operate especially FDIV Each CALL_PAL instruction includes a ...

Page 228: ... priority Prioritization among the Dstream traps works because DTBMISS is suppressed when there is a sign check error The priority of ITBMISS and interrupt is reversed if there is an Icache miss Number of Cycles Description 1 Minimum TRAPB for empty pipe Typically this will be four cycles 1 Issue the CALL_PAL instruction 2 The minimum length of a PAL flow However in most cases there will be more t...

Page 229: ...lists the opcodes reserved by the Alpha architecture for implementation specific use These opcodes are privileged and are only available in PALmode MCHK 0400 Uncorrected hardware error OPCDEC 0480 Illegal opcode ARITH 0500 Arithmetic exception FEN 0580 Floating point operation attempted with Table 6 2 Required PALcode Function Codes Mnemonic Type Function Code DRAINA Privileged 00 0002 HALT Privil...

Page 230: ...s Note Explicit software timing is required for accessing the hardware specific IPRs and the PAL_TEMP registers These constraints are described in Table 5 31 6 6 1 HW_LD Instruction PALcode uses the HW_LD instruction to access memory outside of the realm of nor mal Alpha memory management and to do special forms of Dstream loads Figure 6 1 and Table 6 4 describe the format and fields of the HW_LD ...

Page 231: ... inhibited ALT 0 1 Memory management checks use MTU IPR DTB_CM for access checks Memory management checks use MTU IPR ALT_MODE for access checks WRTCK 0 1 Memory management checks fault on read FOR and read access violations Memory management checks FOR fault on write FOW read and write access violations QUAD 0 1 Length is longword Length is quadword VPTE 1 Flags a virtual PTE fetch Used by trap l...

Page 232: ...ormat Description Field Value Description OPCODE 1F16 The OPCODE field contains 1F16 RA Write data register number RB Base register for memory address PHYS 0 1 The effective address for the HW_ST is virtual The effective address for the HW_ST is physical Translation and memory management access checks are inhibited ALT 0 1 Memory management checks use MTU IPR DTB_CM for access checks Memory manage...

Page 233: ...that is normally used Stall prefetch This encoding of HW_REI inhibits Istream fetch until the HW_REI itself is issued Thus this is the method used to synchronize IDU changes such as ITB write instructions with the HW_REI There is a rule that PALcode can have only one such HW_REI in an aligned block of four instruc tions Figure 6 3 and Table 6 6 describe the format and fields of the HW_ REI instruc...

Page 234: ...cycles IDU hardware slots each type of MXPR to the correct IEU pipe refer to Table 5 1 Figure 6 4 and Table 6 7 describe the format and fields of the HW_MFPR and HW_MTPR instructions Figure 6 4 HW_MFPR and HW_MTPR Instruction Format Table 6 7 HW_MFPR and HW_MTPR Format Description Field Value Description OPCODE 1916 1D16 The OPCODE field contains 1916 for HW_MFPR The OPCODE field contains 1D16 for...

Page 235: ...ion External interface initialization Internal processor register IPR reset state Timeout reset IEEE 1149 1 test port reset 7 1 Input Signals sys_reset_l and dc_ok_h and Booting The 21164PC reset sequence uses two input signals sys_reset_l and dc_ok_h When transitioning from a powered down state to a powered up state signal dc_ok_h must be deasserted and signal sys_reset_l must be asserted until p...

Page 236: ...ch misses in the Icache and pro duces an offchip read command The external system implementation must be compatible with the default configuration of the 21164PC after reset refer to Section 7 8 The code that is executed at this point should complete the 21164PC configuration as necessary 4 After configuring the 21164PC control can be transferred to code anywhere in memory including the noncacheab...

Page 237: ...lock output osc_clk_in_h l Must be clocking st_clk1_h Deasserted st_clk2_h Deasserted st_clk3_h Deasserted sys_clk_out1_h Clock output sys_clk_out2_h Clock output sys_reset_l NA input Bcache data_h 127 0 Tristated data_adsc_l Deasserted data_adv_l Deasserted data_ram_oe_l Deasserted data_ram_we_l 3 0 Deasserted index_h 21 4 Unspecified lw_parity_h 3 0 Tristated tag_data_h 32 19 Tristated tag_data_...

Page 238: ...most recent sysclk edge If driven the command is NOP dack_h Must be deasserted data_bus_req_h NA input fill_h Must be deasserted fill_dirty_h NA input fill_error_h Must be deasserted fill_id_h Must be deasserted idle_bc_h Must be deasserted int4_valid_h 3 0 Unspecified victim_pending_h Unspecified Interrupts irq_h 3 0 sysclk divisor ratio input mch_hlt_irq_h sysclk delay input pwr_fail_irq_h syscl...

Page 239: ... chip to chip The sysclk divisor and sys_clk_out2_h delay are determined by input pins while signal sys_reset_l remains asserted Refer to Section 4 2 2 and Section 4 2 3 for ratio and delay values 7 1 1 Pin State with dc_ok_h Not Asserted While dc_ok_h is deasserted and sys_reset_l is asserted every output and bidirec tional 21164PC pin is tristated and pulled weakly to ground by a small pull down...

Page 240: ...omatically tested and the result is made available in the ICSR IPR and on signal test_status_h 1 Internally the CPU reset continues to be asserted throughout the BiSt process For additional informa tion refer to Section 9 4 4 1 7 4 Serial Read Only Memory Interface Port The serial read only memory SROM interface provides the initialization data load path from a system SROM to the instruction cache...

Page 241: ...he cannot be loaded serially The tag valid bits for this bank should reflect this The automatic serial Icache fill invoked by the chip reset sequence operates inter nally at a frequency of 126 CPU clock period However due to the synchroniza tion with the system clocks consecutive access cycles to SROM may shrink or stretch by a system cycle For example for a system with a system clock ratio of 15 ...

Page 242: ...ter the SROM data is loaded into the Icache the three SROM interface signals can be used as a software UART and the pins become parallel I O pins that can drive a diagnostic terminal by using an interface such as RS 232 or RS 423 7 6 Cache Initialization Regardless of whether the Icache BiSt is executed the Icache is flushed during the reset sequence prior to the SROM load If the SROM load is bypa...

Page 243: ...instructions if necessary 7 6 2 Flushing Dirty Blocks During a power failure recovery dirty blocks must be flushed out of the backup cache Bcache To flush out dirty blocks from the Bcache on power failure the following sequence must be used to guarantee that all the dirty blocks have been written back to main memory Perform loads at a stride of Bcache block size 2 size of the Bcache 7 7 External I...

Page 244: ...et State Sheet 1 of 3 IPR Reset State Comments IDU Registers ITB_TAG UNDEFINED ITB_PTE UNDEFINED ITB_ASN UNDEFINED PALcode must initialize ITB_PTE_TEMP UNDEFINED ITB_IAP UNDEFINED ITB_IA UNDEFINED PALcode must initialize ITB_IS UNDEFINED IFAULT_VA_FORM UNDEFINED IVPTBR UNDEFINED PALcode must initialize ICPERR_STAT UNDEFINED PALcode must initialize IC_FLUSH_CTL UNDEFINED EXC_ADDR UNDEFINED EXC_SUM ...

Page 245: ... MTU Registers DTB_ASN UNDEFINED PALcode must initialize DTB_CM UNDEFINED PALcode must initialize DTB_TAG Cleared Valid bits are cleared on chip reset but not on timeout reset DTB_PTE UNDEFINED DTB_PTE_TEMP UNDEFINED MM_STAT UNDEFINED Must be unlocked by PALcode by reading VA register VA UNDEFINED Must be unlocked by PALcode by reading VA register VA_FORM UNDEFINED Must be unlocked by PALcode by r...

Page 246: ...red on chip reset but not on timeout reset DC_MODE Cleared Cleared on chip reset but not on timeout reset MAF_MODE Cleared Cleared on chip reset MAF_MODE 05 cleared on timeout reset DC_FLUSH UNDEFINED PALcode must write this register to clear Dcache valid bits ALT_MODE UNDEFINED CC UNDEFINED CC is disabled on chip reset CC_CTL UNDEFINED DC_TEST_CTL 15 cleared Cleared on chip reset but not on timeo...

Page 247: ...149 1 Test Port Reset 7 10 IEEE 1149 1 Test Port Reset Signal trst_l must be asserted when sys_reset_l is asserted or when dc_ok_h is deasserted Continuous trst_l assertion during normal operation is used to guarantee that the IEEE 1149 1 test port does not affect 21164PC operation ...

Page 248: ......

Page 249: ...r interrupts Where possible the address of affected data is latched in an onchip IPR Most of the Istream errors can be retried by the operating system because the machine check occurs before any part of the instruction causing the error is executed In some other cases the system may be able to recover from an error by terminating all processes that had access to the affected memory location 8 1 Er...

Page 250: ...ame cycle the SEO bit is not set but more than one error bit will be set VA Contains the virtual address of the quadword with the error MM_STAT locked Contents contain information about instruction causing par ity error Note Fault information on another instruction in same cycle may be lost 8 1 3 Dcache Tag Parity Error Machine check occurs Machine state may have changed DCPERR_STAT TP0 or TP1 is ...

Page 251: ...DR Contains the physical address bits 39 04 of the octaword associated with the error Note If the Istream parity error occurs early in the PALcode routine at the machine check entry point an infinite loop may result Recommendation On data parity errors it may be feasible for the operating system to flush the block of data out of the Bcache by requesting a block of data with the same Bcache index b...

Page 252: ...e hit is determined based on the tag alone not the parity bit The victim is processed according to the status bits in the tag ignoring the control field parity PALcode can distinguish fatal from nonfatal occurrences by checking for the case in which a potentially dirty block is replaced without the victim being properly written back and the case of false hit when the tag parity is incorrect 8 1 7 ...

Page 253: ...n is asserted for one cycle and the normal fill sequence involving the fill_h fill_id_h and dack_h pins is generated by the system environment A fill_error_h assertion forces a PALcode trap to the MCHK entry point but has no other effect Note No internal status is saved to show that this happened If necessary sys tems must save this status and include read operations of the appropriate status regi...

Page 254: ...s to fill the refill buffer with new data 32 instructions Then flush the Icache again Read EXC_ADDR If EXC_ADDR PAL then halt Issue MB to clear out MTU CBU before reading CBU registers or issuing DC_FLUSH Flush Dcache to remove bad data on Dstream errors Read ICSR Read ICPERR_STAT Read DCPERR_STAT Read CBOX_ADDR Use an MB instruction to ensure that read operations of CBOX_ADDR occur before subsequ...

Page 255: ... DATA_PAR_ERR ICPERR_STAT TPE ICPERR_STAT DPE Unlock the following IPRs ICPERR_STAT write 0x1800 DCPERR_STAT write 0x03 VA and CBOX_STATUS are already unlocked Check for arithmetic exceptions Read EXC_SUM Check for arithmetic errors and handle according to operating system spe cific requirements Clear EXC_SUM unlocks EXC_MASK Report the processor uncorrectable MCHK according to operating system sp...

Page 256: ......

Page 257: ... dc characteristics Clocking scheme ac characteristics Power supply considerations 9 1 Electrical Characteristics Table 9 1 lists the maximum ratings for the 21164PC and Table 9 2 lists the operat ing voltages Table 9 1 21164PC Absolute Maximum Ratings Sheet 1 of 2 Characteristics Ratings Storage temperature 55 C to 125 C 67 F to 257 F Junction temperature 15 C to 85 C 59 F to 185 F Supply voltage...

Page 258: ...inary CMOS inputs with standard TTL levels see Table 9 3 See Section 9 3 1 for a description of an exception osc_clk_in_h l After power has been applied input and bidirectional pins can be driven to a maxi mum dc voltage of Vclamp at a maximum current of Iclamp without harming the 21164PC Refer to Table 9 3 for Vclamp and Iclamp values Inputs greater than Signal input or output applied 0 5 V to 4 ...

Page 259: ...are ordinary 3 3 V CMOS outputs Table 9 3 shows the CMOS dc input and output pins Table 9 3 CMOS DC Input Output Characteristics Sheet 1 of 2 Parameter Requirements Symbol Description Min Max Units Test Conditions Vih High level input voltage 2 0 V Vil Low level input voltage 0 8 V Voh High level output voltage 2 4 V Ioh 6 0 mA Vol Low level output voltage 0 4 V Iol 6 0 mA Iil_pd Input with pull d...

Page 260: ... Iclamp 100 mA Idd Peak power supply current for Vdd power supply 1 01 A Vdd 3 465 V Frequency 400 MHz Idd Peak power supply current for Vdd power supply 1 01 A Vdd 3 465 V Frequency 466 MHz Idd Peak power supply current for Vdd power supply 1 31 A Vdd 3 465 V Frequency 533 MHz Iddi Peak power supply current for Vddi power supply 11 25 A Vddi 2 6 V Frequency 400 MHz Iddi Peak power supply current ...

Page 261: ...ected ratio of the internal clock frequency There is a small clock skew between the internal clock and sys_clk_out1_h Refer to Section 4 2 for more information on clock functions 9 3 1 Input Clocks The differential input clocks osc_clk_in_h l provide the time base for the chip when dc_ok_h is asserted These pins are self biasing and must be capacitively coupled to the clock source on the module No...

Page 262: ...d to approximate a 50 Ω termination for the pur pose of impedance matching for those systems that drive input clocks across long traces The clock input pins appear as a 50 Ω series termination resistor connected to a high impedance voltage source The voltage source produces a nominal voltage value of Vdd 2 The source has an impedance of between 130 Ω and 600 Ω This voltage is called the self bias ...

Page 263: ...clocks may be driven by testers In any case the oscillator should be ac coupled to the osc_clk_in_h l inputs by 47 pF through 220 pF capacitors Figure 9 2 shows a plot of the simulated impedance versus the clock input fre quency Figure 9 1 is a simplified circuit of the complex model used to create Figure 9 2 Figure 9 2 Impedance vs Clock Input Frequency Differential Impedance ocs_clk_in_h to osc_...

Page 264: ...ns is not attenuated below the 600 mV peak to peak lower limit For sine waves or oscillators producing nearly sinusoidal pseudo square wave outputs 220 pF is recommended at 433 MHz A high quality dielectric such as NPO is required to avoid dielectric losses Table 9 4 shows the input clock specification 9 4 AC Characteristics This section describes the ac timing specifications for the 21164PC 9 4 1...

Page 265: ...inted circuit board PCB etch has a characteristic impedance of approx imately 75 Ω This may vary from 60 Ω to 90 Ω with tolerances If the line is driven in the electrical center the load could be as low as 30 Ω Therefore a characteristic impedance range of 30 Ω to 90 Ω could be experienced The 21164PC output drivers are designed with typical printed circuit board applica tions in mind rather than ...

Page 266: ... support an offchip backup cache Bcache Pri vate Bcache read or write transactions initiated by the 21164PC are independent of the system clocking scheme Bcache loop timing must be an integer multiple of the 21164PC cycle time Table 9 5 lists the Bcache loop timing 1 The value 0 2 ns accounts for onchip driver and clock skew 3 For private Bcache write operations 21164PC drives data_h 127 0 coincid...

Page 267: ...3 0 st_clk1_h st_clk2_h st_clk3_h data_adsc_l data_adv_l If any of the previous pins are connected to lightly loaded lines less than 40 pF additional drive should not be enabled or the lines should be properly terminated to avoid transmission line ringing 1 NA Not applicable Table 9 6 Normal Output Driver Characteristics Specification 40 pF Load 10 pF Load Name Maximum driver delay 2 7 ns 1 4 ns T...

Page 268: ...etermine the maximum capacitance that can be safely driven by each pin For normal output drivers Cmax in pF 5t where t is the waveform period measured from rising to rising or falling to falling edge in nanoseconds For big output drivers Cmax in pF 7t where t is the waveform period mea sured from rising to rising or falling to falling edge in nanoseconds For example if the waveform appearing on a ...

Page 269: ...n For all private write transactions data is driven coincident with Tcycle 0 cpu_clk the driving of index_h 21 4 Table 9 8 21164PC System Clock Output Timing sysclk Tø Signal Specification Value Name sys_clk_out1_h Output delay Tdd Tsysd sys_clk_out1_h Minimum output delay Tmdd Tsysdm data_bus_ req_h data_h 127 0 addr_h 39 4 Input setup 1 1 ns Tdsu data_bus_ req_h data_h 127 0 addr_h 39 4 Input ho...

Page 270: ...ock sys_clk_out1 Relationship of CPU Clock and sys_clk_out1 LJ 03410 AI4 CPU Clock Address Command Out dack Memory Read Pipe_Latch Mode Tsysd sys_clk_out1 Data In CPU Clock Address Command Out dack Memory Read Non Pipe_Latch Mode sys_clk_out1 Data In Taod Tdsu Taoh Tsysd Tsysd Tsysd Taod Tntacksu Tdsu Taoh Tsysd Tsysd Tsysd Tntackh Ttacksu cack Tntcacksu ...

Page 271: ...nds 1 These signals can also be used synchronously clk_mode_h 1 0 dc_ok_h sys_reset_l irq_h 3 0 1 mch_hlt_irq_h1 pwr_fail_irq_h1 sys_mch_chk_irq_h1 Table 9 9 Input Timing for sys_clk_out Based Systems Signal Specification Value Name fill_h fill_error_h fill_id_h idle_bc_h irq_h 3 0 mch_hlt_irq_h pwr_fail_irq_h sys_mch_chk_irq_h Testability pins port_mode_h srom_data_h srom_present_l Input setup 1 ...

Page 272: ...ut delay Tdd 0 2 ns Taod addr_res_h int4_valid_h 1 srom_clk_h srom_oe_l victim_pending_h Output hold Tmdd Taoh int4_valid_h2 Output delay Tdd Tcycle 0 2 ns Tdod int4_valid_h2 Output hold Tmdd Tcycle Tdoh Bidirectional Signals Input mode cmd_h lw_parity_h 1 tag_dirty_h3 Input setup 1 1 ns Tdsu cmd_h lw_parity_h 1 tag_dirty_h3 Input hold 0 ns Tdh Output mode cmd_h tag_dirty_h 4 tag_valid_h4 Output d...

Page 273: ...g the external reset signal sys_reset_l Figure 9 6 shows the timing between various events relevant to BiSt operations 1 The value 0 2 ns accounts for onchip driver and clock skew 2 For big drive enabled or big drive disabled respectively See Table 9 7 Table 9 11 Bcache Control Signal Timing Signal Specification Value Name Input mode tag_data_h tag_data_par_h tag_valid_h Input setup 1 1 ns Tdsu ta...

Page 274: ...onnects to the beginning of the timeline shown in Figure 9 7 Table 9 12 and Table 9 13 list timing shown in Figure 9 6 for some of the system clock ratios Time t1 is measured starting from the rising edge of sysclk following the deassertion of the sys_reset_l signal Table 9 12 BiSt Timing for Some System Clock Ratios Port Mode Normal System Cycles System Cycles Sysclk Ratio t1 t2 t3 4 7 28569 3 28...

Page 275: ...ted in Table 9 14 and Table 9 15 Figure 9 7 SROM Load Timing Event Timeline 1 Measured in sysclk cycles where n refers to an additional n CPU cycles Table 9 14 SROM Load Timing for Some System Clock Ratios System Cycles System Cycles1 Sysclk Ratio t1 t2 t3 t4 t5 4 3 48 4209267 4209361 3 4209362 15 3 13 1122472 1122496 14 1122497 Table 9 15 SROM Load Timing for Some System Clock Ratios CPU Cycles C...

Page 276: ..._mode_h 0 signal is used to enable disable a clock equalizing circuit called a symmetrator The symmetrator equalizes the duty cycle of the input clock for use onchip The osc_clk_ in_h l signals must have a duty cycle of at least 60 40 for the symmetrator to work properly Normal clock mode with the symmetrator enabled is the preferred clocking mode of the 21164PC 9 4 5 2 Clock Test Reset Mode When ...

Page 277: ...t conditions at the 21164PC pins and not just at the PCB edge Table 9 16 Clock Test Modes clk_mode_h Mode 1 0 Notes Normal 1 clock mode 0 0 Normal 1 clock mode 0 1 Symmetrator is enabled Clock reset 1 0 Clock reset 1 1 Symmetrator is enabled Table 9 17 IEEE 1149 1 Circuit Performance Specifications Item Specification trst_l is asynchronous Minimum pulse width 4 ns trst_l setup time for deassertion...

Page 278: ...d between Vdd and Vss should be roughly equal to 10 times the amount of capacitive load that 21164PC is required to drive at any one time This should guarantee a voltage drop of no more than 10 on Vdd during heavy drive conditions Use capacitors that are as physically small as possible Connect the capacitors directly to the 21164PC Vdd and Vss pins by short surface etch 0 64 cm 0 25 in or less The...

Page 279: ... must not be allowed to exceed Vclamp during the application or removal of power Refer to Table 9 3 for the value of Vclamp Note that it is acceptable for the signal voltage either to be held at zero or to follow Vdd during the application or removal of power Rule 3 means that if the signal voltage follows Vdd the signal voltage must never be greater than 2 4 V above the value of Vddi This applies...

Page 280: ...ctional signals are diode clamped to Vdd and Vss A current greater than Iclamp on an individual pin could damage the 21164PC Designers must take care that currents greater than Iclamp will not be achieved during power supply sequencing While currents less than Iclamp will not damage the 21164PC other source drivers connected to the 21164PC could be damaged by the clamp Designers must verify that t...

Page 281: ...GRAFOIL pad is the interface material between the package and the heat sink Table 10 1 lists the values for the center of heat sink to ambient Θca for the 413 pin grid array Table 10 2 shows the allowable Ta without exceeding Tc at various airflows Note DIGITAL recommends using the heat sink because it greatly improves the ambient temperature requirement 1 With the heat sink fan performance does n...

Page 282: ...uency 400 MHz Power 26 5 W Vdd 3 3 V and Vddi 2 5 V Ta with heat sink 1 C 26 8 46 6 51 9 54 6 57 2 Ta with heat sink 2 C includes 52 10 mm fan 51 91 Frequency 466 MHz Power 30 5 W Vdd 3 3 V and Vddi 2 5 V Ta with heat sink 1 C 18 0 40 8 46 9 50 0 53 0 Ta with heat sink 2 C includes 52 10 mm fan 46 91 Frequency 533 MHz Power 35 W Vdd 3 3 V and Vddi 2 5 V Ta with heat sink 1 C 34 3 41 3 44 8 48 3 Ta...

Page 283: ... Sink Specifications Figure 10 1 describes the specifications of heat sink 1 Heat sink 2 has the exact same specifications plus an added 52 10 mm fan Figure 10 1 Heat Sink 1 1 870 in 4 75 cm 2 16 cm 850 in 4 20 cm 1 655 in 4 75 cm 1 870 in 3 18 cm 1 25 in PCA030 3 56 cm 1 40 in 4 20 cm 1 655 in ...

Page 284: ...CB with the heat sink fins aligned with the airflow direction Avoid preheating ambient air Place the 21164PC on the PCB so that inlet air is not preheated by any other PCB components Do not place other high power devices in the vicinity of the 21164PC Do not restrict the airflow across the 21164PC heat sink Placement of other devices must allow for maximum system airflow in order to maximize the p...

Page 285: ...ackaging Information This chapter describes the 21164PC mechanical packaging including chip package physical specifications and a signal pin list For heat sink dimensions refer to Chapter 10 11 1 Mechanical Specifications Figure 11 1 shows the package physical dimensions without a heat sink ...

Page 286: ...18 in Lid 1 75 mm 069 in 250 in 49 78 mm square 1 960 in 24 89 mm 980 in 24 89 mm 980 in 21 59 mm 850 in 31 75 mm square 1 250 in 1 27 mm typ 050 in 22 86 mm 900 in 02 19 03 05 07 09 11 13 15 17 01 21 23 25 27 29 31 33 35 37 04 06 08 10 12 14 16 18 20 22 24 26 28 30 32 34 36 AN AM AK AH AF AD AB Z X V T R P M K H F D B AL AJ AG AE AC AA Y W U S Q N L J G E C A 22 86 mm 900 in 2 54 mm typ 100 in 4X...

Page 287: ...or a total of 413 pins in the array Table 11 1 Alphabetic Signal Pin List Sheet 1 of 4 Signal PGA Location Signal PGA Location Signal PGA Location addr_h 4 AH12 addr_h 5 AN11 addr_h 6 AJ13 addr_h 7 AL9 addr_h 8 AK8 addr_h 9 AJ11 addr_h 10 AN9 addr_h 11 AJ7 addr_h 12 AJ9 addr_h 13 AL5 addr_h 14 AK6 addr_h 15 AH6 addr_h 16 AG7 addr_h 17 AK4 addr_h 18 AJ3 addr_h 19 AH4 addr_h 20 AJ1 addr_h 21 AF6 add...

Page 288: ...31 data_h 35 U37 data_h 36 V36 data_h 37 V34 data_h 38 V30 data_h 39 W35 data_h 40 V32 data_h 41 X34 data_h 42 W37 data_h 43 W31 data_h 44 Y37 data_h 45 Y35 data_h 46 Z34 data_h 47 X32 data_h 48 Y33 data_h 49 AB36 data_h 50 Y31 data_h 51 AC35 data_h 52 AA31 data_h 53 Z32 data_h 54 AD34 data_h 55 AE35 data_h 56 AA37 data_h 57 AB34 data_h 58 AG37 data_h 59 AB32 data_h 60 AG35 data_h 61 AH36 data_h 6...

Page 289: ...B4 data_h 114 AB2 data_h 115 AC7 data_h 116 AD6 data_h 117 AB6 data_h 118 AC3 data_h 119 AE5 data_h 120 AD4 data_h 121 AE1 data_h 122 AF4 data_h 123 AG3 data_h 124 AE3 data_h 125 AE7 data_h 126 AG1 data_h 127 AH2 data_adsc_l D32 data_adv_l C33 data_bus_req_h E21 data_ram_oe_l D18 data_ram_we_l 0 B18 data_ram_we_l 1 A19 data_ram_we_l 2 B20 data_ram_we_l 3 D20 dc_ok_h AL23 fill_h D4 fill_dirty_h AL1...

Page 290: ... AJ17 srom_present_l AK18 st_clk1_h E7 st_clk2_h E31 st_clk3_h G31 sys_clk_out1_h AK22 sys_clk_out2_h AJ23 sys_mch_chk_ irq_h AN25 sys_reset_l AN23 tag_data_h 19 A11 tag_data_h 20 D6 tag_data_h 21 E9 tag_data_h 22 D8 tag_data_h 23 C7 tag_data_h 24 F12 tag_data_h 25 B6 tag_data_h 26 E11 tag_data_h 27 C9 tag_data_h 28 A9 tag_data_h 29 C13 tag_data_h 30 C11 tag_data_h 31 E15 tag_data_h 32 A15 tag_dat...

Page 291: ...F28 H2 H36 K8 K30 L5 L33 P2 P8 P30 P36 S5 S33 W5 W33 Z2 Z8 Z30 Z36 AC5 AC33 AD8 AD30 AF2 AF36 AH10 AH16 AH22 AH28 AJ5 AJ33 AK12 AK20 AK26 AL1 AL19 AL21 AL37 AM2 AM8 AM14 AM24 AM30 AM36 AN3 AN5 AN7 AN21 AN31 AN33 AN35 Vdd Metal plane 7 B4 B10 B16 B22 B28 B34 C3 C35 D2 D36 F8 F14 F24 F30 K2 K36 M8 M30 R2 R8 R30 R36 X2 X8 X30 X36 AB8 AB30 AD2 AD36 AH8 AH14 AH24 AH30 AK2 AK36 AL3 AL35 AM4 AM6 AM10 AM1...

Page 292: ...re 11 2 shows the 21164PC pinout from the top view with pins facing down Figure 11 2 21164PC Top View Pin Down PCA028 AN AL AJ AG AE AC AA Y W U S Q N L J G E C A 21164PC Top View Pin Down AM AK AH AF AD AB Z X V T R P M K H F D B 37 35 33 19 03 05 07 09 11 13 15 17 01 21 23 25 27 29 31 36 34 32 30 28 26 24 22 20 18 16 14 12 10 08 06 04 02 ...

Page 293: ... the 21164PC pinout from the bottom view with pins facing up Figure 11 3 21164PC Bottom View Pin Up PCA029 02 19 03 05 07 09 11 13 15 17 01 21 23 25 27 29 31 33 35 37 04 06 08 10 12 14 16 18 20 22 24 26 28 30 32 34 36 AN AM AK AH AF AD AB Z X V T R P M K H F D B AL AJ AG AE AC AA Y W U S Q N L J G E C A Bottom View Pin Up 21164PC ...

Page 294: ......

Page 295: ...ins and their functions Table 12 1 21164PC Test Port Pins Pin Name Type Function port_mode_h 1 I Must be false port_mode_h 0 I Must be false srom_present_l I Tied low if serial ROMs SROMs are present in system srom_data_h Rx I Receives SROM or serial terminal data srom_clk_h Tx O Supplies clock to SROMs or transmits serial terminal data srom_oe_l O SROM enable tdi_h I IEEE 1149 1 TDI port tdo_h O ...

Page 296: ...test The port also allows access to factory manufacturing features not described in this document The port is compliant with most requirements of IEEE 1149 1 test access port Compliance Enable Inputs Table 12 2 shows the compliance enable inputs and the pattern that must be driven to those inputs in order to activate the 21164PC IEEE 1149 1 circuits Exceptions to Compliance The 21164PC is complian...

Page 297: ... at pin dc_ok_h This cell captures the output of a clock sniffer circuit It captures a 1 when the oscillator is con nected and captures a 0 if the chip s oscillator connections are broken This exception to the standard is made to permit a meaningful test of the oscilla tor input pins Refer to IEEE Standard 1149 1 1993 A Test Access Port and Boundary Scan Archi tecture for a full description of the...

Page 298: ...achine Instruction Register The 5 bit wide instruction register IR supports IEEE 1149 1 mandated public instructions EXTEST SAMPLE BYPASS HIGHZ and a number of optional instructions for public and private factory use Table 12 3 summarizes the public instructions and their functions Test Logic Reset Run Test Idle Select IR Scan 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 Select ...

Page 299: ...rt chip Boundary Scan Register The 261 bit boundary scan register is accessed during SAMPLE EXTEST and CLAMP instructions Refer to Section 12 3 for the organization of this register Table 12 3 Instruction Register IR 4 0 Name Selected Scan Register Operation 00000 EXTEST BSR BSR drives pins Interconnect test mode 00010 SAMPLE PRELOAD BSR Preloads BSR 00010 Private BSR Private 00011 Private BSR Pri...

Page 300: ...gister The 21164PC boundary scan register BSR is 261 bits long Table 12 4 provides the boundary scan register organization The BSR is connected between the tdi_h and tdo_h pins whenever an instruction selects it Table 12 3 The scan register runs clockwise beginning at the upper left corner of the chip There are six groups of bidirectional pins each group controlled from a group con trol cell Loadi...

Page 301: ...I None tck_h I None tms_h I None tdo_h O None tdi_h I None srom_oe_l O 250 io_bcell srom_clk_h O 249 io_bcell srom_data_h I 248 in_bcell srom_present_l I 247 in_bcell port_mode_h 0 1 I None Compliance enable pins clk_mode_h 0 I 246 in_bcell osc_clk_in_h l I None Analog pins clk_mode_h 1 I 245 in_bcell sys_clk_out1_h O 244 io_bcell sys_clk_out2_h O 243 io_bcell sys_reset_l I 242 in_bcell dc_ok_h I ...

Page 302: ...2 io_bcell gr_2 data_h 63 0 B 211 148 io_bcell gr_2 lw_parity_h 0 1 B 147 146 io_bcell gr_2 int4_valid_h 1 0 O 145 144 io_bcell addr_res_h 1 0 O 143 142 io_bcell st_clk3_h O 141 io_bcell data_adsc_l O 140 io_bcell data_adv_l O 139 io_bcell st_clk2_h O 138 io_bcell Lower right corner index_h 21 4 O 137 120 io_bcell data_bus_req_h I 119 in_bcell dack_h I 118 in_bcell addr_bus_req_h I 117 in_bcell da...

Page 303: ...o_bcell gr_3 st_clk1_h O 89 io_bcell Lower left corner idle_bc_h I 88 in_bcell fill_error_h I 87 in_bcell fill_id_h I 86 in_bcell fill_h I 85 in_bcell TTAG1 Control 84 io_bcell gr_4 cmd_h 0 3 B 83 80 io_bcell gr_4 int4_valid_h 2 3 O 79 78 io_bcell TTAG2 Control 77 io_bcell gr_5 lw_parity_h 3 2 B 76 75 io_bcell gr_5 data_h 64 127 B 74 11 io_bcell gr_5 TR_DDL Control 10 io_bcell gr_6 addr_h 21 12 B ...

Page 304: ......

Page 305: ...ng Branch Bra oo oo is the 6 bit opcode field Floating point F P oo fff oo is the 6 bit opcode field fff is the 11 bit function code field Memory Mem oo oo is the 6 bit opcode field Memory function code Mfc oo ffff oo is the 6 bit opcode field ffff is the 16 bit function code in the displacement field Memory branch Mbr oo h oo is the 6 bit opcode field h is the high order 2 bits of the displacemen...

Page 306: ... 10 40 Add longword ADDQ Opr 10 20 Add quadword ADDQ V Opr 10 60 Add quadword ADDS F P 16 080 Add S_floating ADDT F P 16 0A0 Add T_floating AMASK Opr 11 61 Determine byte word instruction implementa tion AND Opr 11 00 Logical product BEQ Bra 39 Branch if zero BGE Bra 3E Branch if zero BGT Bra 3F Branch if zero BIC Opr 11 0 Bit clear BIS Opr 11 20 Logical sum BLBC Bra 38 Branch if low bit clear BLB...

Page 307: ...PLE Opr 10 6D Compare signed quadword less than or equal CMPLT Opr 10 4D Compare signed quadword less than CMPTEQ F P 16 0A5 Compare T_floating equal CMPTLE F P 16 0A7 Compare T_floating less than or equal CMPTLT F P 16 0A6 Compare T_floating less than CMPTUN F P 16 0A4 Compare T_floating unordered CMPULE Opr 10 3D Compare unsigned quadword less than or equal CMPULT Opr 10 1D Compare unsigned quad...

Page 308: ...TST F P 16 2AC Convert S_floating to T_floating CVTTQ F P 16 0AF Convert T_floating to quadword CVTTS F P 16 0AC Convert T_floating to S_floating DIVF F P 15 083 Divide F_floating DIVG F P 15 0A3 Divide G_floating DIVS F P 16 083 Divide S_floating DIVT F P 16 0A3 Divide T_floating EQV Opr 11 48 Logical equivalence EXCB Mfc 18 0400 Exception barrier EXTBL Opr 12 06 Extract byte low EXTLH Opr 12 6A ...

Page 309: ...f zero FCMOVNE F P 17 02B FCMOVE if zero FETCH Mfc 18 80 Prefetch data FETCH_M Mfc 18 A0 Prefetch data modify intent IMPLVER Opr 11 6C Determine CPU type INSBL Opr 12 0B Insert byte low INSLH Opr 12 67 Insert longword high INSLL Opr 12 2B Insert longword low INSQH Opr 12 77 Insert quadword high INSQL Opr 12 3B Insert quadword low INSWH Opr 12 57 Insert word high INSWL Opr 12 1B Insert word low JMP...

Page 310: ...igned byte maximum MAXSW4 Opr 1C 3F Vector signed word maximum MAXUB8 Opr 1C 3C Vector unsigned byte maximum MAXUW4 Opr 1C 3D Vector unsigned word maximum MB Mfc 18 4000 Memory barrier MF_FPCR F P 17 025 Move from floating point control register MINSB8 Opr 1C 3E Vector signed byte minimum MINSW4 Opr 1C 3F Vector signed word minimum MINUB8 Opr 1C 3C Vector unsigned byte minimum MINUW4 Opr 1C 3D Vec...

Page 311: ...floating ORNOT Opr 11 28 Logical sum with complement PERR Opr 1C 31 Pixel error PKLB Opr 1C 37 Pack longwords to bytes PKWB Opr 1C 36 Pack words to bytes RC Mfc 18 E0 Read and clear RET Mbr 1A 2 Return from subroutine RPCC Mfc 18 C0 Read process cycle counter RS Mfc 18 F000 Read and set S4ADDL Opr 10 02 Scaled add longword by 4 S4ADDQ Opr 10 22 Scaled add quadword by 4 S4SUBL Opr 10 0B Scaled subt...

Page 312: ...tore S_floating STL Mem 2C Store longword STL_C Mem 2E Store longword conditional STQ Mem 2D Store quadword STQ_C Mem 2F Store quadword conditional STQ_U Mem 0F Store unaligned quadword STT Mem 27 Store T_floating STW Mem 0D Store word SUBF F P 15 081 Subtract F_floating SUBG F P 15 0A1 Subtract G_floating SUBL Opr 10 09 Subtract longword SUBL V 10 49 SUBQ Opr 10 29 Subtract quadword SUBQ V 10 69 ...

Page 313: ...pr 1C 35 Unpack bytes to longwords UNPKBW Opr 1C 34 Unpack bytes to words WMB Mfc 18 44 Write memory barrier XOR Opr 11 40 Logical difference ZAP Opr 12 30 Zero bytes ZAPNOT Opr 12 31 Zero bytes not Table A 3 Opcodes Reserved for DIGITAL Mnemonic Opcode Mnemonic Opcode Mnemonic Opcode OPC01 01 OPC05 05 OPC0B 0B OPC02 02 OPC06 06 OPC0C 0C1 OPC03 03 OPC07 07 OPC0D 0D1 OPC04 04 OPC0A 0A1 OPC0E 0E1 Ta...

Page 314: ...Architecture Mnemonic Function HW_LD 1B PAL1B Performs Dstream load instructions HW_ST 1F PAL1F Performs Dstream store instructions HW_REI 1E PAL1E Returns instruction flow to the program counter PC pointed to by the EXC_ADDR internal pro cessor register IPR HW_MFPR 19 PAL19 Accesses the IDU MTU and Dcache IPRs HW_MTPR 1D PAL1D Accesses the IDU MTU and Dcache IPRs Table A 5 IEEE Floating Point Ins...

Page 315: ...UI SUIC SUIM SUID ADDS 580 500 540 5C0 780 700 740 7C0 ADDT 5A0 520 560 5E0 7A0 720 760 7E0 CMPTEQ 5A5 CMPTLT 5A6 CMPTLE 5A7 CMPTUN 5A4 CVTQS 7BC 73C 77C 7FC CVTQT 7BE 73E 77E 7F3 CVTTS 5AC 52C 56C 5EC 7AC 72C 76C 7EC DIVS 583 503 543 5C3 783 703 743 7C3 DIVT 5A3 523 563 5E3 7A3 723 763 7E3 MULS 582 502 542 5C2 782 702 742 7C2 MULT 5A2 522 562 5E2 7A2 722 762 7E2 SUBS 581 501 541 5C1 781 701 741 7...

Page 316: ... the hexadecimal value of the 11 bit function code field for the VAX floating point instructions The opcode for these instructions is 1516 Mnemonic None C V VC SV SVC SVI SVIC CVTTQ 0AF 02F 1AF 12F 5AF 52F 7AF 72F Mnemonic D VD SVD SVID M VM SVM SVIM CVTTQ 0EF 1EF 5EF 7EF 06F 16F 56F 76F Table A 6 VAX Floating Point Instruction Function Codes Sheet 1 of 2 Mnemonic None C U UC S SC SU SUC ADDF 080 ...

Page 317: ...he Offset column For example the third row 2 A under the 1016 column contains the symbol INTS representing the all integer shift instructions The opcode for those instructions would then be 1216 because the 0 in 10 is replaced by the 2 in the Offset column Likewise the third row under the 1816 column contains the symbol JSR represent ing all jump instructions The opcode for those instructions is 1...

Page 318: ...D Res STW mem FLTV op PAL STG mem STQ mem FBNE br BNE br 6 E Res STB mem FLTI op PAL STS mem STL_C mem FBGE br BGE br 7 F Res STQ_U mem FLTL op PAL STT mem STQ_C mem FBGT br BGT br Symbol FLTI FLTL FLTV INTA INTL INTM INTS JSR MISC PAL PAL Res SEXT MVI Meaning IEEE floating point instruction opcodes Floating point operate instruction opcodes VAX floating point instruction opcodes Integer arithmeti...

Page 319: ...upport precise exception handling necessary for complete conformance to the standard is in the Alpha AXP Architec ture Reference Manual The following information is specific to the 21164PC Invalid operation INV The invalid operation trap is always enabled If the trap occurs then the destina tion register is UNPREDICTABLE This exception is signaled if any VAX architecture operand is nonfinite reser...

Page 320: ...full 64 bits of zero This is done even if the proper IEEE result would have been 0 The exception is signaled if the rounded result is smaller in magnitude than the smallest finite number that can be represented by the destination format If the exception occurs then FPCR UNF is set If the trap is enabled then the trap is signaled to the IDU The 21164PC never produces a denormal number underflow occ...

Page 321: ...n integer overflow occurs if the rounded result is outside the range 263 263 1 In conversions from quadword integer to longword integer an integer overflow occurs if the result is outside the range 231 231 1 If the exception occurs then the appropriate bit in the FPCR is set If the trap is enabled then the trap is signaled to the IDU Software completion SWC The software completion signal is not re...

Page 322: ......

Page 323: ...lion Die size 8 65 16 28 mm Package 413 pin IPGA interstitial pin grid array Number of signal pins 264 Typical worst case power Vdd 3 3 V Vddi 2 5 V 24 W int and 2 5 W ext 2 50 ns cycle time 400 MHz 32 W int and 3 0 W ext 1 87 ns cycle time 533 MHz Power supply 3 3 V dc 2 5 V dc Clocking input One times the internal clock speed Virtual address size 43 bits Physical address size 33 bits Page size 8...

Page 324: ...y associative not last used replacement 8K pages 128 ASNs MAX_ASN 127 full granularity hint support Onchip instruction translation buffer 48 entry fully associative not last used replacement 128 ASNs MAX_ASN 127 full granularity hint support Floating point unit Onchip FPU supports both IEEE and DIGITAL floating point Bus Separate data and address bus 128 bit 64 bit data bus Serial ROM interface Al...

Page 325: ...dd in 54 in main fillmap 0 127 maps data 127 0 etc fillmap n is bit position in output vector bit 0 of this vector is first in bit 201 is last int dfillmap 128 data 0 127 fillmap 0 127 44 46 48 50 52 54 56 58 0 7 60 62 64 66 68 70 72 74 8 15 76 78 80 82 84 86 88 90 16 23 92 94 96 98 100 102 104 106 24 31 45 47 49 51 53 55 57 59 32 39 61 63 65 67 69 71 73 75 40 47 77 79 81 83 85 87 89 91 48 55 93 9...

Page 326: ... 118 int tagfillmap 29 tag bits 13 42 tagfillmap 0 29 29 28 27 26 25 24 23 22 21 13 22 20 19 18 17 16 15 14 13 12 11 23 32 10 9 8 7 6 5 4 3 2 1 33 42 int asnfillmap 7 asn 0 6 asnfillmap 0 6 37 36 35 34 33 32 31 0 6 int asmfillmap asm asmfillmap 30 int tagphysfillmap tagphysical address tagphysfillmap 38 int tagvalfillmap 4 tag valid bits 0 3 tagvalfillmap 42 41 40 39 0 3 int tagparfillmap tag pari...

Page 327: ... charptr int chksum int instr 4 outvector DATA_BYTES_PER_REC 4 strcpy filename loadfile dxe default file names strcpy ofilename loadfile srom strcpy hfilename loadfile hex base 0 tag 0 asn 0 asm 1 tphysical 1 bhtvector 0 offset 0 for PCA I added 55 bits of padding One of those bits is reflected in the above numbers for i 0 i 128 i dfillmap i 54 for i 0 i 8 i BHTfillmap i 54 for i 0 i 20 i predfill...

Page 328: ...gment addr record tparity eparity tag eparity tphysical eparity asn tvalids 15 instatus 0 instr_count 0 there are 1024 full 32 byte records MAX_INSTR instructions for lines_written 0 lines_written 1024 lines_written build_vector instr outvector instatus instr_count build the vector fwrite outvector 0 1 DATA_BYTES_PER_REC outfile print it to a binary file fprintf hexfile 19 04X00 offset print it to...

Page 329: ...r for j 0 j 4 j Put 4 bytes of FF at the end fprintf hexfile 02X 0xff chksum 0xff fprintf hexfile 02X n chksum 0xff fprintf hexfile 00000001FF n end of file record printf Total intructions processed d t d free n instr_count MAX_INSTR instr_count fclose infile fclose outfile fclose hexfile exit 0 void build_vector int instr int outvector int instatus int instr_count int j k t int status for j 0 j 4...

Page 330: ... j k 32 outvector t 5 instr k j 1 t 0x1f predecodes for j 0 j 20 j t predfillmap j outvector t 5 predecodes j 1 t 0x1f owparity outvector octawpfillmap 5 owparity octawpfillmap 0x1f pdparity outvector predpfillmap 5 pdparity predpfillmap 0x1f tparity outvector tagparfillmap 5 tparity tagparfillmap 0x1f tvalids for j 0 j 4 j t tagvalfillmap j outvector t 5 tvalids j 1 t 0x1f tphysical outvector tag...

Page 331: ... 1 return x 1 define EXT data bit data unsigned 1 bit 0 define EXTV data hbit lbit data lbit hbit lbit 1 32 unsigned 0xffffffff unsigned 0xffffffff hbit lbit 1 define INS name bit data name name unsigned 1 bit unsigned data bit unsigned 1 bit int instrpredecode int inst int result int opcode int func int jsr_type int ra int out0 int out1 int out2 int out3 int out4 int e0_only int e1_only int ee in...

Page 332: ...de 0x2E STL_C opcode 0x2F STQ_C opcode 0x1F HW_ST opcode 0x18 MISC mem format FETCH _M RS RC RPCC TRAPB MB opcode 0x12 EXT MSK INS SRX SLX ZAP opcode 0x13 MULX opcode 0x1D EXT inst 8 0 MBOX HW_MTPR opcode 0x19 EXT inst 8 0 MBOX HW_MFPR opcode 0x01 VR might change this later RESDEC s opcode 0x02 RESDEC s opcode 0x03 RESDEC s opcode 0x04 RESDEC s opcode 0x05 RESDEC s opcode 0x06 RESDEC s opcode 0x07...

Page 333: ...de 0x20 LDF opcode 0x21 LDG opcode 0x22 LDS opcode 0x23 LDT opcode 0x1B HW_LD lnoop opcode 0x0B ra 0x1F LDQ_U R31 x y NOOP fadd opcode 0x17 func 0x20 Flt datatype indep excl CPYS opcode 0x15 func 0xf 0x2 VAX excl MUL s opcode 0x16 func 0xf 0x2 IEEE excl MUL s opcode 0x31 FBEQ opcode 0x32 FBLT opcode 0x33 FBLE opcode 0x35 FBNE opcode 0x36 FBGE opcode 0x37 FBGT fmul opcode 0x15 func 0xf 0x2 VAX MUL ...

Page 334: ...Misc TRAPB MB RS RC RPCC etc opcode 0x1F HW_ST opcode 0x2A LDL_L opcode 0x2B LDQ_L br opcode 0x30 all branches call_pal opcode 0x00 call PAL bsr opcode 0x34 ret_rei opcode 0x1A jsr_type 0x2 opcode 0x1E jsr_type 0x3 jmp opcode 0x1A jsr_type 0x0 jsr_cor opcode 0x1A jsr_type 0x3 jsr opcode 0x1A jsr_type 0x1 cond_br opcode 0x31 opcode 0x32 opcode 0x33 opcode 0x35 opcode 0x36 opcode 0x37 opcode 0x38 op...

Page 335: ...ore out1 ret_rei e1_only br_type jmp jsr_cor jsr lnoop fadd br_type fe out2 call_pal bsr jsr_cor e0_only jsr fmul fe out3 e1_only cond_br e1_only br_type fadd fmul fe out4 ee lnoop e0_only fadd fmul fe result 0 INS result 0 out0 INS result 1 out1 INS result 2 out2 INS result 3 out3 INS result 4 out4 return result ...

Page 336: ......

Page 337: ...er 1997 Subject To Change Errata Sheet D 1 D Errata Sheet Table D 1 lists the revision history for this document Table D 1 Document Revision History Date Revision September 29 1997 Preliminary version EC R2W0A TE ...

Page 338: ......

Page 339: ...lso call the Digital Semiconductor Information Line or the Digital Semiconductor Customer Technology Center Please use the following information lines for support For documentation and general information Digital Semiconductor Information Line United States and Canada 1 800 332 2717 Outside North America 1 510 490 4753 Electronic mail address semiconductor digital com For technical support Digital...

Page 340: ...tion Chips Order Number Digital Semiconductor Alpha 21164PC 400 MHz microprocessor 211PC 01 Digital Semiconductor Alpha 21164PC 466 MHz microprocessor 211PC 02 Digital Semiconductor Alpha 21164PC 533 MHz microprocessor 211PC 03 Title Order Number Alpha AXP Architecture Reference Manual1 EY T132E DP Alpha Architecture Handbook2 EC QD2KB TE Digital Semiconductor Alpha 21164PC Microprocessor Data She...

Page 341: ...ication Note EC QA4XE TE Alpha Microprocessors SROM Mini Debugger User s Guide EC QHUXC TE Alpha Microprocessors Motherboard Debug Monitor User s Guide EC QHUVE TE Alpha Microprocessors Motherboard Software Design Tools User s Guide EC QHUWC TE Title Vendor PCI Local Bus Specification Revision 2 1 PCI System Design Guide PCI Special Interest Group U S 1 800 433 5177 International 1 503 797 4207 Fa...

Page 342: ......

Page 343: ...ed address translations for process specific addresses when a context switch occurs ASNs are processor specific the hardware makes no attempt to maintain coherency across multiple processors address translation The process of mapping addresses from one address space to another ALIGNED A datum of size 2N is stored in memory at a byte address that is a multiple of 2N that is one that has N low order...

Page 344: ... execution of the process at the point where it was interrupted backmap A memory unit that is used to note addresses of valid entries within a cache bandwidth Bandwidth is often used to express high rate of data transfer in a bus or an I O channel This usage assumes that a wide bandwidth may contain a high frequency which can accommodate a high rate of data transfer barrier transaction A transacti...

Page 345: ...che boot Short for bootstrap Loading an operating system into memory is called booting BSR Boundary scan register buffer An internal memory area used for temporary storage of data records during input or output operations bugcheck A software condition usually the response to software s detection of an internal inconsistency which results in the execution of the system bugcheck code bus A group of ...

Page 346: ...a and when cached data is modi fied all other processors that access that data receive modified data Schemes for maintaining consistency can be implemented in hardware or software Also called cache consistency cache fill An operation that loads an entire cache block by using multiple read cycles from main memory cache flush An operation that marks all cache blocks as invalid cache hit The status r...

Page 347: ...ite back cache cache miss The status returned when cache memory is probed with no valid cache entry at the probed address CALL_PAL Instructions Special instructions used to invoke PALcode CBU Cache control and bus interface unit The logic unit within the 21164PC micropro cessor that provides an interface to the external data bus and board level Bcache central processing unit CPU The unit of the co...

Page 348: ...I O space The CSR ini tiates device activity and records its status CPLD Complex programmable logic device CPU See central processing unit CSR See control and status register cycle One clock interval data bus The bus used to carry data between the 21164PC and external devices Also called the pin bus Dcache Data cache A cache reserved for storage of data The Dcache does not contain instructions DIP...

Page 349: ...ite memory that must be refreshed read from or written to periodically to maintain the storage of information DTL Diode transistor logic dual issue Two instructions are issued in parallel during the same microprocessor cycle The instructions use different resources and so do not conflict ECC Error correction code Code and algorithms used by logic to facilitate error detection and correction See al...

Page 350: ...ank or bulk erased Contrast with EEPROM FET Field effect transistor firmware Machine instructions stored in hardware floating point A number system in which the position of the radix point is indicated by the expo nent part and another part represents the significant digits or fractional part flush See cache flush FPGA Field programmable gate array FPLA Field programmable logic array FPU Floating ...

Page 351: ...of the two areas of primary cache located on the 21164PC used to store instructions The Icache contains 16KB of memory space It is a direct mapped cache Icache blocks or lines contain 64 bytes of instruction stream data with associated tag as well as a 6 bit ASM field and an 8 bit branch history field per block Icache does not contain hard ware for maintaining cache coherency with memory and is un...

Page 352: ...pha 21164PC microprocessor IPGA Interstitial pin grid array JFET Junction field effect transistor latency The amount of time it takes the system to respond to an event LCC Leadless chip carrier LFSR Linear feedback shift register load store architecture A characteristic of a machine architecture where data items are first loaded into a processor register operated on and then stored back to memory ...

Page 353: ...lding most instruction code and data Usually built from cost effective DRAM memory chips May be used in connection with the microprocessor s internal caches and an optional external cache masked write A write cycle that only updates a subset of a nominal data block MBO See must be one MBZ See must be zero MESI protocol A cache consistency protocol with full support for multiprocessing The MESI pro...

Page 354: ...address translation unit The logic unit within the 21164PC microprocessor that performs address translation interfaces to the Dcache and performs several other functions multiprocessing A processing method that replicates the sequential computer and interconnects the collection so that each processor can execute the same or a different program at the same time Must be one MBO A field that must be ...

Page 355: ...te boundary The bits are num bered from right to left 0 through 127 OpenVMS Alpha operating system The open version of the DIGITAL VMS operating system which runs on Alpha platforms operand The data or register upon which an operation is performed PAL Privileged architecture library software See also PALcode Programmable array logic hardware See programmable array logic PALcode Alpha privileged ar...

Page 356: ...rogrammable logic array PLCC Plastic leadless chip carrier or plastic leaded chip carrier PLD Programmable logic device PLL Phase locked loop PMOS P type metal oxide semiconductor PQFP Plastic quad flat pack primary cache The cache that is the fastest and closest to the processor The first level caches located on the CPU chip composed of the Dcache and Icache program counter That portion of the CP...

Page 357: ...dword Eight contiguous bytes starting on an arbitrary byte boundary The bits are numbered from right to left 0 through 63 RAM Random access memory READ BLOCK A transaction where the 21164PC requests that an external logic unit fetch read data read data wrapping System feature that reduces apparent memory latency by allowing read data cycles to differ the usual low to high sequence Requires coopera...

Page 358: ...plex least fre quently used instructions by breaking them down into simpler instructions This approach allows the RISC architecture to implement a small hardware assisted instruction set thus eliminating the need for microcode ROM Read only memory RTL Register transfer logic SAM Serial access memory SBO Should be one SBZ Should be zero scheduling The process of ordering instruction execution to ob...

Page 359: ...IPP Single inline pin package SMD Surface mount device SRAM Static random access memory SROM Serial read only memory SSI Small scale integration SSRAM Synchronous static random access memory stack An area of memory set aside for temporary data storage or for procedure and inter rupt service linkages A stack uses the last in first out concept As items are added to pushed on the stack the stack poin...

Page 360: ...at has three states high low and high impedance TTL Transistor transistor logic UART Universal asynchronous receiver transmitter UNALIGNED A datum of size 2N stored at a byte address that is not a multiple of 2N unconditional branch instructions Instructions that write a return address into a register UNDEFINED An operation that may halt the processor or cause it to lose information Only privi leg...

Page 361: ...er making cache hit times faster VHSIC Very high speed integrated circuit VLSI Very large scale integration VRAM Video random access memory word Two contiguous bytes 16 bits starting on an arbitrary byte boundary The bits are numbered from right to left 0 through 15 write back A cache management technique in which write operation data is written into cache but is not written into main memory in th...

Page 362: ...o differ the usual low to high sequence Requires cooperation between the 21164PC and external hardware write through A cache management technique in which a write operation to cache also causes the same data to be written in main memory during the same operation write through cache Copies are kept of any data in the region read operations may use the copies but write operations update the actual d...

Page 363: ... translation 2 10 Addressing 1 2 Aligned convention xx Alpha documentation E 2 ALT_MODE register 5 49 Architecture 1 1 to 1 4 Associated documentation E 2 AST 2 8 ASTER register 5 20 ASTRR register 5 20 B Bcache 2 13 errors 4 57 hit under READ MISS example 4 57 interface 4 4 introduction 4 2 to 4 6 selecting options 4 27 structure 4 12 victim buffers 4 13 BCACHE VICTIM command 4 29 BIU 4 2 4 12 4 ...

Page 364: ...4 4 28 4 38 4 40 4 48 4 51 7 4 9 13 9 16 Coherency caches 4 13 Command address driving bus 4 45 errors 4 57 Commands 21164PC initiated 4 28 BCACHE VICTIM 4 29 INVALIDATE 4 40 NOP 4 28 4 40 READ MISS0 4 29 READ MISS1 4 29 WRITE BLOCK 4 29 Commands sending to 21164PC 4 38 Conventions xix xix to xxiii abbreviations xix address xx aligned xx data units xxi numbering xxi signal names xxii unaligned xx ...

Page 365: ... E 2 Digital Semiconductor information line E 1 Digital Semiconductor WWW site E 1 Documentation E 2 E 3 DTB 2 10 DTB_ASN register 5 31 DTB_CM register 5 31 DTB_IA register 5 40 DTB_IAP register 5 40 DTB_IS register 5 41 DTB_PTE register 5 32 DTB_PTE_TEMP register 5 34 DTB_TAG register 5 32 E Entry pointer queues 2 34 EXC_ADDR register 5 12 EXC_MASK register 5 14 EXC_SUM register 5 12 Exceptions 2...

Page 366: ...x_h 21 4 description 3 8 operation 4 13 4 46 4 56 7 3 9 10 Initialization role of interrupt signals 4 59 Input clock ac coupling 9 8 impedance levels 9 6 termination 9 6 Input clocks 9 5 Instruction decode 2 4 issue 2 4 prefetch 2 4 Instruction issue 1 4 2 17 Instruction translation buffer 2 7 Instructions classes 2 19 issue rules 2 27 latencies 2 23 slotting 2 19 2 21 WMB 2 12 2 34 int4_valid_h 3...

Page 367: ...23 ITB_ASN 5 7 ITB_IA 5 8 ITB_IAP 5 7 ITB_IS 5 8 ITB_PTE 5 5 ITB_PTE_TEMP 5 7 ITB_TAG 5 5 IVPTBR 5 10 MAF_MODE 5 46 MCSR 5 42 MM_STAT 5 35 MVPTBR 5 38 PAL_BASE 5 15 PMCTR 5 27 SIRR 5 21 SL_RCV 5 26 SL_XMIT 5 25 VA 5 36 VA_FORM 5 37 IRF 2 9 irq_h 3 0 description 3 11 operation 2 8 4 8 4 60 5 23 7 4 9 15 ISR register 5 23 Issue rules 2 27 Issuing rules 2 19 to 2 28 ITB 2 7 ITB_ASN register 5 7 ITB_I...

Page 368: ..._h l operation 3 5 4 7 9 2 9 4 9 5 9 7 9 8 9 20 12 3 P PAL restrictions 5 69 PAL_BASE register 5 15 PALcode 1 2 PALshadow registers 5 68 PALtemp IPRs 5 68 encoding 5 2 Pending request queue 2 34 Performance counters 2 36 Physical address considerations 4 10 Physical address regions 4 10 Physical memory regions 4 11 Pipeline organization 2 13 to 2 19 Pipelines 2 9 bubbles 2 19 examples 2 15 floatin...

Page 369: ...e 2 18 Resource conflict 2 19 Restrictions interface 4 50 S Scheduling rules 2 19 to 2 28 Signal descriptions 3 3 to 3 16 Signal name convention xxii SIRR register 5 21 SL_RCV register 5 26 SL_XMIT register 5 25 Slotting 2 21 Specifications mechanical 11 1 SROM 2 13 srom_clk_h operation 5 25 9 16 9 19 9 20 12 1 srom_data_h operation 5 26 9 15 12 1 srom_oe_l operation 9 16 12 1 srom_present_l opera...

Page 370: ...10 1 Third party documentation E 3 Timing diagrams Bcache hit under READ MISS 4 57 bus contention 4 45 FILL to private read or write 4 50 idle_bc_h and cack_h 4 54 READ MISS with idle_bc_h asserted 4 55 READ MISS with victim 4 53 READ MISS with victim abort 4 56 using data_bus_req_h 4 48 tms_h operation 9 4 9 21 12 1 12 2 12 4 Transactions FILL 4 32 READ MISS no Bcache 4 31 READ MISS with victim 4...

Page 371: ...x 9 WRITE BLOCK command 4 29 WRITE BLOCK command acknowledge 4 51 WRITE BLOCK LOCK transaction 4 37 WRITE BLOCK transaction 4 37 Write buffer 2 12 2 33 to 2 36 entry processing 2 35 Write invalidate protocol commands 4 40 Write ordering 2 36 ...

Page 372: ......

Reviews:

No comments

Brands by name

0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Popular brands

Load more brands