Sun Microsystems UltraSPARC-I Скачать руководство пользователя страница 282 | Manualshive

Страница: 282 / 410

background image

Sun Microelectronics

267

16. Code Generation Guidelines

•

The instruction buffer almost always contains several instructions when an
I-Cache miss occurs (an average of about 6.6).

•

The instruction buffer is filled faster (up to 4 instructions per cycle) than it is
emptied.

All these factors contribute to reducing the apparent I-Cache miss latency from 6
cycles (assuming an E-Cache hit) to 0.14 cycles on average for fpppp; that is, on
average, the pipeline is stalled for 0.14 cycles when an I-Cache miss occurs.

The effectiveness of the instruction buffer and the prefetcher on fpppp demon-
strated that techniques (such as loop unrolling) that create large sequential blocks
of code can be used efficiently on UltraSPARC, even if these blocks do not fit in
the I-Cache. On the other hand, for code properly scheduled to take advantage of
the four issue slots on UltraSPARC, the rate of instruction “consumption” may
easily exceed the rate of instruction fetching, thus making I-Cache misses more
apparent.

16.2.5 uTLB and iTLB Misses

The one-entry uTLB contains the virtual page number and the associated physical
page number of the line accessed last. If the line currently accessed is to the same
page, the instructions from that line are simply forwarded to the next stage. If the
line is from a different virtual page, the translation is obtained from the iTLB a
cycle later. The cost of crossing a page boundary is thus one cycle (the smallest
possible page size, 8 Kbytes, is assumed). This may or may not translate into a
one cycle penalty for the whole processor. For a tight loop with code spanning
over two pages, this cost may be significant, especially if the instruction buffer is
empty at the time of the page crossing. For this reason, it is desirable to position
short loops within a page (avoid page crossing).

An iTLB miss is handled by software through the use of the TSB, and takes about
32 cycles. Consequently, an iTLB miss may be very costly in terms of idle proces-
sor cycles. In order to minimize the frequency of iTLB misses, UltraSPARC pro-
vides a large number of entries (64) in the iTLB and allows pages as large as
4Mbytes to be used. Nonetheless, techniques that allocate pages based on profil-
ing are encouraged to further decrease the iTLB miss cost.

16.2.6 Branch Prediction

UltraSPARC predicts the outcome of branches and fetches the next instructions
likely to be executed based on that outcome. While this is all done dynamically in
hardware, the compiler has an impact on the initialization of the state machine.

Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com

«
...
280
281
282
283
284
...
»

Содержание UltraSPARC-I

Страница 1: ...service in house repair center WE BUY USED EQUIPMENT Sell your excess underutilized and idle used equipment We also offer credit for buy backs and trade ins www artisantg com WeBuyEquipment REMOTE IN...

Страница 2: ...02 This July 1997 02 Revision is only available on line The only changes made were to support hypertext links in the pdf file UltraSPARC User sManual UltraSPARC I UltraSPARC II July 1997 Artisan Techn...

Страница 3: ...ms Inc Sun Sun Microsystems and the Sun logo are trademarks or registered trademarks of Sun Microsystems Inc in the United States and other countries All SPARC trademarks are used under license and ar...

Страница 4: ...ent Overview 5 1 4 UltraSPARC Subsystem 10 2 Processor Pipeline 11 2 1 Introductions 11 2 2 Pipeline Stages 12 3 Cache Organization 17 3 1 Introduction 17 4 Overview of the MMU 21 4 1 Introduction 21...

Страница 5: ...Interfaces 73 7 1 Introduction 73 7 2 Overview of UltraSPARC External Interfaces 73 7 3 Interaction Between E Cache and UDB 76 7 4 SYSADDR Bus Arbitration Protocol 84 7 5 UltraSPARC Interconnect Tran...

Страница 6: ...C Data Buffer UDB Control Register 185 11 5 Overwrite Policy 185 Section III UltraSPARC and SPARC V9 12 Instruction Set Summary 189 13 UltraSPARC Extended Instructions 195 13 1 Introduction 195 13 2 S...

Страница 7: ...ons 290 17 8 Floating Point and Graphic Instructions 295 Appendixes Debug and Diagnostics Support 303 A 1 Overview 303 A 2 Diagnostics Control and Accesses 303 A 3 Dispatch Control Register 303 A 4 Fl...

Страница 8: ...on 337 E 2 Pin Descriptions 337 E 3 Signal Descriptions 341 F ASI Names 345 F 1 Introduction 345 G Differences Between UltraSPARC Models 351 G 1 Introduction 351 G 2 Summary 351 G 3 References to Mode...

Страница 9: ...Sun Microelectronics viii UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 10: ...rs Extensions to and implementation dependencies of the SPARC V9 architecture Techniques for managing the pipeline and for producing optimized code A Brief History of SPARC SPARC stands for Scalable P...

Страница 11: ...dependencies are introduced in The SPARC Architecture Manual Version 9 they are numbered throughout the body of the text and are cross referenced in Appendix C that book This book the UltraSPARC User...

Страница 12: ...he following notational conventions are used Square brackets indicate a numbered register in a register file Angle brackets indicate a bit number or colon separated range of bit numbers within a field...

Страница 13: ...ter 9 Interrupt Handling describes how UltraSPARC processes interrupts Chapter 10 Reset and RED_state describes how UltraSPARC handles the various SPARC V9 reset conditions and how it implements RED_s...

Страница 14: ...low level technical material or information not needed for a general understanding of the architecture The manual contains the following ap pendixes Appendix A Debug and Diagnostics Support describes...

Страница 15: ...Sun Microelectronics 14 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 16: ...IntroducingUltraSPARC 1 UltraSPARC Basics 3 2 Processor Pipeline 11 3 Cache Organization 17 4 Overview of the MMU 21 Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artis...

Страница 17: ...Sun Microelectronics 2 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 18: ...is a full implementation of the 64 bit SPARC V9 architecture It sup ports a 44 bit virtual address space and a 41 bit physical address space The core instruction set has been extended to include grap...

Страница 19: ...e data cache or 8 bytes per cycle into the register files To reduce instruction dependency stalls UltraSPARC has short latency opera tions and provides direct bypassing between units or within the sam...

Страница 20: ...struction Translation Lookaside Buffer iTLB and a 64 entry Data Translation Ext Cache RAM Prefetch and Dispatch Unit PDU Integer Execution Unit IEU Floating Point Unit FPU Graphics Unit GRU Instructio...

Страница 21: ...refetch across conditional branches a dynamic branch prediction scheme is implemented in hardware The outcome of a branch is based on a two bit history of the branch A next field associated with every...

Страница 22: ...ns are not pipelined and take 12 22 cycles single double to execute but they do not stall the processor Other in structions following the divide square root can be issued executed and retired to the r...

Страница 23: ...g 16Kb direct mapped cache with two 16 byte sub blocks per line It is virtually indexed and physically tagged VIPT The tag array is dual ported so tag updates due to line fills do not collide with tag...

Страница 24: ...supports The modes are described below 1 1 1 Pipelined Mode The E Cache SRAMS have a cycle time equal to the processor cycle time The name 1 1 1 indicates that it takes one processor clock to send th...

Страница 25: ...subsystem which consists of the UltraSPARC processor synchronous SRAM components for the E Cache tags and data and two UltraSPARC Data Buffer UDB chips The UDBs isolate the E Cache from the system pro...

Страница 26: ...line This simplifies pipeline synchronization and ex ception handling It also eliminates the need to implement a floating point queue Floating point instructions with a latency greater than three divi...

Страница 27: ...e Stages Detail X1 IU Register File E C N1 N2 G D Cache TLB FP add FP RF 32 x 64 IST_data Icc FPST_data Annex D FPU IEU DU G ALU FP mul G mul GRU address bus data bus instruction bus LSU Tag Tag Check...

Страница 28: ...nd then sent to the Instruction Buffer The pre decoded bits generated during this stage accompany the instruc tions during their stay in the Instruction Buffer Upon reaching the next stage where the g...

Страница 29: ...that the data can be forwarded to dependent instruc tions in the pipeline as soon as possible ALU operations executed in the E Stage generate condition codes in the C Stage The condition codes are se...

Страница 30: ...d to the data portion of the Store Buffer All loads that have entered the Load Buffer in N1 continue their progress through the buffer they will reappear in the pipeline only when the data comes back...

Страница 31: ...Sun Microelectronics 16 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 32: ...are used to index into the I Cache tag and data arrays while accessing the I MMU that is the iTLB The resulting tag is compared against the translated physical address to determine I Cache hits 3 1 1...

Страница 33: ...16 byte sub blocks per line Data accesses bypass the data cache when the D Cache enable bit in the LSU_Control_Register is clear see Section A 6 LSU_Control_Register on page 306 Load misses will not a...

Страница 34: ...vide a noncacheable ECC less scratch memory for use of the booting code until the MMUs are enabled The E Cache is a unified write back allocating direct mapped cache The E Cache always includes the co...

Страница 35: ...Sun Microelectronics 20 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 36: ...om features required of SPARC V8 Reference MMUs 4 2 Virtual Address Translation The UltraSPARC MMU supports four page sizes 8 Kb 64 Kb 512 Kb and 4 Mb It supports a 44 bit virtual address space with 4...

Страница 37: ...either all zeros or all ones Figure 4 2 on page 23 illustrates the UltraSPARC virtual address space 0 0 12 12 13 13 63 40 8K byte Virtual Page Number 8K byte Physical Page Number Page Offset Page Offs...

Страница 38: ...Translation Storage Buffer TSB which acts like a direct mapped cache is the in terface between the two The TSB can be shared by all processes running on a processor or it can be process specific The...

Страница 39: ...se when multiple mappings from one VA context to multiple PAs produce a multi ple TLB match is not detected in hardware it produces undefined results Note The hardware ensures the physical reliability...

Страница 40: ...ternal Architecture 41 7 UltraSPARC External Interfaces 73 8 Address Spaces ASIs ASRs and Traps 145 9 Interrupt Handling 161 10 Reset and RED_state 169 11 Error Handling 175 Artisan Technology Group Q...

Страница 41: ...Sun Microelectronics 26 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 42: ...ization of memory accesses Accesses to addresses that cause side effects I O accesses Non faulting loads Instruction prefetching Load and store buffers This chapter only address coherence in a uniproc...

Страница 43: ...locks from the I and D Caches because UltraSPARC main tains inclusion between the external and internal caches See Section 5 2 2 Com mitting Block Store Flushing on page 29 2 1 Address Aliasing Flushi...

Страница 44: ...ection 13 6 4 Block Load and Store Instructions on page 230 5 2 3 Displacement Flushing Cache flushing also can be accomplished by a displacement flush This is done by reading a range of read only add...

Страница 45: ...C a MEMBAR Lookaside executes more efficiently than a MEMBAR StoreLoad 3 1 1 Cacheable Accesses Accesses that fall within the coherence domain are called cacheable accesses They are implemented in Ult...

Страница 46: ...correct ordering between the cacheable and noncacheable domains explicit memory synchronization is needed in the form of MEMBARs or atomic instructions Code Example 5 1 illustrates the issues in volve...

Страница 47: ...EMBARs at both 1 and 2 are needed 3 2 Memory Synchronization MEMBAR and FLUSH The MEMBAR STBAR in SPARC V8 and FLUSH instructions are provide for ex plicit control of memory ordering in program execut...

Страница 48: ...emIssue Forces all outstanding memory accesses to be completed before any memory ac cess instruction after the MEMBAR is issued It must be used to guarantee order ing of cacheable accesses following n...

Страница 49: ...on immediately af ter the FLUSH 3 3 Atomic Operations SPARC V9 provides three atomic instructions to support mutual exclusion These instructions behave like both a load and a store but the operations...

Страница 50: ...CASX Instruction Compare and swap combines a load compare and store into a single atomic in struction It compares the value in an integer register to a value in memory if they are equal the value in m...

Страница 51: ...ng loads allow the null pointer to be accessed safely in a read ahead fashion if the OS can ensure that the page at virtual address 016 is accessed with no penalty The NFO non fault access only bit in...

Страница 52: ...fined in The SPARC Architecture Manual Version 9 A data_access_MMU_miss exception D MMU disabled For PREFETCHA any ASI other than the following 0416 0C16 1016 1116 1816 1916 8016 8316 8816 8B16 Attemp...

Страница 53: ...ads or stores to pages that have this bit set have the following behavior Noncacheable accesses are strongly ordered with respect to each other Noncacheable loads with the E bit set will not be issued...

Страница 54: ...quires the following A MEMBAR Sync is needed after an internal ASI store other than MMU ASIs before the point that side effects must be visible This MEMBAR must precede the next load or noninternal st...

Страница 55: ...on instructions MEMBAR and STBAR are entered into the Store Buffer 5 1 Stores Delayed by Loads The store buffer normally has lower priority than the load buffer when arbitrat ing for the D Cache or E...

Страница 56: ...ache the tag is used to determine whether there is a hit in the TSB If there is a hit the data is fetched by software Figure 6 1 Translation Table Entry TTE from TSB G Global If the Global bit is set...

Страница 57: ...LT _LITTLE ASI_SECONDARY_NO_FAULT _LITTLE are translated Any other access will trap with a data_access_exception trap FT 1016 The NFO bit in the I MMU is read as zero and ignored when written If this...

Страница 58: ...tware must ensure that at least one entry is not locked when replacing a TLB entry otherwise the last TLB entry will be replaced CP CV The cacheable in physically indexed cache and cacheable in virtua...

Страница 59: ...ftware The Global Privileged and Writable fields replace the 3 bit ACC field of the SPARC V8 Reference MMU Page Translation Entry 3 Translation Storage Buffer TSB The TSB is an array of TTEs managed e...

Страница 60: ...amic sharing of the level 2 cache resource should provide a better overall solution than that provided by a fixed partitioning Figure 6 2 shows both the common and shared TSB organization The constant...

Страница 61: ...e 55 Note that there are no separate physical registers in UltraSPARC hardware for the Pointer registers but rather they are implemented through a dynamic re ordering of the data stored in the Tag Acc...

Страница 62: ...ling of TLB misses For the following traps the trap handler is presented with a special set of MMU globals fast_ instruction da ta _access_MMU_miss instruction data _access_exception and fast_data_acc...

Страница 63: ...l Address Space on page 237 Note that the case of JMPL RETURN and branch CALL sequential are handled differently The contents of the I Tag Access Register are undefined in this case but are not needed...

Страница 64: ...S_BYPASS_EC_WITH_EBIT _LITTLE ASIs In this case SFSR FT 0416 6 4 5 Data_access_protection Trap This trap occurs when the MMU detects a protection violation for a data access A protection violation is...

Страница 65: ...r atomic Also access to UltraSPARC internal registers other than LDXA LDFA STDFA or STXA except for I Cache diagnostic accesses other than LDDA STDFA or STXA See Section 8 3 2 UltraSPARC Non SPARC V9...

Страница 66: ...t The MMU signals a data_access_exception trap FT 2016 for this case Table 6 4 D MMU Operations for Normal ASIs Condition Behavior Opcode PRIV Mode ASI W TLB Miss E 0 P 0 E 0 P 1 E 1 P 0 E 1 P 1 Load...

Страница 67: ...e Primary Context identifier there is no I MMU Primary Context register Note The endianness of a data access is specified by three conditions the ASI specified in the opcode or ASI register the PSTATE...

Страница 68: ...ASI_NUCLEUS Table 6 7 ASI Mapping for Data Accesses Condition for Data Access Access Processed with Opcode PSTATE TL PSTATE CLE D MMU IE Endianness ASI Value Recorded in SFSR LD ST Atomic FLUSH 0 0 0...

Страница 69: ...gs can be found in Section 6 10 MMU Bypass Mode on page 68 However if a bypass ASI is used while the D MMU is disabled the bypass operation behaves as it does when the D MMU is enabled that is the acc...

Страница 70: ...e conditions 6 9 MMU Internal Registers and ASI Operations 6 9 1 Accessing MMU Registers All internal MMU registers can be accessed directly by the CPU through UltraSPARC defined ASIs Several of the r...

Страница 71: ...are must guarantee that the VA is within range Writes to the TSB register Tag Access register and PA and VA Watchpoint Ad dress Registers are not checked for out of range VA No matter what is written...

Страница 72: ...ts of the missing virtual address 6 9 3 Context Registers The context registers are shared by the I and D MMUs The Primary Context Register is defined as follows Figure 6 4 D MMU Primary Context Regis...

Страница 73: ...field records the 8 bit ASI associated with the faulting instruction This field is valid for both D MMU and I MMU SFSRs and for all traps in which the FV bit is set JMPL and RETURN mem_address_not_ali...

Страница 74: ...ys reads as 0 in the I MMU SFSR OW Overwrite Set to one when the MMU detects a fault if the Fault Valid bit Table 6 11 MMU Synchronous Fault Status Register FT Fault Type Field FT 6 0 Fault Type 0116...

Страница 75: ...ns the virtual address that was not found in the I MMU TLB For instruction_access_exception traps privilege violation fault type TPC con tains the virtual address of the instruction in the privileged...

Страница 76: ...emble the trapping instruction 6 9 6 I D Translation Storage Buffer TSB Registers The TSB registers provide information for the hardware formation of TSB point ers and tag target to assist software in...

Страница 77: ...a it may load an incorrect TTE I D TSB_Size The Size field provides the size of the TSB according to the following Number of entries in the TSB or each TSB if split 512 2TSB_Size Number of entries in...

Страница 78: ...e location of the missing or trapping TTE in the software maintained TSB The TSB 8 Kb and 64 Kb Pointer registers provide the possible locations of the 8 Kb and 64 Kb TTE re spectively The Direct Poin...

Страница 79: ...d stores on the Tag Access register and the TLB Table 6 13 Effect of Loads and Stores on MMU Registers Software Operation Effect on MMU Physical Registers ad Store Register TLB tag TLB data Tag Access...

Страница 80: ...Entry The TLB Entry number to be accessed in the range 0 63 The format for the Tag Read register is as follows Figure 6 14 I D MMU TLB Tag Read Registers I D VA 63 13 The 51 bit virtual page number P...

Страница 81: ...d TLB entry ASI loads from the TLB Data In register are not supported 9 10 I D MMU Demap Demap is an MMU operation as opposed to a register as described above The purpose of Demap is to remove zero on...

Страница 82: ...e data demap registers requires either a MEMBAR Sync FLUSH DONE or RETRY before the point that the effect must be visible to data accesses A STXA to the I MMU demap registers requires a FLUSH DONE or...

Страница 83: ...from the specified TLB If the TTE Global bit is set the TTE is not removed 10 MMU Bypass Mode In a bypass access the D MMU sets the physical address equal to the truncated virtual address that is PA 4...

Страница 84: ...In Data Access Tag Read Registers on page 64 Write operation The TLB simultaneously writes the CAM and RAM portion of the specified entry or the entry given by the replacement policy described in Sec...

Страница 85: ...ardware Description The hardware diagram in Figure 6 16 on page 70 and the code fragment in Code Example 6 1 on page 71 describe the generation of the 8 Kb and 64 Kb pointers in more detail Figure 6 1...

Страница 86: ...Mask marks the bits from TSB Base Reg TSBBaseMask 0xffffffffffffe000 split TSBSize 1 TSBSize Shift va towards lsb appropriately and zero out the original va page offset vaPortion va type 8K_POINTER 9...

Страница 87: ...Sun Microelectronics 72 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 88: ...how to obtain the data sheet 7 2 Overview of UltraSPARC External Interfaces Figure 7 1 on page 74 shows the UltraSPARC s main interfaces Model dependent interface lengths are labeled in italics inste...

Страница 89: ...an interconnect master and an interconnect slave As an interconnect master UltraSPARC issues read write transactions to the interconnect using part of the transaction set Section 7 5 As a master it al...

Страница 90: ...s paper discussing this algorithm is documented in the Bibliography The UDBs generate ECC when sending data and check the ECC when receiving data The SYSDATA transaction set supports both 64 byte blo...

Страница 91: ...UltraSPARC to the system by hiding system latency for example for Writebacks and noncacheable stores The UDB supports multiple outstanding transactions to increase overall bandwidth The UDB also handl...

Страница 92: ...the E Cache Store buffer All cacheable stores go to the E Cache because the D Cache is write through the order of stores with respect to loads is determined by the memory ordering model Prefetch unit...

Страница 93: ...tisfy copyback requests from the system Table 7 5 shows the number of Writeback buffer entries for each UltraSPARC model Note Models that support more than one Writeback buffer entry can be restricted...

Страница 94: ...ments Notice that the reads are fully pipelined thus full throughput is achieved Three requests are made before the data of the first request comes back and the latency of each re quest is three cycle...

Страница 95: ...The data address is presented on the ECAD pins in the cycle after the request cycle 6 for W0 and the data is sent in the following cycle cycle 7 Separating the ad dress and the data by one cycle redu...

Страница 96: ...ated to Modified M state at the same time that the data is written as shown in Figure 7 7 on page 82 1 1 1 CLK CYCLE 0 1 2 3 4 5 6 7 8 9 TSYN_WR_L R0 R1 R2 TOE_L R0 R1 R2 ECAT A0_tag A1_tag A2_tag TDA...

Страница 97: ...1 1 1 Mode overlap of tag and data accesses The data for three previous writes W0 W1 and W2 is written while three tag accesses reads are made for three younger stores R3 R4 and R5 Figure 7 8 Timing O...

Страница 98: ...force an extra dead cycle while the E Cache data bus driver is switched from the SRAMs to the UltraSPARC UltraSPARC uses a one deep write buffer in the data SRAMs to reduce the read to write turn aro...

Страница 99: ...poten tial drivers the same enable logic can and should be used for both Holding am plifiers in the System Controller must maintain the last state of Addr_Valid whenever UltraSPARC or the SC stop driv...

Страница 100: ...e inside the SC or UltraSPARC All tristate output enables for the SYSADDR bus and Addr_Valid are registered This requires the protocol to be described as a pipeline where only the state of the request...

Страница 101: ...lowing the de assertion of RESET_L 3 The UltraSPARC for which LAST PORT DRIVER port_ID 1 0 can take advantage of a rule that allows request then drive Otherwise the UltraSPARC will minimally see a req...

Страница 102: ...DRIVER can drive without being dependent on possible simultaneously asserted requests Fairness is provided by the release request in presence of another request rule for example a request from anothe...

Страница 103: ...DRIVER Addr_Valid tells the SC when the CURRENT DRIVER is driving a valid packet it is needed because the CURRENT DRIVER may keep its request asserted for longer than the minimum time required to deli...

Страница 104: ...13 4 cycles if the CURRENT DRIVER must be forced off Figure 7 14 Figure 7 12 shows the timing in a uniprocessor system with the UltraSPARC driving back to back packets in the absence of a request from...

Страница 105: ...ycle however and Port1 becomes CURRENT DRIVER as a result Figure 7 14 Arbitration CURRENT DRIVER Loses Ownership While Asserting Request Figure 7 15 on page 91 shows the timing when the SC takes owner...

Страница 106: ...is al lowed to drive its packet s after one arbitration cycle 0 0 0 0 0 SC drives SYSADDR Addr_Valid 0 Undriven Addr_Valid 0 SYSADDR Port0 drives LAST PORT DRIVER Req 0 SC Request SYSADDR Addr_Valid 0...

Страница 107: ...ed only to cacheable memory UltraSPARC splits P_REQ transactions into two independent classes Class 0 contains read transactions due to cacheable misses and block loads Class 1 contains Writeback requ...

Страница 108: ...receives an S_REPLY for that line The SC must not issue an S_REPLY for a request with the same cache index that is for each coherent read or Writeback during the window between an S_REQ and P_REPLY f...

Страница 109: ...cache coherence protocol operates on Physically Indexed Physically Tagged PIPT writeback caches The E Cache maintains inclusion for both the I Cache and the D Cache that is all lines in the internal c...

Страница 110: ...g are invariants for the state transitions 1 Only one cache in the system can ever have the line in E or M state while a line is in E or M state no other cache can have a copy of that line 2 Only one...

Страница 111: ...line that is already in its cache this includes P_RDD_REQ Figure 7 20 on page 95 shows that some transitions are caused by the PREFETCH A instructions which are not supported by all UltraSPARC models...

Страница 112: ...KD S M Store hit atomic hit to Shared Clean line PREFETCH P_RDO_REQ S_OAK S I i A Shared Clean line is victimized by UltraSPARC I Cache miss Write hit on shared line P_RDS_REQ or P_RDSA_REQ or P_RDO_R...

Страница 113: ...making the read with Writeback complete atomically this is described later Figure 7 21 illustrates a system that uses Dtags to maintain cache coherence the system contains multiple UltraSPARCs one Dt...

Страница 114: ...mize block A for block B then block B will simply overwrite block A in the Etags and the Dtags for UltraSPARCk In this case the writeback buffer and DtagTB would not be used for this transaction since...

Страница 115: ...h it sent an S_REQ before S_REPLYing to the original requesting UltraSPARC In general the SC does not complete the original transaction until all of the related S_REQs are P_REPLYed Implementations ma...

Страница 116: ...hat UltraSPARC was initiating or had an outstanding P_WRB_REQ to the same address 40 6 Since some other writer has ownership this Writeback should not complete to memory because the other writer s mod...

Страница 117: ...he has this datum that is if this is the first read of the datum then Etag transitions to E This gives exclusive access to the requesting UltraSPARC to later write this datum without generating anothe...

Страница 118: ...a store hit or atomic hit on a shared line Etag transitions to M For a store miss or atomic miss SC gets data from memory or another processor and provides it to UltraSPARC with the S_RBU reply after...

Страница 119: ...that each UltraSPARC model supports 7 4 1 Error Handling The system can reply with S_RTO time out typically if the address is for unim plemented memory or S_ERR bus error typically if the access is il...

Страница 120: ...pts to report write failures 7 7 6 WriteInvalidate P_WRI_REQ Coherent Write and Invalidate request Generated by UltraSPARC for a block store to an S O or I state line or a block store commit to a line...

Страница 121: ...P_SACKD if the block has been victimized from the E Cache but not yet written back P_SNACK if the block is not present in the E Cache or the writeback buffer UltraSPARC responds more quickly if NDP 0...

Страница 122: ...undefined data in response to the S_CRAB UltraSPARC responds more quickly if NDP 0 SC should assert NDP only in sys tems that do not support Dtags Section 7 10 S_REQ on page 111 for more tim ing info...

Страница 123: ...st SC can send its next coherent request on the cycle after the S_CRAB reply 7 10 CopybackToDiscard S_CPD_REQ Non destructive copyback request from SC to UltraSPARC Generated by SC to service a ReadTo...

Страница 124: ...traSPARC does not cache data associated with these transactions 7 8 1 NonCachedRead P_NCRD_REQ Noncached Read Generated by an UltraSPARC by a load or instruction fetch from a noncached address space o...

Страница 125: ...from SYSDATA Table 7 13 shows the number of outstanding NonCachedBlockRead transactions that each UltraSPARC model supports 8 3 NonCachedWrite P_NCWR_REQ Noncached Write Generated by UltraSPARC to wri...

Страница 126: ...es UltraSPARC I also imposes the following restrictions on back to back S_REQs If the previous S_REQ requires a data transfer the earliest that SC can send the next S_REQ both S_INV_REQ and S_CP _REQ...

Страница 127: ...with P_SNACK In systems with Dtags SC sets NDP 0 in all S_REQs This allows UltraSPARC to reply P_SACK D without searching its tag store which is a significant optimi zation All other effects are the s...

Страница 128: ...ust receive its S_REPLY before UltraSPARC II issues a third read with DVP 1 UltraSPARC delays issue of a coherent read to any address that has an outstand ing Writeback UltraSPARC inhibits its own int...

Страница 129: ...ust like for clean victims 2 UltraSPARC keeps the dirty victimized block in the coherence domain for copyback invalidate requests from SC until it receives the S_REPLYs for both the read and Writeback...

Страница 130: ...h P_SACK if the requested line is in the E Cache P_SACKD if there is a pending Writeback for the line and P_SNACK if the line is not present Some special cases to this are described below The only dif...

Страница 131: ...lock See the discussion accompanying Figure 7 19 on page 93 for more information 12 Interrupts P_INT_REQ UltraSPARC can both send and receive interrupt requests Interrupt requests are used to report i...

Страница 132: ...try later after some backoff period 7 12 1 Extended Interrupt Target ID During an interrupt send UltraSPARC also passes PA 20 19 to create an extend ed MID 6 5 field See Chapter 9 Interrupt Handling T...

Страница 133: ...s all P_REPLYs as an ac knowledgment to a previous SC request UltraSPARC can assert P_FERR at any time to indicate a fatal error requiring system reset upon seeing P_FERR from Table 7 17 P_REPLY Encod...

Страница 134: ...be sent P_IAK Interrupt Acknowledge Reply to a P_INT_REQ from SC UltraSPARC acknowledges that the interrupt transaction has been serviced SC can send the next P_INT_REQ request and its data P_SACK Coh...

Страница 135: ...andshake for delivering data to UltraSPARC 4 Figure 7 27 on page 124 shows the timing for back to back S_REQs for Copyback The earliest that SC can send another S_REQ to the same Table 7 19 S_REPLY En...

Страница 136: ...SDATA bus can be kept continually busy without any dead cycles as long as the same source is driving the data If sources are switched one dead cycle is required on SYSDATA this allows the first source...

Страница 137: ...SC commands the output data queue of the UltraSPARC that contains the block to drive 64 bytes of copyback data on SYSDATA Issued in response to a P_SACK or P_SACKD reply from UltraSPARC containing th...

Страница 138: ...alls Figure 7 24 S_REPLY Timing UltraSPARC Sourcing Block Write No Data Stall Figure 7 25 S_REPLY Timing UltraSPARC Receiving Block Write No Data Stall S_REPLY Data on Bus S_WAB D 0 D 1 D 2 D 3 2 cloc...

Страница 139: ...lifies the S_REPLY signal accompanying a data transfer The following rules govern the assertion of Data_Stall 1 When UltraSPARC is sourcing data the earliest that SC can assert Data_Stall is one syste...

Страница 140: ...g of any quadword including the first quadword at the sink UltraSPARC can be delayed for an arbitrary number of clock cycles by keeping Data_Stall asserted for that many clock cycles Figure 7 30 shows...

Страница 141: ...raSPARC II will not issue any other request Finally UltraSPARC II will not is sue a P_NCRD_REQ if any Class 0 transaction is outstanding UltraSPARC issues all other transactions in Class 1 and can hav...

Страница 142: ...ictim read miss before its corresponding Writeback If the E Cache data bus is busy or if the assertion of an external re quest takes away SYSADDR the Writeback can be delayed A Writeback is not issued...

Страница 143: ...k Typical systems will however since they complete all Class 1 transactions in or der Additionally UltraSPARC I restricts the issue of a read with Writeback until any prior read with Writeback has com...

Страница 144: ...summarizes the requests and replies generated by the SC Table 7 21 Requests and Replies Generated by UltraSPARC Requests Replies P_RDS_REQ P_IDLE P_RDSA_REQ P_RERR P_RDO_REQ P_RAS P_RDD_REQ P_SACK P_W...

Страница 145: ...ror and data is to be transferred to from UltraSPARC Table 7 23 Valid Request and Reply Types UltraSPARC to SC UltraSPARC Request Reply from SC P_RDS_REQ S_RBU or S_RBS or S_ERR2 or S_RTO2 P_RDSA_REQ...

Страница 146: ...how the transfer of control between the processors and the SC Thus each table row may represent zero or more clock ticks 7 16 1 ReadToShare Block Condition Load miss on Processor 1 no other processor...

Страница 147: ...Etag I P_RDS_REQ to System Initial state Etag E Initial state Etag I S_CPB_REQ to P2 P2 copies block to copyback buffer P2 updates Etag E S P_SACK reply to System S_CRAB reply to P2 S_RBS reply to P1...

Страница 148: ...block When the miss victimizes a clean block instead of an invalid block the sequence is the same When Processor 2 s initial state is Etag M or O the sequence is the same 7 16 6 ReadToOwn Block Condi...

Страница 149: ...ag S P_RDO_REQ to System Initial state Etag O Initial state Etag S S_INV_REQ to P2 S_INV_REQ to P3 P2 updates Etag O I P_SACK to System P3 updates Etag S I P_SACK to System S_OAK to P1 no data is tran...

Страница 150: ...stem Processor 2 Processor 3 Initial victim state Etag1 M Initial missed state Etag2 I P1 copies the victim block into the Writeback buffer P_RDS_REQ to System DVP bit set Initial state Etag2 I Initia...

Страница 151: ...rty Victimized Block Processor 1 System Processor 2 Processor 3 ial victim state g1 M ial missed state g2 I copies the victimized block into the teback buffer RDS_REQ to System VP bit set Initial stat...

Страница 152: ...missed state Etag2 I P1 copies the victimized block into the writeback buffer P_RDS_REQ to System DVP bit set Initial state Etag1 I Initial state Etag2 I Initial state Etag2 I S_RBU reply to P1 P1 re...

Страница 153: ...uest packets are carried over SYSADDR Table 7 36 Copyback Invalidate Dirty Victimized Block in Owned State Processor 1 System Processor 2 Processor 3 ial victim state g1 O ial missed state g2 I copies...

Страница 154: ...7 31 Transaction Types Figures 7 32 7 33 and 7 34 show the transaction request packet formats Packet Type Initiated by UltraSPARC Cache Coherent P_RDS_REQ P_RDSA_REQ P_RDO_REQ Non Cached P_NCWR_REQ In...

Страница 155: ...s 16 4 Reserved 22 13 NDP 33 Physical Address 8 6 Class 0 Reserved Second Cycle Master ID 35 29 Parity Physical Address 16 4 12 ByteMask 15 0 33 Class 0 34 28 13 Transaction Type Physical Address 38 1...

Страница 156: ...ass bit identifies which of the two master Class queues the request has been issued from The system must maintain strong ordering between transac Table 7 37 Interconnect Transaction Type Encoding Tran...

Страница 157: ...riteback bit This bit is set when a coherent read victim ized a dirty line The system uses this bit for victim handling 7 17 2 7 IV A Invalidate me Advisory bit in P_WRI_REQ transaction only UltraSPAR...

Страница 158: ...without Dtags however SC must send the requesting UltraSPARC an S_INV_REQ if IVA 1 in a P_WRI_REQ 7 18 1 Using the IV A bit in a P_WRI_REQ UltraSPARC can issue a cache coherent block store that will g...

Страница 159: ...a subsequent co herent miss to the same address might complete first Systems with Dtags ignore the IVA bit so this is not an issue Note This hazard occurs only in uniprocessor systems without Dtags In...

Страница 160: ...32 bits of the 64 bit address to zero when the address mask AM bit in the PSTATE register is set Both big and little endian byte orderings are supported in UltraSPARC The de fault data access byte ord...

Страница 161: ...through their virtual addresses as physical addresses Accesses made using these ASIs are always made in big endian mode regardless of the setting of the D MMU s IE bit Accesses to Internal ASIs with i...

Страница 162: ...address space user privilege V9 1116 ASI_AS_IF_USER_SECONDARY ASI_AIUS RW2 Secondary address space user privilege V9 1816 ASI_AS_IF_USER_PRIMARY_LITTLE ASI_AIUPL RW2 Primary address space user privile...

Страница 163: ...nostics access A 8 2 16 ASI_INTR_DISPATCH_STATUS ASI_INTR_DISPATCH_STATUS 016 R1 Interrupt vector dispatch status 9 3 3 16 ASI_INTR_RECEIVE ASI_INTR_RECEIVE 016 RW Interrupt vector receive status 9 3...

Страница 164: ...U TSB 8K Pointer Register 6 9 8 5A16 ASI_DMMU_TSB_64KB_PTR_REG ASI_DMMU_TSB_64KB_PTR_REG 016 R1 D MMU TSB 64K Pointer Regis ter 6 9 8 5B16 ASI_DMMU_TSB_DIRECT_PTR_REG ASI_DMMU_TSB_DIRECT_PTR_REG 016 R...

Страница 165: ...ASI_UDB_INTR_W 5016 W1 Outgoing interrupt vector data register 1 9 3 1 16 ASI_UDB_INTR_W ASI_UDB_INTR_W 6016 W1 Outgoing interrupt vector data register 2 9 3 1 16 ASI_BLOCK_AS_IF_USER_PRIMARY_LI TTLE...

Страница 166: ...ry address space 4 16 bit partial store little endian 13 6 1 CB16 ASI_PST16_SECONDARY_LITTLE ASI_PST16_SL W1 4 Secondary address space 4 16 bit partial store little endian 13 6 1 CC16 ASI_PST32_PRIMAR...

Страница 167: ...ster s ID field A16 ASI_FL16_PRIMARY_LITTLE ASI_FL16_PL RW 4 Primary address space one 16 bit floating point load store little endian 13 6 2 B16 ASI_FL16_SECONDARY_LITTLE ASI_FL16_SL RW 4 Secondary ad...

Страница 168: ...DQ Set to zero since incoming slave data writes are not supported by UltraSPARC PREQ_RQ Set to one since one incoming P_REQ request may be outstanding at one time Two types of incoming requests are su...

Страница 169: ...RC I Figure 8 3 UPA_CONFIG Register UltraSPARC II MCAP UltraSPARC II Implementation dependent module capability bits Software can use these bits to determine the processor module speed capability Thes...

Страница 170: ...nsaction WB 10 UltraSPARC II Maximum number of outstanding Writebacks SCIQ0 9 8 UltraSPARC II Maximum number of outstanding Class 0 transactions BST 7 Maximum number of outstanding block stores NCST 6...

Страница 171: ...for use by an implementation 4 2 SPARC V9 Defined ASRs Table 8 3 defines the SPARC V9 ASRs that must be supported by a conforming processor implementation 1 An attempt to read this register by non pri...

Страница 172: ...m asi rd tick regrd rd pc regrd rd fprs regrd wr regrs1 reg_or_imm fprs Table 8 4 Non SPARC V9 ASRs ASR Value ASR Name Syntax Access Description Section 1016 PERF_CONTROL_REG RW3 Performance Control R...

Страница 173: ...1 tick_cmpr rd dcr regrd wr regrs1 dcr Table 8 5 Other UltraSPARC Registers Register Name Access Description Section INTERRUPT_GLOBAL_REG RW 8 Interrupt handler globals 14 5 9 MMU_GLOBAL_REG RW 8 MMU...

Страница 174: ...is generated instead of a data_access_protection trap 9 AG alternate globals MG MMU globals IG interrupt globals illegal_instruction AG 01016 710 privileged_opcode AG 01116 6 fp_disabled AG 02016 8 fp...

Страница 175: ...these ASIs are used with incorrect opcodes they do not take mem_address_not_aligned or illegal_instruction traps for memory and register alignment required by the ASI For example block ASIs require 6...

Страница 176: ...cross call a processor or I O device first writes to the Outgoing Interrupt Vector Data Registers according to an established software convention described below A subsequent write to the Interrupt Ve...

Страница 177: ...ence PSTATE IE 1 DONE if NACK Retry after random delay if NACKED Until DONE Note In order to avoid deadlocks interrupts must be enabled for some period before retrying the atomic sequence Alternativel...

Страница 178: ...n Section 9 1 2 Interrupt Vector Re ceive on page 162 the processor takes an implementation dependent interrupt_vector trap after receiving an interrupt packet Software uses a number of scratch regist...

Страница 179: ...7016 A write to this ASI triggers an interrupt vector dispatch to the target CPU resid ing at slot MID Module ID along with the contents of the three Interrupt Vector Data Registers A read from this...

Страница 180: ...rrupt Vector Data Registers Privileged ASI_UDB_INTR_R data 0 ASI 7F16 VA 63 0 4016 ASI_UDB_INTR_R data 1 ASI 7F16 VA 63 0 5016 ASI_UDB_INTR_R data 2 ASI 7F16 VA 63 0 6016 Data Interrupt data A read fr...

Страница 181: ...ield matches the TICK Register s counter field the TICK_INT field is set and a software interrupt is generated See also Section 14 1 7 TICK Register on page 239 and Section 14 5 1 Per Processor TICK C...

Страница 182: ...d for service at level n have been serviced the kernel will write to the CLEAR_SOFTINT register ASR 1516 with bit n set in order to clear that interrupt Note that the complement of the value written t...

Страница 183: ...Sun Microelectronics 168 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 184: ...e bits in the LSU_Control_Register When PSTATE RED is explicitly set by a software write there are no side effects other than disabling the I MMU Software must create the appropriate state itself Trap...

Страница 185: ...E or RETRY instruction in RED_state Note that the RAS is cleared after Power on Reset Section 16 2 10 Return Address Stack RAS on page 272 discusses the RAS in detail The following code fragment fills...

Страница 186: ...s reset affects only one processor not the entire system 10 1 4 Watchdog Reset WDR and error_state A SPARC V9 processor enters error_state when a trap occurs and TL MAXTL The processor signals itself...

Страница 187: ...Unchanged Y Unknown Unchanged PIL Unknown Unchanged CWP Unknown Unchanged except for register window traps TT TL 1 trap type 3 4 trap type CCR Unknown Unchanged ASI Unknown Unchanged TL MAXTL min TL...

Страница 188: ...dep impl dep impl dep 0 0 0 0 0 0 slot ID 1 0 1 1B16 Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged slot ID 1 0 1 1B16 LSU_CONTROL all 0 off 0 off VA_WATCHPO...

Страница 189: ...changed MID Unknown Unchanged ESTATE_ERR_EN ISAPEN sys addr err NCEEN non CE CEEN CE 0 off 0 off 0 off Unchanged Unchanged Unchanged AFAR PA Unknown Unchanged AFSR all Unchanged Unchanged Other UltraS...

Страница 190: ...Fault Status Register and the UDB Error Register see Section 11 3 3 Asynchronous Fault Address Register on page 182 Section 11 3 2 Asynchro nous Fault Status Register on page 180 and Section 11 3 4 U...

Страница 191: ...the oldest non executed instruction and its next PC As a result execution cannot normally be resumed from the point that the trap is taken Instruction access errors are reported before executing the...

Страница 192: ...ter flushing to remove the corrupted data In case of an instruction error the instruction returned to the CPU is marked for ter mination to be aborted This means that a bad instruction will not create...

Страница 193: ...aSPARC will take a disrupting data_access_error trap with priority 33 instead of a deferred trap This avoids panics when the system displaces corrupted user data from the cache Note To prevent multipl...

Страница 194: ...ced before installing in the E Cache This prevents using the bad data or having the bad data written back to memory with good ECC bits Uncorrectable ECC errors on cache fills will be reported for any...

Страница 195: ...olicy de scribed in Table 11 6 Error Detection and Reporting in AFAR and AFSR on page 183 The AFSR is logically divided into four fields Bit 32 the accumulating multiple error ME bit is set when multi...

Страница 196: ...e clear will be performed before logging the new error status The syndrome field is read only and writes to this field are ig nored Refer to Table 10 1 Machine State After Reset and in RED_state on pa...

Страница 197: ...ll corresponding error bits in AFSR If software attempts to write to these bits at the same time as an error that captures address occurs the error ad Table 11 3 E Cache Data Parity Syndrome Bit Order...

Страница 198: ...ress Register Bits Field Use RW 63 41 Reserved R 40 4 PA 40 4 Physical address of faulting transaction RW 3 0 Reserved R Table 11 6 Error Detection and Reporting in AFAR and AFSR Error Type PA SYNDROM...

Страница 199: ...for correctable errors from system In case of multiple outstanding errors only the first is recorded Bits 9 8 are sticky error bits that record the most recently detected errors These bits accumulate...

Страница 200: ...ultiple errors conditions have occurred Errors are captured in the order that they are detected not necessarily in program order If an error occurs at the same time as error bits are cleared by softwa...

Страница 201: ...e P_SYND field 11 5 3 AFSR E Cache Tag Parity ETS Overwrite Policy Parity information for the first occurrence of any error is captured in the ETS field of the AFSR register Error logging in this fiel...

Страница 202: ...V9 12 Instruction Set Summary 189 13 UltraSPARC Extended Instructions 195 14 Implementation Dependencies 235 15 SPARC V9 Memory Models 255 Artisan Technology Group Quality Instrumentation Guaranteed 8...

Страница 203: ...Sun Microelectronics 188 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 204: ...cates a SPARC V9 core instruction The Ref column lists the section number that contains the instruction documentation SPARC V9 core instructions are documented in The SPARC Architecture Manual Version...

Страница 205: ...add A 12 LIGNDATA Perform data alignment for misaligned data 13 5 5 NDNOT1 s Negated src1 AND src2 single precision 13 5 6 NDNOT2 s src1 AND negated src2 single precision 13 5 6 ND s Logical AND sing...

Страница 206: ...bit fixed pack 13 5 3 FPACK 16 32 Four 16 bit two 32 bit pixel pack 13 5 3 FPADD 16 32 s Four 16 bit two 32 bit partitioned add single precision 13 5 2 FPMERGE Two 32 bit pixel to 64 bit pixel merge 1...

Страница 207: ...7 UWA Load unsigned word from alternate space A 28 X Load extended A 27 XA Load extended from alternate space A 28 XFSR Load extended floating point state register A 25 MBAR Memory barrier A 31 OVcc M...

Страница 208: ...ht logical extended A 31 STB Store byte A 53 STBA Store byte into alternate space A 54 STBAR Store barrier A 50 STD Store doubleword A 53 STDA Store doubleword into alternate space A 54 STDF Store dou...

Страница 209: ...teger divide and modify condition codes A 10 IVX 64 bit unsigned integer divide A 36 MUL UMULcc Unsigned integer multiply and modify condition codes A 37 RASI Write ASI register A 62 RASR Write ancill...

Страница 210: ...emory accesses see Section 13 6 Memory Access Instructions 13 2 SHUTDOWN Format 3 Description The SHUTDOWN instruction waits for all outstanding transactions to be com pleted This leaves the system an...

Страница 211: ...tain the data sheet This is a privileged instruction an attempt to execute it while in non privileged mode causes a privileged_opcode trap Traps privileged_opcode Note Privileged software should save...

Страница 212: ...val ue Conversion from 32 bit fixed to 16 bit fixed is also supported with the FPACKFIX instruction Rounding can be performed by adding 1 to the round bit position Complex calculations needing more dy...

Страница 213: ...S_LITTLE instruction See Section 13 5 5 Alignment Instructions on page 214 Traps fp_disabled 5 Graphics Instructions All instruction operands are in floating point registers unless otherwise speci fie...

Страница 214: ...01 0001 Two 16 bit add FPADD32 0 0101 0010 Two 32 bit add FPADD32S 0 0101 0011 One 32 bit add FPSUB16 0 0101 0100 Four 16 bit subtract FPSUB16S 0 0101 0101 Two 16 bit subtract FPSUB32 0 0101 0110 Two...

Страница 215: ...raphics instruction source operand in the next instruction group Similarly do not use the result of a standard FPADD as a 32 bit graphics instruction source operand in the next instruction group Traps...

Страница 216: ...o not use the result of an FPACK as part of a 64 bit graphics instruction source operand in the next three instruction groups Do not use the result of FEXPAND or FPMERGE as a 32 bit graphics instructi...

Страница 217: ...on is performed to convert the scaled value into a signed integer that is round toward negative infinity If the resulting value is negative that is the MSB is set zero is delivered as the clipped valu...

Страница 218: ...ing clipping information 2 For each 32 bit value truncate and clip to an 8 bit unsigned integer starting at the bit immediately to the left of the implicit binary point i e between bits 23 and 22 of e...

Страница 219: ...s the result in the 32 bit rd register This operation illustrated in Figure 13 5 is carried out as follows 1 Left shift each 32 bit value in rs2 by the number of bits in the 3 rs2 rd 7 2 0 5 implicit...

Страница 220: ...igned integer i e rounds toward negative infinity If the resulting value is less than 32768 32768 is delivered as the clipped value If the value is greater than 32767 32767 is delivered Otherwise the...

Страница 221: ...s to a 16 bit fixed value 2 Stores the results in the rd register Figure 13 6 FEXPAND Operation 5 3 5 FPMERGE FPMERGE interleaves four corresponding 8 bit unsigned values in rs1 and rs2 to produce a 6...

Страница 222: ...ed when it is applied twice in suc cession for example R1R2R3R4 B1B2B3B4 R1B1R2B2R3B3R4B4 R1G1B1A1R2G2B2A2 Figure 13 7 FPMERGE Operation 6 3 rs1 rd 1 3 1 5 4 7 0 1 7 5 2 3 3 1 2 3 7 3 9 5 5 0 1 7 5 2...

Страница 223: ...duct FMUL8x16AU 0 0011 0011 8 16 bit upper partitioned product FMUL8x16AL 0 0011 0101 8 16 bit lower partitioned product FMUL8SUx16 0 0011 0110 upper 8 16 bit partitioned product FMUL8ULx16 0 0011 011...

Страница 224: ...e most significant bit Typically this operation is used with filter coefficients as the fixed point rs2 value and image data as the rs1 pixel value Appropriate scaling of the coefficient allows variou...

Страница 225: ...L is the same as FMUL8x16AU except that the least significant 16 bits of the 32 bit rs2 register are used for the value Figure 13 10 FMUL8x16AL Operation 3 rd rs1 1 1 5 2 3 0 7 rs2 0 6 3 3 rd rs1 1 1...

Страница 226: ...4 5 FMUL8ULx16 FMUL8ULx16 multiplies the unsigned lower 8 bits of each 16 bit value in rs1 by the corresponding fixed point signed integer in rs2 Each 24 bit product is sign extended to 32 bits The u...

Страница 227: ...ed left by 8 bits to make up a 32 bit result The result is stored in the corresponding 32 bit of the destination rd register The operation is illustrated in Figure 13 13 3 rd rs1 1 1 5 2 3 0 7 rs2 5 5...

Страница 228: ...duct is sign extended to 32 bits and stored in the rd register The operation is illustrated in Figure 13 14 Figure 13 14 FMULD8ULx16 Operation Code Example 13 2 16 bit x 16 bit 32 bit Multiply fmuld8s...

Страница 229: ...lf of the concatenated value Bytes in this value are numbered from most significant to least significant with the most sig nificant byte being byte 0 Eight bytes are extracted from this value where th...

Страница 230: ...c2 single precision FOR 0 0111 1100 Logical OR FORS 0 0111 1101 Logical OR single precision FNOR 0 0110 0010 Logical NOR FNORS 0 0110 0011 Logical NOR single precision FAND 0 0111 0000 Logical AND FAN...

Страница 231: ...fregrs1 fregrs2 fregrd fands fregrs1 fregrs2 fregrd fnand fregrs1 fregrs2 fregrd fnands fregrs1 fregrs2 fregrd fxor fregrs1 fregrs2 fregrd fxors fregrs1 fregrs2 fregrd fxnor fregrs1 fregrs2 fregrd fx...

Страница 232: ...Pixel Compare Instructions Format 3 opcode opf operation FCMPGT16 0 0010 1000 Four 16 bit compare set rd if src1 src2 FCMPGT32 0 0010 1100 Two 32 bit compare set rd if src1 src2 FCMPLE16 0 0010 0000...

Страница 233: ...For FCMPLE each bit in the result is set if the corresponding value in rs1 is less than or equal to the value in rs2 Greater than or equal comparisons are made by swapping the operands For FCMPEQ each...

Страница 234: ...mask is computed from left and right edge masks as fol lows 1 The left edge mask is computed from the 3 least significant bits LSBs of rs1 and the right edge mask is computed from the 3 LSBs of rs2 a...

Страница 235: ...left edge mask The integer condition codes are set the same as a SUBCC instruction with the same operands End of scan line comparison tests may be performed using edge with an appropriate conditional...

Страница 236: ...ce the result of a nonPDIST instruction in the previous two instruction groups Table 13 2 Edge Mask Specification Little Endian Edge Size A2 A0 Left Edge Right Edge 8 000 1111 1111 0000 0001 8 001 111...

Страница 237: ...8 16 ARRAY16 or 32 bits ARRAY32 The rs2 operand speci fies the power of two size of the X and Y dimensions of a 3D image array The legal values for rs2 and their meanings are shown in the following ta...

Страница 238: ...zero The number of zeros in the least signifi cant bits is determined by the element size An element size of eight bits has no zeros an element size of 16 bits has one zero and an element size of 32 b...

Страница 239: ...block The following code fragment shows assembly of components along an interpolat ed line at the rate of one component per clock on UltraSPARC Code Example 13 4 Assembly of Components Along an Inter...

Страница 240: ...8 bit conditional stores to secondary address space little endian STDFA ASI_PST16_P C216 Four 16 bit conditional stores to primary address space STDFA ASI_PST16_S C316 Four 16 bit conditional stores...

Страница 241: ...t is big endian Note If the byte ordering is little endian the byte enables generated by this instruction are swapped with respect to big endian Traps fp_disabled mem_address_not_aligned data_access_e...

Страница 242: ...condary address space little endian LDDFA STDFA ASI_FL16_P D216 16 bit load store from to primary address space LDDFA STDFA ASI_FL16_S D316 16 bit load store from to secondary address space LDDFA STDF...

Страница 243: ...the low order 8 or 16 bits of the register Little endian ASIs transfer data in little endian format in memory otherwise memory is assumed to big endian Short loads and stores typically are used with...

Страница 244: ...access_exception trap will be taken for a noncacheable access or use with any instruction other than LDDA A mem_address_not_aligned trap will be taken if the access is not aligned on a 128 bit boundar...

Страница 245: ...STDFA ASI_BLK_S F116 64 byte block load store from to secondary address space LDDFA STDFA ASI_BLK_PL F816 64 byte block load store from to primary address space little endian LDDFA STDFA ASI_BLK_SL F9...

Страница 246: ...from a 64 byte aligned memory area into eight double precision floating point registers specified by fregrd The lowest addressed eight bytes in memory are loaded into the lowest numbered double preci...

Страница 247: ...rules data from before or after the load may be used UltraSPARC continues exe cution before all of the store data has been transferred If store data registers are overwritten before the next block sto...

Страница 248: ...MEMBAR StoreLoad instruction the contents of the block are undefined If the BST overlaps a later store or flush and there is no intervening trap or MEM BAR StoreStore instruction the contents of the b...

Страница 249: ...a f6 f8 f40 faligndata f8 f10 f42 faligndata f10 f12 f44 faligndata f12 f14 f46 addcc l0 1 l0 bg pt l1 fmovd f14 f48 end of loop handling l1 ldda regaddr ASI_BLK_P f0 stda f32 regaddr ASI_BLK_P falign...

Страница 250: ...P opcodes and instructions with in valid values in reserved fields other than reserved FPops or fields in graphics in structions that reference floating point registers and the reserved field in the T...

Страница 251: ...er roneous condition Upon completion of trap processing the state of the CPU is restored before returning to the offending code or terminating the process This time consuming operation is necessary be...

Страница 252: ...4 1 5 SIGM Support Impdep 116 UltraSPARC initiates a Software Initiated Reset SIR by executing a SIGM in struction while in privileged mode When in non privileged mode SIGM behaves as a NOP See also S...

Страница 253: ...struction_access_exception trap if PSTATE AM is not set If the target address of a JMPL or RETURN instruction is an out of range address and PSTATE AM is not set a trap is generated with the PC the ad...

Страница 254: ...e the D MMU SFAR contains only 44 bits the trap handler must decode the load or store instruction if the full 64 bit virtual address is needed See also Section 6 9 4 I D MMU Synchronous Fault Status R...

Страница 255: ...ight window 64 bit integer register file that is NWINDOWS 8 UltraSPARC truncates values stored in the CWP CANSAVE CANRESTORE CLEANWIN and OTHERWIN registers to three bits This in cludes implicit updat...

Страница 256: ...manufacturer code 001716 TI JEDEC number that identifies the manufacturer of an UltraSPARC CPU impl 16 bit implementation code 001016 that uniquely identifies an UltraSPARC class CPU Table 14 3 shows...

Страница 257: ...hed_FPop trap is signalled and these operations are handled in system software The unfinished trapping cases are listed in Table 14 4 and Table 14 5 Because trapping on subnormal operands and results...

Страница 258: ...n handling Underflow is detected before rounding Prediction of overflow underflow and inexact traps for divide and square root is used to simplify the hardware For divide pessimistic prediction occurs...

Страница 259: ...t register file modifying instructions include floating point Table 14 6 Unimplemented Quad Precision Floating Point Instructions Instruction Description F s d TOq Convert single double to quad precis...

Страница 260: ...instructions The FBfcc FMOVcc and MOVcc instructions use one of these condition code sets to determine conditional control transfers and conditional register moves Note fcc0 is the same as the fcc in...

Страница 261: ...mal Result Trapping Cases NS 0 on page 243 ver This field identifies a particular implementation of the UltraSPARC FPU architecture ftt The 3 bit floating point trap type field is set whenever an floa...

Страница 262: ...54 exceptions 14 4 SPARC V9 Memory Related Operations 14 4 1 Load Store Alternate Address Space Impdep 5 29 30 Supported ASI accesses are listed in Section 8 3 Alternate Address Spaces on page 146 14...

Страница 263: ...FLUSH sequence is performed UltraSPARC guarantees that earlier code modifications will be visible across the whole system 14 4 5 PREFETCH A Impdep 103 117 For UltraSPARC I PREFETCH A instructions with...

Страница 264: ...mpdep 113 121 UltraSPARC supports all three memory models TSO PSO RMO See Section 15 2 Supported Memory Models on page 256 14 4 10 I O Operations Impdep 118 123 I O spaces and their accesses are speci...

Страница 265: ...e is described in Chapter 3 Cache Organization 5 3 Memory Management Unit UltraSPARC implements a multi level memory management scheme The MMU architecture is described in Chapter 4 Overview of the MM...

Страница 266: ...the trap globals Two 1 bit fields PSTATE IG and PSTATE MG have been added to the PSTATE register to select which set of global registers to use The PSTATE IG and PSTATE MG bits are also stored with th...

Страница 267: ...ong with the rest of the PSTATE register When an interrupt_vector trap trap type 6016 is taken UltraSPARC selects the In terrupt Global registers by setting IG and clearing AG and MG When a fast_instr...

Страница 268: ...power requirements during idle periods A privileged instruction SHUTDOWN has been added to facilitate a software controlled power down of the CPU and system Power down support is described in Appendix...

Страница 269: ...g and Diagnostics Support UltraSPARC support for debug and diagnostics is described in Appendix A Debug and Diagnostics Support on page 303 Artisan Technology Group Quality Instrumentation Guaranteed...

Страница 270: ...s to function correctly if data is shared MEMBAR is a SPARC V9 memo ry synchronization primitive that enables a programmer to explicitly control the ordering in a sequence of memory operations Process...

Страница 271: ...ual Ver sion 9 The definitions in the following sections apply to system behavior as seen by the programmer A description of MEMBAR can be found in Section 5 3 2 Memory Synchronization MEMBAR and FLUS...

Страница 272: ...must check snoop the store buffer for the most recent store to that address For SPARC V9 compatibility a MEMBAR Lookaside should be used between a store and a subsequent load to the same non cacheabl...

Страница 273: ...accesses with the E bit set that is those having side effects are all strongly ordered with respect to each other A MEMBAR must be used between cacheable memory references if stronger order is desired...

Страница 274: ...SectionIV ProducingOptimizedCode 16 Code Generation Guidelines 261 17 Grouping Rules and Stalls 281 Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 275: ...Sun Microelectronics 260 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 276: ...ntage of by using modern compiler technology This technology was not available previously mainly because the hardware support was not sufficient to justify its development 16 2 Instruction Stream Issu...

Страница 277: ...cted for simplicity and timing con siderations hardware support for getting instructions from two adjacent lines was not included Consequently on average for random accesses 3 25 instruc tions are fet...

Страница 278: ...he first three instruc tions in a group occupy slots that in most cases are interchangeable with respect to resources Only special cases of instructions that can only be executed in IEU1 followed by I...

Страница 279: ...taken branch is among the four instructions the next field contains the index of the target of the branch The following cases represent situations when the prediction bits and or the next field do not...

Страница 280: ...al branch is forced after address 31 and there is already a branch in the group Figure 16 5 Artificial Branch Inserted after a 32 byte Boundary 16 2 3 I Cache Timing If accesses to the I Cache hit the...

Страница 281: ...s fit in the I Cache and avoid hot spots collisions UltraSPARC provides instru mentation to profile a program and detect if instruction accesses generate a cache miss or a cache hit For example one ca...

Страница 282: ...ns from that line are simply forwarded to the next stage If the line is from a different virtual page the translation is obtained from the iTLB a cycle later The cost of crossing a page boundary is th...

Страница 283: ...gorithm used for branch prediction is represented in Figure 16 6 Note This figure is identical to Figure A 15 Figure 16 6 Dynamic Branch Prediction State Diagram For loops in steady state the algorith...

Страница 284: ...gain any performance The penalty for a mispredicted branch is always 4 cycles SETcc Bicc and the delay slot can be grouped together Figure 16 7 Figure 16 7 Handling of Conditional Branches Conditiona...

Страница 285: ...the second CTI It processes the first CTI executes instructions until the second CTI reaches the N3 stage squashes all instructions executed after the first CTI and executes instruc tions starting wi...

Страница 286: ...edictions for hard to predict branches For example in Figure 16 10 if the outcome of branch A which is executed before branch B has an impact on the direction on branch B then it is desirable to split...

Страница 287: ...L or RETURN with rs1 equal to o7 normal subroutine or i7 leaf subroutine The RAS provides a guess for the target address so that prefetching can continue even though the address calculation has not ye...

Страница 288: ...on between two loads returning data As soon as a cycle without a load appears in the pipeline the latency of loads is brought back to two cycles Note The SPARC V8 LD instruction is replaced with LDUW...

Страница 289: ...s possible to organize data at compile time so that collisions are mini mized however For frequently executed loops the compiler should organize the data so that all accesses within the loop are mappe...

Страница 290: ...Loads that miss the D Cache do not necessarily stall the pipeline non blocking loads Instead they are sent to the load buffer where they wait for the data to be returned from the E Cache The pipeline...

Страница 291: ...In this case the younger load also must enter the load buffer it will access the D Cache array only after the older load D Cache miss does so If the load buffer is not empty the D Cache array access i...

Страница 292: ...to the same 16 byte sub block the entering load is marked as a hit since by the time it accesses the D Cache array the sub block will be present Code Example 16 2 The detection of a hit eliminates a...

Страница 293: ...this in more detail Code Example 16 3 Avoiding Bus Turnaround Penalties 1 1 1 mode only 3 6 5 Using LDDF to Load Two Single Precision Operands Cycle UltraSPARC supports single cycle 8 byte data trans...

Страница 294: ...hat there could be a match between them In order to simplify the hardware the full 40 physical address bits are not used when comparing the address of the memory location requested by the load with th...

Страница 295: ...s non faulting loads equivalent to silent loads used for Multiflow TRACE and Cydrome Cydra 5 so that loads can be moved ahead of conditional control structures that guard their use Non faulting loads...

Страница 296: ...ten in Mixed Case BODY FONT Examples are FdMULq Floating point multiply double to quad SPARC V9 LDDF Load Double Floating Point Register SPARC V9 SHUTDOWN Power Down Support UltraSPARC Instruction Fam...

Страница 297: ...uctions are shown with offsets between their stages to indicate the amount of latency that normally occurs between the instructions The following instruc tion pair has one cycle of latency This instru...

Страница 298: ...are added for each I Cache miss The next fetch from the I Cache will not add instructions to the instruction buffer for one to two clocks after the E Cache instructions are added Back to back I Cache...

Страница 299: ...UBcc TV ADDcc ANDcc ANDNcc ORcc ORNcc SUBcc XORcc XNORcc EDGE and ARRAY CALL JMPL BPr PST and FC MP LE NE GT EQ 16 32 also require the IEU1 data path besides counting as CTI store or floating point in...

Страница 300: ...s to the TICK PSTATE and TL registers and FLUSH W instructions cause a pipeline flush when they reach the W Stage effectively inserting nine bubbles 17 5 2 IEU Dependencies Instructions that have the...

Страница 301: ...To avoid this software should explicitly force the use instruction to be in the third group or later after the FCMP LE NE GT EQ 16 32 MULX U S MUL cc MULScc U S DIV X U S DIVcc and STD cannot be in t...

Страница 302: ...erting nine bubbles into the pipe The pipeline is flushed even if the second DCTI is an nulled 17 6 1 Control Transfer Dependencies UltraSPARC can group instructions following a control transfer with...

Страница 303: ...ing the delay slot For example When a control transfer is mispredicted the instruction buffer and instructions younger than the delay slot in the pipe are flushed effectively inserting four bub bles i...

Страница 304: ...ruction is stalled in issue until the FDIV in struction completes A predicted annulled load does not affect dependency checking after it is dis patched For example 1 The W1 Stage is a virtual stage th...

Страница 305: ...7 Load Store Instructions Load store instructions can be dispatched only if they are in the first three in struction slots One load store instruction can be dispatched per group Load store instruction...

Страница 306: ...r example When an instruction referencing a load result enters the E Stage and the data is not yet returned all instructions in the E Stage and earlier will be stalled If there are multiple load uses...

Страница 307: ...six clocks after the load reaches the C Stage otherwise Because load data is returned in order a D Cache load hit that reaches the C Stage one clock after a D Cache miss also returns data seven clocks...

Страница 308: ...e previous store is dispatched 17 7 1 5 Other Timing Issues Additional clocks are added to the time a load returns data for E Cache misses and arbitration for the D and E Caches An E Cache miss adds a...

Страница 309: ...after the data leaves the store buffer Data leaves the store buffer when the write is issued to the E Cache SRAM for cacheable accesses UDB for non cacheable accesses and internal register for intern...

Страница 310: ...latency to the first word of read data is at least 18 processor clocks Noncacheable stores are removed from the store buffer with the same timing as if the store were an E Cache hit provided that the...

Страница 311: ...ns that have the same destination register in the same register file can not be grouped together For example FBfcc cannot be grouped with an older FCMP E s d even if they reference differ ent floating...

Страница 312: ...IV or FSQRT depen dent instruction would be released will be held for one clock regardless of data dependency FDIV and FSQRT use the floating point multiplier for final rounding so an M Class operatio...

Страница 313: ...les Floating point loads and stores are independent of these mixed precision rules 1 A floating point or graphics instruction that follows an FMOV FABS FNEG of different precision break the group even...

Страница 314: ...ixed precision floating point instruction To avoid the pipe flush overhead software should explicitly force the use instruction to be at least the latency number of groups after the source instruction...

Страница 315: ...Vcc s d FMOV s d FABS s d FNEG s d FPADD 16 32 s FPSUB 16 32 s FALIGNDATA FPMERGE FEXPAND FPACK 16 32 FIX FMUL8x16 AL AU FMUL d 8ULx16 FMUL d 8SUx16 PDIST rs1 rs2 FCMPLE 16 32 FCMPNE 16 32 FCMPGT 16 3...

Страница 316: ...ort 303 B Performance Instrumentation 319 C Power Management 327 D IEEE 1149 1 Scan Interface 329 E Pin and Signal Descriptions 337 F ASI Names 345 Artisan Technology Group Quality Instrumentation Gua...

Страница 317: ...Sun Microelectronics 302 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 318: ...rough system calls to these facilities See Section 6 9 4 I D MMU Synchronous Fault Status Registers SFSR on page 58 for SFSR details Caution A STXA to any internal debug or diagnostic register require...

Страница 319: ...tructions that read or update the Graphic Status Register GSR are treated as floating point instructions They cause an fp_disabled trap if either PSTATE PEF or FPRS FEF is cleared See Section 13 5 Gra...

Страница 320: ...ority than the physical address watchpoint trap Separate 8 bit byte masks allow watchpoints to be set for a range of addresses Zero bits in the byte mask causes the comparison to ignore the correspond...

Страница 321: ...ed 64 bit address into the watch point register 6 LSU_Control_Register ASI 4516 VA 0016 Name ASI_LSU_CONTROL_REGISTER The LSU_Control_Register contains fields that control several memory related hardw...

Страница 322: ...le with side effects A 6 3 Parity Control FM 15 0 LSU parity_mask If set UltraSPARC writes will generate incorrect parity on the E Cache data bus for bytes corresponding to this mask The parity_mask c...

Страница 323: ...in the watchpoint mask a virtual watchpoint trap is generated A 6 4 3 Physical Address Data Watchpoint Enable PR PW LSU physical_address_data_watchpoint_enable If PR PW is set a data read write that m...

Страница 324: ...nto four fields per entry The instruction field contains eight 32 bit instructions The tag field contains a 28 bit physical tag and a valid bit The pre decode field contains eight 4 bit information pa...

Страница 325: ...Instruction Access Address Format ASI 6616 IC_set This 1 bit field selects a set 2 way associative IC_addr This 10 bit index 12 3 selects an aligned pair of 32 bit instructions Figure A 7 I Cache Ins...

Страница 326: ...This 8 bit index i e addr 12 5 selects an IC_Line IC_line For LDDA accesses this 2 bit field selects a pair of pre decode fields in a 64 bit aligned instruction pair For STXA accesses the least signi...

Страница 327: ...ng 7 4 I Cache LRU BRPD SP NFA Fields ASI 6F16 VA 63 14 0 VA 13 IC_set VA 12 3 IC_addr VA 2 0 0 Name ASI_ICACHE_PRE_NEXT_FIELD Figure A 13 I Cache LRU BRPD SP NFA Field Access Address Format ASI 6F16...

Страница 328: ...f either of the corresponding instructions is a branch with static prediction bit set other wise IC_brpd is set to likely not taken The prediction bits are subsequently up dated according to the dynam...

Страница 329: ...Cache ASI accesses are supported data ASI 4616 and tag valid ASI 4716 A 8 1 D Cache Data Field ASI 4616 VA 63 14 0 VA 13 3 DC_addr VA 2 0 0 Name ASI_DCACHE_DATA Figure A 16 D Cache Data Access Address...

Страница 330: ...o PA without page mapping To prevent interference from instruction prefetching modifying the E Cache state LDXA STXA instructions which use these ASIs should be on non physical cacheable pages A 9 1 E...

Страница 331: ...READING VA 63 41 0 VA 40 39 2 VA 38 19 0 VA 18 6 EC_addr VA 5 0 0 0 5 Mb VA 38 20 0 VA 19 6 EC_addr VA 5 0 0 1 Mb VA 38 21 0 VA 20 6 EC_addr VA 5 0 0 2 Mb VA 38 22 0 VA 21 6 EC_addr VA 5 0 0 4 Mb VA...

Страница 332: ...ASI_ECACHE TAG are ignored but the contents of the E Cache_tag_data_register are written to the selected E Cache line A 9 3 E Cache Tag State Parity Data Accesses ASI 4E16 VA 63 0 0 Name ASI_ECACHE_TA...

Страница 333: ...Sun Microelectronics 318 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 334: ...vileged software to access the PICs causes a privileged_action trap Event measurements in non privileged and or privileged modes can be con trolled by setting the PCR UT and PCR ST fields Two 32 bit P...

Страница 335: ...er_trace If set events in non privileged user mode are counted This may be set along with PCR ST to count all selected events ST System_trace If set events in privileged system mode are counted This m...

Страница 336: ...cle counting is controlled by the PCR UT and PCR ST fields Instr_cnt PIC0 PIC1 The number of instructions completed Annulled mispredicted or trapped start set up PCR end sel PCR sel accumulate stat PI...

Страница 337: ...t instruction in the group Dispatch0_FP_use PIC1 First instruction in the group depends on an earlier floating point result that is not yet available but only while the earlier instruction is not stal...

Страница 338: ...regardless of whether the access will be used IC_ref PIC0 I Cache references I Cache references are fetches of up to four instructions from an aligned block of eight instructions I Cache references ar...

Страница 339: ...wnership UPA transaction EC_wb PIC1 E Cache misses that do writebacks EC_snoop_inv PIC0 E Cache invalidates from the following UPA transactions S_INV_REQ S_CPI_REQS_INV_REQ S_CPI_REQS_INV_REQ S_CPI_RE...

Страница 340: ...tch0_IC_miss 0011 Dispatch0_storeBuf 1000 IC_ref 1001 DC_rd 1010 DC_wr 1011 Load_use 1100 EC_ref 1101 EC_write_hit_RDO 1110 EC_snoop_inv 1111 EC_rd_hit Table B 2 PIC S1 Selection Bit Field Encoding S1...

Страница 341: ...Sun Microelectronics 326 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 342: ...flushed to memory by software This flush should be done by displacement flush if other masters are doing coherent accesses while the flush is being performed Cache flushing is described in Section 5 2...

Страница 343: ...a synchronous wake up signal eliminates the problems of warm switching the PLL loops and sampling the wake up signal without a clock When the reset pin is deasserted UltraSPARC begins RED_state reset...

Страница 344: ...parts A test access port controller An instruction register Numerous public and private test data registers For information about how to obtain a copy of IEEE Std 1149 1 1990 see the Bib liography D 2...

Страница 345: ...isabled the instruction register is initialized to select the Device ID register 3 2 RUN TEST IDLE An intermediate controller state between scan operations If no instruction is se lected all test data...

Страница 346: ...IR SCAN SELECT DR SCAN RUN TEST IDLE TEST LOGIC RESET CAPTURE IR CAPTURE DR EXIT 2 IR EXIT 1 DR PAUSE DR EXIT 2 IR EXIT 2 DR UPDATE IR UPDATE DR 0 PAUSE IR SHIFT DR 0 SHIFT IR 1 0 1 1 1 1 0 0 0 1 1 0...

Страница 347: ...orary controller state in which the IR DR retain their previous state D 3 8 PAUSE IR DR A temporary controller state in which the IR DR retain their previous state This state is provided so that the s...

Страница 348: ...5 Instructions The UltraSPARC 8 bit instruction register IR implements numerous public and private instructions There are 75 valid instructions out of the 256 possible encod ings all invalid encoding...

Страница 349: ...ter as the active test data register Used to perform board level interconnect testing When active the boundary scan chain drive the processor pins Therefore UltraSPARC cannot operate in its normal fun...

Страница 350: ...vice inoperative D 6 Public Test Data Registers D 6 1 Device ID Register A 32 bit register that is loaded with the UltraSPARC ID upon entering the CAP TURE DR TAP state when the ID instruction is acti...

Страница 351: ...e tween register bits and the pin signals is described in a Boundary Scan Descrip tion Language BSDL file available from your SPARC sales representative Note It is recommended that transitions from th...

Страница 352: ...system clock UDB_UEL I Asserted when the Low UDB is driving EDATA 63 0 and it has detected an uncorrect able ECC error in that data Synchronous to system clock UDB_CEH I Asserted when the High UDB is...

Страница 353: ...These pins are also used to transfer data to control status registers on the UDB chip EDPAR 7 0 I O Byte parity for EDATA Odd parity is driven for all EDATA transfers from the UDB and checked if UDB...

Страница 354: ...s arbitration protocol Synchronous to system clock S_REPLY 3 0 I System Reply packet from the system to UltraSPARC Used by UltraSPARC for flow con trol and initiating data transfers between the system...

Страница 355: ...ynchronous to processor clock TOE_L O Active low operation enable for all E Cache tag SRAM reads and writes Active low Synchronous to processor clock Table E 5 Clock Interface Pins Symbol Type Name an...

Страница 356: ...nterface Pins Symbol Type Name and Function RESET_L I Asserted asynchronously for POR power on resets Deasserted synchronous to system clock Active low XIR_L I Asserted to signal XIR resets Acts like...

Страница 357: ...ntial Clock Input A CLKA 1 I Differential Clock Input B CLKB 1 I PLL loop filter connection LOOP_CAP3 1 I Low Frequency D C signal DC_SPARE 1 I UDB Clock A copy SDBCLKA 1 I UDB Clock B copy SDBCLKB 1...

Страница 358: ...m Reply S_REPLY 3 0 4 I System Identification SYSID 4 0 5 I System Clock Input A SYSCLKA 1 I System Clock Input B SYSCLKB 1 I External Event EXT_EVENT 1 I Phase Lock Loop Bypass PLL_BYPASSS 1 I Reset...

Страница 359: ...Sun Microelectronics 344 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 360: ..._USER_SECONDARY Secondary address space user privilege 1116 ASI_AS_IF_USER_SECONDARY_LITTLE Secondary address space user privilege little endian 1916 ASI_BLK_AIUP Primary address space block load stor...

Страница 361: ...ss 4716 ASI_DMMU D MMU PA Data Watchpoint Register 5816 ASI_DMMU D MMU Secondary Context Register 5816 ASI_DMMU D MMU Synch Fault Address Register 5816 ASI_DMMU D MMU Synch Fault Status Register 5816...

Страница 362: ...imary address space one 8 bit floating point load store D016 ASI_FL8_PL Primary address space one 8 bit floating point load store little endian D816 ASI_FL8_PRIMARY Primary address space one 8 bit flo...

Страница 363: ...e TL 0 little endian 0C16 ASI_NUCLEUS_QUAD_LDD Cacheable 128 bit atomic LDDA 2416 ASI_NUCLEUS_QUAD_LDD_L Cacheable 128 bit atomic LDDA little endian 2C16 ASI_NUCLEUS_QUAD_LDD_LITTLE Cacheable 128 bit...

Страница 364: ...space 8 8 bit partial store little endian C816 ASI_PST8_PRIMARY Primary address space 8 8 bit partial store C016 ASI_PST8_PRIMARY_LITTLE Primary address space 8 8 bit partial store little endian C816...

Страница 365: ...rnal UDB Control Register write low 7716 ASI_UDB_ERROR_W External UDB Error Register write high 7716 ASI_UDB_ERROR_W External UDB Error Register write low 7716 ASI_UDB_INTR_R Incoming interrupt vector...

Страница 366: ...cements Reduced gate dimensions 0 35 and faster cycles times 4 ns 8 Mb and 16 Mb E Cache sizes Additional Processor System clock ratios Use of reduced cost increased density E Cache SRAMs Support for...

Страница 367: ...transsition 1 1 1 Mode 82 Timing overlap for tag read data write for coherent write 1 1 1 Mode 83 Read to write bus turnaround penalty 1 1 1 Mode 96 Support for the PREFETCH A instructions 102 Number...

Страница 368: ...SRAM mode 275 Load buffer depth optimized for 1 1 1 mode 277 E Cache accessed every other cycle in 2 2 mode 278 Read toWrite bus turnaround penalty in 1 1 1 mode only 284 CTI at end of cache line not...

Страница 369: ...Sun Microelectronics 354 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 370: ...BackMatter Glossary 357 Bibliography 363 Index 367 Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 371: ...Sun Microelectronics 356 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 372: ...indow is one in which all of the registers contain either zero or a valid address from the current address space or valid data from the cur rent address space coherence A set of protocols guaranteeing...

Страница 373: ..._register and IEEE_754_exception floating point IEEE 754 exception A floating point exception as specified by IEEE Std 754 1985 floating point trap type The specific type of a floating point exception...

Страница 374: ...3 an instruction that can be executed when the processor is in either privileged mode or non privileged mode non privileged mode The mode in which processor is operating when PSTATE PRIV 0 See also pr...

Страница 375: ...ions within an instruction field or a register field that is reserved for definition by future versions of the architecture A reserved field should only be written to zero by software A reserved regis...

Страница 376: ...faulting load that is carried out before it is known whether the result of the operation is required These accesses typically are used to speed program execution An implementation through a combina t...

Страница 377: ...mplementation unimplemented An architectural feature that is not directly executed in hardware because it is optional or is emulated in software unpredictable Synonymous with undefined unrestricted An...

Страница 378: ...rs Boney Joel SPARC Version 9 Points the Way to the Next Generation RISC Sun World October 1992 pp 100 105 Greenley D et al UltraSPARC The Next Generation Superscalar 64 bit SPARC 40th Annual CompCon...

Страница 379: ...t Interrupt Clock Controller Data Sheet STP2210QFP UltraSPARC I Uniprocessor System Controller Data Sheet STP2200BGA UltraSPARC I UPA Modules Data Sheet STP5110 UltraSPARC II Data Sheet STP1031 UltraS...

Страница 380: ...standing Requests Whitepaper STB0117 How to Contact SME Sun Microelectronics SME is a division of Sun Microsystems Inc 2550 Garcia Avenue Mountain View CA U S A 94043 Phone 408 774 8545 FAX 408 774 85...

Страница 381: ...Sun Microelectronics 366 UltraSPARC User s Manual Artisan Technology Group Quality Instrumentation Guaranteed 888 88 SOURCE www artisantg com...

Страница 382: ...ter 48 to 49 51 145 167 220 238 to 239 Address Space Identifier ASI 145 to 146 255 address translation virtual to physical 21 to 22 ADR_VLD signal 342 alias 357 address 17 28 boundary 28 boundary mini...

Страница 383: ..._SDB_INTR 164 to 165 _SDBH_CONTROL_RE 185 _SDBH_ERROR_REG 184 ASI_SDBL_ERROR_REG 184 ASI_SECONDARY 34 ASI_SECONDARY_LITTLE 34 ASI_SECONDARY_NO_FAULT 36 42 49 to 51 ASI_SECONDARY_NO_FAULT_LITTLE 36 42...

Страница 384: ...305 byte granularity 279 Byte Mask 110 142 BYTE_WE_L signals 341 Bytemask field 142 C C Stage 276 290 292 C stage 269 cache direct mapped 274 external 18 flushing 28 inclusion 28 level 1 27 level 2 27...

Страница 385: ...rency transactions in power down mode 327 coherent P_REQ 92 Coherent P_REQ transaction packet format illustrated 140 coherent read hit timing 79 coherent read hit timing illustrated 79 Coherent S_REQ...

Страница 386: ...D0 see Data 0 D0 field of PIC register D1 see Data 1 D1 field of PIC register Data 0 D0 field of PIC register 320 data alignment 7 273 data byte addresses within quadword illustrated 76 Data Cache D...

Страница 387: ...map Context operation 67 dependency load use 269 dependency checking 289 destination register 360 Diag see Diagnostics Diag field of TTE Diagnostic Diag field of TTE 43 diagnostic accesses I Cache 50...

Страница 388: ...raction 76 E Cache client transactions relarive priorities 77 E Cache coherence states defined 94 E Cache coherency system responsibility 94 E Cache Data Access Address illustrated 315 E Cache Data Ac...

Страница 389: ...nded Interrupt Target ID 117 external cache 4 18 External Cache E Cache 8 14 External Cache Unit ECU 8 illustrated 5 external power down EPD signal 196 328 External Reset pin 169 Externally Initiated...

Страница 390: ...3 fcc3 field of FSR register 245 floating point condition codes 296 floating point deferred trap queue FQ 247 floating point exception 358 floating point IEEE 754 exception 358 floating point multipli...

Страница 391: ...MERGE instruction 200 MERGE operation illustrated 207 RS Register 285 UB16 instruction 199 FPSUB32 instruction 199 FPSUB32S instruction 199 to 200 FPU Enabled FEF field of FPRS register 198 304 FQ see...

Страница 392: ...Cache Predecode Field Access Address 311 illustrated 311 I Cache Predecode Field Access Data 311 I Cache Predecode Field LDDA Access Data illustrated 311 I Cache Predecode Field STXA Access Data illu...

Страница 393: ...nation 15 ruction Translation Lookaside Buffer iTLB 5 8 170 instruction Translation Lookaside Buffer iTLB 17 Instruction Translation Lookaside Buffer iTLB misses 267 instruction_access_error exception...

Страница 394: ...t Vector Receive Register 117 interrupt vector receive register 165 interrupt vector transmission 180 Interrupt Vector Uncorrectable Error IVUE field of AFSR 181 interrupt vectors in power down mode 3...

Страница 395: ...instructions 296 machine state after reset 171 machine state in RED_state 171 mandatory SPARC V9 ASRs 156 manuf field of VER register 241 manuf see Manufacturer manuf field of VER register mask field...

Страница 396: ...tate 54 MMU behavior during reset 54 MMU demap 66 MMU demap context operation 66 68 MMU demap operation format illustrated 66 MMU demap page operation 66 68 MMU dTLB Tag Access Register illustrated 63...

Страница 397: ...n branches next program counter 359 NFO bit in MMU 36 NFO page attribute bit 280 NFO see No Fault Only NFO field of TTE No Dual Tag Present NDP option 93 no dual tag present NDP bit 106 to 108 NO_FAUL...

Страница 398: ...Writes PREQ_ DQ field of UPA_PORT_ID register 153 Number of Noncacheable Stores NCST field of Number of Slave Reads ONEREAD field of UPA_PORT_ID register 153 Number of Writebacks WB field of UPA_ CON...

Страница 399: ...to 109 115 118 to 120 122 137 to 138 NACK 101 106 to 109 111 to 112 115 118 to P_SNACK transaction 93 P_WRB_REQ 95 to 97 101 104 113 115 120 122 128 135 138 141 P_WRI_REQ 95 to 96 101 105 to 106 122 1...

Страница 400: ...62 Physical Address PA field of TTE 43 physical address data watchpoint 306 Physical Address Data Watchpoint Read Enable Physical Address Data Watchpoint Write Enable PW field of LSU_Control_Register...

Страница 401: ...ry Context Register 57 PRIV see Privileged PRIV field of PCR register Privilege PRIV field of AFSR 177 privilege PRIV field of PSTATE register 180 privilege violation 60 privileged 47 360 Privileged P...

Страница 402: ...er RED_state 17 19 39 54 to 55 169 to 171 177 236 252 328 360 default memory model 255 exiting 39 170 252 MMU behavior 54 RED_state_exception trap 158 Reference MMU 24 Register R Stage 14 register fil...

Страница 403: ...05 113 115 117 120 122 129 135 S_WAS 110 to 111 120 122 129 S_WBCAN 97 101 105 113 115 120 to 122 125 129 137 to 138 S0 see Select Code 0 S0 field of PCR register S1 see Select Code 1 S1 field of PCR...

Страница 404: ...ed in ECU 9 snooping 33 361 store buffer 256 Soft see Software Defined Soft field of TTE Soft2 see Software Defined Soft2 field of TTE SOFTINT Register 161 166 SOFTINT register 250 SOFTINT_REG Ancilla...

Страница 405: ...r SFAR 61 chronous Fault Status Register SFSR 58 illustrated 58 in E Cache 77 SYSADDR pins 339 SYSADDR bus 85 87 92 116 119 138 to 139 143 arbitration protocol 84 current driver 84 dead cycle when swi...

Страница 406: ..._INT field of SOFTINT register 166 TICK Register 285 TICK_CMPR see Tick Compare TICK_CMPR field of TICK_compare register TICK_CMPR_REG register 157 TICK_INT 167 250 TICK_REG Ancillary State Register A...

Страница 407: ...ddress TSB_Base field of TSB register TSB_Size see TSB Size TSB_Size field of TSB register TSO 295 mode 30 32 ordering 30 TSO memory model 249 TSTATE 253 TSYN_WR_L pin 340 TSYN_WR_L signal 341 turn ar...

Страница 408: ...lustrated 154 UPA_PORT_ID Register 152 shadowed 156 UPA_Slave_Int_L signal unused in UltraSPARC I 153 UPACAP see UPA Capabilities UPACAP field of UPA_PORT_ID register UPACAP see UPA Capabilities UPACA...

Страница 409: ...R register Stage virtual stage 289 chdog Reset WDR 169 171 236 chdog_reset trap 158 chpoint trap 49 304 see Number of Writebacks WB subfield of UPA_CONFIG register dow_fill trap 238 table W field of T...

Страница 410: ...service in house repair center WE BUY USED EQUIPMENT Sell your excess underutilized and idle used equipment We also offer credit for buy backs and trade ins www artisantg com WeBuyEquipment REMOTE IN...

Отзывы:

Нет отзывов

Похожие инструкции для UltraSPARC-I

Бренд: NEC Страницы: 10

Бренд: NEC Страницы: 8

Бренд: ICP DAS USA Страницы: 48

Бренд: ICP DAS USA Страницы: 24

Бренд: Cypress Semiconductor Страницы: 9

Бренд: Digitus Страницы: 8

Avenue Express Control Panel

Бренд: Ensemble Designs Страницы: 19

Бренд: Alesis Страницы: 46

Urban Wireless CC-VT240W

Бренд: Cateye Страницы: 15

Бренд: Genie Страницы: 15

Бренд: Renesas Страницы: 941

8677 - BladeCenter Rack-mountable - Power Supply

Бренд: IBM Страницы: 126

Бренд: ZALMAN Страницы: 9

Бренд: Cervoz Страницы: 22

Quadro M6000 Sync

Бренд: Nvidia Страницы: 8

SiliconDrive SSD-P16G(I)-3100

Бренд: Silicon Systems Страницы: 108

Бренд: Cisco MERAKI Страницы: 4

Бренд: C&T Solution Страницы: 129

Бренды по названию

0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Популярные бренды

Загрузить еще бренды