background image

Содержание i86W

Страница 1: ......

Страница 2: ...DED CONTROLLERS 16 BIT EMBEDDED CONTROLLERS 16 32 BIT EMBEDDED PROCESSORS MEMORY MICROCOMMUNICATIONS 2 volume set MICROCOMPUTER SYSTEMS MICROPROCESSORS PERIPHERALS PRODUCT GUIDE Overview of Intel s co...

Страница 3: ...ocal Sales Tax ______ Postage add 10 of subtotal i Postage _______ Total _____ Pay by check money order or include company purchase order with this form 100 minimum We also accept VISA MasterCard or A...

Страница 4: ...___ X ____ ____ I I ___X _____ ______ ___ X ____ _____ ___ X _____ ____ ____ X _____ _____ ____ X _____ _____ ___X ____ ____ ___ X ____ ______ ___ X _____ _____ ___ X _____ ____ Subtotal _____ Must Ad...

Страница 5: ...i860 64 BIT MICROPROCESSOR PROGRAMMER S REFERENCE MANUAL 1990...

Страница 6: ...K iRMX iSBC iSBX iSDM iSXM Library Manager MAPNET MCS Megachassis MICROMAINFRAME MULTIBUS MULTICHANNEL MULTIMODULE MultiSERVER ONCE OpenNET OTP PR0750 PROMPT Promware QUEST QueX Quick Erase Quick Puls...

Страница 7: ...by the instructions of the i860 microprocessor Chapter 3 Registers presents the processor s database A detailed knowledge of the registers is important to programmers but this chapter may be skimmed...

Страница 8: ...ed to A Compound statements are enclosed between the keywords of the if statement IF THEN ELSE FI or of the do statement DO OD The operator indicates autoincrement addressing Register names and instru...

Страница 9: ...from the same register NOTE Depending upon the values of reserved or undefined bits makes software depen dent upon the unspecified manner in which the i860 microprocessor handles these bits Depending...

Страница 10: ...tandard support includes TIPS Technical Infor mation Phone Service updates and subscription service product specific troubleshooting guides and COMMENTS Magazine Basic support consists of updates and...

Страница 11: ...REGISTER 3 2 3 4 EXTENDED PROCESSOR STATUS REGISTER 3 4 3 5 DATA BREAKPOINT REGISTER 3 6 3 6 DIRECTORY BASE REGISTER 3 6 3 7 FAULT INSTRUCTION REGISTER 3 8 3 8 FLOATING POINT STATUS REGISTER 3 8 3 9...

Страница 12: ...TIONS 6 5 6 3 1 Floating Point Multiply 6 7 6 3 2 Floating Point Multiply Low 6 8 6 3 3 Floating Point Reciprocals 6 9 6 4 ADDER INSTRUCTIONS 6 9 6 4 1 Floating Point Add and Subtract 6 10 Q 4 2 Float...

Страница 13: ...sts 8 3 8 2 DATA ALIGNMENT 8 4 8 3 IMPLEMENTING A STACK 8 4 8 3 1 Stack Entry and Exit Code 8 5 8 3 2 Dynamic Memory Allocation on the Stack 8 6 8 4 MEMORY ORGANIZATION 8 6 CHAPTER 9 PROGRAMMING EXAMP...

Страница 14: ...Address Translation 4 4 4 5 Format of a Page Table Entry 4 5 4 6 Invalid Page Table Entry 4 5 6 1 Pipelined Instruction Execution 6 3 6 2 Dual Operation Data Paths 6 16 6 3 Data Paths by Instruction...

Страница 15: ...DP MERGE Update A 5 Register Encoding B 1 Examples Title Example of bla Usage Cache Flush Procedure Examples of lock and unlock Usage Saving Pipeline States Restoring Pipeline States 1 of 2 Reading Mi...

Страница 16: ...TABLE OF CONTENTS Examples Example Title 9 18 9 19 9 20 9 21 Construction of Color Interpolants Z Mask Procedure Accumulator Initialization 3 D Rendering 1 of 2 xii Page 9 28 9 28 9 30 9 31...

Страница 17: ...Architectural Overview 1...

Страница 18: ......

Страница 19: ...r programmers can explic itly use the data cache as if it were a large block of vector registers To sustain high performance the i860 microprocessor incorporates wide information paths that include 64...

Страница 20: ...SqftWare can switch between scalar and pipelined modes Large register set 32 general purpose integer registers each 32 bits wide 32 floating point registers each 32 bits wide which can also be configu...

Страница 21: ...ntains the integer register file and decodes and executes load store integer bit and control transfer operations Its pipelined organization with extensive bypassing and scoreboarding maximizes perform...

Страница 22: ...0 microprocessor does not have high level math macro instructions High level math and other functions are imple mented in software macros and libraries For example the i860 microprocessor does not hav...

Страница 23: ...ecially useful for high resolution distance interpolation In addition to the special support provided by the graphics unit many 3 D graphics applications directly benefit from the parallelism of the c...

Страница 24: ...n also begins immediately Compilers designed for the vector model can treat the i860 microprocessor as a vector machine New instruction scheduling technology for compilers can compare the processing r...

Страница 25: ...orm that can be utilized by a variety of compilers Simulator and debugger 1 8 1 Multiprocessing for High Performance with Compatibility Memory organization of the i860 microprocessor is compatible wit...

Страница 26: ......

Страница 27: ...Data Types 2...

Страница 28: ......

Страница 29: ...form A 32 bit integer can represent a value in the range 2 147 483 648 _231 to 2 147 438 647 231 1 Arithmetic operations on 8 and 16 bit integers can be performed by sign extending the 8 or 16 bit va...

Страница 30: ...results Refer to Table 2 2 for encoding of these special values 2 4 DOUBLE PRECISION REAL 63 52 E 1 t F L FRACTION EXPONENT SIGN o 240329i A double precision real also called double real data type is...

Страница 31: ...defines only the field sizes not the specific use of each field Other ways of using the fields of pixels are possible 2 6 REAL NUMBER ENCODING Table 2 2 presents the complete range of values that can...

Страница 32: ...ENSITY R RED INTENSITY G GREEN INTENSITY B BLUE INTENSITY C COLOR T TEXTURE 3 THESE ASSIGNMENTS OF SPECIFIC MEANINGS TO THE FIELDS OF PIXELS ARE FOR ILLUSTRATION PURPOSES ONLY ONLY THE FIELD SIZES ARE...

Страница 33: ...a 0 11 10 11 11 Normals C 0 00 01 00 00 ij Q a 0 00 00 11 11 Denormals 0 00 00 00 01 Zero 0 00 00 00 00 Zero 1 00 00 00 00 1 00 00 00 01 C Cil Denormals Q 1 00 00 11 11 II 1 00 01 00 00 C Q Normals ii...

Страница 34: ......

Страница 35: ...Registers 3...

Страница 36: ......

Страница 37: ...dirbase fir and fsr Four special purpose registers KR KI T and MERGE FLOAnNG POlNT 63 0 INTEGER 0 0 to 31 0 12 0 rO 14 r1 f6 r2 f8 r3 110 r4 112 r5 114 r6 116 r7 118 r8 f20 r9 f22 r10 124 r11 f26 r12...

Страница 38: ...read independently of what is stored in them The floating point registers are also used by a set of integer operations primarily for graphics computations The floating point registers act as buffer re...

Страница 39: ...status bits are changed when a trap occurs They are restored into their corresponding status bits when returning from a trap handler with a branch indirect instruction when a trap flag is set in the...

Страница 40: ...when the instruction has been successfully bypassed It is possible that the core instruction may cause a trap when the floating point instruction is suppressed In this case KNF remains set permitting...

Страница 41: ...nterrupt is the value of the INT input pin DCS Data Cache Size is a read only field that tells the size of the on chip data cache The number of bytes actually available is 212 DCS therefore a value of...

Страница 42: ...omparing to db a 32 bit access ignores the low order two bits This ensures that any access that overlaps the address contained in the register will generate a trap The trap occurs before the register...

Страница 43: ...bit in a PTE that is not itself in the data cache When CS8 Code Size 8 Bit is set instruction cache misses are processed as 8 bit bus cycles When this bit is clear instruction cache misses are proces...

Страница 44: ...mode status for the cur ent process Figure 3 5 shows its format If FZ Flush Zero is clear and underflow occurs a result exception trap is generated When FZ is set and underflow occurs the result is se...

Страница 45: ...Mode Rounding Action 240329i Round to nearest or even Closer to b of a or c if equally close select even number the one whose least significant bit is zero Round down toward 00 a Round up toward 00 c...

Страница 46: ...aced into the first stage of the adder and multiplier pipelines When the processor executes pipe lined operations it propagates the result status bits of a particular unit multiplier or adder one stag...

Страница 47: ...nd the T Temporary register are special purpose registers used by the dual operation floating point instructions described in Chapter 6 The MERGE register is used only by the graphics instructions als...

Страница 48: ......

Страница 49: ...Addressing 4...

Страница 50: ......

Страница 51: ...ry page table accesses are always done with little endian addressing Figure 4 1 shows the difference between the two storage modes Figure 4 2 defines by example how data is transferred from memory ove...

Страница 52: ...or a data access trap occurs A 32 bit value is aligned to an address divisible by four when referenced in memory i e the two least significant address bits must be zero or a data access trap occurs A...

Страница 53: ...address into the physical address by consulting two levels of page tables The addressing mechanism uses the DIR field as an index into a page directory uses the PAGE field as an index into the page ta...

Страница 54: ...for each process or some combination of the two 4 2 4 Page Table Entries Page table entries PTEs in either level of page tables have the same format Figure 4 5 illustrates this format 4 2 4 1 PAGE FR...

Страница 55: ...y is not valid for address translation and the rest of the entry is available for software use none of the other bits in the entry is tested by the hardware Figure 4 6 illustrates the format of a page...

Страница 56: ...irectory entries is not referenced by the processor but is reserved To control external caches the PTB output pin reflects either CD or WT depending on the PBM bit of epsr refer to Chapter 3 4 2 4 5 A...

Страница 57: ...sor clears the psr U bit to indicate supervisor level when a trap occurs including when the trap instruction causes the trap The prior value of U is copied into PU The trap mechanism is described in C...

Страница 58: ...ess Let DIR PAGE and OFFSET be the fields of the virtual address let PFAl and PFA2 be the page frame address fields of the first and second level page tables respectively DTB is the page directory tab...

Страница 59: ...is zero and if the TLB miss occurred while the bus was not locked assert LOCK refetch the PTE set A and store the PTE deasserting LOCK during the store 7 Locate the PTE at the physical address formed...

Страница 60: ...Flushing Instruction and Address Translation Caches Storing to the dirbase register with the ITI bit set invalidates the contents of the instruction and address translation caches This bit should be...

Страница 61: ...Core Instructions 5...

Страница 62: ......

Страница 63: ...address offset The immediate value is zero extended for logical operations and is sign extended for add and subtract operations including addu and subu and for all addressing calculations Same as src...

Страница 64: ...rent instruction pointer plus four The resulting target address may lie anywhere within the address space The contents of the memory location indicated by address with a size of x The comments regardi...

Страница 65: ...urce operand by the next instruction 2 A load instruction should not directly follow a store that is expected to hit in the data cache Even though immediate address offsets are limited to 16 bits load...

Страница 66: ...formance a load instruction should not directly follow a store that is ex pected to hit in the data cache Even though immediate address offsets are limited to 16 bits a store using a 32 bit immediate...

Страница 67: ...rc1ni Transfer Integer to F P Register The ixfr instruction transfers a 32 bit value from an integer register to a floating point register Programming Notes For best performance the destination of an...

Страница 68: ...pipeline has three stages A pfld returns the data from the address calculated by the third previous pfld thereby allowing three loads to be outstanding on the external bus When the data is already in...

Страница 69: ...o hit in the data cache There is no performance impact for a pfld following a store instruction 3 A string of successive pfld instructions causes internal delays due the fact that the bandwith of the...

Страница 70: ...c2 Traps If the operand is misaligned a data access trap results Programming Notes For the autoincrementing form of the instruction the register coded as isrcl must not be the same register as isrc2 F...

Страница 71: ...to be updated are selected by the low order bits of the PM field in the psr Each bit of PM corresponds to one pixel with bit 0 corresponding to the pixel at the lowest address This instruction is typ...

Страница 72: ...he add and subtract instructions are also used to implement comparisons For this use rO is specified as the destination so that the result is effectively discarded Equal and not equal comparisons are...

Страница 73: ...6 Note that the only difference between the signed and the unsigned forms is in the setting of the condition code CC and the overflow flag OF The various forms of comparison between variables and cons...

Страница 74: ...d isrc2 the low order 32 bits The shift count for shrd is taken from the shift count of the last shr instruction which is saved in the SC field of the psr Shift left is identical for integers and ordi...

Страница 75: ...trap as described in Chapter 7 The trap instruction can be used to implement supervisor calls and code breakpoints The ides should be zero because its contents are undefined after the operation The i...

Страница 76: ...isrc2 CC set if result is zero cleared otherwise xor isrc1 isrc2 idest Logical XOR idest isrc1 XOR isrc2 CC set if result is zero cleared otherwise xorh const isrc2 idest Logical XOR high idest const...

Страница 77: ...CORE INSTRUCTIONS Bit Operation Equivalent Logical Operation Set bit or Clear bit andnot Complement bit xor Test bit and CC set if bit is clear 5 15...

Страница 78: ...ntrol transfer instruction before actually transferring control During the time used to execute the additional instruction the i860 microprocessor refills the instruction pipeline by fetching instruct...

Страница 79: ...ontinue execution at brx sbroff FI Branch on not CC taken Branch if equal Branch if not equal bla isrc1ni isrc2 sbroff Branch on LCC and add LCC_temp clear if isrc2 comp2 isrc1nil signed LCC_temp set...

Страница 80: ...rations is the value of isrc2 before the first bla instruction plus one Example 5 1 illustrates this use of bla Programmers should avoid calling subroutines from within a bla loop because a subrou tin...

Страница 81: ...dual instruction mode ELSE IF DIM is set FI THEN enter dual instruction mode for next instruction pair ELSE enter single instruction mode for next instruction pair FI Continue execution at address in...

Страница 82: ...ccurs saves the address of the Id c instruction After a scalar floating point operation a st c to fsr should not change the value of RR RM or FZ until the point at which result exceptions are reported...

Страница 83: ...lush is suppressed use it only in supervisor mode Example 5 2 shows how to use the flush instruction The addresses used by the flush instruction refer to a reserved 4 Kbytc memory area that is not use...

Страница 84: ...ear RC and RB II Change DrB ATE or ITI fields here if necessary st c Rz dirbase D_fLUSH orh fLUSH_P_H r0 Rw II Rw address minus 32 or fLUSH_P_L Rw Rw II of flush area or 127 r0 Ry II Ry loop count ld...

Страница 85: ...d not span a page boundary After a lock instruction the location is not locked until the first data access that misses the data cache Software in a multiprocessing system should ensure that the first...

Страница 86: ...Notes In a locked sequence a transition to or from dual instruction mode is not permitted II LOCKED TEST AND SET II Value to put in semaphore is in r23 lock II ld b semaphore r22 II Put current value...

Страница 87: ...Floating Point Instructions 6...

Страница 88: ......

Страница 89: ...perand is to be placed srcl The first of the two source register designators src2 The second of the two source register designators dest The destination register designator Thus the operand specifier...

Страница 90: ...e holds status information pertaining to those results The figure assumes that the instruction stream consists of a series of consecutive floating point instructions all of one type i e all adder inst...

Страница 91: ...STRUCTIONS StagG 1 results status i s i 1 s i 2 s i 3 5 i 4 s i 5 s r r r r r Stage 2 results status Clockm Clock m 1 i s Clock m 2 i 1 s Clock m 3 i 2 s Clock m 4 i 3 8 Clock m 5 i 4 6 r r r r Stage...

Страница 92: ...e normal case 2 It is propagated from the first stage of the pipeline This method is used when restoring the state of the pipeline after a preemption When a store instruction updates the fsr and the t...

Страница 93: ...nstored results from the affected pipeline After a scalar operation the values of all pipeline stages of the affected unit except the last are undefined No spurious result exception traps result when...

Страница 94: ...w the last stage The second previous operation old second stage is discarded The next pipelined multi plier operation stores the single precision result Double to Single Transitions When a pipelined m...

Страница 95: ...Multiply Pipelined Floating Point Multiply Three Stage Pipelined Multiply k These instructions perform a standard multiply operation Programming Notes Fsrcl must not be the same as fdest for pipelined...

Страница 96: ...es the low order bits of its operands It operates only on double precision operands The high order 10 bits of the result are undefined An fmlow can perform 32 bit integer multiplies Two 64 bit values...

Страница 97: ...d compilers must encode fsrcl as fO A Newton Raphson approximation may produce a result that is different from the IEEE standard in the two least significant bits of the mantissa A library routine sup...

Страница 98: ...ese instructions perform standard addition and subtraction operations The famov and pfamov instructions send fsrcl through the floating point adder preserv ing the value of 0 minus zero when fsrcl is...

Страница 99: ...mov sd In assembly language this conversion can be specified by the fmov or pfmov pseudo operation with the sd suffix fmov sd fsrc1 fdest Equivalent to famov sd fsrc1 fdest pfmov sd fsrc1 fdest Equiva...

Страница 100: ...re instructions The pipelined instructions can be used either within a sequence of pipelined instructions or within a sequence of nonpipelined scalar instructions pfgt p should be used for A B and A B...

Страница 101: ...dest last stage adder result Advance A pipeline one stage A pipeline first stage 64 bit value with low order 32 bits equal to integer part of fsrc1 The instructions fix pfix ftrunc and pftrunc must sp...

Страница 102: ...e first stage M op1 x M op2 pfmsm p fsrc1 fsrc2 fdest fdest last stage multiplier result PlpeJined Floating Point Multiply with Subtract Advance A and M pipeline one stage operands accessed before adv...

Страница 103: ...perand l of the adder can be fsrcl the T register the last stage result of the multiplier pipeline or the last stage result of the adder pipeline 4 Operand 2 of the adder can be fsrc2 the last stage r...

Страница 104: ...ID ID instructions and 1 i2apt ss ID ID ID Because single precision values are stored in these 64 bit registers in a format which does not conform to the standard for double precision numbers leaving...

Страница 105: ...1 src2 No No 1111 m12tpa m12tsa src1 src2 T A result No No OPC PFMAM PFMSM M Unit M Unit A Unit A Unit T K Mnemonic Mnemonic op1 op2 op1 op2 Load Load 0000 mr2p1 mr2s1 KR src2 src1 M result No No 0001...

Страница 106: ...MULTIPLIER UNIT RESULT RESULT op ADDER UNIT ADDER UNIT RESULT RESULT r2p1 r2s1 r2pt r2st fare1 fsre2 fdest fsre1 fsre2 fdest op2 op2 MULTIPLIER UNIT MULTIPLIER UNIT RESULT ADDER UNIT ADDER UNIT RESULT...

Страница 107: ...i2s1 fsre2 op2 MULTIPLIER UNIT RESULT op1 op2 ADDER UNIT RESULT i2ap1 i2as1 fdest fdest fsre1 fsre2 op2 MULTIPLIER UNIT RESULT ADDER UNIT RESULT i2pt i2st fsre1 fsre2 op1 op2 MULTIPLIER UNIT RESULT o...

Страница 108: ...R UNIT IRESULT fsre2 fdest fsrc2 fdest isrei fiic2 op1 op2 MULTIPLIER UNIT RESULT I op1 op2 ADDER UNIT RESULT I m12apm m12asm fsre1 fsrc2 op1 op2 MULTIPLIER UNIT RESULT E op1 op2 ADDER UNIT ADDER UNIT...

Страница 109: ...UNIT RESULT RESULT opl 0P2 ADDER UNIT ADDER UNIT l lat1 p2 lat1s2 ml2tpm ml2tsm farcl farc2 fdest farcl farc2 fdest 4 opl op2 MULTIPLIER UNIT MULTIPLIER UNIT RESULT RESULT opl op2 ADDER UNIT ADDER UN...

Страница 110: ...2s1 fsrc2 op2 MULTIPLIER UNIT nESULi op2 ADDER UNIT RESULT mr2mp1 mr2ms1 fdest fdest fsrc1 fsrc2 op2 MULTIPLIER UNIT RESULT iop1 op ADDER UNIT RESULT mr2pt mr2st fsrc1 fsrc2 op2 MULTIPLIER UNIT RESULT...

Страница 111: ...p1 mi2s1 fsrc2 fdesl MULTIPLIER UNIT RESULT op2 ADDER UNIT RESULT ml2mp1 ml2ms1 fsrc1 fsrc2 op2 MULTIPLIER UNIT RESULT op1 op ADDER UNIT RESULT ml2pt ml2st fsrc1 fsrc2 op2 MULTIPLIER UNIT RESULT opl o...

Страница 112: ...T RESULT op1 op2 ADDER UNIT ADDER UNIT RESULT RESULT I mrmt1p2 mrmt1s2 mm12mpm mm12msm fsrc1 fsrc2 fdest fsrc1 fsrc2 fdest op1 op2 MULTIPLIER UNIT MULTIPLIER UNIT RESULT 1 op1 op2 ADDER UNIT ADDER UNI...

Страница 113: ...op1 op2 MULTIPLIER UNIT MULTIPLIER UNIT RESULT RESULT I op1 op2 ADDER UNIT ADDER UNIT RESULT RESULT I mimt1p2 mimt1s2 mm12tpm mm12tsm fsre1 fsre2 fdest MULTIPLIER UNIT RESULT ADDER UNIT RESULT mim1p2...

Страница 114: ...and op2 ra rrn la 1m mt2 loadT t nUll I I L KI M result KI A result KR M result KR A result A unit opt I a m t Add Subtract p s A unit op2 2 m a IT L A result LM result IIrc2 subtract add plus M resu...

Страница 115: ...aphics operation if fdest is not fO then fdest must not be the same as fsrcl or fsrc2 For best performance the result of a scalar operation should not be a source operand in the next instruction unles...

Страница 116: ...for floating point operations These instructions do not set CC nor do they cause floating point traps due to overflow Programming Notes In assembly language fiadd and pfiadd are used to implement the...

Страница 117: ...nd an OR instruction use the MERGE register The addition instructions are designed to add interpolation values to each color intensity field in an array of pixels or to each distance value in a Z buff...

Страница 118: ...fsrc2 i and fsrc1 i 00 MERGE 0 fzchkl fsrc1 fsrc2 fdest 32 Blt Z Buffer Check Consider fsrc1 fsrc2 and fdest as arrays of two 32 bit fields fsrc1 0 fsrc1 1 fsrc2 0 fsrc2 1 and fdest 0 fdest 1 where ze...

Страница 119: ...he instructions compare the distances of the points to be drawn against the values in the Z buffer and set bits of PM to indicate which distances are smaller than those in the Z buffer Previously calc...

Страница 120: ...ion implements interpolation of color intensities The 8 and 16 bit pixel formats use 16 bit intensity interpolation Being a 64 bit instruction faddp does four 16 bit interpolations at a time The 32 bi...

Страница 121: ...h faddp instruction the MERGE register is shifted right by 8 bits Two faddp instructions should be executed consecutively one to interpolate for even numbered pixels the next to interpolate for odd nu...

Страница 122: ...bits Normally three faddp instructions are exe cuted consecutively one for each color represented in a pixel The shifting of MERGE causes the results of consecutive faddp instructions to be accumulate...

Страница 123: ...en they are loaded into the MERGE register With each faddp the MERGE register is shifted right by 8 bits Normally three faddp instructions are exe cuted consecutively one for each color represented in...

Страница 124: ...hose that form a Z buffer With faddz 16 bit Z buffers can use 32 bit distance interpolation as Figure 6 9 illustrates Since faddz adds 32 bit values each value can be treated as a fixed point real num...

Страница 125: ...struction The fact that data is carried from the low order 32 bits into the high order 32 bits may introduce an insignificant distortion into the interpolation For 32 bit Z buffers 64 bit distance int...

Страница 126: ...els from the MERGE register sets any additional bits that may be needed in the pixels e g texture values and loads the result into a floating point register Fsrcl when a register and fdest are floatin...

Страница 127: ...O Programming Notes This scalar instruction is performed by the graphics unit When it is executed the result in the graphics unit pipeline is lost However executing this instruction does not impact pe...

Страница 128: ...ion mode and encounters a floating point instruction with the D bit set one more 32 bit instruction is executed before dual mode execution begins If the i860 microprocessor is executing in dual instru...

Страница 129: ...ot reported on fnop Because it is a core instruction d fnop cannot be used to initiate entry into dual instruction mode 6 8 1 Core and Floating Point Instruction Interaction 1 If one of the branch an...

Страница 130: ...tion is fst or pst the store should not reference the result register of the floating point operation When the core operation is pst the floating point instruction cannot be p fzchks or p fzchkl 4 Whe...

Страница 131: ...anion floating point instruction unless the destination is fO or f1 No overlap of register destinations is permitted for example the following instructions must not be paired d fmul ss f9 fl f5 fld q...

Страница 132: ......

Страница 133: ...Traps and Interrupts 7...

Страница 134: ......

Страница 135: ...ts U to zero supervisor mode Table 7 1 Types of Traps Indication Caused by Type psr epsr fsr Condition Instruction Instruction IT OF Software traps trap intovr Fault IL Missing unlock Any SE Floating...

Страница 136: ...n mode when a data access fault occurs in the absence of other trap conditions the floating point half of the dual instruction will already have been executed 9 Clears the BL bit of dirbase and deasse...

Страница 137: ...of the next trap 7 2 3 Returning from the Trap Handler Returning from a trap handler involves the following steps 1 Restoring the pipeline states including the fsr KR Kl T and MERGE registers where n...

Страница 138: ...curred To implement the IEEE standard for unordered com pares the trap handler may need to change the value of CC In this case it cannot resume at fir 4 because the new value of CC might cause an inco...

Страница 139: ...the intovr instruction The trap occurs only if OF in epsr is set when intovr is executed The trap handler should clear OF before returning Refer to the intovr instruction in Chapter 5 3 By the lack o...

Страница 140: ...source operands are stored in and inspect all four source operands to see if one or both operations need to be fixed up It can then compute the appropriate result and store the result in des in the ca...

Страница 141: ...been lost The point at which a result exception is reported depends upon whether pipelined operations are being used Scalar nonpipelined operations Result exceptions are reported on the next floating...

Страница 142: ...spect the result compute the result appropriate for that instruction a NaN or an infinity for example and store the correct result The result is either stored in the register specified by RR if nonzer...

Страница 143: ...zed by the value at the INT pin just before the end of RESET The read only fields of the epsr are set to identify the processor while the IL WP PBM and BE bits are cleared The bits U 1M BR and BW in p...

Страница 144: ...he following items 1 The current contents of the floating point status register fsr including the third stage result status 2 Unstored results from the first second and third stages The number of stag...

Страница 145: ...how to restore the pipeline state Trap handlers manipulate the result status bits in the floating point pipelines while preparing for pipeline resumption When storing to fsr with the U bit set the res...

Страница 146: ...0 f0 f0 f0 Dummy Lres Lres Lres1m Ares3 II II II II Mres2 II Ares2 II II II II Mres1 II Ares1 II II f0 f0 Ires1 II II T and MERGE results get double precision 1 0 save third stage result status clear...

Страница 147: ...L1 I Lres3m r31 I Lres3m r31 IIlxll1lllllll Fsr3 L2 Mres3 Mres3 IIlxll1l f2 f4 Fsr3 IIIx2III Temp Temp fsr I I clear FTE rl II move low 16 bits to high 16 rl II move low 16 bits to high 16 f4 f5 fill...

Страница 148: ...age andh xl Fsrl r II test multiplier result precision MRP bc t Lb II skip next if double pfmul ss Mresl f2 f II insert single result pfmu13 dd Mresl f4 f II insert double result Lb andh x2 Fsrl r II...

Страница 149: ...Programming Model 8...

Страница 150: ......

Страница 151: ...g point registers is now set at 8 Earlier software used a dividing point at 16 Table 8 1 Register Allocation Register Purpose Left Unchanged by a Subroutine rO Always zero Yes r1 Return address No r2...

Страница 152: ...n integer the rest in successively higher numbered regis ters If fewer parameters are required the remaining registers can be used for temporary variables If more than 12 parameters are required the o...

Страница 153: ...int value or 64 bit integer A subroutine may need to save the first parameter to make room for the return value 8 1 3 Passing Mixed Integer and Floating Point Parameters in Registers Integer and float...

Страница 154: ...ENTING A STACK In general compilers and programmers have to maintain a software stack Register r2 called sp in assembly language is the suggested stack pointer Register r2 is set by the operating syst...

Страница 155: ...point to a 16 byte boundary as long as the compiler keeps data correctly aligned when assigning positions relative to fp Figure 8 2 shows the stack frame format A fixed format is necessary to allow s...

Страница 156: ...nter Languages such as Pascal that need to maintain activation records on the stack can put them below the frame pointer in the program specific area The frame pointer is optional All stack references...

Страница 157: ...I Set return value to allocated space Example 8 4 Possible Implementation of alloca OxFFFFFFFF OPERATING SYSTEM CODE AREA EMPTY USER CODE AREA OxF0400000 FIXED SUBROUTINE ENTRIES OxFOOOOOOO OPERATING...

Страница 158: ...space for shared memory areas with other tasks UNIX System V allows such shared memory areas The empty areas on the diagram if Figure 8 3 would normally be marked as not present in the page table ent...

Страница 159: ...4 even in case a trap occurs on the first instruc tion of a section The memory mapped I O devices should also be placed in the upper operating system data space The paging hardware allows logical addr...

Страница 160: ......

Страница 161: ...Programming Examples 9...

Страница 162: ......

Страница 163: ...ly not loaded from memory Example 9 1 shows how II SIGN EXTEND 8 BIT INTEGER TO 32 BITS II Assume the operand is already in rlb shl 24 rlb rlb II left justify shra 24 rlb rlb II right justify all but...

Страница 164: ...s algorithm is optimized for high performance and does not produce results that are rounded according to the IEEE standard Worst case error is about two least significant bits If the result is referen...

Страница 165: ...rform the divide II DOUBLE PRECISION DIVIDE II The dividend X is in f2 II The divisor Y is in f4 II The result Z is left in f8 frcp dd f4 fb fmul dd f4 fb fld d flttwo f1 II The fld d is free It fsub...

Страница 166: ...ocks can be overlapped with other operations II INTEGER MULTIPLY II The multiplier is in r4 II The multiplicand is in r5 II The product is left in rb II The registers f2 f4 and fb are used as temporar...

Страница 167: ...ision format properly normalized by the iB60 microprocessor The value of Be BN is 252 231 Ox4330_0000_BOOO_OOOO The conversion requires 7 clocks if the result is referenced in the next instruction Thr...

Страница 168: ...dd f4 fb fsub dd f10 f8 fmul dd fb f8 fmul dd f4 fb fsub dd f10 f8 fmul dd fb f8 fmul dd f4 fb fsub dd f10 f8 fmul dd fb f2 fmul dd f8 fb II Convert Quotient to fld d onepluseps fmul dd f8 f10 ixfr r...

Страница 169: ...nknown II End of string indicated by NUL II r17 address of source string II r1b address of destination string copy_string ld b 0 r17 r2b II Load one character bte 0 r2b done II Test for NUL character...

Страница 170: ...iscards them by specifying register fO as the destination of the first three instructions After performing the intended calculations it flushes the pipeline by executing three dummy addition instructi...

Страница 171: ...s 1 0 8 0 2 0 7 0 8 0 1 0 a series of multipli cations followed by additions The dual operation instructions are designed precisely to execute this type of calculation efficiently by using the adder a...

Страница 172: ...f0 II 6 3 5 4 20 18 0 14 0 8 Discard m12apm ss f10 f18 f0 II 7 2 6 3 20 20 8 18 0 14 Discard m12apm ss fll fi9 f0 II 8 1 7 2 18 20 14 20 8 18 Discard II For larger matrices include more instructions h...

Страница 173: ...nes assume that the actual matrices to be multiplied have the following values A 1 0 2 0 3 0 4 0 5 0 6 0 B 6 0 5 0 4 0 3 0 2 0 1 0 Assume further that the two matrices are already loaded into register...

Страница 174: ...10 0 6 0 0 Discard m12apm dd f12 f24 f0 II 5 2 4 3 12 0 10 0 6 Discard m12apm dd f14 f26 f0 II 6 1 5 2 12 6 12 0 10 Discard II For larger vectors include more instructions here II Flushing phase m12a...

Страница 175: ...rocedure uses dual instruction mode to overlap loading decision making and branch ing with the basic pipelined floating point add instruction pfadd ss To make obvious the pairing of core and floating...

Страница 176: ...f20 f30 f30 br S d pfadd ss f21 f31 nop d pfadd ss f22 f30 bla r21 r17 d pfadd ss f23 f31 fld d 8 r16 II If we reach this point II r17 is either 4 or 3 II Exit loop after adding f31 II f21 to the pipe...

Страница 177: ...ough straight forward programming techniques Each example uses dual instruction mode to perform the loading and loop control operations in parallel with the basic floating point calcula tions The exam...

Страница 178: ...8 88 f19 II matrix A row values II matrx 8 column vals II temporary results T1 f20 T2 f21 T3 f22 shl 2 adds 8 adds 8 adds 4 d fiadd dd f0 adds 1 d fnop M r0 M C f0 L SIZ DEC RC C f0 Ar bla d fnop subs...

Страница 179: ...f f Tl II adds 8 M RC II Reinitialize row column counter d m12apm ss f f T2 II nop II d pfadd ss f f T3 II bla DEC RC inner_loop II Wont branch initializes LCC d pfadd ss f f Tl II fld q 16 A A5 II Lo...

Страница 180: ...s from matrix B and the loop control with the eight m12apm instructions in the inner loop The strategy of Example 9 14 is suitable for larger matrices than the strategy in Example 9 13 because even in...

Страница 181: ...adds 8 r0 DEC II Set decrement or for bla adds 8 M RC II Initialize rowlcolumn counter d fiadd dd f0 f0 f0 II Initiate dual instruction mode adds 4 C C II Start C index one entry low d fnop II First d...

Страница 182: ...ch initializes LCC d pfadd ss f0 f0 T2 II nap II d fadd ss Tl T3 T3 II nap II d fadd ss T2 T3 T3 II adds 1 8c 8c II Decrement column counter d pfadd ss f0 f0 f0 II fst l T3 4 C II Store rowlcolumn pro...

Страница 183: ...color intensities are determined by higher level graphics software The points represent the intersection of the scan line with two edges of the projected image of a polygon For a given scan line the r...

Страница 184: ...iZl iZlh iZ3 iZ3h oldz newz newzh newi iR iRh aR aRh iG iGh aG aGh iB iBh aB aBh lZmask lZmaskh rZmask rZmaskh f2 II Accumulated Z values f3 II f4 II Z interpolant coefficient 1 0 f5 II f6 II Z inter...

Страница 185: ...r all scan lines that intersect the polygon therefore mZ needs to be calculated only once for each polygon Example 9 21 assumes that dX and mZ have already been calculated and all that re mains is to...

Страница 186: ...e way of constructing the operands before starting the distance interpolations The initial value given to fsrc1 depends on the alignment of the first pixel Table 9 1 helps to visualize the process Aft...

Страница 187: ...tine the numbers shown here are the values of the coefficient N where the actual operands have the values Z1 N mZ For each execution of faddz fsrc1 is the same as fdest of the prior faddz After every...

Страница 188: ...Xl N Cl mC Cl 2 mC Cl N mC C Xl dX Cl dX mC C X2 Figure 9 3 illustrates Gouraud shading of a triangle The faddp instruction performs the above calculations 64 bits at a time Because a pixel is 16 bits...

Страница 189: ...IAL SRC1 SRC2 240329i The i860 microprocessor operates on 64 bit quantities that are aligned on 8 byte bound aries The code in this example takes full advantage of this design handling four 16 bit pix...

Страница 190: ...shift by 16 to put the significant shl 18 mB Rc II bits into the high order half shr 16 Ra mR II Return significant 16 bits shr 16 Rb mG II to low order half Any sign bits shr 16 Rc mB II in high ord...

Страница 191: ...The left and right ends of the line segment go through different logic paths so that the Z buffer masks can be applied by the form instruction All the interior points are handled by the tight inner l...

Страница 192: ...Rtab II 4 5 6 7 shl 5 Lalign Lalign II Multiply by row width 1 2 3 2 3 4 3 4 5 4 5 6 adds Lalign Rtab Rtab II Index row corresponding to alignment fld d aZiCRtab aZ II Z ixfr Zl Fx II Z fld d aRiCRtab...

Страница 193: ...rm f0 newi II Move 4 new pixels to 64 bit reg adds 5 dX r0 II Are there any whole sets CdX 5 Ll d fzchks oldz newz newz II Mark closer points in PM 7 4 bc short_segment II Get out now if no whole set...

Страница 194: ...aB II Interpolate 4 blue intensities 8 FBP II Store pixels indicated by PM 3 iG aG II Interpolate 4 green intensities iR II aR II Interpolate II red intensities II No special boundary conditions f ne...

Страница 195: ...Instruction Set Summary A...

Страница 196: ......

Страница 197: ...nd subu and for all addressing calculations Same as srcl except that no immediate constant or address offset value is permitted Same as srcl except that the immediate constant is a 5 bit value that is...

Страница 198: ...ts s 16 bits or I 32 bits I 32 bits d 64 bits or q 128 bits I 32 bits or d 64 bits mem x address The contents of the memory location indicated by address with a size of x PM The pixel mask which is co...

Страница 199: ...CC IF CC 1 THEN continue execution at brx lbroff FI bc t lbroff Branch on CC Taken IF CC 1 THEN execute one more sequential instruction continue execution at brx lbroff ELSE skip next sequential instr...

Страница 200: ...e for next instructions pair Continue execution at address in isrclni The original contents of isrclni is used even if the next instruction modifies isrclni Does not trap if isrclni is misaligned bte...

Страница 201: ...Subtract frdest fsrel fsre2 fix p fsrel fdest Floating Point to Integer Conversion fdest 64 bit value with low order 32 bits equal to integer part of fsrel rounded Floating Point Load fld y isrel isr...

Страница 202: ...Operation Assembler pseudo operation fnop shrd rO rO rO form fsrcl fdes OR with MERGE Register fdes fsrcl OR MERGE MERGE 0 frcp p fsrc2 fdes Floating Point Reciprocal fdes 1 fsrc2 with maximum mantis...

Страница 203: ...3 where zero denotes the least significant field PM PM shifted right by 4 bits FOR i 0 to 3 DO PM i 4 fsrc2 i 5 fsrcl i unsigned fdest i smaller of fsrc2 i and fsrcl i aD MERGE O intoYr Software Trap...

Страница 204: ...fsrcl fsrc2 Shift MERGE right 16 and load fields 31 16 and 63 48 fromfsrcl fsrc2 pfam p fsrcl fsrc2 fdest Pipelined Floating Point Add and Multiply fdest last stage adder result Advance A and M pipel...

Страница 205: ...Identical to pfgt p except that assembler sets R bit of instruction fdes last stage adder result Co clear if fsrcl 5 fsrc2 else set Advance A pipeline one stage A pipeline first stage is undefined bu...

Страница 206: ...ipeline A pipeline first stage A op1 A op2 M pipeline first stage M op1 x M op2 pfsub p fsrcl fsrc2 fdest Pipelined Floating Point Subtract fdest last stage adder result Advance A pipeline one stage A...

Страница 207: ...nst fdest Shift PM right by 8 pixel size in bytes bits IF autoincrement THEN isrc2 const isrc2 FI shl isrcl isrc2 idest Shift Left ides isrc2 shifted left by isrcl bits shr isrcl isrc2 idest Shift Rig...

Страница 208: ...st Software Trap Generate trap with IT set in psr unlock End Interlocked Sequence Clear BL in dirbase The next load or store unlocks the bus Interrupts are enabled xor isrcl isrc2 idest Logical Exclus...

Страница 209: ...Instruction Format and B Encoding...

Страница 210: ......

Страница 211: ...own in Table B 1 are used Among the core instructions there are two general formats REG format and CTRL format Within the REG format are several variations Table 8 1 Register Encoding Register Encodin...

Страница 212: ...pst ixfr For instructions where srcl is optionally an immediate constant or address offset bit 26 of the opcode I bit indicates whether srcl is immediate If bit 26 is clear an integer register is use...

Страница 213: ...it 0 selects autoincrement addressing if set Bits one and two select the operand size as follows Bit 1 Bit 2 Operand Size 0 0 64 bits 0 1 128 bits 1 0 32 bits 1 1 32 bits When srcl is immediate bits z...

Страница 214: ...nch LCC Set and Add Arithmetic Shift AND ANDNOT OR XOR reserved 1 16 or 32 bits selected by bit 0 LS Load Store o Load 1 Store SO Signed Ordinal o Ordinal 1 Signed H High o and or andnot xor 1 andh or...

Страница 215: ...cape Opcodes 4 3 2 o reserved 0 0 0 0 0 lock Begin Interlocked Sequence 0 0 0 0 1 calli Indirect Subroutine Call 0 0 0 1 0 reserved 0 0 0 1 1 intovr Trap on Integer Overflow 0 0 1 0 0 reserved 0 0 1 0...

Страница 216: ...ODING CTRL Format Instructions 31 28 25 o BROFFSET 240329i CTRL Format Opcodes 28 27 26 br Branch Direct 0 1 0 call Call 0 1 1 bc t Branch on CC Set 1 0 T bnc t Branch on CC Clear 1 1 T T Taken o bc o...

Страница 217: ...nstructions other than fxfr one of 32 floating point registers fxfr one of 32 integer registers Pipelining 1 Pipelined instruction mode o Scalar instruction mode Dual Instruction Mode 1 Dual instructi...

Страница 218: ...Equal 0 1 p ftrunc Truncate 0 1 fxfr Transfer to Integer Register 1 0 p fiadd Long Integer Add 1 0 p fisub Long Integer Subtract 1 0 p fzchkl Z Check Long 1 0 p fzchks Z Check Short 1 0 p faddp Add wi...

Страница 219: ...Instruction Timings c...

Страница 220: ......

Страница 221: ...eturned Id st pfld fld fst or ixfr and data cache load One plus number of clocks until last READY miss processing in progress returned Reference to dest of Id call calli fxfr or Id c in One clock the...

Страница 222: ...e full and Id fld dress can be issued Le an address which is pfld st fst not the 2nd 4th cycle of a cache fill or the 2nd 8th cycle of a CS8 mode instruction fetch or the 2nd cycle of an 128 bit write...

Страница 223: ...Instruction Characteristics 0...

Страница 224: ......

Страница 225: ...fault is reported on the subsequent floating point instruction plus pst fst and some times fld pfld and ixfr See Section 7 4 2 for more information on result exception reporting The instruction access...

Страница 226: ...trol transfer instruction nor a trap instruction nor the target of a control transfer instruction b When using a bri to return from a trap handler programmers should take care to prevent traps from oc...

Страница 227: ...OAT 5 f fsub p A SE RE ftrunc p A SE RE fxfr G 6 8 fzchkl G 8 fzchks G 8 intovr E IT ixfr E 2 Id c E Id x E OAT 6 lock E or E CC orh E CC pfadd p A P SE RE pfaddp G P 8 e pfaddz G P 8 e pfamov r A P S...

Страница 228: ...lpellned Sets Faults Performance Programming Unit Delayed CC Notes Restrictions pftrunc p A P SE RE pfzchkl G P 8 pfzchks G P 8 pst d E OAT f shl E shr E shra E shrd E st c E st x E OAT subs E CC 1 su...

Страница 229: ...OWA Tel 716 425 2750 TWX 510 253 7391 Intel Corp FAX 716 223 2561 1930 SI Andrews Drive N E tlntel Corp 2nd Floor Cedar Rapids 52402 2950 Expressway Dr South Tel 319 393 1294 Suite 130 Islandia 11722...

Страница 230: ...onics Tel 313 522 4700 10824 Hope Street Rancho Cordova 95670 TWX 810 863 0374 8208 Melrose Dr Suite 210 TWX 810 282 8775 Cypress 90630 Tel 916 638 5282 tHamiiton Avnet Electronics Lenexa 66214 tPione...

Страница 231: ...berty Ave Pittsburgh 15238 Tel 412 281 4150 Pioneer Electronics 259 Kappa Drive Pittsburgh 15238 Tel 412 782 2300 TWX 710 795 3122 tPioneer Technologies Group Inc Delaware Valley 261 Gibralter Road Ho...

Страница 232: ...SA TLX 95142 Tel 32 02 216 01 60 In Multikomponent GmbH Telcom S r 1 Calle Miguel Angel 21 3 MMD TLX 64475 or 22090 Poslfach 1265 Via M Civitali 75 28010 Madrid Unit 8 Southview Park Bahnhofstrasse 44...

Страница 233: ...ago el 56 2 225 8139 LX 240 846 RUD HINA HONG KONG 11 P c 7 f I Ltd hase 26 Kwai Hei Street I T Kowloon long Kong el 852 0 4223222 WX 39114 JINMI HX AX 852 0 4261602 ield Application Location INDIA Mi...

Страница 234: ...464 2736 3280 Pointe Pkwy Ste 200 Norcross 30092 MISSOURI Tel 404 449 0541 OREGON Intel Corp HAWAII 4203 Earth City Exp Ste 131 Intel Corp Earth City 63045 15254 NW Greenbrier Parkway Intel Corp Tel 3...

Страница 235: ......

Страница 236: ......

Страница 237: ......

Страница 238: ......

Страница 239: ......

Страница 240: ......

Страница 241: ......

Отзывы: