![Intel ITANIUM ARCHITECTURE Скачать руководство пользователя страница 479](http://html.mh-extra.com/html/intel/itanium-architecture/itanium-architecture_manual_2073403479.webp)
4:472
Volume 4: IA-32 SSE Instruction Reference
The PMULHUW (Unsigned high packed integer word multiply in MMX technology
register) instruction performs an unsigned multiply on each word field of the two source
MMX technology registers, returning the high word of each result to a MMX technology
register.
The PSADBW (Sum of absolute differences) instruction computes the absolute
difference for each pair of sub-operand byte sources and then accumulates the 8
differences into a single 16-bit result.
The PSHUFW (Shuffle packed integer word in MMX technology register) instruction
performs a full shuffle of any source word field to any result word field, using an 8-bit
immediate operand.
4.6.1.9
Cacheability Control Instructions
Data referenced by a programmer can have temporal (data will be used again) or
spatial (data will be in adjacent locations, e.g. same cache line) locality. Some
multimedia data types, such as the display list in a 3D graphics application, are
referenced once and not reused in the immediate future. We will refer to this data type
as non-temporal data. Thus the programmer does not want the application’s cached
code and data to be overwritten by this non-temporal data. The cacheability control
instructions enable the programmer to control caching so that non-temporal accesses
will minimize cache pollution.
In addition, the execution engine needs to be fed such that it does not become stalled
waiting for data. SSE instructions allow the programmer to prefetch data long before
it’s final use. These instructions are not architectural since they do not update any
architectural state, and are specific to each implementation. The programmer may have
to tune his application for each implementation to take advantage of these instructions.
These instructions merely provide a hint to the hardware, and they will not generate
exceptions or faults. Excessive use of prefetch instructions may be throttled by the
processor.
The following four instructions provide hints to the cache hierarchy which enables the
data to be prefetched to different levels of the cache hierarchy and avoid polluting
cache with non-temporal data.
The MASKMOVQ (Non-temporal byte mask store of packed integer in a MMX technology
register) instruction stores data from a MMX technology register to the location
specified by the EDI register. The most significant bit in each byte of the second MMX
technology mask register is used to selectively write the data of the first register on a
per-byte basis. The instruction is implicitly weakly-ordered, with all of the
characteristics of the WC memory type; successive non-temporal stores may not write
memory in program-order, do not write-allocate (i.e. the processor will not fetch the
corresponding cache line into the cache hierarchy, prior to performing the store), write
combine/collapse, and minimize cache pollution.
The MOVNTQ (Non-temporal store of packed integer in a MMX technology register)
instruction stores data from a MMX technology register to memory. The instruction is
implicitly weakly-ordered, does not write-allocate and minimizes cache pollution.
Содержание ITANIUM ARCHITECTURE
Страница 1: ......
Страница 7: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 199: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Страница 269: ...4 262 Volume 4 Base IA 32 Instruction Reference LES Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 270: ...Volume 4 Base IA 32 Instruction Reference 4 263 LFS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 273: ...4 266 Volume 4 Base IA 32 Instruction Reference LGS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 288: ...Volume 4 Base IA 32 Instruction Reference 4 281 LSS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 352: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Страница 368: ...Volume 4 Base IA 32 Instruction Reference 4 361 SHL SHR Shift Instructions See entry for SAL SAR SHL SHR ...
Страница 373: ...4 366 Volume 4 Base IA 32 Instruction Reference SIDT Store Interrupt Descriptor Table Register See entry for SGDT SIDT ...
Страница 589: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Страница 590: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Страница 591: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 603: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...
Страница 604: ......