48
POWER7 and Optimization and Tuning Guide
Bits 61:63 – DPFD – Default Prefetch Depth
Supplies a prefetch depth for hardware-detected streams and for software-defined
streams for which a depth of zero is specified, or for which
dcbt
or
dcbtst
with TH=1010 is
not
used in their description.
Bits 55:57 - URG - Depth Attainment Urgency
This field is a new one added in the processor. This field indicates how quickly
the prefetch depth should be reached for hardware-detected streams. Values and their
meanings are as follows:
– 0: Default
– 1: Not urgent
– 2: Least urgent
– 3: Less urgent
– 4: Medium
– 5: Urgent
– 6: More urgent
– 7: Most urgent
The ability to enable or disable the three types of streams that the hardware can detect (load
streams, store streams, or stride-N streams), or to set the default prefetch depth, allows
empirical testing of any application. There are no simple rules for determining which settings
are optimum overall for a application: the performance of prefetching depends on many
different characteristics of the application in addition to the characteristics of the specific
system and its configuration. Data prefetches are purely speculative, meaning they can
improve performance greatly when the data that is prefetched is, in fact, referenced by the
application later, but can also degrade performance by expending bandwidth on cache lines
that are not later referenced, or by displacing cache lines that are later referenced by
the program.
Similarly, setting DPFD to a deeper depth tends to improve performance for data streams that
are predominately sourced from memory because the longer the latency to overcome, the
deeper the prefetching must be to maximize performance. But deeper prefetching also
increases the possibility of stream overshoot, that is, prefetching lines beyond the end of the
stream that are not later referenced. Prefetching in multi-core processor implementations has
implications for other threads or processes that are sharing cache (in SMT mode) or the same
system bandwidth.
Controlling DSCR under Linux
DSCR settings on Linux are controlled with the
ppc64_cpu
command. Controlling DSCR
settings for an application is generally considered advanced and specific tuning.
Currently, setting the DSCR value is a cross-LPAR setting.
Controlling DSCR under AIX
Under AIX, DSCR settings can be controlled both by programming API and from the
command line by running the following commands:
62,63
dscr_ctl()
API
#include <sys/machine.h>
int dscr_ctl(int op, void *buf_p, int size)
62
dscr_ctl subroutine, available at:
http://pic.dhe.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.basetechref/doc/basetrf1/dscr_ctl.htm
63
dscrctl command, available at:
http://pic.dhe.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.cmds/doc/aixcmds2/dscrctl.htm
Summary of Contents for Power System POWER7 Series
Page 2: ......
Page 36: ...20 POWER7 and POWER7 Optimization and Tuning Guide...
Page 70: ...54 POWER7 and POWER7 Optimization and Tuning Guide...
Page 112: ...96 POWER7 and POWER7 Optimization and Tuning Guide...
Page 140: ...124 POWER7 and POWER7 Optimization and Tuning Guide...
Page 162: ...146 POWER7 and POWER7 Optimization and Tuning Guide...
Page 170: ...154 POWER7 and POWER7 Optimization and Tuning Guide...
Page 223: ......