20.4 Hardware Multi-threading (HMT)
Hardware multi-threading is a facility present in several iSeries processors. The eServer i5 models
instead have the Simultaneous Multi-threading (SMT) facility, which are discussed in the SMT white
paper at the following website:
http://www-1.ibm.com/servers/eserver/iseries/perfmgmt/pdf/SMT.pdf
.
HMT is mentioned here primarily to compare-and-contrast with the SMT. Moreover, several system
facilities operate slightly differently on HMT machines versus SMT machines and these differences need
some highlighting.
HMT Described
Broadly, HMT exploited the concept that modern processors are often quite fast relative to certain
memory accesses.
Without HMT, a modern CPU might spend a lot of time stalled on things like cache misses. In modern
machines, the memory can be a considerable distance from the CPU, which translates to more cycles per
fetch when a cache miss occurs. The CPU idles during such accesses.
Since many OS/400 applications feature database activity, cache misses often figured noticeably in the
execution profile. Could we keep the CPU busy with something else during these misses?
HMT created two securely segregated streams of execution on one physical CPU, both controlled by
hardware. It was created by replicating key registers including another instruction counter. Generally,
there is a distinction between the one physical processor and its two logical processors. However, for
HMT, the customer seldom sees any of this as the various performance facilities of the system continue to
report on a physical CPU basis.
Unlike SMT, HMT allows only one instruction stream to execute at a time. But, if one instruction stream
took a cache miss, the hardware switches to the other instruction stream (hence, "hardware
multi-threading" or, some say, "hardware multi-tasking"). There would, of course, be times when both
were waiting on cache misses, or, conversely, applications that hardly ever had misses. Yet, on the
whole, the facility works well for OS/400 applications.
The system value QPRCMLTTSK was introduced in order to turn HMT on or off. This could only take
affect when the whole system was IPLed, so (for clarity) one should change the system value itself
shortly before a full system IPL. The default is to have it set on ('1').
Generally, in most commercial workloads, HMT enabled ('1') gives gains in throughput between 10 and
25 percent, often without impact to response time.
In rare cases, HMT results in losses rather than gains.
IBM i 6.1 Performance Capabilities Reference - January/April/October 2008
©
Copyright IBM Corp. 2008
Chapter 20 - General Tips and Techniques
321