Chapter 1. Optimization and tuning on IBM POWER7 and IBM
7
POWER7 and POWER6 processors are dissimilar in some respects, and some simple steps
can be taken to ensure good performance of a single binary running on either system. In
particular, see the information in “C, C++, and Fortran compiler options” on page 8.
Performance test beds must be sized and configured for performance and scalability testing.
Choose your scalability goals based on the requirements that are placed on an application,
and the test bed must accommodate at least the minimum requirements. For example, when
you target a multi-threaded application to scale up to four cores on POWER7, it is important
that the test bed be at least a 4-core system and that tests are configured to run in various
configurations (1-core, 2-core, and 4-core). You want to be able to measure performance
across the different configurations and the scalability can be computed. Ideally, a 4-core
system delivers four times the performance of a 1-core system, but in practice, the scalability
is generally less than ideal. Scalability bottlenecks might not be clearly visible if the only
testing done for this example were in a 4-core configuration.
With the multi-threaded POWER7 cores (see 2.2, “Multi-core and multi-thread scalability” on
page 23), each processor core can be instantiated with one, two, or four logical CPUs within
the operating system, so a 4-core server, with SMT4 mode (four hardware threads per core),
means that the operating system is running 16 logical CPUs. Also, larger-core servers are
becoming more pervasive, with scaling considerations well beyond 4-core servers.
The performance test bed must be a dedicated logical partition (LPAR). You must ensure that
there is no other activity on the system (including on other LPARs, if any, configured on the
system) when performance tests are run. Performance testing initially should be done in a
non-virtualized environment to minimize the factors that affect performance. Ensure that the
LPAR is running an up-to-date version of the operating system, at the level that is expected for
the typical usage of the application. Keep the test bed in place after any performance effort so
that performance can occasionally be monitored, which ensures that later maintenance of an
application does not introduce a performance regression.
Choosing the appropriate workloads for performance work is also important. Ideally, a
workload has the following characteristics:
Be representative of the expected actual usage of the application.
Have simple measures of performance that are easily collected and compared, such as
run time or transactions/second.
Be easy to set up and run in an automated environment, with a fairly short run time for a
fast turnaround in performance experiments.
Have a low run-to-run variability across duplicated runs, such that extensive tests are not
required to obtain a statistically significant measure of performance.
Produce a result that is easily tested for correctness.
When an application is being optimized for both the AIX and Linux operating systems, much
of the performance work can be undertaken on just one of the operating systems. However,
some performance characteristics are operating system-dependent, so some analysis must
be performed on both operating systems. In particular, perform profiling and lock analysis
separately for both operating systems to account for differences in system libraries and
kernels. Each operating system also has unique scalability considerations. More operating
system-specific optimizations are detailed in Chapter 4, “AIX” on page 67 and Chapter 5,
“Linux” on page 97.
Build environment and build tools
The build environment, if separate from the performance test bed, must be running an
up-to-date operating system. Only recent operating system levels include Application Binary
Interface (ABI) extensions to use or control newer hardware features.
Summary of Contents for Power System POWER7 Series
Page 2: ......
Page 36: ...20 POWER7 and POWER7 Optimization and Tuning Guide...
Page 70: ...54 POWER7 and POWER7 Optimization and Tuning Guide...
Page 112: ...96 POWER7 and POWER7 Optimization and Tuning Guide...
Page 140: ...124 POWER7 and POWER7 Optimization and Tuning Guide...
Page 162: ...146 POWER7 and POWER7 Optimization and Tuning Guide...
Page 170: ...154 POWER7 and POWER7 Optimization and Tuning Guide...
Page 223: ......