178
POWER7 and Optimization and Tuning Guide
Some tools and techniques for this analysis include:
AIX
tprof
profiling. For more information, see tprof Command, available at:
http://publib.boulder.ibm.com/infocenter/aix/v7r1/index.jsp?topic=/com.ibm.aix.
cmds/doc/aixcmds5/tprof.htm
Linux
Oprofile
profiling. For more information, see the following resources:
– Taking advantage of oprofile, available at:
https://www.ibm.com/developerworks/wikis/display/LinuxP/adof+
oprofile
– Oprofile with Java Support, available at:
https://www.ibm.com/developerworks/wikis/display/LinuxP/Java+Perfon+
POWER7#JavaPerformanceonPOWER7-4.5OprofilewithJavaSupport
– OProfile manual, available at:
http://oprofile.sourceforge.net/doc/index.html
General information about running the profiler and interpreting the results are contained in the
sections on profiling in “AIX” on page 162 and “Linux” on page 171. For Java profiling,
additional Java options are required to be able to profile the machine code that is generated
for methods by the JIT compiler:
AIX 32-bit:
-agentlib:jpa=instructions=1
AIX 64-bit:
-agentlib:jpa64=instructions=1
Linux Oprofile:
-agentlib:jvmti_oprofile
The entire execution of a Java program can be profiled, for example on AIX by running the
following command:
tprof -ujeskzl -A -I -E -x java …
However, it is more common to profile Java after a warm-up period so that JIT compilation
activity has generally completed. To profile after a warm-up, start Java and wait an
appropriate interval until steady-state performance is reached, which is anywhere from a few
seconds to a few minutes for large applications. Then, invoke the profiler, for example, on AIX,
by running the following command:
tprof -ujeskzl -A -I -E -x sleep 60
On Linux,
Oprofile
can be used in a similar fashion; for more information, see “Java profiling
example”, and follow the appropriate documentation in the resources included in this section.
Java profiling example
Example B-11 contains a sample Java program that is profiled on AIX and Linux. This
program does some meaningless work and is purposely poorly written to illustrate lock
contention and GC impact in the profile. The program creates three threads but serializes
their execution by having them attempt to lock the same object. One thread at a time acquires
the lock, forcing the other two threads to wait until they can get the lock and run the code that
is protected by the synchronized statement in the doWork method. While they wait to acquire
the lock, the threads initially use
spin locking
, repeatedly checking if the lock is free. After a
suitable amount of spinning, the threads block rather than continuing to use CPU resources.
Example B-11 Sample Java program
public class ProfileTest extends Thread {
static Object o; /* used for locking to serialize threads */
static Double A[], B[], C[];
Summary of Contents for Power System POWER7 Series
Page 2: ......
Page 36: ...20 POWER7 and POWER7 Optimization and Tuning Guide...
Page 70: ...54 POWER7 and POWER7 Optimization and Tuning Guide...
Page 112: ...96 POWER7 and POWER7 Optimization and Tuning Guide...
Page 140: ...124 POWER7 and POWER7 Optimization and Tuning Guide...
Page 162: ...146 POWER7 and POWER7 Optimization and Tuning Guide...
Page 170: ...154 POWER7 and POWER7 Optimization and Tuning Guide...
Page 223: ......