114
POWER7 and Optimization and Tuning Guide
gcc -fprofile-generate -o program a.o b.o
program < sample1
program < sample2
program < sample3
gcc -fprofile-use -O3 -c a.c
gcc -fprofile-use -O3 -c b.c
gcc -fprofile-use -o program a.o b.o
Additional options that are related to GCC PDF include:
-fprofile-correction
Corrects for missing counter samples from multi-threaded
applications.
-fprofile-dir=PATH
Specifies the directory for generating and using profile data.
-fprofile-generate=PATH
Combines
-fprofile-generate
and
-fprofile-dir
.
-fprofile-use=PATH
Combines
-fprofile-use
and
-fprofile-dir
.
Detailed descriptions about
-fprofile-generate
and its related options can be found Options
That Control Optimization, available at:
http://gcc.gnu.org/onlinedocs/gcc-4.6.3/gcc/Optimize-Options.html#Optimize-Options
For more information about this topic, see 6.4, “Related publications” on page 123.
6.3 IBM Feedback Directed Program Restructuring
Feedback Directed Program Restructuring (FDPR) is a feedback-based, directed, and
post-link optimization tool.
6.3.1 Introduction
FDPR optimizes the executable binary file of a program by collecting information about the
behavior of the program while the program is used for a typical workload, and then creates a
new version of the program that is optimized for that workload. Both main executable and
dynamically shared libraries (DLLs) are supported.
FDPR performs global optimizations at the level of the entire executable library, including
statically linked library code. Because the executable library to be optimized by FDPR is not
relinked, the compiler and linker conventions do not need to be preserved, thus allowing
aggressive optimizations that are not available to optimizing compilers.
The main advantage that is provided by FDPR is the reduced footprint of both code and data,
resulting in more effective cache usage. The principal optimizations of FDPR include global
code reordering, global data reordering, function inlining, and loop unrolling, along with
various tuning options tailored for the specific Power target. The effectiveness of the
optimization depends largely on how representative the collected profile is regarding the
true workload.
FDPR runs on both Linux and AIX and produces optimized code for all versions of the Power
Architecture. POWER7 is its default target architecture.
Содержание Power System POWER7 Series
Страница 2: ......
Страница 36: ...20 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 70: ...54 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 112: ...96 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 140: ...124 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 162: ...146 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 170: ...154 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 222: ...POWER7 and POWER7 Optimization and Tuning Guide POWER7 and POWER7 Optimization and Tuning Guide...
Страница 223: ......