Chapter 1. Optimization and tuning on IBM POWER7 and IBM
11
Fortunately, AIX offers a number of different memory allocation packages that are appropriate
for different scenarios. These different packages are chosen by setting environment variables
and do not require any code modification or rebuilding of an application.
Choosing the best malloc package requires some understanding of how an application uses
the memory allocation routines. Appendix A, “Analyzing malloc usage under AIX” on
page 151 shows how to easily collect the required information. Following the data collection,
experiment with various alternatives, alone or in combination. Some alternatives that deliver
high performance include:
Pool malloc: The pool front end to the malloc subsystem optimizes the allocation of
memory blocks of 512 bytes or less. It is common for applications to allocate many small
blocks, and pools are particularly space- and time-efficient for that allocation pattern.
Thread-specific pools are used for multi-threaded applications. The pool malloc is a good
choice for both single-threaded and multi-threaded applications.
Multiheap malloc: The multiheap malloc package uses up to 32 separate heaps, reducing
contention when multiple threads attempt to allocate memory. It is a good choice for
multi-threaded applications.
Using the pool front end and multiheap malloc in combination is a good alternative for
multi-threaded applications. Small memory block allocations, typically the most common, are
handled with high efficiency by the pool front end. Larger allocations are handled with good
scalability by the multiheap malloc. A simple example of specifying the pool and multiheap
combination is by using the environment variable setting:
MALLOCOPTIONS=pool,multiheap
For more information malloc alternatives, see Chapter 4, “AIX” on page 67 and “Malloc” on
page 68.
Linux Advance Toolchain libraries
The Linux Advance Toolchain contains replacements for various standard system libraries.
These replacement libraries are optimized for specific processor chips, including POWER5,
POWER6, and POWER7. After you install the Linux Advance Toolchain, the dynamic linker
automatically has programs use the library that is optimized for the processor chip type in
the system.
The libraries in Linux Advance Toolchain Version 5.0 and later are optimized to use the
multi-core facilities in POWER7.
Mathematical Acceleration Subsystem Library and Engineering and Scientific
Subroutine Library
The Mathematical Acceleration Subsystem (MASS) library contains both optimized and
vectorized versions of some basic mathematical functions and runs on AIX and Linux. The
MASS library is included with the XL compilers and is automatically used by the compilers
when the
-O3 -qhot=level=1
compilation options are used. The MASS routines can be used
automatically with the Advance Toolchain GNU Compiler Collection (GCC) by using the
-mveclibabi=mass
option, but the library is not included with the compiler and must be
separately installed. Explore the use of MASS for applications that use basic mathematical
functions. Good results occur when you use the vector versions of the functions. The MASS
routines do not necessarily provide the same precision of results as do standard libraries.
Содержание Power System POWER7 Series
Страница 2: ......
Страница 36: ...20 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 70: ...54 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 112: ...96 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 140: ...124 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 162: ...146 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 170: ...154 POWER7 and POWER7 Optimization and Tuning Guide...
Страница 222: ...POWER7 and POWER7 Optimization and Tuning Guide POWER7 and POWER7 Optimization and Tuning Guide...
Страница 223: ......