Chapter 2. The POWER7 processor
43
Here are examples of vector initialization using initializer lists:
vector unsigned int v1 = {1};// initialize the first 4 bytes of v1 with 1
// and the remaining 12 bytes with zeros
vector unsigned int v2 = {1,2};// initialize the first 8 bytes of v2 with 1 and 2
// and the remaining 8 bytes with zeros
vector unsigned int v3 = {1,2,3,4};// equivalent to the vector literal
// (vector unsigned int) (1,2,3,4)
How to use vector capability in POWER7
When you target a POWER processor that supports VMX or VSX, you can request the
compiler to transform code into VMX or VSX instructions. These machine instructions can run
up to 16 operations in parallel. This transformation mostly applies to loops that iterate over
contiguous array data and perform calculations on each element. You can use the NOSIMD
directive to prevent the transformation of a particular loop:
56
Using a compiler: Compiler versions that recognize the POWER7 architecture are XL
C/C++ 11.1 and XLF Fortran 13.1 or recent versions of GCC, including the Advance
Toolchain, and the SLES 11SP1 or Red Hat RHEL6 GCC compilers:
– For C:
•
xlc -qarch=pwr7 -qtune=pwr7 -O3 -qhot -qsimd
•
gcc -mcpu=power7 -mtune=power7 -O3
– For Fortran
•
xlf -qarch=pwr7 -qtune=pwr7 -O3 -qhot -qsimd
•
gfortran -mcpu=power7 -mtune=power7 -O3
Using Engineering and Scientific Subroutine (ESSL) libraries with vectorization support:
– Select routines have vector analogs in the library
– Key FFT, BLAS routines
Vector capability support in AIX
A program can determine whether a system supports the vector extension by reading the
vmx_version field of the
_system_configuration
structure. If this field is non-zero, then the
system processor chips and operating system contain support for the vector extension. A
__power_vmx()
macro is provided in
/usr/include/sys/systemcfg.h
for performing this test.
A value of 2 means that the processor chip is both VMX and VSX capable.
The AIX Application Binary Interface (ABI) is extended to support the addition of vector
register state and conventions. AIX supports the AltiVec programming interface specification.
A set of malloc subroutines (
vec_malloc
,
vec_free
,
vec_realloc
, and
vec_calloc
) is provided
by AIX that give 16-byte aligned allocations. Vector-enabled compilation, with
_VEC_
implicitly
defined by the compiler, result in any calls to older mallocs and callocs being redirected to
their vector-safe counterparts,
vec_malloc
and
vec_calloc
. Non-vector code can also be
explicitly compiled to pick up these same malloc and calloc redirections by explicitly defining
__AIXVEC
.
56
Ibid
Summary of Contents for Power System POWER7 Series
Page 2: ......
Page 36: ...20 POWER7 and POWER7 Optimization and Tuning Guide...
Page 70: ...54 POWER7 and POWER7 Optimization and Tuning Guide...
Page 112: ...96 POWER7 and POWER7 Optimization and Tuning Guide...
Page 140: ...124 POWER7 and POWER7 Optimization and Tuning Guide...
Page 162: ...146 POWER7 and POWER7 Optimization and Tuning Guide...
Page 170: ...154 POWER7 and POWER7 Optimization and Tuning Guide...
Page 223: ......