Intel® Xeon Phi™ Coprocessor D
EVELOPER
’
S
Q
UICK
S
TART
G
UIDE
19
float reduction(float *data, int size)
{
float ret = 0.f;
#pragma offload target(mic) in(data:length(size))
for (int i=0; i<size; ++i)
{
ret += data[i];
}
return ret;
}
Code Example 2: Serial Reduction with Offload
Vector Reduction with Offload
Each core on the Intel® Xeon Phi™ Coprocessor has a VPU. The auto vectorization option is enabled by default
on the offload compiler. Alternately, as seen in the example below, the programmer can use the Intel® Cilk™
Plus Extended Array Notation to maximize vectorization and take advantage of the Intel® MIC Architecture
core’s 32 512-bit registers. The offloaded code is executed by a single thread on a single core. The thread
uses the built-in reduction function
__sec_reduce_add()
to use the core’s 32 512-bit vector registers to
reduce the elements in the array sixteen at a time.
float reduction(float *data, int size)
{
float ret = 0;
#pragma offload target(mic) in(data:length(size))
ret = __sec_reduce_add(data[0:size]); //
Intel® Cilk™ Plus
//Extended Array Notation
return ret;
}
Code Example 3: Vector Reduction with Offload in C/C++
Asynchronous Offload and Data Transfer
Asynchronous offload and data transfer between the host and the Intel® Xeon Phi™ Coprocessor is available.
For details see the “About Asynchronous Computation” and “About Asynchronous Data Transfer” sections in
the Intel® C++ Compiler User and Reference Guide (under “Key Features/Programming for the Intel® MIC
Architecture”).
For an example showing the use of asynchronous offload and transfer, refer to
/
opt/intel/composerxe
/Samples/en_US/C++/mic_samples/intro_sampleC/sampleC13.c
Note that when using the Explicit Memory Copy Model in C/C++, arrays are supported provided the array
element type is scalar or bitwise copyable struct or class. So arrays of pointers are not supported. For C/C++
complex data structure, use the Implicit Memory Copy Model. Please consult the section “Restrictions on
Offload Code Using a Pragma” in the document “Intel C++ Compiler User and Reference Guide” for more
information.
Using the Offload Compiler – Implicit Memory Copy Model
Intel Composer XE 2013 SP1 includes two additional keyword extensions for C and C++ (but not Fortran) that
provide a “shared memory” offload programming model
appropriate for dealing with complex, pointer-based