Intel® Xeon Phi™ Coprocessor D
EVELOPER
’
S
Q
UICK
S
TART
G
UIDE
24
int nthreads = omp_get_max_threads();
int ElementsPerThread = size/nthreads;
#pragma omp parallel for red:ret)
for(int i=0;i<nthreads;i++)
{
ret =_sec_reduce_add(
data[i*ElementsPerThread:ElementsPerThread]);
}
//rest of the array
for(int i=nthreads*ElementsPerThread; i<size; i++)
{
ret+=data[i];
}
}
return ret;
}
Code Example 7: Array Reduction Using Open MP and Intel® Cilk™ Plus in C/C++
Parallel Programming on the Intel® Xeon Phi™ Coprocessor: Intel® Cilk™ Plus
Intel Cilk Plus header files are not available on the target environment by default. To make the header files
available to an application built for the Intel® MIC Architecture using Intel Cilk Plus, wrap the header files with
#pragma offload_attribute(push,target(mic))
and
#pragma offload_attribute(pop)
as follows:
#pragma offload_attribute(push,target(mic))
#include <cilk/cilk.h>
#include <cilk/reducer_opadd.h>
#pragma offload_attribute(pop)
Code Example 8: Wrapping the Header Files in C/C++
In the following example, the compiler converts the
cilk_for
loop into a recursively called function using an
efficient divide-and-conquer strategy.
float ReduceCilk(float*data, int size)
{
float ret = 0;
#pragma offload target(mic) in(data:length(size))
{
cilk::reducer_opadd<int> total;
cilk_for (int i=0; i<size; ++i)
{
total += data[i];
}
ret = total.get_value();
}
return ret;
}
Code Example 9: Creating a Recursively Called Function by Converting the “
cilk_for
” Loop