Intel® Xeon Phi™ Coprocessor D
EVELOPER
’
S
Q
UICK
S
TART
G
UIDE
20
data structures such as linked lists, binary trees, and the like (
_Cilk_shared
and
_Cilk_offload)
. This
model places variables to be shared between the host and coprocessor (marked with the
_Cilk_shared
keyword) at the same virtual addresses on both machines, and synchronizes their values at the beginning and
end of offload function calls marked with the
_Cilk_offload
keyword. Data to be synchronized can also
be dynamically allocated using special allocation and free calls that ensure the allocated memory exists at the
same virtual addresses on both machines.
APIs for Dynamic shared memory allocation:
void *_Offload_shared_malloc(size_t size);
_Offload_shared_free(void *p);
APIs for Dynamic Aligned Shared memory allocation
void *_Offload_shared_aligned_malloc(size_t size, size_t alignment);
_Offload_shared_aligned_free(void *p);
It should be noted that this is not actually “shared memory”: there is no hardware that maps some portion of
the memory on the Intel® Xeon Phi™ Coprocessor to the host system. The memory subsystems on the
coprocessor and host are completely independent, and this programming model is just a different way of
copying data between these memory subsystems at well-defined synchronization points. The copying is
implicit, in that at these synchronization points (offload calls marked with
_Cilk_offload
) do not specify
what data to copy. Rather, the runtime determines what data has changed between the host and coprocessor,
and copies only the deltas at the beginning and end of the offload function call.
The following code sample demonstrates the use of the
_Cilk_shared
and
_Cilk_offload
keywords
and the dynamic allocation of “shared” memory.
float * _Cilk_shared data
; //pointer to “shared” memory
_Cilk_shared
float MIC_OMPReduction(int size)
{
#ifdef __MIC__
float Result;
int nThreads = 32;
omp_set_num_threads(nThreads);
#pragma omp parallel for red:Result)
for (int i=0; i<size; ++i)
{
= data[i];
}
return Result;
#else
printf("Intel(R) Xeon Phi(TM) Coprocessor not available\n");
#endif
return 0.0f;
}
int main()
{