Dynamic Memory Allocation Consideration
25
22007E/0—November 1999
AMD Athlon™ Processor x86 Code Optimization
w h i ch m i g h t i n h i b i t c e r t a i n o p t i m i z a t i o n s w i t h s o m e
compilers—for example, aggressive inlining.
Dynamic Memory Allocation Consideration
Dynamic memory allocation (‘malloc’ in C language) should
always return a pointer that is suitably aligned for the largest
base type (quadword alignment). Where this aligned pointer
cannot be guaranteed, use the technique shown in the following
code to make the pointer quadword aligned, if needed. This
code assumes the pointer can be cast to a long.
Example:
double* p;
double* np;
p = (double *)malloc(sizeof(double)*number_of_7L);
np = (double *)((((long)(p))+7L) & (–8L));
Then use ‘np’ instead of ‘p’ to access the data. ‘p’ is still needed
in order to deallocate the storage.
Introduce Explicit Parallelism into Code
Where possible, long dependency chains should be broken into
several independent dependency chains which can then be
executed in parallel exploiting the pipeline execution units.
This is especially important for floating-point code, whether it
is mapped to x87 or 3DNow! instructions because of the longer
latency of floating-point operations. Since most languages,
including ANSI C, guarantee that floating-point expressions are
not re-ordered, compilers can not usually perform such
optimizations unless they offer a switch to allow ANSI non-
compliant reordering of floating-point expressions according to
algebraic rules.
Note that re-ordered code that is algebraically identical to the
o r i g i n a l c o d e d o e s n o t n e c e s s a r i l y d e l ive r i d e n t i c a l
computational results due to the lack of associativity of floating
p o i n t o p e r a t i o n s . T h e r e a r e w e l l - k n o w n n u m e r i c a l
considerations in applying these optimizations (consult a book
on numerical analysis). In some cases, these optimizations may
Summary of Contents for Athlon Processor x86
Page 1: ...AMD Athlon Processor x86 Code Optimization Guide TM...
Page 12: ...xii List of Figures AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 16: ...xvi Revision History AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 202: ...186 Page Attribute Table PAT AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 252: ...236 VectorPath Instructions AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 256: ...240 Index AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...