background image

Cryptographic Performance on the 

 

2

nd

 Generation Intel® Core™ processor family

 

 

 

 

processors based on Intel® microarchitecture code named Westmere, we 

maximize performance with 4 buffers. 

Performance 

The performance results provided in this section were measured on two 

processors supporting Intel

®

 AES-NI: 

 

Intel

®

 Core™ i5-650 processor at a frequency of 3.20 GHz, based on 

the 

Intel® microarchitecture code named Westmere

 

 

Intel® Core™ i7-2600 processor at a frequency of 3.40 GHz, based on 

the 

2

nd

 Generation Intel® Core™ processor family

 

The fact that the two processors have different clock frequencies does not 

affect the comparison, as we have normalized the performance results to 

cycles. 

The tests, conducted by Intel, were run with Intel

®

 Turbo Boost Technology 

off, and represent the performance without Intel

®

 Hyper-Threading 

Technology (Intel

®

 HT Technology) on a single core. 

The multi-buffer code bases for AES, MD5, SHA1, SHA256 were measured on 

64-byte fixed size data buffers without a scheduler. The AES-128 CBC Encrypt 

implementation used pre-expanded keys. AES using 128-bit keys is a 

common usage; the results will be similar and scale for other key sizes such 

as 192 and 256 bit keys. 

The modular exponentiation code bases were measured on 512-bit and 1024-

bit keys. These algorithms form the basis of one of the most critical Server 

workloads, the RSA Signing Algorithm. RSA Sign performs a decryption 

process that can be implemented efficiently with the Chinese Remainder 

Theorem (CRT). As a result of the CRT implementation, RSA2048 requires 

1024-bit modular exponentiation and RSA1024 requires 512-bit modular 

exponentiation. 

We present results here with warm data. When a test for warm data is called, 

it is first run numerous times to warm up the cache.  

The timing is measured using the rdtsc() function which returns the 

processor time stamp counter (TSC). The TSC is the number of clock cycles 

since the last reset. The ‘TSC_initial’ is the TSC recorded before the function 

is called. Then, the function is called for the specified number of times for 

data buffers of a given size. After the runs are complete, the rdtsc() is called 

again to record the new cycle count ’TSC_final’. The effective cycle count for 

the called routine is computed using  

# of cycles = (TSC_final-TSC_initial)/(number of iterations). 

Summary of Contents for BX80623I72600K

Page 1: ...Performance on the 2nd Generation Intel Core processor family January 2011 White Paper Vinodh Gopal Jim Guilford Wajdi Feghali Erdinc Ozturk Gil Wolrich Kirk Yap Sean Gulley Martin Dixon IA Architect...

Page 2: ...e net result is an improvement in cryptographic performance up to 1 8X over the previous Intel processors1 The Intel Embedded Design Center provides qualified developers with web based access to techn...

Page 3: ...Cryptographic Performance on the 2nd Generation Intel Core processor family 3 and the embedded community Design Fast Design Smart Get started today www intel com embedded edc...

Page 4: ...ocessor family 4 Contents Overview 5 Improving Cryptographic Processing 5 Multi Buffer via SIMD 6 Multi Buffer via Data Dependency Hiding 6 Performance 7 Private Key and Secure Hashing Performance 8 P...

Page 5: ...the RSA DSA and Diffie Hellman algorithms In 3 we demonstrate the fastest implementation of modular exponentiation on Intel processors The 2nd Generation Intel Core processor family improves the perf...

Page 6: ...ntel AVX Intel Advanced Vector Extensions instruction set 5 resulting in much faster multi buffer hashing Although we process the same number of buffers as with SSE we gain some performance efficiency...

Page 7: ...ded keys AES using 128 bit keys is a common usage the results will be similar and scale for other key sizes such as 192 and 256 bit keys The modular exponentiation code bases were measured on 512 bit...

Page 8: ...2 bit Modular Exponentiation 360 880 246 899 1 46 1024 bit Modular Exponentiation 2 722 590 1 906 555 1 43 In this case we observe a large performance boost with the Intel Core i7 2600 processor based...

Page 9: ...erence http softwarecommunity intel com isn downloads intelavx Intel AVX Programming Reference 31943302 pdf The Intel Embedded Design Center provides qualified developers with web based access to tech...

Page 10: ...Gopal Jim Guilford Erdinc Ozturk Gil Wolrich Wajdi Feghali Kirk Yap Sean Gulley and Martin Dixon are IA Architects with the IAG Group at Intel Corporation Acronyms IA Intel Architecture API Applicati...

Page 11: ...tem Performance will vary depending on the specific hardware and software you use For more information including details on which processors support HT Technology see here 64 bit computing on Intel ar...

Reviews: