background image

at the right balance of frame buffer bandwidth required to keep the texture units 
fully utilized and not starved for data. 
General frame buffer efficiency has been improved for GeForce GTX 200 GPUs. 
We reworked the critical paths in the frame buffer to allow higher speed memory 
operation, up to 1.1 GHz GDDR3 stock speed. Memory bank access patterns and 
caching algorithms have also been improved. Additional compression hardware in 
GeForce GTX 200 GPUs effectively increase frame buffer bandwidth by permitting 
more data to traverse the interface per unit time, enabling better performance at 
higher resolutions. 

Power Management Enhancements 

GeForce GTX 200 GPUs include a more dynamic and flexible power management 
architecture than past generation NVIDIA GPUs. Four different performance / 
power modes are employed: 

‰

 

Idle/2D power mode (approx 25 W) 

‰

 

Blu-ray DVD playback mode (approx 35 W) 

‰

 

Full 3D performance mode (varies—worst case TDP 236 W) 

‰

 

HybridPower™ mode (effectively 0 W) 

 
Using a HybridPower-capable nForce motherboard, such as those based on the 
nForce 780a chipset, a GeForce GTX 200 GPU can be fully powered off when not 
performing intensive graphics operations and graphics output can be handled by the 
motherboard GPU (mGPU).  
For 3D graphics-intensive applications, the NVIDIA driver can seamlessly switch 
between the power modes based on utilization of the GPU. Each of the new 
GeForce GTX 200 GPUs integrates utilization monitors (“digital watchdogs”) that 
constantly check the amount of traffic occurring inside of the GPU. Based on the 
level of utilization reported by these monitors, the GPU driver can dynamically set 
the appropriate performance mode (i.e., a defined clock and voltage level) that 
minimizes the power draw of the graphics card—all fully transparent to the end 
user.  
The GPU also has clock-gating circuitry, which effectively “shuts down” blocks of 
the GPU which are not being used at a particular time (where time is measured in 
milliseconds), further reducing power during periods of non-peak GPU utilization.  
All this enables GeForce GTX 200 graphics cards to deliver idle power that is nearly 
1/10th of its maximum power (approximately 25 W on GeForce GTX 280 GPUs). 
This dynamic power range gives you incredible power efficiency across a full range 
of applications (gaming, video playback, surfing the web, etc). 
Many other areas of the GeForce GTX 200 GPU pipeline have been reworked to 
improve performance and reduce various processing bottlenecks.  

Additional Pipeline and Architecture Enhancements 

Starting from the top of the GeForce GTX 200 GPUs, the front-end unit 
communicates with the graphics driver running on the host system to accept 
commands and data. The communication protocol and certain software classes have 

18

   

May, 2008  |  TB-04044-001_v01 

 

Содержание GeForce GTX 200 GPU

Страница 1: ...Technical Brief NVIDIA GeForce GTX 200 GPU Architectural Overview Second Generation Unified GPU Architecture for Visual Computing...

Страница 2: ...Processing Architecture 10 Parallel Computing Architecture 12 SIMT Architecture 13 Greater Number of Threads in Flight 13 Larger Register File 14 Improved Dual Issue 15 Double Precision Support 15 Im...

Страница 3: ...ure 10 Figure 5 GeForce GTX 280 GPU Parallel Computing Architecture 12 Figure 6 TPC Thread Processing Cluster 13 Figure 7 Local Register File 2 versus 1 14 Figure 8 Geometry Shading Performance 17 Tab...

Страница 4: ...perience We ll begin by describing architectural design goals and key features and then dive into the technical implementation of the GeForce GTX 200 GPUs We assume you have a basic understanding of f...

Страница 5: ...e natural character motion and very accurate and convincing physics effects The GeForce GTX 200 GPUs are designed to be fully compliant with Microsoft DirectX 10 and Open GL 2 1 Architectural Design G...

Страница 6: ...luding Convincing facial and character animation Multiple ultra high polygon characters in complex environments Advanced volumetric effects smoke fog mist etc Fluid and cloth simulation Fully simulate...

Страница 7: ...it color output Gaming Beyond SLI NVIDIA s SLI technology is the industry s leading multi GPU technology giving you an easy low cost high impact performance upgrade PC gaming simply doesn t get any fa...

Страница 8: ...coding application from Elemental and various video and photo editing applications Many engineering scientific medical and financial areas demand high performance computational horsepower for numerous...

Страница 9: ...ure consists of a number of TPCs which stands for Texture Processing Clusters in graphics processing mode and Thread Processing Clusters in parallel compute mode Each TPC is in turn made up of a numbe...

Страница 10: ...imates show 20 of the transistors of a CPU are dedicated to computation compared to 80 of GPU transistors GPU processing is centered on computation and throughput where CPUs focus heavily on reducing...

Страница 11: ...heoretical performance limits than could prior generations Table 2 compares the GeForce 8800 GTX to the new GeForce GTX 280 GPU You will notice sizable increases in a number of important measurable pa...

Страница 12: ...perform atomic read modify write operations to memory Atomic access provides granular access to memory locations and facilitates parallel reductions and parallel data structure management Figure 5 Ge...

Страница 13: ...utilized at all times From the programmer s perspective SIMT also allows each thread to take on its own path Since branching is handled by the hardware there is no need to manually manage branching wi...

Страница 14: ...Series GPUs The older GPUs could run into situations with long shaders where registers would be exhausted generating the need to swap to memory A much larger register file permits larger and more comp...

Страница 15: ...EE 754R floating point specification compliant The overall double precision performance of all 10 TPCs of a GeForce GTX 280 GPU is roughly equivalent to an eight core Xeon CPU yielding up to 78 gigafl...

Страница 16: ...ng quality The new GeForce GTX 200 GPU ROP subsystem supports all of the previous generation features and delivers a maximum of 32 pixels per clock output equating to 4 pixels clock per ROP partition...

Страница 17: ...cantly faster than prior generation NVIDIA GPUs and competitive products Geometry Shader Performance Rightmark 3D 2 0 Hyperlight Heavy http www ixbt com video itogi video ini rmdx10 rar 0 100 200 300...

Страница 18: ...iver can seamlessly switch between the power modes based on utilization of the GPU Each of the new GeForce GTX 200 GPUs integrates utilization monitors digital watchdogs that constantly check the amou...

Страница 19: ...age Setup rates are similar to prior generation supporting up to one primitive per clock Z Culling performance has also been improved especially at high resolutions Early Z rejection rates have been i...

Страница 20: ...uffer memory access Improvements in on chip communications between various units Improved Z cull and compression supporting higher performance at high resolutions and 10 bit color support These all re...

Страница 21: ...ed number of pixel processing units and a fixed number of vertex processing units This same unified architecture provided the framework for efficient high end computation using NVIDIA CUDA software te...

Страница 22: ...CPUs and GeForce 8800 GTS 512 were run on Asus P5K V motherboard Intel G33 based with 2 GB DDR2 system memory Based on an extrapolation of 1 min 50 sec 1280 720 high definition movie clip 4 http devel...

Страница 23: ...hat may result from its use No license is granted by implication or otherwise under any patent or patent rights of NVIDIA Corporation Specifications mentioned in this publication are subject to change...

Отзывы: