Based on traditional processing core designs that can perform integer and floating-
point math, memory operations, and logic operations, each processing core is a
hardware-multithreaded processor with multiple pipeline stages that execute an
instruction for each thread every clock.
Various types of threads exist, including pixel, vertex, geometry, and compute. For
graphics processing, threads execute a shader program and many related threads
often simultaneously execute the same shader program for greater efficiency.
All GeForce GTX 200 GPUs include a substantial portion of die area dedicated to
processing, unlike CPUs where a majority of die area is dedicated to onboard cache
memory. Rough estimates show 20% of the transistors of a CPU are dedicated to
computation, compared to 80% of GPU transistors. GPU processing is centered on
computation and throughput, where CPUs focus heavily on reducing latency and
keeping their pipelines busy (high cache hit rates and efficient branch prediction).
Graphics Processing Architecture
As mentioned earlier, the GeForce GTX 200 GPUs include two different
architectural personalities—graphics and computing. Figure 4 represents the
GeForce 280 GTX in graphics mode. You can see the shader thread dispatch logic
at the top, in addition to setup and raster units. The ten TPCs each include three
SMs, and each SM has 24 processing cores for a total of 240 scalar processing cores.
ROP (raster operations processors) and memory interface units are located at the
bottom.
Figure 4: GeForce GTX 280 GPU Graphics Processing Architecture
10
May, 2008 | TB-04044-001_v01