Chip
Theoretical
Bilinear Fillrate
Measured Rate
(3DMark
multitex)
Measured
Performance /
Theoretical
Performance
GeForce 9
Series
33,600 25,600 76.2%
GeForce GTX
200 GPUs
51,840 48,266 93.1%
Table 4: Theoretical vs Measured Texture Filtering Rates
Higher Shader to Texture Ratio
Because games and other visual applications are continually employing more and
more complex shaders, the GeForce GTX 200 GPU design shifts the balance to a
higher shader to texture ratio. By adding one more SM to each TPC, and keeping
texturing hardware constant, the shader to texture ratio is increased by 50%. This
shift allows the GeForce GTX 200 GPUs to perform efficiently for both today’s
and tomorrow’s games.
ROP Improvements
The previous-generation GeForce 8 series ROP subsystem supported multisampled,
supersampled, transparency adaptive, and coverage sampling antialiasing. It also
supported frame buffer (FB) blending of floating-point (FP16 and FP32) render
target surfaces, and either type of FP surface could be used in conjunction with
multisampled antialiasing for outstanding HDR rendering quality.
The new GeForce GTX 200 GPU ROP subsystem supports all of the previous
generation features, and delivers a maximum of 32 pixels per clock output, equating
to 4 pixels/clock per ROP partition × 8 partitions. Up to 32 color and Z samples
per clock for 8 × MSAA are supported per ROP partition. Pixels using U8 (8-bit
unsigned integer) data format can be blended at twice the rate per TPC of the older-
generation GPUs. Given the prior generation GPU had six ROP partitions, it could
output 24 pixels/clock and blend 12 pixels/clock. In contrast the GeForce GTX
280 can output and blend 32 pixels/clock.
1 GB Framebuffer
Today’s 3D games use a variety of textures to attain realism. Normal maps are used
to enhance surface realism, cubemaps for reflections, and high-resolution
perspective shadow maps for soft shadows. This means much more memory is
needed to render a single scene than classic rendering which relied mainly on the
base texture. Deferred rendering engines also make extensive use of multiple render
targets, where attributes of the image are rendered off screen before the final image
is composed. These techniques consume an immense amount of video memory and
memory bandwidth, especially when used in conjunction with antialiasing.
16
May, 2008 | TB-04044-001_v01