Reference Number: 327043-001
101
Intel® Xeon® Processor E5-2600 Product Family Uncore Performance Monitoring
• Definition:
Counts the number of cycles that the Intel® QPI RxQ was not empty. Generally, when
data is transmitted across Intel® QPI, it will bypass the RxQ and pass directly to the ring interface.
If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer,
thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy
Accumulator event to calculate the average occupancy.
RxL_FLITS_G0
• Title:
Flits Received - Group 0
• Category:
FLITS_RX Events
• Event Code:
0x01
• Max. Inc/Cyc:
2,
Register Restrictions:
0-3
• Definition:
Counts the number of flits received from the Intel® QPI Link. It includes filters for
Idle, protocol, and Data Flits. Each "flit" is made up of 80 bits of information (in addition to some
ECC data). In full-width (L0) mode, flits are made up of four "fits", each of which contains 20 bits
of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits,
and therefore it takes twice as many fits to transmit a flit. When one talks about Intel® QPI
"speed" (for example, 8.0 GT/s), the "transfers" here refer to "fits". Therefore, in L0, the system
will transfer 1 "flit" at the rate of 1/4th the Intel® QPI speed. One can calculate the bandwidth of
the link by taking: flits*80b/time. Note that this is not the same as "data" bandwidth. For exam-
ple, when we are transfering a 64B cacheline across Intel® QPI, we will break it into 9 flits -- 1 with
header information and 8 with 64 bits of actual "data" and an additional 16 bits of other informa-
tion. To calculate "data" bandwidth, one should therefore do: data flits * 8B / time (for L0) or 4B
instead of 8B for L0p.
RxL_FLITS_G1
• Title:
Flits Received - Group 1
• Category:
FLITS_RX Events
• Event Code:
0x02
• Extra Select Bit:
Y
• Max. Inc/Cyc:
2,
Register Restrictions:
0-3
• Definition:
Counts the number of flits received from the Intel® QPI Link. This is one of three
"groups" that allow us to track flits. It includes filters for SNP, HOM, and DRS message classes.
Each "flit" is made up of 80 bits of information (in addition to some ECC data). In full-width (L0)
mode, flits are made up of four "fits", each of which contains 20 bits of data (along with some addi-
tional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as
many fits to transmit a flit. When one talks about Intel® QPI "speed" (for example, 8.0 GT/s), the
Table 2-98. Unit Masks for RxL_FLITS_G0
Extension
umask
[15:8]
Description
IDLE
bxxxxxxx1
Idle and Null Flits:
Number of flits received over Intel® QPI that do not hold protocol
payload. When Intel® QPI is not in a power saving state, it
continuously transmits flits across the link. When there are no
protocol flits to send, it will send IDLE and NULL flits across. These
flits sometimes do carry a payload, such as credit returns, but are
generall not considered part of the Intel® QPI bandwidth.
DATA
bxxxxxx1x
Data Tx Flits:
Number of data flitsreceived over Intel® QPI. Each flit contains 64b
of data. This includes both DRS and NCB data flits (coherent and
non-coherent). This can be used to calculate the data bandwidth of
the Intel® QPI link. One can get a good picture of the Intel® QPI-
link characteristics by evaluating the protocol flits, data flits, and idle/
null flits. This does not include the header flits that go in data
packets.
NON_DATA
bxxxxx1xx
Non-Data protocol Tx Flits:
Number of non-NULL non-data flits received across Intel® QPI. This
basically tracks the protocol overhead on the Intel® QPI link. One
can get a good picture of the Intel® QPI-link characteristics by
evaluating the protocol flits, data flits, and idle/null flits. This
includes the header flits for data packets.