Chapter 12: Flow Control
12–3
Throughput of Non-Posted Reads
November 2012
Altera Corporation
Arria V GZ Hard IP for PCI Express
6. After an FC Update DLLP is created, it arbitrates for access to the PCI Express link.
The FC Update DLLPs are typically scheduled with a low priority; consequently, a
continuous stream of Application Layer TLPs or other DLLPs (such as ACKs) can
delay the FC Update DLLP for a long time. To prevent starving the attached
transmitter, FC Update DLLPs are raised to a high priority under the following
three circumstances:
a. When the last sent
credit allocated
counter minus the amount of received
data is less than
MAX_PAYLOAD
and the current
credit allocated
counter is
greater than the last sent credit counter. Essentially, this means the data sink
knows the data source has less than a full
MAX_PAYLOAD
worth of credits, and
therefore is starving.
b. When an internal timer expires from the time the last FC Update DLLP was
sent, which is configured to 30 µs to meet the
PCI Express Base Specification
for
resending FC Update DLLPs.
c. When the
credit
allocated
counter minus the last sent
credit
allocated
counter is greater than or equal to 25% of the total credits available in the RX
buffer, then the FC Update DLLP request is raised to high priority.
After arbitrating, the FC Update DLLP that won the arbitration to be the next item
is transmitted. In the worst case, the FC Update DLLP may need to wait for a
maximum sized TLP that is currently being transmitted to complete before it can
be sent.
7. The FC Update DLLP is received back at the original write requester and the
credit
limit
value is updated. If packets are stalled waiting for credits, they can
now be transmitted.
To allow the write requester to transmit packets continuously, the
credit
allocated
and the
credit
limit
counters must be initialized with sufficient credits to allow
multiple TLPs to be transmitted while waiting for the FC Update DLLP that
corresponds to the freeing of credits from the very first TLP transmitted.
You can use the
RX Buffer space allocation - Desired performance for received
requests
to configure the RX buffer with enough space to meet the credit
requirements of your system.
Throughput of Non-Posted Reads
To support a high throughput for read data, you must analyze the overall delay from
the time the Application Layer issues the read request until all of the completion data
is returned. The Application Layer must be able to issue enough read requests, and
the read completer must be capable of processing these read requests quickly enough
(or at least offering enough non-posted header credits) to cover this delay.
However, much of the delay encountered in this loop is well outside the Arria V GZ
Hard IP for PCI Express and is very difficult to estimate. PCI Express switches can be
inserted in this loop, which makes determining a bound on the delay more difficult.