PCI Express® 6.0 Specification Functionality Updates – Part 2
Author’s Note: This blog discusses new functionality introduced in the PCIe 6.0 specification, but please note that subsequent revisions have been published. Developers should always work from the latest revision to ensure they see all specification errata.
The PCI Express® (PCIe®) 6.0 specification, in addition to doubling the raw bandwidth compared to the PCIe 5.0 specification, includes many functional enhancements. This blog introduces new functionality impacting both software and hardware. In Part 1, we introduced Flit Mode as a concept. In Part 2, we’ll dig into more of the new capabilities enabled by Flit Mode.
Updates to Flow Control Support Implementation of Multiple Virtual Channels
Shared Flow Control credits enable multiple Virtual Channels (VCs) to be implemented using a shared pool of Flow Control credits, reducing the incremental cost to support additional VCs beyond the one required – VC0. All Transmitters are required to support the Link protocol mechanisms for Shared Flow Control, and each receiver indicates its implementation choices when the link is negotiated. Software can limit the percentage of shared receive buffering that a given VC can consume (default is 100% – oversubscription is expected however this mechanism can be used to effectively “disable” sharing). Merged Flow Control enables further hardware optimization by using a shared buffer pool for both Memory Writes and Memory Read Completions.
Addition of Segment Numbers to TLP Headers
Transaction Layer Packets (TLPs) can now explicitly indicate Segment Numbers in Flit Mode. Segment Numbers (a.k.a. Hierarchy IDs) identify unique Configuration Address (BDF) spaces and have existed at the system level since before PCIe technology. Before the PCIe 6.0 specification, the Segment Number for a PCIe hierarchy domain (the part of a Hierarchy originating from a single Root Port) was only explicitly understood by the Root Complex (RC) and system software — not by the PCIe switches or devices below the Root Port. This added significant complexity to the RC by requiring the tracking of Non-Posted Requests that route peer-to-peer across segments through the RC. In this instance, the RC must “take ownership” of the transaction, replacing the Transaction ID for the request and restoring it for the associated completion, as those flow through the RC. This increases implementation costs and reduces performance for peer-to-peer traffic. For Flit-Mode TLPs, the segment can be indicated explicitly in the header, making it unnecessary for RC to take ownership of those TLPs. To ensure correct functionality in systems with Non-Flit Mode hardware, Root Ports are required by default to assume that taking ownership of peer-to-peer transactions is still required. To achieve optimized peer-to-peer performance, system software will need to configure Root Ports to indicate that all Links below a Root Port are operating in Flit Mode.
New Link Level Functionality
Some new functional elements apply at the Link level, providing headroom for future capabilities and optimizing Link efficiency. Flit Mode uses completely new TLP Headers, overcoming limitations of the previous TLP headers, such as the lack of space required for 14-bit Tags and explicit Segment Number indication. The new Flit Mode TLP Headers are also designed to allow hardware implementations to parse them more cleanly than was possible with the Non-Flit Mode TLP Headers. Additionally, for Flit Mode all TLPs use a fully decoded Type field, for which all values are explicitly defined in the PCIe 6.0 specification or “earmarked” for specific uses in future revisions. This enables hardware to implement proper framing and routing of all Flit Mode TLP types, even those not yet defined. Because the TLP Header format changes, TLP Translation occurs when forwarding between Flit Mode and Non-Flit Mode Links through routing elements. Per-TLP Framing overhead now becomes per-Flit overhead, improving Link utilization efficiency for small TLPs.
Introducing L0p
L0p, a new power reduction mechanism, was introduced to further optimize Link power. In L0p, some lanes of a Link are in a sleeping state while others continue to be active. L0p is negotiated dynamically, and without the need for the Link to pass through the Recovery state. The Link remains up during transitions between L0 and L0p, enabling aggressive use of L0p. Control of L0p is automatic. System Software/Firmware configures the Exit Latency (how long does an idle lane need to become operational) taking Retimers on the Link into consideration. Hardware takes it from there. Link Width overrides are available when needed. L0p support is optional in Flit Mode, and because it makes use of new mechanisms, is not supported in Non-Flit Mode. Retimers support L0p – all Link segments transparently use the deleted Link width. L0s is not supported in Flit Mode but remains optional for Links operating in Non-Flit Mode (Note: L0s is incompatible with Retimers, as Link speed increases Retimers are more important in the PCIe ecosystem).
Learn More About the PCIe 6.0 Specification and Subsequent Revisions
To learn more about the PCIe 6.0 specification, visit our website to view the latest blogs, infographics, webinars and more. Follow us on X (Formerly Twitter) and LinkedIn for the latest PCI-SIG updates. Finally, read Part 1 of this blog series on the PCI-SIG website.