AMD’s Innovative Chiplet GPU Patent: A Game Changer

    AMD, a leading player in the tech industry, has recently filed a patent for a chiplet architecture for its Radeon GPUs. This innovative approach is a significant shift from the traditional monolithic design, where all the graphics processing units are located on a single die.

    The patent is intriguing to read about, and I learned about it through the RedGamingTech YouTube channel. The paper is actually about distributed rendering, despite its title, “Distributed Geometry.” Consider AMD’s greatest graphics processor to date, the Navi 31 GPU, which is found in the Radeon RX 7900 series. While chiplets are used in that, they only have two VRAM connections and a small portion of L3 Infinity Cache; the remainder of the GPU is housed in a single block known as the graphics compute die, or GCD.

    The monolithic design, although widely used, has its limitations. As the dies get larger for high-performance GPU configurations, they become more expensive to manufacture and do not scale well. This has led AMD to explore new design methodologies. As a result, the existing monolithic GPU designs are not sustainable in the long term. As we try to stuff more transistors into smaller spaces, it results in additional challenges such as diminishing die yields and growing R&D costs. This is because as the dies get larger for high-performance GPU configurations, they become more expensive to manufacture and do not scale well.

    Monolithic architecture also lacks flexibility by being constrained by the tech already used within the structure3. Moreover, monolithic apps are harder to rely on since an error in any of the modules can affect the entire application’s availability. The large size of the application may slow down the start-up time. TLDR you basically get memes like RTX 4090, with absurd power consumption, but doesn’t scale that well, or does it scale? is it limited by pure bandwidth itself? ah, that’s a story for another day!

    AMD’s new design involves using multiple GPU configurations, which was previously considered inefficient due to limited software support. However, AMD has found a way to overcome these limitations by using high-bandwidth passive crosslinks for chiplet-to-chiplet communication.

    In this design, each GPU in the chiplet array would be coupled to the first GPU in the array, and all communication would go through an active interposer containing many layers of high-bandwidth passive crosslinks.

    amd chiplet patent

    Given that it seems like this is just going to confuse things further, what are the possible advantages of doing this? AMD’s successful transition to chiplets for their CPUs shows that the main goal is to lower production costs for high-end technology. Because each silicon wafer delivers fewer functioning dies than smaller ones, making really massive GPUs is more expensive.

    A standard 12-inch wafer can produce over a thousand memory chiplets from the Navi 31’s tiny memory chiplets, so even if many of them are faulty, you still have a little pile of functional chips.

    It seems evident that AMD wishes to handle the remainder of the GPU similarly. The cost of producing a high-end GPU could be decreased if it could be built by simply stacking tiny chiplets together in a single package, as cutting-edge manufacturing nodes are very expensive.

    That being said, there are significant obstacles that must be removed before this approach to GPU construction can be considered successful. The Internal bandwidth and latency requirements come first. Several terabytes of data can be read and written between the caches inside a typical graphics processor per second, with each transaction completing in a matter of nanoseconds. The system that connects everything to the shared cache and memory controllers must be extremely powerful in order to transition to chiplets. The current generation RDNA 3 7900XTX and other flagships tend to fall short competing with Nvidia counterparts when it comes to rendering games at a much higher resolution.

    Indeed, AMD’s experience with the Infinity Fanout Links used in the RX 7900 series, which connect the GCD to the memory chiplets, is noteworthy. These links provide substantial bandwidth, and the latency is not significantly worse than that seen in a full-die GPU, such as the Navi 21 (RX 6900 XT). This technology could be a stepping stone towards the distributed geometry design.

    However, there are challenges that need to be addressed. One of them is ensuring that all of the chiplets are kept as busy as possible. With each one determining its own workload, there’s a risk that some units are left idle because the others can work through what’s needed quickly enough. There’s also the issue of processing stalls, where a chiplet can’t actually fully complete a task because it requires neighboring geometry information.

    These challenges are not discussed in the patent, leaving us to speculate about when AMD will announce the technology if they ever do. It’s suspected that this is being planned for RDNA 5, rather than the next iteration, but there’s a small chance it might not be. The last time a radical technology patent from AMD was seen was for its ray-tracing texture units.

    That was published almost two years after submission, and the feature was implemented in RDNA 2. AMD started promoting that architecture in 2020 and the first products to sport the new RT-texture processors launched in November of the same year. So there is a possibility, albeit a rather small one, that AMD could introduce a whole new world of GPU chiplets next year, with RDNA 4.

    For the latest news on technology gaming and hardware visit our website here

    Latest articles


    Related articles

    Leave a reply

    Please enter your comment!
    Please enter your name here