Close Menu
    Trending
    • Royal Critics Suspicious Of Meghan Markle’s Reported UK Return
    • Trump say repairs to begin ‘immediately’ for Washington pool renovation
    • Mourners gather to remember Lebanese conservationist killed by Israel | Israel attacks Lebanon News
    • Angel Reese, Paige Bueckers Shine Bright In Coach Campaign
    • Commentary: We liked remote work. Then we looked at the data
    • Belgium see red in goalless World Cup draw with Iran in Los Angeles | World Cup 2026
    • Jason Bateman Reveals Family Relied On His Childhood Earnings
    • Vance hopes US, Iran can turn ‘new leaf’ with talks
    Ironside News
    • Home
    • World News
    • Latest News
    • Politics
    • Opinions
    • Tech News
    • World Economy
    Ironside News
    Home»Tech News»HBM on GPU: Thermal Challenges and Solutions
    Tech News

    HBM on GPU: Thermal Challenges and Solutions

    Ironside NewsBy Ironside NewsJanuary 14, 2026No Comments6 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Peek contained in the package deal of AMD’s or Nvidia’s most advanced AI products and also you’ll discover a acquainted association: The GPU is flanked on two sides by high-bandwidth memory (HBM), probably the most superior reminiscence chips out there. These reminiscence chips are positioned as shut as doable to the computing chips they serve with the intention to minimize down on the most important bottleneck in AI computing—the energy and delay in getting billions of bits per second from reminiscence into logic. However what when you may convey computing and reminiscence even nearer collectively by stacking the HBM on high of the GPU?

    Imec just lately explored this situation utilizing superior thermal simulations, and the reply—delivered in December on the 2025 IEEE International Electron Device Meeting (IEDM)—was a bit grim. 3D stacking doubles the working temperature contained in the GPU, rendering it inoperable. However the workforce, led by Imec’s James Myers, didn’t simply surrender. They recognized a number of engineering optimizations that in the end may whittle down the temperature distinction to almost zero.

    Imec began with a thermal simulation of a GPU and 4 HBM dies as you’d discover them right this moment, inside what’s known as a 2.5D package deal. That’s, each the GPU and the HBM sit on substrate known as an interposer, with minimal distance between them. The 2 kinds of chips are linked by hundreds of micrometer-scale copper interconnects constructed into the interposer’s floor. On this configuration, the mannequin GPU consumes 414 watts and reaches a peak temperature of slightly below 70 °C—typical for a processor. The reminiscence chips eat an extra 40 W or so and get considerably much less scorching. The warmth is faraway from the highest of the package deal by the sort of liquid cooling that’s change into frequent in new AI data centers.

    RELATED: Future Chips Will Be Hotter Than Ever

    “Whereas this strategy is presently used, it doesn’t scale effectively for the longer term—particularly because it blocks two sides of the GPU, limiting future GPU-to-GPU connections contained in the package deal,” Yukai Chen, a senior researcher at Imec informed engineers at IEDM. In distinction, “the 3D strategy results in increased bandwidth, decrease latency… crucial enchancment is the package deal footprint.”

    Sadly, as Chen and his colleagues discovered, probably the most easy model of stacking, merely placing the HBM chips on high of the GPU and including a block of clean silicon to fill in a niche on the middle, shot temperatures within the GPU as much as a scorching 140 °C—effectively previous a typical GPU’s 80 °C restrict.

    System Expertise Co-optimization

    The Imec workforce set about attempting plenty of know-how and system optimizations aimed toward reducing the temperature. The very first thing they tried was to throw out a layer of silicon that was now redundant. To know why, it’s a must to first get a grip on what HBM actually is.

    This type of reminiscence is a stack of as many as 12 high-density DRAM dies. Every has been thinned right down to tens of micrometers and is shot by with vertical connections. These thinned dies are stacked one atop one other and related by tiny balls of solder, and this stack of reminiscence is vertically related to a different piece of silicon, known as the bottom die. The bottom die is a logic chip designed to multiplex the info—pack it into the restricted variety of wires that may match throughout the millimeter-scale hole to the GPU.

    However with the HBM now on high of the GPU, there’s no want for such an information pump. Bits can movement immediately into the processor with out regard for what number of wires occur to suit alongside the aspect of the chip. In fact, this alteration means shifting the reminiscence management circuits from the bottom die into the GPU and due to this fact altering the processor’s floorplan, says Myers. However there needs to be ample room, he suggests, as a result of the GPU will now not want the circuits used to demultiplex incoming reminiscence knowledge.

    RELATED: The Hot, Hot Future of Chips

    Reducing out this middle-man of reminiscence cooled issues down by solely rather less than 4 °C. However, importantly, it ought to massively increase the bandwidth between the reminiscence and the processor, which is necessary for an additional optimization the workforce tried—slowing down the GPU.

    That may appear opposite to the entire goal of higher AI computing, however on this case it’s a bonus. Large language models are what are known as “reminiscence sure” issues. That’s, reminiscence bandwidth is the primary limiting issue. However Myers’ workforce estimated 3D stacking HBM on the GPU would increase bandwidth fourfold. With that added headroom, even slowing the GPU’s clock by 50 p.c nonetheless results in a efficiency win, whereas cooling the whole lot down by greater than 20 °C. In follow, the processor may not must be slowed down fairly that a lot. Growing the clock frequency to 70 p.c led to a GPU that was only one.7 °C hotter, Myers says.

    Optimized HBM

    One other massive drop in temperature got here from making the HBM stack and the world round it extra conductive. That included merging the 4 stacks into two wider stacks, thereby eliminating a heat-trapping area; scaling down the highest—normally thicker—die of the stack; and filling in additional of the area across the HBM with clean items of silicon to conduct extra warmth.

    With all of that, the stack now operated at about 88 °C. One closing optimization introduced issues again to close 70 °C. Typically, some 95 p.c of a chip’s warmth is faraway from the highest of the package deal, the place on this case water carries the warmth away. However including related cooling to the underside as effectively drove the stacked chips down a closing 17 °C.

    Though the analysis introduced at IEDM reveals it is likely to be doable, HBM-on-GPU isn’t essentially the only option, Myers says. “We’re simulating different system configurations to assist construct confidence that that is or isn’t the only option,” he says. “GPU-on-HBM is of curiosity to some in trade,” as a result of it places the GPU nearer to the cooling. However it will seemingly be a extra advanced design, as a result of the GPU’s energy and knowledge must movement vertically by the HBM to succeed in it.

    From Your Web site Articles

    Associated Articles Across the Net



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleZoe Saldaña Rewrites Hollywood’s Record Books
    Next Article For safety’s sake, bring city block back to life
    Ironside News
    • Website

    Related Posts

    Tech News

    Founder Shares Value of Resilience in Entrepreneurship

    June 20, 2026
    Tech News

    IEEE Rolls Out Large Language Models Training Course

    June 19, 2026
    Tech News

    Amazon’s Astro Robot Sound Turns Motion Into Story

    June 19, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Energy demand: ‘Comprehensive permitting reform’

    December 28, 2025

    US Supreme Court rejects Colorado ban on LGBTQ child ‘conversion therapy’ | Courts News

    March 31, 2026

    Joby Air Taxi to Transform Urban Mobility in Dubai

    December 28, 2025

    What does the resilience of Gaza’s people tell the world? | Israel-Palestine conflict

    January 20, 2025

    UnAmerican! Stanislaus County California Democrat Party Rejects Invitation To Vigil For Charlie Kirk | The Gateway Pundit

    September 15, 2025
    Categories
    • Entertainment News
    • Latest News
    • Opinions
    • Politics
    • Tech News
    • Trending News
    • World Economy
    • World News
    Most Popular

    How to stop Israel from starving Gaza | Gaza

    September 4, 2025

    Venezuela’s Maduro declares national emergency after Caracas hit by explosions, smoke seen in capital

    January 3, 2026

    Bravo Pulls ‘Wife Swap’ Episode Featuring Wendy Osefo Following Arrest

    October 11, 2025
    Our Picks

    Royal Critics Suspicious Of Meghan Markle’s Reported UK Return

    June 22, 2026

    Trump say repairs to begin ‘immediately’ for Washington pool renovation

    June 22, 2026

    Mourners gather to remember Lebanese conservationist killed by Israel | Israel attacks Lebanon News

    June 22, 2026
    Categories
    • Entertainment News
    • Latest News
    • Opinions
    • Politics
    • Tech News
    • Trending News
    • World Economy
    • World News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright Ironsidenews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.