The AI boom has a memory problem

High-bandwidth memory keeps powerful AI chips fed with data, and demand for it helped Boise-based Micron briefly top $1 trillion

By Ramin Skibba edited by Eric Sullivan

Red light fills a cleanroom machine used in semiconductor manufacturing, with reflective panels and mechanical parts arranged around a central platform. — Equipment inside a Micron Technology facility. The company’s memory chips have become increasingly important to the AI hardware boom.

Kyle Green/Bloomberg via Getty Images

Join Our Community of Science Lovers!

For decades, Micron Technology made one of computing’s less glamorous essentials: memory chips. Then the artificial intelligence boom made that hardware one of the industry’s most sought-after components. Technology companies are now scrambling for high-bandwidth memory, or HBM; Micron specializes in it. This week, the Boise-based company became the first U.S. memory-chip company to briefly top $1 trillion in market value—a milestone that points to a larger shift in the AI supply chain.

AI systems depend on fast processors, but also on how quickly data can reach them and remain accessible. HBM is designed to do just that. “The reason HBMs are in such high demand is that they have pretty good storage, and they’re extremely, extremely fast,” says Keren Bergman, an electrical engineering professor at Columbia University.

HBM chips are built differently from the memory inside a laptop or phone. Instead of spreading memory chips across a board, HBM stacks layers of memory vertically and places them close to the processor. The arrangement gives AI accelerators a much wider path to the data they need. Micron says its HBM4 chips can reach more than 2.8 terabytes per second of bandwidth and are designed for Nvidia’s next-generation Vera Rubin GPUs.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

Within a computer, memory chips and processors are like buildings connected with highways. There are only so many ways to widen those roads. Engineers can make memory faster only to a point, and they can add onlyso many physical connections between memory and processors, says Hadi Esmaeilzadeh, a computer architecture researcher at UC San Diego. The innovation of high-bandwidth memory is to stack the buildings 12 or even 16 layers high, with the layers connected by through-silicon vias, or TSVs, so that GPU processors and other accelerators can reach more memory in a given time. “Now there’s higher connectivity between the two, providing higher bandwidth. It’s like adding more lanes on highways,” Esmaeilzadeh says.

The demand is coming from both sides of the AI business. Training large models requires huge clusters of accelerators. Running those models for users, whether in chatbots, coding tools, or future AI agents, also requires moving enormous amounts of data, again and again. And a GPU waiting for data is wasted hardware.

Bandwidth is only part of the problem. As large language models expand, capacity becomes a challenge too, even with top-of-the-line HBM chips. “Because of the growing size of AI models, the available memory capacity you have close by is one or two orders of magnitude less than what you need,” Bergman says. Memory has become one of the central limits on advanced AI hardware. (Micron declined Scientific American’s requests for comment.)

That has made memory a strategic concern as well. Many leading memory suppliers, like SK Hynix and Samsung, are based in Asia, while Micron is the largest in North America. “It’s in the national security interest that we bring chip manufacturing back to the United States,” Esmaeilzadeh says. “Our dependence on AI systems is growing, and our supply chain is somewhere else.”

Not every AI bet will pay off. Some industry leaders, including Google’s Sundar Pichai and OpenAI’s Sam Altman, have warned of a possible bubble, and the buildout faces constraints beyond chips. Data center construction has stalled, and banks have grown wary of the glut of debt piling up behind it.

The demand for memory, though, shows no sign of slowing. “It’s very clear that we’re not even close to meeting the compute demand that’s out there,” Bergman says. Bubble or not, the hardware undergirding AI must keep moving data—and right now, it can’t move fast enough.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American