دسته‌ها
اخبار

Nvidia AI: Challengers Are Coming for Nvidia’s Crown


It’s hard to overstate Nvidia’s AI dominance. Founded in 1993,
Nvidia first made its mark in the then-new field of graphics processing units (GPUs) for personal computers. But it’s the company’s AI chips, not PC graphics hardware, that vaulted Nvidia into the ranks of the world’s most valuable companies. It turns out that Nvidia’s GPUs are also excellent for AI. As a result, its stock is more than 15 times as valuable as it was at the s، of 2020; revenues have ballooned from roughly US $12 billion in its 2019 fiscal year to $60 billion in 2024; and the AI power،use’s leading-edge chips are as scarce, and desired, as water in a desert.

Access to
GPUs “has become so much of a worry for AI researchers, that the researchers think about this on a day-to-day basis. Because otherwise they can’t have fun, even if they have the best model,” says Jennifer Prendki, head of AI data at Google DeepMind. Prendki is less reliant on Nvidia than most, as Google has its own ،mespun AI infrastructure. But other tech giants, like Microsoft and Amazon, are a، Nvidia’s biggest customers, and continue to buy its GPUs as quickly as they’re ،uced. Exactly w، gets them and why is the subject of an an،rust investigation by the U.S. Department of Justice, according to press reports.

Nvidia’s AI dominance, like the explosion of ma،e learning itself, is a recent turn of events. But it’s rooted in the company’s decades-long effort to establish GPUs as general computing hardware that’s useful for many tasks besides rendering graphics. That effort spans not only the company’s GPU architecture, which evolved to include “tensor cores” adept at accelerating AI workloads, but also, critically, its software platform, called
Cuda, to help developers take advantage of the hardware.

“They made sure every computer-science major coming out of university is trained up and knows ،w to
program CUDA,” says Matt Kimball, prin،l data-center ،yst at Moor Insights & Strategy. “They provide the tooling and the training, and they spend a lot of money on research.”

Released in 2006, CUDA helps developers use an Nvidia GPU’s many cores. That’s proved essential for accelerating highly parallelized compute tasks, including modern generative AI. Nvidia’s success in building the CUDA ecosystem makes its hardware the path of least resistance for AI development. Nvidia chips might be in s،rt supply, but the only thing more difficult to find than AI hardware is experienced AI developers—and many are familiar with CUDA.

That gives Nvidia a deep, broad moat with which to defend its business, but that doesn’t mean it lacks compe،ors ready to storm the castle, and their tactics vary widely. While decades-old companies like
Advanced Micro Devices (AMD) and Intel are looking to use their own GPUs to rival Nvidia, ups،s like Cere،s and SambaNova have developed radical chip architectures that drastically improve the efficiency of generative AI training and inference. These are the compe،ors most likely to challenge Nvidia.

AMD: The other GPU maker

Pro: AMD GPUs are convincing Nvidia alternatives

Con: Software ecosystem can’t rival Nvidia’s CUDA

AMD has battled Nvidia in the graphics-chip arena for nearly two decades. It’s been, at times, a lopsided fight. When it comes to graphics, AMD’s GPUs have rarely beaten Nvidia’s in sales or mindshare. Still, AMD’s hardware has its strengths. The company’s broad GPU portfolio extends from integrated graphics for laptops to AI-focused data-center GPUs with over 150 billion transistors. The company was also an early supporter and adopter of high-bandwidth memory (HBM), a form of memory that’s now essential to the world’s most advanced GPUs.

“If you look at the hardware…it stacks up favorably” to Nvidia, says Kimball, referring to AMD’s Instinct MI325X, a compe،or of Nvidia’s H100. “AMD did a fantastic job laying that chip out.”

The MI325X, slated to launch by the end of the year, has over 150 billion transistors and 288 gigabytes of high-bandwidth memory, t،ugh real-world results remain to be seen. The MI325X’s predecessor, the MI300X, earned praise from Microsoft, which deploys AMD hardware, including the MI300X, to handle some ChatGPT 3.5 and 4 services. Meta and Dell have also deployed the MI300X, and Meta used the chips in parts of the development of its latest large language model, Llama 3.1.

There’s still a hurdle for AMD to leap: software. AMD offers an open-source platform, ROCm, to help developers program its GPUs, but it’s less popular than CUDA. AMD is aware of this weakness, and in July 2024, it agreed to buy Europe’s largest private AI lab, Silo AI, which has experience doing large-scale AI training using ROCm and AMD hardware. AMD has also plans to purchase ZT Systems, a company with expertise in data-center infrastructure, to help the company serve customers looking to deploy its hardware at scale. Building a rival to CUDA is no small feat, but AMD is certainly trying.

Intel: Software success

Pro:Gaudi 3 AI accelerator s،ws strong performance

Con: Next big AI chip doesn’t arrive until late 2025

Intel’s challenge is the opposite of AMD’s.

While Intel lacks an exact match for Nvidia’s CUDA and AMD’s ROCm, it launched an open-source unified programming platform, OneAPI, in 2018. Unlike CUDA and ROCm, OneAPI spans multiple categories of hardware, including CPUs, GPUs, and FPGAs. So it can help developers accelerate AI tasks (and many others) on any Intel hardware. “Intel’s got a heck of a software ecosystem it can turn on pretty easily,” says Kimball.

Hardware, on the other hand, is a weakness, at least when compared to Nvidia and AMD. Intel’s Gaudi AI accelerators, the fruit of Intel’s 2019 acquisition of AI hardware s،up Habana Labs, have made headway, and the latest, Gaudi 3, offers performance that’s compe،ive with Nvidia’s H100.

However, it’s unclear precisely what Intel’s next hardware release will look like, which has caused some concern. “Gaudi 3 is very capable,” says Patrick Moorhead, founder of Moor Insights & Strategy. But as of July 2024 “there is no Gaudi 4,” he says.

Intel instead plans to pivot to an ambitious chip, code-named Falcon S،res, with a tile-based modular architecture that combines Intel x86 CPU cores and Xe GPU cores; the latter are part of Intel’s recent push into graphics hardware. Intel has yet to reveal details about Falcon S،res’ architecture and performance, t،ugh, and it’s not slated for release until late 2025.

Cere،s: Bigger is better

Pro: Wafer-scale chips offer strong performance and memory per chip

Con: Applications are niche due to size and cost

Make no mistake: AMD and Intel are by far the most credible challengers to Nvidia. They share a history of designing successful chips and building programming platforms to go alongside them. But a، the smaller, less proven players, one stands out: Cere،s.

The company, which specializes in AI for supercomputers, made waves in 2019 with the Wafer Scale Engine, a gigantic, wafer-size piece of silicon packed with 1.2 trillion transistors. The most recent iteration, Wafer Scale Engine 3, ups the ante to 4 trillion transistors. For comparison, Nvidia’s largest and newest GPU, the B200, has “just” 208 billion transistors. The computer built around this wafer-scale monster, Cere،s’s CS-3, is at the heart of the Condor Galaxy 3, which will be an 8-exaflop AI supercomputer made up of 64 CS-3s. G42, an Abu Dhabi–based conglomerate that ،pes to train tomorrow’s leading-edge large language models, will own the system.

“It’s a little more niche, not as general purpose,” says Stacy Rasgon, senior ،yst at Bernstein Research. “Not everyone is going to buy [these computers]. But they’ve got customers, like the [United States] Department of Defense, and [the Condor Galaxy 3] supercomputer.”

Cere،s’s WSC-3 isn’t going to challenge Nvidia, AMD, or Intel hardware in most situations; it’s too large, too costly, and too specialized. But it could give Cere،s a unique edge in supercomputers, because no other company designs chips on the scale of the WSE.

SambaNova: A transformer for transformers

Pro: Configurable architecture helps developers squeeze efficiency from AI models

Con: Hardware still has to prove relevance to m، market

SambaNova, founded in 2017, is another chip-design company tackling AI training with an unconventional chip architecture. Its flag،p, the SN40L, has what the company calls a “reconfigurable dataflow architecture” composed of tiles of memory and compute resources. The links between these tiles can be altered on the fly to facilitate the quick movement of data for large neural networks.

Prendki believes such customizable silicon could prove useful for training large language models, because AI developers can optimize the hardware for different models. No other company offers that capability, she says.

SambaNova is also scoring wins with SambaFlow, the software stack used alongside the SN40L. “At the infrastructure level, SambaNova is doing a good job with the platform,” says Moorhead. SambaFlow can ،yze ma،e learning models and help developers reconfigure the SN40L to accelerate the model’s performance. SambaNova still has a lot to prove, but its customers include SoftBank and Analog Devices.

Groq: Form for function

Pro: Excellent AI inference performance

Con: Application currently limited to inference

Yet another company with a unique spin on AI hardware is Groq. Groq’s approach is focused on tightly pairing memory and compute resources to accelerate the s،d with which a large language model can respond to prompts.

“Their architecture is very memory based. The memory is tightly coupled to the processor. You need more nodes, but the price per ،n and the performance is nuts,” says Moorhead. The “،n” is the basic unit of data a model processes; in an LLM, it’s typically a word or portion of a word. Groq’s performance is even more impressive, he says, given that its chip, called the Language Processing Unit Inference Engine, is made using GlobalFoundries’ 14-nanometer technology, several generations behind the TSMC technology that makes the Nvidia H100.

In July, Groq posted a demonstration of its chip’s inference s،d, which can exceed 1,250 ،ns per second running Meta’s Llama 3 8-billion parameter LLM. That beats even SambaNova’s demo, which can exceed 1,000 ،ns per second.

Qualcomm: Power is everything

Pro: Broad range of chips with AI capabilities

Con: Lacks large, leading-edge chips for AI training

Qualcomm, well known for the Snapdragon system-on-a-chip that powers popular Android p،nes like the Samsung Galaxy S24 Ultra and OnePlus 12, is a giant that can stand toe-to-toe with AMD, Intel, and Nvidia.

But unlike t،se ،rs, the company is focusing its AI strategy more on AI inference and energy efficiency for specific tasks. Anton Lokhmotov, a founding member of the AI benchmarking ،ization MLCommons and CEO of Krai, a company that specializes in AI optimization, says Qualcomm has significantly improved the inference of the Qualcomm Cloud AI 100 servers in an important benchmark test. The servers’ performance increased from 180 to 240 samples-per-watt in ResNet-50, an image-cl،ification benchmark, using “essentially the same server hardware,” Lokhmotov notes.

Efficient AI inference is also a boon on devices that need to handle AI tasks locally wit،ut rea،g out to the cloud, says Lokhmotov. Case in point: Microsoft’s Copilot Plus PCs. Microsoft and Qualcomm partnered with laptop makers, including Dell, HP, and Lenovo, and the first Copilot Plus laptops with Qualcomm chip، store shelves in July. Qualcomm also has a strong presence in smartp،nes and tablets, where its Snapdragon chips power devices from Samsung, OnePlus, and Motorola, a، others.

Qualcomm is an important player in AI for driver ،ist and self-driving platforms, too. In early 2024, Hyundai’s Mobius division announced a partner،p to use the Snapdragon Ride platform, a rival to Nvidia’s Drive platform, for advanced driver-،ist systems.

The Hyperscalers: Custom ،ins for ،wn

Pros: Vertical integration focuses design

Cons: Hyperscalers may prioritize their own needs and uses first

Hyperscalers—cloud-computing giants that deploy hardware at vast scales—are synonymous with Big Tech. Amazon, Apple, Google, Meta, and Microsoft all want to deploy AI hardware as quickly as possible, both for their own use and for their cloud-computing customers. To accelerate that, they’re all designing chips in-،use.

Google began investing in AI processors much earlier than its compe،ors: The search giant’s Tensor Processing Units, first announced in 2015, now power most of its AI infrastructure. The sixth generation of TPUs, Trillium, was announced in May and is part of Google’s AI Hypercomputer, a cloud-based service for companies looking to handle AI tasks.

Prendki says Google’s TPUs give the company an advantage in pursuing AI opportunities. “I’m lucky that I don’t have to think too hard about where I get my chips,” she says. Access to TPUs doesn’t entirely eliminate the supply crunch, t،ugh, as different Google divisions still need to share resources.

And Google is no longer alone. Amazon has two in-،use chips, Trainium and Inferentia, for training and inference, respectively. Microsoft has Maia, Meta has MTIA, and Apple is supposedly developing silicon to handle AI tasks in its cloud infrastructure.

None of these compete directly with Nvidia, as hyperscalers don’t sell hardware to customers. But they do sell access to their hardware through cloud services, like Google’s AI Hypercomputer, Amazon’s AWS, and Microsoft’s Azure. In many cases, hyperscalers offer services running on their own in-،use hardware as an option right alongside services running on hardware from Nvidia, AMD, and Intel; Microsoft is t،ught to be Nvidia’s largest customer.

An il،ration of a knight ،lding a crown surrounded by arrows.  David Plunkert

Chinese chips: An opaque future

Another category of compe،or is born not of technical needs but of geopolitical realities. The United States has imposed restrictions on the export of AI hardware that prevents chipmakers from selling their latest, most-capable chips to Chinese companies. In response, Chinese companies are designing ،megrown AI chips.

Huawei is a leader. The company’s Ascend 910B AI accelerator, designed as an alternative to Nvidia’s H100, is in ،uction at Semiconductor Manufacturing International Corp., a Shanghai-based foundry partially owned by the Chinese government. However, yield issues at SMIC have reportedly constrained supply. Huawei is also selling an “AI-in-a-box” solution, meant for Chinese companies looking to build their own AI infrastructure on-premises.

To get around the U.S. export control rules, Chinese industry could turn to alternative technologies. For example, Chinese researchers have made headway in p،tonic chips that use light, instead of electric charge, to perform calculations. “The advantage of a beam of light is you can cross one [beam with] another,” says Prendki. “So it reduces constraints you’d normally have on a silicon chip, where you can’t cross paths. You can make the circuits more complex, for less money.” It’s still very early days for p،tonic chips, but Chinese investment in the area could accelerate its development.

Room for more

It’s clear that Nvidia has no s،rtage of compe،ors. It’s equally clear that none of them will challenge—never mind defeat—Nvidia in the next few years. Everyone interviewed for this article agreed that Nvidia’s dominance is currently unparalleled, but that doesn’t mean it will crowd out compe،ors forever.

“Listen, the market wants c،ice,” says Moorhead. “I can’t imagine AMD not having 10 or 20 percent market share, Intel the same, if we go to 2026. Typically, the market likes three, and there we have three reasonable compe،ors.” Kimball says the hyperscalers, meanwhile, could challenge Nvidia as they transition more AI services to in-،use hardware.

And then there’s the wild cards. Cere،s, SambaNova, and Groq are the leaders in a very long list of s،ups looking to nibble away at Nvidia with novel solutions. They’re joined by dozens of others, including
d-Matrix, Untether, Tenstorrent, and Etched, all pinning their ،pes on new chip architectures optimized for generative AI. It’s likely many of these s،ups will falter, but perhaps the next Nvidia will emerge from the survivors.

From Your Site Articles

Related Articles Around the Web


منبع: https://spect،.ieee.org/nvidia-ai