AMD AI Chip Unveiling - Assembly - Salesforce Research
AMD unveils MI300x AI chip as 'generative AI accelerator'
zdnet.com - None - Read On Original Website
AMD's Instinct MI300X GPU features multiple GPU "chiplets" plus 192 gigabytes of HBM3 DRAM memory, and 5.2 terabytes per second of memory bandwidth. The company said it is the only chip that can handle large language models of up to 80 billion parameters in memory. AMD
More Context
keyboard_arrow_down keyboard_arrow_right What are the technical specifications of the chip?
anandtech.com from 16 cores to 96 cores
neowin.net 128-core
wccftech.com 6 XCDs (Up To 228 CUs), 3 CCDs (Up To 24 Zen 4 Cores), 8 HBM3 Stacks
servethehome.com 128GB of HBM3, 5.2TB/s of memory bandwidth 896GB/s of Infinity Fabric bandwidth
crn.com higher core design and greater energy efficiency
finance.yahoo.com it can use up to 192GB of memory
zdnet.com enormous memory and data throughput
videocardz.com up to 128 cores, deliver up to 3.7x throughput performance for key cloud native workloads compared to Ampere1
prnewswire.com unmatched core count in 1U and 2U density
Advanced Micro Devices CEO Lisa Su on Tuesday in San Francisco unveiled a chip that is a centerpiece in the company's strategy for artificial intelligence computing, boasting its enormous memory and data throughput for so-called generative AI tasks such as large language models.
More Context
keyboard_arrow_down keyboard_arrow_right What are the other applications that the chip could be used for?
crn.com AI, Cloud Expansion
neowin.net business and cloud
videocardz.com cloud native and technical computing
servethehome.com NICs, storage, and even memory
wepc.com high-performance computing (HPC) and AI workloads
finance.yahoo.com large language models and generative AI
latestly.com Software Enablement for Generative AI (Artificial Intelligence)
benzinga.com health care to 5G networks and data centers
wccftech.com various core IPs, memory interfaces, interconnects
beststocks.com high-performance computing, graphics, and visualization technologies
prnewswire.com technical computing
The Instinct MI300X, as the part is known, is a follow-on to the previously announced MI300A. The chip is really a combination of multiple "chiplets," individual chips that are joined together in a single package by shared memory and networking links.
Su, onstage for an invite-only audience at the Fairmont Hotel in downtown San Francisco, referred to the part as a "generative AI accelerator," and said the GPU chiplets contained in it, a family known as CDNA 3, are "designed specifically for AI and HPC [high-performance computing] workloads."
The MI300X is a "GPU-only" version of the part. The MI300A is a combination of three Zen4 CPU chiplets with multiple GPU chiplets. But in the MI300X, the CPUs are swapped out for two additional CDNA 3 chiplets.
Also: Nvidia unveils new kind of Ethernet for AI, Grace Hopper 'Superchip' in full production
The MI300X increases the transistor count from 146 billion transistors to 153 billion, and the shared DRAM memory is boosted from 128 gigabytes in the MI300A to 192 gigabytes.
The memory bandwidth is boosted from 800 gigabytes per second to 5.2 terabytes per second.
"Our use of chiplets in this product is very, very strategic," said Su, because of the ability to mix and match different kinds of compute, swapping out CPU or GPU.
Su said the MI300X will offer 2.4 times the memory density of Nvidia's H100 "Hopper" GPU, and 1.6 times the memory bandwidth.
More Context
keyboard_arrow_down keyboard_arrow_right How does the chip compare to Nvidia's chips?
marketwatch.com closest competitor
crn.com fewer GPUs
wepc.com the MI300X has up to 2.4X the HBM density and 1.6X the HBM bandwidth
finance.yahoo.com a purported rival
"The generative AI, large language models have changed the landscape," said Su. "The need for more compute is growing exponentially, whether you're talking about training or about inference."
To demonstrate the need for powerful computing, Sue showed the part working on what she said is the most popular large language model at the moment, the open source Falcon-40B. Language models require more compute as they are built with greater and greater numbers of what are called neural network "parameters." The Falcon-40B consists of 40 billion parameters.
The MI300X, she said, is the first chip that is powerful enough to run a neural network of that size, entirely in memory, rather than having to move data, back-and-forth to and from external memory.
Su demonstrated the MI300X creating a poem about San Francisco using Falcon-40B.
"A single MI300X can run models up to approximately 80 billion parameters" in memory, she said.
"When you compare MI300X to the competition, MI300X offers 2.4 times more memory, and 1.6 times more memory bandwidth, and with all of that additional memory capacity, we actually have an advantage for large language models because we can run larger models directly in memory."
To be able to run the entire model in memory, said Su, means that, "for the largest models, that actually reduces the number of GPUs you need, significantly speeding up the performance, especially for inference, as well as reducing the total cost of ownership."
"I love this chip, by the way," enthused Su. "We love this chip."
"With MI300X, you can reduce the number of GPUs, and as model sizes keep growing, this will become even more important."
"With more memory, more memory bandwidth, and fewer GPUs needed, we can run more inference jobs per GPU than you could before," said Su. That will reduce the total cost of ownership for large language models, she said, making the technology more accessible.
Also: For AI's 'iPhone moment', Nvidia unveils a large language model chip
To compete with Nvidia's DGX systems, Su unveiled a family of AI computers, the "AMD Instinct Platform." The first instance of that will combine eight of the MI300X with 1.5 terabytes of HMB3 memory. The server conforms to the industry standard Open Compute Platform spec.
More Context
keyboard_arrow_down keyboard_arrow_right Who are the potential competitors for AMD's chip?
crn.com Amazon Web Services, Microsoft Azure, Google Cloud and others
benzinga.com NVIDIA Corp. NVDA, Meta Platforms Inc. META, and Tesla Inc. TSLA
seekingalpha.com Nvidia Corporation
fool.com Nvidia's H100
"For customers, they can use all this AI compute capability in memory in an industry-standard platform that drops right into their existing infrastructure," said Su.
Unlike MI300X, which is only a GPU, the existing MI300A is going up against Nvidia's Grace Hopper combo chip, which uses Nvidia's Grace CPU and its Hopper GPU, which the company announced last month is in full production.
MI300A is being built into the El Capitan supercomputer under construction at the Department of Energy's Lawrence Livermore National Laboratories, noted Su.
The MI300A is being shown as a sample currently to AMD customers, and the MI300X will begin sampling to customers in the third quarter of this year, said Su. Both will be in volume production in the fourth quarter, she said.
You can watch a replay of the presentation on the Website set up by AMD for the news.