October 28th, 2024

GDDR7 Memory Supercharges AI Inference

GDDR7 memory enhances AI inference with up to 48 GT/s and 192 GB/s throughput, utilizing PAM3 encoding for improved performance, advanced reliability features, and supporting high-memory throughput in various applications.

Read original article

GDDR7 memory has emerged as a cutting-edge solution for enhancing AI inference capabilities, boasting a performance roadmap of up to 48 Gigatransfers per second (GT/s) and a memory throughput of 192 GB/s per device. This new generation of graphics memory is essential for the increasing demands of AI workloads, particularly in inference applications where low latency and high bandwidth are critical. GDDR7 memory significantly outperforms its predecessor, GDDR6, by utilizing three-bit pulse amplitude modulation (PAM3) encoding, which allows for a 50% increase in data transmission rates. The memory can deliver 128 GB/s of bandwidth, more than double that of LPDDR5T. Additionally, GDDR7 incorporates advanced reliability features, including on-die error correction and data integrity enhancements. The transition to four 10-bit channels from two 16-bit channels in GDDR6 further optimizes performance. The Rambus GDDR7 Controller IP is designed to support high-memory throughput and programmability, facilitating the integration of GDDR7 in various applications. As AI inference models grow in size and complexity, the need for high-performance memory solutions like GDDR7 becomes increasingly vital for AI accelerators and GPUs deployed in edge computing environments.

- GDDR7 memory offers up to 48 GT/s and 192 GB/s throughput, enhancing AI inference.

- It utilizes PAM3 encoding for a 50% increase in data transmission compared to GDDR6.

- GDDR7 provides 128 GB/s bandwidth, significantly outperforming LPDDR5T.

- Advanced reliability features include on-die error correction and data integrity enhancements.

- The Rambus GDDR7 Controller IP supports high-memory throughput and programmability.

Nvidia launches a new RTX 4070

Nvidia has introduced an updated RTX 4070 graphics card with GDDR6 memory to enhance supply and availability, maintaining similar performance to the previous model, available globally from September 2024.

Cerebras Inference: AI at Instant Speed

Cerebras launched its AI inference solution, claiming to process 1,800 tokens per second, outperforming NVIDIA by 20 times, with competitive pricing and plans for future model support.

Cerebras Launches the Fastest AI Inference

Cerebras Systems launched Cerebras Inference, the fastest AI inference solution, outperforming NVIDIA GPUs by 20 times, processing up to 1,800 tokens per second, with significant cost advantages and multiple service tiers.

The Memory Wall: Past, Present, and Future of DRAM

The article highlights the challenges facing DRAM, including slowed scaling, rising AI-driven memory demand, and high costs of HBM, while emphasizing the need for innovation and new memory technologies.

AMD Instinct MI325X to Feature 256GB HBM3E Memory, CDNA4-Based MI355X with 288GB

AMD announced updates to its Instinct GPUs, introducing the MI325X with 256GB memory and 6 TB/s bandwidth, and the MI355X with 288GB memory and 8 TB/s bandwidth, launching in 2025.

11 comments

By @cogman10 - 6 months

What an annoying article to read. "The AI workload of AI in a digital AI world that the AI world AI when it AIs. Also the bandwidth is higher. AaaaaaaaaIiiiiiiiiiii".

90% of the article is just finding new ways to integrate "AI" into a purely fluff sentence.

By @Retr0id - 6 months

PAM3 is 3 levels per unit interval (~1.58 bits), not 3 bits per cycle as reported in this article. Although I suppose if you count a cycle as both edges of the clock it's 3.17 bits.

By @alberth - 6 months

Is I/O starvation the bottleneck with GPUs?

I didn't think it was.

By @vdfs - 6 months

Not even a mention of Blockchain

By @blackoil - 6 months

If 5090 comes with 32GB of this RAM. That should be substantial boost over 4090!! Hope that isn't reflected in the price.

By @hmottestad - 6 months

"With this new encoding scheme, GDDR7 can transmit “3 bits of information” per cycle, resulting in a 50% increase in data transmission compared to GDDR6 at the same clock speed."

Sounds pretty awesome. I would think that it's going to be much hard to achieve the same clock speeds.

By @ilaksh - 6 months

So it's almost twice the performance? That's great. But AI could actually easily use 10 times.

Anyone heard anything about memristors being in a real large scale memory/compute product?

By @sva_ - 6 months

Trying to figure out how this compares to HBM3/e

By @octocop - 6 months

What does "48 Gigatransfers per second (GT/s)" mean?

By @grahamj - 6 months

Well, yeah

Any bets on when it gets renamed AIDDR? Only partly joking

Nvidia launches a new RTX 4070

Cerebras Inference: AI at Instant Speed

Cerebras launched its AI inference solution, claiming to process 1,800 tokens per second, outperforming NVIDIA by 20 times, with competitive pricing and plans for future model support.

Cerebras Launches the Fastest AI Inference

The Memory Wall: Past, Present, and Future of DRAM

AMD Instinct MI325X to Feature 256GB HBM3E Memory, CDNA4-Based MI355X with 288GB

AMD announced updates to its Instinct GPUs, introducing the MI325X with 256GB memory and 6 TB/s bandwidth, and the MI355X with 288GB memory and 8 TB/s bandwidth, launching in 2025.

GDDR7 Memory Supercharges AI Inference

Related

Nvidia launches a new RTX 4070

Cerebras Inference: AI at Instant Speed

Cerebras Launches the Fastest AI Inference

The Memory Wall: Past, Present, and Future of DRAM

AMD Instinct MI325X to Feature 256GB HBM3E Memory, CDNA4-Based MI355X with 288GB

Related

Nvidia launches a new RTX 4070

Cerebras Inference: AI at Instant Speed

Cerebras Launches the Fastest AI Inference

The Memory Wall: Past, Present, and Future of DRAM

AMD Instinct MI325X to Feature 256GB HBM3E Memory, CDNA4-Based MI355X with 288GB