July 16th, 2024

AMD MI300X vs. Nvidia H100 LLM Benchmarks

The AMD MI300X outperforms Nvidia H100 SXM on MistralAI's Mixtral 8x7B model at small and large batch sizes due to its larger VRAM. Cost-effective at various batch sizes, MI300X excels at very low and high batch sizes, while H100 SXM offers higher throughput at smaller to medium batch sizes. Workload-specific choice between the two GPUs balances throughput, latency, and cost efficiency.

Read original articleLink Icon
AMD MI300X vs. Nvidia H100 LLM Benchmarks

In a performance comparison between the AMD MI300X and Nvidia H100 SXM GPUs on MistralAI's Mixtral 8x7B inference model, the MI300X outperforms the H100 SXM at small and large batch sizes but lags behind at medium batch sizes. The MI300X's larger VRAM (192GB) proves advantageous at higher batch sizes, enabling it to handle larger workloads more efficiently on a single GPU. Cost-wise, the MI300X is more cost-effective at smaller and larger batch sizes compared to the H100 SXM. Serving benchmarks reveal that the H100 SXM offers higher throughput at smaller to medium batch sizes, while the MI300X provides lower latency and better consistency at larger batch sizes. The choice between the two GPUs depends on specific workload requirements, balancing throughput, latency, and cost efficiency. The MI300X excels at very low and very high batch sizes, leveraging its larger VRAM, while the H100 SXM demonstrates superior throughput at smaller to medium batch sizes. Further real-world tests are planned to benchmark other popular open-source models to explore the impact of AMD's 192GB of VRAM.

Link Icon 3 comments
By @samspenc - 6 months
Fascinating, despite the significantly better specs (and VRAM) on the AMD MI300x, the Nvidia H100 seems to match performance at lower batch sizes, and only loses out slightly at larger batches, I'm guessing the differentiator is mostly VRAM (192 GB in MI300 vs 80 GB in the Nvidia chip.)

Does anyone know if this is just due to ROCm vs CUDA implementations? Or something else?