dstack (K8s alternative) adds support for AMD accelerators on RunPod
dstack has introduced support for AMD accelerators on RunPod, enabling efficient AI container orchestration with MI300X GPUs, which offer higher VRAM and memory bandwidth, enhancing model deployment capabilities.
Read original articledstack has announced support for AMD accelerators on RunPod, marking it as the first cloud provider to offer AMD GPUs through its platform. This initiative aims to enhance vendor independence and portability for AI container orchestration. The MI300X GPU, which has been benchmarked favorably against the H100 SXM, features significant advantages such as higher VRAM (192 GB compared to 80 GB) and superior memory bandwidth (5.3 TB/s versus 3.4 TB/s). This allows for more efficient deployment of large models, such as the FP16 version of Llama 3.1, which can be accommodated within a single node using eight MI300X GPUs. Users can now specify AMD GPUs in their configurations, with examples provided for both service and development environments. The support for AMD accelerators is expected to expand to other cloud providers and on-premise servers in the future. Users interested in utilizing this feature must ensure they are using images that include ROCm drivers. dstack plans to add more examples for using AMD accelerators with various frameworks.
- dstack supports AMD accelerators on RunPod, enhancing AI container orchestration.
- MI300X GPU offers higher VRAM and memory bandwidth compared to H100 SXM.
- Users can deploy large models more efficiently with MI300X.
- Configuration examples for services and development environments are provided.
- Support for AMD accelerators will expand to additional cloud providers soon.
Related
Testing AMD's Giant MI300X
AMD introduces Radeon Instinct MI300X to challenge NVIDIA in GPU compute market. MI300X features chiplet setup, Infinity Cache, CDNA 3 architecture, competitive performance against NVIDIA's H100, and excels in local memory bandwidth tests.
AMD MI300X performance compared with Nvidia H100
The AMD MI300X AI GPU outperforms Nvidia's H100 in cache, latency, and inference benchmarks. It excels in caching performance, compute throughput, but AI inference performance varies. Real-world performance and ecosystem support are essential.
AMD MI300X vs. Nvidia H100 LLM Benchmarks
The AMD MI300X outperforms Nvidia H100 SXM on MistralAI's Mixtral 8x7B model at small and large batch sizes due to its larger VRAM. Cost-effective at various batch sizes, MI300X excels at very low and high batch sizes, while H100 SXM offers higher throughput at smaller to medium batch sizes. Workload-specific choice between the two GPUs balances throughput, latency, and cost efficiency.
AMD's Long and Winding Road to the Hybrid CPU-GPU Instinct MI300A
AMD's journey from 2012 led to the development of the powerful Instinct MI300A compute engine, used in the "El Capitan" supercomputer. Key researchers detailed AMD's evolution, funding, and technology advancements, impacting future server offerings.
Show HN: We made glhf.chat – run almost any open-source LLM, including 405B
The platform allows running various large language models via Hugging Face repo links using vLLM and GPU scheduler. Offers free beta access with plans for competitive pricing post-beta using multi-tenant model running.
Maybe it’s k8s alternative for AI/LLM!
Related
Testing AMD's Giant MI300X
AMD introduces Radeon Instinct MI300X to challenge NVIDIA in GPU compute market. MI300X features chiplet setup, Infinity Cache, CDNA 3 architecture, competitive performance against NVIDIA's H100, and excels in local memory bandwidth tests.
AMD MI300X performance compared with Nvidia H100
The AMD MI300X AI GPU outperforms Nvidia's H100 in cache, latency, and inference benchmarks. It excels in caching performance, compute throughput, but AI inference performance varies. Real-world performance and ecosystem support are essential.
AMD MI300X vs. Nvidia H100 LLM Benchmarks
The AMD MI300X outperforms Nvidia H100 SXM on MistralAI's Mixtral 8x7B model at small and large batch sizes due to its larger VRAM. Cost-effective at various batch sizes, MI300X excels at very low and high batch sizes, while H100 SXM offers higher throughput at smaller to medium batch sizes. Workload-specific choice between the two GPUs balances throughput, latency, and cost efficiency.
AMD's Long and Winding Road to the Hybrid CPU-GPU Instinct MI300A
AMD's journey from 2012 led to the development of the powerful Instinct MI300A compute engine, used in the "El Capitan" supercomputer. Key researchers detailed AMD's evolution, funding, and technology advancements, impacting future server offerings.
Show HN: We made glhf.chat – run almost any open-source LLM, including 405B
The platform allows running various large language models via Hugging Face repo links using vLLM and GPU scheduler. Offers free beta access with plans for competitive pricing post-beta using multi-tenant model running.