July 27th, 2024

Ask HN: Would you use a shared GPU cloud tier?

A new cloud instance model allows users to pay only for active GPU usage, but tasks may take about 25% longer than on dedicated GPU instances due to shared resources.

Ask HN: Would you use a shared GPU cloud tier?

A new cloud instance model has been introduced that functions like a standard GPU instance while utilizing shared GPU resources. This model allows users to pay only for the time the GPU is actively in use, rather than for the entire duration the instance is running. However, this approach comes with a trade-off, as tasks executed on this shared GPU instance may take approximately 25% longer to complete compared to traditional dedicated GPU instances.

Related

20x Faster Background Removal in the Browser Using ONNX Runtime with WebGPU

20x Faster Background Removal in the Browser Using ONNX Runtime with WebGPU

Using ONNX Runtime with WebGPU and WebAssembly in browsers achieves 20x speedup for background removal, reducing server load, enhancing scalability, and improving data security. ONNX models run efficiently with WebGPU support, offering near real-time performance. Leveraging modern technology, IMG.LY aims to enhance design tools' accessibility and efficiency.

Intel's Gaudi 3 will cost half the price of Nvidia's H100

Intel's Gaudi 3 will cost half the price of Nvidia's H100

Intel's Gaudi 3 AI processor is priced at $15,650, half of Nvidia's H100. Intel aims to compete in the AI market dominated by Nvidia, facing challenges from cloud providers' custom AI processors.

So you want to rent an NVIDIA H100 cluster? 2024 Consumer Guide

So you want to rent an NVIDIA H100 cluster? 2024 Consumer Guide

Considerations when renting an NVIDIA H100 cluster include price, reliability, spare nodes, storage, support, and management. Testing before committing, monitoring GPU usage, and eco-friendly choices are crucial. Prioritize reliability, efficient interconnects, spare nodes, support, and eco-consciousness. Choose cluster management wisely and understand electricity sources for sustainability.

gpu.cpp: A lightweight library for portable low-level GPU computation

gpu.cpp: A lightweight library for portable low-level GPU computation

The GitHub repository features gpu.cpp, a lightweight C++ library for portable GPU compute using WebGPU. It offers fast cycles, minimal dependencies, and examples like GELU kernel and matrix multiplication for easy integration.

AMD's Long and Winding Road to the Hybrid CPU-GPU Instinct MI300A

AMD's Long and Winding Road to the Hybrid CPU-GPU Instinct MI300A

AMD's journey from 2012 led to the development of the powerful Instinct MI300A compute engine, used in the "El Capitan" supercomputer. Key researchers detailed AMD's evolution, funding, and technology advancements, impacting future server offerings.

Link Icon 8 comments
By @cpeterson42 - 6 months
For anyone curious, here is an early prototype of this tech in action:

https://imgur.com/a/2qPN4ru

Would love to hear your thoughts on how we can make this most useful for you!

By @cheptsov - 6 months
Sounds like an extremely complex technical problem. I also suggest to look at the use cases when this is needed. One of the problems is that loading weights into the GPU will be so slow that it will be really hard to share the GPU between different processes - causing long time to offload and load. Would love to learn more about what you do.
By @lbhdc - 6 months
Yes, I have wanted something like this for a while. I try to avoid using gpus where possible because of the expense, and the ephemeral nature of my use.
By @einsteinx2 - 6 months
Depending on the price difference from a standard GPU instance, absolutely.
By @JSDevOps - 6 months
Isn’t this just the cloud? You pay for what you use
By @billconan - 6 months
Yes, I would use it if the price is affordable.
By @DamonHD - 6 months
Maybe prefix with "Ask HN:"?
By @42lux - 6 months
Depends on your workload...