July 13th, 2024

gpu.cpp: A lightweight library for portable low-level GPU computation

The GitHub repository features gpu.cpp, a lightweight C++ library for portable GPU compute using WebGPU. It offers fast cycles, minimal dependencies, and examples like GELU kernel and matrix multiplication for easy integration.

Read original articleLink Icon
gpu.cpp: A lightweight library for portable low-level GPU computation

The GitHub repository at the provided URL contains information about gpu.cpp, a lightweight library focusing on portable GPU compute using C++. It utilizes the WebGPU specification for a low-level GPU interface, aiming for a high-power-to-weight ratio API with fast compile/run cycles and minimal dependencies. Developers can easily integrate GPU computation into their projects using standard C++ compilers, benefiting from a small API surface area and a prebuilt binary of the Dawn native WebGPU implementation. The library includes examples like GELU kernel, matrix multiplication, physics simulation, and signed distance function rendering, catering to projects needing portable on-device GPU computation with low implementation complexity.

Related

20x Faster Background Removal in the Browser Using ONNX Runtime with WebGPU

20x Faster Background Removal in the Browser Using ONNX Runtime with WebGPU

Using ONNX Runtime with WebGPU and WebAssembly in browsers achieves 20x speedup for background removal, reducing server load, enhancing scalability, and improving data security. ONNX models run efficiently with WebGPU support, offering near real-time performance. Leveraging modern technology, IMG.LY aims to enhance design tools' accessibility and efficiency.

A portable lightweight C FFI for Lua, based on libffi

A portable lightweight C FFI for Lua, based on libffi

A GitHub repository offers a portable lightweight C FFI for Lua, based on libffi. It aims for LuaJIT FFI compatibility, developed in C. Includes features, examples, basic types, build instructions, testing, and acknowledgements.

Show HN: UNet diffusion model in pure CUDA

Show HN: UNet diffusion model in pure CUDA

The GitHub content details optimizing a UNet diffusion model in C++/CUDA to match PyTorch's performance. It covers custom convolution kernels, forward pass improvements, backward pass challenges, and future optimization plans.

GPU profiling for WebGPU workloads on Windows with Chrome

GPU profiling for WebGPU workloads on Windows with Chrome

Challenges of GPU profiling for WebGPU in Chrome on Windows are addressed. A workaround using a custom DLL enables GPU profiling with tools like AMD's Radeon GPU Profiler and Nvidia's Nsight, enhancing performance metrics for WebGPU applications.

Karpathy: Let's reproduce GPT-2 (1.6B): one 8XH100 node 24h $672 in llm.c

Karpathy: Let's reproduce GPT-2 (1.6B): one 8XH100 node 24h $672 in llm.c

The GitHub repository focuses on the "llm.c" project by Andrej Karpathy, aiming to implement Large Language Models in C/CUDA without extensive libraries. It emphasizes pretraining GPT-2 and GPT-3 models.

Link Icon 16 comments
By @pavlov - 3 months
Lovely! I like how the API is in a single header file that you can read through and understand in one sitting.

I've worked with OpenGL and Direct3D and Metal in the past, but the pure compute side of GPUs is mostly foreign to me. Learning CUDA always felt like a big time investment when I never had an obvious need at hand.

So I'm definitely going to play with library and try to get up to speed. Thanks for publishing it.

By @0xf00ff00f - 3 months
This is cool, but they should have just used Vulkan. Dawn is a massive dependency (and a PITA to build, in my experience) to get what's basically a wrapper around Vulkan. Vulkan has a reputation for being difficult to work with, but if you just want to use a compute queue it's not that horrible. Also, since Vulkan uses SPIR-V, the user would have more choices for shading languages. Additionally, with RenderDoc you get source-level shader debugging.

Shameless plug: in case anyone wants to see how doing just compute with Vulkan looks like, I wrote a similar library to compete on SHAllenge [0], which was posted here on HN a few days ago. My library is here: https://github.com/0xf00ff00f/vulkan-compute-playground/

[0] https://shallenge.quirino.net/

By @austinvhuang - 3 months
Hi, author here! Agh I was intending for the project to fly under the radar for a few more days before making the announcement and blog post (please look/upvote that when you see it haha :)

But since this is starting I'm happy to chat. Nice to see the interest here!

By @almostgotcaught - 3 months
TIL you can run the WebGPU runtime without a browser.
By @jph00 - 3 months
We just published an article introducing gpu.cpp, what it's for, and how it works:

https://www.answer.ai/posts/2024-07-11--gpu-cpp.html

By @soci - 3 months
I watched the video mentioned in the post [1], but now I’m more confused than before…

What are the benefits, if any, of using gpu.cpp instead of just webgpu.h (webgpu native) directly? Maybe each is tailored for different use cases?

[1] https://youtu.be/qHrx41aOTUQ?si=CehJnYQWCg3XklHj

By @uLogMicheal - 3 months
This is awesome! Was looking at creating similar, inspired by the miniaudio approach. Will likely contribute a dart wrapper soon.
By @hpen - 3 months
Any performance metrics vs Vulkan, metal, etc?
By @captaincrowbar - 3 months
This looks useful but I'm worried about portability. Are there any plans for native Windows support?
By @Arech - 3 months
Very interesting... I wonder, how does code performance compares to raw Vulkan?
By @coffeeaddict1 - 3 months
Is this intended to integrate well in an existing WebGPU project?
By @apatheticonion - 3 months
Oh nice! Would love to see a Rust crate wrapping bindings for this
By @01HNNWZ0MV43FF - 3 months
> The only library dependency of gpu.cpp is a WebGPU implementation.

Noo

By @kookamamie - 3 months
Portable, as in Windows native is not supported?
By @byefruit - 3 months
This looks great. Is there an equivalent project in rust?