September 3rd, 2024

Make Your Own CDN with NetBSD

The article outlines setting up a self-hosted CDN using NetBSD, Varnish, and nginx, detailing installation, SSL management, configuration, and benefits like control, device compatibility, and geo-replication options.

Read original articleLink Icon
Make Your Own CDN with NetBSD

This article provides a comprehensive guide on setting up a self-hosted Content Delivery Network (CDN) using NetBSD, a lightweight and secure operating system. It emphasizes NetBSD's compatibility with various hardware, including older devices, making it suitable for a caching reverse proxy. The installation process involves enabling binary package management and using the pkgin tool to install necessary packages like Varnish and nginx. Two methods for SSL certificate management are discussed: using acme.sh, which is recommended for its simplicity, and compiling the lego tool manually. The article details the configuration of Varnish and nginx, including creating a VCL configuration file for Varnish and modifying the nginx configuration to set up a reverse proxy. Finally, it outlines the steps to start both services, ensuring they are ready to handle incoming connections. The conclusion highlights the benefits of this setup, including control over the CDN and the ability to run on various devices, while also suggesting options for geo-replication and resilience through DNS management.

- The guide focuses on creating a self-hosted CDN using NetBSD, Varnish, and nginx.

- It provides two methods for SSL certificate management: acme.sh and lego.

- Configuration steps for Varnish and nginx are detailed, including VCL file creation.

- The setup is suitable for a variety of hardware, including older devices.

- The article suggests options for enhancing resilience and geo-replication.

Link Icon 13 comments
By @rahkiin - 5 months
This is not a CDN: Content Delivery Network. The value is in the networking bit. Storage all around the world for both resiliency, bandwidth cost, scalability, and low latency.

Having 1 server with some static file storage is called a web server.

By @scrapheap - 5 months
Varnish is one of those tools that has a very specific purpose (a highly configurable reverse caching proxy that is crazily fast). Most of the time I don't need it - but those places I have had to use it, it's made the difference between working services and failing services.

One example of where it made the difference was where we had two commercial systems, let's call them System A and System B. System A was acting as front end for System B, but System A was making so many API calls to System B it was grinding it to a halt. System B's responses would only change when System A made a call to a few specific APIs - so we put Varnish between System A and System B caching the common API responses. We also set it up so that when a request was made to the handful of APIs that would change the other API's for an account, we'd invalidate all the cache entries for that one specific account. Once System A was talking to the Varnish cache the performance of both Systems drastically improved.

By @sirn - 5 months
Some comments:

- You don't really need to repeat built-in VCLs in default.vcl. In the article, you can omit `vcl_hit`, `vcl_miss`, `vcl_purge`, `vcl_synth`, `vcl_hash`, etc. If you want to modify the behavior of built-in VCL, e.g. adding extra logs in vcl_purge, then just have `std.log` line and don't `return` (it will fall through to the built-in VCL). You can read more about built-in VCL on Varnish Developer Portal[1] and Varnish Cache documentation[2].

- Related to the above built-in VCL comment: `vcl_recv` current lacks all the guards provided by Varnish default VCL, so it's recommended to skip the `return (hash)` line at the end, so the built-in VCL can handle invalid requests and skip caching if Cookie or Authorization header is present. You may also want to use vmod_cookie[3] to keep only cookies you care about.

- Since Varnish is sitting behind another reverse proxy, it makes more sense to enable PROXY protocol, so client IPs are passed to Varnish as part of Proxy Protocol rather than X-Forwarded-For (so `client.ip`, etc. works). This means using `-a /var/run/varnish.sock,user=nginx,group=varnish,mode=660,PROXY`, and configuring `proxy_protocol on;` in Nginx.

[1]: https://www.varnish-software.com/developers/tutorials/varnis...

[2]: https://varnish-cache.org/docs/7.4/users-guide/vcl-built-in-...

[3]: https://varnish-cache.org/docs/trunk/reference/vmod_cookie.h...

By @daniel_iversen - 5 months
I’ve heard good things about varnish and believe I used it for a few things back in the day. Squid was also good when I used it in the kid 2000s (not sure where it’s today) and I think I heard that Akamai was originally just Squid on NetBSD or something like that!! Can anyone confirm or deny?
By @benterix - 5 months
The first article in the series offers a better explanation of what and why:

https://it-notes.dragas.net/2024/08/26/building-a-self-hoste...

By @draga79 - 5 months
This article is part of a series, and the goal is to create content caching nodes on hosts scattered around the world. When a user connects, the DNS will return the closest active host to them. On a larger scale, it's not much different from what commercial CDNs do.
By @systems_glitch - 5 months
Always nice to see a project choosing NetBSD! It's pretty easy to manage with Ansible too, so we sometimes rotate it in on "this could be any *NIX" projects and services.
By @jhdias - 5 months
By @justmarc - 5 months
NetBSD is just leet. FTW.
By @kaycey2022 - 5 months
What is the point of this? Isn’t a cdn’s primary purpose to cache content close to the client?
By @eqballhejri - 5 months
Electric hybrid
By @opentokix - 5 months
Useless

Varnish is not better in any shape or form than nginx for static content. Varnish has one single usecase, php-sites. - For everything else it will just add a layer of complexity that give no gains. And since varnish is essentially built on apache there is some issues with how it handles connections above about 50k/sec - where it gets complicated to configure, something that nginx does not have.