We survived 10k requests/second: Switching to signed asset URLs in an emergency
Hardcover experienced a surge in Google Cloud expenses due to unauthorized access to their public storage. They implemented signed URLs via a Ruby on Rails proxy, reducing costs and enhancing security.
Read original articleIn response to an unexpected surge in Google Cloud expenses, Adam Fortuna detailed how his company, Hardcover, managed a crisis involving 10,000 requests per second to their public Google Cloud Storage bucket. Initially, Fortuna noticed unusual charges on his debit card, which led him to investigate the spike in Cloud Storage costs. He discovered that the public access configuration of their storage bucket allowed unauthorized users to download images at an alarming rate, resulting in significant data transfer costs. To mitigate this issue, Fortuna decided to switch to signed URLs, which would restrict access to images while still allowing legitimate users to retrieve them. The solution involved creating a Ruby on Rails proxy that generates signed URLs for images, caching them for efficiency. This approach not only protected the storage bucket but also ensured that API users could still access the necessary images without disruption. The implementation of signed URLs has since stabilized the system, reducing costs and preventing unauthorized access.
- Hardcover faced a significant spike in Google Cloud expenses due to unauthorized access to public storage.
- The company switched to signed URLs to secure access to images while maintaining usability for legitimate users.
- A Ruby on Rails proxy was implemented to generate and cache signed URLs efficiently.
- The transition has successfully reduced costs and improved security against bulk data downloads.
Related
Show HN: Turning my 10y old tablet into digital photo frame
Pankaj Tanwar transformed a 10-year-old tablet into a digital photo frame using a custom nginx server and a bash script to display images from a Google Photos album.
Is Cloudflare overcharging us for their images service?
Jérôme Petazzoni reported unexpectedly high charges for Cloudflare's Images service, exceeding $400 instead of the anticipated $110, due to confusing billing practices. He is considering alternatives like Amazon S3.
How to save $13.27 on your SaaS bill
The author discusses managing costs with Vercel's analytics, converting images to reduce charges, and building a custom API using SQLite. They faced deployment challenges but plan future enhancements.
How HashiCorp evolved its cloud infrastructure
Michael Galloway discusses HashiCorp's cloud infrastructure evolution, emphasizing the need for clear objectives, deadlines, and executive buy-in to successfully redesign and expand their services amid growing demands.
Tell HN: Server error (5xx) in Google Search Console may not be 5xx at all
The website next-episode.net faced indexing issues as Google misreported "429 Too Many Requests" as "5xx" errors. Whitelisting Google Crawlers' IPs resolved the issue, with no new errors reported since.
- Many commenters express confusion about the lack of a CDN in front of the public Google Cloud Storage bucket, suggesting it as a standard practice for security and cost management.
- There are warnings about potential security vulnerabilities associated with using signed URLs without proper validation and checks.
- Several users share personal experiences with similar issues, highlighting the prevalence of bot attacks and the need for effective rate limiting.
- Commenters suggest alternative solutions, such as using different cloud providers or services that may offer lower costs and better performance.
- There is a general consensus on the importance of implementing robust security measures and considering the architecture of cloud solutions to avoid unnecessary expenses.
[1] https://www.blackhat.com/docs/us-17/thursday/us-17-Tsai-A-Ne...
This post goes over what happened, how we put an a solution in place in hours and how we landed on the route we took.
I'm curious to hear how others have solved this same problem – generating authenticated URLs when you have a public API.
Reading your blogpost I don't fully get how the current signing implementation can halt massive downloads, or the "attacker"(?) would just adapt their methods to get the signed URLs first and then proceed to download what they are after anyway?
On AWS you'd put CloudFront in front of the (now-private) bucket as a CDN, then use WAF for rate limiting, bot control, etc. In my experience GCP's services work similarly to AWS, so...is this not possible with GCP, or why wasn't this the setup from the get-go? That's the proper way to do things IMO.
Signed URLs I only think of when I think of like, paid content or other "semi-public" content.
It wasn't unusual, for first-time victims at least, that we'd a) waive the fees and b) schedule a solution architect to talk them through using signed URLs or some other mitigation. I have no visibility into current practice either at AWS or GCP but I'd encourage OP to seek billing relief nevertheless, it can't hurt to ask. Sustainable customer growth is the public cloud business model, of which billing surprises are the antithesis.
In all seriousness, the devil is in the details around this kind of stuff, but I do worry that doing something not even clever, but just nonstandard, introduces a larger maintenance effort than necessary.
Interesting problem, and an interesting solution, but I'd probably rather just throw money at it until it gets to a scale that merits further bot prevention measures.
I ended up using the exact same code for sharding, and later to move to a static site with Azure Storage (which lets me use SAS tokens for timed expiry if I want to).
I had a look at the site - why does this need to run on a major cloud provider at all? Why use VERY expensive cloud storage at 9 cents per gigabyte? Why use very expensive image conversion at $50/month when you can run sharp on a Linux server?
I shouldn't be surprised - the world is all in on very expensive cloud computing.
There's another way though assuming you are running something fairly "normal" (whatever that means) - run your own Linux servers. Serve data from those Linux computers. I use CloudFlare R2 to serve your files - its free. You probably don't need most of your fancy architecture - run a fast server on Ionos or Hetzner or something and stop angsting about budget alerts from Google for things that should be free and runnong on your own computers - simple,. straightforward and without IAM spaghetti and all that garbage.
EDIT: I just had a look at the architecture diagram - this is overarchitected. This is a single server application that almost has no architecture - Caddy as a web server - a local queue - serve images from R2 - should be running on a single machine on a host that charges nothing or trivial amount for data.
Does anybody here have a success story where AWS was either much cheaper to operate or to develop for (ideally both) than the normal alternatives?
Every important service always eventually gets rate limiting. The more of it you have, the more problems you can solve. Put in the rate limits you think you need (based on performance testing) and only raise them when you need to. It's one of those features nobody adds until it's too late. If you're designing a system from scratch, add rate limiting early on. (you'll want to control the limit per session/identity, as well as in bulk)
For everything high traffic and/or concurrency related my go to solution is dedicated sockets. Sockets are inherently session-oriented which makes everything related to security and routing more simple. If there is something about a request you don’t like then just destroy the socket. If you believe there is a DOS flood attack then keep the socket open and discard its messaging. If there are too many simultaneous sockets then jitter traffic processing via load balancer as resources become available.
You can roll/host your own anything. Except CDN, if you care about uptime.
Google's documentation is inconsistent, but you do not need to make your bucket public, you can instead grant read access only to Cloud CDN: https://cloud.google.com/cdn/docs/using-signed-cookies#confi...
Dangerously incorrect documentation claiming the bucket must be public: https://cloud.google.com/cdn/docs/setting-up-cdn-with-bucket...
Bots these days are our of control and have lost their mind!
Back in the old days where everyone operates their own server, another thing you could do is to just setup a per-IP traffic throttling with iptables (`-m recent` or `-m hashlimit`). Just something to consider in case one day you might grow tired of Google Cloud Storage too ;)
Edit: I see this is discussed in other threads.
Wouldn't this be solved by using Cloudflare R2 though?
For example, the value of session ID cookies should actually be signed with an HMAC, and checked at the edge by the CDN. Session cookies that represent a authenticated session should also look different than unauthenticated ones. The checks should all happen at the edge, at your reverse proxy, without doing any I/O or calling your "fastcgi" process manager.
But let's get to the juicy part... hosting files. Ideally, you shouldn't have "secret URLs" for files, because then they can be shared and even (gasp) hotlinked from websites. Instead, you should use features like X-Accel-Redirect in NGINX to let your app server determine access to these gated resources. Apache has similar things.
Anyway, here is a write-up which goes into much more detail: https://community.qbix.com/t/files-and-storage/286
I am sorry but who sees a $100 sudden charge, assumes misconfiguration and just goes about their day without digging deeper right away?
Related
Show HN: Turning my 10y old tablet into digital photo frame
Pankaj Tanwar transformed a 10-year-old tablet into a digital photo frame using a custom nginx server and a bash script to display images from a Google Photos album.
Is Cloudflare overcharging us for their images service?
Jérôme Petazzoni reported unexpectedly high charges for Cloudflare's Images service, exceeding $400 instead of the anticipated $110, due to confusing billing practices. He is considering alternatives like Amazon S3.
How to save $13.27 on your SaaS bill
The author discusses managing costs with Vercel's analytics, converting images to reduce charges, and building a custom API using SQLite. They faced deployment challenges but plan future enhancements.
How HashiCorp evolved its cloud infrastructure
Michael Galloway discusses HashiCorp's cloud infrastructure evolution, emphasizing the need for clear objectives, deadlines, and executive buy-in to successfully redesign and expand their services amid growing demands.
Tell HN: Server error (5xx) in Google Search Console may not be 5xx at all
The website next-episode.net faced indexing issues as Google misreported "429 Too Many Requests" as "5xx" errors. Whitelisting Google Crawlers' IPs resolved the issue, with no new errors reported since.