August 30th, 2024

I'm blocking connections from AWS to my on-prem services

The article discusses the balkanization of the internet due to large cloud providers, emphasizing commercialization, isolated ecosystems, and the need for accountability and improved management practices to prevent fragmentation.

Read original articleLink Icon
I'm blocking connections from AWS to my on-prem services

The article discusses the emergence of a balkanized internet, largely influenced by the dominance of large cloud providers. It traces the history of the internet from its inception, highlighting the transition from a government-controlled network to a commercialized one. The author notes that the commercialization led to a model based on advertising and the sale of user data, which has been exacerbated by the rise of AI technologies that utilize vast amounts of user-generated content. The piece critiques the current state of cloud services, suggesting that they create isolated ecosystems that view the broader internet as an external resource. The author shares personal experiences of blocking AWS access to their servers as a response to excessive traffic and abuse, emphasizing the need for better accountability and transparency from large cloud providers. The article concludes with a call for improved practices in managing internet resources to prevent further fragmentation and to ensure that users are aware of the implications of relying on these dominant services.

- The rise of large cloud providers is contributing to a fragmented, balkanized internet.

- The commercialization of the internet has shifted focus to advertising and user data monetization.

- Cloud services create isolated ecosystems that limit interaction with the broader internet.

- Personal experiences highlight the need for better accountability from cloud providers.

- Improved management practices are necessary to prevent further fragmentation of internet resources.

Link Icon 22 comments
By @ThePhysicist - 5 months
It's crazy how fast people start attacking your infrastructure on the Internet these days. I recently started announcing one of my /23 subnets (512 addresses) over BGP for an anycast setup, once the route was announced and traffic flowing to the router tcpdump blew up with port scan activity on all IPs of the range. Of course that doesn't have anything to do with the route, it's just that tons of people seem to indiscriminately and continuously scan for open ports on all IP ranges (my range has been unannounced for many years before so it wasn't on someones list of active servers).

I find it a shocking that people still expose internal web services (e.g. Gitlab) openly to the Internet, in my opinion you should at least have one additional layer of protection through a VPN or similar mechanism so that your services aren't discoverable from the public Internet.

I only expose SSH from a single bastion host, which is the only host that's publicly reachable, something that I'd like to get rid off in the future as well by adding a VPN layer on top.

By @immibis - 5 months
That's a shame, really. You are risking that none of your stuff will show up on search engines, not even Marginalia. You are risking that none of your stuff will be saved in the Wayback Machine. Maybe you want that, in which case you should block all the clouds and data centers, just to be sure. You might even be blocking your site from some small ISPs based on where they run their CGNAT gateway (I doubt this, but it's possible).

As far as I noticed, ping with a spoofed source address is the only actual abuse mentioned in the article. It should go without saying that you can't tell if a spoofed ping packet came from AWS, because the source address is the address the spoofer wants you to send a reply to, not the spoofer's address. And a much less invasive mitigation would be rate-limiting pings to, say, 10 per second.

While the Internet is becoming balkanized this is mostly because of social media siloing itself to generate advertising and data revenue and to extract profit from AI training data (e.g. the Reddit/Google exclusivity deal) rather than because of providers blocking IP ranges.

I certainly don't understand the rational mindset behind blocking certain providers over some pings and then complaining about IP connectivity becoming balkanized. The balkanization is caused by the ones doing the blocking.

By @arcza - 5 months
I've been blocking Hetzner, Digital Ocean, Linode, OVH and Contabo for a while. You can do this with pfBlocker NG by blocking ASNs, or UFW rules (https://blog.abctaylor.com/ufw-and-firewalld-rules-to-block-...)
By @bdcravens - 5 months
In considering this, my first thought was it would block bona fide desktops running in AWS, especially for service offerings like Amazon Workspaces. However, it looks like the IP space for such services are publicly documented if the need arises to specifically whitelist those IPs.

All that said, it's trivial to use proxies or VPNs to bypass any blocks.

By @rkagerer - 5 months
I did this a long time ago after dozens of Amazon servers started slowing down our on-prem servers. I think it might have been some kind of attempted SSO stuff but never did entirely track it down. Just wrote a script to periodically download a list of their IP ranges and block 'em all, and moved on.
By @UI_at_80x24 - 5 months
I understand the appeal. If I wanted to wall-off large swaths of the internet from what I create I would do the same thing. It's not much different then blocking entire countries.

The desire to limit the noise and only allow a "small circle of friends" is also appealing.

But I do that for specific services, not my domains in general. Mumble server: only open to the 3-4 countries that my friends are in, and none of the 'cloud providers'. Tech blog: world+dog can see it.

I am firmly in the 'We all benefit from shared knowledge' camp. So if my notes on modem init strings for my 300-baud C64 modem can help one other person; they won't go through the same pain I went through, and the world will be a better place.

I get the desire, for many reasons. That's cool. You do you.

By @User23 - 5 months
This makes me miss the Internet. It’s really hard to explain how wonderful it was pre-commercialization. The sour pleasure of having been exactly right about how that would turn out isn’t nearly satisfactory compensation for the loss.
By @hello_computer - 5 months
AWS is a boy scout compared to places like DigitalOcean, OVHcloud, ColoCrossing, Scaleway, Tencent, or even Google. I think DigitalOcean, in particular, has made a terrible mistake marketing to the “cybersecurity” community.
By @cmeacham98 - 5 months
Bias disclaimer: AWS is my current employer.

Maybe I'm missing something obvious, but if the author believes the ping traffic is being spoofed, how could they know AWS is the source?

By @knallfrosch - 5 months
Where's the summary? I didn't quite get what the problem is that you're trying to solve.

Data scraping? DDOS attacks? Bandwidth trouble? Security?

By @nullc - 5 months
I've found that blocking china is also a pretty good improvement on abusive traffic relative to disruption.
By @jeroenhd - 5 months
I should probably do this for most of my stuff. I run some servers that require cloud-to-cloud networking, but the only inbound stuff I see coming from cloud services is bots, scanners, and scrapers. I've had to block off China's largest ISP because some broken scraper kept re-downloading the same image assets for no reason, and kept popping up on other subnets.

I don't think anyone will miss my stuff if they're part of the small minority of people accessing the internet through a VPN hosted in large data centres.

The biggest challenge for implementing this will probably be figuring out how to block inbound connections but keep outbound connections working. I'm sure there's a good nftables rule I can come up with eventually.

By @aorth - 5 months
It has really gotten terrible. Between DDoS and bots from massive tech companies, it seems like I have several events a year where thousands or tens of thousands of IPs from a single datacenter (Singapore!) are making requests to some of my infrastructure concurrently. What can we do?

I opted for CIDR aggregation and rate limiting of data center ISPs in nginx for one of my frontends. There are reasonable limits for normal IPs too. Not all of us have the capacity or desire to scale.

By @perching_aix - 5 months
https is broken for the site
By @ruthmarx - 5 months
The cynic in me thinks this won't accomplish much. They can/do just buy data from other companies that scrape or some subsidiary.

This isn't a technical problem, it's a legal/social problem.

By @theideaofcoffee - 5 months
Yawn. Old man yells at cloud (literally). So he's taking his little netblock ball and going home because of some failed purity tests: bad or nonexistent PTRs, excessive ICMP, oh my! The gentleman's agreements that held together the early internet and web and the unwritten practices like that are long gone, get with the times, there ain't any going back to how it was. Otherwise, feel free to disconnect entirely if you don't want to deal with the new reality.

I'm going to going out on a limb and guess that all of this traffic that isn't related directly to AWS, but its customers. You can set PTRs for your allocated elastic IPs with a request to support. But then again nobody is going to do it because... it doesn't matter. It may have mattered when you were hosting with a block that you actually truly owned, before the ICANN times, but no more. No one cares. Everything is ephemeral, so why should the reverse matter when things get cycled through addresses multiple times per day? If you're seeing excessive anything, then it's probably time to reach out to the abuse contact published in the whois. Let me help you with that:

   OrgAbuseHandle: AEA8-ARIN
   OrgAbuseName:   Amazon EC2 Abuse
   OrgAbuseEmail:  trustandsafety@support.aws.com
   OrgAbuseRef:    https://rdap.arin.net/registry/entity/AEA8-ARIN

   Comment:        All abuse reports MUST include:
   Comment:        * src IP
   Comment:        * dest IP (your IP)
   Comment:        * dest port
   Comment:        * Accurate date/timestamp and timezone of activity
   Comment:        * Intensity/frequency (short log extracts)
   Comment:        * Your contact details (phone and email) Without these we will be unable to identify the correct owner of the IP address at that point in time.
Use modern features built in to modern versions of common packages and products: rate limiting, redirects, filters, and on and on. If you're just blocking to block to make some sort of statement into the void, you're just hastening that balkanization.
By @xcdzvyn - 5 months
I'm slightly ashamed to admit I don't know what RPZ, PTR, or SPF are, nor do I understand the asides about AI or reverse DNS.

What precisely is his problem with Amazon?

By @tonetegeatinst - 5 months
Firefox on mobile, using dark mode. That font actually hurts my eye, and I straigup closed the site. Many I'll write a script to scrape the text into a terminal output so I can read it without feeling like I'm having a headache. Probably just output the text via a terminal windows.