July 10th, 2024

Daily Usenet Feed Size Hits 300TB

The Usenet Newsgroup Feed Size has grown significantly from 27.80 TiB in January 2017 to 300 TiB in March 2024, reflecting increased content sharing. NewsDemon, a Usenet provider, offers various services and emphasizes premium access to newsgroups.

Read original article

The Usenet Newsgroup Feed Size has seen a significant increase over the years, with data showing a growth from 27.80 TiB in January 2017 to 300 TiB in March 2024. This growth indicates a rising volume of daily content being shared on Usenet newsgroups. The information is provided by NewsDemon, a Usenet provider, and the data reflects the evolving landscape of Usenet usage and content consumption. The website also includes sections on Usenet resources, member benefits, support, and a copyright notice. Additionally, there is information about cookies being used on the site for functionality and analytics purposes, with users having the option to customize their cookie settings. NewsDemon emphasizes its role as a premium Usenet provider, offering access to a wide range of newsgroups and content for its members.

Size of Wikipedia

English Wikipedia has 6,841,238 articles with 4.5 billion words, 60,922,102 pages, and a growth rate of 14,000 articles monthly. Word count per article averages 668, showing consistent growth.

Combine multiple RSS feeds into a single feed, as a service

The GitHub URL provides details on "RSS Combine," a tool merging multiple RSS feeds. It guides users on local setup, configuration via YAML or environment variables, and generating a static RSS file in S3. Simplifies feed consolidation.

A Large-Scale Structured Database of a Century of Historical News

A large-scale database named Newswire was created, containing 2.7 million U.S. newswire articles from 1878 to 1977. Reconstructed using deep learning, it aids research in language modeling, computational linguistics, and social sciences.

How I scraped 6 years of Reddit posts in JSON

The article covers scraping 6 years of Reddit posts for self-promotion data, highlighting challenges like post limits and cutoffs. Pushshift is suggested for Reddit archives. Extracting URLs and checking website status are explained. Findings reveal 40% of sites inactive. Trends in online startups are discussed.

X userbase 'grew 1.6%' since Musk's $44B takeover

X, previously Twitter, saw a 1.6% user increase post Elon Musk's acquisition, reaching 251 million daily users. Growth slowed compared to prior years, facing challenges like monetization and competition from Meta and Mastodon.

12 comments

By @joecool1029 - 10 months

Feel like it's obligatory to say this for people unfamiliar with Usenet, but there are text groups (think text threads like a mailing list or forum) and binary groups (shows/movies/software/etc). The size growth is due to the latter.

I use both types. Text groups almost certainly fell off a cliff earlier this year when Google Groups shut off their spam posting gateway. http://www.eternal-september.org/ is a good free project for the text groups. I've been a newsdemon customer for years but they suck for text groups (their headers got messed up a few years back after a move from highwinds backbone to usenetexpress backbone).

By @crazygringo - 10 months

Serious question: why does anybody use Usenet for pirating rather than torrents?

It seems so fundamentally ill-suited to the task.

And if the answer has something to do with privacy or warnings from your ISP, it seems like VPNs would be the answer.

What am I missing?

By @garciasn - 10 months

Based on this table there's been a 1100% increase over the last 7 years, with a 60% increase between 2020 and 2021.

I'd be interested to see WHY this is the case. Is it attributable to a larger share of data that cannot be compressed vs more compressible data (e.g., Warez/Movies)?

It just seems highly unlikely this is driven by a growing user base; but, without more details other than this data table, I am at a loss for the reasons why.

By @vessenes - 10 months

rn seems to have had it right: ripped from my youth, the pre-posting warning comes to mind:

“This program posts news to thousands of machines throughout the entire civilized world. You message will cost the net hundreds if not thousands of dollars to send everywhere. Please be sure you know what you are doing.”

By @surteen - 10 months

Gosh that's a lot of discussion!

By @freeqaz - 10 months

So 300TB * 365 = ~110PB for a mirror to have 1 year of retention.

That's pretty insane lol. How many mirrors are there that can actually manage that much storage? If you're using 20TB disks that's ~5500 disks per year with zero redundancy. Double or triple that for a bare minimum... not counting the load of actually serving that data to everybody.

How is this economical for anybody at this point? Or are these Usenet mirrors all massive businesses that can support running hundreds of PBs or storage and I'm just naive?

By @eesmith - 10 months

So, binaries, warez, images, and the like?

A backchannel to download papers from sci-hub?

Or are people using Usenet as a way to send encrypted messages in a way that makes traffic analysis more difficult? (If 50,000 people download everything to a group, and post encrypted or steganographic message to that group, then it's easier than seeing that X sent an email blob to Y.)

Or, Usenet as the new numbers station?

By @Ekaros - 10 months

Wow, is this all AI generated content? AI bots just discussing between each other constantly?

Is there really even that much media produced everyday? Or media that does get uploaded?

By @rekabis - 10 months

Wow. Makes me want to break out my archived installer for Microplanet Gravity, assuming it even works on Windows 10 or 11.

By @486sx33 - 10 months

And a TON of it is spam and “robots talking to robots”

By @yieldcrv - 10 months

where's the action nowadays?

Size of Wikipedia

English Wikipedia has 6,841,238 articles with 4.5 billion words, 60,922,102 pages, and a growth rate of 14,000 articles monthly. Word count per article averages 668, showing consistent growth.

Daily Usenet Feed Size Hits 300TB

Related

Size of Wikipedia

Combine multiple RSS feeds into a single feed, as a service

A Large-Scale Structured Database of a Century of Historical News

How I scraped 6 years of Reddit posts in JSON

X userbase 'grew 1.6%' since Musk's $44B takeover

Related

Size of Wikipedia

Combine multiple RSS feeds into a single feed, as a service

A Large-Scale Structured Database of a Century of Historical News

How I scraped 6 years of Reddit posts in JSON

X userbase 'grew 1.6%' since Musk's $44B takeover