Daily Usenet Feed Size Hits 300TB
The Usenet Newsgroup Feed Size has grown significantly from 27.80 TiB in January 2017 to 300 TiB in March 2024, reflecting increased content sharing. NewsDemon, a Usenet provider, offers various services and emphasizes premium access to newsgroups.
Read original articleThe Usenet Newsgroup Feed Size has seen a significant increase over the years, with data showing a growth from 27.80 TiB in January 2017 to 300 TiB in March 2024. This growth indicates a rising volume of daily content being shared on Usenet newsgroups. The information is provided by NewsDemon, a Usenet provider, and the data reflects the evolving landscape of Usenet usage and content consumption. The website also includes sections on Usenet resources, member benefits, support, and a copyright notice. Additionally, there is information about cookies being used on the site for functionality and analytics purposes, with users having the option to customize their cookie settings. NewsDemon emphasizes its role as a premium Usenet provider, offering access to a wide range of newsgroups and content for its members.
Related
Size of Wikipedia
English Wikipedia has 6,841,238 articles with 4.5 billion words, 60,922,102 pages, and a growth rate of 14,000 articles monthly. Word count per article averages 668, showing consistent growth.
Combine multiple RSS feeds into a single feed, as a service
The GitHub URL provides details on "RSS Combine," a tool merging multiple RSS feeds. It guides users on local setup, configuration via YAML or environment variables, and generating a static RSS file in S3. Simplifies feed consolidation.
A Large-Scale Structured Database of a Century of Historical News
A large-scale database named Newswire was created, containing 2.7 million U.S. newswire articles from 1878 to 1977. Reconstructed using deep learning, it aids research in language modeling, computational linguistics, and social sciences.
How I scraped 6 years of Reddit posts in JSON
The article covers scraping 6 years of Reddit posts for self-promotion data, highlighting challenges like post limits and cutoffs. Pushshift is suggested for Reddit archives. Extracting URLs and checking website status are explained. Findings reveal 40% of sites inactive. Trends in online startups are discussed.
X userbase 'grew 1.6%' since Musk's $44B takeover
X, previously Twitter, saw a 1.6% user increase post Elon Musk's acquisition, reaching 251 million daily users. Growth slowed compared to prior years, facing challenges like monetization and competition from Meta and Mastodon.
I use both types. Text groups almost certainly fell off a cliff earlier this year when Google Groups shut off their spam posting gateway. http://www.eternal-september.org/ is a good free project for the text groups. I've been a newsdemon customer for years but they suck for text groups (their headers got messed up a few years back after a move from highwinds backbone to usenetexpress backbone).
It seems so fundamentally ill-suited to the task.
And if the answer has something to do with privacy or warnings from your ISP, it seems like VPNs would be the answer.
What am I missing?
I'd be interested to see WHY this is the case. Is it attributable to a larger share of data that cannot be compressed vs more compressible data (e.g., Warez/Movies)?
It just seems highly unlikely this is driven by a growing user base; but, without more details other than this data table, I am at a loss for the reasons why.
“This program posts news to thousands of machines throughout the entire civilized world. You message will cost the net hundreds if not thousands of dollars to send everywhere. Please be sure you know what you are doing.”
That's pretty insane lol. How many mirrors are there that can actually manage that much storage? If you're using 20TB disks that's ~5500 disks per year with zero redundancy. Double or triple that for a bare minimum... not counting the load of actually serving that data to everybody.
How is this economical for anybody at this point? Or are these Usenet mirrors all massive businesses that can support running hundreds of PBs or storage and I'm just naive?
A backchannel to download papers from sci-hub?
Or are people using Usenet as a way to send encrypted messages in a way that makes traffic analysis more difficult? (If 50,000 people download everything to a group, and post encrypted or steganographic message to that group, then it's easier than seeing that X sent an email blob to Y.)
Or, Usenet as the new numbers station?
Is there really even that much media produced everyday? Or media that does get uploaded?
Related
Size of Wikipedia
English Wikipedia has 6,841,238 articles with 4.5 billion words, 60,922,102 pages, and a growth rate of 14,000 articles monthly. Word count per article averages 668, showing consistent growth.
Combine multiple RSS feeds into a single feed, as a service
The GitHub URL provides details on "RSS Combine," a tool merging multiple RSS feeds. It guides users on local setup, configuration via YAML or environment variables, and generating a static RSS file in S3. Simplifies feed consolidation.
A Large-Scale Structured Database of a Century of Historical News
A large-scale database named Newswire was created, containing 2.7 million U.S. newswire articles from 1878 to 1977. Reconstructed using deep learning, it aids research in language modeling, computational linguistics, and social sciences.
How I scraped 6 years of Reddit posts in JSON
The article covers scraping 6 years of Reddit posts for self-promotion data, highlighting challenges like post limits and cutoffs. Pushshift is suggested for Reddit archives. Extracting URLs and checking website status are explained. Findings reveal 40% of sites inactive. Trends in online startups are discussed.
X userbase 'grew 1.6%' since Musk's $44B takeover
X, previously Twitter, saw a 1.6% user increase post Elon Musk's acquisition, reaching 251 million daily users. Growth slowed compared to prior years, facing challenges like monetization and competition from Meta and Mastodon.