August 3rd, 2024

Two months of feed reader behavior analysis

An analysis of feed reader behavior revealed significant request handling patterns, with some applications like Netvibes and NextCloud-News facing caching issues, while others like Miniflux performed better.

Read original article

Two months of feed reader behavior analysis

Over the past two months, an analysis of feed reader behavior has revealed significant patterns in request handling among various applications. The project, initiated at the end of May, logged over 70,000 requests from 97 unique keys. Notably, some applications exhibited poor caching and polling behaviors. For instance, Netvibes faced caching errors, while SpaceCowboys Android RSS Reader displayed erratic polling intervals. NextCloud-News and Feedbin were identified as having severe caching issues, leading to frequent unconditional requests. In contrast, some applications like Miniflux and rawdog demonstrated proper caching and pacing.

Several feed readers, including NetNewsWire and FreshRSS, consistently generated unconditionals and displayed erratic timing, raising concerns about their efficiency. Others, like Yarr and bdrss, managed to maintain better request patterns, although timing issues persisted. The analysis highlighted the importance of proper configuration and the impact of caching mechanisms on performance. Overall, while some applications showed improvement, many still struggled with fundamental issues, suggesting a need for users to choose their feed readers carefully based on these behaviors. The findings underscore the variability in feed reader performance and the necessity for ongoing monitoring and adjustments to enhance user experience.

How to waste bandwidth, battery power, and annoy sysadmins

Web browsers' excessive requests waste bandwidth and drain battery. Firefox for iOS' flawed link requests, especially for favicons, strain servers inefficiently. Users are cautioned about negative impacts, stressing efficient browsing.

How I scraped 6 years of Reddit posts in JSON

The article covers scraping 6 years of Reddit posts for self-promotion data, highlighting challenges like post limits and cutoffs. Pushshift is suggested for Reddit archives. Extracting URLs and checking website status are explained. Findings reveal 40% of sites inactive. Trends in online startups are discussed.

Data Fetching for Single-Page Apps

Efficient data fetching in single-page applications is crucial for responsiveness. Key patterns include Asynchronous State Handler, Parallel Data Fetching, Fallback Markup, Code Splitting, and Prefetching. Optimizing data fetching enhances application performance universally.

90% of performance is data access patterns

A recent analysis revealed that 90% of application performance issues arise from data access patterns. A platform team improved performance by eliminating redundant API requests, reducing daily calls significantly and enhancing latency.

NetNewsWire and Conditional Get Issues

Brent Simmons addresses bugs in NetNewsWire's conditional GET support, revealing issues with feed data processing. He suggests improved logic for updates and stresses the need for further testing to ensure reliability.

11 comments

By @quectophoton - 9 months

How many of the "polling issues" can be explained by clients doing heuristic caching[1][2] due to a lack of a `Cache-Control` header in the responses?

If the feed was last updated (for example) 1 hour ago, and the response lacks a `Cache-Control` header, in that case the response would be cached for 6 minutes (10% of 1 hour; the 10% is mentioned in RFC 9111 section 4.2.2).

[1]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching#he...

[2]: https://datatracker.ietf.org/doc/html/rfc9111#name-calculati...

By @kkfx - 9 months

Personally I choose tt-RSS for a reason: human quickness going through posts. I can have all my feeds arranged, just a line per entry, scrolling mark them as read, opening one open a side panel without touching visually the current post and threads list, being on a desktop with a 21:9 screen it's a perfect mach. I've used Miniflux before but it's less quick skimming posts, opening one, come back to the compact list etc.

Before, but very before in the past, I've tried few desktops one, and the final choice was a Java/SWIG one RSSOwl, heavy but effective enough compared to others. I've also used for a period elfeed (Emacs) but it's too slow to read feeds for me, Emacs offer UI for focused read, while for a large amount of low importance posts I need something allowing a very quick pass even if something get lost.

A thing I miss in ALL feed readers I've tried so far is real-world filtering abilities like fuzzy-matching titles to reasonably show only one per kind of news (let's say you follow many newspapers and nearly all report the same news about an earthquake somewhere, there is no point in seeing let's say 12 posts on the same event), eventually offering a button to show all the matched if I want to go through them. Another is historical analysis, let's say every years there are wildfires around the world, I'd like to see from news I've read if they are almost the same of the last year or more, if they start to appear earlier and last longer etc, it's still fuzzy keywords matching, nothing so hard, but still absent in all, I imaging I'm one of the very few interested in such automation to use feeds as a personal aggregator. Gnus with scoring offer something similar, but well, it's too slow to really skim things, and easy to break as well.

By @trekz - 9 months

Not surprised to see Reeder in there. It’s a great app for Apple users. But that app can bring a website to its knees with how aggressive it is.

I can see in my logs that it constantly makes over ~20 requests to different RSS feeds on my domain, all in the exact same millisecond. Happens multiple times a day. And it appears to rotate IPs. Scary… Tried reaching out to the developer about it twice, but they never responded.

By @aquova - 9 months

Can someone explain how to parse this article? There's a number of similar entries for each client. I can't tell how to interpret these complaints.

By @jerlam - 9 months

I'm disappointed in Feedly here, as I've been on their platform since Google Reader was shut down - over ten years. And their script is still at 1.0.

Rachel's complaint about Feedly's overzealous polling also contrasts the experience of author John Scalzi, where his blog was simply getting ignored by Feedly:

https://whatever.scalzi.com/2024/08/02/the-feedly-issue-appa...

By @8organicbits - 9 months

Does anyone have experience with WebSub? It's designed to solve the frequent polling problem while still giving readers immediate access to content.

https://en.m.wikipedia.org/wiki/WebSub

By @JustARandomGuy - 9 months

Unread RSS Reader. Godawful poll timing. 6103 requests in 52 days is about one poll every 736 seconds _on average_, but they're hugely spread out. WTF? Put it this way: the list of unique intervals (nn seconds, nn minutes, ...) is four pages tall on my web browser.

Not entirely sure what the criticism here is other than polling on average every 12 minutes seems a little excessive at best. Why does in matter it the intervals are a bit wonky? I could think of many reasons why: maybe the poll intervals are smaller during the daytime and more spread out over the night to optimize for reading conditions, etc

By @butz - 9 months

Sad to see that so many feed readers are unable to solve polling problems. It should be in their interest to make least requests possible. Especially the ones hosting multiple tenants that are very likely subscribing to same feeds.

By @kibwen - 9 months

> rawdog/2.24rc1. Behavior is spot on. More like this, please.

I'd like to see a description of what the proper behavior is in this context. The OP uses terms like timing, pacing, conditionals, and unconditionals in a way that makes me think that these must be well-defined jargon in the context of RSS, but I don't see these in the RSS spec.

By @impure - 9 months

Looks like I need to update my reader to use the If-Modified-Since and If-None-Match headers

By @Twirrim - 9 months

Disappointing to see feedly in the misbehaving bucket. It's an online service, you'd have expected to get this right

Two months of feed reader behavior analysis

Related

How to waste bandwidth, battery power, and annoy sysadmins

How I scraped 6 years of Reddit posts in JSON

Data Fetching for Single-Page Apps

90% of performance is data access patterns

NetNewsWire and Conditional Get Issues

Related

How to waste bandwidth, battery power, and annoy sysadmins

How I scraped 6 years of Reddit posts in JSON

Data Fetching for Single-Page Apps

90% of performance is data access patterns

NetNewsWire and Conditional Get Issues