September 18th, 2024

Twitter shut off API access; users volunteering their own data for an open API

Twitter's API shutdown has led to users sharing data for an open API, resulting in a dataset of one million tweets for developers, emphasizing user consent, privacy, and community-driven innovation.

Read original articleLink Icon
Twitter shut off API access; users volunteering their own data for an open API

Twitter's recent decision to shut off API access has led users to voluntarily share their data for the creation of an open API. Approximately one million tweets have been compiled into a publicly accessible dataset hosted on community-archive.org, allowing developers to build tools that can be commercialized or used for personal insights. The initiative emphasizes user consent and the benefits of sharing data, fostering a community-driven ecosystem. Users can analyze their own tweets or explore cultural trends within their communities. The project aims to support self-hosting to ensure data control and privacy, while also encouraging the development of offline tools for data analysis. The vision includes creating a collaborative environment where users can easily access and visualize their data, potentially leading to innovative applications and insights. The initiative draws parallels to successful open data projects like OpenStreetMap, highlighting the potential for competition and innovation in the tech landscape. The community is encouraged to contribute and share their archives, with discussions ongoing about funding and sustainability for the project.

- Twitter's API access shutdown has prompted users to share their data for an open API.

- A dataset of around one million tweets is now publicly available for developers.

- The initiative focuses on user consent, privacy, and community-driven tools.

- Self-hosting and offline analysis tools are key components of the project.

- The project aims to foster innovation and competition similar to OpenStreetMap.

Link Icon 21 comments
By @abdullahkhalids - 5 months
To play the devil's advocate. If you were running a large public forum, and you knew that many companies had started to scrape all data off your site, and were going to cumulatively make billions off that data, and some of those billions will come from polluting your forum with crap content, would you continue running your site in the open?

What is the game theory here? Twitter cooperates and OpenAI defects, and we call that a win?

By @bangaladore - 5 months
Here's a thought: someone "trustworthy" should maintain a Chrome extension or Tapermonkey script that automatically scrapes data from various social media sites in a fully anonymized fashion. As people browse Twitter, Reddit, or XYZ, the posts/comments are sent to some aggregation system. It might be against TOS, but certainly far less than scraping, and you couldn't tell, as it's the user driving what gets scraped.

I don't use Twitter often, but I'd run something like that if there were strong anonymity guarantees. Seems like a win-win for everyone.

Does anything like this exist today?

By @criticalfault - 5 months
Users should just get off this continued tragedy and API access wouldn't be an issue
By @throw0101a - 5 months
Many moons ago Twitter used to have RSS (Atom?) feeds for each user so you could use any old news aggregator to keep up to date.
By @kypro - 5 months
> What if I could ask patio’s archive: “what are some good books to read about [topic]” or “what advice would you give to someone trying to get a job at Stripe”

Or what if I could ask: "Given Omer Shehata's Twitter history, formulate a phishing scam that he would be likely vulnerable to".

The problem I see with here is that there are far more bad actor use cases for identifiable user data than good. In my opinion the main reason most social networks have stopped doing public by default and now do private by default is because not doing so opens them up to Cambridge Analytica type scandals where people don't realise what they're signing up for.

Personally if you do this, I would be very clear with your users that by submitting their data it will be made available publicly in an identifiable form. And that even if they revoke their data from your service it's possible for their data will continue to be archived by others, possibly for malicious reasons.

By @rasengan - 5 months
The trend of shutting down / charging steeply for API access has fundamentally changed the internet.
By @OmarShehata - 5 months
Here is the washington post doing this with TikTok users to reverse engineer the algorithm!! https://thewashingtonpost.formstack.com/forms/help_investiga...

they've got data from 800 users so far, with watch data on 55 million videos

By @nunobrito - 5 months
It has been difficult to rescue data from Twitter even before purchase. On our case it was relevant because this is online digital history for the people in my country.

The only thing we can is motivate more people to use open platforms like NOSTR where API or data/identity handling is completely different.

By @jrm4 - 5 months
Funny, just now I've been playing around with the various tweet deleters and trying to get something working; presently I think I'm about to settle on something involving a basic screen macro recorder thing, like one of the iterations of AHK.

I'm somewhat surprised that this space feels relatively dormant compared to the more complex stuff out there.

(APIs suck)

By @molticrystal - 5 months
Twitter was originally a microblogging service that had rss feeds to syndicate things or monitor the microblogs of people/companies that were interesting, it has gone way far off into the fields.

Same journey reddit is making, starting after it prepared to go public.

By @danielodievich - 5 months
The only useful thing on Twitter that I ever saw was the lovely and tender Dog Rates https://x.com/dog_rates. You could read anonymously and be all aww and schucks about all those good dogs. They've thankfully stopped engaging with this cesspool that it became and moved somewhere else, Instagram perhaps? Somewhere where I can't read without an account, so I don't read it anymore.
By @fb03 - 5 months
At this point, just nope out of it and use something like Bluesky
By @qingcharles - 5 months
I hate to be the one that says this, but what's to stop someone poisoning this and uploading a file of someone else's fake tweets?
By @bravetraveler - 5 months
Neat boundaries on the Town Square
By @xyst - 5 months
switched to mastodon, bluesky long ago.
By @mrkramer - 5 months
Elon is trying really hard to destroy Twitter, isn't he?
By @ranger_danger - 5 months
> cheaply as possible (put everything in S3

yikes.