September 18th, 2024

Twitter shut off API access; users volunteering their own data for an open API

Twitter's API shutdown has led to users sharing data for an open API, resulting in a dataset of one million tweets for developers, emphasizing user consent, privacy, and community-driven innovation.

Read original article

Twitter shut off API access; users volunteering their own data for an open API

Twitter's recent decision to shut off API access has led users to voluntarily share their data for the creation of an open API. Approximately one million tweets have been compiled into a publicly accessible dataset hosted on community-archive.org, allowing developers to build tools that can be commercialized or used for personal insights. The initiative emphasizes user consent and the benefits of sharing data, fostering a community-driven ecosystem. Users can analyze their own tweets or explore cultural trends within their communities. The project aims to support self-hosting to ensure data control and privacy, while also encouraging the development of offline tools for data analysis. The vision includes creating a collaborative environment where users can easily access and visualize their data, potentially leading to innovative applications and insights. The initiative draws parallels to successful open data projects like OpenStreetMap, highlighting the potential for competition and innovation in the tech landscape. The community is encouraged to contribute and share their archives, with discussions ongoing about funding and sustainability for the project.

- Twitter's API access shutdown has prompted users to share their data for an open API.

- A dataset of around one million tweets is now publicly available for developers.

- The initiative focuses on user consent, privacy, and community-driven tools.

- Self-hosting and offline analysis tools are key components of the project.

- The project aims to foster innovation and competition similar to OpenStreetMap.

A New Development in the Debate About Instagram and Teens

Meta launches a pilot program with the Center for Open Science, granting researchers access to Instagram data to study its impact on teen well-being. The initiative aims to address concerns about social media's effects on mental health.

Sourcegraph went dark

Sourcegraph has privatized its main repository, disappointing former employees like Eric Fritz, who is preserving important references by forking the repository and saving pull request data to maintain access.

Dogsheep: Tools for personal analytics using SQLite and Datasette

Dogsheep offers tools for personal data analytics using SQLite, enabling users to export data from platforms like Twitter and Google, promoting understanding and ownership of personal information.

Replace Twitter Embeds with Semantic HTML

Terence Eden replaced Twitter embeds with semantic HTML on his site to enhance user safety, prevent data tracking, and improve accessibility, sharing his Python code on GitHub for public feedback.

Mastering Twitter Osint: The Ultimate Guide

The guide on Twitter OSINT outlines tools and techniques for gathering intelligence, emphasizing secure environments, advanced search features, data collection tools, and the importance of ethical considerations in investigations.

21 comments

By @abdullahkhalids - 5 months

To play the devil's advocate. If you were running a large public forum, and you knew that many companies had started to scrape all data off your site, and were going to cumulatively make billions off that data, and some of those billions will come from polluting your forum with crap content, would you continue running your site in the open?

What is the game theory here? Twitter cooperates and OpenAI defects, and we call that a win?

By @bangaladore - 5 months

Here's a thought: someone "trustworthy" should maintain a Chrome extension or Tapermonkey script that automatically scrapes data from various social media sites in a fully anonymized fashion. As people browse Twitter, Reddit, or XYZ, the posts/comments are sent to some aggregation system. It might be against TOS, but certainly far less than scraping, and you couldn't tell, as it's the user driving what gets scraped.

I don't use Twitter often, but I'd run something like that if there were strong anonymity guarantees. Seems like a win-win for everyone.

Does anything like this exist today?

By @criticalfault - 5 months

Users should just get off this continued tragedy and API access wouldn't be an issue

By @throw0101a - 5 months

Many moons ago Twitter used to have RSS (Atom?) feeds for each user so you could use any old news aggregator to keep up to date.

By @kypro - 5 months

> What if I could ask patio’s archive: “what are some good books to read about [topic]” or “what advice would you give to someone trying to get a job at Stripe”

Or what if I could ask: "Given Omer Shehata's Twitter history, formulate a phishing scam that he would be likely vulnerable to".

The problem I see with here is that there are far more bad actor use cases for identifiable user data than good. In my opinion the main reason most social networks have stopped doing public by default and now do private by default is because not doing so opens them up to Cambridge Analytica type scandals where people don't realise what they're signing up for.

Personally if you do this, I would be very clear with your users that by submitting their data it will be made available publicly in an identifiable form. And that even if they revoke their data from your service it's possible for their data will continue to be archived by others, possibly for malicious reasons.

By @toomuchtodo - 5 months

https://www.community-archive.org/

https://github.com/TheExGenesis/community-archive

By @rasengan - 5 months

The trend of shutting down / charging steeply for API access has fundamentally changed the internet.

By @OmarShehata - 5 months

Here is the washington post doing this with TikTok users to reverse engineer the algorithm!! https://thewashingtonpost.formstack.com/forms/help_investiga...

they've got data from 800 users so far, with watch data on 55 million videos

By @nunobrito - 5 months

It has been difficult to rescue data from Twitter even before purchase. On our case it was relevant because this is online digital history for the people in my country.

The only thing we can is motivate more people to use open platforms like NOSTR where API or data/identity handling is completely different.

By @jrm4 - 5 months

Funny, just now I've been playing around with the various tweet deleters and trying to get something working; presently I think I'm about to settle on something involving a basic screen macro recorder thing, like one of the iterations of AHK.

I'm somewhat surprised that this space feels relatively dormant compared to the more complex stuff out there.

(APIs suck)

By @molticrystal - 5 months

Twitter was originally a microblogging service that had rss feeds to syndicate things or monitor the microblogs of people/companies that were interesting, it has gone way far off into the fields.

Same journey reddit is making, starting after it prepared to go public.

By @danielodievich - 5 months

The only useful thing on Twitter that I ever saw was the lovely and tender Dog Rates https://x.com/dog_rates. You could read anonymously and be all aww and schucks about all those good dogs. They've thankfully stopped engaging with this cesspool that it became and moved somewhere else, Instagram perhaps? Somewhere where I can't read without an account, so I don't read it anymore.

By @fb03 - 5 months

At this point, just nope out of it and use something like Bluesky

By @qingcharles - 5 months

I hate to be the one that says this, but what's to stop someone poisoning this and uploading a file of someone else's fake tweets?

By @bravetraveler - 5 months

Neat boundaries on the Town Square

By @xyst - 5 months

switched to mastodon, bluesky long ago.

By @mrkramer - 5 months

Elon is trying really hard to destroy Twitter, isn't he?

By @ranger_danger - 5 months

> cheaply as possible (put everything in S3

yikes.

A New Development in the Debate About Instagram and Teens

Sourcegraph went dark

Dogsheep: Tools for personal analytics using SQLite and Datasette

Dogsheep offers tools for personal data analytics using SQLite, enabling users to export data from platforms like Twitter and Google, promoting understanding and ownership of personal information.

Replace Twitter Embeds with Semantic HTML

Terence Eden replaced Twitter embeds with semantic HTML on his site to enhance user safety, prevent data tracking, and improve accessibility, sharing his Python code on GitHub for public feedback.

Twitter shut off API access; users volunteering their own data for an open API

Related

A New Development in the Debate About Instagram and Teens

Sourcegraph went dark

Dogsheep: Tools for personal analytics using SQLite and Datasette

Replace Twitter Embeds with Semantic HTML

Mastering Twitter Osint: The Ultimate Guide

Related

A New Development in the Debate About Instagram and Teens

Sourcegraph went dark

Dogsheep: Tools for personal analytics using SQLite and Datasette

Replace Twitter Embeds with Semantic HTML

Mastering Twitter Osint: The Ultimate Guide