Mapping Hacker News to find who knows what in the HN community
Wilson Lin's project analyzes 40 million Hacker News posts to create a semantic map, highlighting trusted voices and user relationships, while inviting feedback and participation to enhance community connections.
Read original articleWilson Lin discusses a project involving the analysis of 40 million posts and comments from Hacker News to create a semantic map of the community. This initiative aims to identify and highlight the trusted voices within the network, emphasizing the importance of people over content in social networks. Lin collaborated with Robert, who has experience in social semantic algorithms, to explore how to better understand the knowledge and relationships among users. The project allows users to see their contributions and unique linguistic identities within the community, as well as to search for expertise on various topics such as startups, programming languages, and neuroscience. The technology developed can organize user semantics, facilitate searches based on knowledge, and map community relationships, thereby revealing the expertise of individuals rather than just the information they produce. Lin invites feedback and encourages interested individuals to join a waitlist to further engage with the project, which aims to enhance connections among users based on their knowledge and interests rather than merely organizing information.
Related
Insights from over 10,000 comments on "Ask HN: Who Is Hiring" using GPT-4o
The analysis of over 10,000 Hacker News comments using GPT-4o and LangChain revealed job market trends like remote work opportunities, visa sponsorship stability, and skill demands. Insights suggest potential SaaS product development.
Evaluating a Decade of Hacker News Predictions: An Open-Source Approach
The blog post evaluates a decade of Hacker News predictions using LLMs and ClickHouse. Results show a 50% success rate, highlighting challenges in prediction nuances. Future plans include expanding the project. Website: https://hn-predictions.eamag.me/.
Show HN: 40M embeddings to find who knows what on HN
Wilson Lin and Robert embed 40 million Hacker News posts to create a semantic map, prioritizing individuals over content. They aim to highlight trusted voices and knowledge expertise within the community.
- Many users appreciate the innovative approach and visualization of user expertise.
- Concerns arise about the potential for misidentifying "trusted voices" and the implications of algorithmic influence on social interactions.
- Some commenters express doubts about the accuracy and relevance of the tool, particularly for less active users.
- There are discussions about privacy and the risks of exposing personal information through such analyses.
- Several users highlight the importance of content over individual user reputation in discussions.
Also, I feel like this tool selects for active commenters, not for knowledgeable experts. Not to mention throwaway accounts.
Still a cool project.
Show HN: Exploring HN by mapping and analyzing 40M posts and comments for fun - https://news.ycombinator.com/item?id=40307519 - May 2024 (159 comments)
Could this tool be repurposed for that? Presumably the “map” rendered in each user’s avatar could be encoded as a vector and then compared to that of another user.
EDIT: Wait, I just realized it already does this… (or at least I think so - it’s not immediately obvious if “Explore More Users” is ranked by similarity.)
"Phædrus was a master with this knife, and used it with dexterity and a sense of power. With a single stroke of analytic thought he split the whole world into parts of his own choosing, split the parts and split the fragments of the parts, finer and finer and finer until he had reduced it to what he wanted it to be. Even the special use of the terms "classic" and "romantic" are examples of his knifemanship."
In a bit of nominative determinism, or perhaps just having chosen the name because I know myself (or maybe I just over use these words), my keywords include: "part, system, level, language, article, object," etc.
What I'm saying is: I like the focus on the content and that it's not about who said it.
It got me to remove my twitter handle from my bio though. If you could update that in your app I would be thankful.
Rather than organizing the world's information, what if we could organize the world's people?
I do not wish to be organized.I suspect a bias towards more common topics might be occurring.
Keywords the site actually associates with me: language, English, article, team, book. To be fair, at some zoom levels I do get "chess".
https://hn2.wilsonl.in/user/dang
Why does it say "Marion Milner" and why are there only so few posts on the map?
https://hn2.wilsonl.in/user/dep_b
I guess that's why I still like posting anonymously?
I think this is an extremely cool idea both on HN specifically and generally on the Internet. Bluesky does a bit of the thing where you can mix and match your content to your ranker/recommender system.
I hope you folks keep working on it, this is a refreshingly cool hack in the space.
Fallacy: appeal to authority. Practically, just because someone generates great content on subject A doesn't mean their take on subject B is any better than random. A well-reasoned self-consistent argument informed by accurate data is far more valuble than 'trust this expert opinion because this expert can be trusted' approaches - although it may require more work on the part of the reader. Don't get lazy.
Awesome project, fantastic UI.
Cool visualization and analysis, really well made!
Also, thanks a lot for attempting to put it in decoded form on another website. If I now get spam on that address, at least I know whose idea that was and that it's not yet spammers who got this clever, but rather it's due to well-meaning hackers with an idea: the most dangerous kind! :P
I guess having over a decade in startups counts for something, but it's crazy to score higher than sama or all the other founder/investors here who make 1000000x more from startups.
My favourite terms here are apparently "inclusive pregnancy emoji proposal", "Controversial tweet on China issue", and "enhancing pasta flavor with salt". I do remember all of these conversations, it's interesting to see it brought up.
Anyway, I'm gonna bookmark this for the next time I look for a job on HN lol
Checked RIP Aaron Swartz. Nothing meaningful. What am I missing?
Oh, would some Power the gift give us / To see ourselves as others see us!
Looks fantastic. Very cool project.
I left Twitter right before it became X, and focused on HN as a “breaking news and commentary” that is mercifully free of politics.
This tool fills a discovery gap.
In any case, interesting idea & project. Philosophically i'm not sure i like the idea of identifying experts - i'd much rather people's comments stand on their own instead of their clout, but nonetheless definitely interesting.
Would things like this create/increase incentives to game the metrics?
I looked myself up and "google" is proeminent. However I'm only posting anti google comments, nothing technical but on the privacy theme...
I also looked up 'yocto', which is something i know something about and i mentioned in posts a couple times, and the first user returned has some very interesting tags:
gur juvpu ohg guvax evtug qbrf
And the only really related tag i see is 'buildroot'. I guess it's just not a popular enough tag for the machine to have enough data.
Edit: and to join the choir of concerned voices:
1. It's not 'trusted'. It's at best 'popular'.
2. It reminds me of the main social networks and 'engagement'. Hope HN never becomes as predatory.
I like it. I feel this tool is quite sophisticated and incredibly accurate.
Seems pretty accurate. I do love me some PHP and WordPress.
I guess I should apply for a job at ESA.
Why not add their Gmail signature as well?
"Trusted voices" on HN are often not experts, yet espouse views which are not what the larger body of experts would consider correct. In addition, actual expert voices are drowned out by whatever the popular position is. HN also provides its own cultural bias, in that everything spoken on HN has to follow a rigorous cultural sieve set down by the guidelines, such that a "negative" view, even if correct, is considered either wrong or distasteful and buried. This is exacerbated by banning the use of humor to disarm controversial or heated comments. And this is the comments that HN does get; many opinions are never entered as comments here, so there exists a large knowledge gap. Then there's the "taboo" subjects like race, gender, religion, politics, social justice, etc which get buried for fear of controversy, so you're definitely not gonna find any expert opinions on those, as the stories just aren't there for discussion.
The end result is that often experts go unheeded or even downvoted, popular shallow opinions get upvoted, and substantive commentary based on evidence and experience is frequently missing. The fact is that we have no idea who knows what, or what's true or right. We just have "popularity" according to the particular cultural quirks of this site. So you can definitely find out "who thinks what", and who is considered to be more trustworthy to a HNer, but it has absolutely nothing to do with objective truth or the body of real knowledge that exists outside the world of HN comments. This is an echo chamber, but it's not a chamber of experts. It just seems that way because occasionally you see a minor tech celebrity, and people talk with absolute authority regardless of if they have any.
You want to find out who knows what? Look at their diplomas and careers. If they've done 20 years in a single field, probably they're an expert. If they have a degree (or multiple) in a field, probably they're an expert. If they spent half their life working on a single hobby, they're at least very knowledgeable in that field. But you can't determine that just by looking at who's talking about what or how many completely subjective "points" they get for what they say. Determining real knowledge requires analysis of specific criteria, filtered to get a higher quality result.
My name pops up.
Hell yeah. A++++ completely accurate would use again.
Please no. That sounds dystopian. We should not prefer having algorithms meddling with social interaction. We should not want things better designed to manipulate us.
I guess it’s inevitable at this point, but how does it feel to be among the first who dumb down real people to a set of caricature keywords based on an ml method of the day?
Related
Insights from over 10,000 comments on "Ask HN: Who Is Hiring" using GPT-4o
The analysis of over 10,000 Hacker News comments using GPT-4o and LangChain revealed job market trends like remote work opportunities, visa sponsorship stability, and skill demands. Insights suggest potential SaaS product development.
Evaluating a Decade of Hacker News Predictions: An Open-Source Approach
The blog post evaluates a decade of Hacker News predictions using LLMs and ClickHouse. Results show a 50% success rate, highlighting challenges in prediction nuances. Future plans include expanding the project. Website: https://hn-predictions.eamag.me/.
Show HN: 40M embeddings to find who knows what on HN
Wilson Lin and Robert embed 40 million Hacker News posts to create a semantic map, prioritizing individuals over content. They aim to highlight trusted voices and knowledge expertise within the community.