July 15th, 2024

Show HN: RAG on HN comments in 34 LOC

The tool hackerNewsRAG uses Algolia to search HackerNews comments, extracting and summarizing content with Substrate. Users can fork it on Substrate for free credits. Val Town offers JavaScript hosting services.

Read original articleLink Icon
Show HN: RAG on HN comments in 34 LOC

The article discusses a tool called hackerNewsRAG, which leverages Algolia to find comments on HackerNews related to a specific topic. The tool extracts content and generates a streaming markdown summary using Substrate. It involves 34 lines of Substrate code and provides a walkthrough link for further details. Users can fork the tool by signing up for Substrate to receive their API key and $50 in free credits. The tool searches for HackerNews comments based on a query, extracts summaries, sentiments, and metadata from each comment, and categorizes them by sentiment. It filters out posts related to hiring and those not relevant to the query. The markdown summary includes a link to the original story URL. Val Town, the platform hosting this tool, allows users to write and deploy JavaScript, build APIs, and schedule functions from their browser. No comments have been made on this specific tool on Val Town yet.

Related

Show HN: R2R V2 – A open source RAG engine with prod features

Show HN: R2R V2 – A open source RAG engine with prod features

The R2R GitHub repository offers an open-source RAG answer engine for scalable systems, featuring multimodal support, hybrid search, and a RESTful API. It includes installation guides, a dashboard, and community support. Developers benefit from configurable functionalities and resources for integration. Full documentation is available on the repository for exploration and contribution.

How I scraped 6 years of Reddit posts in JSON

How I scraped 6 years of Reddit posts in JSON

The article covers scraping 6 years of Reddit posts for self-promotion data, highlighting challenges like post limits and cutoffs. Pushshift is suggested for Reddit archives. Extracting URLs and checking website status are explained. Findings reveal 40% of sites inactive. Trends in online startups are discussed.

GraphRAG (from Microsoft) is now open-source!

GraphRAG (from Microsoft) is now open-source!

GraphRAG, a GitHub tool, enhances question-answering over private datasets with structured retrieval and response generation. It outperforms naive RAG methods, offering semantic analysis and diverse, comprehensive data summaries efficiently.

Insights from over 10,000 comments on "Ask HN: Who Is Hiring" using GPT-4o

Insights from over 10,000 comments on "Ask HN: Who Is Hiring" using GPT-4o

The analysis of over 10,000 Hacker News comments using GPT-4o and LangChain revealed job market trends like remote work opportunities, visa sponsorship stability, and skill demands. Insights suggest potential SaaS product development.

Evaluating a Decade of Hacker News Predictions: An Open-Source Approach

Evaluating a Decade of Hacker News Predictions: An Open-Source Approach

The blog post evaluates a decade of Hacker News predictions using LLMs and ClickHouse. Results show a 50% success rate, highlighting challenges in prediction nuances. Future plans include expanding the project. Website: https://hn-predictions.eamag.me/.

Link Icon 3 comments
By @benzguo - 6 months
Little demo that searches Hacker News comments for a topic (using https://hn.algolia.com/api), extracts sentiment and other metadata, then generates a research summary. Really proud of the API we've built at https://substrate.run – you don't have to think about graphs, but you implicitly create a DAG by relating tasks to each other. Because you submit the entire workflow to our inference service, you get automatic parallelization of dozens of LLM calls for free, zero roundtrips, and much faster execution of multi-step workflows (often running on the same machine).
By @bosch_mind - 6 months
I’m new to RAG, but have been learning about it lately for fun and it’s pretty incredible as a concept.

What are your thoughts around these frameworks like llama index and langchain? Being a seasoned engineer, it seems like a ridiculous amount of fluff around an already simple process.

By @Something1234 - 6 months
How do I run it?