August 9th, 2024

Launch HN: Roe AI (YC W24) – AI-powered data warehouse to query multimodal data

Roe AI is developing a query engine that allows SQL queries on unstructured data using LLMs, simplifying analysis for teams and offering a free trial with AI credits.

CuriositySkepticismEnthusiasm
Launch HN: Roe AI (YC W24) – AI-powered data warehouse to query multimodal data

Roe AI, founded by Richard and Jason, is developing a query engine that enables data analysts to perform SQL queries on unstructured data types, including videos, images, webpages, and documents, using large language model (LLM) powered data processors. The platform aims to address the challenges faced by product, advertising, and marketing teams in extracting insights from unstructured multimodal data, which typically requires complex analysis processes. Roe AI simplifies this by allowing users to execute queries with just a few lines of SQL. The system utilizes multimodal LLMs for data extraction and classification, features a user-friendly interface for data exploration, and includes a semantic index builder for multimodal data. The founders draw on their extensive experience in data analysis, having transitioned from traditional methods to more streamlined approaches like those offered by Snowflake. Roe AI is currently in its early stages, offering a free trial with $50 in AI credits for processing unstructured data. While the product is not open-sourced due to its complexity, the team is open to feedback and suggestions for improvement.

- Roe AI enables SQL queries on unstructured data using LLMs.

- The platform simplifies complex data analysis processes for various teams.

- It features a user interface and semantic index builder for multimodal data.

- The product is in early development, offering free trials with AI credits.

- Feedback from users is encouraged to enhance the platform.

AI: What people are saying
The comments on Roe AI's query engine reveal a mix of curiosity and skepticism about its functionality and application.
  • Users express interest in the product's potential but suggest improvements, such as adding a video demonstration.
  • There are questions about the technical aspects, including how the LLM processes video files and the integration with existing data sources.
  • Some commenters compare Roe AI's solution to existing technologies like PostgreSQL and express doubts about the necessity of a SQL interface for AI engineers.
  • There is a discussion about the target audience, with inquiries about whether the tool is more suited for data engineers or analysts.
  • Overall, the community is optimistic about the product's evolution and its potential to bridge gaps in data analysis.
Link Icon 13 comments
By @airstrike - 4 months
Congrats on the launch. Sounds cool and potentially useful, but I don't want to read blog posts or book a demo. I'd put a proper video at the very top of the page instead of the animated typing you currently have.

FYI your <title> tag needs to be updated.

By @dmpetrov - 4 months
Bridging the gap between AI and data warehouses is crucial, but I’m not sure SQL is the best fit for AI engineers who mainly work with Python and AI APIs.

At DataChain, we are solving this by creating a Python API that translates to SQL under the hood, which is pretty easy now with Pydantic. https://github.com/iterative/datachain

WDYT?

By @alpineidyll3 - 4 months
I am glad to see people focusing on this.

If this tool could parse drug patents and draw molecular structures with associated data, I know we would pay 200k/yr+ for that service, and there's a market for it.

In my own field, there's an incredibly important application to parse patents and scientific papers, but this would require specific image=>text models in order to get the required information out with high fidelity. Do you guys have plans to enable user supplied workflows where perhaps image patches can be sent to bespoke encoders, or finetunes?

By @namanyayg - 4 months
Congrats on the launch! What are you using to make the LLM understand a video file?

Are you doing transcription + sending frames to a vision or is there a third party service for this?

By @fsndz - 4 months
Why this when I can just use postgreSQL and pgvector ? Like in this example I found recently: https://www.lycee.ai/courses/91b8b189-729a-471a-8ae1-717033c...
By @mnrozhkov - 4 months
Congratulations on the launch! I've been researching unstructured data management for some time, and I'm glad new tools have appeared.
By @atak1 - 4 months
This is awesome :) can we use this directly on our entire db?
By @datadrivenangel - 4 months
Is this more for data engineers or data analysts?

Seems like the type of thing that would be very useful in helping build data pipelines on semi-structured data.

By @nextworddev - 4 months
Why not just focus on the UI part and make it integrate with different data sources?
By @funnyenough - 4 months
Will this work with Redshift via SQL interface? Or am I looking at this wrong?
By @7thpower - 4 months
You are on to something here. Look forward to seeing this evolve.
By @hartator - 4 months
Why the name? It sounds like it will be about US politics.