July 25th, 2024

Launch HN: Undermind (YC S24) – AI agent for discovering scientific papers

Josh and Tom are developing Undermind, a search engine for complex scientific research, using large language models to enhance search accuracy and comprehensiveness, inviting user feedback for improvements.

ImpressionCuriosityAppreciation

Launch HN: Undermind (YC S24) – AI agent for discovering scientific papers

Josh and Tom, physicists and founders of Undermind, are developing a search engine specifically designed for complex scientific research. Their motivation stems from personal frustrations during their graduate studies, where they often struggled to efficiently find relevant research papers, leading to wasted time and missed opportunities. They aim to address this issue by creating a system that mimics effective human research strategies, utilizing a pipeline that incorporates large language models (LLMs) to provide tailored search results.

The search process begins with a conversation between the user and the LLM to clarify complex research goals. The system then conducts a thorough search for approximately three minutes, employing a tree search method that follows citations and adapts based on findings. This approach prioritizes accuracy and comprehensiveness, ensuring that users receive specific recommendations and are aware of all relevant existing research.

Undermind's automated pipeline tracks the discovery process, allowing for statistical modeling of search saturation, which helps determine when all useful leads have been exhausted. The founders are currently focusing on reading abstracts and citations, with plans to include full texts in the future. They have made a demo video available and are inviting users to try the search engine without a signup requirement for a limited time. Feedback from users is encouraged to improve the platform further.

Ask HN: What are your worst pain points when dealing with scientific literature?

The author, experienced in computer science, aims to develop tools to overcome challenges in extracting value from scientific literature, seeking input on existing effective tools and major challenges in the field.

AI: What people are saying

The comments on Undermind highlight user experiences and suggestions for improvement.

Many users find the search engine effective, discovering relevant research they previously missed.
Suggestions for enhancements include better citation formats, the ability to save results, and improved search algorithms to prioritize important papers.
Users express interest in pricing models, with requests for student tiers and pay-per-query options.
Some users note limitations in the search results, such as missing key references and the need for more comprehensive coverage of gray literature.
Overall, there is excitement about the potential of Undermind to streamline research processes.

62 comments

By @kelloggm - 10 months

I'm a CS academic who _should_ be working on finalizing a new submission, so when I saw this on HN I decided to give it a try and see if it could find anything in the literature that I'd missed. Somewhat to my surprise, it did - the top 10 results contained two items that I really ought to have found myself (they're from my own community!), but that I'd missed. There were also some irrelevant results mixed in (and lots of things I was already aware of), but overall I'm very impressed with this and will try it out again in the future. Nice work :)

By @mnode - 10 months

I tried this with a question for an area I know well. It's pretty impressive but missed some key references.

I'd love to see limitations like this quantified and clearly flagged. Otherwise there's a danger that people may the assume results are definitive, and this could have the opposite outcome to that intended (much time spent working on something only to disocver it's been done already).

By @robwwilliams - 10 months

Awesome! I just took you up on your offer and compared roughly similar questions using Claude 3.5 Sonnet and Undermind.

Claude 3.5 is reluctant to provide references—-although it will if coaxed by prompting.

Undermind solves this particular problem. A great complement for my research question —- the evidence that brain volume is reduced as a function of age in healthy cognitively normal humans. In mice we see a steady slow increase that averages out to a gain of 5% between the human equivalents of 20 to 65 years of age. This increase is almost linear as a function of log of age.

Here is the question that was refined with Undermind’s help:

>I want to find studies on adult humans (ages 20-100+) that have used true longitudinal repeated measures designs to study variations in brain volume over several years, focusing on individuals who are relatively healthy and cognitively functional.

I received 100 ranked and lightly annotated set of 100 citations in this format:

>[1] Characterization of Brain Volume Changes in Aging Individuals With Normal Cognition Using Serial Magnetic Resonance Imaging S. Fujita, ..., and O. Abe JAMA Network Open 2023 - 21 citations - Show abstract - Cite - PDF 99.6% topic match Provides longitudinal data on brain volume changes in aging individuals with normal cognition. Analyzes annual MRI data from 653 adults over 10 years to observe brain volume trajectories. Excludes populations with neurodegenerative diseases; employs true longitudinal design with robust MRI techniques.

By @A_D_E_P_T - 10 months

I've just tried it and it looks good. Will probably sign up.

If you can get this to work for patent searches across multiple languages, you'd really have a killer product. Patent searches, via an attorney, cost thousands of dollars each and are nevertheless frequently imperfect. (I had a patent denied because somewhere, in a throwaway paragraph buried in the 100-page description of an entirely different invention, something similar was briefly mentioned and never referred to again.)

I'd gladly pay $100/month for "Deep Patent Search," and more than that if it's really good.

By @passwordoops - 10 months

"manually dig through papers on Google Scholar for hours."

Is exactly how you gain expertise in a field and/or find those subtle gaps in knowledge that are the seeds of real breakthrough

By @phren0logy - 10 months

I have only tried one search, but so far it's impressive. I have been using elicit.com, but they seem to be taking a different approach that is less AI-heavy. I would definitely give this a shot for a few months.

By @richardreeze - 10 months

Great product!

One thing I'd improve is how it asks followup questions.

For example, I asked "What are the best types of videogames that improve cognition?"

The followup response was

```

Are you specifically looking for studies that focus on cognitive improvements as measured by standardized tests, or are you more interested in everyday functional improvements observed in real-world settings?

Could you be more precise about which cognitive domains you're interested in? For example, are you focusing on memory, problem-solving, attention, or something else? Additionally, are you looking for papers that compare different genres of videogames (e.g., action, puzzle, strategy) or studies targeting specific age groups or populations (e.g., children, older adults)?

Lastly, are you interested in experimental studies where the cognitive benefits were directly measured pre- and post-intervention, or are observational studies that report correlations between videogame use and cognitive performance also of interest to you? Understanding this will help determine the type of articles we should prioritize in the search.

```

It would be great if it turned those into multiple choice. For example:

```

Could you be more precise about which cognitive domains you're interested in?

[] memory

[] problem-solving

[] attention

[] something else (please specify)

```

Would save a ton of time having to reply/ reread everything.

By @viraj_shah - 10 months

Here is an open source tool for summarizing Arxiv papers: https://summarizepaper.com/

By @sirlunchalot - 10 months

Very happy subscriber here, thank you for the tool. I do a lot of searching with it, however due to some changes in my life in the near future I will not need it as much so I wont be willing to spend $20 a month on it. So my question is, would you consider adding an option where one could pay per query rather than just per monthly subscription? I would love to use it for the occasional spark of curiosity when I want to know more about a certain topic without having to familiarise myself with the academic field surrounding it. Having a way for using undermind for situations like that would be truly amazing! Would gladly pay 1-2 or maybe even 3 dollars per extended query.

By @zwaps - 10 months

I really, really wish you would use a different citation format.

Arbitrary numbers are really the least information. At least use last names and years, so I can have some idea which paper you are talking about without scrolling back and forth.

By @xbmcuser - 10 months

Something like this was the first thing that came to my mind when chat gpt and their ilk started showing up. The amount of knowledge in so many fields is so vast that it is impossible for any one or even a group people to access and utilise properly. Serendipity is still needed for many things LLM will make it occur more regularly

By @physicsguy - 10 months

My first impression is that it’s quite cool, but it should weight things by importance to some degree.

I tried a search on my previous research area (https://www.undermind.ai/query_app/display_one_search/5408b4...) and it missed some key theoretical papers. At the same time, it picked up the three or four papers I’d expected it to find plus a PhD thesis I expected it to find. The results at the top of the list though are very recent and one of them is on something totally different to what I asked it for (“Skyrmion bubbles” != “Skyrmions”). The 7th result is an absolutely core paper, would be the one I’d give to a new PhD student going into this area and the one I’d have expected it to push up to the top of the list.

By @i-use-nixos-btw - 10 months

Ok, tried searching something incredibly niche, and it came up with results that no search I'd tried through conventional methods could.

There's a 50/50 false positive rate, but I can deal with that. It means looking at 10 papers to find 5 useful ones instead of looking at 1000 papers to also find 5 useful ones.

I'm impressed.

By @jspann - 10 months

I love the concept and loved the results I got. I tried it out and found a lot of papers both from my lab group and ones related that I had missed. I'd happily pay for it but as a grad student the price is a little steep - would it be possible to make a student tier?

By @sebastiennight - 10 months

This is brillant and I found papers I could not locate previously through manual searches.

One thing that I would like to suggest (other than saving to PDF, as discussed elsewhere in the thread) is to give the possibility, not just to "expand" the search, but also to "refine" the search.

If it was possible for me, after reading through the result page, to go back to your conversational UI and say, "OK this was my original intent, and here's what's wrong with the results I've got" for your system to provide a "refined" version of my query, that would be next-level.

Keep up the good work. Congrats on a successful launch!

By @BOOSTERHIDROGEN - 10 months

It would be impressive if the pricing were based on the country's income level.

By @mrweiden - 10 months

Ref to prior art: https://en.wikipedia.org/wiki/Meta_(academic_company)

One anecdote that I heard from the team developing it: turned out that researchers more readily sourced material from their social networks, notably twitter at the time. Meta's search functionality didn't receive enough traffic and eventually was shut down.

Perhaps LLMs will make the search capability more compelling. I guess we'll see.

By @winddude - 10 months

Curious as to what it's doing under the hood, the query to return the results takes an excruciatingly long time... are you searching remote sources vs a local index?

this was the search <https://www.undermind.ai/query_app/display_one_search/cba773...> if you need a reference too it, ie bugs or performance monitoring...

By @adi2907 - 10 months

Very impressed. I am not a scientist but am building a product for intent-based discounting in Shopify. Typically Google scholar gives me very generic results using LSTM etc however this search gave me some interesting results with focus on real world implementation. The clarifying questions are also quite impressive as it gives the impression that it is understanding the query really well. Good stuff. I think it might be useful for end-users and not just company/research folks as well

By @iamacyborg - 10 months

I’m a marketer rather than a scientist but this proved very useful in helping me find research that’s applicable to my field of work (crm marketing). Nothing particularly new was surfaced but I suppose I wasn’t expecting it to either http://www.undermind.ai/query_app/display_one_search/7140cc6...

By @jackmphy10 - 10 months

This looks really cool! I'm sure I'll be adding this to my toolkit. And I swear by SciSpace Copilot https://typeset.io/ which I've been using for more than a year. It saves my reading time and summarizes the paper extremely well, helps me decode complex topics, automates the literature review, and extracts key findings of the paper within minutes.

By @admissionsguy - 10 months

This works well. Well done. There was a similar product on HN a few weeks ago and it mostly failed on my favourite topics. Undermind returned all the papers I would expect. The ordering of the results could be improved since in the case I tested, it does not reflect well the relative importance of the papers. I think it may give too much weight to direct similarity to the search query, which could sometimes be an advantage.

By @gradschool - 10 months

I did a search on an obscure topic I like and got impressively well informed results, so hats off to you. If this isn't an awkward question, how does one avoid running afoul of Google's rules against crawling when implementing a service like this? If someone were to post a link to a report obtained through your service on a blog or something, would that be a good thing in your view or more like piracy?

By @ziofill - 10 months

> We’re both physicists, and one of our biggest frustrations during grad school was finding research

You should have seen what it used to be like a few decades ago :)

By @timdellinger - 10 months

I'll write the obligatory comment about doing literature searches in the 90s, which involved trudging to the physics library, the chemistry library, and the engineering library in search of dead tree copies of the journal articles you're after. Also: skimming each paper quickly after you photocopy it, to see if it references any other papers you should grab while you're at the library.

By @cbracketdash - 10 months

Been using Undermind for several months now and it's honestly been a lifesaver in getting a comprehensive understanding of a research topic.

By @benzguo - 10 months

This is really cool! Both of my parents are cell biologists, and I've done some time in labs as well, so a lot of paper exploring and reading in the family. "Review" articles are a good index but something more on-demand makes a lot of sense, I can definitely see this being extremely useful.

By @jpmattia - 10 months

I was looking for a way to save the result page locally (ie offline). Maybe I need more coffee, but it did not look possible. So I'd add the ability to save the result (or at least make the button prominent).

But the fact I wanted to save a result is a good sign. Nice work!

By @qxfys - 10 months

Tried it. It would save me a lot of time, I would say!

One suggestion: The back-and-forth chat in the beginning could be improved with a more extensive interaction. So, the final prompt could be more fine-grained into a specific area/context/anything one would aim for.

By @sitkack - 10 months

How is this different than the work that semantic scholar is doing around AI?

By @tims33 - 10 months

Just in case you needed a theme song for your future ads: https://www.youtube.com/watch?v=XTnMvULdcB4

By @iandanforth - 10 months

"Please use a valid institutional or company email address."

This is obnoxious. Please remove this unnecessary roadblock.

By @19h - 10 months

Absolutely phenomenal quality. Subscribed to the pro plan! Please add an option to use Claude as I absolutely prefer it to GPT4 and it's probably cheaper too.

By @KrisGaudel - 10 months

This is really cool, excited to see where this goes!

By @gillesjacobs - 10 months

Pretty good, it found some useful references I missed in Google Scholar and Arxiv. Looks promising, will use it more.

By @rjchint - 10 months

How would you compare your product to elicit.ai?

In my opinion elicit has better looking UI and much more features and further along

By @glitchc - 10 months

This is a nice search rngine. I found it to be more effective than crawling with google scholar. Good work guys!

By @sam1234apter - 10 months

How is this different from Scite, Elicit, Consensus, and Scopus AI for Generating Literature Reviews

By @skyde - 10 months

Could not try it. Saying valid institutional or company email address.

It doesn’t recognize my university.

By @Tsarp - 10 months

quite impressive! This is really more like what I was hoping Elicit would be.

Are you breakdown the question into subtopics, doing a broad search and then doing some sort of dim reduction -> topical clustering to get it in the format?

By @pointlessone - 10 months

OK, I'm both impressed and disappointed.

I did 2 searches.

First I asked about a very specific niche thing. I gave me results but none I wanted. It looked like I missed a crucial piece of information.

So I did the second search. I started with the final request it written for the previous search and added the information I though I missed. It gave me virtually the same results with a little sprinkle of what I was actually after.

A few observations:

1. I'm not sure but it seems like it relies too much on citation count. Or maybe citations in papers make it think that the paper is absolutely a must read. I specifically said I'm not interested in what's in that paper and I still got those results.

2. I don't see much dissertations/theses in the result. I know for sure that there a good results for my request in a few dissertations. None of them are in the results.

That said, while I didn't get exactly what I want I've found a few interesting papers even if they're tangential to the actual request.

By @smcsdp - 10 months

Any idea how i can use your tool for a vs code extension

By @mangoparrot - 10 months

would this be able to find the latest articles on a given topic?

let’s say i am interested in coffee and i’d like to get new research papers on it. would this work?

By @Geee - 10 months

I've been using https://exa.ai for this. It doesn't do any advanced agent stuff like here, but it's way better than Google, especially if you're not quite sure what you're looking for.

By @minznerjosh - 10 months

Are you planning to offer a search API at some point?

By @thor-rodrigues - 10 months

Hi Josh and Tom, thank you for the post.

Are there any plans on releasing any sort of API integration? I work in Technology Transfer consultancy for research institutes in Europe, and often we have to do manual evaluation of publications for novelty check and similar developments. Since most of the projects we work on were developed by leading researchers in academic institutions, it is important for us to quickly assess if a certain topic has been studied already.

Currently, one of my company's internal projects is a LLM-powered software to automate much of the manual search, together with other features related to the industry.

I think it be very beneficial for us to implement academic papers search function, but for that an API system would be required.

Great work nonetheless, good luck on the journey

By @SubiculumCode - 10 months

Impressive result. I will visit again

By @brainwipe - 10 months

Independent researcher without academic address; can't get in. Best of luck.

By @bobobob420 - 10 months

Ycombinator has really fallen off

By @redeux - 10 months

I’ve been using a similar platform that I really like called Answer This[1]. I’ll have to check out yours as well and see how it compares.

1. https://answerthis.io/

By @havkom - 10 months

This looks cool!

By @hgarg - 10 months

Is this magic?

By @alexp2021 - 10 months

Hi, Looks good on first try. You mentioned the tool currently searches abstracts only. Searching full papers appears impossible as they are mostly behind paywalls, right?

By @adalacelove - 10 months

Congratulations for an LLM that doesn't give me BS. I'm sending links to colleagues and most probably subscribe myself

By @danpalmer - 10 months

So does this undermine academics? the research industry? the academic publishing industry? Who exactly is being "undermined" by this AI product?

With the current attitudes to AI, the name feels a little tone deaf being so easily mistaken for AI undermining people.

By @smcsdp - 10 months

what are your biggest drawbacks?

By @8thcross - 10 months

is there a way to configure sources used for research? what about ability to search through paywalled journals?

By @setgree - 10 months

Very cool, and very relevant to my life -- I am currently writing a meta-analysis and finishing my literature search.

I gave it a version of my question, it asked me reasonable follow-ups, and we refined the search to:

> I want to find randomized controlled trials published by December 2023, investigating interventions to reduce consumption of meat and animal products with control groups receiving no treatment, measuring direct consumption (self-reported outcomes are acceptable), with at least 25 subjects in treatment and control groups (or at least 10 clusters for cluster-assigned studies), and with outcomes measured at least one day after treatment begins.

I just got the results back: https://www.undermind.ai/query_app/display_one_search/e5d964....

It certainly didn't find everything in my dataset, but:

* the first result is in the dataset.

* The second one is a study I excluded for something buried deep in the text.

* The third is in our dataset.

* The fourth is excluded for something the machine should have caught (32 subjects in total), but perhaps I needed to clarify 25 subjects in treatment and control each.

* The fifth result is a protocol for the study in result 3, so a more sophisticated search would have identified that these were related.

* The sixth study was entirely new to me, and though it didn't qualify because of the way the control group received some aspect of treatment, it's still something that my existing search processes missed, so right away I see real value.

So, overall, I am impressed, and I can easily imagine my lab paying for this. It would have to advance substantially before it was my only search method for a meta-analysis -- it seems to have missed a lot of the gray literature, particularly those studies published on animal advocacy websites -- but that's a much higher bar than I need for it to be part of my research toolkit.

Launch HN: Undermind (YC S24) – AI agent for discovering scientific papers

Related

Ask HN: What are your worst pain points when dealing with scientific literature?

Related

Ask HN: What are your worst pain points when dealing with scientific literature?