July 23rd, 2024

Ask HN: What are your worst pain points when dealing with scientific literature?

The author, experienced in computer science, aims to develop tools to overcome challenges in extracting value from scientific literature, seeking input on existing effective tools and major challenges in the field.

The author, with a background in computer science and software engineering, has primarily collaborated with biologists and has identified significant challenges in extracting value from scientific literature. These challenges include obtaining reusable raw data and synthesizing information from multiple studies to create a coherent understanding that can guide research. The author expresses a desire to develop tools that address these issues and seeks input on the biggest challenges faced in this area. Additionally, they are interested in learning about any existing tools that are considered particularly effective or valuable in enhancing scientific research.

Six things to keep in mind while reading biology ML papers

The article outlines considerations for reading biology machine learning papers, cautioning against blindly accepting results, emphasizing critical evaluation, understanding limitations, and recognizing biases. It promotes a nuanced and informed reading approach.

We must seek a widely-applicable Science of Systems

The text discusses the importance of a Science of Systems, focusing on Complex Systems. Emphasizing computer science's role, it explores potential applications in various fields and advocates for scientific progress through unified theories.

Why haven't biologists cured cancer?

Biologists face challenges in curing cancer despite technical advancements. Integration of math in genomics has not led to transformative breakthroughs. Biology's complexity demands rapid experimentation for progress in research and trials.

How I Use AI

The author shares experiences using AI as a solopreneur, focusing on coding, search, documentation, and writing. They mention tools like GPT-4, Opus 3, Devv.ai, Aider, Exa, and Claude for different tasks. Excited about AI's potential but wary of hype.

You got a null result. Will anyone publish it?

Researchers struggle to publish null or negative results, leading to bias favoring positive findings. Initiatives like registered reports aim to enhance transparency, but challenges persist in academia's culture. Efforts to encourage reporting null results continue, aiming to improve research integrity.

6 comments

By @kingkongjaffa - 9 months

Some kind of academic CRM where you can start a new 'project' with some keywords and it assembles highly cited works and the authors of the works.

From there you can recursively search through the bibliography of the seminal works and the works that cite the seminal work to build a research map.

When researching different fields you often end up finding a) who the top researchers are and then you want to go read all of their stuff. b) who is currently working on the thing in $current_year who you might want to contact and talk to.

for example, when it comes to internal combustion engine research, Heywood is the man: https://scholar.google.co.uk/scholar?hl=en&as_sdt=0%2C5&q=JB...

(most cutting edge research is locked away in the automotive company's sadly).

Or in computational fluid dynamics the 'entry point' to the field is basically JD Anderson.

In both cases you're like 6 degrees of separation away from the cutting edge in several micro topics of active research.

> synthesize them into a coherent mental model to inform your own research

For the mental model there's no real way around sitting and reading a bunch of papers, I basically taught myself how to read papers efficiently and then read papers every day (often dead ends which can be quickly discounted.)

By @solardev - 9 months

Not a professional, but as someone with a science degree and casual interest in reading papers now and then, I wish there was a:

1) A better (cheaper) way to access them. It doesn't necessarily have to be free as in SciHub, but there's no way I'm going to pay $80 as an individual to read one paper.

2) An easy way to summarize them, ask questions of it, etc. Google's NotebookLM (https://notebooklm.google.com/) is actually decent at this... upload a PDF and you can ask it questions about that content with minimal hallucination and citations back to the source. However, it's buggy (some files just never finish loading, others won't accept any prompt at all). And it's probably another short-lived experiment soon to meet the Google Graveyard :(

I would be willing to pay maybe $10-$20/mo for a service that can do both (provide Netflix-like access to papers, and also use LLM to summarize them and answer questions). Bonus points if it can do its own meta-analysis of multiple related papers and easily summarize them.

I suspect journal publishers would be heavily resistant to any of that. Probably a more technical workaround would be a web browser extension that uses public/school library logins to fetch papers from the clientside and then mirror them into the service. There is something like this in the legal world, https://free.law/recap to bypass access fees. But there's no copyright concerns there (since the documents themselves are public domain works of the federal gov, different from scientific papers).

By @phewson - 9 months

The fact the entire model is based on articles in a printed journal. I really like the Cochrane Collaboration systematic reviews. As part of the article interested authorities can ask pertinent questions and receive responses. The best we get in most journals is "cited by" links but that's it. Is it being cited because a point is contested. If so, what point is contested. Does the citing paper make a good case. It it being cited as an inspiration for some derivative research with new applications in a different context; if so does that reinforce the methodology in the paper you are looking at. Why not have something that quickly helps you determine whether the paper has been reproduced, and maybe even uprate it if so. And so on.

By @lbhdc - 9 months

The biggest painpoints for me are discovery, and access. It can be really difficult to find papers on the topic I am researching, and often getting access to the papers I find is hard.

By @noncovalence - 9 months

A better way to organise and find papers I've looked at before. For example, being able to ask an LLM "what was that paper again that tried using X to solve Y but ran into some issue" which I vaguely remember skimming a month ago but only just realised that it might actually be useful to me, and it will find the right one from my reference manager and/or subset of my browser history.

Ask HN: What are your worst pain points when dealing with scientific literature?

Related

Six things to keep in mind while reading biology ML papers

We must seek a widely-applicable Science of Systems

Why haven't biologists cured cancer?

How I Use AI

You got a null result. Will anyone publish it?

Related

Six things to keep in mind while reading biology ML papers

We must seek a widely-applicable Science of Systems

Why haven't biologists cured cancer?

How I Use AI

You got a null result. Will anyone publish it?