June 28th, 2024

LLMs now write lots of science. Good

Large language models (LLMs) are significantly shaping scientific papers, with up to 20% of computer science abstracts and a third in China influenced by them. Debates persist on the impact of LLMs on research quality and progress.

Read original article

Large language models (LLMs) are increasingly contributing to scientific papers, with over 10% of abstracts in scientific journals and up to 20% in computer science being partially written by these models. In China, a third of abstracts are influenced by LLMs. While some express concerns about the potential for poor-quality papers, biases, and plagiarism due to LLM usage, others argue that the benefits outweigh the risks. Journals like Science are implementing disclosure requirements for LLM use, but critics believe policing LLMs is challenging. The debate continues on whether the rise of LLMs in scientific writing will ultimately enhance or hinder the quality and progress of research.

Researchers describe how to tell if ChatGPT is confabulating

Researchers at the University of Oxford devised a method to detect confabulation in large language models like ChatGPT. By assessing semantic equivalence, they aim to reduce false answers and enhance model accuracy.

Delving into ChatGPT usage in academic writing through excess vocabulary

A study by Dmitry Kobak et al. examines ChatGPT's impact on academic writing, finding increased usage in PubMed abstracts. Concerns arise over accuracy and bias despite advanced text generation capabilities.

Claude 3.5 Sonnet

Anthropic introduces Claude Sonnet 3.5, a fast and cost-effective large language model with new features like Artifacts. Human tests show significant improvements. Privacy and safety evaluations are conducted. Claude 3.5 Sonnet's impact on engineering and coding capabilities is explored, along with recursive self-improvement in AI development.

Large Language Models are not a search engine

Large Language Models (LLMs) from Google and Meta generate algorithmic content, causing nonsensical "hallucinations." Companies struggle to manage errors post-generation due to factors like training data and temperature settings. LLMs aim to improve user interactions but raise skepticism about delivering factual information.

AI Scaling Myths

The article challenges myths about scaling AI models, emphasizing limitations in data availability and cost. It discusses shifts towards smaller, efficient models and warns against overestimating scaling's role in advancing AGI.

4 comments

By @quantified - 10 months

> Peer review, for instance, will become even more important in a gen-ai world. It must be beefed up accordingly, perhaps by paying reviewers for the time they sacrifice to scrutinise papers.

So the cost savings in writing will be offset by additional costs of reading. Playing good defense is harder than playing offense.

The only strong argument here is that you can't tell anyway. Like effective doping in sport.

By @tivert - 10 months

> And most worrying of all, writing can be an integral part of the research process, by helping researchers clarify and formulate their own ideas. An excessive reliance on llms could therefore make science poorer.

This. The mistake so many people seem to make is to think of writing merely as outputting text, when it's a lot more.

I predict LLMs will cause general competence levels to decrease, and an increase in the intellectual equivalent of three-fingered hands as more and more people lose the ability to notice the problem.

By @bookofjoe - 10 months

https://archive.ph/kYf9c

LLMs now write lots of science. Good

Related

Researchers describe how to tell if ChatGPT is confabulating

Delving into ChatGPT usage in academic writing through excess vocabulary

Claude 3.5 Sonnet

Large Language Models are not a search engine

AI Scaling Myths

Related

Researchers describe how to tell if ChatGPT is confabulating

Delving into ChatGPT usage in academic writing through excess vocabulary

Claude 3.5 Sonnet

Large Language Models are not a search engine

AI Scaling Myths