September 8th, 2024

GPT-fabricated scientific papers on Google Scholar

The rise of GPT-generated scientific papers on Google Scholar threatens research integrity, with 62% lacking AI disclosure. Recommendations include improved filtering and education to enhance critical evaluation of research.

Read original articleLink Icon
ConcernSkepticismFrustration
GPT-fabricated scientific papers on Google Scholar

The increasing prevalence of GPT-fabricated scientific papers on Google Scholar raises significant concerns regarding research integrity and the potential for misinformation. A study analyzed a selection of these questionable papers, revealing that approximately two-thirds were produced using generative AI, particularly in politically sensitive areas such as health, environment, and computing. These papers often mimic legitimate scientific writing and are indexed alongside reputable research, complicating the ability of users to discern credible sources. The findings indicate that the presence of such fabricated content threatens the integrity of the scholarly communication system and could undermine public trust in scientific knowledge. The study highlights the risks of "evidence hacking," where misleading studies are used to manipulate public opinion and policy. Recommendations to address these issues include improving filtering options in academic search engines, enhancing transparency in publication practices, and developing educational measures to help users critically evaluate scientific literature. The ongoing discussion around the implications of generative AI in academia emphasizes the need for a nuanced understanding of its impact on research and society.

- GPT-fabricated papers are increasingly found on Google Scholar, posing risks to research integrity.

- Approximately 62% of analyzed papers did not disclose the use of generative AI.

- Most questionable papers focus on politically sensitive topics like health and environment.

- The phenomenon raises concerns about "evidence hacking" and public trust in science.

- Recommendations include better filtering in search engines and educational initiatives for critical evaluation of research.

AI: What people are saying
The discussion surrounding GPT-generated scientific papers reveals several concerns and insights about research integrity and the use of AI in academia.
  • Many commenters express skepticism about the integrity of AI-generated content, emphasizing that human fabrication of research has been a longstanding issue.
  • There are concerns about the ability of non-experts to discern credible research from AI-generated material, highlighting the need for better education and critical evaluation skills.
  • Some commenters point out potential flaws in the methodologies used to identify AI-generated papers, suggesting that existing detection techniques may not be reliable.
  • Several participants advocate for stronger credentialing and accountability measures in academic publishing to combat misinformation.
  • There is a recognition that while AI can assist in writing, it does not inherently lead to data fabrication, and the responsibility lies with the authors to ensure the integrity of their work.
Link Icon 20 comments
By @Strilanc - 7 months
When I went to the APS March Meeting earlier this year, I talked with the editor of a scientific journal and asked them if they were worried about LLM generated papers. They said actually their main worry wasn't LLM-generated papers, it was LLM-generated reviews.

LLMs are much better at plausibly summarizing content than they are at doing long sequences of reasoning, so they're much better at generating believable reviews than believable papers. Plus reviews are pretty tedious to do, giving an incentive to half-ass it with an LLM. Plus reviews are usually not shared publicly, taking away some of the potential embarrassment.

By @refibrillator - 7 months
Hmm there may be a bug in the authors’ python script that searches google scholar for the phrases "as of my last knowledge update" or "I don't have access to real-time data". You can see the code in appendix B.

The bug happens if the ‘bib’ key doesn’t exist in the api response. That leads to the urls array having more rows than the paper_data array. So the columns could become mismatched in the final data frame. It seems they made a third array called flag which could be used to detect and remove the bad results, but it’s not used any where in the posted code.

Not clear to me how this would affect their analysis, it does seem like something they would catch when manually reviewing the papers. But perhaps the bibliographic data wasn’t reviewed and only used to calculate the summary stats etc.

By @nomilk - 7 months
GPT might make fabricating scientific papers easier, but let's not forget how many humans fabricated scientific research in recent years - they did a great job without AI!

For any who haven't seen/heard, this makes for some entertaining and eye-opening viewing!

https://www.youtube.com/results?search_query=academic+fraud

By @pcrh - 7 months
This kind of fabricated result is not a problem for practitioners in the relevant fields, who can easily distinguish between false and real work.

If there are instances where the ability to make such distinctions is lost, it is most likely to be so because the content lacks novelty, i.e. it simply regurgitates known and established facts. In which case it is a pointless effort, even if it might inflate the supposed author's list of publications.

As to the integrity of researchers, this is a known issue. The temptation to fabricate data existed long before the latest innovations in AI, and is very easy to do in most fields, particularly in medicine or biosciences which constitute the bulk of irreproducible research. Policing this kind of behavior is not altered by GPT or similar.

The bigger problem, however, is when non-experts attempt to become informed and are unable to distinguish between plausible and implausible sources of information. This is already a problem even without AI, consider the debates over the origins of SARS-CoV2, for example. The solution to this is the cultivation and funding of sources of expertise, e.g. in Universities and similar.

By @tkgally - 7 months
For a paper that includes both a broad discussion of the scholarly issues raised by LLMs and wide-ranging policy recommendations, I wish the authors had taken a more nuanced approach to data collection than just searching for “as of my last knowledge update” and/or “I don’t have access to real-time data” and weeding out the false positives manually. LLMs can be used in scholarly writing in many ways that will not be caught with such a coarse sieve. Some are obviously illegitimate, such as having an LLM write an entire paper with fabricated data. But there are other ways that are not so clearly unacceptable.

For example, the authors’ statement that “[GPT’s] undeclared use—beyond proofreading—has potentially far-reaching implications for both science and society” suggests that, for them, using LLMs for “proofreading” is okay. But “proofreading” is understood in various ways. For some people, it would include only correcting spelling and grammatical mistakes. For others, especially for people who are not native speakers of English, it can also include changing the wording and even rewriting entire sentences and paragraphs to make the meaning clearer. To what extent can one use an LLM for such revision without declaring that one has done so?

By @daghamm - 7 months
Last time we discussed this, someone basically searched for phrases such as "certainly I can do X for you" and assumed that meant GPT was used. HN noticed that many of the accused papers actually predated openai.

Hope this research is better.

By @hodgesrm - 7 months
> Two main risks arise... First, the abundance of fabricated “studies” seeping into all areas of the research infrastructure... A second risk lies in the increased possibility that convincingly scientific-looking content was in fact deceitfully created with AI tools...

A third risk: ChatGPT has no understanding of "truth" in the sense of facts reported by established, trusted sources. I'm doing a research project related to use of data lakes and tried using ChatGPT to search for original sources. It's a shitshow of fabricated links and pedestrian summaries of marketing materials.

This feels like an evolutionary dead end.

By @layer8 - 7 months
I appreciate that, appropriately, the article image is not AI-generated.
By @kgeist - 7 months
I wonder how many of the GPT-generated papers are actually made by people whose native language is not English and who want to improve their English. That would explain various "as of my last knowledge update" still left intact in the papers, if the authors don't fully understand what it means.
By @oefrha - 7 months
How about people stop responding to titles for a change. This isn’t about papers that merely used ChatGPT and got caught by some cutting edge detection techniques, it’s about papers that blatantly include ChatGPT boilerplates like

> “as of my last knowledge update” and/or “I don’t have access to real-time data”

which suggests no human (don’t even need to be a researcher) read every sentence of these damn “papers”. That’s a pretty low bar to clear, if you can’t even bother to read generated crap before including it in your paper, your academic integrity is negative and not a word from you can carry any weight.

By @rosmax_1337 - 7 months
I think we might be entering a dark age of sorts.
By @RobotToaster - 7 months
If the papers are correct, what does it matter if the author used AI?

If the papers are incorrect, then the reviewers should catch them.

By @OutOfHere - 7 months
Just because ChatGPT was used to help write a paper doesn't in itself mean that the data or findings are fabricated.
By @gerdesj - 7 months
Colour me surprised. An IT related search will generally end up with loads of returns that lead to AI generated wankery.

For example, suppose you wish to back up switch configs or dump a file or whatever and tftp is so easy and simple to setup. You'll tear it down later or firewall it or whatever.

So a quick search "linux tftp serevr" gets you to say: https://thelinuxcode.com/install_tftp_server_ubuntu/

All good until you try to use the --create flag which should allow you to upload to the server. That flag is not valid for tftp-hpa, it is valid on tftpd (another tftp daemon)

That's a hallucination. Hallucinations are fucking annoying and increasingly prevalent. In Windows land the humans hallucinate - C:\ SFC /SCANNOW does not fix anything except for something really madly self imposed.

By @jeremynixon - 7 months
There is article shows no evidence of fabrication, fraud or misinformation, while making accusations of all of them. All it shows is that ChatGPT was used, which is wildly escalated into "evidence manipulation" (ironically without evidence).

Much more work is needed to show that this means anything.

By @Barrin92 - 7 months
Honestly what we need to do is establish much stronger credentialing schemes. The "only a good guy with an AI can stop a bad guy with an AI" approach of trying to filter out bad content is just a hopeless arms race and unproductive.

In a sense we need to go back two steps and websites need to be much stronger curators of knowledge again, and we need some reliable ways to sign and attribute real authorship to publications. So that when someone publishes a fake paper there is always a human being who signed it and can be held accountable. There's a practically unlimited number of automated systems, but only a limited number of people trying to benefit from it.

In the same way https went from being rare to being the norm because the assumption that things are default-authentic doesn't hold, the same just needs to happen to publishing. If you have a functioning reputation system and you can put on a price on fake information 99% of it is dis-incentivized.

By @kazinator - 7 months
> often controversial topics susceptible to disinformation: ... and computing

Ouch!