January 15th, 2025

Has LLM killed traditional NLP?

Large Language Models (LLMs) streamline Natural Language Processing by using zero-shot prompts, reducing the need for extensive training data and retraining, potentially challenging traditional NLP methods' relevance and efficiency.

Read original article

The article discusses the impact of Large Language Models (LLMs) on traditional Natural Language Processing (NLP) methods. Traditionally, NLP has involved breaking down tasks into smaller problems, such as text classification and Named Entity Recognition (NER), requiring extensive training data and model retraining for new intents. For instance, to classify intents like check-in time inquiries, developers must create detailed examples and retrain models as new intents are added. However, with the advent of LLMs like ChatGPT, the process has become more streamlined. LLMs can handle various NLP tasks using zero-shot prompts, allowing users to input questions and intents without needing extensive examples or retraining. This shift raises questions about the future of traditional NLP techniques, as LLMs offer a more efficient and flexible approach to solving language-related challenges.

- LLMs simplify NLP tasks by using zero-shot prompts, reducing the need for extensive training data.

- Traditional NLP methods require detailed examples and retraining for new intents, making them more time-consuming.

- The rise of LLMs may challenge the relevance of traditional NLP techniques in the industry.

- LLMs can potentially handle a wider range of language tasks more efficiently than traditional models.

22 comments

By @axegon_ - 3 months

No, it has not and will not in the foreseeable future. This is one of my responsibilities at work. LLMs are not feasible when you have a dataset of 10 million items that you need to classify relatively fast and at a reasonable cost. LLMs are great at mid-level complexity tasks given a reasonable volume of data - they can take away the tedious job of figuring out what you are looking at or even come up with some basic mapping. But anything at large volumes.. Na. Real life example: "is '20 bottles of ferric chloride' a service or a product?"

One prompt? Fair. 10? Still ok. 100? You're pushing it. 10M - get help.

By @scarface_74 - 3 months

For my use case, definitely.

I have worked on AWS Connect (online call center) and Amazon Lex (the backing NLP engine) projects.

Before LLMs, it was a tedious process of trying to figure out all of the different “utterances” that people could say and the various languages you had to support. With LLMs, it’s just prompting

https://chatgpt.com/share/678bab08-f3a0-8010-82e0-32cff9c0b4...

I used something like this using Amazon Bedrock and a Lambda hook for Amazon Lex. Of course it wasn’t booking a flight. It was another system

The above is a simplified version. In the real world , I gave it a list of intents (book flights, reserve a room, rent a car) and properties - “slots” - I needed for each intent.

By @DebtDeflation - 3 months

The question seems malformed to me.

Text classification, clustering, named entity recognition, etc. are NLP tasks. LLMs can perform these tasks. ML models that are not LLMs (or even not deep learning models) can also perform these tasks. Is the author perhaps asking if the concept of a "completion" has replaced all of these tasks?

When I hear "traditional NLP" I think not of the above types of tasks but rather the methodology employed for performing them. For example, building a pipeline to do stemming/lemmatization, part of speech tagging, coreference resolution, etc. before the text gets fed to a classifier model. This was SOTA 10 years ago but I don't think many people are still doing it today.

By @thangalin - 3 months

I created an NLP library to help curl straight quotes into curly quotes. Last I checked, LLMs struggled to curl the following straight quotation marks:

    ''E's got a 'ittle box 'n a big 'un,' she said, 'wit' th' 'ittle 'un 'bout 2'×6". An' no, y'ain't cryin' on th' "soap box" to me no mo, y'hear. 'Cause it 'tweren't ever a spec o' fun!' I says to my frien'.

The library is integrated into my Markdown editor, KeenWrite (https://keenwrite.com/), to correctly curl quotation marks into entities before passing them over to ConTeXt for typesetting. While there are other ways to indicate opening and closing quotation marks, none are as natural to type in plain text as straight quotes. I would not trust an LLM curl quotation marks accurately.

For the curious, you can try it at:

https://whitemagicsoftware.com/keenquotes/

If you find any edge cases that don't work, do let me know. The library correctly curls my entire novel. There are a few edge cases that are completely ambiguous, however, that require semantic knowledge (part-of-speech tagging), which I haven't added. PoS tagging would be a heavy operation that could prevent real-time quote curling for little practical gain.

The lexer, parser, and test cases are all open source.

https://gitlab.com/DaveJarvis/KeenQuotes/-/tree/main/src/mai...

By @darepublic - 3 months

I remember using the open NLP library from Stanford around 2016. It would do parts of speech tagging of words in a sentence (labelling the words with their grammatical function). It was pretty good but reliably failed on certain words where context determined the tag. When for gpt 3 came out the first thing I tested it out on was parts of speech tagging. In particular those sentences open NLP had trouble with. And it aced everything I was impressed.

By @RancheroBeans - 3 months

NLP is an important part of upcoming RAG frameworks like Microsoft’s LazyGraphRAG. So I think it’s more like NLP is a tool used when the time is right.

https://www.microsoft.com/en-us/research/blog/lazygraphrag-s...

By @derbaum - 3 months

One of the things I'm still struggling with when using LLMs over NLP is classification against a large corpus of data. If I get a new text and I want to find the most similar text out of a million others, semantically speaking, how would I do this with an LLM? Apart from choosing certain pre-defined categories (such as "friendly", "political", ...) and then letting the LLM rate each text on each category, I can't see a simple solution yet except using embeddings (which I think could just be done using BERT and does not count as LLM usage?).

By @michaelsbradley - 3 months

https://archive.is/J53CE

By @freefaler - 3 months

If archive links aren't working this works:

https://freedium.cfd/https://medium.com/altitudehq/is-tradit...

By @vletal - 3 months

The idea that we can solve "language" by breaking down and understanding sentences is naive and funny with the benefit of hindsight, is it not?

An equivalently funny attitude seems to be the "natural language will replace programming languages". Let's see how that one will work out when the hype is over.

By @itissid - 3 months

LLM Design/Use has only about as much to with engineering as building a plane has to do with actually flying it.

Every business is kind of a unicorn in its problems NLP is a small part of it. Like even if it did perform cheaply enough to do NLP, how would you replace parts like: 1. Evaluation system that uses Calibration(Human labels) 2. Ground Truth Collection(Human + sometimes semi automated) 3. QA testing by end users.

Even if LLMs made it easier to do NLP there are correlations with the above which means your NLP process is hugely influenced so much that you still need an engineer. If you have an engineer who only for doing NLP and nothing else you are quite hyper specialized like to the extent you are only building planes 0.01%: of the engineering work out there.

By @arcknighttech - 3 months

They also have their own uses depending in the case scenario so it depends on what someone is working with. NLP in some cases like advanced marketing can still have great benefit without an LLM and AIs in general are streamlining certain "speed of information processing" tasks but still struggle with complex "systems thinking".

FYI - If anyone doesn't know the difference between the two or has no idea what NLP or an LLM is, this has a good breakdown: https://medium.com/@melindaboone80722/nlp-vs-llm-b339abdc651...

By @ein0p - 3 months

You have to be more specific with that question. The "traditional" NLP has by now been killed twice: first by classical machine learning (which significantly reduced the need for linguists), and now by deep learning (which has all but eliminated it). So "traditional" NLP was killed back in late 00s. You can't kill that which is not alive, so it follows that LLMs have not, in fact, killed traditional NLP.

By @antonvs - 3 months

One datapoint: we were using NLP to translate natural language instructions into an executable form that could drive our product. It was part of a product we sold to enterprises.

We've completely replaced that with LLMs. We still use our own DNNs for certain tasks, but not for NLP.

By @oliwary - 3 months

This article seems to be paywalled unfortunately. While LLMs are very useful when the tasks are complex and/or there is not a lot of training data, I still think traditional NLP pipelines have a very important role to play, including when:

- Depending on the complexity of the task and the required results, SVMs or BERT can be enough in many cases and take much lower resources, especially if there is a lot of training data available. Training these models with LLM outputs could also be an interesting approach to achieve this.

- When resources are constrained or latency is important.

- In some cases, there may be labeled data in certain classes that have no semantic connection between them, e.g. explaining the class to LLMs could be tricky.

By @vedant - 3 months

The title of this article feels like "has electricity killed oil lamps"?

By @retinaros - 3 months

cant read the article. do they consider BERT as an LLM? there are tasks still in NLP where BERT is better than a GPT like

By @leobg - 3 months

There are AI bros that will call an LLM to do what you could do with a regex. I’ve seen people do the chunking for RAG using an LLM…

Has LLM killed traditional NLP?

Related

Related