July 16th, 2024

Mistaking Engineering Achievements for Human Linguistic Agency

Abeba Birhane and Marek McGann's paper challenges assumptions about Large Language Models (LLMs), emphasizing the dynamic nature of language and critiquing claims of LLM linguistic capabilities. They caution against overstating LLM agency.

Read original articleLink Icon
SkepticismCriticismDebate
Mistaking Engineering Achievements for Human Linguistic Agency

The paper titled "Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency" by Abeba Birhane and Marek McGann challenges the assumptions underlying claims about the linguistic capabilities of Large Language Models (LLMs). It argues against the notions of language completeness and data completeness, highlighting that language is a dynamic process of interaction and embodiment rather than a static entity that can be entirely captured by data. The authors emphasize that LLMs lack key characteristics of enacted language such as embodiment, participation, and precariousness, making them fundamentally different from human linguistic agents. The paper warns against sensationalized claims regarding LLM agency and capabilities, attributing them to a misunderstanding of both human language and the nature of LLMs. The discussion includes examples like 'algospeak' to illustrate the limitations of LLMs in replicating human language behaviors. The paper is set to appear in the Journal of Language Sciences and provides a critical perspective on the current discourse surrounding LLMs and human linguistic agency.

Related

AI: What people are saying
The comments on Abeba Birhane and Marek McGann's paper on LLMs spark a diverse discussion:
  • Some find the paper unconvincing and argue that LLMs could potentially develop an "internal" concept of reality despite lacking corporeal experience.
  • Critics note that the paper's arguments are somewhat dated and overlook recent advancements in LLMs, such as iterative post-training with human feedback.
  • There is skepticism about the paper's narrow view on language and embodiment, with some suggesting that LLMs can still achieve human-like communication without physical interaction.
  • Several comments highlight the philosophical debate on whether LLMs can truly mimic human linguistic capabilities and consciousness.
  • Some commenters express frustration with the paper's theoretical approach, arguing that practical applications and empirical evidence should be the focus.
Link Icon 15 comments
By @GeneralMayhem - 7 months
I am highly skeptical of LLMs as a mechanism to achieve AGI, but I also find this paper fairly unconvincing, bordering on tautological. I feel similarly about this as to what I've read of Chalmers - I agree with pretty much all of the conclusions, but I don't feel like the text would convince me of those conclusions if I disagreed; it's more like it's showing me ways of explaining or illustrating what I already believed.

On embodiment - yes, LLMs do not have corporeal experience. But it's not obvious that this means that they cannot, a priori, have an "internal" concept of reality, or that it's impossible to gain such an understanding from text. The argument feels circular: LLMs are similar to a fake "video game" world because they aren't real people - therefore, it's wrong to think that they could be real people? And the other half of the argument is that because LLMs can only see text, they're missing out on the wider world of non-textual communication; but then, does that mean that human writing is not "real" language? This argument feels especially weak in the face of multi-modal models that are in fact able to "see" and "hear".

The other flavor of argument here is that LLM behavior is empirically non-human - e.g., the argument about not asking for clarification. But that only means that they aren't currently matching humans, not that they couldn't.

Basically all of these arguments feel like they fall down to the strongest counterargument I see proposed by LLM-believers, which is that sufficiently advanced mimicry is not only indistinguishable from the real thing, but at the limit in fact is the real thing. If we say that it's impossible to have true language skills without implicitly having a representation of self and environment, and then we see an entity with what appears to be true language skills, we should conclude that that entity must contain within it a representation of self and environment. That argument doesn't rely on any assumptions about the mechanism of representation other than a reliance on physicalism. Looking at it from the other direction, if you assume that all that it means to "be human" is encapsulated in the entropy of a human body, then that concept is necessarily describable with finite entropy. Therefore, by extension, there must be some number of parameters and some model architecture that completely encode that entropy. Questions like whether LLMs are the perfect architecture or whether the number of parameters required is a number that can be practically stored on human-manufacturable media are engineering questions, not philosophical ones: finite problems admit finite solutions, full stop.

Again, that conclusion feels wrong to me... but if I'm being honest with myself, I can't point to why, other than to point at some form of dualism or spirituality as the escape hatch.

By @Animats - 7 months
Full paper: [1].

Not much new here. The basic criticism is that LLMs are not embodied; they have no interaction with the real world. The same criticism can be applied to most office work.

Useful insight: "We (humans) are always doing more than one thing." This is in the sense of language output having goals for the speaker, not just delivering information. This is related to the problem of LLMs losing the thread of a conversation. Probably the only reasonably new concept in this paper.

Standard rant: "Humans are not brains that exist in a vat..."

"LLMs ... have nothing at stake." Arguable, in that some LLMs are trained using punishment. Which seems to have strong side effects. The undesirable behavior is suppressed, but so is much other behavior. That's rather human-like.

"LLMs Don’t Algospeak". The author means using word choices to get past dumb censorship algorithms. That's probably do-able, if anybody cares.

[1] https://arxiv.org/pdf/2407.08790

By @mnkv - 7 months
Good summary of some of the main "theoretical" criticism of LLMs but I feel that it's a bit dated and ignores the recent trend of iterative post-training, especially with human feedback. Major chatbots are no doubt being iteratively refined on the feedback from users i.e. interaction feedback, RLHF, RLAIF. So ChatGPT could fall within the sort of "enactive" perspective on language and definitely goes beyond the issues of static datasets and data completeness.

Sidenote: the authors make a mistake when citing Wittgenstein to find similarity between humans and LLMs. Language modelling on a static dataset is mostly not a language game (see Bender and Koller's section on distributional semantics and caveats on learning meaning from "control codes")

By @kazinator - 7 months
The authors of this paper are just another instance of the AI hype being used by people who have no connection to it, to attract some kind of attention.

"Here is what we think about this current hot topic; please read our stuff and cite generously ..."

> Language completeness assumes that a distinct and complete thing such as `a natural language' exists, the essential characteristics of which can be effectively and comprehensively modelled by an LLM

Replace "LLM" by "linguistics". Same thing.

> The assumption of data completeness relies on the belief that a language can be quantified and wholly captured by data.

That's all that a baby has, who becomes a native speaker of their surrounding language. Language acquisition does not imply totality of data. Not every native speaker recognizes exactly the same vocabulary and exactly the same set of grammar rules.

By @KHRZ - 7 months
That's a lot of thinking they've done about LLMs, but how much did they actually try LLMs? I have long threads where ChatGPT refine solutions to coding problems. Their example of losing the thread after printing a tiny list of 10 philosophers seems really outdated. Also it seems LLMs utilize nested contexts as well, for example when it can break it' own rules while telling a story or speaking hypothetically.
By @beepbooptheory - 7 months
There is a lot of frustration here over what appears to be essentially this claim:

> ...we argue that it is possible to offer generous interpretations of some aspects of LLM engineering to find parallels with human language learning. However, in the majority of key aspects of language learning and use, most specifically in the various kinds of linguistic agency exhibited by human beings, these small apparent comparisons do little to balance what are much more deep-rooted contrasts.

Now, why is this so hard to stomach? This is the argument of this paper. To feel like this extremely general claim is something you have to argue against means you believe in a fundamental similarity between what our linguistic agency and the model. But is embodied human agency something that you really need the LLMs to have right now? Why? What are the stakes here? The ones actually related to the argument at hand?

This ultimately not that strong of a claim! To the point that its almost vacuous... Of course the LLM will never learn the stove is "hot" like you did when you were a curious child. How can this still be too much to admit for someone? What is lost?

It makes me feel little crazy here that people constantly jump over the text at hand whenever something gets a little too philosophical, and the arguments become long pseudo-theories that aren't relevant to argument.

By @throwthrowuknow - 7 months
“Enactivism” really? I wonder if these complaints will continue as LLMs see wider adoption, the old first they ignore you, then they ridicule you, then they fight you… trope that is halfways accurate. Any field that focuses on building theories on top of theories is in for a bad time.

https://en.m.wikipedia.org/wiki/Enactivism

By @Simon_ORourke - 7 months
Where I work, there's a somewhat haphazardly divided org structure, where my team has some responsibility to answer the executives demands for "use AI to help our core business". So we applied off-the-shelf models to extract structured context from mostly unstructured text - effectively a data engineering job - and thereby support analytics and create more dashboards for the execs to mull over.

Another team, with a similar role in a different part of the org has jumped (feet first) into optimizing large language models to turn them into agents, without consulting the business about whether they need such things. RAG, LoRA and all this optimization is well and good, but this engineering focus has found no actual application, expect wasting several million bucks hiring staff to do something nobody wants.

By @flimflamm - 7 months
How would the authors consider a paralyzed individual who can only move their eyes since birth? That person can learn the same concepts as other humans and communicate as richly (using only their eyes) as other humans. Clearly, the paper is viewing the problem very narrowly.
By @nativeit - 7 months
I'm more or less a layperson when it comes to LLMs and this nascent concept of AI, but there's one argument that I keep seeing that I feel like I understand, even without a thorough fluency with the underlying technology. I know that neural nets, and the mechanisms LLMs employ to train and form relational connections, can plausibly be compared to how synapses form signal paths between neurons. I can see how that makes intuitive sense.

I'm struggling to articulate my cognitive dissonance here, but is there any empirical evidence that LLMs, or their underlying machine learning technology, share anything at all with biological consciousness beyond a convenient metaphor for describing "neural networks" using terms borrowed from neuroscience? I don't know that it necessarily follows that just because something was inspired by, or is somehow mimicking, the structure of the brain and its basic elements, that it should necessarily relate to its modeled reality in any literal way, let alone provide a sufficient basis for instantiating a phenomena we frankly know very little about. Not for nothing, but our models naturally cannot replicate any biological functions we do not fully understand. We haven't managed to reproduce biological tissues that are exponentially less complex than the brain, are we really claiming that we're just jumping straight past lab-grown t-bones to intelligent minds?

I'm sure most of the people reading this will have seen Matt Parker's videos where they "teach" matchbooks to win a game against humans. Is anyone suggesting those matchbooks, given infinite time and repetition, would eventually spark emergent consciousness?

> The argument would be that that conceptual model is encoded in the intermediate-layer parameters of the model, in a different but analogous way to how it's encoded in the graph and chemical structure of your neurons.

Sorry if I have misinterpreted anyone. I honestly thought all the "neuron" and "synapse" references were handy metaphors to explain otherwise complex computations that resemble this conceptual idea of how our brains work. But it reads a lot like some of the folks in this thread believe it's much more than metaphors, but rather a literal analog.

By @mistrial9 - 7 months
oh what a kettle of worms here... Now the mind must consider "repetitive speech under pressure and in formal situations" in contrast and comparison to "limited mechanical ability to produce grammatic sequences of well-known words" .. where is the boundary there?

I am a fan of this paper, warts and all ! (and the paper summary paragraph contained some atrocious grammar btw)

By @rramadass - 7 months
See also Beyond the Hype: A Realistic Look at Large Language Models • Jodie Burchell • GOTO 2024 - https://www.youtube.com/watch?v=Pv0cfsastFs
By @Royshiloh - 7 months
Why assume you "know" what language is? Like there is a study backed insight on the ultimate definition of language? it's the same as saying "oh, it's not 'a,b,c' its 'x,y,z'", which makes you as dogmatic as the one you critique. This is absurd.
By @dboreham - 7 months
The first stage is denial.
By @amne - 7 months
tl:dr; we're duck-typing LLMs as AGI