Mistaking Engineering Achievements for Human Linguistic Agency
Abeba Birhane and Marek McGann's paper challenges assumptions about Large Language Models (LLMs), emphasizing the dynamic nature of language and critiquing claims of LLM linguistic capabilities. They caution against overstating LLM agency.
Read original articleThe paper titled "Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency" by Abeba Birhane and Marek McGann challenges the assumptions underlying claims about the linguistic capabilities of Large Language Models (LLMs). It argues against the notions of language completeness and data completeness, highlighting that language is a dynamic process of interaction and embodiment rather than a static entity that can be entirely captured by data. The authors emphasize that LLMs lack key characteristics of enacted language such as embodiment, participation, and precariousness, making them fundamentally different from human linguistic agents. The paper warns against sensationalized claims regarding LLM agency and capabilities, attributing them to a misunderstanding of both human language and the nature of LLMs. The discussion includes examples like 'algospeak' to illustrate the limitations of LLMs in replicating human language behaviors. The paper is set to appear in the Journal of Language Sciences and provides a critical perspective on the current discourse surrounding LLMs and human linguistic agency.
Related
- Some find the paper unconvincing and argue that LLMs could potentially develop an "internal" concept of reality despite lacking corporeal experience.
- Critics note that the paper's arguments are somewhat dated and overlook recent advancements in LLMs, such as iterative post-training with human feedback.
- There is skepticism about the paper's narrow view on language and embodiment, with some suggesting that LLMs can still achieve human-like communication without physical interaction.
- Several comments highlight the philosophical debate on whether LLMs can truly mimic human linguistic capabilities and consciousness.
- Some commenters express frustration with the paper's theoretical approach, arguing that practical applications and empirical evidence should be the focus.
On embodiment - yes, LLMs do not have corporeal experience. But it's not obvious that this means that they cannot, a priori, have an "internal" concept of reality, or that it's impossible to gain such an understanding from text. The argument feels circular: LLMs are similar to a fake "video game" world because they aren't real people - therefore, it's wrong to think that they could be real people? And the other half of the argument is that because LLMs can only see text, they're missing out on the wider world of non-textual communication; but then, does that mean that human writing is not "real" language? This argument feels especially weak in the face of multi-modal models that are in fact able to "see" and "hear".
The other flavor of argument here is that LLM behavior is empirically non-human - e.g., the argument about not asking for clarification. But that only means that they aren't currently matching humans, not that they couldn't.
Basically all of these arguments feel like they fall down to the strongest counterargument I see proposed by LLM-believers, which is that sufficiently advanced mimicry is not only indistinguishable from the real thing, but at the limit in fact is the real thing. If we say that it's impossible to have true language skills without implicitly having a representation of self and environment, and then we see an entity with what appears to be true language skills, we should conclude that that entity must contain within it a representation of self and environment. That argument doesn't rely on any assumptions about the mechanism of representation other than a reliance on physicalism. Looking at it from the other direction, if you assume that all that it means to "be human" is encapsulated in the entropy of a human body, then that concept is necessarily describable with finite entropy. Therefore, by extension, there must be some number of parameters and some model architecture that completely encode that entropy. Questions like whether LLMs are the perfect architecture or whether the number of parameters required is a number that can be practically stored on human-manufacturable media are engineering questions, not philosophical ones: finite problems admit finite solutions, full stop.
Again, that conclusion feels wrong to me... but if I'm being honest with myself, I can't point to why, other than to point at some form of dualism or spirituality as the escape hatch.
Not much new here. The basic criticism is that LLMs are not embodied; they have no interaction with the real world. The same criticism can be applied to most office work.
Useful insight: "We (humans) are always doing more than one thing." This is in the sense of language output having goals for the speaker, not just delivering information. This is related to the problem of LLMs losing the thread of a conversation. Probably the only reasonably new concept in this paper.
Standard rant: "Humans are not brains that exist in a vat..."
"LLMs ... have nothing at stake." Arguable, in that some LLMs are trained using punishment. Which seems to have strong side effects. The undesirable behavior is suppressed, but so is much other behavior. That's rather human-like.
"LLMs Don’t Algospeak". The author means using word choices to get past dumb censorship algorithms. That's probably do-able, if anybody cares.
Sidenote: the authors make a mistake when citing Wittgenstein to find similarity between humans and LLMs. Language modelling on a static dataset is mostly not a language game (see Bender and Koller's section on distributional semantics and caveats on learning meaning from "control codes")
"Here is what we think about this current hot topic; please read our stuff and cite generously ..."
> Language completeness assumes that a distinct and complete thing such as `a natural language' exists, the essential characteristics of which can be effectively and comprehensively modelled by an LLM
Replace "LLM" by "linguistics". Same thing.
> The assumption of data completeness relies on the belief that a language can be quantified and wholly captured by data.
That's all that a baby has, who becomes a native speaker of their surrounding language. Language acquisition does not imply totality of data. Not every native speaker recognizes exactly the same vocabulary and exactly the same set of grammar rules.
> ...we argue that it is possible to offer generous interpretations of some aspects of LLM engineering to find parallels with human language learning. However, in the majority of key aspects of language learning and use, most specifically in the various kinds of linguistic agency exhibited by human beings, these small apparent comparisons do little to balance what are much more deep-rooted contrasts.
Now, why is this so hard to stomach? This is the argument of this paper. To feel like this extremely general claim is something you have to argue against means you believe in a fundamental similarity between what our linguistic agency and the model. But is embodied human agency something that you really need the LLMs to have right now? Why? What are the stakes here? The ones actually related to the argument at hand?
This ultimately not that strong of a claim! To the point that its almost vacuous... Of course the LLM will never learn the stove is "hot" like you did when you were a curious child. How can this still be too much to admit for someone? What is lost?
It makes me feel little crazy here that people constantly jump over the text at hand whenever something gets a little too philosophical, and the arguments become long pseudo-theories that aren't relevant to argument.
Another team, with a similar role in a different part of the org has jumped (feet first) into optimizing large language models to turn them into agents, without consulting the business about whether they need such things. RAG, LoRA and all this optimization is well and good, but this engineering focus has found no actual application, expect wasting several million bucks hiring staff to do something nobody wants.
I'm struggling to articulate my cognitive dissonance here, but is there any empirical evidence that LLMs, or their underlying machine learning technology, share anything at all with biological consciousness beyond a convenient metaphor for describing "neural networks" using terms borrowed from neuroscience? I don't know that it necessarily follows that just because something was inspired by, or is somehow mimicking, the structure of the brain and its basic elements, that it should necessarily relate to its modeled reality in any literal way, let alone provide a sufficient basis for instantiating a phenomena we frankly know very little about. Not for nothing, but our models naturally cannot replicate any biological functions we do not fully understand. We haven't managed to reproduce biological tissues that are exponentially less complex than the brain, are we really claiming that we're just jumping straight past lab-grown t-bones to intelligent minds?
I'm sure most of the people reading this will have seen Matt Parker's videos where they "teach" matchbooks to win a game against humans. Is anyone suggesting those matchbooks, given infinite time and repetition, would eventually spark emergent consciousness?
> The argument would be that that conceptual model is encoded in the intermediate-layer parameters of the model, in a different but analogous way to how it's encoded in the graph and chemical structure of your neurons.
Sorry if I have misinterpreted anyone. I honestly thought all the "neuron" and "synapse" references were handy metaphors to explain otherwise complex computations that resemble this conceptual idea of how our brains work. But it reads a lot like some of the folks in this thread believe it's much more than metaphors, but rather a literal analog.
I am a fan of this paper, warts and all ! (and the paper summary paragraph contained some atrocious grammar btw)