LLMs and World Models, Part 2

Melanie Mitchell

Feb 13

204

Evidence For (and Against) Emergent World Models in LLMs

Read →

45 Comments

Rangachari Anand

Feb 15

Really liked this passage:

I’d guess that it’s actually our human limitations—constraints on working memory, on processing speed, on available energy—as well as our continually changing and complex environments that require us to form more abstract and generalizable internal models.

An analogy I'd like to suggest is to consider the difference in the way birds and airplanes fly. Birds are far more efficient in using available power but airplanes have so much power to spare that it doesn't matter.

Expand full comment

Reply (1)

Bill Taylor

Feb 20

+1; this is the best and most unique idea in this (excellent) piece. And it cuts in a different direction from most AI development which is expansionist in nature. Maybe constraining things is a solution rather than a problem?

Expand full comment

Reply (1)

Metin

Mar 17Edited

lifeforms have to use their available resources efficiently to cope with realities of life. it's not entirely correct to call it "human limitations" but rather highly engineered and if required and sufficient for functionality then further stripped down by the evolution.

Expand full comment

Daniel Visser

Feb 13Edited

Amazing post, as always :) More evidence (at least in my interpretation) for the "bag of heuristics" explanation can be found in On the Geometry of Deep Learning by Prof. Balestriero and colleagues, which discusses how these models effectively rely on enormous, piecewise-linear tilings to collectively solve tasks – https://arxiv.org/abs/2408.04809.

Expand full comment

Reply (1)

boxwood

Feb 14

what is the relationship between a "bag of heuristics" and deep network tesselation with affine splines? The paper you reference does not mention "bag of heuristics".

Expand full comment

Reply (1)

Daniel Visser

Feb 14Edited

Sorry for the conceptual leaps there. These splines seem to effectively form regions that function as a locally sensitive hashing table. That isn’t to say conclusively that they don’t perform reasoning. However, as I understand it, the way the splines learn in one part of the space affects how they learn in another, with these tiles forming something akin to K-means clustering. Yet, this apparent elasticity of the splines (along with many other features of DNNs) still only produce a, albeit more sophisticated, input-output mapping. To me, this doesn’t seem conducive to an additional layer of deduction or reasoning on top of sophisticated pattern matching, or, "bag of heuristics"

Expand full comment

Art Keller

Feb 15

loved this two part post. shared it with my subscribers. a lot of people don't get that this world model issue is at the heart of AI utility. If world models are emergent-then yes, AGI could be a thing, and soon! If they are not, if LLMs remain brittle bags of tricks-a huge part of the use case for AI goes away.

Expand full comment

Reply (1)

Comment removed

Apr 23

Comment removed

Expand full comment

Reply (1)

Art Keller

Apr 23

Melanie, this comment to me seems like your account has been hacked?

Expand full comment

Reply (1)

Melanie Mitchell

Apr 24

No, it's another account that is trying to impersonate me. I'm trying to get Substack to track it down.

Expand full comment

Reply (1)

Art Keller

May 7

they keep showing up. I just reported the latest. IDK how substack is allowing this.

Expand full comment

Egemen Pamukcu

Feb 14

Fantastic post as always.

It seems to me that when we reference world models of the kind humans use, we rely on something like the phenomenology of world-model-having. And I doubt any intermediate layer in ANNs can pass that test -- it might never fully, intuitively capture that feeling.

So we need a rigorous definition for a human-like world model that doesn't rely on such intuitions. We need to better study and extract the world models of the only systems that we know to have them. Then prove that, under the hood, what's happening is something remarkably different than lumps of heuristics. It's not impossible that running heuristic bags sufficiently well would "feel like" having a world model.

Expand full comment

Bill Taylor

Feb 20

(First, thanks for the great article...one of the best ever in my opinion.) In this type of discussion, I find humans tend to lionize the human approach. We humans certainly use a world model. Crucially, we can interrogate ourselves about whether we have such a model, and whether it's good. It's how we understand our own intelligence. But in a practical sense, there is no requirement for a world model, other than as a means to an end. And it's a human-centric means, which we over-value because we comprehend it. AI may have no model... but does it matter? With respect to text/chat robustness for example: a decade ago the behavior of an LLM would be brittle to the point of comedy.... now suddenly LLMs are very useful, even with no proven world model per se. In the same vein: I may protest that the AI chessbot cannot rationally defend it's next move, or any move. No matter; it still whips me on the chessboard 100% of the time. And so on. Therefore: I think we can expect future models to look at that wonderful complex photo of the woman/dog/man/traffic, and know how to correctly achieve it's proper ends, by means which will still seem hazy to us. (And actually it knows today already, in a simple sense at least in true automomous cars: "those could be humans --> stay away!")

Expand full comment

L Caires

Feb 15

Very interesting discussion indeed. It would be however very useful to clearly lay down better characterizations of the concept "world model", or even more of the more general idea of "model". I find the definitions coming up in the AI field quite diverse and simplistic. In the fields of the philosophy of science, the theory of mind, cognitive psychology and others, many diverse and detailed frameworks have been proposed to conceptualize of is to be a mind model.

Other puzzling feature that seems to predate these discussions in AI are the extremely audacious jumps to assumptions and claims about how humans build or evolve models. What is the basis for such claims? As far as I know, the (observable) ability to model worlds is far from being a specific feature of humans, much "simple" behaviorally rich animals including insects may reasonably be considered to manipulate world models in some sense. Are bees manipulating bags of heuristics or "abstract compressed algorithmic descriptions"? I remember a comment somewhere by Hofstadter about someone imagining a spider doing complex trigonometric calculations while constructing its net :) Do biological brains actually build such "compressed algorithmic descriptions"? I would love to know more about internal evidences for these claims ( say from neuroscience). I confess that all these seem pretty speculative from a methodological viewpoint. We know that a computer game software explicitly manipulates a model in the sense described in this article, with compressed representations for objects, locations, states and state modifiers and so on, carefully programmed eg with C++ code. Can we reverse engineer and find these things inside a running computer "brain" using an oscilloscope probe?

Expand full comment

Wolstencroft on consciousness

Feb 14

Excellent post. Clearly and concisely explained as always.

Expand full comment

Dominic Ignatius

Feb 14

In my Master's degree almost 15 years ago I did a project on using a Genetic Algorithm to learn Othello. As a GA, I had to basically code the world model into the system. It would have been easier if the computer could learn it by itself!

Expand full comment

Kevin

Feb 13

I'm curious whether, in your view, "having a world model" is a property about the internal functioning of a system. If you had a system that was only observable as a black box, you could see inputs and outputs and that's it, like a mathematical function, would it make any sense to ask, does this have a world model?

Asked another way, would it be possible to build two different systems that were exactly identical in terms of input and output, but one of them had a world model, and one did not?

Personally, I try to avoid using the term "world model" (like "AGI") because it is imprecisely defined in this way. I find more practical to talk about the capabilities of models because it is much easier to measure their capabilities than to inspect their workings. For that matter, we can barely inspect the workings of our own brains.

Expand full comment

Reply (2)

Andy X Andersen

Feb 13Edited

The problem is that even an outrageous amount of text, images, and video, are not enough to properly disambiguate the properties of each of the concepts an LLM deals with. Especially that the sampling isn't designed to query the objects exhaustively.

As an example, watching videos about fluid turbulence and then being able to generate plausible videos of such turbulence is not the same as understanding turbulence. If you insert an unexpected object in such a scenario, the generated liquid flow will likely be quite wrong.

Expand full comment

Reply (1)

earthsundance

Mar 12

It seems like we're getting into mixed bags of AI technology where LLMs are building knowledge graphs, I wonder what aspects of a worldview will be visible? This is a very interesting post and I really appreciate your clear thinking.

Expand full comment

Reply (1)

Andy X Andersen

Mar 12

Knowledge graphs are useful for high-level reasoning. Our own world knowledge is also partially stored in various schema and abstractions, so this appears to be a relevant direction.

At finer levels, probably simulators can be invoked, via generating and running code.

Expand full comment

Paul Topping

Feb 13

I think every complex system has a world model. It is more a matter of what it represents. A program can model a digital image as a set of RGB pixel values or as a 3-D model with camera and lighting parameters. Which is better depends on what you want the program to do, though one could argue that the 3-D model operates at a higher level. When it comes to whether some AI employs the "right" world model, it needs to be a model close to what a human brain must contain. This is required so that it can perform like a human. Unfortunately, we don't know in detail what world model humans use.

One problem with games like Othello is that humans can't tell us much about the internal models they use while playing the game. Players might be able to produce theories as to why they made a certain move but they probably aren't reliable. If humans and AIs make similar mistakes while playing a game, that is evidence their internal world models are similar, at least in some respects.

As to treating an AI like a black box, this has problems. When an LLM, trained on massive amounts of human-produced text, produces a sentence, who wrote it? If an AI plays a game after being trained on a massive number of games, perhaps it is simply finding a game in its vast database that includes the current position and bases its next move on it.

Expand full comment

Reply (1)

Andy X Andersen

Feb 14

Yeah, so overall it is doubtful LLMs have what it takes. I do not discount the possibility that a massive neural net put in the head of a robot would eventually discover world models after a lengthy "life" interacting with the world, but LLM are nowhere near that.

Expand full comment

Reply (1)

boxwood

Feb 14

LLMs learn about the world implicitly through the relationships between words and their higher-order connections. The structure of the world is "encoded" in language, and transformers are efficient at learning this structure. The concept of "second-order similarity" is key here, as originally discussed and studied by Roger Shepard in the 70s. A second-order similarity is a similarity between two representational spaces (e.g. language and the world). So LLMs do not need "sense data", as they can infer semantics through the second-order similarity discovered in the structure of language. Whether it's "enough" is unclear, but what we know is that it's *a lot* and will probably be a lot more when video/audio/action are better integrated.

Expand full comment

Reply (1)

Andy X Andersen

Feb 14

What will be needed is lots of action and "reaction". A lot more than available now, and likely better represented internally as well.

Expand full comment

Paul Topping

Feb 13

There's a rather critical typo, I believe, where "casual" probably should have been "causal". A small nit on an otherwise excellent post.

Expand full comment

Reply (1)

Melanie Mitchell

Feb 14

Thanks, fixed now.

Expand full comment

Steph

Feb 14

Verrry interesting...! My first reaction (the same as many others here, I see) to the idea of localised piecemeal heuristics is to exclaim 'Oh, just like humans!'. Our habits, skills and processes (I try not to say words like 'belief') also seem to be piecemeal and contextual, each having developed in a particular context for a particular function and subject to particular constraints. Now they all jostle each other along in a haphazard way that more or less works as a whole, albeit with many inconsistencies and dissonances

Expand full comment

Reply (1)

Melanie Mitchell

Feb 14

I agree, but I think our heuristics differ from those learned by LLMs -- they have more capacity to learn extremely specific rules, whereas we have a drive to use more general heuristics.

Expand full comment

Marco Masi

Feb 14

I'm wondering whether an LLM's “large bags of heuristics” are comparable to the kind of world model congenitally blind people have regarding concepts like light, colors, and transparency, or deaf people have regarding sound, music, etc. They can also meaningfully talk about these things but lack a real understanding and resort to logical associations (such as the word “window” being associated with the word “transparency”) but, I guess, without having a real understanding of what the words truly signify. Thoughts?

Expand full comment

Reply (1)

boxwood

Feb 14

I think the congenitally blind have extremely sophisticated world models and it shows that such models can be inferred from indirect data including language (and other senses).

Expand full comment

Reply (1)

Marco Masi

Feb 17

Yes, but it remains a conceptual relational model, not a lived representation.

Expand full comment

Mark Seery

Feb 13

Hi Melanie I probably need to do a bunch more study on this, but it is my impression than humans also use an enormous amout of heuristics. Maybe the term means something different in this context. - best.

Expand full comment

Reply (1)

Melanie Mitchell

Feb 14

I agree. But I think the kinds of heuristics we tend to use are more general than the extremely specific heuristics learned by LLMs, precisely because we are more constrained (as I argued in the post). Plus I think there is a lot of evidence that we have the kinds of "world models" I described in the first part of the post.

Expand full comment

Andy X Andersen

Feb 13Edited

LLMs do not have world models. They don't know the properties of the objects they deal with.

They are still an amazing breakthrough. That because while world models are important, building such models in well-defined areas we know how to do well.

What we failed to create for decades is a system that can work through free-form data, where sometimes world models exist, sometimes they don't, and one should be able to reason at multiple levels of abstraction, loading and letting go of world models as need arises.

This is what LLM provides. I think of it as an all-encompassing framework, onto which world models can be attached as need be. Such as via code generation, running and observing internal simulations, invoking tools, asking users for guidance, doing sanity checks as it reasons along, etc.

Expand full comment

Patrick Logan

Mar 29

I would hesitate to refer to these as a bag of "heuristics" as at least humans know what is heuristic is and is not. Hopefully the industry will learn sooner rather than later that anthropomorphisms are hindering rather than helping measurable objectives of machine capabilities.

Expand full comment

Elan Barenholtz, Ph.D.

Mar 24Edited

While the evidence from OthelloGPT (and other models since) show that neural networks develop internal states that correlate with properties of the world (like board positions), I remain skeptical of the "internal model" interpretation for both these systems—AND humans.

The decodability of board states from network activations doesn't mean the network has a "world model" that it consults—feedforward models can't work that way. Instead, it merely indicates its activation patterns have been shaped to statistically support appropriate next-token prediction. This framework better explains why these representations are distributed across activations in non-linear ways; not the method I would use in building a readable world map.

And, I would argue, the same may well be true of humans: no internal models, just autoregressive generation ability that LOOKS like we've got such models.

Expand full comment

AI: A Guide for Thinking Humans

LLMs and World Models, Part 2