Discussion about this post

User's avatar
Herbert Roitblat's avatar

Thanks for writing this. I must admit, however, that I am a bit confused. There are two hypotheses. One hypothesis is that the model learns the statistical properties of Othello moves. The second hypothesis is that the model learns properties of Othello that are not statistical. It seems to me that any analysis that attempts to distinguish between these two would have to find a situation where the output of the system could not be explained by statistical properties (the more parsimonious hypothesis). I do not see how finding that the activation state of the model correlates with the state of the board addresses that distinction. I would expect that the statistical properties of the moves would be perfectly correlated with the state of the board. Am I missing something? Any specific sequence of moves would result in a specific state of the board. Multiple sequences could result in the same state, but is that not a statistical relation?

Here is an example of a test close to what I mean for a language model. Embeddings are supposed to capture the meaning of a sentence and they do a reasonable job as a first approximation because of distributional semantics. But consider these three sentences:

1. Skinny weighed 297 pounds.

2. Edward weighed 297 pounds.

3. Skinny weighed 297 pounds of potatoes.

It seems to me that sentences 1 and 2 are very similar in meaning relative to sentence 3. They both describe a person's weight. Sentence 3 uses "weighed" to indicate an action performed with some potatoes. It does not refer to the state of a person. Yet, the embeddings for sentence 1 and 3 were much closer to one another than either one was to sentence 2.

This one example does not "prove" anything, certainly, but it does illustrate the kind of contrast that I think we need to evaluate claims of cognitive functions.

For more on evaluating LLMs for general intelligence, see: http://arxiv.org/abs/2502.07828

Expand full comment
Dominic Ignatius's avatar

I've always been confused by people who think LLMs have a sophisticated world model through learning only text. It seems obvious to me that words themselves are not enough, from a really basic Philosophy 101 "The Map is Not the Territory" and "This is Not a Pipe" level of reasoning. Text based LLMs aren't really experiencing the world, just a shadows-on-a-cave-wall version of it. How can it have an even close to complete world model?

Expand full comment
12 more comments...

No posts