AI: A Guide for Thinking Humans

Thanks for this very clear exposition. Talking as if you are reasoning is different from reasoning. To claim that a system that is known to work by predicting the next word is reasoning is an extraordinary claim, but,as you point out, there is little critical evaluation. It should require extraordinary evidence. The public, the governments, and some computer scientists are being bamboozled into thinking that these models do things that they are incapable of doing without considering the possibility that they are just following the statistical language patterns that they have learned. This should be just basic science, but evident,y not. Your critical thought is essential in the medium and long run. Thanks!

Expand full comment

Melanie Mitchell

Thank you!

Expand full comment

What is your model of cognition? Are you positive it doesn't involve predicting the next action based on previous states and external input, deep down - except the inputs and the objects of your actions are more highly dimensional than text?

Expand full comment

Most models of cognition encompass both rule-based and what you might call automatic prediction-based behavior. In such models, the system first learns rules that it can execute slowly but successfully. With practice, these rules are converted into fast, automatic predictive procedures. Many people have noted that LLMs are trained only to produce the fast, automatic predictive procedure. This is why it would be very surprising if they have also learned the general rules, since no part of the architecture appears to support rule-based reasoning.

Expand full comment

What is the minimal demonstration that would change your mind?

Expand full comment

See my separate comment on the main post. There is no single minimal demonstration, as I would like to see many pieces of convergent evidence. I'm very interested in the mechanistic interpretation work, which seeks to uncover the internal mechanisms of these models. If there are internal mechanisms that can effectively behave like binding a constant to a variable in an abstract rule, that would be very exciting. In the human brain, the pre-frontal cortex recognizes when the human is in a novel situation where only rule-based reasoning can be applied. Finding a similar mechanism in an LLM would also be extremely interesting.

Expand full comment

https://www.lesswrong.com/posts/nmxzr2zsjNtjaHh7x/actually-othello-gpt-has-a-linear-emergent-world

then it might be easier than i thought! i'd like to hear your take on this paper:

Expand full comment

That paper addresses a different question, which is whether LLMs construct mental models of situations. While mental models are important, they do not correspond to abstract reasoning but instead support efficient "concrete" reasoning. The evidence in that paper is suggestive, but not conclusive.

Expand full comment

From OthelloGPT paper “The main takeaway is that Othello-GPT does far better than chance in predicting legal moves when trained on both datasets. ”

It still struggles with legal moves.

It may have a model for Othello, but an approximate, or wrong, model.

Expand full comment

This comment is a step in the right direction of discussions that we, the community interested in artificial intelligence, ought to have. Cognition does "involve predicting the next action." The better question is: "Is predicting the next word sufficient to produce other cognitive processes?" A claim that an LLM uses reasoning, or thinking, or is sentient is a claim that it has certain cognitive properties. A language model can say that it is sentient, for example, but is that evidence that it is? Two subquestions: Is language sufficient to implement cognitive processes (such as reasoning)? Is the behavior sufficient to demonstrate these processes? Mitchell cites some articles that show that small changes in the language of a question yield different patterns of results. Some words demonstrate apparent presence of a property, but closely related words do not. There are lots of these examples. What I was mentioning was that even if a model behaves as if it had a cognitive process, that alone does not allow one to conclude that it does have that property. Affirming the consequent is a legal fallacy. Consider a situation "If X then Y," "If a model has reasoning, then it will solve reasoning problems." Observe "Y," observe the model solving reasoning problems. We would be wrong to conclude that this observation demonstrates that the model has reasoning, there could be another cause, such as being able to repeat the right words. If Lincoln was killed by a robot, then he is dead. Lincoln is dead. It would erroneous, however, to conclude the Lincoln was killed by a robot. He was killed by a human, but he is still dead. Affirming the consequent is the dominant means by which people try to demonstrate that models have some cognitive capacity, but we also have to consider alternative potential causes. Actors may say lines as if they were mathematical geniuses without being, in fact, geniuses. Sounding like a genius is not the same as being a genius.

Expand full comment

Eric Saund

"Reasoning" is the application of knowledge to a problem statement that results in making certain information explicit, namely, an answer that was always there, it just wasn't written down yet.

If the knowledge permits derivation of the result in one step of pattern matching---equivalent to constraint satisfaction---then we can expect many architectures to work. If instead, multiple steps are required, then the problem becomes one of search. Search requires keeping track of intermediate results, along with info to navigate the search space.

Where might an LLM hold this information?

In a transformer, in order to parse and emit natural language, the early and late layers probably must attend primarily to lexical and syntactic matters. Presumably the middle layers are the ones that can afford to represent semantics, including various forms of abstraction. Although, the activation vectors must share different time scales of information as the residual stream gets modified: local linguistic patterns must be carried from input to output, but through superposition, activations in the middle layers also carry longer range pressures and constraints associated with categories and manifolds underlying the structure of the problem domain and the problem statement.

Does a transformer LLM have enough room to hold intermediate steps of a complex reasoning task?

The amount of activation vector capacity is the depth of the network times the number of tokens in play. Chain-of-Thought expands capacity by allocating tokens in the context vector to *procedural steps* in the reasoning process. Instead of the model having to internally shoehorn placeholders for its location in the search tree into a short context vector consisting of the problem statement and a compact output, it now has the luxury of seeing where it is in the search space there in writing (output tokens representing steps along the chain of thought). And crucially, intermediate results are made explicit in the context vector, which allows subdivision of a large and complex reasoning process into smaller parts, each of which can be solved with a smaller, more tractable pattern matching step.

Transformers are not the only LLM architecture. Other architectures, especially ones that hold a great deal of internal state that is not closely constrained to text tokens, might behave quite differently.

Expand full comment

One thing we should look for to detect abstract reasoning is cases where abstract reasoning leads to errors. The classic examples are things like "Birds can fly, a penguin is a kind of bird, therefore a penguin flies." Abstract reasoning works by discarding most (or all) of the context and applying an abstract rule to draw a conclusion. This is dangerous, of course, and can lead to faulty conclusions. I suspect that human inference includes a third step, checking the conclusion for plausibility by referring back to what is known about the context. I guess we would need to create novel contexts where the AI system lacks (pre-training) experience in order to detect these kinds of errors.

Expand full comment

Gaurav Gandhi, PhD

Thanks for this insightful post.

"Take a deep breath and work step-by-step!" is now more effective in some tasks than "Let’s think step by step". https://arxiv.org/pdf/2309.03409.pdf

Seems the teams at deepmind were right all along. The same patterns of -content effects- affect both humans and LLMs.

Expand full comment

Jim

https://arxiv.org/abs/2305.15507

Thanks for a very insightful and detailed discussion on the cognitive ability and in some sense a step towards understanding what creativity means in the context of LLM/GPT.

I am not sure what your thoughts are on this (or you know of any discussions related to this: on reasoning and creativity, I think these are very important and practical questions. This is because as we deal with legality of patents and copyrights, the central questions also evolve about what are the intellectual capabilities of the AI. So these are no longer academic or just-philosophical questions but issues of which will affect how AI legislations and impacts will be.

Expand full comment

Devansh

Sep 12, 2023

Did you catch the paper- The Larger They Are, the Harder They Fail: Language Models do not Recognize Identifier Swaps in Python?

That would have made a perfect inclusion in this article

Expand full comment

David Mora

Really appreciated this clear discussion, thank you!

As with most of AI discussions these days, we get so caught up in the glitzy outputs we don’t stop to define our terms or think robustly about what is required to test. Thank you for helping cut through the noise!

Expand full comment

Sahar Mor

Sep 14, 2023

The discourse around the reasoning abilities of LLMs is indeed multi-faceted. While the emergence of CoT prompting has unveiled certain latent capabilities in these models, the depth of true reasoning versus sophisticated pattern recognition remains an open question. The studies you mentioned indeed hint at a more complex interplay between memorization and reasoning, showcasing not just the strides made in AI development but also the intricate pathway that lies ahead in achieving genuine artificial general intelligence.

Exciting times ahead!

Expand full comment

Rohit Akiwatkar

Sep 14, 2023

I truly value this Informative article, thank you! Have featured your article in our today's newsletter. :)

Expand full comment

Manlio De Domenico

https://manlius.substack.com/p/navigating-the-transformative-potential

That's a wonderful summary of the SoTA, Melanie. I have written about the supposed "emergent" features here:

providing a perspective from complexity science.

Expand full comment

Bill Benzon

Sep 11, 2023Edited

I’m interested in the question of an LLM “memorizing” something. When I prompt ChatGPT with “To be or not to be”–nothing more–it returns the entire soliloquy along with a bit of commentary. Given that it was trained to predict the next word, what must have been the case for it to be able to return that entire soliloquy, word for word?

Shakespeare’s “Hamlet” is a well-known play and likely appeared many times in the training corpus. That soliloquy is also well-known and probably appeared many times independently of the play itself, along with commentary. Given that GPT encountered that particular string many times, it makes sense that it should have “memorized” it, whatever that means.

Now, what happens when you prompt it with a phrase from the soliloquy. I opened a new session and prompted it with “The insolence of office”. It returned pretty much the entire soliloquy. “The slings and arrows” (another new session) returned the first five lines of the soliloquy (it begins the third line.) Then I prompted it with “and sweat under a,” (new session) which is from the middle of a line a bit past the middle. ChatGPT didn’t recognize it, but then did so when I told it that is was from Hamlet’s famous soliloquy. I think this is worth further exploring – prompting with phrases from various locations – but haven’t done so yet.

Then we have the rather different situation of things which are (likely to have been) in the training corpus, but do not show up when prompted for. Years ago I heard Dizzy Gillespie play at Left Bank Jazz Society in Baltimore. I blogged about it twice, both times will before the cut-off date for training. I have no way of knowing whether or not those blog posts were actually in the training corpus, but I gave ChatGPT the following prompt: “Dizzy Gillespie plays for the Left Bank Jazz Society in Baltimore’s Famous Ballroom.” It didn’t recognize the event and gave a confused reply. I then named the two blogs where I’d posted about the concert. Again, nothing.

Since I don’t know whether or not those blog posts were actually in the training corpus, I don’t know quite what to think. But, my default belief at this time is the there’s a bunch of stuff in the corpus that never shows up in response to prompting because the events only didn’t appear very often in the corpus.

Between those two cases we have something like the Johnstown flood of 1889, a historical event of some moderate importance, but certainly not as prominent as, say, the bombing of Pearl Harbor. I prompted ChatGPT with “Johnstown flood, 1889,” and got a reasonable response. (Having grown up in Johnstown, I know something about that flood.) I issued the same prompt at a different session and again got a reasonable response, but one that was different from the first.

I’ve written this up in a blog post: https://new-savanna.blogspot.com/2023/09/what-must-be-case-that-chatgpt-would.html

Expand full comment

Bill Benzon

Sep 12, 2023Edited

I've continued with these experiments.

I’ve now prompted ChatGPT with thirteen (13) snippets (I won’t call them phrases becasuse, technically, many of them are not phrases, just strings of words), four (4) from line beginnings, and nine (9) from somewhere in the interior of a line. It correctly located all of line-initial snippets, through responded to them in various ways. It only identified two (2) correctly. In one of those cases, the snippet in question, “what dreams may come,” has a use outside the play, which ChatGPT points out.

It responded in various ways to the snippets it was unable to identify, in some cases offering fairly elaborate intrepretive commentary. In the two cases where it correctly located the snippet it also quoted enough of the soliloquy to establish context.

And then there’s the peculiar case of this prompt: “make cowards of us all.” It is from one of the best-known lines in the play, one often quoted on its own: “Thus conscience does make cowards of us all.” I expected the Chatster to identify it. But it did not. So I decided to help it a bit.

I opened a new session and prompted it with: “does make cowards of us all.” The addition of that one word, “does” was all the Chatster needed. It quoted most of the soliloquy in response.

On the whole, I find this is satisfying. For what it’s worth, the fact that ChatGPT would be able to identify snippets from the beginning of a line, but not snippets from the interior, accords well with my intuitions about human psychology. I am an experienced musician – yes, a different medium, but one where serial order is important – and line beginnings are privileged loci. If, during practice or rehearsal, you are going to go over something again, perhaps several times, you’re likely to start at the beginning of a line, not the interior. The same is true when playing a tune “from memory.” You can’t start at any point in the sequence of notes. You have to start at an “access point.” If you know the tune well, it may have several access points for you, generally at a structural boundary. If not, you may only be able to access the tune from the beginning.

We know that in humans memory is not a passive process, like making a tape recording. It is an active process. It has as structure. That seems to be the case for ChatGPT as well. What mechanisms in the model allow it to do this?

This post gives a complete record of the sessions: https://new-savanna.blogspot.com/2023/09/to-be-or-not-snippets-from-soliloquy.html

Expand full comment

Comment deleted

Comment deleted

Expand full comment

Bill Benzon

I agree, it's in the statistics, starting with ngrams where N = 2 to [just what?].

"I have caught it red-handed writing hundreds lines of my own code..." What has it even "memorized" your code?

But I agree with your larger point, people need to learn to think in terms of character strings in addition to meaning.

I've just been experimenting with prompts from Lincoln's Gettyburg Address. When I give it prompts that are from the beginnings of lines, I always identifies them as coming from that text. When I give it prompts of arbitrary strings that run across syntactic boundaries it never identifies the text. Then, however, when I tell it: It's from a well-known speech, it is able to identify the speech. I just ran up a post about this: https://new-savanna.blogspot.com/2023/09/entry-points-into-memory-stream.html

Expand full comment

Jurgen Gravestein

Best overview on the subject of reasoning to date! Great read

Expand full comment

Claude Coulombe

What an enlightning post! In the same vein look at a recent talk by Evangelina Fedorenko from MIT (https://bit.ly/40TDF9I) based on direct brain observations (EEG, fRMI, etc.). She shows that language and thought are only weakly overlapped even in the human brain.

Expand full comment

Joshua Voydik

I know some LLMs that can reason a lot better than some humans I know

Expand full comment

Sep 11, 2023Edited

Hello Melanie,

First of all, I would like to thank you for having periodically rekindled my interest in maths and computer science - each one of your three technical books (Genetic Algorithms, Analogy making and Complexity) made me reëvaluate entire fields, and your SFI lectures are my go-to suggestion for whoever shows an interest in modeling biological or physical systems.

Because of this, I have spent the year preceding last January hoping that our opinions would somehow converge - trust me when I say that, from the start, my default stance had been: "ok, i got it wrong. let's see how".

Unfortunately, that search proved fruitless. I have seen incredible behaviours emerging from LLMs and today, looking through your examples, I perceive something that in pretty much anyone else I would be drawn to chalk up to confirmation bias. Please bear with me - if you object to any of my claims and observations, there's an offer at the end.

1. Whose human level?

As you note at the end, most of the examples could be chalked up to perfectly humdrum phenomena, common to humans as well as machine (more salient and common connections are easier to reason with; arbitrarily changing some syntax rules in a programming language leaves less short-term memory to solve the problem).

On the other hand, you underline how, unlike machines, "humans are (at least in some cases) capable of abstract, content-independent reasoning". I found this puzzling: you have yourself highlighted how the performance of the models was surely better than chance - I would wager, better than the median human's results. That must count as "at least in come cases".

("if given enough time", by the way, is the operative phrase here - and could help explain how come that, while the "reasoning steps" seem tacked on, their presence still leads to a higher chance of correct responses).

2. Whatever happened to analogies?

It seems to me that the current systems excel in them, and just two years ago analogies based on deep, complex structures - “to discover insightful analogies, and to do so in a psychologically realistic way.”, - were still an acceptable benchmark, and one that was likely unattainable. What changed? Or, if you're still of the same advise, what kind of demonstration would change your mind?

Speaking of which, my offer is as follows.

If you could provide me with a problem that could be solved in an analogical* fashion - or a set thereof, or even pointers towards a class of such problems - which, if solved by one current SOTA LLM, would change your mind on the matter, I would be more than happy to provide a rigorous system able to produce a replicable solution.

Mostly, I think LLMs are an amazing tool for expanding one's space of possibility, and if there's anything I could do to convince you to give them a fair chance, that would be a surefire way for me to secure a ticket to consequentialist heaven.

Thank you again so much for all you've done!

Lumps

* ie, not algebraic/formal - way: something that a brilliant, sensitive and intuitive humanities student with nothing but rudiments of college math and computer science could solve

Expand full comment

Melanie Mitchell

Thanks for your comments! You might find my earlier post on the ARC challenge interesting as an analogy domain that current AI systems still have trouble with. See https://aiguide.substack.com/p/why-the-abstraction-and-reasoning

Expand full comment

Sep 12, 2023

Thank you! The multimodal I/O required might be a stumbling block; would you be convinced by research rendering these tasks as Bongard problem-like challenges - ie, the goal would be for the system to identify the transformations that two sets of objects went through?

Expand full comment

Ijon Tichy

```I have seen incredible behaviours emerging from LLMs and today, looking through your examples, I perceive something that in pretty much anyone else I would be drawn to chalk up to confirmation bias. ```

Projection much?

I have seen countless and obvious failures emerging from state-of-the-art LLMs. Obvious self-contradictions within the scope of a single answer. Answers that entirely depend on the wording, rather than meaning of the question. Failures to understand basic sentences. Complete "forgetting" of some aspects of the previous prompt after just one extra interaction. If all this stuff doesn't indicate to you that there is a serious problem with assuming LLMs "reason", then you operate within a frame of reference where that's an unfalsifiable assumption.

"Reasoning" of those systems is not something a rational person should take for granted and demand evidence to the contrary. At best, it's a hypothesis in need of careful testing.

Expand full comment

you might want to read my comment til the end.

Expand full comment

Karen Brenchley