AI learns to reason (or does it?)

Melanie Mitchell

Mar 21

139

I have a gig writing four columns a year about artificial intelligence for Science Magazine’s “Expert Voices” project.

Read →

17 Comments

Russell Gonnering, MD

Mar 21

As always, thought provoking and unbiased.

Expand full comment

Erik J Larson

Mar 21

Love it! You deserve it Melanie!

Expand full comment

Wolfgang Knorr

Mar 21

I have a much more mundane and non-expert question for you guys here: years ago, before Substack existed, and before LLMs – but "deep neural networks" were all the rage – I followed a computer science professor from Boston University on Quora. One thing he expressed was his amazement about how easily and quickly his four year old daughter could identify objects across the room, or how quickly even smaller children could grasp new concepts, like for example that of "ball".

Now years later we are talking about LLMs supercharged to teach them reasoning skills at Maths Olympiad level, but I very much wonder if anything about the former observation really improved. So an LLM will know what a ball is, simply because it was part of the training data. (Or Pythagora's theorem.) But a small child can learn new concepts on an extremely small sample base. I simply fail to see any significant progress on that front. Or am I mistaken?

I mean, much of real research, and even more so real-world practical problem solving, is about making sense of things that have a very small sample base. Are LRMs a step in that direction? My sense is that they are not, but maybe something slightly different could be. Something that observes things, builds a reasoning model around the observation, derives conclusions and sees if new observations fit its model. That would be more practical and more akin to the wondrous flexibility of human intelligence.

Expand full comment

James McDermott

Mar 21

Excellent article, two nit-picks:

> After this special training, when given a problem, the LRM does not generate tokens one at a time but generates entire chains of thought

It still generates one token at a time!

> some models even intersperse reasoning steps with words like “Hmmm,” “Aha!” or “Wait!” to make them sound more humanlike.

Maybe that's part of the reason but the main reason is that these interjections signal to the next iteration that it needs to go in a different direction of thought. (A reason at another level of abstraction is that real COTs recorded by real humans, which are in the training data, contain these interjections.)

Expand full comment

Fred Simkin

Mar 21

With forty years in the AI field I'd be interested in what you are saying if you didn't continue to conflate the field (AI) with a single tool/technique/approach (LLMs and their various flavors) within the field. Where is the discussion of all the other tools (eg various approaches to Symbolic Reasoning, Neural Networks, Neuro Symbolic hybrids, Machine Learning [other than LLMs]), many of which have long track records as the basis of successful, robust, scalable and transparent commercial applications?

Expand full comment

Reply (1)

Alain Dauron

Mar 21Edited

“Where is the discussion of all the other tools…?”

Part is here: https://en.wikipedia.org/wiki/Artificial_Intelligence:_A_Guide_for_Thinking_Humans !!!

Expand full comment

Alain Dauron

Mar 21

Thanks for this very clear and "scientifically neutral" popularization, as usual!

Could you situate LLM and LRM in relation to another LxM I recently heard about, LCM, which claims to model Concepts rather than Language words? (Very tempting, if it's a way to finally bring symbolic AI and neural AI closer together!)

Just a buzz?

Interesting, but still mainly based on LLM?

A new revolution?

Expand full comment

Reply (3)

Melanie Mitchell

Mar 21

Do you have a reference to "LCM"?

Expand full comment

Alain Dauron

Mar 21

Then, googling, this paper from Meta:

https://ai.meta.com/research/publications/large-concept-models-language-modeling-in-a-sentence-representation-space/

Expand full comment

Reply (1)

Melanie Mitchell

Mar 21

Thanks!

Expand full comment

Alain Dauron

Mar 21

I saw this post on LinkedIn:

https://www.linkedin.com/pulse/large-language-modelsllm-concept-modelslcm-explained-ghada-richani-a1buc?utm_source=share&utm_medium=member_ios&utm_campaign=share_via

Expand full comment

Daniel Scheer

Mar 22

Great article! Out of curiosity do you have any suggested references on the question of “what is reasoning”? Seems like a lot of the questions around LRM hinge on how one thinks about that question.

Expand full comment

Reply (1)

Melanie Mitchell

Mar 24

Lots of debate in the cogsci literature on "what is reasoning". I found this article (Chapter 12 of the linked book) useful but not comprehensive: https://arthurjensen.net/wp-content/uploads/2014/06/2004-sternberg-cognitionandintelligenceidentifyingthemechanismsofthemind.pdf#page=239

Expand full comment

Surf Complexity

Mar 21

This is a great article! I really liked the first link about METAPHOR. I see that too! TYSM. I am old-school AI researcher - and my work indicates (spoiler-alert)

AI looks like a massive PRYSM (metaphor).

Where REFLECTION and PROJECTION - goes way deeper ... than we (can) think! Added fun with "spectra" and "spectrum" - match the math!! Also important...SPARK of light is the binary... (better metaphor)! Love to talk more about this... ✌️🙂‍↕️🪄

Expand full comment