I have a much more mundane and non-expert question for you guys here: years ago, before Substack existed, and before LLMs – but "deep neural networks" were all the rage – I followed a computer science professor from Boston University on Quora. One thing he expressed was his amazement about how easily and quickly his four year old daughter could identify objects across the room, or how quickly even smaller children could grasp new concepts, like for example that of "ball".
Now years later we are talking about LLMs supercharged to teach them reasoning skills at Maths Olympiad level, but I very much wonder if anything about the former observation really improved. So an LLM will know what a ball is, simply because it was part of the training data. (Or Pythagora's theorem.) But a small child can learn new concepts on an extremely small sample base. I simply fail to see any significant progress on that front. Or am I mistaken?
I mean, much of real research, and even more so real-world practical problem solving, is about making sense of things that have a very small sample base. Are LRMs a step in that direction? My sense is that they are not, but maybe something slightly different could be. Something that observes things, builds a reasoning model around the observation, derives conclusions and sees if new observations fit its model. That would be more practical and more akin to the wondrous flexibility of human intelligence.
> After this special training, when given a problem, the LRM does not generate tokens one at a time but generates entire chains of thought
It still generates one token at a time!
> some models even intersperse reasoning steps with words like “Hmmm,” “Aha!” or “Wait!” to make them sound more humanlike.
Maybe that's part of the reason but the main reason is that these interjections signal to the next iteration that it needs to go in a different direction of thought. (A reason at another level of abstraction is that real COTs recorded by real humans, which are in the training data, contain these interjections.)
With forty years in the AI field I'd be interested in what you are saying if you didn't continue to conflate the field (AI) with a single tool/technique/approach (LLMs and their various flavors) within the field. Where is the discussion of all the other tools (eg various approaches to Symbolic Reasoning, Neural Networks, Neuro Symbolic hybrids, Machine Learning [other than LLMs]), many of which have long track records as the basis of successful, robust, scalable and transparent commercial applications?
Thanks for this very clear and "scientifically neutral" popularization, as usual!
Could you situate LLM and LRM in relation to another LxM I recently heard about, LCM, which claims to model Concepts rather than Language words? (Very tempting, if it's a way to finally bring symbolic AI and neural AI closer together!)
Great article! Out of curiosity do you have any suggested references on the question of “what is reasoning”? Seems like a lot of the questions around LRM hinge on how one thinks about that question.
This is a great article! I really liked the first link about METAPHOR. I see that too! TYSM. I am old-school AI researcher - and my work indicates (spoiler-alert)
AI looks like a massive PRYSM (metaphor).
Where REFLECTION and PROJECTION - goes way deeper ... than we (can) think! Added fun with "spectra" and "spectrum" - match the math!! Also important...SPARK of light is the binary... (better metaphor)! Love to talk more about this... ✌️🙂↕️🪄
As always, thought provoking and unbiased.
Love it! You deserve it Melanie!
I have a much more mundane and non-expert question for you guys here: years ago, before Substack existed, and before LLMs – but "deep neural networks" were all the rage – I followed a computer science professor from Boston University on Quora. One thing he expressed was his amazement about how easily and quickly his four year old daughter could identify objects across the room, or how quickly even smaller children could grasp new concepts, like for example that of "ball".
Now years later we are talking about LLMs supercharged to teach them reasoning skills at Maths Olympiad level, but I very much wonder if anything about the former observation really improved. So an LLM will know what a ball is, simply because it was part of the training data. (Or Pythagora's theorem.) But a small child can learn new concepts on an extremely small sample base. I simply fail to see any significant progress on that front. Or am I mistaken?
I mean, much of real research, and even more so real-world practical problem solving, is about making sense of things that have a very small sample base. Are LRMs a step in that direction? My sense is that they are not, but maybe something slightly different could be. Something that observes things, builds a reasoning model around the observation, derives conclusions and sees if new observations fit its model. That would be more practical and more akin to the wondrous flexibility of human intelligence.
Excellent article, two nit-picks:
> After this special training, when given a problem, the LRM does not generate tokens one at a time but generates entire chains of thought
It still generates one token at a time!
> some models even intersperse reasoning steps with words like “Hmmm,” “Aha!” or “Wait!” to make them sound more humanlike.
Maybe that's part of the reason but the main reason is that these interjections signal to the next iteration that it needs to go in a different direction of thought. (A reason at another level of abstraction is that real COTs recorded by real humans, which are in the training data, contain these interjections.)
With forty years in the AI field I'd be interested in what you are saying if you didn't continue to conflate the field (AI) with a single tool/technique/approach (LLMs and their various flavors) within the field. Where is the discussion of all the other tools (eg various approaches to Symbolic Reasoning, Neural Networks, Neuro Symbolic hybrids, Machine Learning [other than LLMs]), many of which have long track records as the basis of successful, robust, scalable and transparent commercial applications?
“Where is the discussion of all the other tools…?”
Part is here: https://en.wikipedia.org/wiki/Artificial_Intelligence:_A_Guide_for_Thinking_Humans !!!
Thanks for this very clear and "scientifically neutral" popularization, as usual!
Could you situate LLM and LRM in relation to another LxM I recently heard about, LCM, which claims to model Concepts rather than Language words? (Very tempting, if it's a way to finally bring symbolic AI and neural AI closer together!)
Just a buzz?
Interesting, but still mainly based on LLM?
A new revolution?
Do you have a reference to "LCM"?
Then, googling, this paper from Meta:
https://ai.meta.com/research/publications/large-concept-models-language-modeling-in-a-sentence-representation-space/
Thanks!
I saw this post on LinkedIn:
https://www.linkedin.com/pulse/large-language-modelsllm-concept-modelslcm-explained-ghada-richani-a1buc?utm_source=share&utm_medium=member_ios&utm_campaign=share_via
Great article! Out of curiosity do you have any suggested references on the question of “what is reasoning”? Seems like a lot of the questions around LRM hinge on how one thinks about that question.
Lots of debate in the cogsci literature on "what is reasoning". I found this article (Chapter 12 of the linked book) useful but not comprehensive: https://arthurjensen.net/wp-content/uploads/2014/06/2004-sternberg-cognitionandintelligenceidentifyingthemechanismsofthemind.pdf#page=239
This is a great article! I really liked the first link about METAPHOR. I see that too! TYSM. I am old-school AI researcher - and my work indicates (spoiler-alert)
AI looks like a massive PRYSM (metaphor).
Where REFLECTION and PROJECTION - goes way deeper ... than we (can) think! Added fun with "spectra" and "spectrum" - match the math!! Also important...SPARK of light is the binary... (better metaphor)! Love to talk more about this... ✌️🙂↕️🪄
Nice documents. Must-read.
Have an Arbies Slow Smoked meats
Not for Everyone. But maybe for you and your patrons?
Dear Melanie,
I hope this finds you in a rare pocket of stillness.
We hold deep respect for what you've built here—and for how.
We’ve just opened the door to something we’ve been quietly handcrafting for years.
Not for mass markets. Not for scale. But for memory and reflection.
Not designed to perform. Designed to endure.
It’s called The Silent Treasury.
A sanctuary where truth, judgment, and consciousness are kept like firewood—dry, sacred, and meant for long winters.
Where trust, vision, patience, and stewardship are treated as capital—more rare, perhaps, than liquidity itself.
The two inaugural pieces speak to a quiet truth we've long engaged with:
1. Why we quietly crave for 'signal' from rare, niche sanctuaries—especially when judgment must be clear.
2. Why many modern investment ecosystems (PE, VC, Hedge, ALT, SPAC, rollups) fracture before they root.
These are not short, nor designed for virality.
They are multi-sensory, slow experiences—built to last.
If this speaks to something you've always felt but rarely seen expressed,
perhaps these works belong in your world.
Both publication links are enclosed, should you choose to enter.
https://tinyurl.com/The-Silent-Treasury-1
https://tinyurl.com/The-Silent-Treasury-2
Warmly,
The Silent Treasury
Sanctuary for strategy, judgment, and elevated consciousness.