>the limits of LLMs will be hit long before we they start to take on human capab...

hnlmorg · 2025-09-30T07:20:31 1759216831

Fair point.

Basically AGI describes human-like capabilities.

The problem with LLMs are that they’re, at their core, a token prediction model. Tokens, typically text, are given a numeric value and can then be used to predict what tokens should follow.

This makes them extremely good things like working with source code and other source of text where relationships are defined via semantics.

The problem with this is that it makes them very poor at dealing with:

1. Limited datasets. Smaller models are shown to be less powerful. So often LLMs need to inject significantly more information than a human would learn in their entire life time, just to approximate what that human might produce in any specific subject.

2. Learning new content. Here we have to rely on non-AI tooling like MCPs. This works really well under the current models because we can say “scrape these software development references” (etc) to keep itself up to date. But there’s no independence behind those actions. An MCP only works because it includes into the prompt how to use that MCP and why you should use that. Whereas if you look at humans, even babies know how to investigate and learn independently. Our ability to self-learn is one of the core principles of human intelligence.

3. Remember past content that resides outside of the original model training. I think this is actually a solvable problem in LLMs but there’s current behaviour of them is to bundle all the current interactions into the next prompt. In reality, the LLM hasn’t really remembered anything, you’re just reminding it about everything with each exchange. So each subsequent prompt gets longer and thus more fallible. It also means that context is always volatile. Basically it’s just a hack that only works because context sizes have grown exponentially. But if we want AGI then there needs to be a persistent way of retaining that context. There are some work around here, but they depend on tools.

4. any operation that isn’t semantic-driven. Things like maths, for example. LLMs have to call a tool (like MCPs) to perform calculations. But that requires having a non-AI function to return a result rather than the AI reason about maths. So it’s another hack. And there are a lot of domains that fall into this kind of category where complex tokenisation is simply not enough. This, I think, is going to be the biggest hurdle for LLMs.

5. Anything related to the physical world. We’ve all seen examples of computer vision models drawing too many fingers on a hand or have disembodied objects floating. The solutions here are to define what a hand should look like. But without an AI having access to a physical 3 dimensional world to explore, it’s all just guessing what things might look like. This is particularly hard for LLMs because they’re language models, not 3D coordinate systems.

There’s also the question about whether holding vector databases of token weights is the same thing as “reasoning”, but I’ll leave that argument for the philosophers.

I think a theoretical AGI might use LLMs as part of its subsystems. But it needs to leverage AI throughout, which LLMs cannot, as it needs handle topics that are more than just token relationships, which LLMs cannot do.