Food for thought

Why Claude Mythos is not AGI

Yesterday on April seventh of 2026, Anthropic announced "Project Glasswing", an initiative with partners accross Big Tech to deploy their latest model "Claude Mythos Preview" "to secure the world’s most critical software".

I've seen some people, mostly on Twitter and some on Hacker News, talking about "AGI" and how we are all "cooked".

I will not talk about the risk of LLMs compromising every pieces of software through vulnerability search.

What is AGI?

The definition of "Artificial General Intelligence" is an active debate accross researchers on Artificial Intelligence and tech bros on Twitter.

We have some lunatics that annouced AGI since the release of GPT-4 in March of 2023 (three years ago!), and we have more and more venture capitalists annoucing it daily. But it's just noise to feed the hype machine, we can ignore it.

But at the core an AGI must be "general". I believe that a general intelligence must be able to learn new concepts, that are not in the training corpus.

You may think it means that we should make LLMs able to learn continualy (Ilija Lichkovski wrote a good article about it on Twitter), but no, they are fondamentaly limited.

How can we make a General Intelligence?

LLMs are limited by the training objective, the Next Token Prediction. This objective force LLMs to become really performant text autocompleters, but it restrict their weights to think inside of the language (even if we can feed them with videos, images, audio and other modalities).

It means they cannot plan or think about the impact of their actions beside generating more tokens, which is not really efficient.

I believe that a General Intelligence must be able to "think" through it's internal "world" learned from the training corpus.

This is why I think that Latent World Models are the future for General Intelligence.

Why Latent World Models are the future?

A Latent World Model is a deep-learning model that take in input the current state and an action (both in form of vectors) and predict the next state (also in form of vectors).

We can even use World Models to plan. By freezing the model's weights and optimizing the action vectors directly via gradient descent, we can find a sequence of actions that leads to a desired future state, without ever interacting with the real world!

We can generate the target state for this process by using the world model's encoder!

I believe that in the future LLMs will became parts of world models to communicate in natural language and nothing more.


If you are curious about this concept I can recommand you to read articles and scientific papers about methods in the space:

#AGI #Anthropic #Claude