'world model' in AI
(harvests of a google search for 'world model')

What Is a World Model? nvidia.com

...World models are generative AI models that understand the dynamics of the real world, including physics and spatial properties. They use input data, including text, image, video, and movement, to generate videos. They understand the physical qualities of real-world environments by learning to represent and predict dynamics like motion, force, and spatial relationships from sensory data.

World Models: Can agents learn inside of their own dreams? github.io

We explore building generative neural network models of popular reinforcement learning environments. Our world model can be trained quickly in an unsupervised manner to learn a compressed spatial and temporal representation of the environment. By using features extracted from the world model as inputs to an agent, we can train a very compact and simple policy that can solve the required task. We can even train our agent entirely inside of its own dream environment generated by its world model, and transfer this policy back into the actual environment.

LLMs, Make Room For World Models Brian Hopkins at Forrester.com

...World Models Are Emerging And Important

At the frontier of AI research lives a potentially huge development: world models. Technically, a world model is a neural network architecture for learning through observation and prediction. But don't confuse it with predictive analytics. The ambition for world models is no less than approximating human observation, learning, reasoning, planning, and acting ... in other words, thinking. For those who like to read the literature, world models were first named in this research paper from David Ha in 2018. Yann LeCun from Meta is the most prominent AI researcher working on an entire cognitive architecture based on world models.

'World Models,' an Old Idea in AI, Mount a Comeback John Pavlus at Quanta Magazine

You're carrying around in your head a model of how the world works. Will AI systems need to do the same?

...a world model: a representation of the environment that an AI carries around inside itself like a computational snow globe. The AI system can use this simplified representation to evaluate predictions and decisions before applying them to its real-world tasks.

...does this mean that AI researchers have finally found a core concept whose meaning everyone can agree upon? As a famous physicist once wrote (opens a new tab): Surely you're joking. A world model may sound straightforward — but as usual, no one can agree on the details (opens a new tab). What gets represented in the model, and to what level of fidelity? Is it innate or learned, or some combination of both? And how do you detect that it's even there at all?

It helps to know where the whole idea started. In 1943, a dozen years before the term "artificial intelligence" was coined, a 29-year-old Scottish psychologist named Kenneth Craik published an influential monograph in which he mused that "if the organism carries a 'small-scale model' of external reality ...within its head, it is able to try out various alternatives, conclude which is the best of them ... and in every way to react in a much fuller, safer, and more competent manner." Craik's notion of a mental model or simulation presaged the "cognitive revolution" [George Miller 2003] that transformed psychology in the 1950s and still rules the cognitive sciences today. What's more, it directly linked cognition with computation: Craik considered the "power to parallel or model external events" to be "the fundamental feature" of both "neural machinery" and "calculating machines."

...In the past few years, as the large language models behind chatbots like ChatGPT began to demonstrate emergent capabilities that they weren't explicitly trained for — like inferring movie titles from strings of emojis, or playing the board game Othello (opens a new tab) — world models provided a convenient explanation for the mystery. To prominent AI experts such as Geoffrey Hinton, Ilya Sutskever and Chris Olah, it was obvious: Buried somewhere deep within an LLM's thicket of virtual neurons must lie "a small-scale model of external reality," just as Craik imagined.

The truth, at least so far as we know, is less impressive. Instead of world models, today's generative AIs appear to learn "bags of heuristics": scores of disconnected rules of thumb that can approximate responses to specific scenarios, but don't cohere into a consistent whole. (Some may actually contradict each other.) It's a lot like the parable of the blind men and the elephant, where each man only touches one part of the animal at a time and fails to apprehend its full form. One man feels the trunk and assumes the entire elephant is snakelike; another touches a leg and guesses it's more like a tree; a third grasps the elephant's tail and says it's a rope. When researchers attempt (opens a new tab) to recover evidence of a world model from within an LLM — for example, a coherent computational representation of an Othello game board — they're looking for the whole elephant. What they find instead is a bit of snake here, a chunk of tree there, and some rope.

...Given the benefits that even simple world models can confer, it's easy to understand why every large AI lab is desperate to develop them — and why academic researchers are increasingly interested in scrutinizing them (opens a new tab), too. Robust and verifiable world models could uncover, if not the El Dorado of AGI, then at least a scientifically plausible tool for extinguishing AI hallucinations, enabling reliable reasoning, and increasing the interpretability of AI systems.

LLMs and World Models, Part 1 How do Large Language Models Make Sense of Their "Worlds"? Melanie Mitchell and Part 2

...there's a fiery debate in the AI community on how these systems achieve their high performance. Have they basically memorized their training data and then retrieve it (in some "approximate" way) to solve new problems? Have they learned much more numerous and detailed, yet still brittle, heuristic shortcuts? Or do they have something more like the robust "world models" that humans seem to use to understand and act in the world?

OpenAI co-founder Ilya Sutskever asserts that these systems have learned robust world models:

"When we train a large neural network to accurately predict the next word in lots of different texts....it is learning a world model.... This text is actually a projection of the world.... What the neural network is learning is more and more aspects of the world, of people, of the human conditions, their hopes, dreams, and motivations...the neural network learns a compressed, abstract, usable representation of that."

...The term "world model" has become a buzzword in AI circles, but it doesn't have a single, agreed-upon, definition. Here are a few definitions of a world model from the AI literature.

"[I]nternal representations that simulate aspects of the external world."

"[R]epresentations which preserve the causal structure of the environment as far as is necessitated by the tasks an agent needs to perform."

"[S]tructure-preserving, behaviorally efficacious representations of the entities, relations, and processes in the real world. These representations capture, at an abstract level, their counterpart real-world processes (which typically involve causal relations), in algorithmically efficient forms, to support relevant behaviors."

These informal definitions emphasize that world models exist in an organism's brain or in, say, a LLM's neural network, that they capture something about the world that is causal and abstract (or compressed) rather than simply based on large sets of statistical associations; they don't require too much work for the agent to use ("algorithmically efficient") and are relevant to tasks the agent performs.

...It's important to note that our world models don't just exist for the real world; they can also be formed and used to reason about imaginary worlds, such as those created in science fiction or fantasy literature.

(links to Language Models, World Models, and Human Model-Building Jacob Andreas, which offers 3 models:

...The map, the orrery, and the simulator are all models of the same underlying system. Where they differ is in their affordances—the set of questions they enable a user of the model to answer, and the actions the user needs to take in order to obtain those answers. The map lets us answer static, timeless information that can be obtained by from some prior snapshot of system state. The orrery lets us answer conditional questions about the past and future states of the system, by additionally giving us a crank that moves it forward or backward in time. And the simulator lets us answer counterfactual questions of the system by representing something close to its true underlying dynamics (but requires us to do substantially more work to specify the initial conditions for these counterfactuals).

With these differences in affordances come differences in the complexity required to implement each model. You can make a map with stone-age technology, and build the orrery in a 17th-century goldsmith's shop, but can really only produce the simulator with 20th-century technology (from chip fabs to FORTRAN compilers).

and he cites Evaluating the World Model Implicit in a Generative Model Keyon Vafa et al (2024)

A world model: On the political logics of generative AI Louise Amoore et al. Political Geography (2024)

...generative AI is shaping and delimiting the political parameters of what can be known and actioned in the world. Contra the promise of a generalizable "world model" in computer science, the article addresses how and why generative AI gives rise to a model of the world, and with it a set of political logics and governing rationalities that have profound and enduring effects on how we live today.

...This orientation of AI towards general discovery of "how the world works" and adaptation to new domains and tasks is captured by what LeCun calls a "world model", which would supply "an internal model of how the world works" so that AI becomes "configurable" to each new situation it encounters (2022:2-3). The powerful claim that a flexible, reconfigurable world model could deal with all potential future unencountered situations defines much of the politics of contemporary generative AI. It is a claim that promises a general resolution of difficult problems across technical computational and political paradigms: an AI model that draws upon a structure of "how the world works" in order to respond to an input it had never encountered in training; and a political model that is always capable of action in the face of the unencountered situation. Though the concept of a world model is present in the ambitions of AI designers for a better and more adaptive "fit" to the world, it is also present in the critical concerns of the humanities and social sciences, where it is said, for example, that "GPT3 does not have a model of the world" whereas "every human grows up with a model of the world" (Hayles, 2023, p. 258). In these formulations — spanning the AI proponents' desires and the critics' disquiet —

a model of the world defines something that AI lacks, whether due to its inefficiencies, its absence of embodiment, or its need to incorporate and learn from prior experiences.