
December 22, 2025 · By Shrey Kothari
Games, World Models, and Beyond
As 2025 comes to a close, I find myself reflecting on our journey so far, and what's in store for us (and the broader field) in 2026. This past year brought rapid advances in language and vision models. We watched "reasoning" and "agency" move from research adjectives to product primitives. We collectively experienced an AI take-off: hard to measure and quantify the efficiency gains but evident from a quick survey of the current workflows and tools we use compared to the ones at the beginning of 2025.
This past year also highlighted how today's systems are far from generalized intelligence. They look brilliant inside narrow loops but fail in ways that feel almost elementary. We've learned to respect the gap between generating text, and operating in the real-world.
Themes for 2026
From this perspective, two themes feel especially decisive for 2026: World models, and the convergence of robotics with vision-language models.
Language models simulate our world by compressing it into text (which seems like an extraordinary hack). It works because there is an abundance of text data, training is relatively efficient, and the interface is universal. But it's also an over-distillation of reality. Text fails to capture the parts that don't fit cleanly into sentences: the physics, the continuity, and the latent constraints. World models, by contrast, are an attempt to capture more of this signal. They internalize how environments evolve with an agent's actions over time. World models' core promise is counterfactual: if I do x, what happens next? Once we can reliably answer that, we can stop treating intelligence as a text-completion problem and start treating it as a control problem under uncertainty.
This is where robotics becomes unavoidable. There is no AGI without embodiment or physical intelligence. In robotics, where latency is crucial, it is hard (almost impossible) to neatly separate perception and action. As language models converge with robotics through VLAs and agentic control stacks, the bottleneck shifts from "can you reason and talk" to "can you reason, act and adapt."
Games as the Middle Layer
Games fit neatly into this structure. Just like world models, games model our world in a much richer modality than text (albeit in a much more constrained manner). They're the missing middle layer: a place to manufacture embodiment before realizing it in the real world. Games force players to balance curiosity with caution, deciding when to gather information and when to exploit it. They are the ideal testbed for teaching long-horizon reasoning, linking a win/loss to decisions made twelve steps earlier. Better still, games let us scale experience cheaply: we can generate millions of trajectories, instrument every decision, and collect not just what was done, but why.
Highlights from 2025
- We raised our pre-seed round led by Active Capital and our entire team relocated to San Francisco
- We rebranded our company: 4Wall AI → Antim Labs to reflect our core thesis: interactive training environments and data
- We made our first hire and started building a team around execution velocity
- We open-sourced our first set of game environments to allow anyone to run experiments, reproduce results, and build on top
- We instantiated model-training and AI R&D to push the boundaries of research forward
- We delivered our first interactive simulation built to generate data for VLAs (vision-language-action systems)
- We started development on our first consumer game (on 4Wall AI), and established a distribution + gameplay loop to scale data collection
Plans for 2026
Our 2026 roadmap is focused on three compounding launches:
General Game Agents
We're training and releasing a general game agent model and showing a clean transfer from game-learned skills to economically valuable tasks (tool use, planning, reasoning under uncertainty).
We plan to open-source the model so anyone can fine-tune it for their niche: new games, new tasks, new domains, while we keep pushing the frontier on environments + training loops.
Action-Labeled Gameplay + Reasoning Traces at Scale
We're building the infrastructure to scale collection of action-labeled gameplay data across many game genres, with optional reasoning traces (the "why" behind actions, not just the "what").
World Model → Environment Generator
We're starting training on a world model that can serve as a fast environment generator for:
- Interactive RL training environments for language and vision models
- VLA interactive sims / Robotics (semantics + actionability)
- 4Wall AI games (a closed data generation loop: games → data → more games)