DeepMind, Google’s AI research organization, has unveiled a new model called Genie 2 that can generate an “endless” variety of playable 3D worlds. The model, which is the successor to DeepMind’s Genie, can create interactive, real-time scenes from a single image and text description.
Genie 2 can simulate object interactions, animations, lighting, physics, reflections, and the behavior of “NPCs” (non-player characters). It can also generate consistent worlds with different perspectives, such as first-person and isometric views. The model can respond intelligently to actions taken by pressing keys on a keyboard, identifying the character and moving it correctly.
What sets Genie 2 apart from other world models is its ability to remember parts of a simulated scene that aren’t in view and render them accurately when they become visible again. This allows for more realistic simulations without artifacts or inconsistencies.
DeepMind claims that Genie 2 can generate “vast diversity of rich 3D worlds” and is positioning the model as a research and creative tool for prototyping interactive experiences and evaluating AI agents. The company says that Genie 2’s out-of-distribution generalization capabilities allow concept art and drawings to be turned into fully interactive environments.
While some in the video game industry may view this technology with concern, Google has poured increasing resources into its world model research, which promises to be the next big thing in AI. The company has hired top talent, including Tim Brooks and Tim Rocktäschel, to work on video generation technologies and world simulators.
Source: https://techcrunch.com/2024/12/04/deepminds-genie-2-can-generate-interactive-worlds-that-look-like-video-games