Hard Krypton Exclusive Interview: WANG Zhongyuan, Dean of Beijing Academy of Artificial Intelligence

Hard Krypton Exclusive Interview: WANG Zhongyuan, Dean of Beijing Academy of Artificial Intelligence

Hard Krypton Exclusive Interview: WANG Zhongyuan, Dean of Beijing Academy of Artificial Intelligence

https://eu.36kr.com/en/p/3853016586359817

Publish Date: 2026-06-14 23:00:00

Source Domain: eu.36kr.com

  • World Model Emergence: The “world model” has become an important term in the AI and robotics industries and is seen as a way to address the limitations of current AI in the physical world.

  • World Model Shortcomings: While robots can recognize objects and follow instructions, they struggle with understanding physical causality and the consequences of actions like predicting the fall of a cup.

  • The Essence of World Model: The relationship between world models and embodied intelligence is akin to the connection between the human “brain” and “body.”

  • Application Challenges: There are uncertainties regarding how the world model will be practically applied despite significant investment of capital and resources.

  • World ModelPaths: According to Wang Zhongyuan, the director of the Beijing Academy of Artificial Intelligence (BAAI), there are four main types of world models: language-centered, pixel-centered, 3D structure-centered, and visual-representation-centered.

  • BAAI Approach: The BAAI emphasizes integrating language and vision into a unified “latent space representation” to provide robots with a comprehensive understanding of the physical world.

  • World Model Capabilities: For the world model to truly function in the physical world, it needs to be physically correct, demonstrate action-causality traceability, maintain long-term sequence consistency, and have generalization abilities.

  • Future Prospects: The world model is expected to enhance robotics with abilities like general task execution, spatial reasoning, and complex decision-making, thus becoming the true “brain” for robots, though this is still a long-term goal.