Gradient-based Planning for World Models at Longer Horizons – The Berkeley Artificial Intelligence Research Blog

https://bair.berkeley.edu/blog/2026/04/20/grasp/

Publish Date:

Summary of Article on GRASP: Gradient RelAxed Stochastic Planner for Long-Horizon Planning

The article presents GRASP, a gradient-based planner designed to make long-horizon planning practical for learned dynamics (world models). It addresses three primary issues of long-horizon planning: deeply conditioned computations, a non-greedy optimization landscape with numerous local minima, and high-dimensionality leading to fragile optimization. GRASP tackles these problems through a lifted state/collocation-based planner leveraging parallel computations, direct state perturbation for exploration, and reshaping gradients to focus on action inputs to avoid state-input sensitivity problems inherent in deep learning models. While exploring by noising state iterates and introducing periodic refinement using full-path gradients enhance stability and success rates, especially for longer horizons, future work includes extending to diffusion-based models and integrating into RL policy learning systems.

Key Points:

Overview of Challenges in Long-Horizon Planning: Long horizons and complex models pose significant challenges for gradient-based planning, particularly with exploding/vanishing gradients and difficulties in navigating optimization landscapes.
GRASP’s Methodology: GRASP uses collocation planning to relax dynamics constraints and optimize directly, adds stochasticity to states to explore the solution space, and reshapes gradients, focusing on actions to avoid brittle state gradients.
Achievements and Future Directions: GRASP significantly improves success rates and optimization speed for long-horizon planning and opens avenues for integrating with RL policies and exploring more sophisticated optimizers.