Description: AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners.
reinforcement learning (171) diffusion model (18) trajectory optimization (5)
Recently, diffusion model shines as a promising backbone for the sequence modeling paradigm in offline reinforcement learning. However, these works mostly lack the generalization ability across tasks with reward or dynamics change. To tackle this challenge, in this paper we propose a task-oriented conditioned diffusion planner for offline meta-RL(MetaDiffuser), which considers the generalization problem as conditional trajectory generation task with contextual representation. The key is to learn a context c
Motivation overview of MetaDiffuser. It enables diffusion models to generate rich synthetic expert data using guidance from reward gradients of either seen or unseen goal-conditioned tasks. Then, it iteratively selects high-quality data via a discriminator to finetune the diffusion model for self-evolving, leading to improved performance on seen tasks and better generalizability to unseen tasks. Planning with diffusion model (Janner et al., 2022b) provides a promising paradigm for offline RL, which utilizes
The core idea of our work is to simplify task-solving by enabling the low-level agent to explicitly focus only on low-level skills such as object manipulation and leave the high-level planning to be solved by a high-level agent. The objective of the low-level agent now is to simply follow the high-level agent as closely as possible.