Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning

Ding, Zihan; Zhang, Amy; Tian, Yuandong; Zheng, Qinqing

Computer Science > Machine Learning

arXiv:2402.03570 (cs)

[Submitted on 5 Feb 2024 (v1), last revised 15 Oct 2024 (this version, v4)]

Title:Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning

Authors:Zihan Ding, Amy Zhang, Yuandong Tian, Qinqing Zheng

View PDF HTML (experimental)

Abstract:We introduce Diffusion World Model (DWM), a conditional diffusion model capable of predicting multistep future states and rewards concurrently. As opposed to traditional one-step dynamics models, DWM offers long-horizon predictions in a single forward pass, eliminating the need for recursive queries. We integrate DWM into model-based value estimation, where the short-term return is simulated by future trajectories sampled from DWM. In the context of offline reinforcement learning, DWM can be viewed as a conservative value regularization through generative modeling. Alternatively, it can be seen as a data source that enables offline Q-learning with synthetic data. Our experiments on the D4RL dataset confirm the robustness of DWM to long-horizon simulation. In terms of absolute performance, DWM significantly surpasses one-step dynamics models with a $44\%$ performance gain, and is comparable to or slightly surpassing their model-free counterparts.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2402.03570 [cs.LG]
	(or arXiv:2402.03570v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2402.03570

Submission history

From: Zihan Ding [view email]
[v1] Mon, 5 Feb 2024 22:43:57 UTC (15,662 KB)
[v2] Sun, 11 Feb 2024 17:33:16 UTC (15,640 KB)
[v3] Sun, 16 Jun 2024 23:35:37 UTC (15,314 KB)
[v4] Tue, 15 Oct 2024 20:56:47 UTC (13,389 KB)

Computer Science > Machine Learning

Title:Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators