Back to List
Distribution Matching Distillation Meets Reinforcement Learning
Z-Image core technical paper, introducing DMDR framework: integrating reinforcement learning into distribution matching distillation process
Paper
Research
DMDR
Reinforcement Learning
Overview
This paper proposes the DMDR framework, integrating reinforcement learning techniques into the distribution matching distillation process. Research shows that for reinforcement learning of few-step generators, the DMD loss itself is more effective than traditional regularization methods.
Features
- DMDR: DMD and reinforcement learning fusion framework
- Dynamic distribution guidance strategy
- Dynamic renoise sampling training
- Few-step generator performance breakthrough
- Exceeds multi-step teacher model performance
Images
DMDR framework model architecture design
Usage
PDF download: https://arxiv.org/pdf/2511.13649.pdf