Text-to-Video
Wan / README.md
worstcoder's picture
Update README.md
18554de verified
---
license: apache-2.0
base_model:
- Wan-AI/Wan2.1-T2V-1.3B
- Wan-AI/Wan2.1-T2V-14B
pipeline_tag: text-to-video
---
# rCM: Score-Regularized Continuous-Time Consistency Model
# Causal-rCM: Teacher-Forcing meets Self-Forcing in Autoregressive Diffusion Distillation for Streaming Video Generation and Interactive World Models
[**Paper**](https://arxiv.org/abs/2510.08431) | [**Website**](https://research.nvidia.com/labs/dir/rcm) | [**Code**](https://github.com/NVlabs/rcm)
This repo holds converted Wan official checkpoints in (Causal-)rCM/TurboDiffusion style.
Specifically, (Causal-)rCM equivalently replaces the `Conv3d` layer in the original Wan with a `Linear` layer for patch embedding, facilitating further optimization. The layer weight is directly reshaped without value change, e.g., from shape [5120, 16, 1, 2, 2] (Conv3d) to shape [5120, 64] (Linear).
## Citation
```
@article{zheng2025rcm,
title={Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency},
author={Zheng, Kaiwen and Wang, Yuji and Ma, Qianli and Chen, Huayu and Zhang, Jintao and Balaji, Yogesh and Chen, Jianfei and Liu, Ming-Yu and Zhu, Jun and Zhang, Qinsheng},
journal={arXiv preprint arXiv:2510.08431},
year={2025}
}
@article{zheng2026causal,
title={Causal-rCM: Teacher-Forcing meets Self-Forcing in Autoregressive Diffusion Distillation for Streaming Video Generation and Interactive World Models},
author={Zheng, Kaiwen and He, Guande and Zhao, Min and Zhu, Hongzhou and Zhang, Jintao and Chen, Huayu and Chen, Jianfei and Lin, Chen-Hsuan and Liu, Ming-Yu and Zhu, Jun and Ma, Qianli},
journal={TODO},
year={2026}
}
```