GoodEnough
/

TiM-T2I

Model card Files Files and versions Community

nielsr HF Staff commited on 1 day ago

Commit

73b9f12

verified ·

1 Parent(s): 9962bec

Improve model card for Transition Models (TiM)

Browse files

This PR significantly enhances the model card for the Transition Models (TiM) repository. It replaces the minimal existing content with a comprehensive overview to improve discoverability and provide essential information to users on the Hugging Face Hub.

Key updates include:
* Adding `license: apache-2.0` and `pipeline_tag: text-to-image` to the metadata for better categorization and searchability.
* Linking to the official Hugging Face paper page: [Transition Models: Rethinking the Generative Learning Objective](https://huggingface.co/papers/2509.04394).
* Providing a direct link to the official GitHub repository: [https://github.com/WZDTHU/TiM](https://github.com/WZDTHU/TiM).
* Including a summary of the paper's highlights and the architecture's key features, such as arbitrary-step generation and high-resolution output.
* Adding the detailed "Model Zoo" tables from the GitHub README, showcasing Text-to-Image and Class-guided Image Generation variants with their respective performance metrics and associated VAEs.
* Including the BibTeX citation for proper academic attribution.

These improvements will make the model more accessible, understandable, and easier for the community to engage with.

Files changed (1) hide show

README.md +58 -1

README.md CHANGED Viewed

	@@ -1 +1,58 @@
1	- ~~arxiv.org/abs/2509.04394~~

+---
+license: apache-2.0
+pipeline_tag: text-to-image
+---
+# Transition Models: Rethinking the Generative Learning Objective
+This repository contains the official implementation of **Transition Models (TiM)**, a novel generative model presented in the paper "[Transition Models: Rethinking the Generative Learning Objective](https://huggingface.co/papers/2509.04394)".
+TiM addresses the dilemma in generative modeling by introducing an exact, continuous-time dynamics equation that analytically defines state transitions across any finite time interval. This enables a novel generative paradigm that adapts to arbitrary-step transitions, seamlessly traversing the generative trajectory from single leaps to fine-grained refinement with more steps.
+For more detailed information, code, and usage instructions, please refer to the official [GitHub repository](https://github.com/WZDTHU/TiM).
+## Highlights
+*   **Arbitrary-Step Generation**: TiM learns to master arbitrary state-to-state transitions, unifying few-step and many-step regimes within a single, powerful model. This approach allows it to learn the entire solution manifold of the generative process.
+*   **State-of-the-Art Performance**: Despite having only 865M parameters, TiM achieves state-of-the-art performance, surpassing leading models such as SD3.5 (8B parameters) and FLUX.1 (12B parameters) across all evaluated step counts on the GenEval benchmark.
+*   **Monotonic Quality Improvement**: Unlike previous few-step generators, TiM demonstrates consistent quality improvement as the sampling budget increases.
+*   **High-Resolution Fidelity**: When employing its native-resolution strategy, TiM delivers exceptional fidelity at resolutions up to 4096x4096.
+<p align="center">
+  <img src="https://github.com/WZDTHU/TiM/raw/main/assets/illustration.png" width="800" alt="TiM Illustration">
+</p>
+## Model Zoo
+A single TiM model can perform any-step generation (one-step, few-step, and multi-step) and demonstrate monotonic quality improvement as the sampling budget increases.
+### Text-to-Image Generation
+| Model   | Model Size | VAE                                                                    | 1-NFE GenEval | 8-NFE GenEval | 128-NFE GenEval |
+|---------|------------|------------------------------------------------------------------------|---------------|---------------|-----------------|
+| TiM-T2I | 865M       | [DC-AE](https://huggingface.co/mit-han-lab/dc-ae-f32c32-sana-1.1-diffusers) | 0.67          | 0.76          | 0.83            |
+### Class-guided Image Generation
+| Model     | Model Size | VAE                                                                    | 2-NFE FID | 500-NFE FID |
+|-----------|------------|------------------------------------------------------------------------|-----------|-------------|
+| TiM-C2I-256 | 664M       | [SD-VAE](https://huggingface.co/stabilityai/sd-vae-ft-ema)             | 6.14      | 1.65        |
+| TiM-C2I-512 | 664M       | [DC-AE](https://huggingface.co/mit-han-lab/dc-ae-f32c32-sana-1.1-diffusers) | 4.79      | 1.69        |
+## Citation
+If you find this project useful, please kindly cite:
+```bibtex
+@article{wang2025transition,
+  title={Transition Models: Rethinking the Generative Learning Objective},
+  author={Wang, Zidong and Zhang, Yiyuan and Yue, Xiaoyu and Yue, Xiangyu and Li, Yangguang and Ouyang, Wanli and Bai, Lei},
+  year={2025},
+  eprint={2509.04394},
+  archivePrefix={arXiv},
+  primaryClass={cs.LG}
+}
+```
+## License
+This project is licensed under the Apache-2.0 license.