Multiple Structural Improvements — Introduces several architecture-level enhancements for discrete VAE to boost reconstruction quality under high compression.
Progressive Training with Pretrained Continuous VAE — Initializes from a high-quality pretrained continuous VAE and gradually transitions to discrete VAE, effectively leveraging strong priors.
Unified Model — Achieves superior performance on both continuous and discrete representations within a single model.
Development Status
In addition to releasing the code of this work, we aim to provide a unified repository that supports fine-tuning and training of multiple pretrained VAE models, enabling the community to better adapt VAEs to their specific needs. We are actively organizing and refining the codebase, and ⚡ most features and resources will be released within two weeks!
More Discrete Video Results on High-Compression VAE (4×16×16)
Video1
Video2
Video3
Planned Supported Fine-Tuning
Image VAE
FluxVAE
LlamaGen
SD-VAE
Video VAE
OneVAE (ours)
WanVAE(Alibaba)
HunyuanVideo VAE(Tencent)
TODO
Release model code (to be completed within two weeks)
Provide pretrained weights download links
Support additional types of VAE models
LICENSE
The code is licensed under the Apache License 2.0. When using our repository to fine-tune other models, you must comply with the licenses of the respective pretrained models.
OneVAE: Unified Repository for Continuous and Discrete VAE Training
Also the official open-source implementation of our work OneVAE.
📄 Paper: OneVAE: Joint Discrete and Continuous Optimization Helps Discrete VAE Train Better
Key Contributions:
Development Status
In addition to releasing the code of this work, we aim to provide a unified repository that supports fine-tuning and training of multiple pretrained VAE models, enabling the community to better adapt VAEs to their specific needs.
We are actively organizing and refining the codebase, and ⚡ most features and resources will be released within two weeks!
Open Source Model
Visual Results
Video Gallery
More Discrete Video Results on High-Compression VAE (4×16×16)
Planned Supported Fine-Tuning
Image VAE
Video VAE
TODO
LICENSE
The code is licensed under the Apache License 2.0. When using our repository to fine-tune other models, you must comply with the licenses of the respective pretrained models.