目录

OneVAE: Unified Repository for Continuous and Discrete VAE Training

Also the official open-source implementation of our work OneVAE.

📄 Paper: OneVAE: Joint Discrete and Continuous Optimization Helps Discrete VAE Train Better

Key Contributions:

  1. Multiple Structural Improvements — Introduces several architecture-level enhancements for discrete VAE to boost reconstruction quality under high compression.
  2. Progressive Training with Pretrained Continuous VAE — Initializes from a high-quality pretrained continuous VAE and gradually transitions to discrete VAE, effectively leveraging strong priors.
  3. Unified Model — Achieves superior performance on both continuous and discrete representations within a single model.
image

Development Status

In addition to releasing the code of this work, we aim to provide a unified repository that supports fine-tuning and training of multiple pretrained VAE models, enabling the community to better adapt VAEs to their specific needs.
We are actively organizing and refining the codebase, and ⚡ most features and resources will be released within two weeks!

Open Source Model

Model Name Encoding Method Compression Ratio Download Link
OneVAE Discrete, Multi-Token Quant = 2 8 x 16 x 16 Link
OneVAE Discrete, Multi-Token Quant = 2 16 x 16 x 16 Link
OneVAE Discrete, Multi-Token Quant = 2 8 x 8 x 8 Link

Visual Results

Video1 Video2

More Discrete Video Results on High-Compression VAE (4×16×16)

Video1 Video2 Video3

Planned Supported Fine-Tuning

Image VAE

  • FluxVAE
  • LlamaGen
  • SD-VAE

Video VAE

  • OneVAE (ours)
  • WanVAE (Alibaba)
  • HunyuanVideo VAE (Tencent)

TODO

  • Release model code (to be completed within two weeks)
  • Provide pretrained weights download links
  • Support additional types of VAE models

LICENSE

The code is licensed under the Apache License 2.0. When using our repository to fine-tune other models, you must comply with the licenses of the respective pretrained models.

关于
68.0 KB
邀请码