BiliSakura
/

BitDance-Tokenizer-diffusers

custom-architecture

Model card Files Files and versions

BitDance-Tokenizer-diffusers / README.md

BiliSakura's picture

Update all files for BitDance-Tokenizer-diffusers

84396d9 verified 3 months ago

|

history blame contribute delete

3.57 kB

	---
	license: apache-2.0
	library_name: diffusers
	pipeline_tag: image-to-image
	base_model: shallowdream204/BitDance-Tokenizer
	language:
	- en
	tags:
	- bitdance
	- tokenizer
	- autoencoder
	- custom-architecture
	- diffusers
	---

	# BitDance-Tokenizer (Diffusers)

	Diffusers-formatted BitDance tokenizer autoencoders (AE) converted from the upstream BitDance tokenizer checkpoints.

	## Available Autoencoders

	- `ae_d16c32` (`z_channels=32`, `gan_decoder=false`)
	- `ae_d32c128` (`z_channels=128`, `gan_decoder=true`)
	- `ae_d32c256` (`z_channels=256`, `gan_decoder=true`)

	Each subfolder includes:

	- `config.json` with the autoencoder architecture
	- `conversion_metadata.json` documenting the source checkpoint and config

	## Test (load tokenizer only)

	This repo is self-contained: it includes `bitdance_diffusers` (copied from BitDance-14B-64x-diffusers) for the `BitDanceAutoencoder` class. Run the test to verify loading and encode/decode:

	The test loads all three autoencoders and runs a quick encode/decode check with `ae_d16c32` (no full image generation).

	## Loading tokenizer autoencoders

	```python
	import sys
	from pathlib import Path

	# Self-contained: add local path so bitdance_diffusers is found
	BASE_DIR = Path(__file__).resolve().parent
	sys.path.insert(0, str(BASE_DIR))

	from bitdance_diffusers import BitDanceAutoencoder

	# Load any tokenizer autoencoder (use repo path or local path)
	ae = BitDanceAutoencoder.from_pretrained(
	"BiliSakura/BitDance-Tokenizer-diffusers", # or str(BASE_DIR) for local
	subfolder="ae_d16c32",
	)
	# ae_d16c32: z_channels=32, patch_size=16
	# ae_d32c128: z_channels=128, patch_size=32
	# ae_d32c256: z_channels=256, patch_size=32
	```

	## Using with a BitDance pipeline (full inference)

	To swap a tokenizer into a BitDance diffusers pipeline for image generation:

	```python
	import torch
	from diffusers import DiffusionPipeline

	# Load a BitDance diffusers pipeline first (provides BitDanceAutoencoder class).
	pipe = DiffusionPipeline.from_pretrained(
	"BiliSakura/BitDance-14B-16x-diffusers",
	trust_remote_code=True,
	torch_dtype=torch.bfloat16,
	).to("cuda")

	# Swap in a tokenizer autoencoder from this repository.
	pipe.autoencoder = pipe.autoencoder.__class__.from_pretrained(
	"BiliSakura/BitDance-Tokenizer-diffusers",
	subfolder="ae_d16c32",
	).to("cuda")

	image = pipe(
	prompt="A watercolor painting of a red fox in a snowy forest.",
	height=1024,
	width=1024,
	).images[0]
	image.save("bitdance_with_custom_tokenizer.png")
	```

	> Note: this repository stores tokenizer autoencoder components; use `trust_remote_code=True` with a BitDance runtime repo when loading custom classes.

	## Citation

	If you use this model, please cite BitDance and Diffusers:

	```bibtex
	@article{ai2026bitdance,
	title = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
	author = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
	journal = {arXiv preprint arXiv:2602.14041},
	year = {2026}
	}

	@inproceedings{von-platen-etal-2022-diffusers,
	title = {Diffusers: State-of-the-art diffusion models},
	author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Damar Jablonski and Hernan Bischof and Thomas Wolf},
	booktitle = {GitHub repository},
	year = {2022},
	url = {https://github.com/huggingface/diffusers}
	}
	```

	## License

	This repository is distributed under the Apache-2.0 license, consistent with the upstream BitDance release.