Instructions to use ByteDance/Hyper-SD with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use ByteDance/Hyper-SD with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", dtype=torch.bfloat16, device_map="cuda") pipe.load_lora_weights("ByteDance/Hyper-SD") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
| license: openrail++ | |
| library_name: diffusers | |
| inference: false | |
| tags: | |
| - lora | |
| - text-to-image | |
| - stable-diffusion | |
| # Hyper-SD | |
| Official Repository of the paper: *[Hyper-SD](https://arxiv.org/abs/2404.13686)*. | |
| Project Page: https://hyper-sd.github.io/ | |
|  | |
| ## News🔥🔥🔥 | |
| * Apr.26, 2024. 💥💥💥 Our CFG-Preserved Hyper-SD15/SDXL that facilitate negative prompts and larger guidance scales (e.g. 5~10) will be coming soon!!! 💥💥💥 | |
| * Apr.26, 2024. Thanks to @[Pete](https://huggingface.co/pngwn) for contributing to our [scribble demo](https://huggingface.co/spaces/ByteDance/Hyper-SD15-Scribble) with larger canvas right now 👏. | |
| * Apr.24, 2024. The ComfyUI [workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-1step-Unet-workflow.json) and [checkpoint](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SDXL-1step-Unet-Comfyui.fp16.safetensors) on 1-Step SDXL UNet ✨ is also available! Don't forget ⭕️ to install the custom [scheduler](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui/ComfyUI-HyperSDXL1StepUnetScheduler) in your `ComfyUI/custom_nodes` folder!!! | |
| * Apr.23, 2024. ComfyUI workflows on N-Steps LoRAs are [released](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui)! Worth a try for creators 💥! | |
| * Apr.23, 2024. Our technical report 📚 is uploaded to [arXiv](https://arxiv.org/abs/2404.13686)! Many implementation details are provided and we welcome more discussions👏. | |
| * Apr.21, 2024. Hyper-SD ⚡️ is highly compatible and work well with different base models and controlnets. To clarify, we also append the usage example of controlnet [here](https://huggingface.co/ByteDance/Hyper-SD#controlnet-usage). | |
| * Apr.20, 2024. Our checkpoints and two demos 🤗 (i.e. [SD15-Scribble](https://huggingface.co/spaces/ByteDance/Hyper-SD15-Scribble) and [SDXL-T2I](https://huggingface.co/spaces/ByteDance/Hyper-SDXL-1Step-T2I)) are publicly available on [HuggingFace Repo](https://huggingface.co/ByteDance/Hyper-SD). | |
| ## Try our Hugging Face demos: | |
| Hyper-SD Scribble demo host on [🤗 scribble](https://huggingface.co/spaces/ByteDance/Hyper-SD15-Scribble) | |
| Hyper-SDXL One-step Text-to-Image demo host on [🤗 T2I](https://huggingface.co/spaces/ByteDance/Hyper-SDXL-1Step-T2I) | |
| ## Introduction | |
| Hyper-SD is one of the new State-of-the-Art diffusion model acceleration techniques. | |
| In this repository, we release the models distilled from [SDXL Base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and [Stable-Diffusion v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)。 | |
| ## Checkpoints | |
| * `Hyper-SDXL-Nstep-lora.safetensors`: Lora checkpoint, for SDXL-related models. | |
| * `Hyper-SD15-Nstep-lora.safetensors`: Lora checkpoint, for SD1.5-related models. | |
| * `Hyper-SDXL-1step-unet.safetensors`: Unet checkpoint distilled from SDXL-Base. | |
| ## Text-to-Image Usage | |
| ### SDXL-related models | |
| #### 2-Steps, 4-Steps, 8-steps LoRA | |
| Take the 2-steps LoRA as an example, you can also use other LoRAs for the corresponding inference steps setting. | |
| ```python | |
| import torch | |
| from diffusers import DiffusionPipeline, DDIMScheduler | |
| from huggingface_hub import hf_hub_download | |
| base_model_id = "stabilityai/stable-diffusion-xl-base-1.0" | |
| repo_name = "ByteDance/Hyper-SD" | |
| # Take 2-steps lora as an example | |
| ckpt_name = "Hyper-SDXL-2steps-lora.safetensors" | |
| # Load model. | |
| pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda") | |
| pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name)) | |
| pipe.fuse_lora() | |
| # Ensure ddim scheduler timestep spacing set as trailing !!! | |
| pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing") | |
| # lower eta results in more detail | |
| prompt="a photo of a cat" | |
| image=pipe(prompt=prompt, num_inference_steps=2, guidance_scale=0).images[0] | |
| ``` | |
| #### Unified LoRA (support 1 to 8 steps inference) | |
| You can flexibly adjust the number of inference steps and eta value to achieve best performance. | |
| ```python | |
| import torch | |
| from diffusers import DiffusionPipeline, TCDScheduler | |
| from huggingface_hub import hf_hub_download | |
| base_model_id = "stabilityai/stable-diffusion-xl-base-1.0" | |
| repo_name = "ByteDance/Hyper-SD" | |
| ckpt_name = "Hyper-SDXL-1step-lora.safetensors" | |
| # Load model. | |
| pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda") | |
| pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name)) | |
| pipe.fuse_lora() | |
| # Use TCD scheduler to achieve better image quality | |
| pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config) | |
| # Lower eta results in more detail for multi-steps inference | |
| eta=1.0 | |
| prompt="a photo of a cat" | |
| image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, eta=eta).images[0] | |
| ``` | |
| #### 1-step SDXL Unet | |
| Only for the single step inference. | |
| ```python | |
| import torch | |
| from diffusers import DiffusionPipeline, UNet2DConditionModel, LCMScheduler | |
| from huggingface_hub import hf_hub_download | |
| from safetensors.torch import load_file | |
| base_model_id = "stabilityai/stable-diffusion-xl-base-1.0" | |
| repo_name = "ByteDance/Hyper-SD" | |
| ckpt_name = "Hyper-SDXL-1step-Unet.safetensors" | |
| # Load model. | |
| unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16) | |
| unet.load_state_dict(load_file(hf_hub_download(repo_name, ckpt_name), device="cuda")) | |
| pipe = DiffusionPipeline.from_pretrained(base_model_id, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda") | |
| # Use LCM scheduler instead of ddim scheduler to support specific timestep number inputs | |
| pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config) | |
| # Set start timesteps to 800 in the one-step inference to get better results | |
| prompt="a photo of a cat" | |
| image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, timesteps=[800]).images[0] | |
| ``` | |
| ### SD1.5-related models | |
| #### 2-Steps, 4-Steps, 8-steps LoRA | |
| Take the 2-steps LoRA as an example, you can also use other LoRAs for the corresponding inference steps setting. | |
| ```python | |
| import torch | |
| from diffusers import DiffusionPipeline, DDIMScheduler | |
| from huggingface_hub import hf_hub_download | |
| base_model_id = "runwayml/stable-diffusion-v1-5" | |
| repo_name = "ByteDance/Hyper-SD" | |
| # Take 2-steps lora as an example | |
| ckpt_name = "Hyper-SD15-2steps-lora.safetensors" | |
| # Load model. | |
| pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda") | |
| pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name)) | |
| pipe.fuse_lora() | |
| # Ensure ddim scheduler timestep spacing set as trailing !!! | |
| pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing") | |
| prompt="a photo of a cat" | |
| image=pipe(prompt=prompt, num_inference_steps=2, guidance_scale=0).images[0] | |
| ``` | |
| #### Unified LoRA (support 1 to 8 steps inference) | |
| You can flexibly adjust the number of inference steps and eta value to achieve best performance. | |
| ```python | |
| import torch | |
| from diffusers import DiffusionPipeline, TCDScheduler | |
| from huggingface_hub import hf_hub_download | |
| base_model_id = "runwayml/stable-diffusion-v1-5" | |
| repo_name = "ByteDance/Hyper-SD" | |
| ckpt_name = "Hyper-SD15-1step-lora.safetensors" | |
| # Load model. | |
| pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda") | |
| pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name)) | |
| pipe.fuse_lora() | |
| # Use TCD scheduler to achieve better image quality | |
| pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config) | |
| # Lower eta results in more detail for multi-steps inference | |
| eta=1.0 | |
| prompt="a photo of a cat" | |
| image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, eta=eta).images[0] | |
| ``` | |
| ## ControlNet Usage | |
| ### SDXL-related models | |
| #### 2-Steps, 4-Steps, 8-steps LoRA | |
| Take Canny Controlnet and 2-steps inference as an example: | |
| ```python | |
| import torch | |
| from diffusers.utils import load_image | |
| import numpy as np | |
| import cv2 | |
| from PIL import Image | |
| from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL, DDIMScheduler | |
| from huggingface_hub import hf_hub_download | |
| # Load original image | |
| image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png") | |
| image = np.array(image) | |
| # Prepare Canny Control Image | |
| low_threshold = 100 | |
| high_threshold = 200 | |
| image = cv2.Canny(image, low_threshold, high_threshold) | |
| image = image[:, :, None] | |
| image = np.concatenate([image, image, image], axis=2) | |
| control_image = Image.fromarray(image) | |
| control_image.save("control.png") | |
| control_weight = 0.5 # recommended for good generalization | |
| # Initialize pipeline | |
| controlnet = ControlNetModel.from_pretrained( | |
| "diffusers/controlnet-canny-sdxl-1.0", | |
| torch_dtype=torch.float16 | |
| ) | |
| vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16) | |
| pipe = StableDiffusionXLControlNetPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, vae=vae, torch_dtype=torch.float16).to("cuda") | |
| pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SDXL-2steps-lora.safetensors")) | |
| # Ensure ddim scheduler timestep spacing set as trailing !!! | |
| pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing") | |
| pipe.fuse_lora() | |
| image = pipe("A chocolate cookie", num_inference_steps=2, image=control_image, guidance_scale=0, controlnet_conditioning_scale=control_weight).images[0] | |
| image.save('image_out.png') | |
| ``` | |
| #### Unified LoRA (support 1 to 8 steps inference) | |
| Take Canny Controlnet as an example: | |
| ```python | |
| import torch | |
| from diffusers.utils import load_image | |
| import numpy as np | |
| import cv2 | |
| from PIL import Image | |
| from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL, TCDScheduler | |
| from huggingface_hub import hf_hub_download | |
| # Load original image | |
| image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png") | |
| image = np.array(image) | |
| # Prepare Canny Control Image | |
| low_threshold = 100 | |
| high_threshold = 200 | |
| image = cv2.Canny(image, low_threshold, high_threshold) | |
| image = image[:, :, None] | |
| image = np.concatenate([image, image, image], axis=2) | |
| control_image = Image.fromarray(image) | |
| control_image.save("control.png") | |
| control_weight = 0.5 # recommended for good generalization | |
| # Initialize pipeline | |
| controlnet = ControlNetModel.from_pretrained( | |
| "diffusers/controlnet-canny-sdxl-1.0", | |
| torch_dtype=torch.float16 | |
| ) | |
| vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16) | |
| pipe = StableDiffusionXLControlNetPipeline.from_pretrained( | |
| "stabilityai/stable-diffusion-xl-base-1.0", | |
| controlnet=controlnet, vae=vae, torch_dtype=torch.float16).to("cuda") | |
| # Load Hyper-SD15-1step lora | |
| pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SDXL-1step-lora.safetensors")) | |
| pipe.fuse_lora() | |
| # Use TCD scheduler to achieve better image quality | |
| pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config) | |
| # Lower eta results in more detail for multi-steps inference | |
| eta=1.0 | |
| image = pipe("A chocolate cookie", num_inference_steps=4, image=control_image, guidance_scale=0, controlnet_conditioning_scale=control_weight, eta=eta).images[0] | |
| image.save('image_out.png') | |
| ``` | |
| ### SD1.5-related models | |
| #### 2-Steps, 4-Steps, 8-steps LoRA | |
| Take Canny Controlnet and 2-steps inference as an example: | |
| ```python | |
| import torch | |
| from diffusers.utils import load_image | |
| import numpy as np | |
| import cv2 | |
| from PIL import Image | |
| from diffusers import ControlNetModel, StableDiffusionControlNetPipeline, DDIMScheduler | |
| from huggingface_hub import hf_hub_download | |
| controlnet_checkpoint = "lllyasviel/control_v11p_sd15_canny" | |
| # Load original image | |
| image = load_image("https://huggingface.co/lllyasviel/control_v11p_sd15_canny/resolve/main/images/input.png") | |
| image = np.array(image) | |
| # Prepare Canny Control Image | |
| low_threshold = 100 | |
| high_threshold = 200 | |
| image = cv2.Canny(image, low_threshold, high_threshold) | |
| image = image[:, :, None] | |
| image = np.concatenate([image, image, image], axis=2) | |
| control_image = Image.fromarray(image) | |
| control_image.save("control.png") | |
| # Initialize pipeline | |
| controlnet = ControlNetModel.from_pretrained(controlnet_checkpoint, torch_dtype=torch.float16) | |
| pipe = StableDiffusionControlNetPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16).to("cuda") | |
| pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SD15-2steps-lora.safetensors")) | |
| pipe.fuse_lora() | |
| # Ensure ddim scheduler timestep spacing set as trailing !!! | |
| pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing") | |
| image = pipe("a blue paradise bird in the jungle", num_inference_steps=2, image=control_image, guidance_scale=0).images[0] | |
| image.save('image_out.png') | |
| ``` | |
| #### Unified LoRA (support 1 to 8 steps inference) | |
| Take Canny Controlnet as an example: | |
| ```python | |
| import torch | |
| from diffusers.utils import load_image | |
| import numpy as np | |
| import cv2 | |
| from PIL import Image | |
| from diffusers import ControlNetModel, StableDiffusionControlNetPipeline, TCDScheduler | |
| from huggingface_hub import hf_hub_download | |
| controlnet_checkpoint = "lllyasviel/control_v11p_sd15_canny" | |
| # Load original image | |
| image = load_image("https://huggingface.co/lllyasviel/control_v11p_sd15_canny/resolve/main/images/input.png") | |
| image = np.array(image) | |
| # Prepare Canny Control Image | |
| low_threshold = 100 | |
| high_threshold = 200 | |
| image = cv2.Canny(image, low_threshold, high_threshold) | |
| image = image[:, :, None] | |
| image = np.concatenate([image, image, image], axis=2) | |
| control_image = Image.fromarray(image) | |
| control_image.save("control.png") | |
| # Initialize pipeline | |
| controlnet = ControlNetModel.from_pretrained(controlnet_checkpoint, torch_dtype=torch.float16) | |
| pipe = StableDiffusionControlNetPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16).to("cuda") | |
| # Load Hyper-SD15-1step lora | |
| pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SD15-1step-lora.safetensors")) | |
| pipe.fuse_lora() | |
| # Use TCD scheduler to achieve better image quality | |
| pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config) | |
| # Lower eta results in more detail for multi-steps inference | |
| eta=1.0 | |
| image = pipe("a blue paradise bird in the jungle", num_inference_steps=1, image=control_image, guidance_scale=0, eta=eta).images[0] | |
| image.save('image_out.png') | |
| ``` | |
| ## Comfyui Usage | |
| * `Hyper-SDXL-Nsteps-lora.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-Nsteps-lora-workflow.json) | |
| * `Hyper-SD15-Nsteps-lora.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SD15-Nsteps-lora-workflow.json) | |
| * `Hyper-SDXL-1step-Unet-Comfyui.fp16.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-1step-Unet-workflow.json) | |
| * **REQUIREMENT / INSTALL** for 1-Step SDXL UNet: Please install our [scheduler folder](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui/ComfyUI-HyperSDXL1StepUnetScheduler) into your `ComfyUI/custom_nodes` to enable sampling from 800 timestep instead of 999. | |
| * i.e. making sure the `ComfyUI/custom_nodes/ComfyUI-HyperSDXL1StepUnetScheduler` folder exist. | |
| * For more details, please refer to our [technical report](https://arxiv.org/abs/2404.13686). | |
| ## Citation | |
| ```bibtex | |
| @misc{ren2024hypersd, | |
| title={Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis}, | |
| author={Yuxi Ren and Xin Xia and Yanzuo Lu and Jiacheng Zhang and Jie Wu and Pan Xie and Xing Wang and Xuefeng Xiao}, | |
| year={2024}, | |
| eprint={2404.13686}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CV} | |
| } | |
| ``` |