深度學習 Diffusers 庫（自留）

（本文將圍繞安裝Diffusers庫及其依賴、理解Diffusers核心概念：Pipeline, Model, Scheduler 、使用預訓練模型進行推理（文生圖、圖生圖等）、自定義模型和調度器、訓練自己的擴散模型（可選，需要大量資源）、以及高級應用：ControlNet、LoRA等進行展開）

官網鏈接：huggingface

鏡像鏈接：mirror

一、安裝Diffusers庫

Diffusers 已在 Python 3.8+、PyTorch 1.7.0+ 和 Flax 上進行了測試。請按照以下適用于您正在使用的 Deep Learning Library 的安裝說明進行作：

# 使用 conda 安裝
# 激活虛擬環境后，使用 （由社區維護）：conda
conda install -c conda-forge diffusers
# 從源碼安裝
# 在從源安裝 🤗 Diffusers 之前，請確保您已安裝 PyTorch 和 🤗 Accelerate。
# 要安裝 🤗 Accelerate：
pip install accelerate
# 然后從源碼安裝 🤗 Diffusers：
pip install git+https://github.com/huggingface/diffusers可編輯安裝
git clone https://github.com/huggingface/diffusers.git
cd diffusers克隆更新到最新版本的 🤗 Diffusers：
cd ~/diffusers/
git pull

? ? ? 此命令將安裝最前沿版本，而不是最新版本。該版本有助于及時了解最新發展。例如，如果自上次正式發布以來已修復錯誤，但尚未推出新版本。但是，這意味著版本可能并不總是穩定的。我們努力保持版本可運行，大多數問題通常會在幾小時或一天內得到解決。如果您遇到問題，請打開https://github.com/huggingface/diffusers/issues/new/choose

二、模型文件和布局

? ? ?擴散模型以各種文件類型保存，并按不同的布局進行組織。Diffusers 將模型權重作為 safetensors 文件存儲在?Diffusers-multifolder?布局中，它還支持從 diffusion 生態系統中常用的單文件布局加載文件（如 safetensors 和 ckpt 文件）。每種布局都有自己的優點和用例，本指南將向您展示如何加載不同的文件和布局，以及如何轉換它們。

1.?Safetensors?庫

Safetensors?是一種安全快速的文件格式，用于安全地存儲和加載張量。Safetensors 限制 header 大小以限制某些類型的攻擊，支持延遲加載（對分布式設置很有用），并且通常具有更快的加載速度。

# 確保已安裝?Safetensors?庫。
!pip install safetensors

? ? ?Diffusers 庫是 Hugging Face 官方開發的?開源 Python 庫，專門用于簡化擴散模型（Diffusion Models）的部署與應用。

Diffusers-multifolder 布局：可能有幾個單獨的 safetensors 文件，每個管道組件（文本編碼器、UNet、VAE）一個，組織在子文件夾中（查看?stable-diffusion-v1-5/stable-diffusion-v1-5?存儲庫作為示例）
單文件布局：所有模型權重都可以保存在一個文件中（查看?WarriorMama777/OrangeMixs?存儲庫作為示例）

2.LoRA 文件

LoRA?是一種輕量級適配器，訓練快速且易于，因此在以某種方式或樣式生成圖像方面特別受歡迎。這些適配器通常存儲在 safetensors 文件中，并且在?civitai?等模型共享平臺上廣泛流行。LoRA 使用?load_lora_weights（）?方法加載到基礎模型中。

from diffusers import StableDiffusionXLPipeline
import torch# base model
pipeline = StableDiffusionXLPipeline.from_pretrained("Lykon/dreamshaper-xl-1-0", torch_dtype=torch.float16, variant="fp16"
).to("cuda")# download LoRA weights
!wget https://civitai.com/api/download/models/168776 -O blueprintify.safetensors# load LoRA weights
pipeline.load_lora_weights(".", weight_name="blueprintify.safetensors")
prompt = "bl3uprint, a highly detailed blueprint of the empire state building, explaining how to build all parts, many txt, blueprint grid backdrop"
negative_prompt = "lowres, cropped, worst quality, low quality, normal quality, artifacts, signature, watermark, username, blurry, more than one bridge, bad architecture"image = pipeline(prompt=prompt,negative_prompt=negative_prompt,generator=torch.manual_seed(0),
).images[0]
image

?3.CKPT

PyTorch 的?torch.save?函數使用 Python 的?pickle?實用程序來序列化和保存模型。這些文件保存為 ckpt 文件，并且包含整個模型的權重。

使用?from_single_file（）?方法直接加載 ckpt 文件。

from diffusers import StableDiffusionPipelinepipeline = StableDiffusionPipeline.from_single_file("https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/blob/main/v1-5-pruned.ckpt"
)

4.存儲布局

有兩種方式組織模型文件，一種是 Diffusers-multifolder 布局，另一種是單文件布局。Diffusers-multifolder 布局是默認布局，每個組件文件（文本編碼器、UNet、VAE）都存儲在單獨的子文件夾中。Diffusers 還支持從單文件布局加載模型，其中所有組件都捆綁在一起。

Diffusers-multifolder

Diffusers-multifolder 布局是 Diffusers 的默認存儲布局。每個組件（文本編碼器、UNet、VAE）的權重都存儲在單獨的子文件夾中。權重可以存儲為 safetensors 或 ckpt 文件。

要從 Diffusers-multifolder 布局加載，請使用?from_pretrained（）?方法。

from diffusers import DiffusionPipelinepipeline = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0",torch_dtype=torch.float16,variant="fp16",use_safetensors=True,
).to("cuda")

使用 Diffusers-multifolder 布局的好處包括：

單獨或并行加載每個組件文件的速度更快。
減少了內存使用量，因為您只加載了所需的組件。例如，SDXL Turbo、SDXL Lightning?和?Hyper-SD?等型號除 UNet 外具有相同的組件。您可以使用?from_pipe（）?方法重用它們的共享組件，而無需消耗任何額外的內存（請查看?重用管道指南），并且只加載 UNet。這樣，您就不需要下載冗余組件，也無需不必要地使用更多內存。

import torch
from diffusers import StableDiffusionXLPipeline, UNet2DConditionModel, EulerDiscreteScheduler# download one model
sdxl_pipeline = StableDiffusionXLPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0",torch_dtype=torch.float16,variant="fp16",use_safetensors=True,
).to("cuda")# switch UNet for another model
unet = UNet2DConditionModel.from_pretrained("stabilityai/sdxl-turbo",subfolder="unet",torch_dtype=torch.float16,variant="fp16",use_safetensors=True
)
# reuse all the same components in new model except for the UNet
turbo_pipeline = StableDiffusionXLPipeline.from_pipe(sdxl_pipeline, unet=unet,
).to("cuda")
turbo_pipeline.scheduler = EulerDiscreteScheduler.from_config(turbo_pipeline.scheduler.config,timestep+spacing="trailing"
)
image = turbo_pipeline("an astronaut riding a unicorn on mars",num_inference_steps=1,guidance_scale=0.0,
).images[0]
image

三、理解Diffusers核心概念

Pipeline, Model, Scheduler

?核心功能全景

功能	實例	代碼示例
文生圖	輸入“星空下的城堡” → 生成高清圖像	`pipe("星空下的城堡").images[0]`
圖生圖	將照片轉為梵高風格	`pipe(image=輸入圖, prompt="梵高風格")`
圖像修復	智能補全破損老照片	`inpaint_pipeline(mask=蒙版, image=原圖)`
視頻生成	生成 3 秒動畫片段	`video_pipe("跳舞的機器人", num_frames=24)`
音頻合成	文本轉自然語音	`audio_pipe("你好，世界", output_type="mp3")`

? 核心概念速查表

概念	說明	代碼示例
Pipeline	端到端生成流程	`StableDiffusionPipeline`
Scheduler	控制擴散過程	`EulerDiscreteScheduler`
Model	核心神經網絡	`UNet2DConditionModel`
VAE	圖像編碼/解碼	`AutoencoderKL`
Tokenizer	文本處理	`CLIPTokenizer`

3. 使用預訓練模型進行推理（文生圖、圖生圖等）

4. 自定義模型和調度器

5. 訓練自己的擴散模型（可選，需要大量資源）

6. 高級應用：ControlNet、LoRA等

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/912672.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/912672.shtml
英文地址，請注明出處：http://en.pswp.cn/news/912672.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！