Stale Diffusion WebUI的主要功能
原始的txt2img和img2img模式
一键安装和运行脚本(但你仍然必须安装python和git)。
画外画
画中画
颜色素描
提示矩阵
稳定的漫射升格
注意,指定模型应该更加注意的文本部分
a man in a ((tuxedo)) - 将更多地注意到燕尾服。
a man in a (tuxedo:1.21) - 替代句法
选择文本并按下Ctrl+Up或Ctrl+Down(如果你使用的是MacOS,则按Command+Up或Command+Down)来自动调整对选定文本的关注(代码由匿名用户贡献)
Loopback,多次运行img2img处理程序
X/Y/Z图,一种绘制不同参数的图像的3维图的方法
文本反转
你想有多少个嵌入就有多少个嵌入,并为它们使用任何你喜欢的名字
使用多个嵌入,每个符号有不同数量的向量
可使用半精度浮点数字
在8GB上训练嵌入(也有6GB的报告)。
额外的标签:
GFPGAN,修复面部的神经网络
CodeFormer,脸部修复工具,作为GFPGAN的替代品
RealESRGAN, 神经网络升级器
ESRGAN,具有大量第三方模型的神经网络增强器
SwinIR和Swin2SR(见这里),神经网络提升器
LDSR, 潜伏扩散超级分辨率升频器
调整长宽比选项
采样方法选择
调整采样器Eta值(噪声乘数)
更高级的噪声设置选项
在任何时候中断处理
支持4GB显卡(也有报告称2GB可以使用)
批次的正确种子
实时提示令牌长度验证
生成参数
你用来生成图像的参数将与该图像一起保存
对于PNG,保存在PNG块中,对于JPEG,保存在EXIF中
可以将图像拖到PNG信息标签,以恢复生成参数并自动复制到UI中
可以在设置中禁用
拖放图像/文本参数到提示框中
读取生成参数按钮,将提示框中的参数加载到用户界面上
设置页面
从用户界面上运行任意的Python代码(必须用--allow-code运行才能启用)
大多数用户界面元素的鼠标悬停提示
可以通过文本配置改变UI元素的默认值/混合值/最大值/步长值
支持平铺,一个复选框用来创建可以像纹理一样平铺的图像
进度条和实时图像生成预览
可以使用一个单独的神经网络来生成预览,几乎不需要任何的VRAM或计算要求
负面提示,一个额外的文本字段,允许你列出你不希望在生成的图像中看到的内容。
样式,一种保存部分提示的方法,以后可以通过下拉菜单轻松应用它们
变化,一种生成相同图像但有微小差异的方法
种子大小调整,一种生成相同图像但分辨率略有不同的方法
剪辑审讯器,一个试图从图像中猜测提示的按钮。
提示编辑,一种在生成过程中改变提示的方法,例如开始制作一个西瓜,中途切换到动漫女孩。
批量处理,用Img2img处理一组文件
Img2img替代品,反向欧拉法的交叉注意力控制
Highres Fix,一个方便的选项,一键生成高分辨率的图片,没有通常的失真。
即时重新加载检查点
检查点合并,一个允许你将最多3个检查点合并成一个的选项。
自定义脚本,有许多来自社区的扩展
可组合式扩散,一种同时使用多个提示的方法
使用大写字母AND来分离提示语
也支持提示语的权重:一只猫:1.2和一只狗和一只企鹅:2.2
提示语没有令牌限制(原来的稳定扩散让你最多使用75个令牌)。
DeepDanbooru整合,为动漫提示创建Danbooru风格的标签
xformers,选择卡的主要速度提高:(在命令行args中添加--xformers)。
通过扩展: 历史标签:在用户界面中方便地查看、指导和删除图片
生成永久选项
训练选项卡
超网络和嵌入选项
预处理图像:裁剪、镜像、使用BLIP或deepdanbooru(用于动漫)自动标记
片段跳过
超网络
Loras(与Hypernetworks相同,但更漂亮)
一个拼写的用户界面,你可以通过预览选择哪些嵌入、超网络或Loras来添加到你的提示中
可以在设置屏幕上选择加载不同的VAE
在进度条中估计完成时间
API
支持RunwayML的专用绘画模型
通过扩展: 美学梯度,一种通过使用剪辑图像嵌入来生成具有特定美学的图像的方法(实现https://github.com/vicgalle/stable-diffusion-aesthetic-gradients)。
支持稳定的扩散2.0--见wiki的说明
支持Alt-Diffusion--见wiki的说明
现在没有任何坏字母了!
以safetensors格式加载检查点
放宽了分辨率的限制:生成的图像的分寸必须是8的倍数,而不是64的倍数
现在有了许可证!
在设置屏幕上重新排列用户界面的元素
以下为英文原文:
Original txt2img and img2img modes
One click install and run script (but you still must install python and git)
Outpainting
Inpainting
Color Sketch
Prompt Matrix
Stable Diffusion Upscale
Attention, specify parts of text that the model should pay more attention to
a man in a ((tuxedo)) - will pay more attention to tuxedo
a man in a (tuxedo:1.21) - alternative syntax
select text and press Ctrl+Up or Ctrl+Down (or Command+Up or Command+Down if you're on a MacOS) to automatically adjust attention to selected text (code contributed by anonymous user)
Loopback, run img2img processing multiple times
X/Y/Z plot, a way to draw a 3 dimensional plot of images with different parameters
Textual Inversion
have as many embeddings as you want and use any names you like for them
use multiple embeddings with different numbers of vectors per token
works with half precision floating point numbers
train embeddings on 8GB (also reports of 6GB working)
Extras tab with:
GFPGAN, neural network that fixes faces
CodeFormer, face restoration tool as an alternative to GFPGAN
RealESRGAN, neural network upscaler
ESRGAN, neural network upscaler with a lot of third party models
SwinIR and Swin2SR (see here), neural network upscalers
LDSR, Latent diffusion super resolution upscaling
Resizing aspect ratio options
Sampling method selection
Adjust sampler eta values (noise multiplier)
More advanced noise setting options
Interrupt processing at any time
4GB video card support (also reports of 2GB working)
Correct seeds for batches
Live prompt token length validation
Generation parameters
parameters you used to generate images are saved with that image
in PNG chunks for PNG, in EXIF for JPEG
can drag the image to PNG info tab to restore generation parameters and automatically copy them into UI
can be disabled in settings
drag and drop an image/text-parameters to promptbox
Read Generation Parameters Button, loads parameters in promptbox to UI
Settings page
Running arbitrary python code from UI (must run with --allow-code to enable)
Mouseover hints for most UI elements
Possible to change defaults/mix/max/step values for UI elements via text config
Tiling support, a checkbox to create images that can be tiled like textures
Progress bar and live image generation preview
Can use a separate neural network to produce previews with almost none VRAM or compute requirement
Negative prompt, an extra text field that allows you to list what you don't want to see in generated image
Styles, a way to save part of prompt and easily apply them via dropdown later
Variations, a way to generate same image but with tiny differences
Seed resizing, a way to generate same image but at slightly different resolution
CLIP interrogator, a button that tries to guess prompt from an image
Prompt Editing, a way to change prompt mid-generation, say to start making a watermelon and switch to anime girl midway
Batch Processing, process a group of files using img2img
Img2img Alternative, reverse Euler method of cross attention control
Highres Fix, a convenience option to produce high resolution pictures in one click without usual distortions
Reloading checkpoints on the fly
Checkpoint Merger, a tab that allows you to merge up to 3 checkpoints into one
Custom scripts with many extensions from community
Composable-Diffusion, a way to use multiple prompts at once
separate prompts using uppercase AND
also supports weights for prompts: a cat :1.2 AND a dog AND a penguin :2.2
No token limit for prompts (original stable diffusion lets you use up to 75 tokens)
DeepDanbooru integration, creates danbooru style tags for anime prompts
xformers, major speed increase for select cards: (add --xformers to commandline args)
via extension: History tab: view, direct and delete images conveniently within the UI
Generate forever option
Training tab
hypernetworks and embeddings options
Preprocessing images: cropping, mirroring, autotagging using BLIP or deepdanbooru (for anime)
Clip skip
Hypernetworks
Loras (same as Hypernetworks but more pretty)
A sparate UI where you can choose, with preview, which embeddings, hypernetworks or Loras to add to your prompt
Can select to load a different VAE from settings screen
Estimated completion time in progress bar
API
Support for dedicated inpainting model by RunwayML
via extension: Aesthetic Gradients, a way to generate images with a specific aesthetic by using clip images embeds (implementation of https://github.com/vicgalle/stable-diffusion-aesthetic-gradients)
Stable Diffusion 2.0 support - see wiki for instructions
Alt-Diffusion support - see wiki for instructions
Now without any bad letters!
Load checkpoints in safetensors format
Eased resolution restriction: generated image's domension must be a multiple of 8 rather than 64
Now with a license!
Reorder elements in the UI from settings screen