Conversation
|
I think this implementation of control_noise_refiner dont work properly, because the image dont follow the control image, and some artifacts are present on image. It seems to me that it's dead weight in this model; perhaps you just need to continue using the control_layer as a refiner and at the end of the forward process apply some dummy to bypass these layers. |
|
Thanks, |
|
Could you give a demo |
|
The issue is mainly due to an incorrectly written layer, which caused an extra 2.1 that shouldn't exist. This has been quite frustrating. |
like this {
"_class_name": "ZImageControlTransformer2DModel",
"_diffusers_version": "0.36.0.dev0",
"all_f_patch_size": [
1
],
"all_patch_size": [
2
],
"axes_dims": [
32,
48,
48
],
"axes_lens": [
1536,
512,
512
],
"cap_feat_dim": 2560,
"dim": 3840,
"in_channels": 16,
"n_heads": 30,
"n_kv_heads": 30,
"n_layers": 30,
"n_refiner_layers": 2,
"norm_eps": 1e-05,
"qk_norm": true,
"rope_theta": 256.0,
"t_scale": 1000.0,
"control_layers_places": [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28],
"control_refiner_layers_places": [0, 1],
"add_control_noise_refiner": true,
"control_in_dim": 33,
"version": "v2.1"
}so in you ZImageControlTransformer2d class init you can do something like this: @register_to_config
def __init__(
self,
control_layers_places=None,
control_refiner_layers_places=None,
control_in_dim=None,
add_control_noise_refiner=None,
all_patch_size=(2,),
all_f_patch_size=(1,),
in_channels=16,
dim=3840,
n_layers=30,
n_refiner_layers=2,
n_heads=30,
n_kv_heads=30,
norm_eps=1e-5,
qk_norm=True,
cap_feat_dim=2560,
rope_theta=256.0,
t_scale=1000.0,
axes_dims=[32, 48, 48],
axes_lens=[1024, 512, 512],
version="v1",
):
super().__init__()
self.all_patch_size = all_patch_size
self.all_f_patch_size = all_f_patch_size
self.in_channels = in_channels
self.out_channels = in_channels
self.dim = dim
self.n_layers = n_layers
self.n_refiner_layers = n_refiner_layers
self.n_heads = n_heads
self.n_kv_heads = n_kv_heads
self.norm_eps = norm_eps
self.qk_norm = qk_norm
self.cap_feat_dim = cap_feat_dim
self.rope_theta = rope_theta
self.t_scale = t_scale
self.axes_dims = axes_dims
self.axes_lens = axes_lens
self.version = version
self.gradient_checkpointing = False
if version == "v1":
(
_control_in_dim_default,
_layers_places_default,
_refiner_places_default,
_add_refiner_default,
) = (16, [0, 5, 10, 15, 20, 25], [], False)
else:
(
_control_in_dim_default,
_layers_places_default,
_refiner_places_default,
_add_refiner_default,
) = (33, list(range(0, 30, 2)), [0, 1], True)
self.control_in_dim = (
control_in_dim if control_in_dim is not None else _control_in_dim_default
)
self.control_layers_places = (
control_layers_places
if control_layers_places is not None
else _layers_places_default
)
self.control_refiner_layers_places = (
control_refiner_layers_places
if control_refiner_layers_places is not None
else _refiner_places_default
)
self.add_control_noise_refiner = (
add_control_noise_refiner
if add_control_noise_refiner is not None
else _add_refiner_default
)
.....If the file exists in the folder, whoever uses from_pretrained will assume the values from config.json. If it doesn't exist, there's a security check that creates the values automatically. This is also useful for those who use from_single_file for a gguf, for example, loading it by specifying the repository in the config parameter. transformer = ZImageControlTransformer2DModel.from_single_file(
"https://huggingface.co/alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union-2.0/blob/main/Z-Image-Turbo-Fun-Controlnet-Union-2.0.gguf",
torch_dtype=torch.bfloat16,
config="https://huggingface.co/alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union-2.0",
) |
Well, if version 2.1 works as expected, then perhaps there's no point in having a 2.1 version; just a corrected 2.0 would only be an update to the repository and a notification for people to download again or do a "pull" in the repository to update the weights, thus avoiding creating a lot of code flows. |
|
Hi @bubbliiiing, I'm the author of the Diffusers PR, apologies for any confusion, I don't think it is necessary to change anything here, I've already handled the different versions by uploading HF Hub repos for the configs (and |
I'll go ahead and merge this PR. Feel free to share any feedback or suggestions for changes. Thx. |
Since the weights have already been propagated, and considering that the two weights cannot be distinguished very precisely, I'm concerned this might affect the performance in actual use. |






No description provided.