Add control noise refiner correctly by bubbliiiing · Pull Request #404 · aigc-apps/VideoX-Fun

bubbliiiing · 2025-12-14T00:43:23Z

No description provided.

elismasilva · 2025-12-16T23:32:14Z

I think this implementation of control_noise_refiner dont work properly, because the image dont follow the control image, and some artifacts are present on image. It seems to me that it's dead weight in this model; perhaps you just need to continue using the control_layer as a refiner and at the end of the forward process apply some dummy to bypass these layers.

With control noise refiner (not followed pose):

Without control noise(followed pose):

bubbliiiing · 2025-12-17T09:30:54Z

我认为这个 control_noise_refiner 的实现方式有问题，因为处理后的图像与对照图像不一致，并且图像上出现了一些伪影。在我看来，它在这个模型中是多余的；或许应该继续使用 control_layer 作为细化层，并在前向传播过程的最后阶段添加一些虚拟层来绕过这些层。

使用控制噪声精炼器（未跟踪姿态）：

无控制噪声（跟随姿态）：

The old model cannot predict properly and needs to be retrained. I'm currently uploading the newly trained model.

bubbliiiing · 2025-12-17T09:38:18Z

speed 2.1

speed 2.0

elismasilva · 2025-12-17T12:26:44Z

speed 2.1

speed 2.0

Thanks,
Since the models internally share the same key structure, I think it would be beneficial to have separate repositories with a config.json file for each. And within that config, you could include a key like "version":"v1", where the value changes to "v2", and so on. Then in the model code you check the key to determine the flow; with this you can even discard the config.yaml and transfer his keys for the .json This way you stay aligned with the diffusers standard and don't need to pass any more parameters via custom kwargs.

bubbliiiing · 2025-12-17T12:27:48Z

Could you give a demo

bubbliiiing · 2025-12-17T12:43:40Z

The issue is mainly due to an incorrectly written layer, which caused an extra 2.1 that shouldn't exist. This has been quite frustrating.

elismasilva · 2025-12-17T12:55:20Z

Could you give a demo

like this

{
  "_class_name": "ZImageControlTransformer2DModel",
  "_diffusers_version": "0.36.0.dev0",
  "all_f_patch_size": [
    1
  ],
  "all_patch_size": [
    2
  ],
  "axes_dims": [
    32,
    48,
    48
  ],
  "axes_lens": [
    1536,
    512,
    512
  ],
  "cap_feat_dim": 2560,
  "dim": 3840,
  "in_channels": 16,
  "n_heads": 30,
  "n_kv_heads": 30,
  "n_layers": 30,
  "n_refiner_layers": 2,
  "norm_eps": 1e-05,
  "qk_norm": true,
  "rope_theta": 256.0,
  "t_scale": 1000.0,
  "control_layers_places": [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28],
  "control_refiner_layers_places": [0, 1],
  "add_control_noise_refiner": true,  
  "control_in_dim": 33,
  "version": "v2.1"
}

so in you ZImageControlTransformer2d class init you can do something like this:

@register_to_config
    def __init__(
        self,
        control_layers_places=None,
        control_refiner_layers_places=None,
        control_in_dim=None,
        add_control_noise_refiner=None,
        all_patch_size=(2,),
        all_f_patch_size=(1,),
        in_channels=16,
        dim=3840,
        n_layers=30,
        n_refiner_layers=2,
        n_heads=30,
        n_kv_heads=30,
        norm_eps=1e-5,
        qk_norm=True,
        cap_feat_dim=2560,
        rope_theta=256.0,
        t_scale=1000.0,
        axes_dims=[32, 48, 48],
        axes_lens=[1024, 512, 512],
        version="v1",
    ):
   super().__init__()
        self.all_patch_size = all_patch_size
        self.all_f_patch_size = all_f_patch_size
        self.in_channels = in_channels
        self.out_channels = in_channels
        self.dim = dim
        self.n_layers = n_layers
        self.n_refiner_layers = n_refiner_layers
        self.n_heads = n_heads
        self.n_kv_heads = n_kv_heads
        self.norm_eps = norm_eps
        self.qk_norm = qk_norm
        self.cap_feat_dim = cap_feat_dim
        self.rope_theta = rope_theta
        self.t_scale = t_scale
        self.axes_dims = axes_dims
        self.axes_lens = axes_lens
        self.version = version
        self.gradient_checkpointing = False

        if version == "v1":
            (
                _control_in_dim_default,
                _layers_places_default,
                _refiner_places_default,
                _add_refiner_default,
            ) = (16, [0, 5, 10, 15, 20, 25], [], False)            
        else:
            (
                _control_in_dim_default,
                _layers_places_default,
                _refiner_places_default,
                _add_refiner_default,
            ) = (33, list(range(0, 30, 2)), [0, 1], True)

        self.control_in_dim = (
                    control_in_dim if control_in_dim is not None else _control_in_dim_default
                )
        self.control_layers_places = (
            control_layers_places
            if control_layers_places is not None
            else _layers_places_default
        )
        self.control_refiner_layers_places = (
            control_refiner_layers_places
            if control_refiner_layers_places is not None
            else _refiner_places_default
        )
        self.add_control_noise_refiner = (
            add_control_noise_refiner
            if add_control_noise_refiner is not None
            else _add_refiner_default
        )

.....

If the file exists in the folder, whoever uses from_pretrained will assume the values from config.json. If it doesn't exist, there's a security check that creates the values automatically. This is also useful for those who use from_single_file for a gguf, for example, loading it by specifying the repository in the config parameter.

transformer = ZImageControlTransformer2DModel.from_single_file(
    "https://huggingface.co/alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union-2.0/blob/main/Z-Image-Turbo-Fun-Controlnet-Union-2.0.gguf",
    torch_dtype=torch.bfloat16,
    config="https://huggingface.co/alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union-2.0",
)

elismasilva · 2025-12-17T13:05:29Z

The issue is mainly due to an incorrectly written layer, which caused an extra 2.1 that shouldn't exist. This has been quite frustrating.

Well, if version 2.1 works as expected, then perhaps there's no point in having a 2.1 version; just a corrected 2.0 would only be an update to the repository and a notification for people to download again or do a "pull" in the repository to update the weights, thus avoiding creating a lot of code flows.

hlky · 2025-12-17T13:30:15Z

Hi @bubbliiiing, I'm the author of the Diffusers PR, apologies for any confusion, I don't think it is necessary to change anything here, I've already handled the different versions by uploading HF Hub repos for the configs (and from_pretrained support - but from_single_file works too for the original weights).

bubbliiiing · 2025-12-18T07:19:35Z

Hi @bubbliiiing, I'm the author of the Diffusers PR, apologies for any confusion, I don't think it is necessary to change anything here, I've already handled the different versions by uploading HF Hub repos for the configs (and from_pretrained support - but from_single_file works too for the original weights).

I'll go ahead and merge this PR. Feel free to share any feedback or suggestions for changes. Thx.

bubbliiiing · 2025-12-18T07:20:44Z

The issue is mainly due to an incorrectly written layer, which caused an extra 2.1 that shouldn't exist. This has been quite frustrating.

Well, if version 2.1 works as expected, then perhaps there's no point in having a 2.1 version; just a corrected 2.0 would only be an update to the repository and a notification for people to download again or do a "pull" in the repository to update the weights, thus avoiding creating a lot of code flows.

Since the weights have already been propagated, and considering that the two weights cannot be distinguished very precisely, I'm concerned this might affect the performance in actual use.

Add add_control_noise_refiner_correctly

81bcbd5

bubbliiiing mentioned this pull request Dec 14, 2025

Basic implementation of z image fun control union 2.0 Comfy-Org/ComfyUI#11304

Merged

bubbliiiing added 3 commits December 14, 2025 11:34

Add add_control_noise_refiner_correctly

7ad51f1

Fix bug in ZImage training

b48b767

Fix bug in ZImage training readme

66a4d81

bubbliiiing added 3 commits December 17, 2025 16:35

Merge branch 'main' of github.com:aigc-apps/CogVideoX-Fun

4e759fe

Merge branch 'main' into add_control_noise_refiner_correctly

40d7e90

Update 2.1 predict

4ed096c

Update z image control multi gpu inference

4028d6b

bubbliiiing requested a review from hkunzhe December 17, 2025 10:05

hkunzhe approved these changes Dec 17, 2025

View reviewed changes

elismasilva mentioned this pull request Dec 17, 2025

Z-Image-Turbo ControlNet huggingface/diffusers#12792

Merged

bubbliiiing merged commit ca4cc15 into main Dec 18, 2025

bubbliiiing deleted the add_control_noise_refiner_correctly branch December 19, 2025 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add control noise refiner correctly#404

Add control noise refiner correctly#404
bubbliiiing merged 8 commits intomainfrom
add_control_noise_refiner_correctly

bubbliiiing commented Dec 14, 2025

Uh oh!

elismasilva commented Dec 16, 2025 •

edited

Loading

Uh oh!

bubbliiiing commented Dec 17, 2025

Uh oh!

bubbliiiing commented Dec 17, 2025

Uh oh!

elismasilva commented Dec 17, 2025 •

edited

Loading

Uh oh!

bubbliiiing commented Dec 17, 2025

Uh oh!

bubbliiiing commented Dec 17, 2025

Uh oh!

elismasilva commented Dec 17, 2025 •

edited

Loading

Uh oh!

elismasilva commented Dec 17, 2025

Uh oh!

hlky commented Dec 17, 2025 •

edited

Loading

Uh oh!

bubbliiiing commented Dec 18, 2025

Uh oh!

bubbliiiing commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

bubbliiiing commented Dec 14, 2025

Uh oh!

elismasilva commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bubbliiiing commented Dec 17, 2025

Uh oh!

bubbliiiing commented Dec 17, 2025

Uh oh!

elismasilva commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bubbliiiing commented Dec 17, 2025

Uh oh!

bubbliiiing commented Dec 17, 2025

Uh oh!

elismasilva commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elismasilva commented Dec 17, 2025

Uh oh!

hlky commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bubbliiiing commented Dec 18, 2025

Uh oh!

bubbliiiing commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

elismasilva commented Dec 16, 2025 •

edited

Loading

elismasilva commented Dec 17, 2025 •

edited

Loading

elismasilva commented Dec 17, 2025 •

edited

Loading

hlky commented Dec 17, 2025 •

edited

Loading