The output of
import deepinv
import torch
denoiser = deepinv.models.DiffUNet(large_model=False)
denoiser.eval()
x = torch.randn(1, 3, 128, 128)
denoiser.forward_diffusion(x, torch.tensor([0.1])).shape
is torch.Size([1, 6, 128, 128]) (6 channels instead of 3).
Any idea where that comes from?
cc @annegnx