Skip to content

CUDA error (again ! :-)) #554

@olivbrau

Description

@olivbrau

Hi,
I've tried SD 1.4 and CUDA backend on 2 configurations :

  1. On my personal computer with RTX 4070, everything works well, thanks to ag2s20150909 and the build https://github.com/ag2s20150909/stable-diffusion.cpp/releases/tag/master-74a21a7

  2. On my working computer, a laptop with RTX A1000, I still get errors that I don't understand :

ggml_cuda_compute_forward: GET_ROWS failed
CUDA error: no kernel image is available for execution on the device
current device: 0, in function ggml_cuda_compute_forward at D:\a\stable-diffusion.cpp\stable-diffusion.cpp\ggml\src\ggml-cuda\ggml-cuda.cu:2174

Here is the full log :

D:\Users\braultoli\Desktop\sd-master-9578fdc-bin-win-avx2-x64\inference_tool_CUDA_2025_01_01>"sd.exe" -m "..\StableDiffusion 1.4 F32\sd-v1-4.ckpt" -p "a cute cat" --sampling-method euler --steps 10 -W 512 -H 512 -s 42 -t 20
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA RTX A1000 Laptop GPU, compute capability 8.6, VMM: yes
[INFO ] stable-diffusion.cpp:195 - loading model from '..\StableDiffusion 1.4 F32\sd-v1-4.ckpt'
[INFO ] model.cpp:891 - load ..\StableDiffusion 1.4 F32\sd-v1-4.ckpt using checkpoint format
ZIP 0, name = archive/data.pkl, dir = archive/
[INFO ] stable-diffusion.cpp:242 - Version: SD 1.x
[INFO ] stable-diffusion.cpp:275 - Weight type: f32
[INFO ] stable-diffusion.cpp:276 - Conditioner weight type: f32
[INFO ] stable-diffusion.cpp:277 - Diffusion model weight type: f32
[INFO ] stable-diffusion.cpp:278 - VAE weight type: f32
|==================================================| 1131/1131 - 0.00it/s←[KKKK
[INFO ] stable-diffusion.cpp:516 - total params memory size = 2719.24MB (VRAM 2719.24MB, RAM 0.00MB): clip 469.44MB(VRAM), unet 2155.33MB(VRAM), vae 94.47MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(VRAM)
[INFO ] stable-diffusion.cpp:520 - loading model from '..\StableDiffusion 1.4 F32\sd-v1-4.ckpt' completed, taking 9.06s
[INFO ] stable-diffusion.cpp:550 - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:682 - Attempting to apply 0 LoRAs
[INFO ] stable-diffusion.cpp:1235 - apply_loras completed, taking 0.00s
ggml_cuda_compute_forward: GET_ROWS failed
CUDA error: no kernel image is available for execution on the device
current device: 0, in function ggml_cuda_compute_forward at D:\a\stable-diffusion.cpp\stable-diffusion.cpp\ggml\src\ggml-cuda\ggml-cuda.cu:2174
err
D:\a\stable-diffusion.cpp\stable-diffusion.cpp\ggml\src\ggml-cuda\ggml-cuda.cu:70: CUDA error

Does anybody have an idea ?

Thanks a lot in advance

Olivier

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions