Optimize pipeline creation in LCM-LoRA mode #328

monstruosoft · 2025-09-03T04:49:47Z

Feedback needed!.

In previous versions of FastSD, changing ControlNet settings would trigger a pipeline rebuild. With the recent addition of SDXL safetensors model support and the fact that loading an SDXL model takes over a minute on my machine, it became obvious that rebuilding the pipeline at every change on the settings was not the best idea, so this PR attempts to optimize pipeline creation by having up to four pipelines ready for generation at all times and setting the appropriate pipeline depending on the user generation settings, effectively reducing the places that require a full pipeline rebuild to when the generation models are changed.
The code for this PR focuses on LCM-LoRA mode and, hopefully, shouldn't break anything else in other generation modes, the new code is enclosed in a conditional statement to avoid affecting other modes.
This PR also fixes an issue in the LoRA code that was saving a reference to the current pipeline that was preventing the garbage collector for releasing the memory used by the pipeline.

However, there are some issues still present. I have 16 GB of RAM, barely enough to run SDXL, but sometimes, when switching models, RAM usage will go up to 30 GB, effectively killing performance on my machine. I've tried to spot the cause for this with no success, it just seems to happen randomly; sometimes I can switch models a dozen times with RAM usage remaining at a reasonable level, and other times just switching models a couple times will eat all available RAM and start using swap space. I've tried a lot of things, like explicitly calling del on all pipelines, but every time I think I found a solution, the issue randomly pops again. Last thing I tried was calling unload_lora_weights() (in line 310 of lcm_text_to_image.py) before deleting the pipeline and that seemed to somehow help but doesn't solve the issue. At this point, I think the OS is caching the safetensors files in RAM, forcing the pipeline to be created in swap space. So, I'd like to have some feedback on this subject.
Note that this issue happens when switching models within FastSD in LCM-LoRA mode. If, instead of switching models, you close FastSD to allow for the RAM to be claimed back by the OS and then launch FastSD again to select the new model, the issue shouldn't be present. This is not an ideal solution but for the time being it's the best I can do.

This commit attempts to optimize pipeline creation by reusing the base txt2img pipeline to maintain four different pipelines ready to use at all times, thus reducing the places in the code that require rebuilding the generation pipeline. The available generation pipelines are: - StableDiffusionPipeline - StableDiffusionImg2ImgPipeline - StableDiffusionControlNetPipeline - StableDiffusionControlNetImg2ImgPipeline For SDXL the corresponding pipelines are created in a similar fashion. This commit also tries to ensure that the garbage collector is called when rebuilding pipelines by removing all pipeline references when rebuilding, including a pipeline reference in the LoRA code that was preventing the garbage collector from releasing the pipeline memory.

This commit adapts the Qt GUI and the WebUI to the recent pipeline changes; in particular, ControlNet pipelines are always built from the base txt2img pipeline, and LoRA models are always loaded onto the base txt2img pipeline.

This commit improves the code for manipulating the multiple pipelines according to the user selected generation options so that the garbage collector correctly releases the pipeline memory when switching models. This commit also reverts the changes made in the previous commit to use the default pipeline variable instead of the txt2img_pipeline variable for pipeline manipulation (loading LoRAs, for example), this ensures that the code continues to work in generation modes other than LCM-LoRA.

This commit explicitly calls del on all pipelines to try and make sure that the garbage collector releases all memory when rebuilding pipelines.

This commit explicitly calls unload_lora_weights() when rebuilding pipelines in LCM-LoRA mode.

This commit fixes pipeline creation in LCM mode.

Merge branch 'cli' from the github repo into cli

This commit makes some minor changes to apply the multiple pipelines method to LCM mode in addition to the LCM-LoRA mode.

Minor changes

monstruosoft · 2025-09-05T23:13:59Z

OK, I removed the call to unload_lora_weights(), it's obvious that it doesn't solve the RAM issue and whatever is causing it is not within the control of FastSD.
In the latest commit I also updated the LCM mode to use the same multiple pipelines approach as the LCM-LoRA mode.
Also, note that by using this approach of having multiple pipelines and juggling them around depending on the user settings, it might be simple to add long prompts support to FastSD.

rupeshs · 2025-09-20T04:30:36Z

@monstruosoft Does SDXL controlnets works now? which controlnets you tested, please let me know.

monstruosoft · 2025-09-22T03:21:28Z

Yes, ControlNet should work since previous PRs. I've tried these 1, 2 ControlNets; use the 640 MB versions (the fp16 versions should also work but for some reason those end up using a lot more RAM). I also tried this one but RAM usage was so high that I literally only used it once, though I can say it worked that one time.

rupeshs · 2025-10-05T03:10:52Z

@monstruosoft Yes, latest master branch controlent working but it is not respecting the controlnet conditioning scale value change from webui, does it considered in this PR?

src/backend/controlnet.py

rupeshs · 2025-10-05T05:43:23Z

SDXL wil take around 14 GB system RAM to generate a 768x768 image

monstruosoft · 2025-10-05T16:36:04Z

it is not respecting the controlnet conditioning scale value change from webui

Gonna have to try it with the WebUI. I usually run FastSD either in the CLI or QtGUI versions. Will take a look at the WebUI as soon as I can.

SDXL will take around 14 GB system RAM to generate a 768x768 image

Yes, that's the typical amount of RAM used by SDXL. Weights have to be loaded in fp32 mode for CPU inference, which is a shame; even if using quantized models the weights have to be loaded into RAM in fp32 mode.

monstruosoft · 2025-10-07T04:21:03Z

I just tried the code from this PR in the WebUI and ControlNet is working correctly. Can you please give more details on the issues you're having?.

rupeshs · 2025-10-11T03:25:23Z

@monstruosoft I've added comments could you please resolve it

rupeshs

Please resolve.

monstruosoft · 2025-10-11T12:34:53Z

Oh, right. Will do ASAP.

Minor changes

monstruosoft · 2025-10-13T16:31:53Z

OK, it's done. I've also fixed a minor issue with the WebUI using int values for the ControlNet conditioning scale if the value was set to 0 or 1, it should be fixed now.

rupeshs · 2025-10-18T09:18:52Z

@monstruosoft Thank you for the contribution

monstruosoft added 9 commits August 29, 2025 20:21

Adapt Qt GUI and WebUI to recent pipeline changes

00ace35

This commit adapts the Qt GUI and the WebUI to the recent pipeline changes; in particular, ControlNet pipelines are always built from the base txt2img pipeline, and LoRA models are always loaded onto the base txt2img pipeline.

Explicit pipeline deletion in LCM-LoRA mode

d421d3b

This commit explicitly calls del on all pipelines to try and make sure that the garbage collector releases all memory when rebuilding pipelines.

Minor changes

2c6d895

This commit explicitly calls unload_lora_weights() when rebuilding pipelines in LCM-LoRA mode.

Minor changes

20378b8

This commit fixes pipeline creation in LCM mode.

Merge branch 'cli' from the github repo into cli

4b63d42

Merge branch 'cli' from the github repo into cli

Update LCM mode to use multiple pipelines

cbd92d0

This commit makes some minor changes to apply the multiple pipelines method to LCM mode in addition to the LCM-LoRA mode.

Minor changes

e425802

Minor changes

rupeshs reviewed Oct 5, 2025

View reviewed changes

src/backend/controlnet.py Outdated Show resolved Hide resolved

rupeshs reviewed Oct 11, 2025

View reviewed changes

Minor changes

2b17db8

Minor changes

rupeshs approved these changes Oct 18, 2025

View reviewed changes

rupeshs merged commit 0c930a1 into rupeshs:main Oct 18, 2025

Optimize pipeline creation in LCM-LoRA mode #328

Optimize pipeline creation in LCM-LoRA mode #328

Uh oh!

Conversation

monstruosoft commented Sep 3, 2025

Uh oh!

monstruosoft commented Sep 5, 2025

Uh oh!

rupeshs commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

monstruosoft commented Sep 22, 2025

Uh oh!

rupeshs commented Oct 5, 2025

Uh oh!

Uh oh!

rupeshs commented Oct 5, 2025

Uh oh!

monstruosoft commented Oct 5, 2025

Uh oh!

monstruosoft commented Oct 7, 2025

Uh oh!

rupeshs commented Oct 11, 2025

Uh oh!

rupeshs left a comment

Choose a reason for hiding this comment

Uh oh!

monstruosoft commented Oct 11, 2025

Uh oh!

monstruosoft commented Oct 13, 2025

Uh oh!

rupeshs commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rupeshs commented Sep 20, 2025 •

edited

Loading

rupeshs commented Oct 18, 2025 •

edited

Loading