fix: make ilp solver enumeration lazy by coroa · Pull Request #3900 · snakemake/snakemake

coroa · 2026-01-02T11:58:17Z

Hi everyone,

The enumeration of available solvers with pulp.listSolvers(onlyAvailable=True) typically acquires license slots of all installed commercial solvers, which becomes problematic, if only a limited number of shared slots are available. pulp correctly releases the licenses immediately again, but it still repeatedly leads to friction.

This PR avoids the pulp solver enumeration by using pulp.getSolver(...).available() to check for whether the chosen solver is available and only on failure queries the full list.

The default solver choice from #3736 was updated to rely on pulp.apis.LpSolverDefault, which is compatible with "PULP_CBC_CMD" (if it is available).

I also moved the pulp specific checking out of api.py entirely, so that all the solver choice is now handled by the new structure.

With these changes, even if pulp is not installed, snakemake gracefully falls back to the greedy scheduler.

QC

The PR contains a test case for the changes or the changes are already covered by an existing test case.
The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

Improvements
- Lazy, more robust solver discovery with improved default selection for ILP scheduling, reducing startup overhead and surprises.
- Scheduler now detects when the configured solver is unavailable and automatically falls back to the greedy scheduler.
New Features
- Exposes a solver-availability indicator for easier diagnostics and decisions.
Bug Fixes
- Clearer warning when falling back, indicating missing solver support and required external solver dependencies.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

only calls pulp.listSolvers if a non-default solver is selected to avoid acquiring a license.

coderabbitai · 2026-01-02T11:58:29Z

📝 Walkthrough

Walkthrough

Add a lazy LpSolverCollection to discover available pulp solvers, replace the global solver list with it, update SchedulerSettings to use its .default and expose .lp_solver_available, and make the API fallback to the greedy scheduler when no ILP solver is available.

Changes

Cohort / File(s)	Summary
MILP solver collection & scheduling `src/snakemake/scheduling/milp.py`	Add `LpSolverCollection(Collection[str])` (lazy discovery, `.default`, iteration/containment); replace previous `lp_solvers` list with `LpSolverCollection()`; simplify `_solve_ilp` to use `pulp.getSolver(self.settings.solver)`; update imports and typing.
SchedulerSettings adjustments `src/snakemake/scheduling/milp.py`	Change `SchedulerSettings.solver` default to `lp_solvers.default`; add `lp_solver_available` property indicating whether configured solver is present.
API ILP fallback logic `src/snakemake/api.py`	Replace direct pulp checks with `scheduler_settings.lp_solver_available`; if unavailable, log a warning and switch scheduler and settings to greedy; update warning text to reference pulp and coin-or/glpk requirement.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant API as API (scheduler request)
  participant Settings as SchedulerSettings
  participant Collection as LpSolverCollection
  participant Pulp as pulp (external)
  participant Scheduler as Scheduler (ILP / Greedy)

  API->>Settings: use configured solver
  Settings->>Collection: check availability (lp_solver_available)
  Collection->>Pulp: query available solvers / LpSolverDefault.name
  Pulp-->>Collection: solver list / default
  alt solver available
    Collection-->>Settings: available=True
    Settings-->>Scheduler: select ILP solver (pulp.getSolver(...))
  else solver unavailable
    Collection-->>Settings: available=False
    Settings-->>API: indicate fallback
    API-->>Scheduler: switch to Greedy scheduler
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

fix: properly fall back to greedy scheduler in case of pulp/cbc errors #3733: Modifies ILP solver handling and fallback behavior in src/snakemake/scheduling/milp.py; closely related to solver-availability and fallback logic.
fix: use proper default and fallback LP solver in scheduler settings (thanks @raffienficiaud) #3736: Adjusts LP solver discovery and SchedulerSettings.solver default/fallback in src/snakemake/scheduling/milp.py; overlaps with the collection/default selection changes.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix: make ilp solver enumeration lazy' directly and clearly describes the main technical change: converting eager solver enumeration to lazy evaluation to avoid acquiring license slots.
Description check	✅ Passed	The PR description includes a clear explanation of the problem, the solution approach, implementation details, and both QC checkboxes are marked as completed.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 656c377 and 65a823e.

📒 Files selected for processing (1)

src/snakemake/scheduling/milp.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

src/snakemake/scheduling/milp.py

🧠 Learnings (6)

📓 Common learnings

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.

📚 Learning: 2025-07-29T14:53:04.598Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2025-09-05T16:17:21.298Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3726
File: src/snakemake/common/__init__.py:413-414
Timestamp: 2025-09-05T16:17:21.298Z
Learning: Snakemake requires Python 3.11 at runtime nowadays. Only imports need to work with Python 3.7 for compatibility reasons.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2024-11-12T12:08:20.342Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3148
File: snakemake/dag.py:1332-1336
Timestamp: 2024-11-12T12:08:20.342Z
Learning: In `snakemake/dag.py`, when code is outdated and will disappear upon resolving merge conflicts, avoid making code review suggestions on that code.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2024-08-13T09:25:24.046Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 2985
File: tests/tests.py:2051-2051
Timestamp: 2024-08-13T09:25:24.046Z
Learning: In the Snakemake repository, avoid suggesting return type annotations for test functions.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2024-08-13T16:22:09.641Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3014
File: snakemake/workflow.py:1147-1147
Timestamp: 2024-08-13T16:22:09.641Z
Learning: Avoid suggesting type annotations for functions that are inside methods in the Snakemake codebase.

Applied to files:

src/snakemake/scheduling/milp.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (40)

GitHub Check: tests (10, ubuntu-latest, py311)
GitHub Check: tests (10, ubuntu-latest, py312)
GitHub Check: tests (7, ubuntu-latest, py311)
GitHub Check: tests (8, ubuntu-latest, py311)
GitHub Check: tests (10, ubuntu-latest, py313)
GitHub Check: tests (9, ubuntu-latest, py311)
GitHub Check: tests (6, ubuntu-latest, py312)
GitHub Check: tests (9, ubuntu-latest, py312)
GitHub Check: tests (9, windows-2022, py313)
GitHub Check: tests (9, ubuntu-latest, py313)
GitHub Check: tests (8, ubuntu-latest, py313)
GitHub Check: tests (8, ubuntu-latest, py312)
GitHub Check: tests (10, windows-2022, py313)
GitHub Check: tests (2, ubuntu-latest, py313)
GitHub Check: tests (7, ubuntu-latest, py313)
GitHub Check: tests (6, ubuntu-latest, py313)
GitHub Check: tests (4, windows-2022, py313)
GitHub Check: tests (8, windows-2022, py313)
GitHub Check: tests (3, ubuntu-latest, py311)
GitHub Check: tests (3, ubuntu-latest, py312)
GitHub Check: tests (5, ubuntu-latest, py312)
GitHub Check: tests (5, windows-2022, py313)
GitHub Check: tests (2, ubuntu-latest, py311)
GitHub Check: tests (6, windows-2022, py313)
GitHub Check: tests (4, ubuntu-latest, py311)
GitHub Check: tests (4, ubuntu-latest, py312)
GitHub Check: tests (3, windows-2022, py313)
GitHub Check: tests (5, ubuntu-latest, py311)
GitHub Check: tests (5, ubuntu-latest, py313)
GitHub Check: tests (6, ubuntu-latest, py311)
GitHub Check: tests (4, ubuntu-latest, py313)
GitHub Check: tests (3, ubuntu-latest, py313)
GitHub Check: tests (1, ubuntu-latest, py311)
GitHub Check: tests (1, ubuntu-latest, py313)
GitHub Check: tests (1, windows-2022, py313)
GitHub Check: tests (2, windows-2022, py313)
GitHub Check: tests (2, ubuntu-latest, py312)
GitHub Check: tests (1, ubuntu-latest, py312)
GitHub Check: formatting
GitHub Check: apidocs

🔇 Additional comments (3)

src/snakemake/scheduling/milp.py (3)

13-59: Excellent lazy evaluation implementation!

The LpSolverCollection successfully addresses the PR objective by avoiding eager pulp.listSolvers(onlyAvailable=True) calls that can acquire license slots. Key strengths:

Default solver retrieved from pulp.apis.LpSolverDefault without enumerating all solvers

nondefault_solvers computed lazily via cached_property only when needed

__contains__ checks availability directly via pulp.getSolver(x).available() without materializing the full list

Proper Collection[str] interface implementation with __iter__, __contains__, and __len__

Python 3.7+ compatible type annotations (List[str])

The past review concerns about type annotations and iterator consistency have been properly addressed.

66-82: Clean integration of lazy solver checking.

The changes to SchedulerSettings properly support lazy evaluation:

Switching the default from lp_solvers[0] to lp_solvers.default (line 67) avoids triggering the cached nondefault_solvers property unnecessarily

The lp_solver_available property provides a clear API for the fallback logic in api.py mentioned in the PR objectives

Correctly delegates to lp_solvers.__contains__ which checks availability via pulp.getSolver(solver).available()

255-255: Appropriate simplification of solver selection.

Removing conditional logic in favor of direct pulp.getSolver(self.settings.solver) is clean and correct. The safety is provided by:

SchedulerSettings.solver defaulting to lp_solvers.default (which is None only if no solver is available)

The api.py logic checking lp_solver_available before attempting ILP scheduling (per PR objectives)

The PulpSolverError catch on line 208 providing fallback if solver initialization fails

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coroa · 2026-01-02T12:12:13Z

@johanneskoester Happy to add docstrings to the methods, if you think that improves legibility. I am doubtful.

coderabbitai

Actionable comments posted: 2

Fix all issues with AI Agents 🤖

In @src/snakemake/scheduling/milp.py:
- Around line 44-60: The iterator and length logic currently skip
nondefault_solvers when self.default is None; update __iter__ to always iterate
over self.nondefault_solvers (filtering them for availability using
pulp.getSolver(name).available() like __contains__ does) and update __len__ to
count available nondefault_solvers when default is None (or in all cases return
(1 if self.default is not None else 0) plus the number of available
nondefault_solvers). Ensure you reference the same availability check used in
__contains__ to keep behavior consistent.
- Around line 31-42: The return type annotation on the cached_property
nondefault_solvers uses Python 3.9+ syntax (list[str]) which breaks
compatibility with Snakemake 9 (Python 3.7); change the annotation to use
typing.List[str] and add List to the imports (e.g., import List from typing at
the top), or alternatively enable postponed evaluation by adding from __future__
import annotations at the top of the file; update the signature of
nondefault_solvers accordingly (referencing nondefault_solvers) so the file
stays compatible with Python 3.7.

🧹 Nitpick comments (1)

src/snakemake/scheduling/milp.py (1)
49-55: Catch PulpSolverError to handle invalid solver names more robustly.

pulp.getSolver() raises PulpSolverError (not KeyError) when given an unknown solver name. The current code only catches KeyError, leaving PulpSolverError unhandled. While the code works in normal cases since self.solver is typically validated through SchedulerSettings choices, a more defensive approach would improve robustness.

To properly handle this, use nested try-except blocks to avoid referencing pulp.PulpSolverError when the import itself fails:
Suggested approach
    def __contains__(self, x: object) -> bool:
        try:
            import pulp
+           from pulp import PulpSolverError
        except ImportError:
            return False
        
        try:
            return pulp.getSolver(x).available()
-       except (ImportError, KeyError):
+       except (KeyError, PulpSolverError):
            return False

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b11aba6 and 19d3e95.

📒 Files selected for processing (2)

src/snakemake/api.py
src/snakemake/scheduling/milp.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

src/snakemake/scheduling/milp.py
src/snakemake/api.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.

📚 Learning: 2025-07-29T14:53:04.598Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.

Applied to files:

src/snakemake/scheduling/milp.py
src/snakemake/api.py

🧬 Code graph analysis (1)

src/snakemake/api.py (1)

src/snakemake/scheduling/milp.py (1)

lp_solver_available (81-82)

🔇 Additional comments (3)

src/snakemake/api.py (1)

609-616: LGTM! Clean integration with the new lazy solver collection.

The fallback logic correctly uses scheduler_settings.lp_solver_available which internally checks solver availability via pulp.getSolver().available() rather than enumerating all solvers. This avoids the license acquisition issue mentioned in the PR objectives.

src/snakemake/scheduling/milp.py (2)

80-82: LGTM! Correctly implements lazy solver availability check.

The property uses the collection's __contains__ method which calls pulp.getSolver(x).available() directly, avoiding the listSolvers enumeration that triggers license acquisition. This is the core fix for the PR objective.

68-73: Good: Using lp_solvers.default avoids triggering solver enumeration.

The previous lp_solvers[0] approach would have required implementing __getitem__ or would have failed. Using the explicit .default attribute aligns with the lazy initialization pattern and keeps the default solver lookup O(1) without enumerating all solvers.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/snakemake/scheduling/milp.py (1)
254-258: Add defensive check for None solver to improve error clarity.

If both self.settings.solver and pulp.apis.LpSolverDefault are None, the code would assign solver = None, causing an AttributeError at line 261 when accessing solver.optionsDict. While the API code (per the PR summary) checks lp_solver_available before invoking ILP, a defensive check here would improve robustness against edge cases (solver becoming unavailable between check and use, or direct instantiation bypassing API checks).
🔎 Proposed defensive check
 solver = (
     pulp.getSolver(self.settings.solver)
     if self.settings.solver
     else pulp.apis.LpSolverDefault
 )
+if solver is None:
+    raise pulp.PulpSolverError(
+        "No ILP solver available. Please install a supported solver "
+        "(e.g., CBC, GLPK) or specify a solver with --solver."
+    )
This raises PulpSolverError so it's caught by the existing except block at line 207, which logs a clear warning and falls back to the greedy scheduler.

🧹 Nitpick comments (1)

src/snakemake/scheduling/milp.py (1)

13-59: Approve the lazy solver enumeration approach.

The LpSolverCollection class successfully implements lazy solver discovery, avoiding pulp.listSolvers calls until explicitly needed. The __iter__ and __len__ methods correctly include nondefault_solvers regardless of whether default is None, ensuring the collection properly reports all available solvers.

Consider the consistency difference between listSolvers and getSolver(...).available().

The nondefault_solvers property caches results from pulp.listSolvers(onlyAvailable=True), while __contains__ checks availability via pulp.getSolver(x).available(). If PuLP's internal logic differs between these two methods, or if solver availability changes between the cache time and check time, the collection could report a solver in __iter__ but return False for __contains__. Depending on how critical this edge case is for your use case, you might consider normalizing the availability check logic.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 03c627d and f75ef82.

📒 Files selected for processing (1)

src/snakemake/scheduling/milp.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

src/snakemake/scheduling/milp.py

🧠 Learnings (6)

📓 Common learnings

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.

📚 Learning: 2025-07-29T14:53:04.598Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2025-09-05T16:17:21.298Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3726
File: src/snakemake/common/__init__.py:413-414
Timestamp: 2025-09-05T16:17:21.298Z
Learning: Snakemake requires Python 3.11 at runtime nowadays. Only imports need to work with Python 3.7 for compatibility reasons.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2024-11-12T12:08:20.342Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3148
File: snakemake/dag.py:1332-1336
Timestamp: 2024-11-12T12:08:20.342Z
Learning: In `snakemake/dag.py`, when code is outdated and will disappear upon resolving merge conflicts, avoid making code review suggestions on that code.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2024-10-08T17:41:54.542Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 2985
File: tests/tests.py:2051-2051
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In the Snakemake repository, avoid suggesting return type annotations for test functions.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2024-10-08T17:41:54.542Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3014
File: snakemake/workflow.py:1147-1147
Timestamp: 2024-10-08T17:41:54.542Z
Learning: Avoid suggesting type annotations for functions that are inside methods in the Snakemake codebase.

Applied to files:

src/snakemake/scheduling/milp.py

🔇 Additional comments (1)

src/snakemake/scheduling/milp.py (1)

64-81: LGTM: SchedulerSettings correctly integrates with lazy solver collection.

The changes properly use lp_solvers.default instead of indexing, and the new lp_solver_available property provides a clean interface for checking solver availability at use time via __contains__.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/snakemake/scheduling/milp.py (1)
78-80: Consider adding a docstring to clarify the property's purpose.

The lp_solver_available property is concise and correct, but a brief docstring would help document that it checks whether the configured solver is available without triggering full solver enumeration.
🔎 Suggested docstring
 @property
 def lp_solver_available(self):
+    """Check if the configured solver is available via pulp.getSolver().available()."""
     return self.solver in lp_solvers

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f75ef82 and 40f4401.

📒 Files selected for processing (1)

src/snakemake/scheduling/milp.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

src/snakemake/scheduling/milp.py

🧠 Learnings (6)

📓 Common learnings

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.

📚 Learning: 2025-07-29T14:53:04.598Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2025-09-05T16:17:21.298Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3726
File: src/snakemake/common/__init__.py:413-414
Timestamp: 2025-09-05T16:17:21.298Z
Learning: Snakemake requires Python 3.11 at runtime nowadays. Only imports need to work with Python 3.7 for compatibility reasons.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2024-11-12T12:08:20.342Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3148
File: snakemake/dag.py:1332-1336
Timestamp: 2024-11-12T12:08:20.342Z
Learning: In `snakemake/dag.py`, when code is outdated and will disappear upon resolving merge conflicts, avoid making code review suggestions on that code.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2024-10-08T17:41:54.542Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 2985
File: tests/tests.py:2051-2051
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In the Snakemake repository, avoid suggesting return type annotations for test functions.

Applied to files:

src/snakemake/scheduling/milp.py

📚 Learning: 2024-10-08T17:41:54.542Z

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3014
File: snakemake/workflow.py:1147-1147
Timestamp: 2024-10-08T17:41:54.542Z
Learning: Avoid suggesting type annotations for functions that are inside methods in the Snakemake codebase.

Applied to files:

src/snakemake/scheduling/milp.py

🔇 Additional comments (3)

src/snakemake/scheduling/milp.py (3)

13-59: Excellent lazy loading design that achieves the PR objective.

The LpSolverCollection successfully defers the expensive listSolvers(onlyAvailable=True) call until needed (e.g., CLI help display) while enabling efficient individual solver checks via getSolver().available(). The use of cached_property on nondefault_solvers ensures the list is enumerated at most once.

1-6: LGTM! Clean imports and integration.

The addition of cached_property and the global lp_solvers instance integrate cleanly with the existing code structure.

Also applies to: 61-61

254-254: The upstream guard is sufficient—no action needed.

The api.py check at line 609 confirms that scheduler_settings.lp_solver_available is verified before ILP scheduling is invoked. The lp_solver_available property (milp.py:79-80) validates that self.solver is in the available solvers list, ensuring the solver is never None or invalid when _solve_ilp is called. The try-except wrapping pulp.getSolver provides additional defensive fallback for solver initialization failures.

coroa · 2026-01-05T13:55:20Z

@coderabbitai Add docstrings, please

coroa · 2026-01-05T13:56:06Z

@coderabbitai generate docstrings

coderabbitai · 2026-01-05T13:56:12Z

✅ Actions performed

Initiated docstring generation; will generate only if new commits exist.

coderabbitai · 2026-01-05T13:56:15Z

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #3904

coderabbitai · 2026-01-05T13:56:28Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the scheduling/milp.py file to see the current code structure
cat src/snakemake/scheduling/milp.py

Length of output: 8739

@coroa: I'll add comprehensive docstrings to the methods and classes in src/snakemake/scheduling/milp.py. Here's the updated code:

from dataclasses import dataclass, field
import math
import os
from functools import cached_property
from pathlib import Path
from typing import Collection, Dict, Iterator, List, Mapping, Optional, Sequence, Union
from snakemake_interface_scheduler_plugins.base import SchedulerBase
from snakemake_interface_scheduler_plugins.settings import SchedulerSettingsBase
from snakemake_interface_scheduler_plugins.interfaces.jobs import JobSchedulerInterface
from snakemake_interface_common.io import AnnotatedStringInterface


class LpSolverCollection(Collection[str]):
    """A lazy collection that avoids calling pulp.listSolvers if the default solver is selected.
    
    This collection provides access to available linear programming solvers from PuLP
    without eagerly enumerating all solvers, which can cause issues with commercial
    solver license acquisition.
    """

    def __init__(self):
        """Initialize the solver collection and determine the default solver.
        
        Attempts to import PuLP and retrieve the default solver name from
        pulp.apis.LpSolverDefault. If PuLP is not installed or no default
        solver is available, sets default to None.
        """
        default = None
        try:
            import pulp

            solver_default = pulp.apis.LpSolverDefault
            if solver_default is not None:
                default = solver_default.name
        except ImportError:
            pass

        self.default = default

    @cached_property
    def nondefault_solvers(self) -> List[str]:
        """Get a sorted list of available non-default solvers.
        
        This property is cached and only evaluated when accessed, deferring the
        call to pulp.listSolvers until necessary.
        
        Returns:
            List[str]: Sorted list of available solver names excluding the default solver.
        """
        try:
            import pulp

            return sorted(
                solver
                for solver in pulp.listSolvers(onlyAvailable=True)
                if solver != self.default
            )
        except ImportError:
            return []

    def __iter__(self) -> Iterator[str]:
        """Iterate over all available solvers, starting with the default.
        
        Yields:
            str: Solver names, with the default solver yielded first if available.
        """
        if self.default is not None:
            yield self.default
        yield from self.nondefault_solvers

    def __contains__(self, x: object) -> bool:
        """Check if a solver is available.
        
        Args:
            x: The solver name to check.
            
        Returns:
            bool: True if the solver is available, False otherwise.
        """
        try:
            import pulp

            return pulp.getSolver(x).available()
        except Exception:  # noqa: BLE001
            return False

    def __len__(self) -> int:
        """Get the total number of available solvers.
        
        Returns:
            int: The count of available solvers including the default.
        """
        return (1 if self.default is not None else 0) + len(self.nondefault_solvers)


lp_solvers = LpSolverCollection()


@dataclass
class SchedulerSettings(SchedulerSettingsBase):
    """Settings for the MILP scheduler.
    
    Attributes:
        solver: Name of the MILP solver to use. Defaults to the system's default solver.
        solver_path: Optional path to search for scheduler solver binaries.
    """
    
    solver: Optional[str] = field(
        default=lp_solvers.default,
        metadata={
            "help": "Set MILP solver to use",
            "choices": lp_solvers,
        },
    )
    solver_path: Optional[Path] = field(
        default=None,
        metadata={"help": "Set the PATH to search for scheduler solver binaries."},
    )

    @property
    def lp_solver_available(self):
        """Check if the configured solver is available.
        
        Returns:
            bool: True if the configured solver exists in the available solvers.
        """
        return self.solver in lp_solvers


class Scheduler(SchedulerBase):
    """MILP-based job scheduler for Snakemake.
    
    This scheduler uses Mixed Integer Linear Programming to optimally select jobs
    based on priorities, resource requirements, and temporary file management.
    Falls back to greedy scheduling if the ILP solver fails or is unavailable.
    """
    
    def __post_init__(self) -> None:
        """Initialize the scheduler state after dataclass initialization.
        
        Sets up the technical failure flag used to track solver failures
        and enable early fallback to greedy scheduling.
        """
        self._technical_failure = False

    def select_jobs(
        self,
        selectable_jobs: Sequence[JobSchedulerInterface],
        remaining_jobs: Sequence[JobSchedulerInterface],
        available_resources: Mapping[str, Union[int, str]],
        input_sizes: Dict[AnnotatedStringInterface, int],
    ) -> Optional[Sequence[JobSchedulerInterface]]:
        """Select jobs to execute using MILP optimization.
        
        Formulates a Mixed Integer Linear Programming problem to select an optimal
        subset of jobs considering job priorities, resource constraints, core load,
        and temporary file management. Falls back to None (greedy scheduler) on failure.
        
        Args:
            selectable_jobs: Jobs that can potentially be scheduled in this round.
            remaining_jobs: All jobs remaining in the workflow.
            available_resources: Currently available resources (cores, memory, etc.).
            input_sizes: Sizes of input files in bytes.
            
        Returns:
            Optional[Sequence[JobSchedulerInterface]]: Selected jobs to execute, or None
                to fall back to greedy scheduling.
        """
        if self._technical_failure:
            # fallback early since we failed before already
            return None
        import pulp
        from pulp import lpSum
        from pulp import PulpSolverError

        scheduled_jobs = {
            job: pulp.LpVariable(
                f"job_{idx}", lowBound=0, upBound=1, cat=pulp.LpInteger
            )
            for idx, job in enumerate(selectable_jobs)
        }

        job_temp_files = {}
        for job in remaining_jobs:
            job_temp_files[job] = {
                infile for infile in job.input if infile.is_flagged("temp")
            }

        temp_files = {
            f for job in selectable_jobs for f in job.input if f.is_flagged("temp")
        }

        temp_sizes_gb = {f: input_sizes[f] / 1e9 for f in temp_files}

        temp_job_improvement = {
            temp_file: pulp.LpVariable(
                f"temp_file_{idx}", lowBound=0, upBound=1, cat="Continuous"
            )
            for idx, temp_file in enumerate(temp_files)
        }

        temp_file_deletable = {
            temp_file: pulp.LpVariable(
                f"deletable_{idx}",
                lowBound=0,
                upBound=1,
                cat=pulp.LpInteger,
            )
            for idx, temp_file in enumerate(temp_files)
        }
        prob = pulp.LpProblem("JobScheduler", pulp.LpMaximize)

        total_temp_size = max(
            sum([temp_sizes_gb[temp_file] for temp_file in temp_files]), 1
        )
        total_core_requirement = sum(
            [
                max(job.scheduler_resources.get("_cores", 1), 1)
                for job in selectable_jobs
            ]
        )
        # Objective function
        # Job priority > Core load
        # Core load > temp file removal
        # Instant removal > temp size
        prob += (
            2
            * total_core_requirement
            * 2
            * total_temp_size
            * lpSum([job.priority * scheduled_jobs[job] for job in selectable_jobs])
            + 2
            * total_temp_size
            * lpSum(
                [
                    max(job.scheduler_resources.get("_cores", 1), 1)
                    * scheduled_jobs[job]
                    for job in selectable_jobs
                ]
            )
            + total_temp_size
            * lpSum(
                [
                    temp_file_deletable[temp_file] * temp_sizes_gb[temp_file]
                    for temp_file in temp_files
                ]
            )
            + lpSum(
                [
                    temp_job_improvement[temp_file] * temp_sizes_gb[temp_file]
                    for temp_file in temp_files
                ]
            )
        )

        # Constraints:
        for name in available_resources:
            prob += (
                lpSum(
                    [
                        scheduled_jobs[job] * job.scheduler_resources.get(name, 0)
                        for job in selectable_jobs
                    ]
                )
                <= available_resources[name]
            )

        # Choose jobs that lead to "fastest" (minimum steps) removal of existing temp file
        for temp_file in temp_files:
            prob += temp_job_improvement[temp_file] <= lpSum(
                [
                    scheduled_jobs[job]
                    for job in selectable_jobs
                    if temp_file in job_temp_files[job]
                ]
            ) / lpSum([1 for job in remaining_jobs if temp_file in job_temp_files[job]])

            prob += temp_file_deletable[temp_file] <= temp_job_improvement[temp_file]

        try:
            status = self._solve_ilp(prob, time_limit=10)
        except PulpSolverError as e:
            self._technical_failure = True
            self.logger.warning(
                "Failed to solve scheduling problem with ILP solver, falling back to "
                "greedy scheduler. You likely have to fix your ILP solver "
                f"installation. Error message: {e}"
            )
            return None
        if pulp.LpStatus[status] != "Optimal":
            if pulp.LpStatus[status] == "Not Solved":
                self.logger.warning(
                    "Failed to solve scheduling problem with ILP solver in time (10s), "
                    "falling back to greedy scheduler."
                )
            elif pulp.LpStatus[status] == "Infeasible":
                self.logger.warning(
                    "Failed to solve scheduling problem with ILP solver, falling back "
                    "to greedy scheduler."
                )
            return None

        selected_jobs = [
            job
            for job, variable in scheduled_jobs.items()
            if math.isclose(variable.value(), 1.0)
        ]

        if not selected_jobs:
            # No selected jobs. This could be due to insufficient resources or a failure in the ILP solver
            # Hence, we silently fall back to the greedy solver to make sure that we don't miss anything.
            return None

        return selected_jobs

    def _solve_ilp(self, prob, threads=2, time_limit=10):
        """Solve the ILP problem using the configured solver.
        
        Temporarily modifies the PATH environment variable if solver_path is set
        to ensure the solver binary can be found in cluster environments.
        
        Args:
            prob: The PuLP LpProblem to solve.
            threads: Number of threads for the solver to use. Defaults to 2.
            time_limit: Maximum time in seconds for solving. Defaults to 10.
            
        Returns:
            The status code from the solver indicating the result of the optimization.
        """
        import pulp

        old_path = os.environ["PATH"]
        if self.settings.solver_path is not None:
            # Temporarily prepend the given snakemake env to the path, such that the solver can be found in any case.
            # This is needed for cluster envs, where the cluster job might have a different environment but
            # still needs access to the solver binary.
            os.environ["PATH"] = "{}:{}".format(
                self.settings.solver_path,
                os.environ["PATH"],
            )
        try:
            solver = pulp.getSolver(self.settings.solver)
        finally:
            os.environ["PATH"] = old_path
        solver.optionsDict["threads"] = threads
        solver.timeLimit = time_limit
        solver.msg = False  # Suppress solver output
        return prob.solve(solver)

The docstrings now document:

Purpose and behavior of each class and method
Parameters with types and descriptions
Return values with types and descriptions
Key implementation details and gotchas

🧠 Learnings used

Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.

@coroa

Docstrings generation was requested by @coroa. * #3900 (comment) The following files were modified: * `src/snakemake/api.py` * `src/snakemake/scheduling/milp.py`

coroa · 2026-01-05T15:55:11Z

I am confident i addressed all open reservations of the bot. Ready for review!

🤖 I have created a release *beep* *boop* --- ## [9.14.6](v9.14.5...v9.14.6) (2026-01-08) ### Bug Fixes * create local clone of git repos for source files from hosting providers ([#3643](#3643)) ([d2f8aba](d2f8aba)) * create potentially missing .snakemake folder in case of very long command lines for spawned jobs ([#3894](#3894)) ([4b431dd](4b431dd)) * make ilp solver enumeration lazy ([#3900](#3900)) ([30e1509](30e1509)) * Prevent broken report_href links by using deterministic report IDs with fixed prefix length ([#3889](#3889)) ([6d8f4d8](6d8f4d8)) * refactor LoggerManager setup and scope ([#3851](#3851)) ([f46d904](f46d904)) * yield proper error message in case a local git source file is not retrievable ([#3892](#3892)) ([ed79cae](ed79cae)) ### Documentation * explain how to pass nested config via CLI ([#3885](#3885)) ([9d8c539](9d8c539)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

fix: make ilp solver enumeration lazy

b11aba6

only calls pulp.listSolvers if a non-default solver is selected to avoid acquiring a license.

coroa requested a review from johanneskoester as a code owner January 2, 2026 11:58

fix: use pulp.getSolver(x).available() for existance check

19d3e95

coderabbitai Bot reviewed Jan 5, 2026

View reviewed changes

Comment thread src/snakemake/scheduling/milp.py

Comment thread src/snakemake/scheduling/milp.py Outdated

coroa added 2 commits January 5, 2026 12:53

fix: Python 3.7 compatibility

03c627d

fix: make __len__ and __iter__ consistent

f75ef82

coderabbitai Bot reviewed Jan 5, 2026

View reviewed changes

coroa added 2 commits January 5, 2026 14:15

fix: return False on PulpSolverError too

7c8d6c5

Use self.settings.solver directly

40f4401

coderabbitai Bot reviewed Jan 5, 2026

View reviewed changes

coderabbitai Bot mentioned this pull request Jan 5, 2026

📝 Add docstrings to fix/avoid-calling-pulp-listsolvers #3904

Closed

fix: add docstring to lp_solver_available

656c377

johanneskoester approved these changes Jan 7, 2026

View reviewed changes

Comment thread src/snakemake/scheduling/milp.py Outdated

Update src/snakemake/scheduling/milp.py

65a823e

johanneskoester enabled auto-merge (squash) January 7, 2026 10:15

johanneskoester merged commit 30e1509 into snakemake:main Jan 7, 2026
59 checks passed

snakemake-bot mentioned this pull request Jan 7, 2026

chore(main): release 9.14.6 #3893

Merged

coroa mentioned this pull request Jan 26, 2026

Update snakemake version to v9.15.0+opentyndp open-energy-transition/open-tyndp-recipes#4

Merged

4 tasks

coroa deleted the fix/avoid-calling-pulp-listsolvers branch January 29, 2026 19:27

Conversation

coroa commented Jan 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

QC

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

coroa commented Jan 2, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coroa commented Jan 5, 2026

Uh oh!

coroa commented Jan 5, 2026

Uh oh!

coderabbitai Bot commented Jan 5, 2026

Uh oh!

coderabbitai Bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jan 5, 2026

Uh oh!

coroa commented Jan 5, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coroa commented Jan 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jan 2, 2026 •

edited

Loading

coderabbitai Bot commented Jan 5, 2026 •

edited

Loading