Skip to content

Move venv inside the workspace folder of the job dir#1400

Merged
deep1401 merged 2 commits intomainfrom
fix/move-local-venv
Feb 25, 2026
Merged

Move venv inside the workspace folder of the job dir#1400
deep1401 merged 2 commits intomainfrom
fix/move-local-venv

Conversation

@deep1401
Copy link
Copy Markdown
Member

@deep1401 deep1401 commented Feb 25, 2026

Summary by CodeRabbit

  • Refactor
    • Reorganized per-job Python environment placement to live at the workspace level, improving consistency and isolation across jobs.
    • Environment activation and synchronization now use the workspace home, reducing setup failures and improving reliability of job execution.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 25, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae33959 and d61b3ce.

📒 Files selected for processing (1)
  • api/transformerlab/compute_providers/local.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • api/transformerlab/compute_providers/local.py

📝 Walkthrough

Walkthrough

Per-job Python virtual environment creation and synchronization were moved from the job directory (job_dir/venv) into the per-job workspace (workspace/venv). The venv is created inside the workspace HOME and then wired into PATH and HOME during environment setup.

Changes

Cohort / File(s) Summary
Virtual Environment Path Relocation
api/transformerlab/compute_providers/local.py
Moves venv creation and synchronization target from job_dir/venv to workspace/venv; adjusts HOME and PATH setup so the workspace HOME contains the venv lifecycle. No public signatures changed.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • aliasaria
  • dadmobile

Poem

🐰 In workspace warm, my venv sleeps,

I hopped it there through tunnels deep.
PATH arranged and HOME made new,
A tidy burrow through and through.
🥕✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Move venv inside the workspace folder of the job dir' directly and clearly describes the main change in the pull request, which is relocating the virtual environment from the job directory to the workspace folder.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/move-local-venv

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
api/transformerlab/compute_providers/local.py (2)

246-260: ⚠️ Potential issue | 🔴 Critical

os.kill return-value check is inverted — running processes always reported as DOWN.

os.kill(pid, 0) always returns None when the process is alive and raises ProcessLookupError/OSError when it is not. Because os_killed is unconditionally None, the condition os_killed is not None is always False, so the else branch (returning ClusterState.DOWN) is always taken for live processes. The state check is completely non-functional. The comment on line 248 ("Return up only if the process is not running") also reveals the inverted intent.

🐛 Proposed fix
-        try:
-            pid = int(pid_file.read_text().strip())
-            os_killed = os.kill(pid, 0)
-            # Return up only if the process is not running
-            if os_killed is not None:
-                return ClusterStatus(
-                    cluster_name=cluster_name,
-                    state=ClusterState.UP,
-                    status_message="Process running",
-                )
-            else:
-                return ClusterStatus(
-                    cluster_name=cluster_name,
-                    state=ClusterState.DOWN,
-                    status_message="Process not running",
-                )
-        except (ValueError, ProcessLookupError, OSError):
+        try:
+            pid = int(pid_file.read_text().strip())
+            os.kill(pid, 0)  # raises if process does not exist
+            # If we reach here, the process is alive
             return ClusterStatus(
                 cluster_name=cluster_name,
-                state=ClusterState.DOWN,
-                status_message="Process not running",
+                state=ClusterState.UP,
+                status_message="Process running",
             )
+        except (ValueError, ProcessLookupError, OSError):
+            return ClusterStatus(
+                cluster_name=cluster_name,
+                state=ClusterState.DOWN,
+                status_message="Process not running",
+            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/transformerlab/compute_providers/local.py` around lines 246 - 260, The
os.kill(pid, 0) check in the cluster status code is inverted: instead of
inspecting os_killed it should rely on exceptions (no exception => process
alive). Update the logic in the function that builds ClusterStatus (around where
pid is read and os.kill is called) to call os.kill(pid, 0) inside a try/except;
on success return ClusterStatus(cluster_name=cluster_name,
state=ClusterState.UP, status_message="Process running"), and on
ProcessLookupError/OSError return ClusterStatus(..., state=ClusterState.DOWN,
status_message="Process not running"); also remove or correct the misleading
comment about the reversed intent.

186-193: ⚠️ Potential issue | 🟠 Major

Resource leak: stdout/stderr file handles are never closed.

The two open(...) calls are passed inline to Popen without being stored or closed. Popen takes ownership of the file objects but does not guarantee closure. These file descriptors leak until the GC collects them, which is non-deterministic and can exhaust fd limits under load.

🛡️ Proposed fix
-        proc = subprocess.Popen(
-            ["/bin/bash", "-c", config.command or "true"],
-            cwd=str(job_dir),
-            env=env,
-            stdout=open(job_dir / "stdout.log", "w"),
-            stderr=open(job_dir / "stderr.log", "w"),
-            start_new_session=True,
-        )
+        stdout_fh = open(job_dir / "stdout.log", "w")
+        stderr_fh = open(job_dir / "stderr.log", "w")
+        proc = subprocess.Popen(
+            ["/bin/bash", "-c", config.command or "true"],
+            cwd=str(job_dir),
+            env=env,
+            stdout=stdout_fh,
+            stderr=stderr_fh,
+            start_new_session=True,
+        )
+        stdout_fh.close()
+        stderr_fh.close()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/transformerlab/compute_providers/local.py` around lines 186 - 193, The
inline open(...) calls passed to subprocess.Popen leak file descriptors; change
the Popen invocation in local.py so you first open stdout/stderr into variables
(e.g., out_f, err_f), pass those file objects to subprocess.Popen ([
"/bin/bash", "-c", config.command or "true" ], cwd=str(job_dir), env=env,
stdout=out_f, stderr=err_f, start_new_session=True), and then immediately close
the parent handles after proc is created (or ensure closure in a try/finally) to
guarantee the file descriptors are closed even on exceptions; reference the proc
variable and the Popen call to locate where to add the out_f/err_f open, pass,
and close logic.
🧹 Nitpick comments (1)
api/transformerlab/compute_providers/local.py (1)

110-111: Redundant Path() re-cast on an already-typed Path parameter.

The parameter venv_path is declared Path at line 101, so the reassignment on line 110 is a no-op.

♻️ Proposed cleanup
-        venv_path = Path(venv_path)
-        venv_path.mkdir(parents=True, exist_ok=True)
+        venv_path.mkdir(parents=True, exist_ok=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/transformerlab/compute_providers/local.py` around lines 110 - 111, The
code redundantly re-casts the already-typed Path parameter venv_path with
venv_path = Path(venv_path); remove that reassignment and call
venv_path.mkdir(parents=True, exist_ok=True) directly so the function uses the
original Path parameter (look for the venv_path parameter and the
venv_path.mkdir(...) call in the same function and delete the unnecessary
Path(...) re-cast).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@api/transformerlab/compute_providers/local.py`:
- Around line 246-260: The os.kill(pid, 0) check in the cluster status code is
inverted: instead of inspecting os_killed it should rely on exceptions (no
exception => process alive). Update the logic in the function that builds
ClusterStatus (around where pid is read and os.kill is called) to call
os.kill(pid, 0) inside a try/except; on success return
ClusterStatus(cluster_name=cluster_name, state=ClusterState.UP,
status_message="Process running"), and on ProcessLookupError/OSError return
ClusterStatus(..., state=ClusterState.DOWN, status_message="Process not
running"); also remove or correct the misleading comment about the reversed
intent.
- Around line 186-193: The inline open(...) calls passed to subprocess.Popen
leak file descriptors; change the Popen invocation in local.py so you first open
stdout/stderr into variables (e.g., out_f, err_f), pass those file objects to
subprocess.Popen ([ "/bin/bash", "-c", config.command or "true" ],
cwd=str(job_dir), env=env, stdout=out_f, stderr=err_f, start_new_session=True),
and then immediately close the parent handles after proc is created (or ensure
closure in a try/finally) to guarantee the file descriptors are closed even on
exceptions; reference the proc variable and the Popen call to locate where to
add the out_f/err_f open, pass, and close logic.

---

Nitpick comments:
In `@api/transformerlab/compute_providers/local.py`:
- Around line 110-111: The code redundantly re-casts the already-typed Path
parameter venv_path with venv_path = Path(venv_path); remove that reassignment
and call venv_path.mkdir(parents=True, exist_ok=True) directly so the function
uses the original Path parameter (look for the venv_path parameter and the
venv_path.mkdir(...) call in the same function and delete the unnecessary
Path(...) re-cast).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4aebbb7 and ae33959.

📒 Files selected for processing (1)
  • api/transformerlab/compute_providers/local.py

@sentry
Copy link
Copy Markdown

sentry bot commented Feb 25, 2026

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
api/transformerlab/compute_providers/local.py 0.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@deep1401 deep1401 merged commit 3c13432 into main Feb 25, 2026
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants