Move venv inside the workspace folder of the job dir#1400
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughPer-job Python virtual environment creation and synchronization were moved from the job directory ( Changes
Sequence Diagram(s)(omitted) Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
api/transformerlab/compute_providers/local.py (2)
246-260:⚠️ Potential issue | 🔴 Critical
os.killreturn-value check is inverted — running processes always reported asDOWN.
os.kill(pid, 0)always returnsNonewhen the process is alive and raisesProcessLookupError/OSErrorwhen it is not. Becauseos_killedis unconditionallyNone, the conditionos_killed is not Noneis alwaysFalse, so theelsebranch (returningClusterState.DOWN) is always taken for live processes. The state check is completely non-functional. The comment on line 248 ("Return up only if the process is not running") also reveals the inverted intent.🐛 Proposed fix
- try: - pid = int(pid_file.read_text().strip()) - os_killed = os.kill(pid, 0) - # Return up only if the process is not running - if os_killed is not None: - return ClusterStatus( - cluster_name=cluster_name, - state=ClusterState.UP, - status_message="Process running", - ) - else: - return ClusterStatus( - cluster_name=cluster_name, - state=ClusterState.DOWN, - status_message="Process not running", - ) - except (ValueError, ProcessLookupError, OSError): + try: + pid = int(pid_file.read_text().strip()) + os.kill(pid, 0) # raises if process does not exist + # If we reach here, the process is alive return ClusterStatus( cluster_name=cluster_name, - state=ClusterState.DOWN, - status_message="Process not running", + state=ClusterState.UP, + status_message="Process running", ) + except (ValueError, ProcessLookupError, OSError): + return ClusterStatus( + cluster_name=cluster_name, + state=ClusterState.DOWN, + status_message="Process not running", + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/transformerlab/compute_providers/local.py` around lines 246 - 260, The os.kill(pid, 0) check in the cluster status code is inverted: instead of inspecting os_killed it should rely on exceptions (no exception => process alive). Update the logic in the function that builds ClusterStatus (around where pid is read and os.kill is called) to call os.kill(pid, 0) inside a try/except; on success return ClusterStatus(cluster_name=cluster_name, state=ClusterState.UP, status_message="Process running"), and on ProcessLookupError/OSError return ClusterStatus(..., state=ClusterState.DOWN, status_message="Process not running"); also remove or correct the misleading comment about the reversed intent.
186-193:⚠️ Potential issue | 🟠 MajorResource leak:
stdout/stderrfile handles are never closed.The two
open(...)calls are passed inline toPopenwithout being stored or closed.Popentakes ownership of the file objects but does not guarantee closure. These file descriptors leak until the GC collects them, which is non-deterministic and can exhaust fd limits under load.🛡️ Proposed fix
- proc = subprocess.Popen( - ["/bin/bash", "-c", config.command or "true"], - cwd=str(job_dir), - env=env, - stdout=open(job_dir / "stdout.log", "w"), - stderr=open(job_dir / "stderr.log", "w"), - start_new_session=True, - ) + stdout_fh = open(job_dir / "stdout.log", "w") + stderr_fh = open(job_dir / "stderr.log", "w") + proc = subprocess.Popen( + ["/bin/bash", "-c", config.command or "true"], + cwd=str(job_dir), + env=env, + stdout=stdout_fh, + stderr=stderr_fh, + start_new_session=True, + ) + stdout_fh.close() + stderr_fh.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/transformerlab/compute_providers/local.py` around lines 186 - 193, The inline open(...) calls passed to subprocess.Popen leak file descriptors; change the Popen invocation in local.py so you first open stdout/stderr into variables (e.g., out_f, err_f), pass those file objects to subprocess.Popen ([ "/bin/bash", "-c", config.command or "true" ], cwd=str(job_dir), env=env, stdout=out_f, stderr=err_f, start_new_session=True), and then immediately close the parent handles after proc is created (or ensure closure in a try/finally) to guarantee the file descriptors are closed even on exceptions; reference the proc variable and the Popen call to locate where to add the out_f/err_f open, pass, and close logic.
🧹 Nitpick comments (1)
api/transformerlab/compute_providers/local.py (1)
110-111: RedundantPath()re-cast on an already-typedPathparameter.The parameter
venv_pathis declaredPathat line 101, so the reassignment on line 110 is a no-op.♻️ Proposed cleanup
- venv_path = Path(venv_path) - venv_path.mkdir(parents=True, exist_ok=True) + venv_path.mkdir(parents=True, exist_ok=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/transformerlab/compute_providers/local.py` around lines 110 - 111, The code redundantly re-casts the already-typed Path parameter venv_path with venv_path = Path(venv_path); remove that reassignment and call venv_path.mkdir(parents=True, exist_ok=True) directly so the function uses the original Path parameter (look for the venv_path parameter and the venv_path.mkdir(...) call in the same function and delete the unnecessary Path(...) re-cast).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@api/transformerlab/compute_providers/local.py`:
- Around line 246-260: The os.kill(pid, 0) check in the cluster status code is
inverted: instead of inspecting os_killed it should rely on exceptions (no
exception => process alive). Update the logic in the function that builds
ClusterStatus (around where pid is read and os.kill is called) to call
os.kill(pid, 0) inside a try/except; on success return
ClusterStatus(cluster_name=cluster_name, state=ClusterState.UP,
status_message="Process running"), and on ProcessLookupError/OSError return
ClusterStatus(..., state=ClusterState.DOWN, status_message="Process not
running"); also remove or correct the misleading comment about the reversed
intent.
- Around line 186-193: The inline open(...) calls passed to subprocess.Popen
leak file descriptors; change the Popen invocation in local.py so you first open
stdout/stderr into variables (e.g., out_f, err_f), pass those file objects to
subprocess.Popen ([ "/bin/bash", "-c", config.command or "true" ],
cwd=str(job_dir), env=env, stdout=out_f, stderr=err_f, start_new_session=True),
and then immediately close the parent handles after proc is created (or ensure
closure in a try/finally) to guarantee the file descriptors are closed even on
exceptions; reference the proc variable and the Popen call to locate where to
add the out_f/err_f open, pass, and close logic.
---
Nitpick comments:
In `@api/transformerlab/compute_providers/local.py`:
- Around line 110-111: The code redundantly re-casts the already-typed Path
parameter venv_path with venv_path = Path(venv_path); remove that reassignment
and call venv_path.mkdir(parents=True, exist_ok=True) directly so the function
uses the original Path parameter (look for the venv_path parameter and the
venv_path.mkdir(...) call in the same function and delete the unnecessary
Path(...) re-cast).
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Summary by CodeRabbit