Skip to content

Conversation

@makseq
Copy link
Member

@makseq makseq commented Nov 25, 2025

Improvements and fixes

  1. Auto-configure Local Files root in Community Edition when env vars are missing – Added backend autodetection of LOCAL_FILES_DOCUMENT_ROOT and LOCAL_FILES_SERVING_ENABLED via autodetect_local_files_root(), scanning for mydata / label-studio-data under the current working directory and logging the chosen root when found.

  2. Documented Community Edition Docker auto-detection for Local Storage – Updated the storage guide to explain that, in Community Edition, Label Studio automatically uses mydata or label-studio-data in the working directory as the local files root when related environment variables are unset, including Docker mount examples.

  3. Hardened Local Files storage path handling and validation – Introduced normalize_storage_path() and a migration to canonicalize stored paths (trim, normalize separators, remove trailing slashes), enforced normalization on save/clean, and tightened validation so paths must exist, cannot equal LOCAL_FILES_DOCUMENT_ROOT, and must be strict subdirectories with clearer error messages and Community Edition hints. See "Problem with path normalization" section below for details.

  4. Refactored Local Files download endpoint into the storage app with ETag support

    • Moved /data/local-files/ handling from core views into io_storages.localfiles, preserved permission checks over LocalFilesImportStorage, and
    • Added weak ETag generation plus If-None-Match handling to return 304 Not Modified when appropriate.
  5. Aligned Local Files export lifecycle with annotation deletion – Extended LocalFilesExportStorage with a deletion API and a pre_delete signal hook so that exported files and links are removed when annotations are deleted, while respecting can_delete_objects to keep disk artifacts when deletion is disabled.

  6. Improved Local Files serializers and surfaced field-level validation errors – Updated serializers to normalize paths, catch both Django and DRF validation errors, stringify nested error structures, and consistently re-raise them as DRF ValidationError, enabling the UI to display precise backend validation messages per field.

  7. Enhanced Storage Settings UI for Local Files configuration – Reworked the Local Files provider into a React-based config with a dedicated warning when local file serving is disabled, Community Edition tips for automatic enablement (including Docker mount hints), and doc-root–aware default path suggestions plus new Storage Settings styling.

  8. Made StorageProviderForm respect backend validation and custom defaults – Added defaultValue support on provider fields, improved default extraction logic, wired the storage API hook to intercept 400 responses with validation_errors, and merged those into the form’s error state so server-side validation shows inline on the corresponding inputs.

  9. Extended shared UI components and reduced log noise – Introduced an info variant for the Alert component, added a TypeScript declaration file for the InlineError component, and suppressed verbose Faker and Redis “not connected” warnings to keep logs cleaner.

  10. Backed changes with focused test coverage – Added tests for Local Files path normalization, validation rules, serializer behavior, ETag/304 download handling, and export-file cleanup, plus pytest fixtures to let IO storage tests run in isolation.

Problem with path normalization

Local Files access was brittle because LocalFilesImportStorage.path stored whatever users typed (with or without trailing slashes and with mixed / and \ separators). The /data/local-files/?d=… view checked permissions by comparing the requested file’s directory against storage.path prefixes in the database; when paths didn’t match exactly (for example /home/user/dataset vs /home/user/dataset/), the prefix check failed and users saw 404s even though the storage and files were valid. A previous attempt to work around this by scanning all LocalFilesImportStorage rows in Python fixed some edge cases but made the permission check O(number_of_storages) per request and therefore not acceptable at scale.

We made Local Files storage paths canonical at the model boundary and backfilled existing rows so that all paths use a single, normalized representation. A new helper normalize_storage_path in LocalFilesMixin trims whitespace, converts backslashes to the OS separator, and runs os.path.normpath to remove redundant/trailing separators; this helper is applied in clean(), save(), and in the serializers before validation, so both new and updated LocalFilesImportStorage/LocalFilesExportStorage rows are stored consistently. A data migration (0022_normalize_localfiles_paths) uses the same helper to normalize all existing rows in-place, which is a simple bounded update (no async job needed) and keeps runtime behavior uniform across deployments.

With all storage paths canonical, the /data/local-files view goes back to using an efficient database-side filter instead of a Python scan: it normalizes the requested file’s directory once and uses an annotated _full_path__startswith=F('path') query to select only candidate storages, then checks project permissions as before. We added focused tests for normalize_storage_path (including trailing slashes, mixed separators, and Windows-style inputs) plus an end‑to‑end view test to prove that storages configured with trailing slashes or backslashes still resolve correctly. Overall, the fix removes the user-facing 404s caused by path formatting, keeps behavior consistent across OSes, and restores the original scalable query pattern for large numbers of Local Files storages.

UI Improvements

New icon:
image

Improved error display:
image

Alerts for env variables:
image

Absolute local path is automatically preloaded from LOCAL_FILES_DOCUMENT_ROOT:
image

@makseq makseq requested a review from a team as a code owner November 25, 2025 22:54
@netlify
Copy link

netlify bot commented Nov 25, 2025

Deploy Preview for label-studio-docs-new-theme canceled.

Name Link
🔨 Latest commit 581b91a
🔍 Latest deploy log https://app.netlify.com/projects/label-studio-docs-new-theme/deploys/6928cf3f7ee39000074992b3

@netlify
Copy link

netlify bot commented Nov 25, 2025

Deploy Preview for heartex-docs canceled.

Name Link
🔨 Latest commit 581b91a
🔍 Latest deploy log https://app.netlify.com/projects/heartex-docs/deploys/6928cf3f0bc600000833d043

@netlify
Copy link

netlify bot commented Nov 25, 2025

Deploy Preview for label-studio-storybook canceled.

Name Link
🔨 Latest commit 581b91a
🔍 Latest deploy log https://app.netlify.com/projects/label-studio-storybook/deploys/6928cf3f60573d0007cf7ace

@github-actions github-actions bot added the feat label Nov 25, 2025
@netlify
Copy link

netlify bot commented Nov 25, 2025

Deploy Preview for label-studio-playground canceled.

Name Link
🔨 Latest commit 581b91a
🔍 Latest deploy log https://app.netlify.com/projects/label-studio-playground/deploys/6928cf3f154d32000839b803

@codecov
Copy link

codecov bot commented Nov 27, 2025

Codecov Report

❌ Patch coverage is 94.72296% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.11%. Comparing base (7502391) to head (581b91a).
⚠️ Report is 1 commits behind head on develop.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
label_studio/io_storages/localfiles/models.py 79.59% 10 Missing ⚠️
label_studio/io_storages/localfiles/views.py 93.54% 4 Missing ⚠️
label_studio/core/settings/base.py 66.66% 3 Missing ⚠️
label_studio/io_storages/localfiles/serializers.py 88.46% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8891      +/-   ##
===========================================
- Coverage    66.28%   65.11%   -1.18%     
===========================================
  Files          820      828       +8     
  Lines        64438    64773     +335     
  Branches     11041    11041              
===========================================
- Hits         42713    42174     -539     
- Misses       21721    22595     +874     
  Partials         4        4              
Flag Coverage Δ
lsf-e2e 46.43% <ø> (-5.39%) ⬇️
lsf-integration 48.54% <ø> (-0.03%) ⬇️
lsf-unit 8.33% <ø> (ø)
pytests 81.31% <94.72%> (+0.27%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@makseq
Copy link
Member Author

makseq commented Nov 27, 2025

/git merge

Workflow run
Successfully merged: 19 files changed, 239 insertions(+), 255 deletions(-)

@makseq makseq enabled auto-merge (squash) November 27, 2025 23:33
@makseq makseq disabled auto-merge November 27, 2025 23:33
@makseq makseq merged commit 43e7124 into develop Nov 28, 2025
78 of 82 checks passed
@makseq makseq deleted the fb-bros-643 branch November 28, 2025 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants