Skip to content

Windows compatibility issue: Path separator in cloud storage URLs #783

@mohamed-laarej

Description

@mohamed-laarej

Windows compatibility issue: Path separator in cloud storage URLs

Issue Summary

When running on Windows, tests for pf7, pf8, and pv4 datasets fail with "Invalid bucket name" errors. This is caused by Windows backslashes (\) in cloud storage paths instead of the required forward slashes (/).

Root Cause

In malariagen_data\plasmodium.py, os.path.join() is used to construct cloud storage paths in four methods:

# In sample_metadata() method:
path = os.path.join(self._path, self.CONF["metadata_path"])

# In _open_variant_calls_zarr() method:
path = os.path.join(self._path, self.CONF["variant_calls_zarr_path"])

# In open_genome() method:
path = os.path.join(self._path, self.CONF["reference_path"])

# In genome_features() method:
path = os.path.join(self._path, self.CONF["annotations_path"])

On Windows, this produces backslashes which are invalid in cloud storage URLs.

Error Details

Tests fail with similar errors:

gcsfs.retry.HttpError: Invalid bucket name: 'pf7_release\metadata', 400
botocore.exceptions.ParamValidationError: Parameter validation failed: Invalid bucket name

Proposed Fix

Use string formatting with explicit forward slashes:

# In sample_metadata() method:
path = f"{self._path}/{self.CONF['metadata_path']}"

# In _open_variant_calls_zarr() method:
path = f"{self._path}/{self.CONF['variant_calls_zarr_path']}"

# In open_genome() method:
path = f"{self._path}/{self.CONF['reference_path']}"

# In genome_features() method:
path = f"{self._path}/{self.CONF['annotations_path']}"

Alternative approach:

# For any of the methods:
path = os.path.join(self._path, self.CONF["metadata_path"])  # or other paths
path = path.replace('\\', '/')

Files to Change

  • malariagen_data\plasmodium.py

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions