Skip to content

Conversation

@d-chambers
Copy link
Contributor

@d-chambers d-chambers commented Aug 23, 2025

Description

A user provided a Silixa Carina SEGY file that DASCore could not read. The issue was the version string in the header is "0.100", which is rather odd. Simply adding another FiberIO in dascore.io.segy.core with this version, however, did the trick.

Checklist

I have (if applicable):

  • referenced the GitHub issue this PR closes.
  • documented the new feature with docstrings or appropriate doc page.
  • included a test. See testing guidelines.
  • your name has been added to the contributors page (docs/contributors.md).
  • added the "ready_for_review" tag once the PR is ready to be reviewed.

Summary by CodeRabbit

  • New Features

    • Added support for SEGY format version 0.100 and broadened minor-version recognition to accept additional SEGY variants, improving detection of more files.
  • Chores

    • Updated plugin/entry-point registrations to expose additional SEGY variants (including 0.0 and 0.100) for automatic discovery.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 23, 2025

Walkthrough

Adds a new SEGY handler class for version "0.100", extends utils to accept minor version 100, and registers the new handler via packaging entry points. No existing read/write/scan logic was changed; changes are additive to enable dispatch to the new versioned handler.

Changes

Cohort / File(s) Summary
SEGY core handler
dascore/io/segy/core.py
Adds SegyV0_100(SegyV1_0) with version = "0.100" to expose the SEGY 0.100 variant.
SEGY utils version parsing
dascore/io/segy/utils.py
Accepts minor version 100 in allowed set; assigns version = f"{major}.{minor}" before returning; minor comment cleanup.
Packaging entry points
pyproject.toml
Adds SEGY__V0_0 -> dascore.io.segy.core:SegyV0_0 and SEGY__V0_100 -> dascore.io.segy.core:SegyV0_100 under [project.entry-points."dascore.fiber_io"].

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor User
    participant Dispatcher as IO Dispatcher
    participant Utils as SEGY Utils
    participant Handler as SegyV0_100 Handler

    User->>Dispatcher: open(file.segy)
    Dispatcher->>Utils: read_header_and_get_version()
    Utils-->>Dispatcher: "0.100"
    Dispatcher->>Dispatcher: resolve entry point for version 0.100
    Dispatcher->>Handler: instantiate SegyV0_100
    Handler->>Handler: parse/scan/read (existing logic)
    Handler-->>User: return data/metadata
    note right of Dispatcher: New dispatch path for minor version 100
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

A rabbit taps the version tree,
Sniffs out "0.100" with glee.
A new handler hops into place,
Entry points marked, utils embrace.
Carrots stored in tidy byte-space. 🥕

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a4d8b9d and 5d86f4c.

📒 Files selected for processing (1)
  • dascore/io/segy/core.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • dascore/io/segy/core.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch silixa_carina_segy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@codecov
Copy link

codecov bot commented Aug 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.87%. Comparing base (a89ff7b) to head (5d86f4c).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #519   +/-   ##
=======================================
  Coverage   99.87%   99.87%           
=======================================
  Files         118      118           
  Lines        9732     9735    +3     
=======================================
+ Hits         9720     9723    +3     
  Misses         12       12           
Flag Coverage Δ
unittests 99.87% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
dascore/io/segy/utils.py (3)

38-38: Nit: hyphenate “400-byte” for clarity

Minor wording improvement in the comment.

-    # Read 400byte header into byte string.
+    # Read 400-byte header into a byte string.

63-65: Accepting minor version 100 is correct; consider centralizing allowed versions

Allowing 100 here unblocks Silixa Carina (0.100). To reduce future drift across modules, consider extracting the allowed minor set into a module-level constant.

-        format_number_minor in {0, 1, 2, 3, 100},
+        format_number_minor in ALLOWED_SEGY_MINOR_VERSIONS,

Add near the top of this module:

# Allowed minor versions observed in the wild (including Silixa's 0.100)
ALLOWED_SEGY_MINOR_VERSIONS: set[int] = {0, 1, 2, 3, 100}

I can propagate this constant to any other places that duplicate the set, if you’d like.


67-69: Return shape change is fine; consider adding a targeted test for 0.100

Switching to compute version via f-string is clear and keeps “0.100” intact. Please add a test that feeds a minimal 400-byte header with major=0, minor=100 into _get_segy_version and asserts ("segy", "0.100").

If helpful, I can draft a unit test that constructs a BytesIO with just the binary header and validates dispatch without needing a full SEGY payload.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a89ff7b and a4d8b9d.

📒 Files selected for processing (3)
  • dascore/io/segy/core.py (1 hunks)
  • dascore/io/segy/utils.py (2 hunks)
  • pyproject.toml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.10)

Comment on lines +123 to +124
SEGY__V0_0 = "dascore.io.segy.core:SegyV0_0"
SEGY__V0_100 = "dascore.io.segy.core:SegyV0_100"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Ensure entry points resolve after class rename

SEGY__V0_0 is fine. SEGY__V0_100 points to dascore.io.segy.core:SegyV0_100, which currently does not exist because the class is named SegyV0_1 in core.py. After renaming the class to SegyV0_100 (preferred), this will be consistent.

If you decide to keep the class name SegyV0_1 (not recommended), update the entry point instead:

-SEGY__V0_100 = "dascore.io.segy.core:SegyV0_100"
+SEGY__V0_100 = "dascore.io.segy.core:SegyV0_1"
🤖 Prompt for AI Agents
In pyproject.toml around lines 123-124, the entry point SEGY__V0_100 points to
dascore.io.segy.core:SegyV0_100 which does not exist because the class is
currently named SegyV0_1; to fix, rename the class in dascore/io/segy/core.py
from SegyV0_1 to SegyV0_100 (and update any internal references/imports) so the
entry point resolves, or if you prefer to keep the class name, change the
pyproject.toml entry to point to SegyV0_1 and run project tests to ensure
imports remain correct.

💡 Verification agent

🧩 Analysis chain

Confirm intent for other 0.x minor versions (0.1/0.2/0.3) registration

utils._get_segy_version accepts minor versions {0,1,2,3,100}, but only V0_0 and V0_100 are registered here. If files report 0.1/0.2/0.3, will the framework dispatch correctly without explicit entry points? If exact version matching is required, we may need SEGY__V0_1, SEGY__V0_2, and SEGY__V0_3 entries (possibly all mapping to SegyV1_0 or a shared v0 handler).

Run this script to audit for corresponding classes/entry points:

Expected: Either explicit entries for V0_1/V0_2/V0_3 or a documented fallback mechanism that doesn't require exact version registration. If fallback exists, consider noting it in a docstring or comment next to these entry points.


🏁 Script executed:

#!/bin/bash
# Check for registered SEGY v0.* entry points and classes
rg -n 'SEGY__V0_' pyproject.toml -n
rg -nP 'class\s+SegyV0_' dascore/io/segy/core.py -n -C2

Length of output: 439


🏁 Script executed:

#!/bin/bash
# Inspect fallback logic in utils._get_segy_version
rg -nP 'def\s+_get_segy_version' -nC10 dascore/io/segy/utils.py

Length of output: 488


🏁 Script executed:

#!/bin/bash
# Corrected search for _get_segy_version definition with context
rg -nP 'def\s+_get_segy_version' -C10 dascore/io/segy/utils.py

Length of output: 782


🏁 Script executed:

#!/bin/bash
# 1. Show version detection and fallback logic in utils._get_segy_version
sed -n '29,100p' dascore/io/segy/utils.py

# 2. Inspect how discovered version is mapped to entry points (dispatching logic)
sed -n '1,200p' dascore/io/segy/__init__.py

# 3. Examine the entry-point group and context around SEGY__V0_0/V0_100 in pyproject.toml
sed -n '100,140p' pyproject.toml

Length of output: 4810


SEGY version handlers are misaligned with detected minor versions

The _get_segy_version utility recognizes minor versions {0, 1, 2, 3, 100}, but only two handlers are registered—and one of them points to a non-existent class. Without explicit support for all detected codes, loading files with versions 0.1, 0.2, or 0.3 (and even 0.100) will break.

Issues to address:

  • utils/_get_segy_version (dascore/io/segy/utils.py: lines 39–47) allows format_number_minor in {0,1,2,3,100}, returning "0.1", "0.2", "0.3", or "0.100".
  • core handlers defined (dascore/io/segy/core.py):
    • class SegyV0_0 (≔ version 0.0)
    • class SegyV0_1 (/doc: “version 0.100”/**)
      → No SegyV0_100, SegyV0_2, or SegyV0_3 classes exist.
  • Entry points (pyproject.toml: lines 123–128):
    SEGY__V0_0   = "dascore.io.segy.core:SegyV0_0"
    SEGY__V0_100 = "dascore.io.segy.core:SegyV0_100"
    
    SegyV0_100 isn’t defined, and there are no entries for V0_1, V0_2, or V0_3.

Next steps:

  • Define or rename classes for all minor versions:
    SegyV0_1SegyV0_100 (if it truly implements 0.100)
    • Add SegyV0_1, SegyV0_2, and SegyV0_3 as needed.
  • Register matching entry points in pyproject.toml:
    SEGY__V0_1, SEGY__V0_2, SEGY__V0_3, and correct SEGY__V0_100.
  • Or implement and document a clear fallback mechanism for unsupported minor versions.

These fixes are critical to ensure SEGY files with versions 0.1/0.2/0.3/0.100 load correctly.

🤖 Prompt for AI Agents
pyproject.toml around lines 123-124: update the SEGY entry points and
corresponding core classes so registered handlers match the minor versions
allowed by utils/_get_segy_version (0,1,2,3,100); specifically either (A) rename
the existing class in dascore/io/segy/core.py so the 0.100 implementation is
exported as SegyV0_100 and add concrete classes SegyV0_1, SegyV0_2, SegyV0_3 (or
alias existing implementations to those names if appropriate), then add matching
entries SEGY__V0_0, SEGY__V0_1, SEGY__V0_2, SEGY__V0_3, SEGY__V0_100 to
pyproject.toml pointing to dascore.io.segy.core:SegyV0_X; or (B) implement a
documented fallback loader in dascore/io/segy/utils.py that maps unsupported
minor versions to an explicit existing handler and update pyproject.toml to only
register the handler names that actually exist. Ensure names and entry points
exactly match class names and update any docstrings/comments to avoid the
current mismatch.

@d-chambers d-chambers merged commit 271bcc8 into master Aug 23, 2025
20 checks passed
@d-chambers d-chambers deleted the silixa_carina_segy branch September 19, 2025 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants