Fix spool equality, patch reverse operation #542

d-chambers · 2025-09-20T06:51:34Z

Description

This PR fixes a bug in patch reverse operations (currently are not correct), and makes the spool equality comparisons more robust to circular references (just in case).

Checklist

I have (if applicable):

referenced the GitHub issue this PR closes.
documented the new feature with docstrings or appropriate doc page.
included a test. See testing guidelines.
your name has been added to the contributors page (docs/contributors.md).
added the "ready_for_review" tag once the PR is ready to be reviewed.

Summary by CodeRabbit

New Features
- Reverse arithmetic support for Patch-like objects (e.g., scalar/array − Patch, **, /, //, %).
- Added a public deep-equality utility for robust comparisons of nested structures, arrays, and DataFrames.
Bug Fixes
- More reliable spool equality handling for mixed and nested types.
- Strain-rate conversions now validate step_multiple and raise clear errors for non-positive values.
- Deprecation warnings now point to the correct call site for clearer diagnostics.
Tests
- Expanded coverage for patch ops, spool equality, strain validations, and deep-equality utility.

coderabbitai · 2025-09-20T06:51:40Z

Walkthrough

Updates warning stacklevel in CoordManager; implements explicit reverse arithmetic operators on Patch; refactors BaseSpool equality to use a new deep_equality_check utility; adds that utility; enforces positive step_multiple in strain-rate transforms; and expands tests for these behaviors.

Changes

Cohort / File(s)	Summary of changes
CoordManager warning level `dascore/core/coordmanager.py`	Increased deprecation warning `stacklevel` in `CoordManager.__getitem__` from 2 to 6.
Patch reverse operators & tests `dascore/core/patch.py`, `tests/test_core/test_patch.py`	Replaced reverse-operator aliasing with explicit `__rsub__`, `__rpow__`, `__rtruediv__`, `__rfloordiv__`, `__rmod__` methods that call `dascore.utils.array.apply_ufunc` with appropriate NumPy ufuncs; added tests for reverse subtraction with scalar and array operands.
Spool equality refactor & tests `dascore/core/spool.py`, `tests/test_core/test_spool.py`	Replaced in-method deep-dict/object comparison with a call to `deep_equality_check`; added tests exercising spool equality across mixed types, objects with `__dict__`, arrays, and DataFrames.
Deep equality utility & tests `dascore/utils/misc.py`, `tests/test_utils/test_misc.py`	Added public `deep_equality_check(obj1, obj2, visited=None)` handling mappings, objects (`__dict__`), numpy arrays, pandas DataFrames, and circular references; added extensive unit tests for many edge cases.
Strain validation & tests `dascore/transform/strain.py`, `tests/test_transform/test_strain.py`	Added validation requiring `step_multiple > 0` in `velocity_to_strain_rate` and `velocity_to_strain_rate_edgeless` (raises `ParameterError` on non-positive); expanded tests for negative/zero/odd/even and valid even cases.

Possibly related PRs

Patch boolean logic and better numpy array protocol support #537: Related changes to Patch reverse-operator handling and ufunc/operator plumbing that overlap the new explicit __r*__ implementations.
typo fixes #530: Related changes to BaseSpool.__eq__ / spool equality logic that intersect with replacing the bespoke comparison with deep_equality_check.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "Fix spool equality, patch reverse operation" is concise and directly reflects the two principal changes in the diff (spool equality robustness and correction of patch reverse operations), so it accurately summarizes the main changes; it is short and clear for a teammate scanning history. Minor nit: "operation" could be pluralized to "operations" to reflect multiple reverse operator methods changed. Overall, it is related and appropriately focused.
Description Check	✅ Passed	The PR description follows the repository template by including a Description and Checklist and succinctly explains the intent to fix patch reverse operations and improve spool equality handling for circular references; this gives a clear high-level summary of the changes. However, the checklist items are all unchecked, no related issue or documentation links are provided, and the description does not explicitly note the added tests or required docstring/changelog updates. Because the core intent is clear but implementation details, references, and checklist completion are missing, the description is mostly complete but would benefit from small additions.
Docstring Coverage	✅ Passed	Docstring coverage is 80.95% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch bug_review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-09-20T06:54:28Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.92%. Comparing base (25b4b75) to head (886e3e4).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #542   +/-   ##
=======================================
  Coverage   99.92%   99.92%           
=======================================
  Files         124      124           
  Lines       10336    10375   +39     
=======================================
+ Hits        10328    10367   +39     
  Misses          8        8

Flag	Coverage Δ
unittests	`99.92% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

dascore/utils/misc.py (1)

842-842: Consider type consistency in base case.

Line 842 performs direct equality comparison (return obj1 == obj2) for non-dict objects. However, this might not handle numpy arrays consistently since their equality returns an array, not a boolean. Consider standardizing the return value handling similar to lines 866-872.

-        if not isinstance(obj1, dict) or not isinstance(obj2, dict):
-            # Non-dict comparison, use direct comparison
-            return obj1 == obj2
+        if not isinstance(obj1, dict) or not isinstance(obj2, dict):
+            # Non-dict comparison, handle arrays and other types
+            try:
+                equal = obj1 == obj2
+                # Handle numpy arrays and other array-like objects
+                if hasattr(equal, "all"):
+                    return equal.all()
+                return bool(equal)
+            except (ValueError, TypeError):
+                # For objects that can't be compared, fall back to False
+                return False

tests/test_utils/test_misc.py (1)

523-523: Consider using a more specific exception message.

The message "Cannot compare" is quite generic. Based on the static analysis hint, consider using a more specific error message that explains why comparison failed.

Apply this diff:
-            def __eq__(self, other):
-                raise ValueError("Cannot compare")
+            def __eq__(self, other):
+                raise ValueError("BadComparisonObj instances cannot be compared")

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 25b4b75 and e10bb6e.

📒 Files selected for processing (9)

dascore/core/coordmanager.py (1 hunks)
dascore/core/patch.py (1 hunks)
dascore/core/spool.py (2 hunks)
dascore/transform/strain.py (2 hunks)
dascore/utils/misc.py (1 hunks)
tests/test_core/test_patch.py (1 hunks)
tests/test_core/test_spool.py (1 hunks)
tests/test_transform/test_strain.py (2 hunks)
tests/test_utils/test_misc.py (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (8)

dascore/core/spool.py (1)

dascore/utils/misc.py (1)

deep_equality_check (789-879)

dascore/core/patch.py (1)

dascore/utils/array.py (1)

apply_ufunc (448-495)

dascore/utils/misc.py (2)

dascore/core/coordmanager.py (1)

equals (746-756)

dascore/proc/basic.py (1)

equals (108-153)

dascore/transform/strain.py (1)

dascore/exceptions.py (1)

ParameterError (26-27)

tests/test_utils/test_misc.py (1)

dascore/utils/misc.py (1)

deep_equality_check (789-879)

tests/test_core/test_patch.py (2)

dascore/examples.py (1)

get_example_patch (659-706)

dascore/core/patch.py (1)

data (237-239)

tests/test_core/test_spool.py (3)

dascore/examples.py (2)

random_spool (553-576)

get_example_patch (659-706)

tests/conftest.py (3)

random_spool (447-451)

patch (375-377)

spool (569-571)

dascore/core/spool.py (1)

spool (656-685)

tests/test_transform/test_strain.py (4)

tests/conftest.py (1)

patch (375-377)

dascore/exceptions.py (1)

ParameterError (26-27)

dascore/transform/strain.py (2)

velocity_to_strain_rate (21-121)

velocity_to_strain_rate_edgeless (128-210)

dascore/core/patch.py (1)

attrs (227-229)

🪛 Ruff (0.13.1)

tests/test_utils/test_misc.py

523-523: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)

GitHub Check: test_code (windows-latest, 3.10)
GitHub Check: test_code (windows-latest, 3.13)
GitHub Check: test_code (windows-latest, 3.12)
GitHub Check: test_code (windows-latest, 3.11)
GitHub Check: test_code (macos-latest, 3.11)
GitHub Check: test_code (ubuntu-latest, 3.12)
GitHub Check: test_code (ubuntu-latest, 3.13)
GitHub Check: test_code (macos-latest, 3.12)
GitHub Check: test_code (macos-latest, 3.10)
GitHub Check: test_code (macos-latest, 3.13)
GitHub Check: test_code (ubuntu-latest, 3.11)
GitHub Check: test_code (ubuntu-latest, 3.10)
GitHub Check: test_code_min_deps (windows-latest, 3.12)
GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
GitHub Check: test_code_min_deps (windows-latest, 3.13)
GitHub Check: test_code_min_deps (macos-latest, 3.12)
GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)

🔇 Additional comments (20)

dascore/transform/strain.py (2)

102-105: LGTM! Proper input validation for step_multiple.

The addition of validation to ensure step_multiple is positive is appropriate and prevents invalid states early. The error message is clear and consistent with other parameter validation in the codebase.

180-183: LGTM! Consistent validation across both functions.

Good consistency in adding the same validation to the edgeless variant. This ensures both functions have uniform input validation.

dascore/core/spool.py (2)

32-32: Good refactor: Centralized deep equality logic.

Importing deep_equality_check from dascore.utils.misc promotes code reuse and centralizes the equality checking logic.

86-90: Clean replacement with robust equality checker.

The replacement of inline dictionary comparison with deep_equality_check is cleaner and provides better handling of complex nested structures, circular references, and special types like DataFrames and numpy arrays.

tests/test_core/test_patch.py (2)

653-662: LGTM! Important test for reverse subtraction operator.

This test validates that the new __rsub__ implementation correctly handles scalar - patch, ensuring the result is scalar - patch.data rather than patch.data - scalar. This is crucial for mathematical correctness.

663-672: LGTM! Comprehensive test with array operand.

Good addition of array-based reverse subtraction test, ensuring the operator works correctly with different operand types.

dascore/utils/misc.py (1)

789-879: Well-designed deep equality utility with robust circular reference handling.

The implementation is comprehensive and handles various edge cases well:

Proper circular reference detection using object IDs

Support for multiple data types (dicts, DataFrames, numpy arrays, objects with __dict__)

Clean try-finally pattern to ensure visited set cleanup

A few observations:

The docstring examples are helpful and demonstrate key use cases

The short-circuit on object identity (if val1 is val2) is a good optimization

The fallback behavior for comparison errors is reasonable

dascore/core/coordmanager.py (1)

211-211: Verify stacklevel value change for coords deprecation

CoordManager.getitem (dascore/core/coordmanager.py:211) warns with stacklevel=6 — this will attribute the warning 4 frames up; confirm this is intentional and that it points to user code (not skipping relevant DASCore internal call sites).

tests/test_utils/test_misc.py (4)

12-12: Good addition of pandas import for DataFrame testing.

The pandas import is correctly added to support the comprehensive DataFrame comparison tests in the new TestDeepEqualityCheck class.

20-20: Correctly importing the new deep_equality_check function.

The import aligns with the new public API function added in dascore/utils/misc.py.

387-596: Comprehensive test coverage for deep_equality_check function.

This test class provides excellent coverage of all the edge cases and code paths in the deep_equality_check function:

Circular reference detection and handling ✓

Non-dict vs dict comparisons ✓

Nested dictionary comparisons ✓

pandas DataFrame comparisons using .equals() method ✓

Objects with __dict__ attributes ✓

NumPy array comparisons ✓

Exception handling for objects that raise ValueError on comparison ✓

Mixed complex data structures ✓

The tests are well-structured and cover both positive and negative cases. The use of custom classes like BadComparisonObj effectively tests the exception handling paths.

574-587: Good edge case coverage for mixed equals method scenarios.

This test effectively covers the case where one object has an equals method (DataFrame) and the other doesn't (custom object), ensuring the comparison falls through to the general comparison path correctly.

dascore/core/patch.py (2)

150-170: Excellent implementation of explicit reverse operators.

The implementation correctly replaces the problematic aliasing approach (__rsub__ = __sub__) with explicit methods that ensure proper operand ordering. Key strengths:

Correct operand order: other - self, other ** self, etc.

Consistent with existing pattern: Uses dascore.utils.array.apply_ufunc like forward operators

Proper documentation: Each method includes clear docstrings

Complete coverage: All necessary reverse operators are implemented

This fixes the reverse operation issue mentioned in the PR objectives where the current reverse behavior was incorrect.

148-148: Verify the aliasing for commutative operations.

The __radd__ = __add__ and __rmul__ = __mul__ aliases are correct since addition and multiplication are commutative operations. This is appropriate and efficient.

Also applies to: 154-154

tests/test_transform/test_strain.py (3)

102-129: Excellent comprehensive validation tests for step_multiple parameter.

These tests provide thorough coverage of the input validation logic added to the strain transformation functions:

Negative values: Correctly tests that negative step_multiple raises ParameterError with "must be positive" message

Zero values: Validates zero is also rejected with the same message

Odd values: Ensures odd values still trigger the existing "must be even" validation

Valid cases: Confirms positive even values work correctly and produce expected data_type

The tests align perfectly with the validation logic shown in the relevant code snippets from dascore/transform/strain.py.

201-213: Good parallel testing for edgeless variant.

The tests for velocity_to_strain_rate_edgeless correctly mirror the validation tests for the main function, ensuring both implementations have consistent input validation for the positive constraint on step_multiple.

189-200: Validate the test logic for function equivalence.

The test verifies that both strain-rate conversion functions produce equivalent results for even step multiples. This is a valuable integration test that ensures the new input validation doesn't break the existing behavior equivalence between the two approaches.

tests/test_core/test_spool.py (3)

658-792: Excellent targeted coverage for spool equality edge cases.

This new test class provides comprehensive coverage for the nuanced equality scenarios that the deep_equality_check function handles within spool comparisons:

Non-dict comparisons (Lines 661-677): Tests string comparisons that should hit the non-dict comparison path

Object with __dict__ differences (Lines 679-694): Tests objects with different internal state

Object with __dict__ equality (Lines 696-711): Tests objects with same internal state via recursive comparison

Mixed data types (Lines 713-739): Comprehensive testing with integers, lists, NumPy arrays

DataFrame comparisons (Lines 740-762): Tests pandas DataFrame equality using the .equals() method

Specific line coverage (Lines 764-791): Targeted tests to ensure specific code paths are exercised

The tests are well-structured and use appropriate fixtures and assertions. The approach of adding attributes to spool instances is a clever way to test the deep equality logic without interfering with the core spool functionality.

713-738: Verify NumPy array comparison behavior.

The test correctly validates that NumPy arrays with different values ([1, 2, 4] vs [1, 2, 3]) are detected as unequal. This ensures the deep_equality_check function properly handles the .all() method on NumPy array comparison results.

740-762: Good DataFrame equality testing using pandas equals() method.

The test correctly demonstrates that DataFrames with identical data are considered equal via the .equals() method, while DataFrames with different data are correctly identified as unequal. Based on the web search results, the pandas .equals() method treats NaNs in the same location as equal and compares both shape and elements, which is the appropriate comparison method for DataFrames in this context.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (6)

dascore/utils/misc.py (3)
853-856: Generalize to Mapping for broader dict-like support

This function currently restricts deep traversal to dict. Using collections.abc.Mapping covers other dict-like types (e.g., defaultdict, OrderedDict) without changing semantics.

Apply this diff:
-        if not isinstance(obj1, dict) or not isinstance(obj2, dict):
+        if not isinstance(obj1, Mapping) or not isinstance(obj2, Mapping):
             # Non-dict comparison, handle arrays and other types
             return _robust_equality_check(obj1, obj2)
@@
-            elif isinstance(val1, dict) and isinstance(val2, dict):
+            elif isinstance(val1, Mapping) and isinstance(val2, Mapping):
                 if not deep_equality_check(val1, val2, visited):
                     return False
Also applies to: 864-866

831-836: Address Ruff TRY300 (minor): add an else and normalize bool conversion

Pure style, but also makes the intent explicit and avoids returning non-bool accidentally.

Apply this diff if you don’t take the larger refactor above:
-            if hasattr(equal, "all"):
-                return equal.all()
-            return equal
+            if hasattr(equal, "all"):
+                return bool(equal.all())
+            else:
+                return bool(equal)
847-851: Cycle detection note (optional): consider symmetric pair check

Current visited pairs are ordered (id(obj1), id(obj2)). In rare symmetric cycles, you could also short-circuit when the reversed pair was visited to reduce redundant work.

Example tweak:
-    if pair_id in visited:
+    if pair_id in visited or (obj2_id, obj1_id) in visited:
         return True
tests/test_utils/test_misc.py (3)
387-658: Add a top-level pandas equality test to prevent regressions

deep_equality_check(df1, df2) (not wrapped in dicts) currently returns a non-bool without the fix above. Add a direct test to lock in correct behavior.

Apply this addition:
def test_top_level_dataframe_equality():
    df1 = pd.DataFrame({"a": [1, 2, 3]})
    df2 = pd.DataFrame({"a": [1, 2, 3]})
    assert deep_equality_check(df1, df2) is True

def test_top_level_dataframe_inequality():
    df1 = pd.DataFrame({"a": [1, 2, 3]})
    df2 = pd.DataFrame({"a": [1, 2, 4]})
    assert deep_equality_check(df1, df2) is False
546-548: Remove redundant local imports of pandas

pd is imported at the module top (Line 12). Re-importing inside tests is unnecessary.

Apply this diff:
-        import pandas as pd
+        # pd already imported at module scope
@@
-        import pandas as pd
+        # pd already imported at module scope
Also applies to: 575-577

639-645: Avoid brittle references to source line numbers in test docstrings

Docstrings referencing exact source line numbers will rot with unrelated edits. Describe behavior instead.

Apply this diff:
-    def test_non_array_equal_comparison(self):
-        """Test line 834: return equal path for non-array objects."""
+    def test_non_array_equal_comparison(self):
+        """Non-array comparison returns correct boolean for simple types."""
@@
-    def test_circular_reference_return_true(self):
-        """Test line 850: circular reference detection returns True."""
+    def test_circular_reference_return_true(self):
+        """Circular reference detection returns True without infinite recursion."""
Also applies to: 646-658

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e10bb6e and 5bfbebb.

📒 Files selected for processing (2)

dascore/utils/misc.py (1 hunks)
tests/test_utils/test_misc.py (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

tests/test_utils/test_misc.py (1)

dascore/utils/misc.py (1)

deep_equality_check (789-881)

dascore/utils/misc.py (2)

dascore/core/coordmanager.py (1)

equals (746-756)

dascore/proc/basic.py (1)

equals (108-153)

🪛 Ruff (0.13.1)

tests/test_utils/test_misc.py

523-523: Avoid specifying long messages outside the exception class

(TRY003)

617-617: Avoid specifying long messages outside the exception class

(TRY003)

631-631: Avoid specifying long messages outside the exception class

(TRY003)

dascore/utils/misc.py

835-835: Consider moving this statement to an else block

(TRY300)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)

GitHub Check: test_code (windows-latest, 3.13)
GitHub Check: test_code (macos-latest, 3.12)
GitHub Check: test_code (windows-latest, 3.12)
GitHub Check: test_code (windows-latest, 3.11)
GitHub Check: test_code (windows-latest, 3.10)
GitHub Check: test_code (macos-latest, 3.13)
GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
GitHub Check: test_code (ubuntu-latest, 3.13)
GitHub Check: test_code (macos-latest, 3.10)
GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
GitHub Check: test_code_min_deps (windows-latest, 3.13)
GitHub Check: test_code_min_deps (windows-latest, 3.12)
GitHub Check: test_code_min_deps (macos-latest, 3.13)
GitHub Check: test_code_min_deps (macos-latest, 3.12)
GitHub Check: test_code (ubuntu-latest, 3.12)
GitHub Check: test_code (ubuntu-latest, 3.11)
GitHub Check: test_code (ubuntu-latest, 3.10)

🔇 Additional comments (1)

tests/test_utils/test_misc.py (1)

375-385: Nice edge case for BytesIO pointer position

Good addition; verifies fallback reads entire buffer regardless of current file pointer.

coderabbitai · 2025-09-20T11:31:39Z

dascore/utils/misc.py

+    def _robust_equality_check(obj1, obj2):
+        """Robust equality check to also handle arrays."""
+        try:
+            equal = obj1 == obj2
+            # Handle numpy arrays and other array-like objects
+            if hasattr(equal, "all"):
+                return equal.all()
+            return equal
+        except (ValueError, TypeError):
+            # For objects that can't be compared, fall back to False
+            return False
+


⚠️ Potential issue

Always return a bool; handle pandas/NumPy explicitly in _robust_equality_check

When comparing top-level pandas objects (e.g., DataFrame vs DataFrame), obj1 == obj2 yields a DataFrame of booleans and .all() returns a Series, not a bool. That leaks a non-bool return from deep_equality_check. Also, prefer equals() for pandas and np.array_equal for NumPy.

Apply this diff to harden and normalize returns:

- def _robust_equality_check(obj1, obj2): - """Robust equality check to also handle arrays.""" - try: - equal = obj1 == obj2 - # Handle numpy arrays and other array-like objects - if hasattr(equal, "all"): - return equal.all() - return equal - except (ValueError, TypeError): - # For objects that can't be compared, fall back to False - return False + def _robust_equality_check(obj1, obj2): + """Robust equality check to also handle pandas/NumPy and array-likes.""" + # Prefer object-provided equality when available (e.g., pandas) + if hasattr(obj1, "equals") and hasattr(obj2, "equals"): + try: + return bool(obj1.equals(obj2)) + except Exception: + return False + # NumPy arrays: exact structural equality + if isinstance(obj1, np.ndarray) and isinstance(obj2, np.ndarray): + try: + return bool(np.array_equal(obj1, obj2)) + except Exception: + return False + # Fallback to Python/array-like equality + try: + equal = obj1 == obj2 + # Convert array-like results to a bool + if hasattr(equal, "to_numpy"): # pandas objects + try: + return bool(np.asarray(equal.to_numpy()).all()) + except Exception: + return False + if hasattr(equal, "all"): # NumPy arrays or similar + try: + return bool(equal.all()) + except Exception: + return False + return bool(equal) + except (ValueError, TypeError): + # For objects that can't be compared, fall back to False + return False

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _robust_equality_check(obj1, obj2):

"""Robust equality check to also handle arrays."""

try:

equal = obj1 == obj2

# Handle numpy arrays and other array-like objects

if hasattr(equal, "all"):

return equal.all()

return equal

except (ValueError, TypeError):

# For objects that can't be compared, fall back to False

return False

def _robust_equality_check(obj1, obj2):

"""Robust equality check to also handle pandas/NumPy and array-likes."""

# Prefer object-provided equality when available (e.g., pandas)

if hasattr(obj1, "equals") and hasattr(obj2, "equals"):

try:

return bool(obj1.equals(obj2))

except Exception:

return False

# NumPy arrays: exact structural equality

if isinstance(obj1, np.ndarray) and isinstance(obj2, np.ndarray):

try:

return bool(np.array_equal(obj1, obj2))

except Exception:

return False

# Fallback to Python/array-like equality

try:

equal = obj1 == obj2

# Convert array-like results to a bool

if hasattr(equal, "to_numpy"): # pandas objects

try:

return bool(np.asarray(equal.to_numpy()).all())

except Exception:

return False

if hasattr(equal, "all"): # NumPy arrays or similar

try:

return bool(equal.all())

except Exception:

return False

return bool(equal)

except (ValueError, TypeError):

# For objects that can't be compared, fall back to False

return False

🧰 Tools

🪛 Ruff (0.13.1)

835-835: Consider moving this statement to an else block

(TRY300)

🤖 Prompt for AI Agents

In dascore/utils/misc.py around lines 828 to 839, _robust_equality_check can return non-bool types (e.g., pandas DataFrame/Series) when using obj1 == obj2; update it to always return a bool by detecting pandas objects and numpy/array-likes explicitly: if either obj is a pandas DataFrame/Series use pandas.equals, if either is a numpy ndarray or has dtype/shape use numpy.array_equal, otherwise fall back to the existing try/except but coerce any array-like truth to a single bool (e.g., using .all()) and ensure exceptions return False.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

dascore/utils/misc.py (1)

828-845: Strengthen _robust_equality_check for pandas/NumPy; always return bool (duplicate of earlier feedback)

Prefer equals() and np.array_equal() when available; keeps NaN semantics consistent and avoids DataFrame->Series truth pitfalls. Current code coerces to bool but still computes elementwise first.

     def _robust_equality_check(obj1, obj2):
         """Robust equality check to also handle arrays."""
-        try:
+        # Prefer object-provided/vectorized equality when available
+        if hasattr(obj1, "equals") and hasattr(obj2, "equals"):
+            try:
+                return bool(obj1.equals(obj2))
+            except Exception:
+                return False
+        if isinstance(obj1, np.ndarray) and isinstance(obj2, np.ndarray):
+            try:
+                return bool(np.array_equal(obj1, obj2))
+            except Exception:
+                return False
+        try:
             equal = obj1 == obj2
             # Handle numpy arrays and other array-like objects
             if hasattr(equal, "all"):
                 result = equal.all()
                 # Handle case where .all() returns a Series (e.g., pandas DataFrame)
                 # In such cases, call .all() again to get a boolean
                 if hasattr(result, "all"):
                     result = result.all()
                 # Ensure we return a Python bool, not numpy.bool_
                 return bool(result)
             return bool(equal)
         except (ValueError, TypeError):
             # For objects that can't be compared, fall back to False
             return False

🧹 Nitpick comments (5)

dascore/utils/misc.py (3)
886-887: Make cleanup resilient

Use discard to avoid a rare KeyError if visited were externally mutated while recursing.
-    finally:
-        visited.remove(pair_id)
+    finally:
+        visited.discard(pair_id)
789-826: Docstring claims top-level dict handling — align implementation or adjust docs

After applying the above refactor, the docstring will be accurate. If not, narrow the claim.

846-858: Optional: guard against unhashable/unsupported Mapping subtypes

If custom Mapping implementations have surprising behavior, consider collections.abc.Mapping check (already used) plus try/except around key iteration. Low risk; keep as-is if no known offenders.
tests/test_utils/test_misc.py (2)
387-684: Great coverage; add two targeted cases to lock desired semantics

Top-level objects with dict should compare by attributes (not only when nested).

Top-level DataFrames with NaNs should follow DataFrame.equals (NaNs equal).
@@ class TestDeepEqualityCheck:
+    def test_top_level_object_with_dict(self):
+        class T:
+            def __init__(self, x): self.x = x
+        assert deep_equality_check(T(1), T(1)) is True
+        assert deep_equality_check(T(1), T(2)) is False
+
+    def test_top_level_dataframe_nan_semantics(self):
+        df1 = pd.DataFrame({"a": [1.0, np.nan]})
+        df2 = pd.DataFrame({"a": [1.0, np.nan]})
+        # pandas.equals treats NaNs as equal; deep_equality_check should too
+        assert deep_equality_check(df1, df2) is True
483-494: Top-level DataFrame tests: consider asserting NaN and index/dtype parity

As-is they only cover simple equality; extend to cover NaNs and differing index/dtype to mirror equals.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5bfbebb and 886e3e4.

📒 Files selected for processing (2)

dascore/utils/misc.py (1 hunks)
tests/test_utils/test_misc.py (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

dascore/utils/misc.py (2)

dascore/core/coordmanager.py (1)

equals (746-756)

dascore/proc/basic.py (1)

equals (108-153)

tests/test_utils/test_misc.py (1)

dascore/utils/misc.py (1)

deep_equality_check (789-887)

🪛 Ruff (0.13.1)

tests/test_utils/test_misc.py

535-535: Avoid specifying long messages outside the exception class

(TRY003)

627-627: Avoid specifying long messages outside the exception class

(TRY003)

641-641: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)

GitHub Check: test_code (windows-latest, 3.12)
GitHub Check: test_code_min_deps (windows-latest, 3.13)
GitHub Check: test_code (windows-latest, 3.10)
GitHub Check: test_code_min_deps (windows-latest, 3.12)
GitHub Check: test_code (macos-latest, 3.11)
GitHub Check: test_code (windows-latest, 3.11)
GitHub Check: test_code (macos-latest, 3.13)
GitHub Check: test_code (macos-latest, 3.10)
GitHub Check: test_code (windows-latest, 3.13)
GitHub Check: test_code (ubuntu-latest, 3.13)
GitHub Check: test_code (ubuntu-latest, 3.11)
GitHub Check: test_code (ubuntu-latest, 3.12)
GitHub Check: test_code_min_deps (macos-latest, 3.13)
GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
GitHub Check: test_code (ubuntu-latest, 3.10)
GitHub Check: test_code_min_deps (macos-latest, 3.12)
GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)

🔇 Additional comments (4)

tests/test_utils/test_misc.py (3)

12-12: LGTM: pandas import added for new equality tests

608-619: Array-like .all() path covered — nice

These verify the _robust_equality_check array branch.

634-647: Exception paths are exercised — good negative coverage

Covers ValueError/TypeError fallbacks to False.

dascore/utils/misc.py (1)

859-862: Resolved — BaseSpool.eq delegates to deep_equality_check Confirmed: dascore/core/spool.py returns deep_equality_check(my_dict, other_dict) (line 90).

coderabbitai · 2025-09-20T12:24:25Z

dascore/utils/misc.py

+        if not isinstance(obj1, Mapping) or not isinstance(obj2, Mapping):
+            # Non-dict comparison, handle arrays and other types
+            return _robust_equality_check(obj1, obj2)
+


⚠️ Potential issue

Handle top-level pandas/NumPy/objects-with-dict before fallback to avoid inconsistent semantics

Currently, top-level DataFrames/ndarrays/objects with dict go through _robust_equality_check, while nested ones use equals/recursive __dict__, leading to different NaN and attribute semantics. Gate these types first, then branch to Mapping vs. fallback.

def deep_equality_check(obj1, obj2, visited=None): @@ - try: - if not isinstance(obj1, Mapping) or not isinstance(obj2, Mapping): - # Non-dict comparison, handle arrays and other types - return _robust_equality_check(obj1, obj2) + try: + # Fast paths: identity and type-specialized equality + if obj1 is obj2: + return True + # Prefer object-provided equality (e.g., pandas) + if hasattr(obj1, "equals") and hasattr(obj2, "equals"): + try: + return bool(obj1.equals(obj2)) + except Exception: + pass + # NumPy arrays + if isinstance(obj1, np.ndarray) and isinstance(obj2, np.ndarray): + try: + return bool(np.array_equal(obj1, obj2)) + except Exception: + return False + # Objects with __dict__: compare attribute dicts deeply + if hasattr(obj1, "__dict__") and hasattr(obj2, "__dict__"): + return deep_equality_check(obj1.__dict__, obj2.__dict__, visited) + + # Mapping path + if isinstance(obj1, Mapping) and isinstance(obj2, Mapping): # ... - if (set1 := set(obj1)) != set(obj2): + if (set1 := set(obj1)) != set(obj2): return False for key in set1: val1, val2 = obj1[key], obj2[key] # Check for object identity first to handle self-references if val1 is val2: continue elif isinstance(val1, Mapping) and isinstance(val2, Mapping): if not deep_equality_check(val1, val2, visited): return False # this is primarily for dataframes which have equals method. elif hasattr(val1, "equals") and hasattr(val2, "equals"): if not val1.equals(val2): return False # Handle object comparison carefully to avoid infinite recursion elif hasattr(val1, "__dict__") and hasattr(val2, "__dict__"): # For objects with __dict__, use recursive comparison if not deep_equality_check(val1.__dict__, val2.__dict__, visited): return False else: if not _robust_equality_check(val1, val2): return False - return True + # Non-mapping fallback + return True if isinstance(obj1, Mapping) and isinstance(obj2, Mapping) else _robust_equality_check(obj1, obj2)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if not isinstance(obj1, Mapping) or not isinstance(obj2, Mapping):

# Non-dict comparison, handle arrays and other types

return _robust_equality_check(obj1, obj2)

def deep_equality_check(obj1, obj2, visited=None):

try:

# Fast paths: identity and type-specialized equality

if obj1 is obj2:

return True

# Prefer object-provided equality (e.g., pandas)

if hasattr(obj1, "equals") and hasattr(obj2, "equals"):

try:

return bool(obj1.equals(obj2))

except Exception:

pass

# NumPy arrays

if isinstance(obj1, np.ndarray) and isinstance(obj2, np.ndarray):

try:

return bool(np.array_equal(obj1, obj2))

except Exception:

return False

# Objects with __dict__: compare attribute dicts deeply

if hasattr(obj1, "__dict__") and hasattr(obj2, "__dict__"):

return deep_equality_check(obj1.__dict__, obj2.__dict__, visited)

# Mapping path

if isinstance(obj1, Mapping) and isinstance(obj2, Mapping):

if (set1 := set(obj1)) != set(obj2):

return False

for key in set1:

val1, val2 = obj1[key], obj2[key]

# Check for object identity first to handle self-references

if val1 is val2:

continue

elif isinstance(val1, Mapping) and isinstance(val2, Mapping):

if not deep_equality_check(val1, val2, visited):

return False

# this is primarily for dataframes which have equals method.

elif hasattr(val1, "equals") and hasattr(val2, "equals"):

if not val1.equals(val2):

return False

# Handle object comparison carefully to avoid infinite recursion

elif hasattr(val1, "__dict__") and hasattr(val2, "__dict__"):

# For objects with __dict__, use recursive comparison

if not deep_equality_check(val1.__dict__, val2.__dict__, visited):

return False

else:

if not _robust_equality_check(val1, val2):

return False

# Non-mapping fallback

return True if isinstance(obj1, Mapping) and isinstance(obj2, Mapping) else _robust_equality_check(obj1, obj2)

except Exception:

return False

🤖 Prompt for AI Agents

In dascore/utils/misc.py around lines 859 to 862, top-level pandas objects (DataFrame/Series), NumPy ndarrays, and objects with a __dict__ are currently falling through to _robust_equality_check which differs from how nested instances are compared; detect and handle these types before the Mapping check: first check for pandas DataFrame/Series and use their .equals (or appropriate pandas comparison) to compare, then check for numpy.ndarray and compare elementwise with NaN-aware semantics (e.g., numpy functions that treat NaNs as equal), then if the object has a __dict__ compare their __dict__ recursively using the same equality logic; only after these guards run, fall back to the Mapping vs fallback branch as before.

Fix spool equality, patch reverse operation

e10bb6e

coderabbitai bot added bug Something isn't working patch related to Patch class proc Related to processing module labels Sep 20, 2025

coderabbitai bot reviewed Sep 20, 2025

View reviewed changes

rabbit review, minor refactor

5bfbebb

coderabbitai bot reviewed Sep 20, 2025

View reviewed changes

rabbit round 2

886e3e4

coderabbitai bot reviewed Sep 20, 2025

View reviewed changes

d-chambers merged commit 272d0e7 into master Sep 20, 2025
22 checks passed

d-chambers deleted the bug_review branch September 20, 2025 12:28

coderabbitai bot mentioned this pull request Oct 4, 2025

Align to coord #561

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix spool equality, patch reverse operation #542

Fix spool equality, patch reverse operation #542

Uh oh!

d-chambers commented Sep 20, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 20, 2025 •

edited

Loading

Uh oh!

codecov bot commented Sep 20, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 20, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        if not isinstance(obj1, Mapping) or not isinstance(obj2, Mapping):
-            # Non-dict comparison, handle arrays and other types
-            return _robust_equality_check(obj1, obj2)
+def deep_equality_check(obj1, obj2, visited=None):
+    try:
+        # Fast paths: identity and type-specialized equality
+        if obj1 is obj2:
+            return True
+        # Prefer object-provided equality (e.g., pandas)
+        if hasattr(obj1, "equals") and hasattr(obj2, "equals"):
+            try:
+                return bool(obj1.equals(obj2))
+            except Exception:
+                pass
+        # NumPy arrays
+        if isinstance(obj1, np.ndarray) and isinstance(obj2, np.ndarray):
+            try:
+                return bool(np.array_equal(obj1, obj2))
+            except Exception:
+                return False
+        # Objects with __dict__: compare attribute dicts deeply
+        if hasattr(obj1, "__dict__") and hasattr(obj2, "__dict__"):
+            return deep_equality_check(obj1.__dict__, obj2.__dict__, visited)
+        # Mapping path
+        if isinstance(obj1, Mapping) and isinstance(obj2, Mapping):
+            if (set1 := set(obj1)) != set(obj2):
+                return False
+            for key in set1:
+                val1, val2 = obj1[key], obj2[key]
+                # Check for object identity first to handle self-references
+                if val1 is val2:
+                    continue
+                elif isinstance(val1, Mapping) and isinstance(val2, Mapping):
+                    if not deep_equality_check(val1, val2, visited):
+                        return False
+                # this is primarily for dataframes which have equals method.
+                elif hasattr(val1, "equals") and hasattr(val2, "equals"):
+                    if not val1.equals(val2):
+                        return False
+                # Handle object comparison carefully to avoid infinite recursion
+                elif hasattr(val1, "__dict__") and hasattr(val2, "__dict__"):
+                    # For objects with __dict__, use recursive comparison
+                    if not deep_equality_check(val1.__dict__, val2.__dict__, visited):
+                        return False
+                else:
+                    if not _robust_equality_check(val1, val2):
+                        return False
+        # Non-mapping fallback
+        return True if isinstance(obj1, Mapping) and isinstance(obj2, Mapping) else _robust_equality_check(obj1, obj2)
+    except Exception:
+        return False

Fix spool equality, patch reverse operation #542

Fix spool equality, patch reverse operation #542

Uh oh!

Conversation

d-chambers commented Sep 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

codecov bot commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

d-chambers commented Sep 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 20, 2025 •

edited

Loading

codecov bot commented Sep 20, 2025 •

edited

Loading