-
Notifications
You must be signed in to change notification settings - Fork 28
Waterfall refactor #574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Waterfall refactor #574
Conversation
WalkthroughAdds IQR-based default color scaling and robust scale handling to the waterfall plot: validates 2D patches and scale_type, centralizes scale logic into _get_scale (relative/absolute/IQR fence), introduces axis and colorbar formatting helpers, updates waterfall() signature/flow, and extends tests, docs, and examples. (50 words) Changes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #574 +/- ##
=======================================
Coverage 99.92% 99.92%
=======================================
Files 128 128
Lines 10878 10938 +60
=======================================
+ Hits 10870 10930 +60
Misses 8 8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hey @ahmadtourei, if you have a few minutes to test this out it would be appreciated. It is a very commonly used function so we need some extra testing. Hopefully it provides a big improvement, especially to new users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (6)
dascore/viz/waterfall.py (4)
75-81: Validate relative fractions are within [0, 1]Docs state relative values are fractions of the data range. Enforce upper bound too and clarify error.
Apply:
- if np.any(scale < 0) or scale[0] > scale[1] or len(scale) != 2: + if ( + len(scale) != 2 + or np.any(scale < 0) + or np.any(scale > 1) + or scale[0] > scale[1] + ): msg = ( - "Relative scale values cannot be negative and the first " - f"value must be less than the second. You passed {scale}" + "Relative scale values must be within [0, 1], length-2, " + "and ordered ascending. You passed " + f"{scale}" ) raise ParameterError(msg)
85-87: Absolute scalar semantics mismatch with examplesExamples claim
scale_type="absolute", scale=0.5gives [-0.5, 0.5], but code passes the scalar through. Either update docs or implement symmetric limits for scalars.If you want code to match docs, apply:
- # Case 4: Absolute scaling (default, no match needed) - # Scale values are used directly as colorbar limits - return scale + # Case 4: Absolute scaling (default) + # If a scalar is provided, interpret as symmetric about zero. + if scale_type == "absolute" and isinstance(scale, (int, float)): + scale = np.array([-abs(scale), abs(scale)]) + return scaleOtherwise, adjust the example text to reflect current behavior.
94-102: Add strict=True to zip (Ruff B905)Be explicit and catch mismatched lengths (we validate 2D already).
Based on static analysis hints
Apply:- for dim, x in zip(dims_r, ["x", "y"]): + for dim, x in zip(dims_r, ["x", "y"], strict=True):
61-69: Handle zero IQR/dynamic range to avoid singular CLIMIf data are constant, limits can be equal, producing singular normalization. Add a small epsilon or fallback to [min, max] ± eps.
Example:
- diff = q3 - q2 # Interquartile range (IQR) + diff = q3 - q2 # Interquartile range (IQR) + if not np.isfinite(diff) or diff == 0: + eps = 1e-12 if np.isfinite(q2) else 1.0 + return np.asarray([q2 - eps, q3 + eps])tests/test_viz/test_waterfall.py (2)
195-207: Add a case for relative values > 1If relative fractions must be within [0, 1], add a test to assert values > 1 raise ParameterError.
Apply:
# More than two values with pytest.raises(ParameterError, match=msg): random_patch.viz.waterfall(scale=(0.1, 0.2, 0.9), scale_type="relative") + + # Values greater than 1 + with pytest.raises(ParameterError, match=msg): + random_patch.viz.waterfall(scale=(0.1, 1.1), scale_type="relative")
160-166: Consider asserting absolute scalar semantics (doc vs code)Docs suggest
scale_type="absolute", scale=<scalar>yields symmetric [-v, v]. If you adopt that behavior, add an assertion here to lock it in.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
dascore/viz/waterfall.py(3 hunks)tests/test_viz/test_waterfall.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/test_viz/test_waterfall.py (4)
dascore/exceptions.py (1)
ParameterError(26-27)dascore/viz/waterfall.py (1)
waterfall(120-210)dascore/proc/coords.py (2)
select(433-525)squeeze(691-721)dascore/core/coords.py (5)
select(323-329)select(1015-1035)select(1167-1192)select(1338-1364)select(1449-1474)
dascore/viz/waterfall.py (4)
dascore/exceptions.py (1)
ParameterError(26-27)dascore/core/patch.py (4)
dims(212-226)data(272-283)attrs(239-252)coords(255-269)dascore/utils/plotting.py (5)
_get_dim_label(14-19)_format_time_axis(84-103)_get_ax(29-33)_get_cmap(22-26)_get_extents(36-81)dascore/units.py (1)
get_quantity_str(200-224)
🪛 Ruff (0.14.0)
dascore/viz/waterfall.py
94-94: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: test_code (windows-latest, 3.11)
- GitHub Check: test_code (windows-latest, 3.12)
- GitHub Check: test_code (macos-latest, 3.12)
- GitHub Check: test_code (windows-latest, 3.10)
- GitHub Check: test_code (macos-latest, 3.13)
- GitHub Check: test_code (macos-latest, 3.11)
- GitHub Check: test_code (windows-latest, 3.13)
- GitHub Check: test_code (ubuntu-latest, 3.11)
- GitHub Check: test_code (ubuntu-latest, 3.13)
- GitHub Check: test_code (macos-latest, 3.10)
- GitHub Check: test_code (ubuntu-latest, 3.10)
- GitHub Check: test_code (ubuntu-latest, 3.12)
- GitHub Check: test_code_min_deps (macos-latest, 3.12)
- GitHub Check: test_code_min_deps (windows-latest, 3.13)
- GitHub Check: test_code_min_deps (windows-latest, 3.12)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
- GitHub Check: Run benchmarks
dascore/viz/waterfall.py
Outdated
| def _get_scale(scale, scale_type, patch): | ||
| """ | ||
| Calculate the color bar scale limits based on scale and scale_type. | ||
| """ | ||
| _validate_scale_type(scale_type) | ||
| data = patch.data | ||
| modifier = 1 | ||
| if scale_type == "relative": | ||
| modifier = 0.5 * (np.nanmax(data) - np.nanmin(data)) | ||
| # only one scale parameter provided, center around mean | ||
| if isinstance(scale, float | int): | ||
| mean = np.nanmean(patch.data) | ||
| scale = np.asarray([mean - scale * modifier, mean + scale * modifier]) | ||
| im.set_clim(scale) | ||
|
|
||
| match (scale, scale_type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scale must be computed on the plotted data (fixes log=True CLIM mismatch)
Currently _get_scale uses patch.data, but the image may be log-transformed. This makes CLIM inconsistent for log=True. Compute scale from the actual data passed to imshow and ignore non-finite values.
Apply:
-def _get_scale(scale, scale_type, patch):
+def _get_scale(scale, scale_type, data):
@@
- _validate_scale_type(scale_type)
- data = patch.data
+ _validate_scale_type(scale_type)
+ # operate on the plotted data and ignore non-finite values
+ d = np.asarray(data)
+ finite_mask = np.isfinite(d)
+ if not np.any(finite_mask):
+ return scale
+ data = d[finite_mask]And update call site:
- scale = _get_scale(scale, scale_type, patch)
+ scale = _get_scale(scale, scale_type, data)Optionally, add a small epsilon fallback if min==max or IQR==0 to avoid singular CLIM.
Also applies to: 197-200
CodSpeed Performance ReportMerging #574 will degrade performances by 29.41%Comparing Summary
Benchmarks breakdown
|
|
✅ Documentation built: |
Sure, I'll test it today or tomorrow. |
| sp_lf_merged = sp_lf.chunk(time=None, conflict="keep_first") | ||
| pa_lf_merged = sp_lf_merged[0] | ||
| # Visualize the results. Try different scale values for better visualization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be helpful to keep this comment as it's still true?
|
@d-chambers |
Ya, that is unfortunate, but I am still not sure how we can handle it besides, perhaps raising an issue with matplotlib. Have you had any other thoughts? For the record, I haven't run into it, but I have been looking at minutes of data with sample rates up to 500 Hz lately, so maybe to small to show the effect you document? |
Same here, I'm not sure how we can tackle that issue. Maybe we should keep the matplotlib version below 3.10 for now to ensure reproducibility for the users? That's what I'm doing for my env :) But that may cause dependency issues as you know. I ran into that performance issue I mentioned only for one of the datasets I was looking at (the VA mine), which is in patches of ~255 sec with a 2000 sampling rate (~2GB per patch). |
|
I dug into this a bit more, it appears that there where some changes in data interpolation from 3.9 to 3.10 you can see here. So, if we simply set these parameters to the old defaults in |
|
With these parameters set matplotlib 3.10 now produces this image with the code provided in #512 :
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
dascore/viz/waterfall.py (3)
220-258: Verify the scale computation fix is applied consistentlyOnce the critical issue with
_get_scaleis fixed (passingdatainstead ofpatch), ensure line 233 uses the transformed data:scale = _get_scale(scale, scale_type, data)This ensures the scale is computed on the actual plotted data (after log transform if
log=True).Based on learnings
180-182: Fix documentation: Fractions of range, not percentilesThe past review comment is still valid. Relative scaling with a sequence maps fractions of the [min, max] range, not percentiles. Line 90 uses
dmin + scale * data_range, which is linear interpolation, notnp.percentile.Apply this correction:
- >>> # Use relative scaling with a tuple to show the middle 80% of data range - >>> # Scale values of (0.1, 0.9) map to 10th and 90th percentile of data + >>> # Use relative scaling with a tuple to show the middle 80% of the data range + >>> # Scale values of (0.1, 0.9) map to 10% and 90% of the data rangeBased on learnings
49-98: Critical: Scale computed on wrong data when log=TrueThe past review comment remains valid and must be addressed. Currently
_get_scalecomputes scale onpatch.data(line 54), but whenlog=True, the plotted data islog10(abs(data))(line 226). This causes incorrect colorbar limits for logarithmic plots.The scale must be computed on the actual data passed to
imshow. Apply this fix:-def _get_scale(scale, scale_type, patch): +def _get_scale(scale, scale_type, data): """ Calculate the color bar scale limits based on scale and scale_type. """ _validate_scale_type(scale_type) - data = patch.data + # Ensure we work with the actual data array and handle non-finite values + data = np.asarray(data)And update the call site at line 233:
+ # Compute scale on the data that will actually be plotted + scale = _get_scale(scale, scale_type, data) - scale = _get_scale(scale, scale_type, patch)This ensures colorbar limits match the visualized data range.
Based on learnings
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
dascore/viz/waterfall.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
dascore/viz/waterfall.py (4)
dascore/exceptions.py (1)
ParameterError(26-27)dascore/proc/coords.py (2)
get_coord(96-148)get_array(151-200)dascore/utils/plotting.py (5)
_get_dim_label(14-19)_format_time_axis(84-103)_get_ax(29-33)_get_cmap(22-26)_get_extents(36-81)dascore/units.py (1)
get_quantity_str(200-224)
🔇 Additional comments (7)
dascore/viz/waterfall.py (7)
12-12: LGTM: Clean imports and well-documented constantThe addition of
ParameterErrorand the publicIQR_FENCE_MULTIPLIERconstant aligns well with the refactor goals. The 1.5 multiplier is the standard Tukey fence value.Also applies to: 23-23
26-32: LGTM: Clear validation with informative errorThe explicit validation and ParameterError aligns with the PR objective of replacing assertions with explicit exceptions.
34-47: LGTM: Robust dimension validationThe defensive approach of attempting to squeeze before raising an error is user-friendly and the error message is informative.
100-113: LGTM: Selective Y-axis inversion preserves seismic conventionThe conditional inversion (only for time-like Y axes) successfully addresses the PR objective while maintaining flexibility. Users can still manually invert via the returned axis if needed.
115-127: LGTM: Clean colorbar implementationThe colorbar labeling correctly handles both regular and logarithmic scales with appropriate unit formatting.
130-138: LGTM: Well-typed function signatureThe type annotations improve code clarity and enable better IDE support.
234-246: LGTM: Thoughtful handling of matplotlib version differencesThe explicit
interpolationandinterpolation_stageparameters address the performance regression with matplotlib 3.10+ mentioned in the PR discussion, while maintaining visual consistency.
Awesome! I agree, the imshow parameters in versions before 3.10 work better with DAS (and probably any multichannel seismic) data. Thanks for working on it @d-chambers! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
dascore/viz/waterfall.py (1)
179-181: Fix documentation: sequence scaling uses fractions, not percentiles.The comment incorrectly states that scale values map to percentiles. With relative scaling,
(0.1, 0.9)maps to 10% and 90% of the[data_min, data_max]range (fractions), not the 10th and 90th percentiles.Apply this diff:
- >>> # Use relative scaling with a tuple to show the middle 80% of data range - >>> # Scale values of (0.1, 0.9) map to 10th and 90th percentile of data + >>> # Use relative scaling with a tuple to show a specific fraction of data range + >>> # Scale values of (0.1, 0.9) map to 10% and 90% of the [data_min, data_max] range >>> _ = patch.viz.waterfall(scale=(0.1, 0.9), scale_type="relative")
🧹 Nitpick comments (1)
dascore/viz/waterfall.py (1)
67-70: Fix misleading variable names in IQR calculation.The variables are named
q2, q3but represent the 25th and 75th percentiles (Q1 and Q3), not Q2 (median) and Q3. While the logic is correct, the naming is confusing.Apply this diff to improve clarity:
- case (None, "relative"): - q2, q3 = np.nanpercentile(data, [25, 75]) + case (None, "relative"): + q1, q3 = np.nanpercentile(data, [25, 75]) dmin, dmax = np.nanmin(data), np.nanmax(data) - diff = q3 - q2 # Interquartile range (IQR) - q_lower = np.nanmax([q2 - diff * IQR_FENCE_MULTIPLIER, dmin]) + diff = q3 - q1 # Interquartile range (IQR) + q_lower = np.nanmax([q1 - diff * IQR_FENCE_MULTIPLIER, dmin]) q_upper = np.nanmin([q3 + diff * IQR_FENCE_MULTIPLIER, dmax])
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
dascore/viz/waterfall.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
dascore/viz/waterfall.py (4)
dascore/exceptions.py (1)
ParameterError(26-27)dascore/core/patch.py (6)
ndim(229-231)dims(212-226)data(272-283)dtype(318-320)attrs(239-252)coords(255-269)dascore/proc/coords.py (2)
squeeze(691-721)get_coord(96-148)dascore/utils/plotting.py (5)
_get_dim_label(14-19)_format_time_axis(84-103)_get_ax(29-33)_get_cmap(22-26)_get_extents(36-81)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
- GitHub Check: test_code (windows-latest, 3.10)
- GitHub Check: test_code (windows-latest, 3.13)
- GitHub Check: test_code (windows-latest, 3.12)
- GitHub Check: test_code (macos-latest, 3.12)
- GitHub Check: test_code (windows-latest, 3.11)
- GitHub Check: test_code (ubuntu-latest, 3.13)
- GitHub Check: test_code (ubuntu-latest, 3.11)
- GitHub Check: test_code (ubuntu-latest, 3.12)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
- GitHub Check: test_code_min_deps (windows-latest, 3.13)
- GitHub Check: test_code_min_deps (windows-latest, 3.12)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
- GitHub Check: test_code_min_deps (macos-latest, 3.13)
- GitHub Check: test_code_min_deps (macos-latest, 3.12)
- GitHub Check: Run benchmarks
- GitHub Check: test_build_docs
🔇 Additional comments (8)
dascore/viz/waterfall.py (8)
12-12: LGTM! Clean imports and well-defined constant.The
ParameterErrorimport andIQR_FENCE_MULTIPLIERconstant are appropriate additions. The 1.5 multiplier is the standard value for Tukey's fence and matches the updated documentation.Also applies to: 23-23
26-31: LGTM! Clear and effective validation.The validation logic is straightforward with an informative error message.
34-46: LGTM! Robust dimension validation with helpful fallback.The automatic squeeze attempt improves usability, and the error message provides clear diagnostic information.
49-96: Excellent refactor! Scale computed on actual plotted data.The refactored
_get_scalefunction now receives the transformeddatadirectly, which correctly addresses the previous concern about CLIM mismatch whenlog=True. The scale is now computed on the actual data that will be plotted.
99-111: LGTM! Smart conditional Y-axis inversion.The logic correctly implements time-aware axis flipping as described in the PR objectives—inverting only when the Y-axis is time-like, which preserves seismic convention without affecting other coordinate types.
114-125: LGTM! Clean colorbar formatting with proper log scale indication.The label construction handles edge cases gracefully and clearly indicates logarithmic scaling when applicable.
233-245: Good addition of matplotlib 3.10 compatibility parameters.The explicit
interpolationandinterpolation_stageparameters address the performance regression and visual differences reported in issue #512 with matplotlib 3.10.
219-257: Well-structured refactor with clean separation of concerns.The updated
waterfallfunction effectively delegates to the new helper functions, improving maintainability. The flow correctly validates inputs, transforms data when needed, computes appropriate scales, and applies formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
dascore/viz/waterfall.py (2)
177-186: Fix doc examples: tuple vs scalar mismatch; fractions vs percentiles wording
- The comment says “tuple” but the call uses a scalar.
- Relative tuple maps fractions of [min, max], not percentiles. This was flagged earlier.
Apply:
- >>> # Use relative scaling with a tuple to show a specific fraction - >>> # of data range. Scale values of (0.1, 0.9) map to 10% and 90% - >>> # of the [data_min, data_max] range data's dynamic range - >>> _ = patch.viz.waterfall(scale=0.1, scale_type="relative") + >>> # Use relative scaling with a tuple to show a specific fraction + >>> # of the data range. (0.1, 0.9) maps to 10% and 90% of [min, max] + >>> _ = patch.viz.waterfall(scale=(0.1, 0.9), scale_type="relative") @@ - >>> # Use relative scaling with a tuple to show the middle 80% of data range - >>> # Scale values of (0.1, 0.9) map to 10th and 90th percentile of data + >>> # Use relative scaling with a tuple to show the middle 80% of the data range + >>> # (0.1, 0.9) map to 10% and 90% of the [min, max] data range
49-76: Handle non-finite values in scale computation and add constant-data fallbackLog transforms produce ±inf; NaNs may also exist. Percentiles/min/max should ignore non-finite, and identical limits should be widened slightly to avoid singular CLIM.
Apply:
def _get_scale(scale, scale_type, data): @@ - _validate_scale_type(scale_type) - # This ensures we have a list of the previous scale parameters. - scale = maybe_convert_percent_to_fraction(scale) + _validate_scale_type(scale_type) + # Convert percent inputs and operate on finite plotted data only. + scale = maybe_convert_percent_to_fraction(scale) + d = np.asarray(data) + finite = np.isfinite(d) + if not np.any(finite): + return scale # nothing finite; leave scale unchanged + data = d[finite] @@ - q1, q3 = np.nanpercentile(data, [25, 75]) - dmin, dmax = np.nanmin(data), np.nanmax(data) + q1, q3 = np.nanpercentile(data, [25, 75]) + dmin, dmax = np.nanmin(data), np.nanmax(data) diff = q3 - q1 # Interquartile range (IQR) q_lower = np.nanmax([q1 - diff * IQR_FENCE_MULTIPLIER, dmin]) q_upper = np.nanmin([q3 + diff * IQR_FENCE_MULTIPLIER, dmax]) - scale = np.asarray([q_lower, q_upper]) + scale = np.asarray([q_lower, q_upper]) + # Fallback for constant/near-constant data or degenerate fence. + if not np.all(np.isfinite(scale)) or scale[0] >= scale[1]: + eps = np.finfo(float).eps * max(1.0, abs(dmin)) + scale = np.asarray([dmin - eps, dmax + eps]) return scale
🧹 Nitpick comments (3)
dascore/units.py (1)
360-411: Function implementation looks good with one minor note.The function correctly converts percentage quantities to fractions and handles various input types. The logic is sound and well-documented.
Minor note on Line 404: The
isinstance(obj, Quantity | Unit)check includesUnit, butUnitobjects may not have anndimattribute, which could raise anAttributeErrorif aUnitis passed. However, based on the docstring and examples, the function appears designed forQuantityobjects and plain values, so this is unlikely to be encountered in practice. Consider either narrowing the check to justQuantityor documenting thatUnitobjects are not supported, for clarity.dascore/viz/waterfall.py (2)
101-114: Treat timedelta axes as time-like for Y inversionY inversion triggers only for datetime64. Timedelta-based time axes should invert too to match the stated behavior.
Apply:
- if np.issubdtype(coord.dtype, np.datetime64): - _format_time_axis(ax, dim, x) - # Invert the y axis so origin is at the top. This follows the - # convention for seismic shot gathers where time increases downward. - if x == "y": - ax.invert_yaxis() + if np.issubdtype(coord.dtype, np.datetime64) or np.issubdtype(coord.dtype, np.timedelta64): + if np.issubdtype(coord.dtype, np.datetime64): + _format_time_axis(ax, dim, x) + # Invert the y axis so origin is at the top for time-like axes. + if x == "y": + ax.invert_yaxis()
120-127: Safeguard colorbar label when attrs key is missingAccessing
attrs["data_type"]can raise KeyError. Use.getwith a default.Apply:
- data_type = str(patch.attrs["data_type"]) + data_type = str(patch.attrs.get("data_type", ""))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
dascore/proc/taper.py(0 hunks)dascore/units.py(2 hunks)dascore/viz/waterfall.py(5 hunks)tests/test_units.py(2 hunks)tests/test_viz/test_waterfall.py(3 hunks)
💤 Files with no reviewable changes (1)
- dascore/proc/taper.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_viz/test_waterfall.py
🧰 Additional context used
🧬 Code graph analysis (3)
tests/test_units.py (1)
dascore/units.py (2)
maybe_convert_percent_to_fraction(360-411)get_quantity(83-108)
dascore/units.py (1)
dascore/utils/misc.py (1)
iterate(257-269)
dascore/viz/waterfall.py (4)
dascore/exceptions.py (1)
ParameterError(26-27)dascore/units.py (2)
get_quantity_str(200-224)maybe_convert_percent_to_fraction(360-411)dascore/core/patch.py (5)
ndim(229-231)dims(212-226)data(272-283)attrs(239-252)coords(255-269)dascore/utils/plotting.py (5)
_get_dim_label(14-19)_format_time_axis(84-103)_get_ax(29-33)_get_cmap(22-26)_get_extents(36-81)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: Run benchmarks
- GitHub Check: test_code (windows-latest, 3.13)
- GitHub Check: test_code (windows-latest, 3.12)
- GitHub Check: test_code (windows-latest, 3.11)
- GitHub Check: test_code (ubuntu-latest, 3.10)
- GitHub Check: test_code (windows-latest, 3.10)
- GitHub Check: test_code (macos-latest, 3.13)
- GitHub Check: test_code (ubuntu-latest, 3.11)
- GitHub Check: test_code (ubuntu-latest, 3.13)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
- GitHub Check: test_code (macos-latest, 3.11)
- GitHub Check: test_code_min_deps (windows-latest, 3.12)
- GitHub Check: test_code (ubuntu-latest, 3.12)
- GitHub Check: test_code (macos-latest, 3.10)
- GitHub Check: test_code_min_deps (macos-latest, 3.13)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
- GitHub Check: test_code_min_deps (windows-latest, 3.13)
- GitHub Check: test_code_min_deps (macos-latest, 3.12)
- GitHub Check: test_build_docs
🔇 Additional comments (4)
tests/test_units.py (2)
20-20: LGTM!The import is necessary for the new test class and correctly imports from
dascore.units.
359-419: Excellent test coverage!The test class comprehensively covers the function's behavior including edge cases (0%, 250%, fractional percentages), mixed inputs, and no-op scenarios for non-percentage values. The use of
np.isclosefor floating point comparisons intest_fractional_percentageis appropriate.dascore/units.py (1)
17-17: LGTM!The
iterateimport is necessary for the newmaybe_convert_percent_to_fractionfunction.dascore/viz/waterfall.py (1)
242-248: ---No issues found —
interpolation="antialiased"is valid in matplotlib 3.10+The "antialiased" value is a supported interpolation argument in matplotlib 3.10+, and using it restores the pre-3.10 visual behavior for large rasters. The code implementation is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
dascore/viz/waterfall.py (1)
184-186: Fix doc: Scale values represent fractions of range, not percentiles.Lines 185-186 still incorrectly state that scale values "map to 10th and 90th percentile". This was flagged previously but the text remains incorrect. The relative scale maps fractions of the
[data_min, data_max]range, not statistical percentiles.Apply this diff:
- >>> # Use relative scaling with a tuple to show the middle 80% of data range - >>> # Scale values of (0.1, 0.9) map to 10th and 90th percentile of data + >>> # Use relative scaling with a tuple to show the middle 80% of data range + >>> # Scale values of (0.1, 0.9) map to 10% and 90% of [data_min, data_max] >>> _ = patch.viz.waterfall(scale=(0.1, 0.9), scale_type="relative")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
dascore/units.py(2 hunks)dascore/viz/waterfall.py(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- dascore/units.py
🧰 Additional context used
🧬 Code graph analysis (1)
dascore/viz/waterfall.py (6)
dascore/exceptions.py (1)
ParameterError(26-27)dascore/units.py (2)
get_quantity_str(200-224)maybe_convert_percent_to_fraction(360-411)dascore/utils/time.py (1)
dtype_time_like(415-425)dascore/core/patch.py (6)
ndim(229-231)dims(212-226)data(272-283)dtype(318-320)attrs(239-252)coords(255-269)dascore/proc/coords.py (1)
squeeze(691-721)dascore/utils/plotting.py (5)
_get_dim_label(14-19)_format_time_axis(84-103)_get_ax(29-33)_get_cmap(22-26)_get_extents(36-81)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
- GitHub Check: test_code_min_deps (macos-latest, 3.13)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
- GitHub Check: test_code_min_deps (windows-latest, 3.13)
- GitHub Check: test_code_min_deps (windows-latest, 3.12)
- GitHub Check: test_code (windows-latest, 3.12)
- GitHub Check: test_code (windows-latest, 3.10)
- GitHub Check: test_code (ubuntu-latest, 3.13)
- GitHub Check: test_code (windows-latest, 3.13)
- GitHub Check: test_code (ubuntu-latest, 3.12)
- GitHub Check: test_code (windows-latest, 3.11)
- GitHub Check: test_code (macos-latest, 3.13)
- GitHub Check: test_code (macos-latest, 3.12)
- GitHub Check: test_code (macos-latest, 3.10)
- GitHub Check: test_code (ubuntu-latest, 3.11)
- GitHub Check: test_code (ubuntu-latest, 3.10)
- GitHub Check: Run benchmarks
- GitHub Check: test_build_docs
🔇 Additional comments (8)
dascore/viz/waterfall.py (8)
12-13: LGTM!The new imports are appropriate and used correctly throughout the file for parameter validation, unit handling, and time axis detection.
Also applies to: 22-22
24-24: LGTM!The constant follows the standard Tukey fence convention and is well-named.
27-32: LGTM!The validation function is clear and raises an appropriate exception with a helpful message.
35-47: LGTM!The validation function correctly handles dimension checking with a helpful squeeze attempt before raising an error. The error message provides useful diagnostic information.
102-113: LGTM!The function correctly handles axis formatting and applies Y-axis inversion only for time-like coordinates, which aligns with the PR objectives and seismic convention.
116-127: LGTM!The colorbar creation and labeling logic is correct and handles edge cases appropriately.
131-139: LGTM!The function signature now has complete type annotations that accurately reflect the parameter and return types.
224-258: LGTM! Implementation correctly addresses prior review concerns.The refactored implementation:
- Computes scale on the plotted data (line 235), fixing the log=True CLIM mismatch issue.
- Guards
set_climwith proper validation (lines 249-250), preventing errors from invalid scale values.- Adds matplotlib 3.10 compatibility parameters (lines 246-247) to address the performance regression noted in PR comments.
The control flow is clear and the new helper functions improve maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
dascore/viz/waterfall.py (3)
60-76: Consider handling degenerate scale edge cases.When data is constant (min==max) or when scale=0.0 is passed for relative scaling, the computed scale becomes [mean, mean], which can produce a degenerate colorbar. While these are arguably user errors, adding a small epsilon or early validation could improve robustness.
Optional enhancement:
case (scale, "relative") if len(scale) == 1: scale = scale[0] + if scale == 0: + msg = "Relative scale value of 0 would produce degenerate colorbar limits" + raise ParameterError(msg) mod = 0.5 * (np.nanmax(data) - np.nanmin(data)) + if mod == 0: + # Constant data, use small epsilon to avoid degenerate limits + mod = 1e-10 mean = np.nanmean(data) scale = np.asarray([mean - scale * mod, mean + scale * mod])
177-180: Minor: Simplify redundant phrasing in comment.Line 179 contains redundant text: "of the [data_min, data_max] range data's dynamic range".
Apply this diff:
- >>> # of the [data_min, data_max] range data's dynamic range + >>> # of the data's [min, max] range
242-247: Nitpick: Minor formatting in issue reference.Line 244 references "# 512" with a space. Consider using "#512" for consistency with typical GitHub issue reference format.
Apply this diff:
- # like matplotlib < 3.9, which tends to work better for visualizing large - # DAS data. See # 512. We might consider making these parameters of the + # like matplotlib < 3.9, which tends to work better for visualizing large + # DAS data. See #512. We might consider making these parameters of the
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
dascore/viz/waterfall.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
dascore/viz/waterfall.py (6)
dascore/exceptions.py (1)
ParameterError(26-27)dascore/units.py (2)
get_quantity_str(200-224)maybe_convert_percent_to_fraction(360-411)dascore/utils/time.py (1)
dtype_time_like(415-425)dascore/core/patch.py (6)
ndim(229-231)dims(212-226)data(272-283)dtype(318-320)attrs(239-252)coords(255-269)dascore/proc/coords.py (1)
get_coord(96-148)dascore/utils/plotting.py (5)
_get_dim_label(14-19)_format_time_axis(84-103)_get_ax(29-33)_get_cmap(22-26)_get_extents(36-81)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: test_code (windows-latest, 3.11)
- GitHub Check: test_code (windows-latest, 3.10)
- GitHub Check: test_code (windows-latest, 3.12)
- GitHub Check: test_code (windows-latest, 3.13)
- GitHub Check: test_code (ubuntu-latest, 3.12)
- GitHub Check: test_code (macos-latest, 3.13)
- GitHub Check: test_code (ubuntu-latest, 3.13)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
- GitHub Check: test_code (macos-latest, 3.12)
- GitHub Check: test_code (macos-latest, 3.10)
- GitHub Check: test_code (ubuntu-latest, 3.10)
- GitHub Check: test_code (ubuntu-latest, 3.11)
- GitHub Check: test_code_min_deps (macos-latest, 3.12)
- GitHub Check: test_code_min_deps (macos-latest, 3.13)
- GitHub Check: test_code_min_deps (windows-latest, 3.12)
- GitHub Check: test_code_min_deps (windows-latest, 3.13)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
- GitHub Check: test_build_docs
- GitHub Check: Run benchmarks
🔇 Additional comments (7)
dascore/viz/waterfall.py (7)
27-33: LGTM: Clear validation with explicit exception.The validation logic correctly enforces the two valid scale types and provides a helpful error message.
35-47: LGTM: Robust dimension validation with squeeze fallback.The function appropriately attempts to squeeze degenerate dimensions before raising an error, improving usability.
102-113: LGTM: Y-axis inversion correctly restricted to time-like coordinates.The implementation properly checks for time-like data types before inverting the y-axis, aligning with the PR objective to preserve seismic convention only when appropriate.
116-127: LGTM: Colorbar formatting handles attribute variations gracefully.The label construction logic correctly handles missing or empty data_type/data_units attributes and appropriately indicates log scaling.
224-236: Good: Scale computed on transformed data (addresses past review).The current implementation correctly computes the scale on the log-transformed data (line 230 transforms, line 236 uses it for scale), which addresses the concern from a previous review about log=True CLIM consistency.
Optional enhancement for inf value handling:
data = np.log10(np.absolute(patch.data)) if log else patch.data + # Filter inf from log(0) if present, as they can skew colorbar limits + if log: + data = np.where(np.isfinite(data), data, np.nan)This would treat log(0) = -inf as NaN, preventing it from dominating the colorbar limits.
236-250: Good: CLIM guard prevents errors with invalid scale values.The guard at line 249 robustly checks that scale is a valid 2-element sequence with finite values before calling
set_clim, preventing potential errors. This addresses the concern from a previous review.
50-99: ****The review comment misidentifies the order of operations. The code already correctly computes the scale on log-transformed data when
log=True.At line 228,
datais reassigned to the log-transformed version (np.log10(np.absolute(patch.data))whenlog=True). Then at line 234,_get_scaleis called with this already-transformeddata. The scale is computed on the correct data, and no fix is needed.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
dascore/viz/waterfall.py (1)
232-266: Implementation is solid; consider documenting log transform behavior with zeros.The refactored waterfall implementation correctly validates inputs, transforms data before scale computation (addressing past review concerns), and guards clim setting. The flow is clear and modular.
One optional consideration: When
log=True, line 237 appliesnp.log10(np.absolute(patch.data)), which produces-inffor zero values. While theset_climguard on line 257 prevents invalid scale limits from being applied, the plot may still display-infvalues, potentially causing visualization artifacts.Consider adding a note in the docstring for the
logparameter mentioning that zero values will result in-infand suggesting users pre-process data if needed (e.g., masking or adding a small epsilon). For example:log - If True, visualize the common logarithm of the absolute values of patch data. + If True, visualize the common logarithm of the absolute values of patch data. + Note: Zero values will result in -inf; consider masking or preprocessing if needed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
dascore/viz/waterfall.py(5 hunks)tests/test_viz/test_waterfall.py(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_viz/test_waterfall.py
🧰 Additional context used
🧬 Code graph analysis (1)
dascore/viz/waterfall.py (5)
dascore/exceptions.py (1)
ParameterError(26-27)dascore/units.py (2)
get_quantity_str(200-224)maybe_convert_percent_to_fraction(360-411)dascore/utils/time.py (1)
dtype_time_like(415-425)dascore/core/patch.py (6)
ndim(229-231)dims(212-226)data(272-283)dtype(318-320)attrs(239-252)coords(255-269)dascore/utils/plotting.py (5)
_get_dim_label(14-19)_format_time_axis(84-103)_get_ax(29-33)_get_cmap(22-26)_get_extents(36-81)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: test_code (windows-latest, 3.11)
- GitHub Check: test_code (windows-latest, 3.13)
- GitHub Check: test_code (windows-latest, 3.10)
- GitHub Check: test_code (ubuntu-latest, 3.11)
- GitHub Check: test_code (ubuntu-latest, 3.10)
- GitHub Check: test_code (windows-latest, 3.12)
- GitHub Check: test_code (ubuntu-latest, 3.12)
- GitHub Check: test_code (macos-latest, 3.13)
- GitHub Check: test_code (macos-latest, 3.10)
- GitHub Check: test_code (ubuntu-latest, 3.13)
- GitHub Check: test_code (macos-latest, 3.11)
- GitHub Check: Run benchmarks
- GitHub Check: test_code_min_deps (macos-latest, 3.12)
- GitHub Check: test_code_min_deps (windows-latest, 3.13)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
- GitHub Check: test_code_min_deps (macos-latest, 3.13)
- GitHub Check: test_code_min_deps (windows-latest, 3.12)
- GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
- GitHub Check: test_build_docs
🔇 Additional comments (8)
dascore/viz/waterfall.py (8)
1-24: LGTM: Imports and constants are well-structured.The new imports (
ParameterError,maybe_convert_percent_to_fraction,dtype_time_like) support the refactored validation and scaling logic, andIQR_FENCE_MULTIPLIERis appropriately defined as a public constant.
27-32: LGTM: Clean validation with clear error messaging.The function correctly validates
scale_typeand raises an appropriate exception.
35-47: LGTM: Robust dimension validation with helpful fallback.The function correctly validates 2D requirements, attempts to squeeze degenerate dimensions, and provides clear error context.
110-122: LGTM: Axis formatting correctly implements time-aware Y-axis inversion.The function properly formats labels and selectively inverts the Y-axis only when it's time-like, which aligns with the PR objective to preserve seismic convention while avoiding unnecessary flips for non-time coordinates.
124-135: LGTM: Colorbar construction handles edge cases gracefully.The function correctly formats colorbar labels with appropriate unit handling and log notation.
244-256: Excellent: Matplotlib 3.10 compatibility fix is well-documented.The explicit
interpolationandinterpolation_stageparameters address the performance regression with matplotlib 3.10 noted in the PR discussion. The inline comment clearly documents the rationale and references issue #512.
179-231: Excellent: Comprehensive documentation with clear breaking change notice.The docstring provides extensive examples covering all scaling modes and includes a detailed note about the version 0.1.13 behavior change. This will help users understand and adapt to the new default IQR-based scaling.
50-107: None handling verified correct; empty data edge case is defensively handled.The None-to-empty-list conversion works as expected:
iterate(None)returns an empty tuple, causingmaybe_convert_percent_to_fraction(None)to yield[], which matches thecase ([], "relative")pattern on line 77 and activates the IQR fence logic.For empty data arrays, while
np.nanmin()andnp.nanmax()would return NaN that propagates through calculations, the defensive check at line 257 preventsset_clim()from being called with invalid values. Additionally, empty data arrays in a 2D patch would be anomalous and unlikely in practice.The scale computation is correct and handles these edge cases appropriately.

Description
This PR refactors
viz.waterfallfunction. It does a few things:Fixes typos, clarifies the docs, and some minor refactoring of internal functions.
Changes default behavior when scale=None to perform the standard statistical fence to select ranges rather than using the whole data range. This will eliminate, or at least alleviate, the need for constant fiddling with the scale parameter when there are a few outliers in the data (very common). The old behavior can re-created using scale=1.0.
Replaces the assertions with proper exception raises for malformed parameters.
Makes the automatic Y axis flip only occur when the Y axis of the patch is time-like. This will still match seismic convention but stop flipping with other types of coordinates. This can always be manipulated directly with the returned matplotlib axis.
Summary by CodeRabbit
New Features
Improvements
Documentation
Tests
Chores