Improve the performance of tiled decompression by Cadair · Pull Request #14430 · astropy/astropy

Cadair · 2023-02-22T15:45:49Z

This is a work in progress for the moment as we continue to iterate on the performance hot spots.

github-actions · 2023-02-22T15:46:24Z

github-actions · 2023-02-22T15:46:35Z

👋 Thank you for your draft pull request! Do you know that you can use [ci skip] or [skip ci] in your commit messages to skip running continuous integration tests until you are ready?

astrofrog · 2023-02-22T15:48:48Z

We should try and re-write _iter_array_tiles to not use Numpy as I think this slows things down a bit

astropy/io/fits/_tiled_compression/tiled_compression.py

astropy/io/fits/_tiled_compression/codecs.py

astropy/io/fits/_tiled_compression/tiled_compression.py

astrofrog · 2023-02-23T17:09:16Z

astropy/io/fits/_tiled_compression/tiled_compression.py

+
+    # If more than half the data is requested, read in all the heap.
+    # TODO: decide what heuristic we actually want here
+    if np.product(buffer_shape) > 0.5 * np.product(data_shape):


@Cadair - we need to think about this!

* Read the whole heap in one go if a significant fraction of the data is needed. * Minimize FITS_rec access and cast to np.array as soon as possible * Consolidate code to read in tiles/heap * Other small optimizations

…dask.from_array

Cadair · 2023-03-07T14:08:00Z

astropy/io/fits/_tiled_compression/tiled_compression.py

+        if algorithm in ("RICE_1", "RICE_ONE", "PLIO_1") and tile_size < len(
+            tile_buffer
+        ):


…ng .data might be more appropriate than using .section

astrofrog · 2023-03-07T14:21:39Z

@saimn - I believe this is ready for review - this speeds things up by a factor of a few in our new tiled compression algorithm, to bring it a lot closer to the original performance in previous astropy versions.

astropy/io/fits/_tiled_compression/codecs.py

saimn · 2023-03-07T18:11:01Z

astropy/io/fits/_tiled_compression/tiled_compression.py

+        # We have to calculate the tilesize from the shape of the tile not the
+        # header, so that it's correct for edge tiles etc.
+        settings["tilesize"] = prod(actual_tile_shape)
+    elif compression_type in ("RICE_1", "RICE_ONE"):


Merge PLIO_1 with this one ?

astrofrog · 2023-03-07T22:04:11Z

@saimn - I've addressed your comments 😄

saimn

Nice improvements, thanks @Cadair & @astrofrog !

github-actions bot added Docs io.fits testing labels Feb 22, 2023

astrofrog reviewed Feb 22, 2023

View reviewed changes

astropy/io/fits/_tiled_compression/tiled_compression.py Outdated Show resolved Hide resolved

Cadair commented Feb 22, 2023

View reviewed changes

astropy/io/fits/_tiled_compression/codecs.py Outdated Show resolved Hide resolved

astrofrog force-pushed the performance_issues_decompress_tile branch from 0ef5093 to e9f8f07 Compare February 23, 2023 14:39

astrofrog mentioned this pull request Feb 23, 2023

Add the ability to convert a compressed image HDU to a dask array for parallel decompression #14452

Closed

Cadair force-pushed the performance_issues_decompress_tile branch from 6d15639 to b2921ce Compare February 23, 2023 16:40

astrofrog reviewed Feb 23, 2023

View reviewed changes

astropy/io/fits/_tiled_compression/tiled_compression.py Outdated Show resolved Hide resolved

astrofrog reviewed Feb 23, 2023

View reviewed changes

Cadair mentioned this pull request Mar 3, 2023

Reading real data into the Dask array is *slow* DKISTDC/dkist#226

Open

8 tasks

astrofrog added Affects-dev PRs and issues that do not impact an existing Astropy release no-changelog-entry-needed labels Mar 6, 2023

astrofrog force-pushed the performance_issues_decompress_tile branch from b2921ce to fff7370 Compare March 6, 2023 11:37

Cadair and others added 10 commits March 7, 2023 09:39

Refactor a little for some significant performance improvements

dd865da

Further performance improvements.

17f9667

* Read the whole heap in one go if a significant fraction of the data is needed. * Minimize FITS_rec access and cast to np.array as soon as possible * Consolidate code to read in tiles/heap * Other small optimizations

Re-wrote tile iteration code to not use Numpy arrays

ad83a66

Release GIL inside tiled compression C extension

93e314e

Expose more properties on CompImageSection to make it work well with …

d3c511c

…dask.from_array

Cleanup

79646c5

Use empty to init arrays not zeros

8a03933

Make variable naming more consistent

4429d41

Fixed segfaults and tests

84159f3

More cleanup

55c3580

astrofrog force-pushed the performance_issues_decompress_tile branch from 0825234 to 55c3580 Compare March 7, 2023 09:43

Fix code style

7ccab79

Cadair commented Mar 7, 2023

View reviewed changes

Only cache heap when loading the whole dataset, and document when usi…

d5b1fb8

…ng .data might be more appropriate than using .section

astrofrog marked this pull request as ready for review March 7, 2023 14:15

astrofrog requested a review from saimn as a code owner March 7, 2023 14:15

saimn reviewed Mar 7, 2023

View reviewed changes

Implement review comments from @saimn

0c6dd57

astrofrog requested a review from saimn March 7, 2023 22:35

saimn approved these changes Mar 8, 2023

View reviewed changes

saimn added this to the v5.3 milestone Mar 8, 2023

saimn merged commit a3f4ae6 into astropy:main Mar 8, 2023

Cadair deleted the performance_issues_decompress_tile branch March 9, 2023 11:28

pllim mentioned this pull request Jan 9, 2025

STY: semi-manual fix for FURB188 (slice-to-remove-prefix-or-suffix) (io.fits) #17619

Closed

1 task

Uh oh!

Comments

Conversation

Cadair commented Feb 22, 2023 • edited by astrofrog Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 22, 2023

Uh oh!

github-actions bot commented Feb 22, 2023

Uh oh!

astrofrog commented Feb 22, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

astrofrog Feb 23, 2023

Choose a reason for hiding this comment

Uh oh!

Cadair Mar 7, 2023

Choose a reason for hiding this comment

Uh oh!

astrofrog commented Mar 7, 2023

Uh oh!

Uh oh!

saimn Mar 7, 2023

Choose a reason for hiding this comment

Uh oh!

astrofrog Mar 7, 2023

Choose a reason for hiding this comment

Uh oh!

astrofrog commented Mar 7, 2023

Uh oh!

saimn left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Cadair commented Feb 22, 2023 •

edited by astrofrog

Loading