Support specifying output channels in io.image.read_image#2988
Support specifying output channels in io.image.read_image#2988fmassa merged 16 commits intopytorch:masterfrom
Conversation
…e from assets and reduce duplicate code. Moving jpeg assets used by encode and write unit-tests on their separate folders.
…and adding checks for inputs.
datumbox
left a comment
There was a problem hiding this comment.
I left a few comments to explain parts of the implementation.`
|
The failing tests on Travis are not related to this PR. I would like to rebase to master once #2985 is merged to ensure all tests still pass. |
Codecov Report
@@ Coverage Diff @@
## master #2988 +/- ##
==========================================
- Coverage 73.39% 73.37% -0.03%
==========================================
Files 99 99
Lines 8825 8825
Branches 1391 1391
==========================================
- Hits 6477 6475 -2
- Misses 1929 1931 +2
Partials 419 419
Continue to review full report at Codecov.
|
fmassa
left a comment
There was a problem hiding this comment.
Thanks a lot for the PR, it looks great!
I would really prefer if we could unify the name of the arguments across jpeg / png, this way it is clear to the user that they both represent the same thing.
About channels=1 keeping the image as a palette type, although it makes sense I wonder if the expected behavior would be instead to convert it to a grayscale. Thoughts?
I've made a few other comments on the PR, let me know what you think
|
The usage of pth files is due to the difference between libjpeg and libjpeg-turbo on Windows and Mac, which right now we are not able to use. |
|
@fmassa Thanks for the review. I marked as resolved anything that I either accept your proposal or is already covered discussed. I kept open anything that requires a second look. I'll send now another commit with the changes. I would appreciate to review the last remaining points. @andfoy Thanks for providing the background story. |
* Adding output channels implementation for pngs. * Adding tests for png. * Adding channels in the API and documentation. * Fixing formatting. * Refactoring test_image.py to remove huge grace_hopper_517x606.pth file from assets and reduce duplicate code. Moving jpeg assets used by encode and write unit-tests on their separate folders. * Adding output channels implementation for jpegs. Fix asset locations. * Add tests for JPEG, adding the channels in the API and documentation and adding checks for inputs. * Changing folder for unit-test. * Fixing windows flakiness, removing duplicate test. * Replacing components to channels. * Adding reference for supporting CMYK. * Minor changes: num_components to output_components, adding comments, fixing variable name etc. * Reverting output_components to num_components. * Replacing decoding with generic method on tests. * Palette converted to Gray.
* Adding output channels implementation for pngs. * Adding tests for png. * Adding channels in the API and documentation. * Fixing formatting. * Refactoring test_image.py to remove huge grace_hopper_517x606.pth file from assets and reduce duplicate code. Moving jpeg assets used by encode and write unit-tests on their separate folders. * Adding output channels implementation for jpegs. Fix asset locations. * Add tests for JPEG, adding the channels in the API and documentation and adding checks for inputs. * Changing folder for unit-test. * Fixing windows flakiness, removing duplicate test. * Replacing components to channels. * Adding reference for supporting CMYK. * Minor changes: num_components to output_components, adding comments, fixing variable name etc. * Reverting output_components to num_components. * Replacing decoding with generic method on tests. * Palette converted to Gray.
Fixes #2948
I add a
channelsparameter that allows the users to specify the number of output channels while reading an image. The default value is 0 which leaves the image as-is and ensures the change is BC. The following public API methods of thetorchvision.io.imagepackage were updated:decode_png(input, channels=0)decode_jpeg(input, channels=0)decode_image(input, channels=0)read_image(path, channels=0)There is a small update on the originally proposed pitch because I added support for grayscale transparency and handling for palette images. Here are the supported values:
The PR adds 3 JPEG assets with total size 7kb. These are used to test the supported conversions. It also removes a 900kb asset file which is no longer needed. The assets were produced using the following snippet: