fixed reading the data from the stream and fixed the size of chunks by joerg1985 · Pull Request #10409 · SeleniumHQ/selenium

joerg1985 · 2022-03-02T20:41:48Z

Description

InputStream.read may read less bytes than actually are in the stream, this might corrupt the forwarded data.
Also fixed the size of the chunks send after the initial chunk.

Motivation and Context

I noticed the http chunks after the initial 1MiB chunk are only 8KiB, e.g. a 3 MiB screenshot was chunked
in one chunk of 1MiB and 256 chunks of 8KiB, i don't think this was the intention when implementing chunks.

While fixing this i noticed the call to is.read(ary) did expect the function to block until the array is full or the end of steam is reached, like is.readNBytes(ary, 0, ary.length) would do. Since is.readNBytes is Java 9 i had to fallback to com.google.common.io.ByteStreams already used in other areas of the code.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist

I have read the contributing document.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have added tests to cover my changes.
All new and existing tests passed.

diemol

Thank you for this PR, @joerg1985.

I left a comment because I did not understand one change. The other one seems fine. I will ask someone else for an extra look, and then we can merge this.

diemol · 2022-03-04T13:48:09Z

    byte[] ary = new byte[CHUNK_SIZE];
    InputStream is = seResponse.getContent().get();
-    int byteCount = is.read(ary);
+    int byteCount = ByteStreams.read(is, ary, 0, ary.length);


I do not understand why this is needed. I thought is.read(ary) becomes is.read(ary, 0, ary.length).

From the PR description, I understood that the initial 1MB chunck is read correctly, right?

This is not related to the chunking, it is more about the correct usage of the InputStream.

The InputStream.read method does not guarantee to fill the complete array, even if there is more data in the stream. The InputStream.read javadoc states:
Reads some number of bytes from the input stream and stores them into the buffer array b. The number of bytes actually read is returned as an integer. This method blocks until input data is available, end of file is detected, or an exception is thrown.

An InputStream could - for whatever reason - only read some bytes from the stream and write it into the array. This would fullfill the guarantees of the javadoc. The implementation here would assume the end of stream is reached and close the stream.

On the other hand the ByteStream.read javadoc:
Reads some bytes from an input stream and stores them into the buffer array b. This method blocks until len bytes of input data have been read into the array, or end of file is detected.

There is a guarantee the array is filled to the end or the end of stream is reached. This allows detect the EOF by checking the number of bytes read, what is done in the current implementation.

Best wishes
Jörg

PS: is.readNBytes(ary, 0, ary.length) is Java 9 and i think Selenium is at Java 8.

codecov-commenter · 2022-03-08T05:30:35Z

Codecov Report

Merging #10409 (8109d44) into trunk (d40b1a8) will not change coverage.
The diff coverage is n/a.

❗ Current head 8109d44 differs from pull request most recent head d12c0cd. Consider uploading reports for the commit d12c0cd to get more accurate results

@@           Coverage Diff           @@
##            trunk   #10409   +/-   ##
=======================================
  Coverage   45.03%   45.03%           
=======================================
  Files          85       85           
  Lines        5522     5522           
  Branches      269      269           
=======================================
  Hits         2487     2487           
  Misses       2766     2766           
  Partials      269      269

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d40b1a8...d12c0cd. Read the comment docs.

InputStream.read may read less bytes than actually are in the stream, this might corrupt the forwarded data. Also fixed the size of the chunks send after the initial chunk.

sonarqubecloud · 2022-03-11T15:27:38Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

0.0% Coverage
0.0% Duplication

diemol

Thank you, @joerg1985

…eleniumHQ#10409) InputStream.read may read less bytes than actually are in the stream, this might corrupt the forwarded data. Also fixed the size of the chunks send after the initial chunk. [skip ci]

diemol added the B-grid Everything grid and server related label Mar 4, 2022

diemol reviewed Mar 4, 2022

View reviewed changes

fixed reading the data from the stream and fixed the size of chunks

2afba69

InputStream.read may read less bytes than actually are in the stream, this might corrupt the forwarded data. Also fixed the size of the chunks send after the initial chunk.

joerg1985 force-pushed the chunk-size branch from d12c0cd to 2afba69 Compare March 11, 2022 15:13

diemol approved these changes Mar 11, 2022

View reviewed changes

diemol merged commit 7e8c430 into SeleniumHQ:trunk Mar 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fixed reading the data from the stream and fixed the size of chunks#10409

fixed reading the data from the stream and fixed the size of chunks#10409
diemol merged 1 commit intoSeleniumHQ:trunkfrom
joerg1985:chunk-size

joerg1985 commented Mar 2, 2022

Uh oh!

diemol left a comment

Uh oh!

diemol Mar 4, 2022

Uh oh!

joerg1985 Mar 4, 2022

Uh oh!

joerg1985 Mar 4, 2022

Uh oh!

codecov-commenter commented Mar 8, 2022

Uh oh!

sonarqubecloud Bot commented Mar 11, 2022

Uh oh!

diemol left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

joerg1985 commented Mar 2, 2022

Description

Motivation and Context

Types of changes

Checklist

Uh oh!

diemol left a comment

Choose a reason for hiding this comment

Uh oh!

diemol Mar 4, 2022

Choose a reason for hiding this comment

Uh oh!

joerg1985 Mar 4, 2022

Choose a reason for hiding this comment

Uh oh!

joerg1985 Mar 4, 2022

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Mar 8, 2022

Codecov Report

Uh oh!

sonarqubecloud Bot commented Mar 11, 2022

Uh oh!

diemol left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants