Skip to content

Conversation

@Godin
Copy link
Member

@Godin Godin commented Jan 2, 2019

On example of our own XML report majority of lines do not have branches:

$ cat jacoco.xml | tr '>' '\n' | grep "<line" | wc -l
6403
$ cat jacoco.xml | tr '>' '\n' | grep "mb=\"0\" cb=\"0\"" | wc -l
5522

By not printing mb="0" cb="0" we can save 77308 bytes, which is about 7% out of 915390.

And by not printing zeros at all for IMPLIED attributes mi, ci, mb and cb (as implemented in this PR) we can save 127498 bytes, which is about 14%.


Furthermore (not implemented in this PR) by changing DTD to mark missed and covered as not REQUIRED and by not printing these too, we can save additional 93098 bytes, which is about 10%.

@Godin Godin requested a review from marchof January 2, 2019 20:45
@marchof
Copy link
Member

marchof commented Jan 8, 2019

This looks like a major braking change as probably nobody supplies our DTD.

@Godin Do you really think this optimization is worth it?

@Godin
Copy link
Member Author

Godin commented Jan 10, 2019

@marchof

probably nobody supplies our DTD

what about inlining of DTD into XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE report [
  <!ELEMENT report ... >
  ...
]>
<report>
...

or maybe even XSD ?


looks like a major braking change

The second part of description not implemented here about change for attributes marked as required is indeed a breaking change.

The first implemented part about change for attributes already marked as optional since a very long time did not looked as breaking to me, exactly because consumers already must not assume their presence.

Unfortunately you're probably right - no one reads DTD as like any documentation. SonarQube for example assumes presence of these of attributes 😞


worth it?

14% just for our small project and just for first implemented part, and 27% in total with second part...

@marchof
Copy link
Member

marchof commented Jan 10, 2019

I like the idea of inlining the DTD! Also we can improve our DTD to define default values. Then existing parsers shouldn't be affected.

marchof added a commit that referenced this pull request Jan 22, 2019
marchof added a commit that referenced this pull request Jan 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Candidates

Development

Successfully merging this pull request may close these issues.

3 participants