Refactor BLAST::Report to lazily generate reports #734
Merged
yannickwurm merged 2 commits intowurmlab:masterfrom Mar 21, 2024
Merged
Refactor BLAST::Report to lazily generate reports #734yannickwurm merged 2 commits intowurmlab:masterfrom
yannickwurm merged 2 commits intowurmlab:masterfrom
Conversation
We're about to refactor the BLAST::Report class, and it does not seem to have any unit tests. The implementation is a bit hard to comprehend with generic abbreviated variable names and data accessed via indexes. To aid the refactoring, create a regression test that takes a job generated on current master as gold standard and compares it with the live version. If refactoring changes the outputs, the test will fail.
In some cases (e.g. when used by some extensions) it is beneficial to not generate the report files upfront, and only do it when it's required. Performance gains are significant with large result sets (i.e. hundreds of MB or GBs) as the data does not need to be loaded into the process memory. Lazy method evaluation with memoization allows to achieve just that without cognitive overload to developer - operations are now only executed when the methods are invoked, not up-front by default. Implementation logic was not changed and regression test prepared upfront is not failing and producing identical results.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In some cases (e.g. when used by some extensions) it is beneficial
to not generate the report files upfront, and only do it when it's
required. Performance gains are significant with large result sets
(i.e. hundreds of MB or GBs) as the data does not need to be
loaded into the process memory.
Lazy method evaluation with memoization allows to achieve just that
without cognitive overload to developer - operations are now only
executed when the methods are invoked, not up-front by default.
Implementation logic was not changed and regression test prepared
upfront is not failing and producing identical results.