Software Dependability Analysis of Apache’s Commons-CSV Library
Yash Banke Bihari Mittal*
University of Salerno, Fisciano, Italy, [Link]@[Link]
CCS CONCEPTS • Software and its engineering → Software testing and debugging; Software
maintenance tools; Open source model.
Additional Keywords and Phrases: Java, Apache Commons CSV, Apache Commons, Maven, Automated Test
Case Generation, Bug Fixing, Code Coverage, Code Quality, Snyk, EvoSuite, FindSecBugs, JaCoCo, OWASP DC,
PiTest, Software Analytics, Software Dependability, Software Testing, Software Vulnerabilities, SonarCloud.
ABSTRACT
This analysis comprehensively assessed the dependability of the Apache Commons CSV library using various tools
and approaches. SonarCloud revealed good code quality and maintainability, while Jacoco reported near-perfect
code coverage. PiTest identified areas for improvement through mutation testing, but performance was already
deemed satisfactory. EvoSuite generated additional test cases, and security analyses with FindSecBugs, OWASP
Dependency Check, and Snyk found no major vulnerabilities after updating dependencies. Overall, this well-
maintained, well-tested, and secure library is suitable for working with CSV data in Java.
INTRODUCTION
Core Functionality:
Offers robust and efficient methods for reading and writing CSV data.
Supports various CSV dialects and configurations, including custom separators, escape characters, and quoting
styles.
Provides flexible options for processing CSV data, including handling empty lines, comments, and headers.
Integrates seamlessly with other Apache Commons libraries like BeanUtils and IO for further data manipu lation.
Key Features:
Flexibility: Handles diverse CSV formats and provides customization options.
Efficiency: Optimized for high performance when dealing with large datasets.
Ease of Use: Offers a clear and well-documented API for developers.
Open-Source: Continuously maintained and improved by the community.
Applications:
Data Import/Export: Widely used for transferring data between applications and formats.
Data Analysis: Efficiently parses and processes CSV data for analysis.
Configuration Management: Reads and writes configuration files in CSV format.
Logging: Captures and analyzes log data stored in CSV files.
LINK TO THE REPOSITORY
[Link]
1
1 SOFTWARE QUALITY ANALYSIS USING SONAR CLOUD
Figure 1.1: Analysis of the master repository
Fixes Applied to Repository (Issues were solved considering the severity):
[Link] Cognitive Complexity:
a) createConverter: Reduced from 18 to 15 by extracting helper methods and breaking down logic.
b) validate: Reduced from 25 to 15 by extracting helper methods and simplifying checks.
c) createHeaders: Reduced from 25 to 15 by separating parsing and mapping, handling specific
cases.
d) read: Reduced from 17 to 15 by improving readability and modularity of reading logic.
e) nextToken: Reduced from 27 to 15 by breaking down logic and simplifying token determination.
f) parseEncapsulatedToken: Reduced from 23 to 15 by extracting logic and simplifying checks.
g) parseSimpleToken: Reduced from 20 to 15 by extracting logic and simplifying token type
assignment.
h) checkQuoteCondition: Reduced from 24/31 to 15 by extracting logic, simplifying checks, and
combining conditions.
i) Assertions were added to validate test cases.
2. Other Fixes:
a) Defined a constant instead of duplicating "format" literal.
b) Avoided throwing same checked exception multiple times.
c) Changed test method to non-public within the package.
3. Ignored Issue which were further causing issues:
a) 8 Naming convention problems related to an enumerator caused an error.
b) 2 code smells that were relating to adding test case assertions.
c) 3 methods with cognitive complexity of 45,47 and 37.
2
d) 2 False positives: methods returning ‘null’ values whose return type if changed would hinder the
semantics of the code.
Figure 1.2: Screenshot after applying fixes
2 CODE COVERAGE COMPUTATION USING JACOCO
The code coverage for the repository is currently 98%, which is considered to be very good in many industries.
While it is possible to achieve higher coverage, the effort and risks required to do so would likely be significant.
Commands used:
$ mvn clean && mvn verify && mvn jacoco:report
Figure 2: Jacoco code coverage report
3 MUTATION TESTING USING PITEST
Commands used:
$ mvn test-compile [Link]:pitest-maven:mutationCoverage
3
$ mvn -DwithHistory test-compile [Link]:pitest-maven:mutationCoverage
Fig 3: PiTest Coverage Report
4 PERFORMANCE TEST
The Apache Commons CSV library already includes a comprehensive performance test suite([Link])
that covers a variety of different parsing scenarios. The results of the tests show that the performance of the
Commons CSV parser is good.
Fig 4: Output of [Link]
4
5 AUTOMATED TEST CASE GENERATION USING EVOSUITE
Commands used:
$ mvn compile
$ mvn evosuite:generate \
-Dclass=[Link]
$ mvn evosuite:info
$ mvn evosuite:export
$ java -jar [Link] -class ExtendedBufferedReader -criterion LINE, BRANCH,
EXCEPTION,WEAKMUTATION,OUTPUT,METHOD,METHODNOEXCEPTION,CBRANCH -projectCP target/classes
Fig 5.1: [Link]
6 SECURITY ANALYSIS , OWASP DC AND SNYK
FindSecBugs(SpotBugs):-
Commands used:
$ mvn spotbugs:check && mvn spotbugs:gui
Fig 6.1 Spotbugs Analysis Output in GUI and Terminal
OWASP DC:-
The below .jar files were updated to their latest versions present on maven central as vulnerabilities were found in a
few outdated versions of the .jar files:
1. plexus-interpolation
2. plexus-classworlds
3. plexus-component-annotations
5
4. bcprov-jdk18on
5. maven-core
6. maven-settings
7. maven-shared-utils
8. bcpg-jdk18on
Fig 6.2.1: Number of vulnerabilities analyzed by OWASPDC
Commands used:
$mvn [Link]:dependency-check-maven:check
Fig 6.2.2: OWASPDC output before updating .jar files
6
Fig 6.2.3: OWASPDC output after updating .jar files
Snyk
No vulnerabilities were found by snyk.
Fig 6.3: Snyk Output
ACKNOWLEDGMENT
I would like to thank Prof. Dario Di Nucci for giving me the opportunity to conduct the above analysis. I would also
like to thank Abdul Wasif, Mohammed Aziz, Franco Merenda and my seniors for always supporting and motivating
me.
7
REFERENCES
[1] Baeldung, [Link]
[2] Github, [Link]
[3] JaCoCo, [Link]
[4] Cobertura MojoHaus, [Link]
[5] PiTest, [Link]
[6] PiTest plugin for JUnit5, [Link]
[7] Compute code coverage in CI/CD, [Link]
[8] Tutorials,[Link]