0% found this document useful (0 votes)
10 views8 pages

Project Report

The document presents a comprehensive analysis of the Apache Commons CSV library's dependability, highlighting its good code quality, high code coverage, and satisfactory performance. Various tools such as SonarCloud, Jacoco, and EvoSuite were used to assess and improve the library, with no major vulnerabilities found after updating dependencies. Overall, the library is deemed well-maintained, secure, and suitable for handling CSV data in Java.

Uploaded by

y.mittal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views8 pages

Project Report

The document presents a comprehensive analysis of the Apache Commons CSV library's dependability, highlighting its good code quality, high code coverage, and satisfactory performance. Various tools such as SonarCloud, Jacoco, and EvoSuite were used to assess and improve the library, with no major vulnerabilities found after updating dependencies. Overall, the library is deemed well-maintained, secure, and suitable for handling CSV data in Java.

Uploaded by

y.mittal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Software Dependability Analysis of Apache’s Commons-CSV Library

Yash Banke Bihari Mittal*


University of Salerno, Fisciano, Italy, [Link]@[Link]

CCS CONCEPTS • Software and its engineering → Software testing and debugging; Software
maintenance tools; Open source model.

Additional Keywords and Phrases: Java, Apache Commons CSV, Apache Commons, Maven, Automated Test
Case Generation, Bug Fixing, Code Coverage, Code Quality, Snyk, EvoSuite, FindSecBugs, JaCoCo, OWASP DC,
PiTest, Software Analytics, Software Dependability, Software Testing, Software Vulnerabilities, SonarCloud.

ABSTRACT

This analysis comprehensively assessed the dependability of the Apache Commons CSV library using various tools
and approaches. SonarCloud revealed good code quality and maintainability, while Jacoco reported near-perfect
code coverage. PiTest identified areas for improvement through mutation testing, but performance was already
deemed satisfactory. EvoSuite generated additional test cases, and security analyses with FindSecBugs, OWASP
Dependency Check, and Snyk found no major vulnerabilities after updating dependencies. Overall, this well-
maintained, well-tested, and secure library is suitable for working with CSV data in Java.

INTRODUCTION
Core Functionality:
 Offers robust and efficient methods for reading and writing CSV data.
 Supports various CSV dialects and configurations, including custom separators, escape characters, and quoting
styles.
 Provides flexible options for processing CSV data, including handling empty lines, comments, and headers.
 Integrates seamlessly with other Apache Commons libraries like BeanUtils and IO for further data manipu lation.

Key Features:
 Flexibility: Handles diverse CSV formats and provides customization options.
 Efficiency: Optimized for high performance when dealing with large datasets.
 Ease of Use: Offers a clear and well-documented API for developers.
 Open-Source: Continuously maintained and improved by the community.

Applications:
 Data Import/Export: Widely used for transferring data between applications and formats.
 Data Analysis: Efficiently parses and processes CSV data for analysis.
 Configuration Management: Reads and writes configuration files in CSV format.
 Logging: Captures and analyzes log data stored in CSV files.

LINK TO THE REPOSITORY


[Link]

1
1 SOFTWARE QUALITY ANALYSIS USING SONAR CLOUD

Figure 1.1: Analysis of the master repository

Fixes Applied to Repository (Issues were solved considering the severity):


[Link] Cognitive Complexity:
a) createConverter: Reduced from 18 to 15 by extracting helper methods and breaking down logic.
b) validate: Reduced from 25 to 15 by extracting helper methods and simplifying checks.
c) createHeaders: Reduced from 25 to 15 by separating parsing and mapping, handling specific
cases.
d) read: Reduced from 17 to 15 by improving readability and modularity of reading logic.
e) nextToken: Reduced from 27 to 15 by breaking down logic and simplifying token determination.
f) parseEncapsulatedToken: Reduced from 23 to 15 by extracting logic and simplifying checks.
g) parseSimpleToken: Reduced from 20 to 15 by extracting logic and simplifying token type
assignment.
h) checkQuoteCondition: Reduced from 24/31 to 15 by extracting logic, simplifying checks, and
combining conditions.
i) Assertions were added to validate test cases.
2. Other Fixes:
a) Defined a constant instead of duplicating "format" literal.
b) Avoided throwing same checked exception multiple times.
c) Changed test method to non-public within the package.
3. Ignored Issue which were further causing issues:
a) 8 Naming convention problems related to an enumerator caused an error.
b) 2 code smells that were relating to adding test case assertions.
c) 3 methods with cognitive complexity of 45,47 and 37.

2
d) 2 False positives: methods returning ‘null’ values whose return type if changed would hinder the
semantics of the code.

Figure 1.2: Screenshot after applying fixes

2 CODE COVERAGE COMPUTATION USING JACOCO


The code coverage for the repository is currently 98%, which is considered to be very good in many industries.
While it is possible to achieve higher coverage, the effort and risks required to do so would likely be significant.

Commands used:
$ mvn clean && mvn verify && mvn jacoco:report

Figure 2: Jacoco code coverage report

3 MUTATION TESTING USING PITEST


Commands used:
$ mvn test-compile [Link]:pitest-maven:mutationCoverage

3
$ mvn -DwithHistory test-compile [Link]:pitest-maven:mutationCoverage

Fig 3: PiTest Coverage Report

4 PERFORMANCE TEST
The Apache Commons CSV library already includes a comprehensive performance test suite([Link])
that covers a variety of different parsing scenarios. The results of the tests show that the performance of the
Commons CSV parser is good.

Fig 4: Output of [Link]

4
5 AUTOMATED TEST CASE GENERATION USING EVOSUITE
Commands used:
$ mvn compile
$ mvn evosuite:generate \
-Dclass=[Link]
$ mvn evosuite:info
$ mvn evosuite:export
$ java -jar [Link] -class ExtendedBufferedReader -criterion LINE, BRANCH,
EXCEPTION,WEAKMUTATION,OUTPUT,METHOD,METHODNOEXCEPTION,CBRANCH -projectCP target/classes

Fig 5.1: [Link]

6 SECURITY ANALYSIS , OWASP DC AND SNYK

FindSecBugs(SpotBugs):-
Commands used:
$ mvn spotbugs:check && mvn spotbugs:gui

Fig 6.1 Spotbugs Analysis Output in GUI and Terminal

OWASP DC:-
The below .jar files were updated to their latest versions present on maven central as vulnerabilities were found in a
few outdated versions of the .jar files:
1. plexus-interpolation
2. plexus-classworlds
3. plexus-component-annotations

5
4. bcprov-jdk18on
5. maven-core
6. maven-settings
7. maven-shared-utils
8. bcpg-jdk18on

Fig 6.2.1: Number of vulnerabilities analyzed by OWASPDC

Commands used:
$mvn [Link]:dependency-check-maven:check

Fig 6.2.2: OWASPDC output before updating .jar files

6
Fig 6.2.3: OWASPDC output after updating .jar files
Snyk
No vulnerabilities were found by snyk.

Fig 6.3: Snyk Output

ACKNOWLEDGMENT
I would like to thank Prof. Dario Di Nucci for giving me the opportunity to conduct the above analysis. I would also
like to thank Abdul Wasif, Mohammed Aziz, Franco Merenda and my seniors for always supporting and motivating
me.

7
REFERENCES
[1] Baeldung, [Link]
[2] Github, [Link]
[3] JaCoCo, [Link]
[4] Cobertura MojoHaus, [Link]
[5] PiTest, [Link]
[6] PiTest plugin for JUnit5, [Link]
[7] Compute code coverage in CI/CD, [Link]
[8] Tutorials,[Link]

You might also like