18/09/2019 GitHub - guigolab/bamstats: A command line tool to compute mapping statistics from a BAM file
guigolab / bamstats
Dismiss
Join GitHub today
GitHub is home to over 40 million developers working together to host and
review code, manage projects, and build software together.
Sign up
A command line tool to compute mapping statistics from a BAM file
223 commits 3 branches 7 releases 1 contributor BSD-3-Clause
Branch: master New pull request Find File Clone or download
emi80 Bump version to 0.3.3 for release Latest commit a97df7c 14 days ago
annotation Fix wrong feature merging and interleaving with unsorted annotation - c… 14 days ago
cmd/bamstats Switch to Go Modules - close #22 15 days ago
config Add missing config package 3 years ago
data Add check for int8 NH tag in general stats - fix #18 11 months ago
sam Switch to Go Modules - close #22 15 days ago
scripts Add preprocessing scripts to repo last year
stats Add check for int8 NH tag in general stats - fix #18 11 months ago
utils Add method to [Link] to output JSON - close #16 last year
.gitignore [ci skip] Update gitignore file last year
.[Link] Switch to Go Modules - close #22 15 days ago
LICENSE Update license 3 years ago
Makefile Use tag description as GitHub release name 10 months ago
[Link] [ci skip] Update installation instructions in readme to use latest re… 10 months ago
[Link] Switch to Go Modules - close #22 15 days ago
[Link] Switch to Go Modules - close #22 15 days ago
[Link] Switch to Go Modules - close #22 15 days ago
process_test.go Add check for int8 NH tag in general stats - fix #18 11 months ago
[Link] Bump version to 0.3.3 for release 14 days ago
[Link]
Bamstats
build passing coverage 85%
Bamstats is a command line tool written in Go for computing mapping statistics from a BAM file.
Installation instructions
Use one of the following methods to install Bamstats .
Install a released version
[Link] 1/3
18/09/2019 GitHub - guigolab/bamstats: A command line tool to compute mapping statistics from a BAM file
The easiest way is to download a pre-compiled binary from Github releases. Here is an example for installing the latest
released version on Linux 64bit:
export VERSION=0.3.2 OS=linux ARCH=amd64 BIN=/usr/local/bin
wget -O - [Link]
v${VERSION}-${OS}-${ARCH}.tar.bz2 | tar xj --strip-components 3 -C ${BIN}
Install the latest version with go
The following command will install the latest version from the master branch into $GOPATH :
go get [Link]/guigolab/bamstats/cmd/bamstats
Provided statistics
Bamstats can currently compute the following mapping statistics:
general
genome coverage
RNA-seq
General
The general mapping statistics include:
Total number of reads
Number of unmapped reads
Number of mapped reads grouped by number of multimaps ( NH tag in BAM file)
Number of mappings
Ratio of mappings vs mapped reads
If the data is paired-end, a section for read-pairs is also reported. In addition to the above metrics, the section contains a
map of the insert size length and the corresponding support as number of reads.
Genome coverage
The genome coverage ststistics are computed for RNA-seq data and include counts for the following genomic regions:
exon
intron
exonic_intronic
intergenic
others
The above metrics are computed for continuous and split mapped reads. An aggregated total is computed across elements
and read types too.
The --uniq (or -u ) command line flag allows reporting of genome coverage statistics for uniquely mapped reads too.
RNA-seq
The RNA-seq statistics follow IHEC reccomendations for RNA-seq data quality metrics. They include counts for the following
regions:
intergenic (different from coverage stats)
ribosomal RNA ( rRNA )
As long as other fractional metrics for the following read types:
mapped
intergenic
rRNA
[Link] 2/3
18/09/2019 GitHub - guigolab/bamstats: A command line tool to compute mapping statistics from a BAM file
duplicates
Output examples:
Some examples of the program output can be found in the data folder ot this GitHub repository:
General Stats
Genomic coverage stats
Genomic coverage stats with uniquely mapped reads (Note that the coverageUniq stats are reported as an additional
JSON object)
RNA-seq stats
License
This software is release under a BSD-style license. Please check the LICENSE file for more details.
[Link] 3/3