Skip to content

The NCBI SARS-CoV-2 Variant Calling (SC2VC) Pipeline allows calling high-confidence variants from SARS-CoV-2 NGS data in a standardized format

License

Notifications You must be signed in to change notification settings

ncbi/sars2variantcalling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NCBI SARS-CoV-2 Variant Calling (SC2VC) Pipeline

Contact:

email: [email protected]


The NCBI SARS-CoV-2 Variant Calling (SC2VC) Pipeline is a collection of snakemake workflows that allows calling high-confidence variants from SARS-CoV-2 NGS data in a standardized format. An overview for the pipeline is available here.

Results are available via following NCBI resources: NCBI SARS-COV-2 at AWS, NCBI SARS-COV-2 at GCP

Also, available as: Weekly ACTIV-TRACE reports

Content:

  • Wrapper script to run workflow:
  • Reference sequence and snpEff databases:
    • toolbox/static/reference/
    • toolbox/static/snpEff/

Dependencies:

Following third party tools assumed to be installed in user environment:

For the comprehensive list of tools used and their versions, please refer to Dockerfile

Reference sequence indexing

Pipeline expect reference sequence to be indexed, to create these indexes please run

gatk CreateSequenceDictionary -R toolbox/static/reference/NC_045512.2.fa
samtools faidx toolbox/static/reference/NC_045512.2.fa
gatk IndexFeatureFile --input toolbox/static/reference/NC_045512.2.known_sites.vcf
hisat2-build toolbox/static/reference/NC_045512.2.fa toolbox/static/reference/NC_045512.2.fa

Usage:

  run.sh --platform --accession|--list [--instrument] [--conf] [--workdir] [--help]
    --platform:   platform, choices = [illumina, ont, pacbio, genbank], default = illumina
    --instrument: applicable to ONT platform only, use with single accession option, default = PromethION
    --accession:  single accession, optional, either --accession or --list must be specified
    --list:       file with accession list, optional, either --accession or --list must be specified
                  NOTE: ONT file list is expected to have at least two columns: <acc> <instrument>
    --conf:       optional custom configfile to override default
    --workdir:    optional working directory, default = ./workdir
    --help:       to display this help

Usage examples:

  1. ILLUMINA
    toolbox/workflow/run.sh --accession SRR21830388
    # in case of using docker image
    /pipelines/toolbox/workflow/run.sh --accession SRR21830388 --conf /pipelines/toolbox/workflow/extra.config.yaml 
  2. OXFORD-NANOPORE
    toolbox/workflow/run.sh --accession SRR15965069 --platform ont --instrument GridION
  3. PacBio
    toolbox/workflow/run.sh --accession SRR14895419 --platform pacbio

About

The NCBI SARS-CoV-2 Variant Calling (SC2VC) Pipeline allows calling high-confidence variants from SARS-CoV-2 NGS data in a standardized format

Resources

License

Stars

Watchers

Forks

Packages

No packages published