Scripts and tutorials for using dbSNP data
dbSNP build release JSON files are available on the FTP site (
├── Variation Services # Tutorial for working with SPDI Variation Service
├── eUtils.ipynb # Sample dbSNP eUtils query
├── # Script using eUtils to get rs flanking sequences
├── MafGraph.ipynb # eUtils query and MAF parsing and graphing
├── # parse dbSNP RS JSON object and extract the rs annotation using Hadoop
├── # parse dbSNP RS JSON object and extract clinical rs data using Hadoop
├── # parse dbSNP RS JSON object and extract rs merge history using Hadoop
├── # parse dbSNP RS JSON object and extract rs mapping information (ie. position)
├── refsnp-sample.json.gz # Sample data containing one RefSNP JSON example for rs268 for testing
├── # Sample Python script to parse RefSNP (rs) JSON object. The script
| produces a tab-delimited output containing the assembly version, sequence ID,
| position, reference allele, variant allele and ClinVar clinical significance,
| if available. NOTE: this script was tested using Python 2.7.12.
├── # Extract allele information position, mrna and protein SPDI reference allele (inserted) and variant (deleted) sequence
├── # Extract submission information (ss, local_snp_id, etc.)
Run and explore notebook interactively on Binder server. It may take a few minutes for Binder server to start up.
Notebook | Description | Binder |
eUtils.ipynb | dbSNP eUtils query | |
MafGraph.ipynb | eUtils query and MAF parsing and graphing |