Skip to content

etheleon/omics

Repository files navigation

DOI

Graph Data Model

workflow

Create omics DB

Clone the repository.

$ git clone --recursive [email protected]:etheleon/omics.git

Batch import data into single <database.db> file.

$ ./configure -d=contig -d=metabolism -d=taxonomy -t=10 \
    -x=$HOME/db/taxonomy \
    -j=$HOME/local/neo4j-community-2.2.2/bin/neo4j-import \
    -c=$HOME/contigs \
    -ftp --user=<keggFTP username> --password=<keggFTP password>

Dependencies

Software Version / Packages / etc
Perl > 5.10
R > v3.1.2 is required and the following packages (dplyr igraph XML magrittr)
neo4j 2.2.3 (JAVA; JAVA_HOME has to be defined in your $HOME/.bashrc else NEO4J-import will not work
Docker any version

External datasets

  • NCBI taxonomy (download on your own)
  • KEGG FTP (optional)

Reading and serving an existing Omics DB (with Docker)

The following highlights how you will set up MetamapsDB using Docker.

(Optional) Downloading DB

The following details how to download and run a specific omicsDB mentioned in my thesis.

  • Database created by the .confgiure script
  • s3cmd - for downloading the DB from DigitalOcean

The files are stored in digitalocean spaces.

Download S3 CLI tool

pip install -y s3cmd

Authenticate

(contact author, myself, for ACCESS_KEY and SECRET)

s3cmd --configure

# namespace
# sgp1.digitaloceanspaces.com

# URL template
# %(bucket)s.nyc3.digitaloceanspaces.com

Refer to the docs for more about using s3cmd

ROOTDIR=$HOME/metamapsdb
mkdir $ROOTDIR
BUCKET=metamaps
KEY=neo4j/allKOS_fullnr.tar.gz.gpg
PATH=s3://${BUCKET}/${KEY}
s3cmd get $PATH $ROOTDIR

Because it's encrypted, we have to decrypt before decompressing

# password is found in chapter1's .lyx file
FILE=allKOS_fullnr.tar.gz
gpg --output $FILE --decrypt ${FILE}.gpg


tar zvxf $FILE

Run the Docker container

# Set location to where you've downloaded the
# metamaps database
DATA=$HOME/metamapsdb/allKOS_fullnr/

# Start NEO4J
docker run \
    --name omics \
    --publish=7474:7474 --publish=7687:7687 \
    --volume=$DATA:/data \
    etheleon/omics-neo4j-container

Database needs to be mounted, make sure you have the following folder structure. where graph.db is the output

/DATA
└── graph.db
    ├── bad.log
    ├── index
    ├── messages.log
    ├── neostore
    ├── neostore.counts.db.a
    ├── neostore.counts.db.b
    ├── neostore.id
    ├── neostore.labeltokenstore.db
    ├── neostore.labeltokenstore.db.id
    ├── neostore.labeltokenstore.db.names
    ├── neostore.labeltokenstore.db.names.id
    ├── neostore.nodestore.db
    ├── neostore.nodestore.db.id
    ├── neostore.nodestore.db.labels
    ├── neostore.nodestore.db.labels.id
    ├── neostore.propertystore.db
    ├── neostore.propertystore.db.arrays
    ├── neostore.propertystore.db.arrays.id
    ├── neostore.propertystore.db.id
    ├── neostore.propertystore.db.index
    ├── neostore.propertystore.db.index.id
    ├── neostore.propertystore.db.index.keys
    ├── neostore.propertystore.db.index.keys.id
    ├── neostore.propertystore.db.strings
    ├── neostore.propertystore.db.strings.id
    ├── neostore.relationshipgroupstore.db
    ├── neostore.relationshipgroupstore.db.id
    ├── neostore.relationshipstore.db
    ├── neostore.relationshipstore.db.id
    ├── neostore.relationshiptypestore.db
    ├── neostore.relationshiptypestore.db.id
    ├── neostore.relationshiptypestore.db.names
    ├── neostore.relationshiptypestore.db.names.id
    ├── neostore.schemastore.db
    ├── neostore.schemastore.db.id
    ├── neostore.transaction.db.21
    ├── neostore.transaction.db.22
    ├── neostore.transaction.db.23
    ├── neostore.transaction.db.24
    ├── neostore.transaction.db.25
    ├── neostore.transaction.db.26
    ├── neostore.transaction.db.27
    ├── rrd
    ├── schema
    └── store_lock

After starting the container navigate to 127.0.0.1:7474 on your machine's browser

Running wo Docker

You'll need to edit config to point to the location of the database:

For example, change the path in the settings file neo4j-server.properties from org.neo4j.server.database.location=/graph/db to org.neo4j.server.database.location=</path2/meta4j/out/database/database.db> in .

login in

About

Init of of metagenomics + metatranscriptomics + taxonomy + metabolism into graphDB

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •