These wrappers are copyright 2010-2018 by Peter Cock (James Hutton Institute, UK) and additional contributors including Edward Kirton, John Chilton, Nicola Soranzo, Jim Johnson, Bjoern Gruening, Caleb Easterly, Anton Nekrutenko and Anthony Bretaudeau. See the licence text below.
Note this does not work with the NCBI 'legacy' BLAST suite written in C
(e.g. binary name blastall), but its replacement BLAST, which is
written in C++ (e.g. binary name blastn).
Note that these wrappers (and the associated datatypes) were originally
distributed as part of the main Galaxy repository, but as of August 2012
moved to the Galaxy Tool Shed as ncbi_blast_plus (and blast_datatypes).
My thanks to Dannon Baker from the Galaxy development team for his assistance
with this.
These wrappers are available from the Galaxy Tool Shed at: http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
In-development test releases are available from the Test Tool Shed at: http://testtoolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus/
Please cite the following paper:
NCBI BLAST+ integrated into Galaxy. P.J.A. Cock, J.M. Chilton, B. Gruening, J.E. Johnson, N. Soranzo GigaScience, 2015, 4:39 https://doi.org/10.1186/s13742-015-0080-7
You should also cite the NCBI BLAST+ tools:
BLAST+: architecture and applications. C. Camacho et al. BMC Bioinformatics 2009, 10:421. https://doi.org/10.1186/1471-2105-10-421
Galaxy should be able to automatically install the dependencies, i.e. the
BLAST+ binaries and the blast_datatypes repository which defines the
BLAST XML file format (blastxml), protein and nucleotide BLAST databases
(blastdbp and blastdbn), and so on.
See the configuration notes below.
For those not using Galaxy's automated installation from the Tool Shed, put
the XML and Python files in the tools/ncbi_blast_plus/ folder and add the
XML files to your tool_conf.xml as normal. For example, use:
<section name="NCBI BLAST+" id="ncbi_blast_plus_tools"> <tool file="ncbi_blast_plus/ncbi_blastn_wrapper.xml" /> <tool file="ncbi_blast_plus/ncbi_blastp_wrapper.xml" /> <tool file="ncbi_blast_plus/ncbi_blastx_wrapper.xml" /> <tool file="ncbi_blast_plus/ncbi_tblastn_wrapper.xml" /> <tool file="ncbi_blast_plus/ncbi_tblastx_wrapper.xml" /> <tool file="ncbi_blast_plus/ncbi_makeblastdb.xml" /> <tool file="ncbi_blast_plus/ncbi_dustmasker_wrapper.xml" /> <tool file="ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml" /> <tool file="ncbi_blast_plus/ncbi_blastdbcmd_info.xml" /> <tool file="ncbi_blast_plus/ncbi_rpsblast_wrapper.xml" /> <tool file="ncbi_blast_plus/ncbi_rpstblastn_wrapper.xml" /> <tool file="ncbi_blast_plus/ncbi_makeprofiledb.xml" /> <tool file="ncbi_blast_plus/blastxml_to_tabular.xml" /> </section>
You will also need to install blast_datatypes from the Tool Shed. This
defines the BLAST XML file format (blastxml), BLAST databases, etc:
As described above for an automated installation, you must also tell Galaxy
about any system level BLAST databases using the tool-data/blastdb*.loc
files. Also merge the tool-data/tool_data_table_conf.xml.sample contents
into your tool_data_table_conf.xml file.
You must install the NCBI BLAST+ standalone tools somewhere on the system path. Currently the unit tests are written using BLAST+ 2.2.30.
Run the functional tests (adjusting the section identifier to match your
tool_conf.xml.sample file):
./run_tests.sh -sid NCBI_BLAST+-ncbi_blast_plus_tools
You must tell Galaxy about any system level BLAST databases using configuration
files blastdb.loc (nucleotide databases like NT) and blastdb_p.loc
(protein databases like NR), and blastdb_d.loc (protein domain databases
like CDD or SMART) which are located in the tool-data/ folder. Sample
files are included which explain the tab-based format to use.
You can download the NCBI provided databases as tar-balls from here:
- ftp://ftp.ncbi.nlm.nih.gov/blast/db/ (nucleotide and protein databases like NT and NR)
- ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/little_endian/ (domain databases like CDD)
If using the optional taxonomy columns, you will also need to download the
NCBI taxonomy files (taxdb.btd and taxdb.bti from taxdb.tar.gz on
the BLAST database FTP site). Currently explicit version tracking of the
taxonomy is not supported, and in order to use this you must set the
$BLASTDB environment variable to include the path where you unzipped the
taxonomy files. If this is not done, the taxonomy columns like species name
will appear as N/A in the tabular output.
The BLAST+ binaries support multi-threaded operation, which is handled via the
$GALAXY_SLOTS environment variable. This should be set automatically by
Galaxy via your job runner settings, which allows you to (for example) allocate
four cores to each BLAST job.
In addition, the BLAST+ wrappers also support high level parallelism by task
splitting if use_tasked_jobs = True is enabled in the config/galaxy.ini
configuration file (previously universe_wsgi.ini on older versions of
Galaxy). Essentially, the FASTA input query files are broken up into
batches of 1000 sequences, a separate BLAST child job is run for each chunk,
and then the BLAST output files are merged (in order). This is transparent
for the end user.
The wrappers now follow the Galaxy convention of underlying tool version with a galaxy specific suffix which gets reset to zero with each new BLAST version:
| Version | Changes |
| 2.14.1+galaxy2 |
|
| 2.14.1+galaxy1 |
|
| 2.14.1+galaxy0 |
|
| 2.10.1+galaxy3 |
|
| 2.10.1+galaxy2 |
|
| 2.10.1+galaxy1 |
|
| 2.10.1+galaxy0 |
|
| 2.9.0+galaxy0 |
|
| 2.7.1+galaxy0 |
|
Prior releases used a self-contained version number (deliberately kept low to avoid any confusion with the NCBI BLAST version numbers):
| Version | Changes |
| v0.3.3 |
|
| v0.3.2 |
|
| v0.3.1 |
|
| v0.3.0 |
|
| v0.2.02 |
|
| v0.2.01 |
|
| v0.2.00 |
|
| v0.1.08 |
|
| v0.1.07 |
|
| v0.1.06 |
|
| v0.1.05 |
|
| v0.1.04 |
|
| v0.1.03 |
|
| v0.1.02 |
|
| v0.1.01 |
|
| v0.1.00 |
|
| v0.0.22 |
|
| v0.0.21 |
|
| v0.0.20 |
|
| v0.0.19 |
|
| v0.0.17 |
|
| v0.0.16 |
|
| v0.0.15 |
|
| v0.0.14 |
|
| v0.0.13 |
|
| v0.0.12 |
|
| v0.0.11 |
|
You can file an issue here https://github.com/peterjc/galaxy_blast/issues or ask us on the Galaxy development list http://lists.bx.psu.edu/listinfo/galaxy-dev
This script and related tools were originally developed on the 'tools' branch of the following Mercurial repository: https://bitbucket.org/peterjc/galaxy-central/
As of July 2013, development is continuing on a dedicated GitHub repository: https://github.com/peterjc/galaxy_blast
For pushing a release to the test or main "Galaxy Tool Shed", use the following
Planemo commands (which requires you have set your Tool Shed access details in
~/.planemo.yml and that you have access rights on the Tool Shed):
$ planemo shed_update -t testtoolshed --check_diff tools/ncbi_blast_plus/ ...
or:
$ planemo shed_update -t toolshed --check_diff tools/ncbi_blast_plus/ ...
To just build and check the tar ball, use:
$ planemo shed_upload --tar_only tools/ncbi_blast_plus/ ... $ tar -tzf shed_upload.tar.gz test-data/blastdb.loc ... tools/ncbi_blast_plus/tool_dependencies.xml $ tar -tzf shed_upload.tar.gz | wc -l 117
This simplifies ensuring a consistent set of files is bundled each time, including all the relevant test files.
When updating the version of BLAST+, many of the sample data files used for the unit tests must be regenerated. This script automates that task:
$ tools/ncbi_blast_plus/update_test_files.sh
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.