ElasticBLAST is a new way to BLAST large numbers of queries, faster and on the cloud. Here are the top three reasons you should use ElasticBLAST:
1. ElasticBLAST can handle much LARGER queries!
ElasticBLAST can search query sets that have hundreds to millions of sequences and against BLAST databases of all sizes.
2. ElasticBLAST is FASTER
ElasticBLAST distributes your searches across multiple cloud instances to process them simultaneously. The ability to scale resources in this way allows you to process large numbers of queries in a shorter time than you could with BLAST+.
3. ElasticBLAST is EASY to run on the cloud
ElasticBLAST is easy to set up using our step-by-step instructions (Amazon Web Services (AWS), Google Cloud Platform (GCP)) and allows you to leverage the power of the cloud. Once configured, it manages the software and database installation, handles partitioning of the BLAST workload among the various instances, and deallocates cloud resources when the searches are done.
ElasticBLAST also selects the instance (i.e., machine) type for you based on database size. Of course, you can also choose the instance type manually if you prefer.
Why stop at 3? Here are 3 more reasons to use ElasticBLAST:
4. You can search NCBI-provided databases or your own
ElasticBLAST can access the 28 NCBI databases available on AWS and GCP. These are the same databases that are available on the NCBI FTP site. Of course, you can also search your own database.
5. ElasticBLAST supports BLAST+ options and programs
ElasticBLAST supports all BLAST programs, including those not supported on the web such as rpstblastn that identifies conserved protein domains encoded in nucleotide sequences. You can also easily limit your search by taxonomy to search only records from organism(s) of interest.
6. ElasticBLAST helps manage cloud costs
Utilization of cloud resources (AWS and GCP) incurs some cost. ElasticBLAST helps manage cost by efficiently allocating only the required resources for your particular search. For example, a protein search with a query of about 20 million residues against a database of about 20 billion residues can cost less than $5. Even a larger search with a query of 3-4 billion DNA bases costs only around $50.
Both cloud platforms also offer an option to bid on spot instances or preemptible nodes for less than full price, which can result in significant cost savings. ElasticBLAST can be easily configured to request such instances. Your costs will obviously vary based on many factors, and we encourage you to explore these options with the individual cloud platforms. Also, both AWS and GCP offer a free tier or time-limited trial of their cloud services. You can find information about using ElasticBLAST with the free tiers in the ElasticBLAST Overview.
What do you think of ElasticBLAST?
Your feedback is crucial to the development and support of ElasticBLAST. Please write to us with any questions or suggestions or refer to further details for support. We’d love to hear from you.
ElasticBLAST is a cloud-native package developed by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM) with support from the NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) Initiative.