As we begin a new year, let’s look back at the top NCBI Insights Blog posts of 2024 based on number of views.
In case you missed any of these, check them out:
Continue reading “Top of 2024: A Look at the NCBI Insights Blog “
Tag: Nucleotide BLAST (blastn)
As we begin a new year, let’s look back at the top NCBI Insights Blog posts of 2024 based on number of views.
In case you missed any of these, check them out:
Continue reading “Top of 2024: A Look at the NCBI Insights Blog “
Download the updated bacterial and archaeal reference genome collection! We built this collection of 20,403 genomes by selecting the “best” genome assembly for each species among the 350,000+ prokaryotic genomes in RefSeq (except for E. coli for which two assemblies were selected as reference). Changes have been made to the selection criteria including upgrades for type and complete assemblies resulting in a much larger set of changes as compared to previous updates.
Continue reading “Updated Bacterial and Archaeal Reference Genome Collection now Available!”
Interested in faster nucleotide BLAST searches with more focused search results? As previously announced, NCBI has been re-evaluating the BLAST nucleotide database (nt) to make it more compact and more efficient. Thanks to your feedback, NCBI’s BLAST is excited to introduce the core nucleotide database (core_nt), an alternative to the default nt database that contains better-defined content and is less than half the size.
Download the updated bacterial and archaeal reference genome collection! We built this collection of 19,328 genomes by selecting the “best” genome assembly for each species among the 350,000+ prokaryotic genomes in RefSeq (except for E. coli for which two assemblies were selected as reference).
Continue reading “Now Available! Updated Bacterial and Archaeal Reference Genomes Collection”
Download the updated bacterial and archaeal reference genome collection! This collection (18,941 genomes as of Jan 18, 2024) was built by selecting the “best” genome assembly for each species among the 330,000+ prokaryotic genomes in RefSeq (except for E. coli for which two assemblies were selected as reference). You can speed up your sequence searches by running them against these high-quality genomes instead of the entire nucleotide or protein database.
The criteria for selecting the reference assembly for a given species include assembly contiguity and completeness and quality of the RefSeq annotation. Continue reading “Updated Bacterial and Archaeal Reference Genome Collection is Available!”
An updated bacterial and archaeal reference genome collection is available! This collection of 18,343 genomes was built by selecting exactly one genome assembly for each species among the 312,000+ prokaryotic genomes in RefSeq, except for E. coli for which two assemblies were selected as reference.
The criteria for selecting the reference assembly for a given species include assembly contiguity and completeness and quality of the RefSeq annotation.
Continue reading “Now Available! Updated Bacterial and Archaeal Reference Genomes Collection”
NEW in BLAST! We made smaller nucleotide databases to help you find the sequences you need faster and easier. You can now find these databases on the main nucleotide BLAST search page (Figure 1) and even download them (Databases: nt_euk, nt_prok, nt_viruses, nt_others). They are separated by organism type, such as eukaryotes, prokaryotes, viruses, and others (including synthetic sequences).
Figure 1. The database selection section of the main nucleotide BLAST page with the ‘Experimental databases’ radio button selected. You can choose one or more of the organism database subsets for your search. Continue reading “Now Available! Faster BLAST Searches with New Nucleotide Databases”
As previously announced, we are continuously curating a better Prokaryotic Reference Genomes Collection. An updated bacterial and archaeal reference genome collection is now available! This collection of 17,623 genomes was built by selecting exactly one genome assembly for each species among the 283,000+ prokaryotic genomes in RefSeq, except for E. coli for which two assemblies were selected as reference.
An updated bacterial and archaeal reference genome collection is available! This collection of 17,163 genomes was built by selecting exactly one genome assembly for each species among the 272,000+ prokaryotic genomes in RefSeq, except for E. coli for which two assemblies were selected as reference.
A total of 497 species are included in this collection for the first time. In addition, comparing to the October 2022 set, 174 species are represented by a better assembly and 15 species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment. The criteria for selecting one assembly for a given species from all assemblies available in RefSeq for the species include assembly contiguity and completeness and quality of the RefSeq annotation. See the documentation for details.
We have updated the nucleotide BLAST RefSeq reference genomes database (fourth in the menu) as well as the database on the Microbial Nucleotide BLAST page to reflect these changes. You can also run BLAST searches against the proteins annotated on these reference genomes (RefSeq Select proteins database, second in the menu).
The ongoing sequencing revolution has resulted in exponential growth of the NCBI BLAST databases. The default BLAST nucleotide database (nt), the most popular Web BLAST database, is currently 903 billion letters and continues to grow rapidly – doubling in size in the last year. This growth will cause longer search times, reduced capacity, and more delays in updating the database. In the not-too-distant future, searching the entire nt database on the web will no longer be possible unless we modify the database scope and composition.
Because of the above concerns, we want to make the default Web BLAST nucleotide database smaller and more efficient. Some options are to:
Continue reading “Re-evaluating the BLAST Nucleotide Database (nt)”