Tag: Basic Local Alignment Search Tool (BLAST)

Read About NCBI Resources in 2023 Nucleic Acids Research Database Issue

Read About NCBI Resources in 2023 Nucleic Acids Research Database Issue

The 2023 Nucleic Acids Research Database Issue features papers from NCBI staff on GenBank, Conserved Domain Database, and more. The citations are available in PubMed with full-text available in PubMed Central (PMC). To read an article, click on the PMCID number listed below.  Continue reading “Read About NCBI Resources in 2023 Nucleic Acids Research Database Issue”

Now Available! Add your favorite organism(s) to your BLAST ClusteredNR searches

Now Available! Add your favorite organism(s) to your BLAST ClusteredNR searches

Do you currently add an organism name(s) to focus your searches when using the BLAST standard nr database? You can now focus your searches by organism with the BLAST ClusteredNR database and get faster results with a better overview of protein homologs in a wider range of organisms. Your searches will be restricted to protein clusters that contain one or more sequences from the organism(s) you add.  

ClusteredNR results

A search of the ClusteredNR database (results) using human myoglobin (NP_005359.1) as a query and limited to Cetacea (whales & dolphins) returns clusters containing all the whale myoglobin matches present in a search of standard nr, as well as matches to clusters containing cytoglobin (Figure 1 A). These significant cytoglobin matches are not shown in the standard nr results with the Cetacea limit, which are dominated by matches to proteins from a single species, Physeter catodon (sperm whale) (Figure 1 B).  Continue reading “Now Available! Add your favorite organism(s) to your BLAST ClusteredNR searches”

Updated bacterial and archaeal reference genomes collection now available!

Updated bacterial and archaeal reference genomes collection now available!

An updated bacterial and archaeal reference genome collection is available! This collection of 17,163 genomes was built by selecting exactly one genome assembly for each species among the 272,000+ prokaryotic genomes in RefSeq, except for E. coli for which two assemblies were selected as reference.

A total of 497 species are included in this collection for the first time. In addition, comparing to the October 2022 set, 174 species are represented by a better assembly and 15 species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment. The criteria for selecting one assembly for a given species from all assemblies available in RefSeq for the species include assembly contiguity and completeness and quality of the RefSeq annotation. See the documentation for details.

We have updated the nucleotide BLAST RefSeq reference genomes database (fourth in the menu) as well as the database on the Microbial Nucleotide BLAST page to reflect these changes. You can also run BLAST searches against the proteins annotated on these reference genomes (RefSeq Select proteins database, second in the menu).

NIH Comparative Genomics Resource project

NIH Comparative Genomics Resource project

The potential impact of emerging model organisms on human health

Comparative genomics is a science that compares genomic data either within a species or across species to answer questions in biomedicine. Laboratory experiments can then investigate the functional impact of those genomics similarities and differences. The history of comparative genomics goes back to the mid-1990s, but comparative genomics is now accelerating. A flood of new data is emerging as DNA sequencing technology becomes cheaper and commoditized. While this growth poses many challenges to current tools and approaches, it also offers immense opportunity for scientific research and understanding. These insights continue to reveal novel model organisms that can further the impact of comparative genomics on human health. Continue reading “NIH Comparative Genomics Resource project”

Now Available! BLAST ClusteredNR database for blastx and PSI-BLAST searches

Now Available! BLAST ClusteredNR database for blastx and PSI-BLAST searches

ClusteredNR, the new protein database that provides results with a better overview of protein homologs in a wider range of organisms, is now available for blastx (translated nucleotide query) and PSI-BLAST (Position Specific Iterative BLAST) searches (Figure 1). Simply select ClusteredNR in the database section of the BLAST form. You can even search standard nr at the same time to compare results.

Figure 1. Composite image from the BLAST search forms. The ClusteredNR database is available now for blastx and PSI-BLAST searches in addition to blastp. For all types of searches, you can choose to search both ClusteredNR and standard nr at the same time so you can compare results

ClusteredNR is especially useful with blastx for finding more distant homologs when searching with queries from over-represented groups. For PSI-BLAST, the greater taxonomic scope of ClusteredNR database allows you to work more effectively with the default number target sequences in the first round. The two searches described below highlight these advantages of ClusteredNR.

Continue reading “Now Available! BLAST ClusteredNR database for blastx and PSI-BLAST searches”

Join NCBI at PAG 30

Join NCBI at PAG 30

San Diego, January 13-18, 2023 

NCBI is looking forward to seeing you in person at the International Plant and Animal Genome Conference (PAG 30), January 13-18, 2023 in San Diego, California.  

We’re especially excited to share our recent efforts on the NIH Comparative Genomics Resource (CGR), a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources on biomedical research.  

We also want to hear from you! If you’re interested in sharing your feedback on your needs and experiences involving comparative genomics tools to inform CGR, consider joining our Feedback Session.

Check out NCBI’s schedule of activities and events:  

Continue reading “Join NCBI at PAG 30”

Now available: Updated prokaryote representative genomes collection

Now available: Updated prokaryote representative genomes collection

An updated bacterial and archaeal representative genomes collection is available! We selected a total of 16,665 of the 262,000 prokaryotic assemblies in RefSeq to represent their respective species. For the first time, more complete assemblies (as calculated by CheckM) were ranked higher than less complete assemblies. See the ranked list of criteria for selecting representative assemblies here. Continue reading “Now available: Updated prokaryote representative genomes collection”

Connect with NCBI at ASHG 2022

Connect with NCBI at ASHG 2022

Join us October 25-29 in Los Angeles, CA

We are looking forward to seeing you in-person at the American Society of Human Genetics (ASHG) annual meeting, October 25-29, 2022, in Los Angeles, California.

We will present a variety of talks and posters featuring our clinical and human genetic resources, as well as genome products and tools. We are excited to introduce the NIH Comparative Genomics Resource (CGR), a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. If you’re interested in providing feedback that will be used to help drive CGR forward, consider joining our round table discussion.  

Check out NCBI’s schedule of activities and events: 

Continue reading “Connect with NCBI at ASHG 2022”

New Upcoming NCBI Virtual Workshops!

New Upcoming NCBI Virtual Workshops!

Apply to attend October 2022 interactive, hands-on workshops

Want to learn more about NCBI resources and how to implement our cutting-edge tools in your research? NCBI offers a variety of educational opportunities, including workshops, webinars, codeathons, tutorials, and more!

We are excited to announce our upcoming virtual workshop series for October 2022. Our interactive, hands-on workshops are taught by experienced NCBI Education Faculty. Applications are open to the public; however, each workshop will accept a limited number of participants to facilitate the best possible educational experience. Continue reading “New Upcoming NCBI Virtual Workshops!”

Announcing new links and annotations on Conserved Domain Search results!

Announcing new links and annotations on Conserved Domain Search results!

Conserved Domain Search (CD Search) results now show domain architecture information and other annotations that further characterize predicted domain and protein function. These include links to PubMed, Gene Ontology (GO) terms, Enzyme Commission (EC) numbers, and the SPARCLE Domain Architecture Viewer. You can use these links on the results to find literature (PubMed), assign biological roles and protein function (GO and EC), and find proteins with the same domain architecture (Domain Architecture Viewer).  These annotations are currently available for a limited number of architectures, but we will continue to add them  as part of our curation effort.

Figure 1 shows the results of an example CD Search showing these new links.  Note that you can use the GO and EC information provided to retrieve protein models with these annotations from the Protein Family Models database, for example GO:0030246[GOTermId] — molecular function carbohydrate binding or  2.7.11.1[ECNumber]non-specific serine/threonine protein kinase.

Figure 1. Conserved Domain Database search results for a hypothetical protein (XP_007132600.1) from the common bean (Phaseolus vulgaris). The results classify the protein as a plant receptor-like protein kinase. The results also show the EC number and the GO terms associated with this domain architecture, a link to a PubMed citation for the protein family (receptor-like protein kinases), and a link to the Domain Architecture Viewer for G-type lectin S-receptor-like serine/threonine-protein kinases. The Domain Architecture Viewer shows other proteins from the NCBI databases with the same domain architecture (order, number and types of domains).  Continue reading “Announcing new links and annotations on Conserved Domain Search results!”