Tag: Comparative Genomics Resource (CGR)

Now Available: RefSeq Release 235

Now Available: RefSeq Release 235

RefSeq release 235 is now available online and from the FTP site! You can access RefSeq data through NCBI Datasets. The release is provided in several directories as a complete dataset and also as divided by logical groupings.    

What’s included in this release? 

As of May 11, 2026, this full release incorporates genomic, transcript, and protein data containing:   

  • 616,942,961 records  
  • 473,570,633 proteins  
  • 81,124,747 RNAs  
  • Sequences from 180,620 organisms  

Continue reading “Now Available: RefSeq Release 235”

Now Available: RefSeq Release 233

Now Available: RefSeq Release 233

RefSeq release 233 is now available online and from the FTP site! You can access RefSeq data through NCBI Datasets. The release is provided in several directories as a complete dataset and also as divided by logical groupings.   

What’s included in this release? 

As of January 26, 2026, this full release incorporates genomic, transcript, and protein data containing:  

  • 578,285,616 records 
  • 442,943,508 proteins 
  • 76,278,418 RNAs 
  • Sequences from 174,157 organisms 

Continue reading “Now Available: RefSeq Release 233”

GenBank Now Supports EGAPx-Based Annotation

GenBank Now Supports EGAPx-Based Annotation

With the latest release of EGAPx, we’re excited to announce that you can now submit genome assemblies with EGAPx annotations directly to GenBank. We’re making it easier for researchers to share richly annotated eukaryotic genomes, complete with structural and functional features generated by the EGAPx pipeline. 

What’s new? 
  • Easily integrate your EGAPx annotations into GenBank: You can now attach the EGAPx-generated ASN.1 annotation file as part of a submission package. 

Continue reading “GenBank Now Supports EGAPx-Based Annotation”

Candidozyma and Cryptococcus Fungal Data now in the Multiple Comparative Genome Viewer (MCGV)!

Candidozyma and Cryptococcus Fungal Data now in the Multiple Comparative Genome Viewer (MCGV)!

NCBI’s Multiple Comparative Genome Viewer (MCGV) continues to expand available alignments! We are excited to announce the addition of two new fungal datasets: Candidozyma auris and Cryptococcus multigenome sequence alignments. 

You can now visualize and compare multiple whole genome assemblies for these human pathogens, zooming in on specific genes, tracking evolutionary changes, and identifying critical nucleotide differences across a variety of fungal strains.  Continue reading “Candidozyma and Cryptococcus Fungal Data now in the Multiple Comparative Genome Viewer (MCGV)!”

Top Posts of 2025: A Look at the NCBI Insights Blog

Top Posts of 2025: A Look at the NCBI Insights Blog

As we begin a new year, let’s look back at the top viewed NCBI Insights Blog posts of 2025!   

In case you missed any of these, check them out: Continue reading “Top Posts of 2025: A Look at the NCBI Insights Blog”

Improved Gene Data Access with Redesigned NCBI Gene Pages

Improved Gene Data Access with Redesigned NCBI Gene Pages

We are excited to announce redesigned NCBI Gene pages! The redesigned pages (available through NCBI Datasets) offer a modern, clean interface with an intuitive layout that makes browsing and downloading NCBI Gene data easier. The legacy Gene page will continue to be available during this transition period, and we will continue to communicate updates and changes through NCBI websites and this blog.  

What’s new? 
  • Improved Search Capabilities: The new Gene landing page offers improved search functionality, allowing users to find gene information more effectively. Users can now search by taxon, locus tag and symbol (with or without specifying a taxon, and above species level) in addition to NCBI GeneID and accession. 

Continue reading “Improved Gene Data Access with Redesigned NCBI Gene Pages”

Compare Nucleotide Differences in the Multiple Comparative Genome Viewer (MCGV)!

Compare Nucleotide Differences in the Multiple Comparative Genome Viewer (MCGV)!

NCBI’s new Multiple Comparative Genome Viewer (MCGV) is an interactive graphical genome browser that allows you to visualize multiple genome assemblies in a single view. MCGV displays whole genome alignments created by the research community. 

What’s new?

We made several significant updates to this application since its initial release last winter! Now you can:  

  • View gene annotation with exon and intron structure for all assemblies 
  • Filter your view to only the assemblies and species you want to compare 
  • Change the alignment dataset to view an 8-way ape alignment with the choice of multiple alternate anchor assemblies 

Continue reading “Compare Nucleotide Differences in the Multiple Comparative Genome Viewer (MCGV)!”

Faster, Better Results for Protein BLAST Searches

Faster, Better Results for Protein BLAST Searches

Effective August 2025, ClusteredNR will become the protein BLAST default database 

We are excited to announce that the default database for protein BLAST searches will soon be the NCBI ClusteredNR database! Introduced in 2022, ClusteredNR is a collection of protein sequence clusters built from the current default database, nr. The representative sequence is chosen for each cluster, which is generally well-annotated and indicates the function of the proteins in the cluster, helping you focus on meaningful biological insights and decreasing redundant results.  

What’s better about ClusteredNR?
  • Faster searches 
  • Decreased redundancy in results 
  • Broader taxonomic coverage in results 

Continue reading “Faster, Better Results for Protein BLAST Searches”

RefSeq Release 229 is Now Available!

RefSeq Release 229 is Now Available!

Check out RefSeq release 229, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets. The release is provided in several directories as a complete dataset and also as divided by logical groupings.

What’s included in this release?

As of March 3, 2025, this full release incorporates genomic, transcript, and protein data containing:

  • 522,879,448 records
  • 399,577,538 proteins
  • 68,985,910 RNAs
  • Sequences from 164,117 organisms 

Continue reading “RefSeq Release 229 is Now Available!”