Check out RefSeq release 223, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.
What’s included in this release?
As of March 4, 2024, this full release incorporates genomic, transcript, and protein data containing:
- 425,594,654 records
- 316,329,937 proteins
- 60,886,133 RNAs
- sequences from 147,591 organisms
The release is provided in several directories as a complete dataset and divided by logical groupings.
New rat assembly and annotation
The Genome Reference Consortium has released a new rat assembly, GRCr8. Annotation Release GCF_036323735.1-RS_2024_02 contains the annotated genes, transcripts and proteins in the new assembly. Curated RefSeq Select transcripts are available for 80% of protein-coding genes. The annotation products are available in the sequence databases and on the FTP site.
Updated mouse genome annotation
Annotation Release GCF_000001635.27-RS_2024_02 is an update of NCBI Mus musculus Annotation Release 109. The annotation products are available in the sequence databases and on the FTP site.
Annotation of the Asian tiger mosquito
Annotation release GCF_035046485.1-RS_2024_01 is now available for Aedes albopictus (Asian tiger mosquito) assembly, AalbF5. The annotation products are available in the sequence databases and on the FTP site.
New eukaryotic genome annotations
This release includes new annotations generated by NCBI’s eukaryotic genome annotation pipeline for 36 additional species, including:
- Guinea pig, based on new assembly mCavPor4.1 (GCF_034190915.1-RS_2024_02)
- Philippine flying lemur, based on new assembly mCynVol1.pri (GCF_027409185.1-RS_2024_02)
- Song sparrow, based on new assembly bMelMel2.pri (GCF_035770615.1-RS_2024_02)
- Domestic silkworm, based on new assembly ASM3026992v2 (GCF_030269925.1-RS_2024_01)
- Japanese rose (Rosa rugosa), based on new assembly drRosRugo1.1 (GCF_958449725.1-RS_2024_01) (pictured)
Human alignment files archived on the FTP site
The alignments in this directory are no longer being updated but are provided for archival purposes. Users interested in the latest set of alignments of human RefSeq transcripts to the genome should use the files located in the historical directory.
Stay up to date
RefSeq is part of the NIH Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration. Follow us on social @NCBI and join our mailing list to keep up to date with RefSeq and other CGR news.
Questions?
If you have questions or would like to provide feedback, please reach out to us!