GenBank release 264.0 (12/19/2024) is now available on the NCBI FTP site. This release has 38.97 trillion bases and 5.36 billion records.
The current release has:
- 254,365,075 traditional records containing 5,085,904,976,338 base pairs of sequence data
- 3,957,195,833 WGS records containing 32,983,029,087,303 base pairs of sequence data
- 957,403,887 bulk-oriented TSA records containing 820,128,973,511 base pairs of sequence data
- 187,349,466 bulk-oriented TLS records containing 77,038,271,475 base pairs of sequence data
What’s new?
During the 61 days between the close dates for GenBank releases 263.0 and 264.0, the traditional portion of GenBank grew by 834,962,402,657 base pairs and by 2,017,411 sequence records. We updated 37,007 records during that same period. We added and/or updated an average of 33,679 traditional records per day!
Between releases 263.0 and 264.0, the WGS component of GenBank grew by 1,620,574,619,635 base pairs and by 211,423,075 sequence records. The TSA component of GenBank grew by 7,467,511,700 basepairs and by 8,670,291 sequence records. The TLS component of GenBank grew by 767,007 basepairs and by 71 sequence records.
The total number of sequence data files with this release increased by 777. The divisions are as follows:
- BCT: 18 new files, now a total of 412
- ENV: -2 files, now a total of 27
- EST: 1 new file, now a total of 164
- INV: 148 new files, now a total 1082
- MAM: 9 new files, now a total of 165
- PAT: 2 new files, now a total of 80
- PLN: 215 new files, now a total of 1875
- PRI: 353 new files, now a total of 678
- ROD: 2 new files, now a total of 115
- VRL: 2 new files, now a total of 333
- VRT: 29 new files, now a total of 317
The decrease in the number of ENV-division files is due to the suppression of 8,233 sequence records with OY accession prefixes, which were erroneously submitted as chromosomes. Further details can be obtained from the European Nucleotide Archive.
Upcoming changes
The INSDC will begin to mandate inclusion of /geo_loc_name (formerly /country) and /collection_date for sequence submissions. This requirement is expected to take effect by the end of December 2024.
Because there are valid circumstances in which location and/or the collection date for a sequenced sample cannot be provided, the domain of allowed values for these two source-feature qualifiers will be expanded to include a variety of “null terms.”
Additional information
For downloading purposes, please keep in mind that the uncompressed GenBank release 264.0 sequence data flat files require roughly 7,452 GB. The ASN.1 data files require approximately 2,595 GB.
For more information about GenBank release 264.0, see the release notes, as well as the README files in the GenBank and ASN.1 (ncbi-asn1) directories on FTP.
Stay up to date
Follow us on social @NCBI and join our mailing list to keep up to date with GenBank and other NCBI news.
Questions?
Please send any comments or questions to info@ncbi.nlm.nih.gov.