GenBank

Prokaryotic rRNA submissions must meet the following requirements:
- All sequences are prokaryotic
- All sequences in the FASTA file contain sequences from one of the following types: 16S ribosomal RNA, 23S ribosomal RNA, or 16S-23S ribosomal RNA intergenic spacer region
- Sampled from an uncultured, environmental source or from pure cultured strains
- Sequences from 454, Illumina or next generation sequencing technologies are accepted only if they are assembled (each sequence was assembled from two or more overlapping sequence reads) or processed into OTUs, bins, or individual phylotypes.
- The following information must be provided for each sequence: organism name, a unique sample ID (strain, isolate, or clone ID), collection-date, and geo_loc_name (geographic location of collection). Isolation-source or host is also required for sequences from an uncultured source.
Eukaryotic rRNA and rRNA-ITS submissions must meet the following requirements:
- All sequences are eukaryotic
- All sequences in the FASTA file contain sequences from one of the following types: nuclear small or large subunit ribosomal RNA, nuclear internal transcribed spacer 1 or 2, nuclear rRNA-ITS region, or mitochondrial or chloroplast small or large subunit ribosomal RNA.
- The following information must be provided for each sequence: organism name, a unique sample ID (such as isolate, strain, clone, cultivar, culture-collection, specimen-voucher, OR breed), collection-date, and geo_loc_name (geographic location of collection).
Metazoan (multicellular animal) Mitochondrial COX1 submissions must meet the following requirements:
- All sequences are from metazoan (multicellular animal) organisms.
- All sequences in the FASTA file contain only mitochondrial COX1 sequence. Flanking sequence should not be included.
- The following information must be provided for each sequence: organism name, unique isolate or specimen-voucher, collection-date, and geo_loc_name (geographic location of collection).
- Mitochondrial genetic code must be provided if the organism is not in the NCBI taxonomy database.
Eukaryotic nuclear mRNA submissions must meet the following requirements:
- All sequences are from eukaryotic nuclear-encoded messenger RNA (mRNA/cDNA).
- All mRNA sequences are protein-coding and contain a coding region (CDS). CDS annotations may be provided by completing a form, providing the corresponding protein sequences, or uploading a 5-column tab-delimited feature table.
- The following source information must be provided for each sequence: organism name, a sample identifier (such as isolate, strain, clone, cultivar, culture-collection, specimen-voucher, OR breed), collection-date, and geo_loc_name (geographic location of collection).
- This mRNA workflow is not appropriate for genomic DNA, non-protein coding mRNAs, organelle sequences, synthetic constructs, non-eukaryotic organisms, transcriptome, or third-party annotation submissions.
Influenza submissions must meet the following requirements:
- All sequences are derived from Influenza A, B, or C virus. Only sequences from one viral type can be submitted in a single submission. Currently, you cannot submit a mix of Influenza A and Influenza B sequences as a single submission.
- The following information must be provided regarding the virus: isolate, serotype (if Influenza A), complete collection date, host, geo_loc_name (geographic location of collection).
Norovirus submissions must meet the following requirements:
- All sequences are derived from Norovirus. Only sequences from one genogroup can be submitted in a single submission. Currently, you cannot submit a mix of Norovirus GI and Norovirus GII sequences as a single submission.
- The following information must be provided regarding the virus: unique isolate, genotype, complete collection date, host, geo_loc_name (geographic location of collection).
Dengue submissions must meet the following requirements:
- All sequences are derived from Dengue virus.
- The following information must be provided regarding the virus: unique isolate, genotype, complete collection date, host, geo_loc_name (geographic location of collection).
SARS-CoV-2 submissions must meet the following requirements:
- All sequences are derived from Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).
- The following information must be provided regarding the virus: unique isolate, complete collection date, host, geo_loc_name (geographic location of collection).
Learn more about requirements and sequence processing steps for this wizard.