Tag: NCBI Nucleotide

Access Avian Influenza A (H5N1) Virus Sequences from the Current Outbreak at NCBI

Access Avian Influenza A (H5N1) Virus Sequences from the Current Outbreak at NCBI

The U.S. Centers for Disease Control and Prevention (CDC) has been monitoring the ongoing outbreak of the avian influenza A (H5N1) virus. This is widespread globally in wild birds, and has led to sporadic outbreaks in poultry, cows, several species of wild animals, and has been detected in exposed humans. The CDC recently sequenced the H5N1 virus in two respiratory specimens collected from a U.S. patient who was severely ill and has now died (PQ809549-PQ809564) 

As previously announced, the GenBank sequences, annotations, and metadata including from this patient are available through NLM’s NCBI resources.  Continue reading “Access Avian Influenza A (H5N1) Virus Sequences from the Current Outbreak at NCBI”

MANE v1.4 with MANE Select for Non-Coding Genes

MANE v1.4 with MANE Select for Non-Coding Genes

The next release (v1.4) of Matched Annotation from NCBI and EMBL-EBI (MANE) is here!  MANE is a collaborative dataset produced jointly by NCBI and EMBL-EBI that provides a representative transcript (MANE Select) for human protein-coding genes, to be used as universal standards for variant reporting and browser display. A second transcript, MANE Plus Clinical, is provided for genes where MANE Select alone is not sufficient to report all known variants. The new MANE release adds another important component to this high-value dataset – non-coding genes, some of which are known to be associated with human disease.   Continue reading “MANE v1.4 with MANE Select for Non-Coding Genes”

NCBI will assign 64-bit numeric GIs by November 15th. Update affected software!

As announced  last month, NCBI will begin assigning larger (64-bit) numeric ‘GIs’ to the remaining sequence types that still receive these identifiers. This change is expected as soon as Nov. 15th, 2021 but could occur earlier if data submission volumes are unexpectedly high. This is a reminder that all organizations and developers using our products should review software for any remaining reliance on GIs and compatibility with these larger identifiers.

How do you know if your software or organization may be impacted?

If you have built custom software to interface with NCBI data and consume a sequence database UID (i.e. GI), process the GI from an ASN1 or XML product, or process the GI from any tabular product on FTP, you should review all code to ensure that the new, longer, 64-bit GIs will be handled properly. To ensure a smooth transition and the best overall experience, please update to the latest versions of NCBI-provided programmatic and command line tools. Alternatively, you could make updates  to your code to use accession.version identifiers instead of GIs.

NCBI is here to help the community as we make this change. Stay tuned here or follow NCBI Twitter where we will share updates and additional information, such as a final confirmation of the projected cutover date.

Please contact [email protected] with any questions about this change or to determine if any software you are using is affected.

Vertebrate Genome Project genome assemblies annotated by NCBI

Vertebrate Genome Project genome assemblies annotated by NCBI

NCBI is an active partner of the Vertebrate Genomes Project (VGP), who recently published a series of papers on the initial results of their efforts to sequence all 70,000 vertebrate species.  See the VGP press release  for more details. To date, this project has submitted over 130 diploid chromosome-level assemblies to NCBI’s GenBank  and the European Nucleotide Archive.  NCBI has annotated 94 of the VGP assemblies from 85 species using the NCBI Eukaryotic Genome Annotation Pipeline.

These sequence and annotation data are available through NCBI web resources including Gene, Assembly, Nucleotide, Protein, and Datasets and are included in the GenBank and RefSeq releases. You can browse the assemblies in the Genome Data Viewer  and  download metadata, sequence, and annotation data for the latest assemblies in the VGP BioProject using the NCBI Datasets command-line tools  as shown below. Continue reading “Vertebrate Genome Project genome assemblies annotated by NCBI”

Prokaryotic representative genomes updated — now over 13 thousand assemblies!

We have updated the bacterial and archaeal representative genome collection!  The current collection contains over 13,000 assemblies selected from the 203,000 prokaryotic RefSeq assemblies to represent their respective species. The collection has increased by 11% since August 2020.  We’ve included about 1,400 species for the first time, have used better assemblies for 1,177 species, and have removed 65 species because of changes in NCBI Taxonomy or uncertainty in their species assignment.

We have also updated the  Representative Genomes Database on the Microbial Nucleotide BLAST page as well as the RefSeq Representative Genome Database on basic nucleotide BLAST, to reflect these changes. Continue reading “Prokaryotic representative genomes updated — now over 13 thousand assemblies!”

Novel coronavirus complete genome from the Wuhan outbreak now available in GenBank

Updated!

Get rapid access to Wuhan coronavirus (2019-nCoV) sequence data from the current outbreak as it becomes available. We will continue to update the page with newly released data.

The complete annotated genome sequence of the novel coronavirus associated with the outbreak of pneumonia in Wuhan, China is now available from GenBank for free and easy access by the global biomedical community. Figure 1 shows the relationship of the Wuhan virus to selected coronaviruses.

Wuhan-human-1_posterior-output2

Figure 1.  Phylogenetic tree showing the relationship of Wuhan-Hu-1 (circled in red) to selected coronaviruses. Nucleotide alignment was done with MUSCLE 3.8. The phylogenetic tree was estimated with MrBayes 3.2.6 with parameters for GTR+g+i.  The scale bar indicates estimated substitutions per site, and all branch support values are 99.3% or higher.

Continue reading “Novel coronavirus complete genome from the Wuhan outbreak now available in GenBank”

NCBI Will Retire the Probe Database in April 2020

NCBI released the Probe database in 2005 as a registry of nucleic acid reagents for biomedical research. At that time array-based assays were prevalent, but have since declined with the advent of short read sequencing. As a result, NCBI will retire the web interface for the Probe database in April 2020. You can continue accessing the content of the database on the NCBI FTP site, but it will no longer be updated. As of this announcement, Probe will no longer be accepting new submissions.

If you have questions or concerns about this retirement, we’d love to hear from you. Please comment here or contact us at [email protected].

Vector graphics downloads now available in NCBI genome browsers and sequence views

You can now download images in both PDF and Scaled Vector Graphics (SVG) formats from our Sequence Viewer and genome browsers such as the Genome Data Viewer!  SVG files are ideal for editing in image editors and provide high quality graphics for publications, posters, and presentations. Both the PDF and SVG files that you download contain vector graphics for high fidelity images.

You can download image files by choosing the “Printer-Friendly PDF/SVG” option under the Tools menu from any Graphical Sequence Viewer application (Figure 1).

SVG_GDVFigure 1. Printer friendly download options from the graphical view in the Genome Data Viewer.  You can download either PDF or SVG formats, which are easily edited in standard graphics applications. 

 

New results for organelle genome searches

As part of our ongoing effort to improve your search experience, we’ve made it easier for you to find the sequence of your favorite organelle genome plus all the information and data associated with it. To find organelle genomes, search for an organism name combined with an organelle description, for example human mitochondriontomato chloroplast or Toxoplasma gondii RH apicoplast.

A new results panel will appear with links to the organelle genome sequence, annotated genes, and related phylogenetic and population studies. The panel appears with these searches in an All Databases search or within any of NCBI’s sequence databases including Gene, Nucleotide, Protein, Genome, Assembly.  For the human mitochondrial genome, a graphical schematic of the genome allows you to navigate to individual mitochondrial encoded genes (Figure 1).

Organelle_sensor

Figure 1.  The organelle genome results for a search with human mitochondrion. The panel provides access to analysis tools, downloads, and other relevant results. Clicking any of the gene objects on the genome graphic links leads to the relevant Gene record, for example Gene ID: 4512 in the case of COX1.

Try it out using the following example searches and  let us know what you think!

September 11 Webinar: A beginner’s guide to genes and sequences at NCBI

September 11 Webinar: A beginner’s guide to genes and sequences at NCBI

On Wednesday, September 11, 2019 at 12 PM, NCBI staff will present a webinar for people with limited experience working with gene and sequence information. You will learn about the kinds of data available for genes and sequences, how to select the most informative records, and how to find related genes and sequences using pre-computed information and the BLAST sequence search service.

  • Date and time: Wed, Sep 11, 2019 12:00 PM – 12:30 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.