Tag: Genome Browser

Now Available! More Mammalian Cross-Species Alignments in the Comparative Genome Viewer (CGV)

Now Available! More Mammalian Cross-Species Alignments in the Comparative Genome Viewer (CGV)

In response to your feedback, we’ve made more whole genome cross-species alignments available in NCBI’s Comparative Genome Viewer (CGV). You can use these alignments to explore genome rearrangements between species. You can also zoom in to analyze regions of conserved gene synteny.

There are over 20 new cross-species alignments available, including human-mouse, mouse-rat, human-chimp, human-cattle, dog-cat, and others! These cross-species alignments provide additional opportunities to explore evolutionary relationships at the genomic and gene levels. We will add more cross-species alignments in the coming months.

The latest cross-species alignments added to CGV include imports from the UCSC Genomics Institute, as well as those generated at NCBI.

Check out two examples of cross-species whole-genome alignments in CGV below (Figure 1).

Figure 1. Whole genome alignments between (A) mouse and human (GRCm39 vs. GRCh38.p14)  and (B) cat and dog (F.catus_Fca126_mat1.0 vs. ROS_Cfam_1.0). Colored bands connects aligned regions; green indicates same orientation, blue indicates opposite orientation.

When you zoom in on an alignment (Figure 2), you can compare gene annotation on the two assemblies and see the extent of conservation of synteny. You can also see which genes are missing from one or the other assembly, indicating changes in sequence or differences in annotation.

Continue reading “Now Available! More Mammalian Cross-Species Alignments in the Comparative Genome Viewer (CGV)”

NCBI genome browsers: search and you will find!

If you’ve ever tried searching for a genomic location in NCBI’s Genome Data Viewer (GDV) or Variation Viewer and found that your search term didn’t work, it’s time to try again! We recently expanded support for searches in our genome browsers using non-NCBI identifiers such as HGVS patterns (e.g. NM_001318787.2:c.2258G>A) and Ensembl IDs. You can also search by chromosome coordinatescytogenetic bandassembly scaffold/componentdisease/phenotypedbSNP identifier, or RefSeq transcript/protein accession. We’ve gathered example searches in the table below.

Search term Example(s)
Chromosome coordinate chr1:1,500,000-2,000,000
chr2: 1.5M-2,540.2K
3: 21.335M..21.337M
3: 21.335M..21.337M
chr5
Cytogenetic band 1p36.21
2q13
Assembly scaffold NT_005403.18
NW_021159987.1
Assembly component AC106865.4
AC018680.4
Gene/protein name PTEN
protease
Disease/phenotype diabetes
eye color
SNP rsID rs863223352
dbVar ID rs863223352
RefSeq transcript/protein accession NM_017551.3
XP_011538173.1
Ensembl gene/transcript indentifier ENSG00000233258
ENST00000404547
HGVS NM_001318787.2:c.2258G>A
NP_001289617: p.Arg272Cys

When you search by single coordinate, SNP or dbVar ID, or HGVS, the browser view zooms to the location of the search result. A marker is automatically created to identify the searched position.  For HGVS, the marker is labelled with the corresponding rsID, if there is one.

variation viewer search by HGVS results
Figure 1. Variation Viewer showing results of search by an HGVS pattern, NP_001289617.1: p.Arg272Cys.

As always, please contact us if you have additional questions or suggestions about this or any other feature in GDV or Variation Viewer. You can use the Feedback button on the page or write to the NCBI Help Desk directly.

Three outdated browsers (1000 Genomes, dbGaP Data, and Get-RM) to retire in April 2022. Data available in GDV

The Genome Data Viewer (GDV) is now the comprehensive NCBI genome browser. The  development of GDV led to a few different types of genome browsers along the way, each one originally delivering visual displays for particular datasets. We developed the 1000 Genomes Browser for variation data from the 1000 Genomes project, the dbGaP Data Browser for controlled-access sequence read alignment data, and the GeT-RM browser for Genome in a Bottle (GIAB) data.

The data displayed in these three browsers is now either obsolete and/or can largely be accessed from the GDV browser or other NCBI resources. Moreover, unlike GDV, these older browsers are no longer under active development and the data has not been updated to meet changing needs of the communities they were developed to serve.  For these reasons we will retire these browsers in April 2022. Please see details below for more information on the data displayed in these browsers and how to access and display these data now through GDV and other means.

Continue reading “Three outdated browsers (1000 Genomes, dbGaP Data, and Get-RM) to retire in April 2022. Data available in GDV”

View intron feature evidence in the Genome Data Viewer and Sequence Viewer

Are you a researcher who works on gene biology and are interested in alternative splice patterns in your gene or genes of interest?  If so, be sure to explore the intron feature evidence available in graphics views of genome assemblies annotated by NCBI. You can view the NCBI evidence used for calling splice variant for genes, add other intron feature evidence tracks, and use new display and filter options that make it easier to interpret the data .

Figure 1. Graphical view of the monoamine oxidase gene (MAOA, MOAB) region on the human X  chromosome showing intron features tracks (‘RNA-seq intron features, aggregate’ and ‘Intropolis RNA-Seq intron features’). Mousing-over an intron feature activates a tooltip that shows details such as the number of reads with the splice site, the location on the chromosome, the length of the intron and the donor and acceptor bases at the splice site. The Intropolis track was added through the search feature of the Configure Tracks menu and configured (bottom menu) so that the features were sorted by strand and filtered so that only features with greater than 500 reads appear.

Continue reading “View intron feature evidence in the Genome Data Viewer and Sequence Viewer”

Improved chromosome searching in Genome Browsers

Are you interested in searching for a chromosomal region in a genome, but don’t know how to write the correct query?  The good news is that the NCBI Genome Data Viewer (GDV) now supports a much wider array of search options. Some examples are listed below:

  • chr1:1,500,000-2,000,000
  • chr2: 1.5M – 2M
  • chr2: 1.5M-2,540.2K
  • 2:1,500,000-2,000,000
  • 3: 21.33M – 22.01M
  • 3: 21.335M..21.337M
  • chr1:1,500,000 / 200
  • chr1:101,500,200
  • 1:101,500,200
  • 1:1,500K/0.5K
  • chr5
  • 10

You can use any of these queries or the ones described below for assembly aliases either on the GDV landing page or in the GDV search box (Figure 1).

Figure 1. The search boxes on the GDV landing page (left) and within the GDV graphical interface (right) showing queries with chromosome aliases for the domestic cat. Continue reading “Improved chromosome searching in Genome Browsers”

Coronavirus host gene regulatory elements now annotated by RefSeq Functional Elements

The COVID-19 pandemic has drawn attention to the human host genes associated with SARS-CoV-2 entry and to the elements that regulate expression of these genes. At NCBI, we have prioritized curation of experimentally validated regulatory elements for these genes in the RefSeq Functional Elements project. Our annotations include several enhancers, promoters, cis-regulatory elements and protein binding sites, among other feature types.  We have annotated 236 regulatory features for 27 distinct biological regions in the latest human Annotation Release (109.20200522) including regulatory elements for the ABOACE2, ANPEPCD209CLEC4GCLEC4MCTSL, DPP4,and TMPRSS2 genes

You can view our regulatory element to target gene linkages in the regulatory interactions track using our new track hub that we recently announced.  You can also see the biological regions and features tracks. These have functional and descriptive metadata, including biological region summaries, experimental evidence types, publication support and more.

The example in Figure 1 shows RefSeq Functional Element feature annotation in NCBI’s Genome Data Viewer (GDV) for the ABO gene region (GRCh38, NW_009646201.1: 73,864-103,789) the determiner of the human ABO blood group. A genome-wide association study recently identified non-coding  ABO variants associated with COVID-19 disease severity (PMID:32558485), which map to some of the RefSeq Functional Elements in this region.ABO region showing biological regions in GDVFigure 1. The human ABO gene region in the NCBI GDV displaying the RefSeq Functional Element features.  The biological regions aggregate track shows underlying feature annotation for an ABO upstream enhancer (LOC112637023),  promoter region (LOC112679202),  +5.8 intron 1 enhancer (LOC112679198),  a 3′ regulatory region (LOC112639999), and a +36.0 downstream enhancer (LOC112637025).  Functional Element features include numerous enhancers, promoters, cis-regulatory elements and protein / transcription factor binding sites.

We have more information about RefSeq Functional Elements on our website, including data download and extraction options. Stay tuned to NCBI Insights and other NCBI social media for future announcements about RefSeq Functional Elements!

Recent enhancements in Genome Workbench version 3.4.1

New Features

Version 3.4.1 of Genome Workbench, NCBI’s sequence annotation and analysis platform, includes new features for the Multiple Sequence Alignment View, the Graphical Sequence View and the Sequence Editing and Submission Package as well as a number of other improvements and bug fixes.

In the Multiple Sequence Alignment View, you can now export publication quality graphics (Save As PDF/SVG  … , Figure 1). In the Graphical Sequence View you can now  search by locus tag, use improved search capabilities for genes by locus and can better display the selected location in the feature editing dialog when annotating a sequence.

MSAFigure 1. A multiple alignment view in Genome Workbench highlighting the new ability to save presentation quality image files (Save As PDF and SVG formats).

In the Sequence Editing and Submission Package, we rearranged the controls in the Table Reader dialog to fit onto smaller screens and improved importing feature tables that contain mat-peptides (mature peptide) features.

Bug Fixes and Improvements

We have made a number of other fixes and improvements.  For MacOS users we fixed blurry text in some dialogs, fixed the copy to clipboard problem, and improved support for the latest Catalina version.  We also fixed a crashing problem in the Active Object Inspector interface. You should also see improvements in loading SNP data and better recovery in cases of power outages or other events causing local file corruption.

In the Sequence Editing and Submission Package, we fixed a bug that occurred when applying miscellaneous descriptors and structured comment fields using the Table Reader and an issue with using a PubMed ID to look up a publication.

Please see the extensive help documentation including FAQs, videos, and tutorials linked to the Genome Workbench homepage for more information and examples on how to use Genome Workbench in your research.

 

Non-human variation data from EVA now available in the Genome Data Viewer

You can now view SNP variation data for many commonly studied animals and plants – including mouse, cow, Drosophila, Arabidopsis, maize, cabbage, and many more – in the Genome Data Viewer (GDV) and other graphical sequence viewers. This data is streamed from the European Variation Archive (EVA)  at the European Bioinformatics Institute (EBI).

On any NCBI graphical sequence view you can use the Configure Tracks menu and the Track Configuration Panel to add the track for the EVA RefSNP data. This track is available through the left-hand tab for Remote Variation Data (Figure 1).  The EVA RefSNP track displayed on the pig (Sus scrofa) chromosome 12 graphical view is shown in Figure 2.

Config_tracksFigure 1. The Track Configuration panel showing the Remote Variation Data tab and he EVA RefSNP Release 1 track. Select the track checkbox and click Configure to load the track.

pig_snpsFigure 2. The graphical sequence viewer showing the region of the growth hormone gene on pig chromosome 12 (NC_010454.4) with the EVA RefSNP Release 1 track at the bottom.  The track header has an (R) and a green highlight to indicate that it is remote data streamed from an external website. NCBI is not responsible for the content or availability of these data. 

The EVA SNP FTP site has more information about the EVA SNP data release.

Please contact us using the Feedback link on the graphical view to let us know what you think and how we can further improve your experience with the NCBI genome browsers and graphical sequence viewers

 

dbVar clinical and common structural variants track hub now available

dbVar, NCBI’s database of large-scale genetic variants, has a new track hub for viewing and downloading structural variation (SV) data in popular genome browsers. Initial tracks include Clinical and Common SV datasets. dbVar’s new track hub can be viewed using NCBI’s Genome Data Viewer through the “User Data and Track Hubs” feature (Figure 1) and other genome browsers by selecting “dbVar Hub” from the list of public tracks or by specifying the following URL.

https://ftp.ncbi.nlm.nih.gov/pub/dbVar/sandbox/dbvarhub/hub.txt

Main_Track_Hub_Dial

Figure 1. Loading the dbVar track hub in the Genome Data Viewer. The Track Hubs feature on the left-hand column of the browser allow you to add the track by searching for it or by entering the direct URL. You can select the specific tracks —  for example, “NCBI curated common SVs: All populations” — to load from the Configure Track Hubs dialog. Continue reading “dbVar clinical and common structural variants track hub now available”

Try out our new table download options from the NCBI genome browsers and sequence viewers!

Have you ever wanted a list of the genes you’re looking at in the browser – maybe to give you a starting point for candidate gene analysis, or to cross-reference with other data?

In response to your feedback and helpful discussions with you, we’re excited to announce a new option to download gene annotation data directly from the web sequence viewers and browsers.

This new feature lets you get a table of gene names, coordinates and other helpful information from your genomic region of interest.

Go to the Download menu on the toolbar of the graphical viewer to find options for getting sequence and annotation data.

blog-634

Continue reading “Try out our new table download options from the NCBI genome browsers and sequence viewers!”