Overlapping Genes
Dr Tony Southall
[Link]@[Link]
OBJECTIVES
• Understand that genes may be organised in genomes in a complex
spatial arrangement
• Be able to discuss frameshifting mechanisms (with examples)
• Understand potential advantages and disadvantages of overlapping
• Be able to describe how overlapping antisense genes can regulate
each other (with examples)
Introduction
Overlapping gene: ‘Adjacent genes, located on either DNA strand, sharing one or
more nucleotides in coding sequence’
• Can code 2 or more(!) proteins using the same DNA sequence
• An imprecise definition?
• What about regulatory regions/non-protein coding genes?
• 1st reported in 1976 in ssDNA genome phages
• Originally thought to be rare
• Now found in mitochondria, microbes & eukaryotes
• Humans (10% of genes are overlapping)
• Arabidopsis (14% of 25,000 genes)
Types of Overlap
Remember:
There are 2 strands of DNA in a helix so we have two “sites” of overlap.
• Same-strand overlapping:
• Called ‘uni-directional’
• 3’ end of one gene overlaps with 5’ of other
• The genes may be regulated by a common promoter
• Common in bacteria PMID: 12047938
• Different-strand overlapping
• 2 categories:
• ‘Convergent’
• 3’ end overlap
• ‘Divergent’
• 5’ ends overlap
• Bidirectional promoters active?
Types of Overlap contd.
• The genomic distribution of these gene pairs is species specific
• e.g. Convergent gene pairs are prevalent in Drosophila but rarer in human and mouse
• May depend on evolutionary constraints
• We also see authors using:
‘Complete’/`Internal’/`Embedded’/ `Nested’ Overlaps
- Useful for discussing eukaryotes (see next slide)
5’ 3’ 5’ 3’
e.g. P P
3’ 5’
P P
5’ 3’
(Same Strand Nested) (Different Strand Nested)
‘Partial’ or ‘Terminal’ Overlaps
Involving only small 5’ or 3’ overlap of coding sequence.
Confusingly some older papers use the term ‘out of phase’ to mean much deeper overlaps
“Examples”
Genes sharing the same locus on the
same strand, however coding for
different proteins
Genes sharing promoter region
Nested gene
Embedded gene
Genes on opposite strands with overlapping
locus but no overlap in the exonic region
Tail-to-tail overlap in the exonic region
Head-to-head overlap involving 3ʹ-UTRs
and coding sequence
PMID: 15680581
Gene “Phase”
• We have to consider the relative ‘phase’ that the overlapping genes exist in relative
to one another.
5’ 3’
ATGCGTTTCATGAAATGTCGTGGTTTTGCGTGTCCG
• One gene is considered the ‘reference gene’ – base comparisons from that
• If overlapping genes result in coincident reading frames then= ‘In Phase’
5’ 3’
ATGCGTTTCATGAAATGTCGTGGTTTTGCGTGTCCG
Reference Gene
Gene 2
“In Phase Overlaps”
• Common in bacteria and viruses
• 2 categories: Involving different initiation and different termination of translation
• “Initiation”:
• Alternative translation start site?
• New internal promoter formation
• Genes share terminator
• Dissimilar N-terminals
• Identical C-terminals
• “Termination”:
• Bind same substrate but catalyse
• Same initiator codon
different reactions?
• Termination at distinct codons
• e.g. CS3 Pili genes in E. coli
• 5 polypeptides
• Thermus flavus
Aspartokinase
• askA: α subunit (405 aa)
• askB (3’ end of askA):
β subunit (161 aa)
“Out of Phase Overlaps”
• Some genes overlap in ways that don’t result in identical reading frames
• If both are Phase 0 then they are considered ‘in phase’
• Common in prokaryotes
• Common with phages
• Short overlaps often in phase 2, large overlaps in phase 1 (due to genetic code probabilities)
• Different strand out of phase overlaps:
- In prokaryotes different strand overlaps
seem evenly distributed between phases
“Out of Phase Overlaps”
• The two products of the mouse Ink4a/Arf locus
• Alternative first exons (1α and 1β) that are transcribed from different promoters
• These are spliced to the same acceptor site in exon 2, which is translated in
alternative frames
• Both are key tumour suppressors
PMID: 11584300
Partial Overlap of Genes
• Partial or Terminal Overlap:
• Small overlaps on 5’ or 3’ end
• Common for prokaryotes with functionally dependent genes
• Terminator site of 1 gene overlaps with initiator of another
Tryptophan biosynthesis genes
• Tryptophan operon is an example:
• trpE-trpD- one base overlap:
• Same with trpB-trpA (Shine Dalgarno sequence within trpB)
• Proteins synthesised in equimolar ratios
• Translation coupling dependent on this overlap (PMID: 2685759)
• Proximity of the trpB stop codon to the trpA start influences trpA translation
Summary
• Genes can overlap in all ways you could have thought possible!
Translational recoding
• Ribosomes can be directed to:
• utilise alternative start sites
• bypass or recode termination codons
• Or site specific ‘Programmed Shift of Reading Frame (PSRF)’
• Ribosomal Frameshift:
• Ribosome pause on mRNA: moves or 1 nucleotide before continuing
• ˃ 1 protein per mRNA
• Not every ribosome may shift!
• Depends on:
• mRNA regulatory sequence & structure
• Remember all mRNA structure must be unfolded
• Affect codon/anti-codon binding and leads to uncoupling
-1 Programmed Ribosomal
Frameshifting
• Very common in prokaryotes
• mRNA requires:
i) ‘Slippery’ sequence
• 7 nucleotides (where shift takes place)
X XXY YYZ (original reading frame)
XXX YYY Z (shifted reading frame)
XXX: 3 identical nucleotides, YYY: AAA/UUU, Z: not often G
ii) spacer sequence - 12 nucleotides or less
iii) downstream ‘stimulatory’ structure
• pseudoknots, kissing stem loop
• energetic barrier
• aids positioning over slippery site
-1 Programmed Ribosomal
Frameshifting contd.
• Slippage may occur during distinct points of translation elongation cycle:
• During accommodation of the A-site tRNA
• Or just after before peptidyl transfer
• Or during EF-G catalysed translocation
-1 Programmed Ribosomal
Frameshifting contd.
PMID: 9586242
+1 PRF example
• Saccharomyces cerevisiae: OAZ1
- Mammalian equivalent: ornithine decarboxylase antizyme (OAZ)
- Ornithine decarboxylase (ODC) produces polyamines
- OAZ stimulates ubiquitin-independent degradation of ODC
- polyamines stabilise pseudoknot: +1 PRF and antizyme
+1 PRF example (cont.)
OAZ promotes ODC ODC increases
degradation polyamine levels
of ODC
Frameshift allows
production of Polyamines stabilise
functional OAZ pseudoknot in OAZ
protein mRNA
Induction of +1 PRF
on OAZ mRNA
Negative feedback loop!
-1 PRF example
• Gag is produced as a 55kDa
precursor protein that forms the virus
particle
• The 160kDa GagPol polyprotein
precursor containing the viral enzymes
protease, reverse transcriptase and -1 PRF
integrase — is also expressed at ~5% of
the level of Gag
• Due to a -1 programmed ribosomal
frameshifting event!
Figure adapted from
PMID: 26119571
-1 PRF example (cont.)
-1 PRF example (cont.)
Targeting PRF with drugs?
Images by James Brisbois PMID: 24387306
Overlapping genes
and Programmed Ribosome Frameshifting
Why?!
• Allows for genome compression (see next slide)
• PRF provides another method to increase the diversity of the proteome
PRF and protein diversity
• Stoichiometry: PRF and shared promoters allow proteins to be
expressed at stable levels relative to each other – coordinated control
Advantages of overlapping genes
• Genome size
• Viruses:
• Small genome size / genome compression
• Useful as Capsid enforces limitations on virus:
• Can’t package larger genome (PMID: 20610432)
• Especially with icosahedral capsid
• Can only increase size only with increased subunit number
• Fitness cost (eg. host cell resources and replication speed)
• Genome replication - Faster so can out compete other viruses?
Advantages/disadvantages of overlapping genes
• Mutations and evolution
• Mutations - in theory, might mitigate the detrimental effects of mutation (PMID: 20610432)
Disadvantage:
• Evolution – overlapping genes may be subject to evolutionary constraint
Difference between prokaryotes
and eukaryotes
• Eukaryotes:
• Have larger genomes
• Contain introns so overlapping genes may be located in introns
• More abundant different strand overlaps:
• A result of a more complex genome structure?
• Avoidance of exon sharing retains flexibility?
• A lower proportion of divergent different strand overlaps
• 5’ region delicate?
• Prokaryotes:
• Features exons primarily so exon overlapping is common
• Unidirectional overlapping is the most common layout
• operons a driving force?
• PRF move prevalent? Selected for more due to genome size restraints?
Genes can be overlapping for:
i) Genome compression
ii) Sharing promoters
iii) Stoichiometry
Why else might they be
overlapping?
Graphic: Christine Daniloff
Gene regulation by
antisense transcription
Anitsense transcription can impact on gene expression at 3 different stages:
1) Transcription initiation
2) During transcription
3) Post transcription
Gene regulation by antisense transcription
1) Transcription initiation
Example:
Image adapted from PMID: 24217315
Gene regulation by antisense transcription
Gene regulation by antisense transcription
3) Post transcription
Example:
Image adapted from PMID: 24217315
Gene regulation by antisense transcription
Sense–antisense pairs as self-regulatory circuits
Fine tuning – antisense expression slightly modulates expression of the sense gene
Bistable switch – strong mutual repression
Image adapted from PMID: 24217315
Summary
• Overlapping genes may be on the same strand or different strand
• Different strand overlaps may be convergent or divergent
• Same strand overlaps may express themselves via Programmed
Ribosomal Frameshifting (PRF)
• Benefits include smaller genome size and faster replication
• Anti-sense transcription can regulate sense gene expression by multiple
mechanisms
References
This lecture is based upon one previously given by Dr Timothy Simpson
and Dr Rey Carabeo (both formerly Imperial College London).
Key references:
• Makalowska, I et al., (2005) Overlapping genes in vertebrate genomes. Computational Biology and Chemistry 29:1-12
• Dinman, J.D. (2013) Mechanisms and implications of programmed translational frameshifting. Wiley Interdisciplinary reviews. RNA. 3
(5):661-673
• Chirico et al. (2010) Why genes overlap in viruses 22;277(1701):3809-17.
• Vicent Pelechano and Lars M. Steinmetz (2013) Gene regulation by antisense transcription. Nature Reviews Genetics 14:880-893
Ribosome reinitiation: [Link]
Polysome electron micrograph: [Link]
Prokaryote and eukaryote cells: [Link]
Ribosome: [Link]
Virus: [Link]
Term Table
In the literature different authors (especially considering the year the article was
published in!) may use different pieces of terminology to discuss the same thing.
Therefore, here is a table to simplify things for you
Uni-directional Convergent Divergent
Same-strand Tail-to-tail Head-to-head
Co-directed Anti-parallel Head-on
Parallel End-on
Tandem