0% found this document useful (0 votes)
24 views120 pages

Genex Zusammenfassung 24

Zusammenfassung Genex SS24

Uploaded by

mario.mijat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views120 pages

Genex Zusammenfassung 24

Zusammenfassung Genex SS24

Uploaded by

mario.mijat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

GEN EXPRESSION: PART 1

• Transcription control: ncRNA and proteins


• Transcript process (splicing, transport): snRNA and proteins
• Transcript stability: ncRNA
• Translational control: ncRNA
• Post-translational modification: protein activity
• Protein stability: enzymes

How important are these levels?

Transcriptional control has 75% of the importance.

BIOCHEMISTRY OF TRANSCRIPTION
Phosphodiester bond: Bond between nucleotides under the release of pyrophosphate.
Nucleotides are always added to the 3´end of the mRNA or DNA.

Triphosphate 5’ attached to the 3’ carbon atom of the chain (-OH) – pyrophosphate.

• Template strand: Non coding (is the template and the RNA will be the complementary)
• Coding strand: Information that I want

Growth direction of the transcript: 5’ towards the 3’ end. The DNA non coding strand is read
in 3´->5´ direction to obtain a growth of the mRNA in 5´-> 3´direction.

[Link] rates of synthesis:

• Transcription: 40 nucleotides/sec 37º


• Translation: 15 amino acids/ sec 37 º
• Dna replication: 50000 bp/min (eukaryotes 2000 bp/min)

In [Link] transcription and translation happen at the same time, therefore the two rates
have to be kind of paired. They are roughly equal because 3 nucleotides make up 1 amino
acid which almost matches (45:15).
COMPONENTS OF TRANSCRIPTION
The region of DNA containing all signals and information for the synthesis of a complete
transcript is called transcription unit. The transcription unit must have a start region and a
control region as well as a termination signal.

Polymerase binds to DNA: Sequence control region. Where to start and when to start.

• Prokaryotic: multiple genes (policystronic). Less extended and less control sequences
(they have to respond to the environment very quickly)
• Eukaryotic: only one gene (monocystronic). More extended and more controled
sequences. A lot of signals to do transcription (coordinate different organs and
systems). They have introns and exons.

Start site:

• Transcription starts at +1 (the 1st nucleotide that is transcribed) and then downstream
• Before the +1 comes the -1 which is upstream of the gene

STEPS
1. POLYMERASE BINDS TO THE PROMOTER: Part of DNA binds to polymerase to starts.
Makes transcription possible. Closed complex formation.
2. TRANSCRIPTION BUBBLE: Separate the strands from each other. Moves along the
polymerase. (13-40 nucleotides in the bubble). Open promotor formation.
3. PROMOTER CLEARENCE: Tries a bit and stops and it gets repeated a few times. Why?
Requires a signal that tells that it is ok to elongate. Extra control step.
4. ELONGATION
5. TERMINATION: Different in eukaryotes and prokaryotes

RNA POLYMERASE
All polymerases of all organisms come from the same ancestry

Multiunit protein: 5 subunits:

• 2 alpha subunits
• 2 beta subunits
• 1 sigma subunit. Promoter specificity

Pincers: Create a channel

Rudder: Separate the hybrid strand into RNA and DNA


TOPOISOMERASES
The DNA is constrained

Supercoil stress is created because of the RNA polymerase. The DNA is not able to rotate which
forms to types of supercoils. A positive supercoil (more nucleotides per coil) in front where the
DNA is more compressed, and a negative supercoil (less nucleotides per coil) that occurs
behind, after the RNA polymeraes, where there is now less compression. This is energetically
unfavorable. The supercoiling stress is removed with the use of Type I and Type II
topoisomerases

• Type 1: Only cuts one strand. Relaxation of negative supercoils (type 1A), or relaxation
of negative and introduction of positive supercoils (type 1B). No need of energy
because DNA wants to go back to a favorable state

• Type 2: Double strand cut. Relaxation of positive and establishment of negative


supercoils. Energy in form of ATP is required

PROOF READING POLYMERASE


Proofreading is the ability of polymerase to recognize mismatched nucleotides, to remove
them and to incorporate the correct nucleotide. Two mechanisms:

• Pyrophosphorolytic editing: the last nucleotide of the growing RNA is replaced by


pyrophosphate. This mechanism is the back reaction of synthesis and is carried out by
the 3‘-5‘ exonuclease activity of RNA polymerase
• Hydrolytic editing: Goes back to handle the mistake. RNA Poly moves backwards for a
few nucleotides which is called backtracking and removes a piece of mismatched RNA.
In this situation the 3´-> 5 ´exonuclease activity requires the elongation factors Gre in
bacteria or TFIIS in eukaryotes
TERMINATION
PROKARYOTES: 2 WAYS

• Rho dependent. Protein


rho uses energy to end
transcription. Rho factor
recognizes the rut
sequence and binds the
transcribed RNA. Uses
energy to pull the RNA
away from the polymerase
and the bubble.

• Rho independent: If RNA contains a palindrome sequence (usually GC rich inverted


repeats) to form an intra molecular hairpin at the end. After the hairpin there is a pol-
A sequence which gets translated into a poly U strain. This kind of A-U pairing is less
stable than C-G. So the combination of the hairpin and the A-U tail destabilizes and
therefore removes the polymerase. This can be used to regulate gene expression
(phage lambda)

Anti termination in phage Lambda

There is a terminator after the early genes transcript, so


the transcription cannot go beyond. Among the gene
products made from the early genes there are anti
termination proteins that are used to ignore the
termination of the early genes so we can transcribe the
middle phase genes, and so on.

Anti termination proteins (N, Q) are used as switches


between early and late phases of gene expression.
BACTERIAL PROMOTERS
DNA elements: Do not encode anything but regulate the process of transcription (or other
processes)

Promoters:

- Define the transcription start of a gene.


- Contain DNA sequences (or elements) that direct RNA polymerase to the transcription
start.

SIGMA FACTOR: Converts the polymerase into an enzyme that has a lot of affinity for the
promoters. Allows RNA polymerase to discriminate between promoters and random DNA
sequences.

RNA polymerase is a holoenzyme with a low disassociation constant. Binding becomes more
efficient thanks to the sigma factor.

DNA element that distinguish a promoter? Consensus sequence. Two regions in bacterial
promoters:

• Region – 10: TATA BOX


• Region – 35: GACA BOX

MAPPING CONSENSUS SEQUENCES

• Footprinting: Polymerase to footprint where it’s binding. Try


to modify chemically DNA but the polymerase doesn’t let the
chemical to modify the DNA because it’s protecting.

• Site directed mutagenesis: Creates a bunch of mutants and


see where the mutation that prevents the transcription to
occur is.
SIGMA FACTORS
Different sigma factors (holoenzymes with the polymerase) recognize different consensus
sequences, and therefore different groups of genes.

• Sigma70: Growing. -10 and – 35


• Sigma 38: Starvation stress. Stops processes not essential for survival
• Sigma 32: Too hot. Transcription of chaperons to stabilize the desnaturalitzation of
proteins

Lamba phage gene stage switching (like antitemination): Early genes contain a gene for a
sigma factor that starts the middle genes and so on.

TRANSCRIPTION CONTROL IN BACTERIA


OPERON

• An array of genes characteristic of bacteria


• Defines a group of genes in direct succession on DNA.
• Genes in one operon are under coordinate regulation and are transcribed into one
polycistronic mRNA. There is one control region for multiple genes f.e. lacton operone

TRANSCRIPTIONAL CONTROL
• Positive regulation: Transcribed at low rate naturally. We need to increase
transcriptional rate
• Negative regulation: Natural translation. We need to shut down transcription

Ligand: Small molecule that binds to the repressor or activator for them to function.

• POSITIVE REGULATION
- Ligand activates the activator
- Ligand represses the activator

• NEGATIVE REGULAATION
- Ligand activates the repressor. Co repressor (represses transcription)
- Ligand represses the repressor. Inducer (induces transcription)
This way of controlling is very usual in bacteria. Not so much in eukaryotes.

MECHANISMS
• Some activators increase the binding of pol. to DNA by interacting with the alpha or
beta subunits of the pol.
• Some repressors block the pol. binding site. Others induce the formation of a
repressive loop usually brought by protein-protein interaction (also in eukaryotes),
etc.

OPERONS
LAC OPERON
When lactose is in the environment this operon is activated (when it needs to cleave and
metabolize the lactose). It encodes the enzymes needed to metabolize lactose.

STRUCTURE

• Promoter
• Cap binding site
• Operator
• 3 genes: lacZ, lacY, lacA
INDUCERS

• Lactose
• Alolactose
• Synthetics (in the power). Used in molecular biology experimentation. We also use a
product of betagalactosyde to measure the rate of the operon

LAC REPRESSORS

Usually shuts down transcription in normal conditions. The inducers inactivate the repressor
and therefore when they are added there is transcription of the genes.

The repressor binding site there is a palindrome, so we can conclude that this operon is
controlled by a repressor that is a dimmer.

LAC ACTIVATORS

Positive control by cAMp (coactivator) which is a starvation signal that is produced when
there is no glucose and the ATP levels are low. It binds to the activator protein CAP that gets
activated and stimulates transcription.

Presence of lactose and presence of glucose: A bit activated but really low. Relief of the
negative control but not activate the positive control. Transcription is low.

CAP PROTEIN:

• Changes conformation of the DNA. Creates a bend that helps


polymerase bind to the DNA.
• Interacts with the cap binding place

TWO BINDING SITES: One with low affinity, the other with high affinity

In the absence of lactose and glucose the CAP protein stabilizes the DNA
loop which is formed by interaction of lac repressor bound to operators
O1 and O3 (when glucose is present it binds O1 and O2) so the
transcription is completely shut down. So CAP creates a binding site for
the repressor to sit in ( low affinity) this is why when there is not glucose
and lactose the operon is not transcribed.

EXPRESSION OF LAC OPERON

GLUCOSE LACTOSE OUTCOME


- - -
+ - +/-
+ + +
- + ++++
ARABINOSE OPERON
Works similar to the lac operon

The regulation of the arabinose operon is controlled by the AraC dimmer protein and by the
formation of a repressive loop. There two promoters in the operon, one for the AraC protein
and the other one for the transcription of the araBAD genes, which are needed for the
catabolism of arabinose. There are also different binding-sites for the AraC protein: the
operators (araO1 and araO2) and the inhibitors (araI1 and araI2). There are 3 states:

1. Repressed state: When there is no arabinose present, one dimmer of the AraC protein
is bound to the operator araO2 and the other one is bound to the inhibitor araI1. Their
interaction causes the formation of a repressive DNA loop and the araBAD genes
cannot be transcribed because the promoter is not active.
2. Inducing state: When arabinose is present, it binds to the AraC protein dimmers and
they change conformation which causes them to bind to araI1 and araI2. In this
situation both promoters are active leading to the transcription of both the araC genes
and the araBAD genes.
3. Autoregulation state: Once there is enough AraC, it will exert autoregulation on itself.
AraC will bind to the operator araO1, not forming a repressive loop, so that the araBAD
genes can still be transcribed, but repressing its own synthesis by interfering with the
promoter of the araC gene.

Gene regulation is obtained, through the alteration of DNA conformations between the
repressive loop and a non-repressed state. By the way, this operon is also controlled by the
cAMP protein and therefore also controlled by the presence or absence of glucose
SIGNAL TRANSDUCTION: phoR-phoB
System that senses nutrient outside of the cell and regulate the gene expression accordingly.
This one regulates genes that maintain intracellular phosphate levels. 2 component system:

• Sensor: measures how much phosphate there is. Little phosphate activates sensor
which activates response regulator.
• Response regulator: Activates genes that increase phosphate levels.

Communication of environment and the transcriptional factors. The balance of activation and
repression determines the amount of transcripts.

METHODS TO ANALYZE GENE


EXPRESSION
How many genes do I want to look at?

• Few genes expressed in a cell population: q-PCR


• All genes expressed in a cell: NGS based technologies (RNA-seq)

Do I want to know what transcription rates do?

• Analysis of transcription rates of few genes : Nuclear run on assay (few genes)
• Analysis of transcription rates of whole genome: GRO-seq (whole genome)

How was it done originally?

• northern blot (outdated)


• RT-PCR (outdated) because is not quanitative

Therefore: Real time RT-PCR

Real time RT-PCR

• Intercalation of a fluorescent dye into amplified dsDNA


• The increase in fluorescence is determined in real time during PCR
• Determination and evaluation of the exponential phase of PCR and cycle threshold (Ct) value.

CT value: number of PCR cycles required to exceed the threshold value. The CT value allows for a
comparison of DNA amounts present in different samples after normalization to housekeeping gene. To
measure individual or a few genes
GENOME WIDE TECHNOLOGIES FOR
GENE EXPRESSION ANALYSIS
DNA MICROARRAYS (GENE CHIPS)
It’s outdated. You would have a chip with probes on the form of oligonucleotides for all the
genes of whatever organism you are looking at. All the probes are in defined positions.

1. RNA extraction from the sample


2. cDNA synthesis with fluorescent labeling (retrotranscriptase)
3. Hybridization of the labeled cDNA from the sample with the probes of the chip
4. Observe the fluorescent spots on the chip

Another way is to grow the organism with fluorescently labeled nucleotides so they will be
incorporated in the transcription process and then extract the RNA that will be already labeled.
You can compare it between two states or environments (like healthy tissue and diseased
tissue) and therefore compare the transcripts of the two states

NGS BASED TECHNOLOGIES: DNA


Massive parallel sequencing: Sequence a lot in a small amount of time

1. DNA extraction
2. Fragmentation
3. Library preparation: Ligation of adapters to sequence from both ends
4. PCR amplification
5. NGS: Sequence reads of different lengths
6. Annotation

NGS BASED TECHNOLOGIES: RNA


In-silico alignment to reference genomes:

Highly expressed genes will have many sequence reads aligned to their loci. Therefore aligned
reads are directly correlated to the rate of mRNA expression. So the number of reads of a
fragment will determine the abundance of that transcript.

1. RNA extraction
2. cDNA synthesis
3. Library preparation: Ligation of adapters to sequence from both ends
4. PCR amplification
5. NGS: Short sequence reads
6. Annotation

The multiple peaks we can see for one gene (instead of one) are due to splicing. So the spaces
that are missing correspond to intrones.
RNA SEQUENCING: LIBRARY PREPARATION

1. PolyA tail of the RNA is


captured with a polyT oligo

2. RNA is fragmented and


primed

3. First strand of cDNA is


synthesized (hybrid of RNA-
DNA)

4. Second strand is synthesized

5. Standard ligation procedure

6. Fully functional library

RNA SEQUENCING: RESULTS


Genome-wide gene expression data is often displayed as heatmaps. Gene hierarchical
clustering reveals genes with similar behavior

SINGLE RNACELL SEQUENCING


Single cell sequencing examines the sequence information of individual cells. This provides a
higher resolution of cellular differences and a better understanding of the function of an
individual cell in the context of its micro environment.

Encapsulate each cell and reagents in a small oil


droplet (emulsion). You do the firsts steps in these
droplets and add the multiplex barcode for each cell.
Then you can put them all together because you’ll
have them classified thanks to the barcode and the
you can sequence it.

All these technologies allow for bulk RNA


sequencing now we’ll discuss the technologies
that allow us to distinguish from the influence of
post-transcriptional events like processing
(splicing, 3’ end modification, nuclear transport,
degradation etc). So we will really know how
much of the RNA that I get is actually due to the
control of transcription
NUCLEAR RUN ON ASSAY

It is based on the principal, that the number of polymerases that transcribe a gene at a given
time is proportional to the rate of transcription. So a high transcription rate means that at the
same time you have many polymerases that are transcribing the gene one after another. If a
gene is transcribed at a low rate, the number of the polymerases would be lower too.
So you will have a measure of transcription because RNA stability does not play a roll yet at
this point, since what you are measuring is the amount of nascent transcripts (they are not yet
finished and therefore are not matured or processed). To measure nascent transcripts,
radioactively labelled nucleotides are added so they become part of the mRNA. Therefore, the
radioactivity is indicative/ directly proportional to the transcription rate of this gene at a
given time

1. Isolate nuclei from the cells of interest (f.e. from infected and non-infected cell) and
put them on ice to make the pol. stop transcribing
2. Add a radioactive labeled nucleotide that will be incorporated in the RNA (usually
uridine)
3. Put back the nuclei to 37 degrees , so it will keep transcribing incorporating the
radioactive nucleotide
4. Isolate the nascent RNA transcripts
5. Hybridize RNA to probes in nitrocellulose filters and then put them in film to visualise
the potential radioactive signals
6. The radioactivity in the total RNA to reveal the relative transcription rate of our gene
of interest.

GENOMIC RUN ON SEQUENCING (GRO-SEQ)


The nuclear run on assay is ok for a few genes. But what happens when we want to determine
the mRNA increase due to transcriptional control but genome wide? We do GRO-seq
1. Isolate nuclei from the cells of interest (f.e. from infected and non-infected cell)
2. Add a chemically modified UTP (e.g. bio-UTP) that will be incorporated in the RNA).
3. And elongate transcript in the presence of modified chemically modified UTP
4. Isolation of nascent mRNA using the chemical modification of U (f.e. Streptavidin)
5. Reverse transcription and RNA-Seq
6. Bioinformatics analysis of transcript frequency

The experiment started just like a classical nuclear run on assay, but by isolating and purifying
the labelled nascent RNAs, you can perform a RNA-Seq, except that now the RNA-Seq is not on
the total cellular RNA, but it’s only on the nascent transcripts. The bioinformatics analysis that
is performed gives you the frequency of nascent transcripts, meaning the higher the amount
of nascent transcript, that corresponds to a given gene, the higher is the transcription rate of
that gene.

This technology does not only allow you to


analyse a few gene, such as the classical
approach, but give you the possibility to
determine the transcription rate of all
genes that are transcribed in a cell at the
same time.
You can compare the results to bulk RNA-
seq

These two technologies allow us to measure gene expression as transcription rates.


Now we will discuss the transcription control due to the promoter. So we have to do:
Promoter isolation. To do so we need to do two things

• Transcription start determination


• Genomic Cloning, transient transfection, reporter gene assay

PROMOTER ISOLATION
Primer extension and PRO-Cap are technologies that allow us to determine transcription starts
for individual genes or the entire genome
PRIMER EXTENTION
Good for measuring transcription starts for individual genes.
Because you don’t know exactly where the start of the gene is,
after transcription, you synthesize a primer complementary to
some part of the mRNA (red arrow), and you extend the
primer with a reverse transcriptase. Then you can measure the
length of the fragment and you will know exactly where the
gene started

PRO-SEQ AND PRO-CAP


- PRO-SEQ: This one, just like GRO-seq,
wants to measure nascent transcripts.
So you do the same methodology, and
after puryfing the RNA you remove the
5’cap end, which will allow you to ligate
adapters at both ends. This is the
advantage of pro-seq over gro-seq,
because you will get more information
about the 5’ of the RNA.

- PRO-CAP: You use the same


methology, but you exclusively ligate
adapters to those sequences of RNA that
had a 5’ cap, because then you only look
at the 5’ ends of the genes. The
difference is that before decapping, you
dephosphorilate the ends of the RNA.
The ones that had the 5’cap will still be
phosphorilated and you will be able to
ligate adapters specifically at the 5’ end .

Using these technologies we can determine the transcription start point. But now we have to
worry about determining the promoters because usually promoters, by definition, sit upstream
of the transcription start.

GENOMIC CLONING FOM SEQUENCED GENOMES


So once we know the transcription start, we want to determine the promoter (of genes that
have already been sequenced and annotated). Depending on the size, we can:
• Smaller DNA fragments: Amplify them by PCR for (up to 20kB). So if the promoter is
within 20kB we’ll be fine.

• Large genomic fragments: For eukaryotic cell promoters are very complex and
sometimes there are parts of the promoter that are even further away than 20 kB.
These large fragments cannot be amplified by PCR. So we need to order gene banks

• When you order your gene of interest, this DNA is usually provided as
bacterial artificial chromosomes (BAC) that contain origins of replication from
F plasmids and accept inserts of up to 300kb, so you can replicate this DNA
like a plasmid in [Link].

REPORTER GENE ASSAY


So now we have a piece of DNA in hand that contains the control region of my gene of interest.
How do we study and annotate the functional parts of the DNA? How do I find these regions?
Using a reporter gene assay. This assay is used to do an analysis of functional elements in DNA
promoters.

This assay is used to do an analysis of functional


elements in DNA promoters (using BACs). It allows us to
test whether a certain part of that promoter contains a
sequence that is important for transcription.

So we have a plasmid clone with my piece of DNA with


the promoter and you do promoter bashing and a
reporter gene assay:

You cut the promoter into shorter and shorter pieces


with exonucleases. Then you put these different sized
pieces into vectors with a reporter gene. Then you
determine which of these vectors express the reporter
gene and which ones do not. So the principle is that you
determine if the promoter still works or not when you
have deleted a certain piece of it.

WHAT IS A REPORTER GENE? Is a gene that tells you


whether if the promoter of f.e. a mammalian cell is
active in that mammalian cell. So you fuse to the
promoter a reporter gene that the mammalian cell does
not normally contain. The original gene reporter was the
CAT reporter.

So in conclusion the reporter gene assay tests the


promoter activity by transient transfection of cells.
REPORTER GENES

Reporter genes are not naturally present in the cells used for the reporter gene assay.
Frequently used reporter genes:

• Chloramphenicol-acetyl-transferase (CAT)
• Galactosidase (oNPG-color reaction)
• Luciferase (from fireflies. Generates photons when the substrate lucigenin and ATP are
present)
• Green fluorescent protein (GFP; can be exited with fluoresce laser and expression
analysis in living cells is possible)

All these technologies are great, but they don’t allow us to know where the
transcription factors are binding at. With the promoter bashing and reporter gene
assay, we find functional sequences of the promoter. But these sequences are too big
to pinpoint where the TF binds at. So we need more information .

TECHNOLOGIES TO FIND TF BINDING SITES


PHYLOGENETIC FOOTPRINTING
Comparing sequences of homologous promoter DNA regions from different species
(phylogenetic footprinting) allows detection of regulatory DNA. Done purely in silico

MOTIF SEARCH AND MOTIF DISCOVERY


DNA sequences can be searched with motif discovery algorithms for transcription factor
binding sites using motif databases

CRISPR-CAS9 TECHNOLOGY
It is used to test the importance and function of transcription factor binding sites and it uses a
gRNA that directs Cas9 or dCas9 to the TF binding site.

Binding site destroyed (genome editing)

Binding site mediates gene repression.


Fuse it to a domain that mediates
repression. So if I guide it here, will it
repress the gene?

Binding site mediates gene activation.


Fuse it to a domain that mediates
activation. So if I guide it here, will it
activate the gene?
ASSAYS FOR THE MEASUREMENT OF TRANSCRIPTION
FACTOR BINDING
ELECTROPHORETIC MOBILITY SHIFT ASSAY (EMSA)
1. Mixing of nuclear extracts containing transcription factors (extract of proteins) with a
radioactively labeled piece of DNA c that you suspect contains a binding site for your
transcription factor.
2. Separate protein-DNA complexes by native gel electrophoresis.

The complex formed by the DNA binding site and the transcription factor will migrate slower.
So the TF causes a mobility shift. The extent of this mobility shift is characteristic for the TF
that binds. So different TF will create different mobility shifts. You can also differentiate the
number of TF that are binded.

Of course the mobility shift alone is not very good to identify the TF and doesn’t allow us to be
sure that this shift has been caused by a TF. So to indentify this TF beyond doubt you carry out
a SUPER SHIFT ASSAY: In the mix you also add also an antibody that recognizes the TF. You will
get an Antibody-TF-DNA complex. This complex will migrate even slower in the gel. The
antibody causes a mobility super shift.

CHROMATIN IMMUNOPRECIPITATION AND ChIP-SEQ


The major difference with EMSA, is that ChIP measures the binding of a TF to the entire
nuclear genome. So it’s really a genome-wide assay.

1. Covalent crosslinking of proteins (TFs) with DNA, and chromatin fragmentation by


sonication
2. Purify the protein-DNA complexes
3. Immunoprecipitation of the complexes using an antibody against the TF of interest
4. Reverse the crosslink , digest the proteins (TFs), and purify precipitated DNA
5. Quantitative determination of precipitated DNA by real-time PCR
Problem with the real-time PCR: is that you already have to have an
idea of the binding site of the promoter because you have to generate
primers that recognize the sequence for the PCR. So you will get a
signal only if the fragment you suspected was a binding site and you
amplified, is actually a TF binding site. So what do you do? ChIP-seq

ChIP-seq: So what you actually do, is sequence the DNA fragments


instead of doing real-time PCR. You can sequence the entire library, or
only the DNA bound to the TF of interest. The good thing is that you
will get all the sites in the genome where your TF binds.

After that you annotate the sequence fragments to the genome and
you get a ‘peak map’

• Alignment of sequencing reads to reference genome


• Peak Calling

All these technologies are great to identify transcription binding sites. But what to do when you want to:

Purify a transcription factor

DNA affinity chromatography, Protein sequencing, cDNA cloning (outdated – before sequencing and
bioinformatics)

Analysis of transcription factor structure-function (functional domains)

Deletion or point mutagenesis of the cDNA, co-transfection assays, genetic analysis (knock-out, knock-down)

IDENTIFY RELEVANCE OF TF BY LOSS OF FUNCTION


Using genetic engineering and recombination techniques you make the protein lose its
function which allows you to determine which the original function was. You can do it by:

• Si/shRNA-mediated knock-down
• Gene targeting: knock-out, conventional gene targeting by homologous
recombination or CRISPR/Cas9-mediated
CO-TRANSFECTION ASSAYS

These assays are used to determine the activity of


a TF by co-transfecting a reporter gene. You have a
cDNA clone of your TF and you would like to use it
to get structure information.

This assay consists of a simple modification of the


reporter gene assay. We will have two plasmids:

- One with a reporter gene and upstream of it a


promoter with a binding site for the TF of interest.
So if the reporter is expressed you know the TF has
bind because then the promoter will be activated

- The other one will contain the cDNA clone of the


TF of interest

Both plasmids will be transfected together


into the host cells (co-transfection) and TF
will be then expressed. If the binding
sequence that we fused to the promoter is a
site where our specific TF can bind, the
reporter gen will also be expressed. This is a
way of finding out if the TF is involved in
stimulation our promoter of interest

STRUCTURE FUNCTION ANALYSIS:


Instead of using a plasmid that encodes the
TF in its wild type configuration, you start
chopping up the TF in pieces and determine
the functionality of these pieces by putting
them into vectors and then analyze whether
the gene reporter is expressed or not

These kind of assays are valuable tools for the determination of functional domains in TFs and
the outcomes of such experiments show us that there are no rules as to where the functional
elements have to sit in a TF

Transcription factors show a


modular structure of
functional domains for DNA
binding and transactivation
IDENTIFICATION OF PROTEIN INTERACTIONS
TFs do not work as single entities. TFs of eukaryotes do not usually interact directly with the
RNA polymerase usually there’s another protein or entity in between. To understand how
mechanistically the TF activates transcription we need to know about its partners in the
process. We usually analyze this by: Biochemical purification coupled to mass spectrometry:

1. Purification of protein complexes. You can use a tag in your TF of interest to purify it
with all the proteins that are bind to the TF
2. Separation into individual component with gel electrophoresis or 2-D gel
electrophoresis
3. Protease digestion so you get peptides of individual proteins
4. Determination of peptide masses by mass spectrometry. You can use a second round
of mass spectrometry of selected fragments (MS tandem) to obtain more information
5. Compare hits with data bases
6. Identification of proteins

TRANSCRIPTION CONTROL EUKARYOTES


PROKARYOTES EUKARYOTES
1 polymerase 3 polymerases

Polycistronic mRNA Monocystronic mRNA

Transcription and translation are Transcription and translation are


coupled (there is no uncoupled (because
compartmentalization) compartmentalization, transport, and
the mRNA processing)
Simple promoter structure Complex promoter structure, multiple
binding sites, chromatin…
Promoter positioning by the RNA Complex regulation RNA polymerase,
polymerase holoenzyme initiation and elongation.

Even though the polymerases of prokaryotes and eukaryotes are much related (they kind of
have the same core). The eukaryotic ones are more complex and have more subunits.

POL 1 POL 2 POL 3


Nucleolus Nucleoplasma Nucleoplasma
rRNA Pre-mRNA Non coding: tRNA, 5SrRNA,
snRNA snRNA, others

How did they found out this? Thanks to a toxin (alfa-amanitin) created by a fungus that affects
the polymerases in a different manner.
TRANSCRIPTION FACTORS (TFs)

Every transcription factor needs a binding domain (DBD) or a structure with which it binds to
the DNA. A transactivating domain (TAD) is also needed to interact with the RNA polymerase
or other proteins to activate transcription. And the third structural element that we find in
most (but not all) TFs, is a dimerization domain since most TFs act as dimmers.
Two large groups of transcription factors can be distinguished:
• Basal or general transcription factors: Very generally used for the initiation of
transcription. Are components of complexes required for transcriptional initiation in
eukaryotes (initiation complex of transcription). They help recruit RNA polymerase to
the transcription start.
• Transcription factors (constitutive or regulated activity): The general TFs are needed
to form the initiation complex, but they still need somebody to tell them when to form
the complex. These TF can act by:
- Binding to sequences proximal or distal to the promoter (if distal they are
called enhancers)
- Influence the frequency by which the initiation complexes are formed and
also by interacting with it.
- Interacting with proteins needed to cause RNA polymerase to elongate the
mRNA.

The TFs can act positively so they will be activators or can act negatively so they will be
repressors

STRUCTURAL CLASSIFICATION OF TFs BY THE DBD


HELIX-TURN-HELIX MOTIF
The helix turn helix motif can also be called home domain.

2 alpha helixes linked by an amino acid loop:

- Recognition helix: Base specific contacts with the DNA. Only recognizes certain DNA
sequences to confer specificity. Only binds if there is a specific sequence in place
- Second helix: Does not make specific contact, but stabilizes the complex by making
contact with the DNA backbone
ZINC FINGER MOTIF

The structure has:

- One alpha helix that contains histidines in specific positions


- Two beta pleaded sheets that contain cystines in specific positions
- Loops of amino acids that link these structures and also contain cystines in specific
positions

The finger conformation is formed because the cystines


and histidines coordinate a zinc atom. This
characteristic binding of the zinc atom forms the finger
shape. The finger then inserts itself into the major
grove of the DNA, and then it makes base specific
contacts.

Most TFs contain more than one zinc finger domain to


recognize the DNA. Particularly there is a group of zinc-
finger TFs that are special because they bind to the
DNA as monomers (usually TFs act as dimmers). One of
these is a group of zinc-fingers proteins called C2H2:
They bind to the DNA as monomers because they have
several zinc-finger domains that contact adjacent DNA
groves and so you get a stable and base-specific
complex.

The zinc-finger proteins that act as dimmers have a modification in their structure. Instead of
coordinating the zinc atom with 2 cysteins and 2 histidines, they coordinate the zinc atom with
4 cysteins, and they are called C4 proteins

BETA SCAFFOLDS
While the helix-turn-helix motif and the zinc-finger structure, make contact with the major
groove of the DNA, there are other TFs that make contact with the minor groove. And this is
mediated by the beta scaffold domains.
LEUCINE ZIPPER AND BASIC REGION
Transcriptional regulators combine L-zippers with a basic region. This two structures combine
DNA binding and dimerization.

BASIC REGION: is positively charged at higher PHs.

- Mediates the DNA contact: can interacs


with the negatively charged surface of the
DNA. This only explains why this structure
forms a stable contact but there must be
also a base-specific contact.

- Forms helices: The helix of one monomer


makes contact with the DNA major groove
from one site and the other monomer binds
to the same groove but in the opposite site

LEUCINE ZIPPER: It’s the dimerization part.

It’s a coiled-coil structure formed by amphipathic helices that have a surface that is both
hydrophobic and hydrophilic. This property is what makes the protein dimmerize. The inner
part of the helix forms a hydrophobic surface, created by leucine residues, that induces specific
dimmerization.

HELIX-LOOP-HELIX STRUCTURE

The helix loop helix TFs class, have the same principle of
dimmerization and DNA binding than the leucine zippers.

- Coiled-coil structure interrupted by an amino acid loop


that is also amphipathic. The loop does not contribute to the
dimmerization. So the two helices make the hydrophobic
surface that induces dimerization

-In transcription factors it’s coupled to a basic region


like the L-zipper

DIMERIZATION DOMAIN

- ZIP EXAMPLES OF SH2 domain: Protein domain that binds


- HLH DIMERIZATION DOMAINS to partner proteins that contain
OF EUKARYOTIC phosphorylated tyrosine residues
- COILED-COIL
TRANSCRIPTION FACTORS
- SH2
TRANSACTIVATING DOMAN (TADs)
They are platforms for the binding of proteins. Each TAD has a specific set of proteins that they
like to interact with. These interactions are very important for the talking of RNA polymerase

• Mediate protein interactions required for promoter activation, transcriptional


initiation or transcriptional elongation
• Poorly structured. They usually have intrinsically disordered regions that have a
predominance of certain amino acids
• Classified by predominance of amino acids with common chemical properties:
- Acid TADs (Glu, Asp)
- Gln-rich TADs
- Pro-rich TADs –
- Ser/Thr-rich TADs
• Different classes of TADs may contact different protein complexes during promoter
activation/transcriptional initiation

INITIATION OF TRANSCRIPTION
Eukaryotic transcription works very differently than in prokaryote cells. Transcription
undergoes the following distinct events:

1. (Pre)Initiation: formation of an (pre) initiation complex (PIC or IC) that brings RNA
polymerase to the promoter
2. Clearance: RNA Pol transcribes a few nucleotides and pauses.
3. Elongation: Continuous (mostly) mRNA-synthesis.
4. Termination: RNA Pol leaves the template, processing of the mRNA 3´ end.

The higher the cell in the evolutionary tree, the more complex its promoter and gene
regulation tends to be. Eukaryotic cells have more regulatory sequences and they are more
wide spread. We can have among others:
• Promoter
• Proximal binding site for transcription
factors
• Distal binding sites called enhancers of
transcription
• Regulatory elements within intron
sequences
• Regulatory elements downstream of the
promoter
• Enhancers that sit on the 3´ end of genes

What always needs to be present is a binding site for RNA polymerase, where the initiation
complex can be formed at the transcription start.
DNA SEQUENCES IMPORTANT FOR TRANSCRIPTIONAL INITIATION
Some eukaryotic promoters contain a TATA box that is usually close to the transcription start;
however its localization is not as well defined as in bacteria. Nonetheless most eukaryotic
genes do not have a TATA-box since its function can be replaced by other DNA sequences.
These sequences are recognized by one or more TFs or TFs complexes

TBP: It’s a subunit of the general transcription factor TFIID that binds to the TATA-box. Acting
as a monomer this TF bends the DNA around the initiation site which facilitates the binding of
RNA polymerase. TBP it’s actually a subunit of TFIID

There are two TFs complexes TFIIB and TFIID that recognize sequences at the transcription start
of eukaryotic genes (look at the image).

BASAL (GENERAL) TRANSCRIPTION FACTORS OF RNA POLYMERASE II


They are necessary to form the initiation complex. Most of them form dimmers and/or
complexes and therefore have subunits:
• TFIIA: has 2 subunits
• TFIIB: has 1 subunit
• TFIID: has 1 TBP subunit and 13 TAFs
• TFIIE: has 2 subunits
• TFIIF : has 3 subunits
• TFIIH: has 9 subunits. Phosphorylates the CTD domain of Pol II.

THE INITIATION COMPLEX

1. Binding of TFIID to either the TATA BOX (in this case it uses the TBP
subunit) or to one of the other elements, and introduces a bend
2. TFIIA stabilizes the binding of TFIID to the promoter. This complex
recruits TFIIB
3. TFIIB binding is very important because it recruits and mediates the
association between the initiation complex and RNA pol II. TFIIB interact
with the DNA, the TBP subunit of TFIID and the RNA pol II
4. RNA pol II recruits TFIIF, TFIIH and TFIIE that bind directly to the
polymerase instead of the DNA
5. At this point the polymerase is ready for promoter clearance since the
initiation complex is finished, which means that the polymerase leaves
the promoter
INITIATION AND REINITIATION

When a gene is transcribed, it


is usually transcribed multiple
times until the requirements of
the cell are no longer there. So
we will need to form multiple
initiation complexes. Do we
need to start them from
scratch every time? NO. The
‘scaffold complex’ accelerates
reinitiation. TFIID, TFIIH , TFIIE,
TFIIA and the mediator remain
bound to the promoter

TFIID ANDTFIIB FUNCTIONS


TFIID plays a critical role in the initiation of transcription:

• Its TBP subunit bends DNA around the initiation site which allows the polymerase to
accommodate the DNA better.
• It also contacts directly with the RNA polymerase

TFIIB also plays a critical role:

• Interacts with the TBP subunit of THIID and also contacts directly with the RNA
polymerase
• Contacts directly to the DNA by binding to the BRE element
• Has a linker DNA that helps separate the strands by binding to the rudder
• It has the TFIIB reader that helps the polymerase to find and recognize the
transcription start
• When the polymerase goes to elongation TFIIB is thrown away. This is why it’s not
present in the scaffold complex
POLYMERASE C-TERMINAL DOMAIN (CTD)
The transcription cycle involves phosphoylation and dephosphorylation of the CTD (C-
terminal domain) of RNA polymerase II. The other polymerases do not have this domain.

CTD STRUCTURE:

• Contains hepta repeats of amino acids. So seven amino acids repeated over and over
• The number of repeats depends on the species. In humans there are 52 repeats

The seven amino acid sequence:

Tyrosine – Serine – Proline – Thrionine – Serine – Proline – Serine

The only one of these amino acids that cannot be phosphorylated is Proline. That means that 5 of these 7 amino
acids can be phosphorylated. Because these 7 aa are repeated 52 times a complex phosphorylation pattern can
be generated. Depending on which aa are phosphorylated the proteins that bind to the CTD change

CTD FUNCTION:

• Platform for other proteins to bind to


• The proteins that bind are important for initiation, elongation, termination, RNA
capping and RNA splicing (so RNA transcription and processing)
• The proteins that are bound to the CTD keep changing for each different process
depending on the phosphorylation pattern

PHOSPHORYLATION PATTERNS

• Before RNA polymerase joins the initiation complex all the aa are dephosphorylated.
• TFIIH contains a kinase CDK7 that phosphorylates Ser5 that allows the binding of
capping enzymes (1rst RNA processing event)
• P-TEFb (elongation factor) contains a kinase CDK9 that phosphorylates Ser2 that
allows the binding of proteins required for elongation, splicing etc
TFIIH: TRANSCRIPTION INITIATION AND EXISION REPAIR
• Core TFIIH:
- XPB (3‘-5‘ Helicase)
- XPD (5‘-3‘ Helicase)
- P34, p44, p52, p62,
• Holo TFIIH (Promoter Clearance):
- Core TFIIH+CTD Kinase
- (CDK7/CycH/MAT1)
• Holo TFIIH (DNA-Reparatur):
- Core TFIIH+Repair Proteins
- (Rad, XPG)

A lot of the proteins that contribute to TFIIH have XP- and then some letter. XP stands for the
disease xerodermia pigmentosa. People that have this disease are very sensitive to UV light in
the sense that it damages their DNA.

The repair process requires a number of proteins called XP and whatever letter. Some of these
proteins are found in the core of THIIH but also in the holo TFIIH, that is not required for
transcription but for DNA repair. This means that TFIIH has a dual role and is required for
promoter clearance (core) and also DNA repair (holo).

ELONGATION FACTOR P-TEFb


In the transition to elongation the RNA pol, starts to transcribe and then it stops. Why? There
is an elongation block that prevents the pol to elongate. This block is formed by 2 proteins:

- DSIF (heterodimer of 2 proteins SPT4 and SPT5)


- NELF

To lift this block (AKA stop it) P-TEFb has to phosphorilate :

- Ser 2 of the CTD of the RNA pol


- SPT5 of the DSIF (does not disassociate from the pol however)
- NELF (gets diassociated from the pol)

We also need SPT6: This protein dissociates NELF from the complex ( RNApol-DSIF-NELF
complex). NELF is replaced by PAF1C, which is a protein required for elongation and transcript
processing (so it will stay bound to the pol RNA). Also PAF1C requires the phosphorilation of
the ser 2 and 5 of the CTD to associate.

So with the phosphorilation of DISF and the diassociation of NELF, the RNA pol acquires the
ability to elongate transcripts.
REGULATION OF P-TEFb by C-Myc
This is another potential step of regulation for the transcription.

C-Myc: Helps to elongate transcription by recruiting P-TEFb

ELONGATION FACTORS
During elongation backtracking occurs and it has 2 important functions:

• Correction of mismatches and errors. The cleavage of the nucleotides requires an


elongation factor called TFIIS
• There are some situations where the DNA template is difficult to transcribe and the pol
gets stuck (arrest complex). In these situations backtracking occurs over longer
distances (not just 1 or 2 nucleotides). These situations also require TFIIS, because it
helps RNA poly to cleave the backtracked RNA and to recommence elongation.

TFIIS makes sure that the speed at which the transcription occurs is independent of the DNA
sequence.

TRANCRIPTION RATES BY THE ACTION OF TFs


How do TFs change a gene’s transcription rate?

• By promoting/repressing the formation of an initiation complex (IC)


• By promoting/suppressing the transition to elongation

There are several ways through which initiation/elongation of transcription can be influenced
by a TF:

• Direct contact with components of the IC


• Contact with the mediator, a large protein complex required for gene transcription
• Enhancement or suppression of CTD phosphorylation.
• Changes of the promoter chromatin.

TFIID AND MEDIATOR


• TFIID and mediator are large protein complexes that offer contact surfaces for TFs.
And therefore these TFs increase or decrease the binding of TFIID
• Mediator binds to RNA polymerase.
TFIID: has a TBP subunit and also has 13 TBP-associated factors (TAFs) subunits which can all
function as possible contact points for different transcription factors that can influence
transcription initiation. TAFs of TFIID determine:

- Association with Inr, DCE, MTE, DPE


sequences (promoter sequences)
- Association with TBP
- Contacts with TF
- Histone acetylation

Mediator: The mediator allows for indirect contacts


between transcriptional activators situated in enhancer
sequences and the RNA Pol II. The mediator can help
reaching distal transcription factors (enhancers) while it is
already in contact with the RNA polymerase

What do these transcriptional activators do in terms of


TFIID?

• Improve the binding of TFIID


• Alter the conformation of TFIID so it has a higher affinity for the DNA

Depending on which type of TAD domain the TF has (in this case acting as activator) it will
preferentially contact different TAFs in the TFIID and therefore have different outcomes.

Mediator structure: The mediator complex contains a huge number of subunits

- Head domain
- Middle domain
- MED domain (for nuclear receptors)
- Tail domain and a kinase domain.

The mediator can contact multiple TFs DISTAL REGION


individually or at the same time thanks to its
different domains. These complexes can occur
because of the flexibility of the DNA and its
ability to form loops

The mediator also interacts with the CTD of the


RNA pol, and also contacts the proteins of the
initiation complex

In some cases the contact between TFs and


mediator is not direct and it’s facilitated by a co
activator protein such as PGC1a
INITIATION BY RNA POLYMERASE I AND III
RNA POL I
To form an initiation complex for the RNA pol I is needed:

- Core promoter
- UCE sequence (upstream control element) that contains 2 binding sites for UBF

UBF: is a TF that allows the binding of the SL1 factor to the core promoter which recruits the
RNA pol I

SL1: is a TF that acts as the TFIID analog for RNA pol I transcription. It contains also TBP and
TAFs specific to RNA pol I.

RNA POL III


PROMOTER: The promoters of the pol III are different than those of other polymerases. One of
these promoters is peculiar because it has the binding site for the TFs that initiate
transcription, in the middle of the gen. This is the 5S rRNA gene.

5S rRNA GENE

Ribosomes are made up by a bunch of ribosomal RNAs, and most of these rRNAs are transcribed by
RNA pol I in the nucleolus of the nucleus. However there is one of the rRNAs that is the exeption of
this rule and that’s the 5S rRNA which is transcribed by the RNA pol II

It was found that the critical sequence for transcriptional initiation was inside the gene.

How does this work?

- So in the gene there are 2 binding sites Box A and Box B


- The general TF called TFIIIC binds to these boxes
- TFIIC recruits TFIIIB and TBP
- TFIIIB recruits RNA pol III
- When the complex is formed TFIIC leaves the promoter and the polymerase can now
transcribe the gene from the transcription start
MODIFICATION OF CHROMATIN
• Modification of histones: Recruitment of protein complexes with histonemodifying
enzymatic activity
• Chromatin remodeling: Change the accessibility of DNA

THE NUCLEOSOME
Nucleosomes are composed of what is called a histone core of 4
different histones:

• 146bp DNA are wound approx. twice around an octamer


of histones
• Neighbouring nucleosomes are spaced by between 20bp
and 60bp of linker DNA.
• Histone H1 associates with the linker DNA and cause
compaction of nucleosomal DNA.

Histones have extrusions (N-terminus) that stick out of the core


structure. This N terminus can be modified and therefore the core
structure of the histones can be changed as well. This N-terminus
are important because they can be regulated by TFs.

How this histone-DNA complex is held together?


This is to a large extend due to charges. The surface of the histone
core is positively charged (accumulation of positively charge amino
acids) and the DNA backbone is negatively charged by the
phosphates there. The interaction of the histone core and the DNA
is determined by the electrostatic interactions.

STRUCTURE OF THE HISTONES: THE HISTONE FOLD


The nucleosome is formed by 4 different types of histones:

• H2A
• H2B
• H3
• H4

It’s an order structure:

- H2A-H2B dimmer

- H3-H4 tetramer

The octomer is formed by one H3-H4 tetramer and 2 H2A-H2B dimmers

The N terminus of one histone interacts with the C terminus of the other histone. These
interactions hold the whole structure of the octomer together
Does an H1 exist? Yes, but it’s not part of
the nucleosome. Its function is to bind
itself in the surface of the nucleosome
and mediate interactions between 2
nucleosomes, and therefore compact
more the DNA

ASSEMBLY OF HISTONES
The assembly of an octamer is not a spontaneous process, it requires histone chaperons:

• It starts with an H3-H4 dimer that interacts with DNA


• This DNA-dimmer complex is joined by a second dimer of H3-H4 to form a tetramer.
• This tetramer is the joined by the first H2A-H2B dimer to form a hexosome
• The second H2A-H2B dimer is added to form the octamer

There are situations, where histones need to be disassembled (like in transcription). The
disassemble of nucleosomes is exactly the reverse process of its assembly and is also mediated
by histone chaperons.

ELONGATION FACTOR FACT


• FACT displaces the H2A/H2B dimer from the nucleosome.
• This loosens the DNA-nucleosome contact to allow transcription
• In addition, FACT acts as a histone-chaperone to reinstall the H2A/H2B dimer.

CHROMATIN REMODELLING
Most TFs are not able to bind DNA that is tightly wrapped so histones need to be removed or
moved around, to make the TFs binding sites accessible. This is done by chromatin
remodelling complexes.

Different ways to remodel chromatin:

• Site exposure
- Repositioning: Leave all the histones but move them around. Sliding of
nucleosomes along the DNA
- Ejection: One nucleosome is ejected so we will have one nucleosome less
- Unwrapping: Loosening the contact between the DNA and the nucleoside, so
the DNA can be accessed
• Altered composition
- Dimer exchange: A histone or a histone dimmer is exchanged by another
- Dimer ejection: A histone or a histone dimmer is ejected

The remodelling complexes use energy to make these modifications in the nucleosome

How can we determine the nucleosome structure of the DNA? Using ATAC-seqThis
technology tells us which parts of the chromatin are accessible (so the DNA can interact with
proteins) and which parts are not.
ATAC-seq (TransposaseAccessible Chromatin)
The Tn5 transposase moves Tn5 transposons in Escherichia species and normally its cleavage
activity is limited to transposons.

Tn5 transposase

- Can be mutated to hyperactive Tn5.


This mutant not only binds to
accessible chromatin but also cleaves
open chromatin with high activity and
tags the fragments with a DNA adapter
(= tagmentation)
- The tags can be used to amplify and
sequence the open chromatin
fragments
- The sequence reads can be aligned to
reference genomes to generate peaks
of open chromatin

With this experiment we check the accessibility of the DNA

In this example they stimulated cells with an


interferon which acts by causing new gene
transcription, in this case Mx1.

Black line: It’s not stimulated by the interferon


and the little peaks indicate that the DNA is only
a bit accessible

Green and purple lines: The gene is now


transcribed and it’s widely accessible. Also the 5’
region (where the promoter is located and the
transcription starts) is more accessible

This is a representation of a chromatin


landscape where the TF-bound enhancers and
the promoter of a gene are nucleosome
depleted and thus accessible.

We can perform single cell chromatin


accessibility profiles: The yellow and blue cells
activate the gene through enhancer A and the
green cells do it through enhancer B
Once you have the accessibility peaks you can perform more downstream analysis to get a
genome wide idea of the accessibility:

CHROMATIN REMODELLING COMPLEXES DOMAINS


Since chromatin remodeling is an ATP-depending process the complexes must contain proteins
that can cleave the ATP (atpases). Therefore the families of nucleosome remodelers are
defined by their ATP subunit.

- SWI/SNF family
- ISWI family
- CHD family
- INO80 family

These ATPases have a characteristic structure, and they all contain the following domains:

- DExx domain
- HeliCc
- Short insertion or long instertion
- Interaction domains: To interact with histones or to interact with other proteins of the
remodeling complesx

Interaction domains:

- Bromodomain: Interaction with acetylated histones


- HAS domain: Interactions with proteins in remodelling complex
- Chromodomain: Interactions with methylated histones
- SANT/SLIDE domains: Interaction with histone tails important for nucleosome
repositioning (sliding)

The nucleosome position determines the accessibility of TF binding sites. 2 events can occur:

1. Frequent event: Binding of the TF requires that the histone is moved such that the
binding site becomes accessible
2. Rare event: Some TFs (pioneer factors) are able to bind to nucleosomal DNA so they
can interact with their sites even if they are still wrapped in histones. They often help
to reposition nucleosomes.

MODELS OF COOPERATIVE TF BINDING


For example, 2 TFs need to bind to the DNA in order to activate the gene (so cooperation).
There are different models for 2 binding sites:

- 2 binding sites close to each other and both TFs bind to


their sites and they also interact physically. So they
enhance each other affinity to the DNA
- 2 binding sites far away from each other so the TFs
don’t physically interact but they can interact thanks to
a co activator.
- A banding site is inaccessible because it’s wrapped in a
histone and the other one is accessible. So the
accessible one recruits a remodeler which makes the
other binding site accessible.
- A banding site is inaccessible because it’s wrapped in a
histone and the other one is accessible. So the binding
of TF in the accessible one even in the absence of a
remodeler makes the other binding site also accessible.
So the nucleosome position can be influenced by TFs. For example, through their binding they
adetermine chromatin accessibility, for example in a process called ´bookmarking´

Bookmarking: During DNA replication certain regions of the DNA will be marked to not
include nucleosome structures, so this region will be accessible and free of nucleosome in the
replicated DNA. It can also work the other way and the TFs can interacts with the histones to
help position the nucleosomes.

CHROMATIN CHANGES BY VARIANT HISTONES


The exchange of the histones variants corresponds to a certain biological response

Exchange of H2A for:

- H2AX: The corresponding region of


DNA requires repair. Allows
interaction with DNA repair
enzymes. Marks the site of DNA
damage
- H2AZ: This has to do with the
process of gene activation. When a
gene becomes active, histones of
its promoter get exchanged for
H2AZ
- macroH2A: X-chromosome inactivation

Exchange of H3 for:

- H3.3: Transcriptional activation.


It also reduces stability of the
nucleosome so it makes
remodeling easier and easier to
insert other variants.
- CENP-A: Centromere function

HISTONE MODIFICATIONS
Histone N-termini protrude from the nucleosome and are accessible for modifying enzymes

ACETYLATION AND DEACETYLATION OF HISTONES


The acetylation of histones is always linked to transcriptionally active chromatin

- Acetyl groups are attached by HATs (histone-acetyl transferases).They


neutralize the positive charge of the histones which loosens the DNA
interaction with the nucleosome and increases the accessibility of DNA for TFs
- Acetylated histone are recognized by reader proteins containing Bromo-
domains or other domains that interact specifically with acetylated proteins
- Histone deacetylases (HDACs) remove the acetyl groups from lysines, so they
exert the opposite effect

OTHER MODIFICATIONS
- Acetylation: Always occurs at a lysine (K)
- Methylation: Always occurs at a lysine or argenin (K or R). We can put more
than one methyl group in an aa. (f.e. trimethyl-lysine)
- Phosphorylation: Always occurs at serines (S) (sometimes also threonines)

There is a histone code. All these modifications are interpreted by different proteins so they
will have different outcomes.

INTERPRETATION OF HISTONE MODIFICATIONS


There are 3 types of proteins:

- Readers: Recognize the histones specifically when they carry a modification


f.e. proteins with bromo domains that recognize acetyl groups
- Writers: They write the code by adding the modifications f.e. HAT adds the
acetyl group or HMT (histone methyl transferase) adds the methyl group
- Erasers: They remove the modifications f.e. HDAC removes acetyl group or
HDM (histone de-methylase) removes methyl group

Why there are so many proteins? F.e. why so many HATs?

Different histone acetyl transferases have preferences for different histone lysine residues.
They differ in their specificities for the different lysines in the different histone tails

We can use ChIP-Seq for the determination of histone modification and TFs binding sites.

- We will use an antibody that is specific only for our modification

Some histone 3 modifications that are widely used to determine the transcription status of
chromatin:
- H3K4me1, H3K4me2: poised enhancers
- H3K4me3: active promoters and enhancers (enhancers are always distant!should be
written before in my notes)
- H3K9Ac, H3K14Ac: active promoters
- H3K27Ac: active promoters and enhancers
- H3k36me3: transcribed chromatin
- H3K27me3: repressed promoters and enhancers
- H3K9me3: repressed chromatin (heterochromatin)

Chromatin landscape: Is the sum of nucleosome positions, histone modifications, variant


histones and associations of other proteins. F.e. We need a certain histone code to initiate
transcription

INTERACTION OF TFs WITH HISTONE MODIFIERS


In some situations histone modifying enzymes are recruited to promoters by DNA binding TFs

For example:

- Activator recruits HAT so the gene gets transcriptionally activated


- Repressor recruits HDAC so the gene gets transcriptionally inactivated

Association of chromatin remodeling complexes and histone modifying complexes with


chromatin occurs either through reading of established modifications or through association
with transcription factors (or both).

Question: Do these histone codes need to be established in the process of transcriptional


activation of a gene or are they pre- established?

Anwser: If it’s not pre establish then I need a TF to recruit a histone modifier that will
modificate the histones such that the chromatine remodeler will now bind. Or the other way
around. Binding of transcription factors can be cause or consequence of chromatin
remodelling and/or chromatin modification
Transcriptional activation of the Gal1 gene Regulation of heat shock genes
by Gal4

Cells have to respond very quickly and transcribe the


necessary genes to survive (so chaperons)

To allow for this, these genes have a pre-formed


initiation complex ready to go. However the polymerase
SWI/SNF binding requires interaction with Gal4
is stopped by the DISIF/NELF roadblock.
and with histones acetylated by SAGA/GCN5
complexes This roadblock needs to be modified in order to release
the pol. So the heat shock factor (SHF) recruits the
elongation factor P-THB to remove this complex
ORGANIZATION OF THE GENOME
• Euchromatin: contains transcribed regions of the genome with nucleosomal DNA that
is not incompact higher order chromatin structures.
• Heterochromatine:
- Contains less or not transcribed genomic regions in often tightly packed
chromatin
- Is constitutive (telomeric and centromeric regions) or induced (X-chromosome
inactivation)
- Chromatin modification and density distinguish different forms of
heterochromatin
- Low content of acetylated histones
• DNA elements determine the boundaries and properties of eu- and heterochromatin
(locus control region (LCR), insulator, boundary elements).

CHROMOSOMAL ORGANIZATION
Chromosomes are organized into defined nuclear chromosome territories

• Chromosome territories: Each chromosome occupies its own, defined territory in the
nucleus; few loops are formed between different chromosomes (infrequent event).
• Chromosome paintings: Definition of chromosome territories is achieved by
chromosome painting that involve the labelling of chromosomes with different
fluorescent dyes

ORGANIZATION OF CHROMATIN: TADs


The loop structure of individual chromosomes is further organized. One of the levels of
organization within chromosomes (so how the genome organizes within individual
chromosomes) is the so called topologically associating domains (TAD)

Every chromosome touches the nuclear membrane

- The regions that are not transcribed are touching the nuclear lamina
- The regions that are active are inside in the lumen of the nucleus

TAD COMPARTMENTS

Within these regions the formation of DNA


loops is a lot more likely to occur than
outside these compartments

- TAD -> chromatin loops form with


higher frequency within a TAD as
between TADs.
- A and B compartments contain
transcriptionally active and inactive
TADs, respectively (2 types of TADs).

What determines the boundaries of a TAD?

A binding site for a transcription factor called


CTCF, which binds to regions that define the
boundaries of TADs.

CHROMOSOME CONFORMATION CAPTURE (3C)


This technology allows to map intra- and inter- chromosomal contacts of DNA
loops

1. The regions A and B are not normally in vicinity in the linear


gggggggstructure. But because of the loop these regions are in vicinity
2. We use a formaldehyde treatment to fixate the loop structure
3. We use enzymes to digest away the loop
4. We ligate back together the loop. The connection now is a small loop, not
bbbbbr an extended one . So we form a circular molecule
5. We remove the formaldehyde and perform PCR
6. We analyze the circular molecule

One can use a HiC technology to create contact


maps of the whole genome or big sequences.
LOOPS FORMATION
How do the loops form? Chromosome loop extrusion model

- There are regions in the genome contacted by the cohesin


complex
- The cohesin complex interacts with 2 regions and then using
ATP they loop out a region
- The cohesion complex uses the binding sites of the CTCF. It
marks the boundaries of the loop
- The binding sites are oriented towards each other
- The binding sites determine if the promoter and enhancers are
in the same loop or not so the can or cannot interact
- Not every loop is formed in this way. Some simply form by
protein-protein interaction

DNA ELEMENTS INVOLVED IN GENOME ORGANIZATION


It’s important to note that we are talking about models so this functions like this everywhere
in the genome. But there are some instances that has been found that work like this

• INSULATORS: Insulates heterochromatin (non transcribed regions) from euchromatin


(transcribed regions). Ins elements form boundaries between genome regions with
different structural and functional properties.
• LOCUS CONTROL REGION (LCR): Interaction with promoter or enhancer sequences
determines which genes within a chromosomal domain can be transcribed. Within a
transcribed region determines which genes are active and which genes aren’t.
• LCR + INS: Formation of chromatin hubs, very often involving a protein named CTCF.
Chromatin hubs decide whether genes inside are transcribed or not.

INSULATORS AND LRCs


Insulator sequences determine which part of the genome is
accessible for TF and therefore transcribed.

Here we have a INS A which is typically a binding site for


CTCF. Then we have chromatin that spreads but stops at the
INS sequence. What is between the INS sequences remains
active

We have a boundary between eu- and hetero- chromatin.


This can be cell type specific: Epigenetic gene regulation
hhhh
B-GLOBIN LOCUS

How did they discover the existence of LCR in a period where there were no ATAC-seq
technologies? By the chicken b-globin locus

The principle is that if chromatin is compact in a region the enzymes(Dnase I) will not cleave,
but if the DNA is accessible the enzyme will be able to cleave.

• Dnase I hypersensitivity indicates that a genome region is nucleosome-free and


therefore highly accessible. Dnase I will cleave such regions, but not nucleosomal DNA

• Consistent with the fact that the hemoglobins are synthesized differently in the
different stages of development there was a switch in the DNAse sensitive sites.

It was found that there was one hypersensitive site


that is always thus in every stage of the development

Why? Because it determines in all stages which of


the hemoglobin genes are transcribed and when.

Hypersensitive site = Sites where the TFs can bind


because the DNA is accessible. So enhancers f.e.
would form hypersensitive sites

Developmental control of b-globin


gene expression

The LCR exerts an activating


influence on the different
hemoglobins at the different times
of development

There are types of thalassemias (failure to produce hbemoglobin), that do not delete the
hemoglobin genes, but delete the LCR. So the activating influence of the LCR on these
individual enhancers was missing.

Question: If the LCR activates all the enhancers of the b-globins, why does not activate the
enhancer of the folate receptor gene or the odorant receptor gene?

Anwser: Because there are insulator elements that prevent the LCR to influence these genes

How does this work? Chromatin hubs


CHROMATIN HUBS

Chromatin hubs: The insulators situated in the CTCF binding sites interact with each other
forming a loop. These loops allow for the LCRs to be close and therefore interact with the
pertinent enhancers (but not those that should not be activated).

So the chromatin hubs are structures that prevent for the LCRs to not exert their influence all
over the place, but only to certain regions. These hubs are formed thanks to the interaction of
insulator sequences with CTCFs.

TRANSCRIPTIONAL ENHANCERS
Must fulfill several requirements:

• Must be in an accessible region of the chromatin


• Defined by a characteristic chromatin landscape:
H3K4me1, H3K4me2, H3K4me3, H3K27Ac, H3.3, H2AZ, CBP
• Often a binding site for several different TFs. So they can
respond to different signals. So they are modular
• Variable distances to transcription start site. Thanks to the
loops
• Bidirectional transcription. So formation of bidirectional
enhancer RNAs (eRNAS). Enhancers are actually transcribed

PROPERTIES OF eRNAs
• They are capped: They have a 5’ cap
• They are unstable: Because they often are not polyadenylated
• Proposed functions:
- Regulation of HAT activity: Recruitment of HATs and therefore provide a signal for histon
modification.
- Interaction with cohesins: Promote loop extrusions and therefore strengthen the loops
between promoters and enhancers
- Interaction with NELF: Stimulate elongation by alleviating the roadblock
- Inhibit the binding of DNA repressors
SELF ORGANIZATION: LIQUID-LIQUID SEPARATION
Proteins are usually soluble because on the surface they interact with the surrounding water
molecules, which prevent them of forming aggregates.

LIQUID-LIQUID PHASE SEPARATION: What if there is not enough water because of the high
concentrations of proteins? They form aggregates through hydrophobic interactions.

➔ These aggregates will form transcriptional factories. This is a form of gene control.

TRANCRIPTION FACTORS:

1. There are more transcription factors in the nucleus that are currently actively engaged
in transcription
2. The contact between the transcription factor and the DNA is very short. So it will bind
to the binding site and then leave.
3. With a low concentration of TFs it’s very hard to start a transcription event. So we
need high concentrations of TFs

TRANCRIPTION FACTORY COMPLEX MODEL:

• In the nucleus the proteins that regulate


transcription are not randomly floating around
• Through the liquid phase separation property
they form regions of very high concentrations
(transcription factories)
• For this model to work you would have to
have the enhancers and promoters where the
transcription factories are.

WHAT IS SCIENCE AND WHAT IS FICTION?

FACTS:

• Liquid phase separation has been demonstrated


• Liquid phase separation has been shown in the nucleus
• The high off rates of TFs require high local concentrations

So it all makes sense. But the fact that it makes sense does not necessarily mean that it’s true

EPIGENETIC GENE REGULATION


EPIGENETICS: Phenomena that is inheritable and that doesn’t depend on the DNA sequence,
but rather on the structure of chromatin. It is mitotically heritable but also reversible.

• Responsible for different gene expression patterns from identical DNA in different
cells and organs.
• The patterns are stably inherited from mother to daughter cells.
• Epigenetic gene control results from the formation of distinct chromatin structures
(eu/heterochromatin) and DNA methylation.

EXAMPLE:

- A liver cell recapitulates all the embryonic development events that distinguish the liver
cell from other cell types.
- When a liver cell divides the daughter will also be a liver cell. This has a lot to do with
chromatin structure that passes through mytosis.
- We can maintain a cell type identity through epigenetic phenomena.

CHROMATIN FORMATION
One way of heterochromatin formation is through the modification of histones. One of these
modifications is the METHYLATION OF LYSINE 9 OF THE HISTONE 3 (H3K9):

1. The HP1 chromodomain recognizes H3K9 (mono, -


di, or trimethylated)
2. The SET domain of the HMT Suv39 (histone
methyltranferase) binds to HP1
3. The HMT transfers the methyl group throughout
the histones and helps to spread heterochromatin
4. It stops when it reaches a boundary element such
as an Insulator

This mechanism explains how the heterochromatin is


formed and spread, but it does not explain who adds the
first methyl group.

This mechanism also explains one way of how the


original methylation occurs:

-When we have DN replication, the modified histones


are distributed

-So both parental and daughter strands get histones that


are then modified

-So the replicated region of heterochromatin will inherit


part of the methylated histons required for
heterochromatin formation

- Then the new nucleosomes that are formed will receive


the methylated group thanks to the HP1-HMT complex

This is a part of the reason why a chromatin structure can be inheritable. Because the
distribution of the modified histones gets inherited from the parental to the daughter
strands. The newly formed nucleosomes will be methylated in the same way
POLYCOMB AND TRITHORAX COMPLEX
There are two complexes that are important for this epigenetic way of transferring information
through histone modification.

• POLYCOMB REPRESSIVE COMPLEXES (PRC): Important for the formation of


inaccessible chromatin and gene silencing (repressing chromatin). There are 2
complexes that introduce repressive histone marks:
- PRC1 introduces H3K27me3
- PCR2 introduces H2AK119ub (ubiquitination)
• TRITHORAX COMPLEXES (Trx): Contain histone methyltransferases introducing the
activating mark H3K4me3 (activating chromatin).

Polycomb complexes: The PRC2 requires that the PRC1 first introduces the tri-methylation in
order for PRC2 to introduce the ubiquitination

Trx complexes: They introduce mono, di, and tri methyl groups. In transcriptional enhancers
the mono and di methyl groups act as poised enhancers, and if the tri-methyl modification is
introduced, the enhancer will be active

How do these complexes know where they have to put their marks? There are 3 ways:
• Mediated by TFs: There are DNA sequences that can be TRE (for polycomb) or CGI (for
trithorax). In this sequences TFs (activators or repressors) will bind and then recruit the
complexes as needed.
• Through noncoding RNAs: There are ncRNAs that can bind to polycomb or to the
trithorax and those can then mediate the interaction with these DNA sequences
(TRE/CGI)
• Through reader domains: Both polycomb and trithorax contain reader domains, so
that they can recognize chromatin by determining the histone modification

CELLULAR MEMORY: Once the polycomb and trithorax complexes have determined which
region of the genome can be transcribed and which cannot, this is then going to be inherited
to daughter cells. From then on a pattern of gene expression has been determined
EXAMPLE: Recruitment and displacement of polycomb during lineage specification

We have different stem cells that will differentiate into different cell types. The genes specific
for lineage A, lineage B and lineage C are turned off by polycomb

➔ If the cell types differentiate into cell type A, the stem cell genes will be silenced along
with genes B and C, and genes A will be activated by trithorax

Trx

INACTIVATION OF THE X CHROMOSOME

1. The Xist RNA binds to the X chromosome


that has to be silenced
2. Xist recruits complexes to make that make
the chromatin of the X chromosome
repressed, such as PRC1, HDAC, PRC2
3. Chaperons bind to these modifications and
we have a compact inactive X chromosome

EPIGENETIC ESTABLISHMENT OF ENHANCERS


Where future enhancers are going to be? Ennhancers are important in determining gene
expression so it is important that cells know. How?
➔ Interaction between transcription factors and histone modifying enzymes

Undifferentiated cells get epigenetic enhancer marks (H3K4 methylation,


H3.3, H2A.Z and binding of pioneer trx factor)
1. Pioneer TFs bind to nucleosomal (not accessible) chromatin
2. Initial opening of chromatin
3. Binding of chromatin opening TFs
4. Recruitment of chromatin remodelling complexes to fully open
chromatin
5. The enhancer sequences are now open

Differentiated cells get cellular signalling that


stimulates binding of TF to enhancer
sequences
SUMMARY OF GENE REGULATION BY TFs
• Impact on the initiation of transcription
- Contacts with TAFs or mediator, sometimes directly or through coactivators
• Impact on elongation (pausing pol II)
- pTEFb recruitment (through mediator or other mechanisms)
• Impact on DNA structure
- Bending (e.g. TBP)
- DNA loops, chromatin hubs (CTCF)
• Impact on chromatin modification and structure
- Recruitment of histone-modifying enzyme complexes (HAT, HDAC, methyl
transferases, demethylases etc.)
- Association with remodelling complexes (e.g. SWI/SNF)
- Association with activator complexes (e.g. Trx)
- Association with silencer complexes (e.g. PCR2, PCR1)

TRANSCRIPTIONAL REPRESSORS
Almost always we talk about transcriptional activators when
we think of TFs, but they can also repress transcription
through different mechanisms

• Competition: Prevent an activator from binding


through competition or overlapping
• Inhibition: Binds at the same time as the activator and
inhibits it by interaction
• Direct repression: Contacts the mediator complex and
changes the conformation of it
• Indirect repression: Recruits histone modifying
enzymes such as HDACs

DNA METHYLATION
While DNA methylation is an epigenetic mechanism of regulation it should not be confused
with histone modifications. METHYLATION OF THE DNA:

• Can occur only at cytosines that are followed by a guanosine.


- Exception: CpG islets often found at the 5‘ ends of housekeeping genes.
• Mostly (but not exclusively) correlated with transcriptional silencing
• Permanently epigenetic gene silencing (Xchromosome inactivation‚ imprinted genes)
• Stable silencing of endogenous retroviruses, retrotransposons and transgenes.
• Exceptionally reversible gene silencing (e.g. the IL-2 gene in T lymphocytes)
STRUCTURE: The methyl group is added at the 5 carbon atom of the cytosine base.

➔ DNMT (dna methyl transferases) are the enzymes


that add the methyl group
➔ The base pairing with the guanine does not get
affected

DNA METHYLATION DE NOVO AND MAINTAINANCE


There are 2 types of DNMTs:

• Maintenance DNMTs: These enzymes transfer the


pattern of methylation during the process of
replication to the newly synthesized strand of
DNA
• De novo DNMTs: These enzymes are important
during embryogenesis, because they attach new
methyl groups when there is no pattern to read

CPG ISLANDS
They are DNA region that contains many CpG sequences

• Very often found in promoter proximal regions, especially for housekeeping genes
• CpG islands they help keep the 5’ region open
• Most CpG islands are not methylated but some genes are silenced by methylation of
CpG islands

CpG ISLANDS
PROMOTERS

DNA METHYLATION TRANSCRIPTIONAL REPRESSION


Mechanism by which DNA methylation is capable to repress transcription:

1. Direct interference with the transcription factor binding

The TFs cannot longer bind to the DNA when the DNA
is methylated
r
2. Inactive chromatin structure formation

DNA methylation can lead to the formation of heterochromatin


which interferes with the binding of TFs

3. Specific transcriptional repressors

Enzymatic complexes (like HADCs) that confer repressive marks to


the chromatin are recruited by methyl cytosine binding proteins
(MeCPs ) that recognize methylated cytosines

METHODS TO DETECT CpG METHYLATION


How do we now where in is the genome methylated?

There are 2 methods we can use:

• BISULFITE CONVERSION
This is a method that doen’t work with de novo sequencing

1. Add bisulfate to the DNA sequence


2. The unmethylated cytosines will change to uracils.
3. PCR amplification so there will be T wherever there was a U
4. Sequencing and comparing to the reference genome

• ChIP-SEQ WITH meC-SPECIFIC ANTIBODIES


This method has low resolution

1. Genome fragmentation
2. Immunoprecipitation of the fragments using antibodies specific for the methyl group
3. Purification of the DNA
4. Sequencing
GENOMIC IMPRINTING
In a zigot 2 haploid genomes form a diploid new organism. These 2 haploid genomes have to
come from diffeerent sexes for the embryo to be viable. So how is the maternal genome
different from the paternal?

➔ Because of the Imprinting (methylation pattern)

IMPRINTING

• In diploid organisms some loci of the maternal and


paternal haploid genomes are differentially modified
(imprinted)
• Imprinting causes different expression of maternal
and paternal genes
• Imprinting is essential for development and
homeostasis of mammalian organisms.

ERSASURE AND DE NOVO IMPRINTS


After fertilization, the zygote has a male and female nucleus
with different imprints. These two imprints are maintained
on the same parental chromosome after each cell division in
the embryo cells.

➔ How does imprinting arise during development?

During development the embryonic gonad produces germ


cells (if it’s a male sperm if it’s a femal oocytes). In the
gonad, both types of imprints are erased (only in the germ
cells) . After the erasure, only one type of imprint is set de
novo through the novo DNA methylation. It is important to
note that female embryo must only produce female imprints
in the germ cells, and the same goes for male embryos.

EXAMPLES OF IMPRINTED GENES


There is no rule, which of the two genomes, the maternal or the paternal will be active or
inactive. Here are some examples of imprinted locus:
• LOCUS CHROMOSOME 7: This imprinted locus contains 3 genes H19(lncRNA), SmN
and Igf2 (insulin-like growth factor).
- Maternal allele: H19 is active but Igf2 and SmN are not transcribed
- Paternal allele: H19 is inactive but Igf2 and SmN are transcribed
• LOCUS CHROMOSOME 11: This imprinted locus contains the U2AFbp-rs
- Maternal allele: U2AFbp-rs is inactive and therefore not
transcribed
- Paternal allele: U2AFbp-rs is active and therefore transcribed

• LOCUS CHROMOSOME 17: This imprinted locus contains the Igf2R gene
- Maternal allele: Igf2R is active and therefore transcribed
- Paternal allele: Igf2R is inactive and therefore not transcribed

MODEL OF GENOMIC IMPRINTING


There are different mechanisms for imprinting, depending on which locus we are talking
about, and even for the same locus there can be different models of how things may work

➔ THE IGF2 – H19 LOCUS


Imprinting control regions (ICR): Insulator regions that usually sit between active and inactive
parts of our genome and can be bound by TFs such as CTCF

How does the mechanism of this loci work?

For the imprinting of this locus what is decisive is whether or not CTCF binds to a the ICR
region situated between the Igf2 and the H19 genes

CTCF is involved in the formation of DNA loops and chromatin hubs. CTCF belongs with those
TFs that are sensitive to DNA methylation so that if the ICR is not methylated CTCF will bind
and if the ICR is methylated CTCF will not bind

Difference between maternal and paternal alleles:


- The paternal allele the ICR is methylated so CTCF will not bind
- The maternal allele the ICR is not methylated so CTCF will bind

There are different models as to how the imprinting may work:

MODEL 1

This model is based on the fact that to activate the Igf 2 gene, a loop must be formed between
this gene and the enhancer PRE. The enhancer also controlles the H19 genes and it activates
them when it’s not forming a loop

• MATERNAL ALLELE (not methylated): The binding of CTF to the ICR region in the
maternal chromosome blocks the formation of the loop so Igf 2 will not be active.
Since the loop is not formed and H19 is also not methylated, this gene will be
transcribed.
• PATERNAL ALLELE (methylated): CTCF cannot bind to the ICR region because of the
methylation, and therefore the PRE enhancer will be able to form a loop and interact
with Igf2 so it will be transcribed. Since the loop is formed and also H19 is methylated,
this gene will not be active.

MODEL 2
This model proposes that CTCF really acts as an insulator when sitting on the ICR by directly
recruiting chromatin remodeler (CHD8).

• MATERNAL ALLELE (not methylated): The binding of CTCF to the ICR region triggers
the closing of the chromatin in the Igf2 region but in the region downstream where
the H19 gene is located the chromatin is open. Meaning that there is no transcription
of Igf2 but H19 is active.

• PATERNAL ALLELE (not methylated): CTCF cannot bind to the ICR region so the
chromatin is closed from the ICR region to the H19 gene while the chromatin is open
in the Igf2direction. Meaning that there is no transcription of H19 but Igf2 is active.

➔ THE IGF2R LOCUS


RNAs have the ability to bring transcription factors or histone modifiers to chromatin as they
can be involved in the recruitment process.

This locus a part from containing the Sic22 genes and the Igf2 receptor also contains another
gene called Air, which is a gene coding for a lncRNA.

What does Air do? It is important to note that Air only acts in cis so can only act upon the
same chromosome in which is transcribed; it cannot diffuse to the other chromosome. This
lncRNA does 2 functions:
1. The transcription of Air interferes with the transcription of Igf2R because both genes
are overlapping which means only one of the two can be transcribed.
2. Air participates in the recruitment of histone modifiers such as Histone
methyltransferases (HDACs?) that silence the Sic22 genes by adding the H3K27
trimethylation mark

MODEL
• MATERNAL ALLELE (not methylated): Air is not transcribed, that means the Igf2
receptor gene will be transcribed and the Sic genes will be active as well since Air will
not act on them.

• PATERNAL ALLELE (not methylated):


Air is transcribed, which means that
the promoter of the Igf2 receptor
gene is occluded so it will not be
transcribed and the Sic genes will not
be active as well since Air will act on
them and silence these genes.

GENE DE NOVO SYNTHESIS OR


POSTTRANSLATIONAL MODIFICATION OF TFs
There are different mechanisms that determine when and how TFs are turned from inactive to
active. There are 2 conceptual ways of producing transcription activity in the cells:

• As soon as the TF is produced it is going to be active. So the cell only synthesizes when
it’s needed.
• The TF is already produced and it’s there all the time and when the cell needs it, the TF
will be modified and made active. This could also happen the other way around, when
the TF is already active and the modification makes it inactive.

MECHANISMS OF TFS ACTIVATION


The TFs can be activated or inactivated through different mechanisms:

• By ligand binding
• By chemical modification
• By cleavage precursor ptoreolysis
• By disassociation or association of an inhibitor protein
➔ BY LIGAND BINDING

TFs can be activated or inactivated by ligand binding.

In mammals TFs do not usually bind to metabolites to regulate the


metabolism. Nonetheless, some of them do and they bind to hormones or
steroids. Additionally they can also be activated by cell receptors.

NUCLEAR HORMONE RECEPTORS (NHR)

• Associate with
- Steroids: Glucocorticoids/Cortisol (GR), sex hormones (ER, PR, AR etc.)
- Vitamin D (VDR)
- Other lipophilic hormones: retinoic acid (RAR), thyroid hormone (TR) etc.

There are 2 groups of nuclear hormone receptors. They bind the DNA as:

• Homodimers: (GR/GR), (ER/ER), (PR/PR) etc.


• Heterodimers: This family will associate with the RXR protein (TR/RXR, RAR/RXR etc.)

The receptors are not usually just neutral, but many of them act as repressors in absence of
hormones. When the hormones are bound to them they get activated. These receptors will
recognize the activating ligands in different localizations:

• In the cytosol: glucocorticoid-receptor, GR etc.


• In the nucleus bound to DNA: RXR-heterodimers such as TR/RXR, RAR/RXR, VDR/RXR.
These usually are repressing transcription in the absence of ligand

SPECIFICITY OF NHRs BINDING TO DNA

The NHRs bind either to inverted repeats (palindromes) or to direct repeats:

• Homodimeric receptors bind to inverted repeats

• Heterodimeric receptors bind to direct repeats

LIGAND-INDUCED CHANGES OF NHRs STRUCTURE

When the receptors bind to their ligands, there is a conformational change that modifies their
structure. This requires NHRs to contain a ligand binding domain
The structural change in the receptor is required for the binding of co activator proteins that
they need to activate transcription.

In the absence of hormones many of these receptors bind to co repressors

This is true for all the receptors that have a conformational change because of the binding of
the ligand.

ACTIVATION OF NHRS THROUGH ACTIVATION OF COACTIVATORS AND HISTONE ACETYLASES

The model presented before is true for all the receptors that have a conformational change
because of the binding of the ligand. But there is a group of receptors that works differently;

1. The NHRs in the inactive state sit in the cytoplasm and are bound to chaperon
complexes (such as HSP90) or repressors that keep them from going to the nucleus.
2. When the ligands bind to the receptors the chaperones dissociate and the active
receptors can enter the nucleus and with the help of a co activator promote
transcription.

TRANCRIPTION OF ENHANCERS THROUGH NHRs

The estrogen receptor causes transcription of an enhancer sequences to form a noncoding


RNA (lncRNA) that facilitates DNA looping

1. Two estrogen receptors interact with their ligands (estrogens)


2. One of the receptors binds to the downstream binding site an
activates the TPCG gene
3. The other receptor binds to the upstream binding site an
activates the transcription of an enhancer sequence into a
lcnRNA
4. The transcribed lcnRNA facilitates the formation of a loop,
which brings together the two binding sites, which promotes
an enhanced transcription
➔ BY CHEMICAL MODIFICATION
This chemical modification usually is the attachment of a phosphate group. There are different
major pathways.

It is important to note that there are 3 major targets for phosphorylation in a cell: Serine,
threonine and thyrosine

JACK STAT PATHWAY

STAT: Group of transcription factors that are activated by tyrosine phosphorylation and that
function as dimmers. They interact with receptors that have attached JAKs (tyrosine-kinases
proteins)

SH2 domain: Proteins that interact with STATs must contain a


module that recognizes the phosphor-tyrosine in a specific
amino acid environment. The most prominent of these domains
is the SH2 domain. STATs contain a SH2 domain, which allows
them to dimmerize with each other by recognizing each other’s
phosphor-tyrosines.

Activation of signal transducers and activators of STAT by


tyrosine phosphorilation:

1. A ligand binds to its receptor and activates the JACK


proteins
2. The JACK proteins phosphorylate the STAT
transcription factors which form dimmers
3. The active dimmer moves to the nucleus and activates
transcription of the genes under their control

TGF-BETA PATHWAY

SMADs: Group of transcription factors that are activated by


receptors that bind cytokines from the TGF beta family. These
receptors are threonine – kinases.

1. TGF-beta binds its receptor which get activated


2. The activated receptor phosphorylate threonines of the R-
HHHHH SMAD
3. The phosphorylated threonine allaws the interaction of R-
HHHHH SMAD to SMAD 4
4. The SMAD complex moves to the nucleus and promotes
HHHHH transcription of the genes under their control
SECOND MESSENGER PATHWAY - cAMP

cAMP: Second messenger that controls a lot of cell functions

CREB: Group of transcription factors that localize in the nucleus and


are already sitting in the DNA

1. cAMP binds to the protein kinase A


2. The kinasewill phosphorylate CREB that is already sitting on
the DNA
3. The phosphorilated CREB will now be able to interact with
the histone acetyl transferase CBP
4. CBP allows CREB to acetylase surrounding nucleosomes to
activate transcription of genes under CREB’s control

MAPK CASCADE SIGNALING AND TRANSCRIPTIONAL CONTROL

Binding of growth factors to receptor tyrosine kinases (RTKs) stimulates MAPK signaling.

RAS: GTP binding protein that is usually the starting point of the MAPK cascade. RAS is usually
activated by RTKs that interact with adaptor proteins which make sure that RAS binds to GTPs

MAPK cascade:

1- RAS bound to GTP will activate through phosphorylation Raf, the


first kinase of the cascade
2- Activated Raf will activate the second kinase MEK
3- The phosphorilated MEK will activate the third kinase MAPK
4- Activated MAPK will phosphorylate its targets among which are
TFs

Transcription control by MAPK signaling:

Phosphorylated MAPK does 2 things:

• Activates theTF serum response factor (SRE):


1. MAPK phosphorylates and activates the kinase
p90risk
2. p90risk goes to the nucleus and phosphorylates and
activates SRF
3. Activated SRF will bind to the SRE sequence
• Activates the TF called CTF:
1. Phosphorilated MAPK goes to the nucleus
2. MAPK phosphorylates and activates CTF
3. Activated CTF will bind to the SRE sequence
• CTF and two SRF will form a complex in the SRE
sequence and activate the genes
➔ BY PRECURSOR PROTEOLYSIS
In this cases the TF is translated as a precursor that is inactive and that must be cleaved by a
protease in order to be activated

HIPOXIA INDUCED FACTOR (HIF)

HIF: Growth factor that responds to hypoxic states and promotes transcription of genes that
allow to better utilized the low amount of oxygen available.

• Presence of oxygen: We don’t need the TF. The enzymes proline hydroxilases attach a
hydroxil group to a proline of HIF-1 alpha. This acts as a signal of ubiquitination and
therefore degradation.
• Lack of oxygen: We need the TF. The oxygen inhibits the proline hydroxilases, which
allows HIF-1 alpha to interact with HIF-1 beta. This complex can activate the necessary
genes

In this situation the entire TF is cleaved

SREBP PROTEOLYSIS

SREBP: TF that responds to low cholesterol levels on the cell. The SCAP protease senses the
amount of cholesterol.

1. The SCAP protease senses the amount of cholesterol.


2. When cholesterol is low SCAB cleaves SREBP which gets activated
3. The transcriptionally active SREBP will stimulate genes involved in uptake or synthesis
of cholesterol

➔ BY DISSOCIATION OF AN INHIBITOR
To activate the TF the cell must get rid of the inhibitor protein

THE NFkB PATHWAY

NFkB: Heterodimeric TF formed by two subunits, p50 and p65, that


binds to the promoter of the kappa B genes. NFkB is usually associated
with the inhibitor IkB .

1. A stimulus activates the IkB kinase which will phosphorylate IkB


2. The phosphorylation of IKB is a signal for ubiquitination
3. Ubiquinated IkB will be degraded which will release NFkB
4. NFkB will move to the nucleus and activate the target genes
THE Hsp90 SHOCK FACTOR

Heat shock factor (HSF): TF that responds to high temperatures and activates chaperone genes

Hsp90: Chaperone that helps fold theproteins during high temperatures. In normal conditions
this chaperone is bound to HSF

In the presence of high temperature:

1. Proteins will start to lose their structure and associate with chaperones
2. The unfolded proteins will start to decrease the amount of Hsp90 available for binding
to HSF
3. HSF will be free to trimmmerize
4. The HSF trimmer will promote the transcription of more chaperons

➔ BY FORMATION OF DIFFERENT HETERODIMMERS


This is a mode of regulation by TFs that belong to the HLH family.

HLH: These TFs contain a helix loop helix structure for dimmerization, and a basic region for
association with DNA. One important HLH is MyoD

• To be active 2 HLH transcription factors must form a dimmer and they must have the
two basic regions to bind to the DNA
• If a HFH dimmerizes with a protein that has the helix structure, but not the basic
region, it will not be able to bind to the DNA (low affinity)

TF SIGNALING
How does the signal arrive to the cell, in order to activate the a transcription factor? There are
3 different situations

• Signal penetrates the cell membrane (hormones)


• Signal binds to receptor which directly phosphorylates the TF
• Signal binds to receptor which phosphorylates the TF through
intermediates

TERMINATION OF TRANSCRIPTION IN EUKARYOTES


The termination of transcription in eukaryotes is entirely different than in bacteria.

In bacteria there is the Rho dependent and the Rho independent way of terminating
transcription which drags away the RNA. In eukaryotes the RNA is separated from the pol by
cleavage.

As the RNA is transcribed, at the end of the gene, sequences will be transcribed that constitute
termination signals aka polyadenylation signals.

1. Polyadenylation signals are recognized by the nuclease CPSF which will cleave the RNA.
2. The cleaved off RNA will still be bound to CPSF which will recruit the poly-A
polymerase
3. Poly-A pol will, in a template independant manner, attach adenosine nucleotides to
the 3’ end of the RNA
4. The polyadenylation protects the RNA from degradation
CENTRAL DOGMA OF MOLECULAR
BIOLOGY
Gene expression: Process by
which information from a gene
is used in the synthesis of a
functional gene product that
enables it to produce end
products, protein or non-coding
RNA, and ultimately affect a
phenotype, as the final effect.

The central dogma of molecular biology deals with the detailed residue-by-residue transfer of
sequential information. It states that such information cannot be transferred back from
protein to either protein or nucleic acid.

Gene expression is regulated at the level of transcription. Why is regulation at the


translational level required or beneficial?

➔ Because we have to respond fast to the environment

FROM DNA TO PROTEIN IN PROKARYOTES


What is the major difference between prokaryotes and eukaryotes?

➔ In prokaryotes everything takes place in the same compartment


and at the same time.

In bacteria we have a co-transcriptional protein synthesis

• While the RNA pol is transcribing, the ribosomes will be already


there translating the nascent mRNA.
• RNA pol directly interects with the first ribosome to come and
forms the so called expressosomes

There are different scenarios for the translation of mRNAs in prokaryotes

• Co-transcriptional translation
• Translation of mature mRNA: These are already
fully transcribed
• Translation and simultaneous insertion into the
membrane
TRANSCRIPTION
PROMOTER AND SIGMA FACTOR
Quick recap of prokaryotic transcription,
promoters, and RNA pol.

RNA POL:

• Core enzyme
- 2 alpha subunits
- 2 beta subunits
- 1 omega subunit
• Sigma factor: Identifies the promoter

We have several different sigma factors which regulate changes in the transcriptome due to
different stress conditions:

• Sigma 70: Housekeeping genes


• Sigma 36: Stationary phase
• Sigma 32: Heat shock response
• Sigma 28: Motility genes
• Sigma 54: Nitrogen metabolism

All these sigma factors recognize a


sequence which is -35 to -10 from the
starting point of transcription

➔ Sigma factors have C-terminal that recognizes the -10 region

MESSENGER RNA (mRNA)


What are the features of pro- and eukaryotic mRNAs that are important for translation?

PROKARYOTIC mRNA:
r

• Tri-phosphorylated at the 5’ end:


- Protects RNA from
degradation
• 5’ UTR:
- Protects RNA from
degradation
- Contains the Shine-Dalgarno sequence which is a recognition motive for the
ribosomes
- Can make very specific structures (2ndary or 3rary), which will allow or inhibit
ribosome, RNA or protein binding
• ORF
- Contains the AUG start codon
- Contains the stop codon
• 3’ UTR:
- Landing platform for the ribosome
- Regulatory platform
- It is very flexible, and depending on the nature of the 5’UTR, downstream the
translation can also be regulated

There are 2 types of mRNAs

• Monocistronic: Only one ORF that will give rise to only one protein
• Polycistronic: They are formed by operons, which are consecutive genes which are
transcribed at one starting point by the RNA pol. So the final mRNA will have different
ORFs

What is the point of polycistronic mRNAs?

➔ The regulation is much easier. Usually this types of RNA comprise ORFs, the products
of which, are found either in a common regulatory pathway, or make up one machine
with different modules or subunits

IMPORTANT: Polycistronic mRNA can have intrinsic different regulatory systems

F.e: Maybe there is a secondary structure that prevents translation initiation of ORF2, and
only if ORF1 is translated the stalling ribosomes can open the start point of the ORF2

THE GENETIC CODE


3 potential reading frames, but only one encodes the
correct amino acid sequence

tRNA: Is the adaptor molecule of the genetic code. It


binds to the codon through its anti-codon sequence. This
molecule identifies the codon and translates it to the
aminoacid which is attached at the 3’ end.

Redundancy of the genetic code: There are 64 possible


combinations of codons, but only 20 different amino
acids. Some aa are specified by more than one triplet.

There are 3 codons that specify the stop of protein


synthesis. There is one start codon which is the aa
methionine. Methionine is read only by one codon (AUG). We do have a distinct initiator tRNA,
which is different from the elongator tRNAs that read for methionine or other aa.

Why is it that some aa are encoded by a lot of codons? The genetic code is very optimized!!

➔ Abundance of the amino acid has to be taken into account


➔ Chemichally similar aa often have related codons to minimize the effect of mutations

Third base degeneracy: In order to preserve the speed of transltion, the most important are
the first 2 nucleotides of the codon. Their pairing to the anticodon is thoroughly checked by
the ribosomes.

• The 3rd one is not thoroughly checked, so it doesn’t matter that much what is the third
nucleotide.
• To keep the error as low as possible, the different aa’s that share the first 2
nucleotides, are usually chemically similar

Proline: It is coded by all 4 codons that start with CC (so CCX). This is done to keep the proline
where it should be. Why?

• Proline is the only aa that does not have a primary amino group. So whenever a
proline is inserted the structure changes a lot. It can be for instance a helix breaker.
• Additionally, if we have 2 or 3 prolines in a row to translate, in order to make the
peptide bonds, a particular elongation factor for translation is required.

Triptophane: It is coded only by one codon. Why?

• It’s the least abundant aa which is incorporated.


• Moreover the synthesis of triptophane is very energetically demanding.
• Additionally it has a huge hydrophobic pi electron surface area. So it also is very
particular in determining the protein structure

DECODING SITE OF THE RIBOSOME STRUCTURE


r

The universally conserved nucleotides G530,


A1492, A1493, and C530 of the 16S ribosomal
RNA are critical for tRNA binding in the A site.

These nucleotides are involved in identifying


the perfect Watson and Crick base pairing

Only when there is a correct codon– anticodon


pairing these nucleotides induce an overall
ribosome conformational change, so the
ribosome knows that the tRNA fits properly
In red we have the tRNA localized in the A site of the
ribosome, with its anticodon. In yellow we have the mRNA
with the codon.
CODON–ANTICODON RECOGNITION INVOLVES WOBBLING
➔ The pairing between the first base of the anticodon and the third base of the codon
can vary from standard Watson-Crick base pairing according to specific wobble rules

BASE IN FIRST POSITION OF BASE (s) RECOGNIZED IN


ANTICODON THIRD POSITION OF CODON
U A or G
C G only
A U only
G C or U

DECIPHERING THE GENETIC CODE


How did scientist identify the nature of the code?

EXPERIMENT

➔ Question: What amino acids are specified by codons composed of only on type of
base?

METHOD

1. They started with a simple homopolymer Poly U


2. They labled free aminoacids
3. They put the aminoacids to the vial with Poly U
4. They precipitaded the peptide that had been formed
5. They saw that only phenilalanine was being incorporated.

They concluded that the codon UUU coded for phelinalanine. They checked all three
combinations and finally cracked the genetic code.

TRANSFER RNA (tRNA)


We have a primary sequence of tRNA that folds forming a
secondary structure that consists of:

• D loop
• T loop
• Anticoodon loop
• 3’ end: There is a CCA codon where the aa is atatached to
- In prokaryotes the CCA end is encoded in the DNA
- In eukaryotes there is an enzyme that adds the
CCA end

The secondary structure of the mature RNA folds in a


very particular L shape. These tRNAs are encoded all
over the chromosomes
RIBOSOSMAL RNA OPERONS
There are several operones where the ribosomal RNA is encoded. [Link] has 7 of these
operons. The arrangement of the operons is always the same:

• After 2 promoters we have the ribosomal RNA 16S (small subunit)


• After the 16S there can be one tRNA
• After the tRNA come the ribosomal RNAs 23S and 5S (they make the scaffold for the
large ribosomal subunit)
• Sometimes after the 5S gene, there are
more tRNAs genes

When the entire RNA is transcribed, there are


RNases that cleave away the 16S and the 23S
genes from the operone.

➔ HOW ARE THE tRNAs CORRECTLY PROCESSED AT THEIR 5’ AND 3’ ENDS?

There is a bunch of endo and exonucleases


that are required. One of the most important is
RNase P which processes the mature 5’ end.
This enzyme is composed by RNA and protein,
but its functional part is the RNA. For the 3’
end there are different exonucleases that chop
of the 3’ end until a CCA codon.

CHARGING AND ACTIVATION OF tRNA


The tRNAs must be charged with their respective amino acid. This process of aminoacylation is
performed by the enzymes aminoacyl-tRNA synthetases (aaRs). For each and every amino
acid there is one synthetase.

CATALYTIC MECHANISM OF AMINOACYLATION


There are different steps:

1. Formation of aminoacyl adenylate: The amino acid situated in the aaRs is first
activated through its linkage to an AMP (ATP minus 2 P) forming the intermediate
aminoacyl adenylate. The aaRs has very specific sites where only one amino acid can
bind
2. Aminoacyl-tRNA is charged: The tRNA binds to the aaRs which recognizes the
anticodon of the tRNA. Then there is a nucleophilic attack from the 3’ OH of the last A
of the tRNA, to the adenylated aminoacid. A transferification occurs and a bond is
formed while the AMP leaves the complex.
r

This means that the aaRs have to interact with different anti codons and have to specify which
one is the correct.

There are two classes of aaRs Class I and Class II. They interact with the tRNA from opposite
positions.

PROOFREADING DURING AMINOACETYLATION


Specificity of amino acid-tRNA pairing is controlled by proofreading reactions that hydrolyze
incorrectly formed aminoacyl adenylates and aminoacyl-tRNAs. There are 2 kinds of
proofreading:

• Kinetic proofreading: A proofreading mechanism that depends on the fact that


incorrect events proceed more solwly than correct events. Incorrect events are
reversed before a subunit is added to a polymeric chain. A lot of this has to do with
orientation

• Chemical proofreading: A proofreading


mechanism in which the correction event
occurs after the addition of an incorrect
subunit to the polymeric chain, by means of
reversing the addition reaction. Similar to
the pol proofreading. The aaRs have an
editing site, and whenever a correct amino
acid is attached, this does not fit into the
editing site so that the tRNA is released.

RECOGNITION PROBLEM
The tRNAs have to recognize very specifically their correspondent aaRs, but are more general
in their placements on the ribosome.

The affinity of each and every tRNA for the ribosomes should be the same in order to keep
the peace of translation constant and also the insertion of the tRNA in the A site constant.
There is a problem to achieve this though since each tRNA has attached a different aa. So the
affinities should not be the same. The cell needs to balance the affinities of the tRNAs to the
ribosome through balancing the chemical nature of the attached amino acids.

➔ How can that be achieved?

The tRNAs can contain a modified bases to balance the


chemical nature of the aa, and therefore to have the same
affinity for the ribosome.

• 81 examples of modified bases in tRNAs have been


reported
• Modification usually involves direct alteration of the
primary bases. Some cases involve base removal and
replacement by another base.
• Modifications confer increased stability to tRNAs
and to modulate their recognition by proteins and
other RNAs in the translational apparatus

ADENOSINE TO IOSINE MODIFICATION

Sometimes the modifications not only contribute to the overall affinity for the ribosome, but
also have a particular meaning.

Inosine can pair with all 3 nucleotides (U, C or A). The adenosine to iosine modification is
particularly found on the 34 position which corresponds to the wobbling position of the
anticodon. Modifications in the anticodon affect the pattern of wobble pairing and thus are
important for tRNA specificity.

SUPRESSOR tRNAs
Suppressor tRNAs: have mutated anti codons that recognize new
codons, but they still have attached the original aa.

So we have a normal sequence with a normal codon. Due to the


mutation this codon of the sequence mutates to a termination codon.

➔ What will happen?


Nonsense mutant: The termination codon will be recognized by
a release factor and we’ll have a truncated protein.
• NONSENSE SUPRESSION: Through the supressor tRNAs
the nonsense mutant is corrected. There are suppressor
tRNAs for each and every stop codon. These recognize
the stop codon, because of their own mutation, but still
introduce the original aa they are attached to instead of
stopping the translation
This was used in molecular biology to change some amino acids in a protein of interest to see
whether the aa was important or could be exchanged etc.

Suppressor mutations are not very efficient. If they were


efficient, termination would not work anymore, even when
it’s the correct thing.

So we have a normal sequence with a normal codon. Due to


mutation, this codon of the sequence codes for another
aminoacid.

Missense: This new codon will be recognized by a normal


tRNA that has an aa that should not be there, so there will be
an error in the peptide chain.

• MISSENSE SUPRESSION: The missense codon will be


recognized by a suppressor tRNA that is actually
carrying the correct original aa but that, thanks to its
own mutation, recognizes the misense codon.

Suppressor tRNAs compete with wild-type tRNAs that have


the same anticodon (but different aa) to read the
corresponding codon(s).

RIBOSOME
To start translation the ribosome has to identify the start codon on the ORF, and then it will
translate the mRNA into a peptide and finally reach the stop codon where it will start
termination and recruit the recycling factors. But…

➔ What is the ribosome composed of?

RIBOSOME STRUCTURE
There are 2 ribosomal subunits with different structures each
one:

• Small subunit: Is the 30 S


- Head: When the ribosome is assembled the
mRNA is localized here and in the neck
- Body
- Neck: Where the mRNA is bend
• Large subunit : Is the 50S
- L 11 arm: Important for recruitment of the
factors during translation. It has the protein
L7/L12 where the elongation factors will bind
- L 1 arm: important for recruitment of the
factors during translation

There are 3 different positions on the ribosome for the tRNA. These are only positions. There
will never be a ribosome with more than two tRNAs. They will either be in the A and P sites or
the P and E sites. Whenever a tRNA comes to the A site, the one that is in the E site has to
leave the ribosome.

➔ What is the S nomenclature? Why do they sum 70?

To purify a ribosome we have to sediment it. The S stands for


sedimentation coefficient, which is dependent on the mass as
well as the shape of the molecule. When the ribosome is
assembled the shape is a sphere, which is faster than the sum of
the 30S and the 50S, so it sediments at 70S.

rRNA: The ribosomal RNA is mainly localized in the middle or


interior of the ribosome, while the proteins coat the outside
surface. So the catalytic function is really done by the rRNA. The
proteins most likely get the ribosome into the ribosomal position

EUKARYOTIC RIBOSOME
If we compare the eukaryotic 40S (which is the equivalent to the 30S in prok.) to the
prokaryotic 30S, we can appreciate that there are a lot more proteins in the eukaryotic
ribosome. The entire translation process is more aided by proteins in eukaryotes. The more
complex the organism, the more proteins on the ribosome.

EVOLUTION OF THE RIBOSOME

There is a layered evolution from the prokaryotic ribosome to the ones of more
complex organisms, by adding proteins which have diverse functions and mechanisms
of modulation.

The mammalian rRNA expands a lot by adding expansion segments. There are a lot of
very long rRNA extensions that are very flexible and contribute to a variety of functions.

The bacterial ribosome RNA is the core for protein synthesis. For eukaryotes there is a
lot of protein and small extensions. The mammalian ribosome evolution introduced
really huge extensions

RIBOSOME PURIFICATION
The ribosome is quite easy to purify via sucrose dense centrifugation.

SUCROSE DENSITY GRADIENT CENTRIFUGATION


1. We have a linear sucrose gradient where the sucrose concentration at the top is 15%
while in the bottom is 45%
2. We take cells, break them, pellet the membranes and vesicles, and take the
supernatant where the cytoplasmatic extract will be (RNA, DNA etc)
3. We will add the cytoplasmatic extract into the sucrose tubes and then we will place
them in the swing rotator and centrifugate them with ultra speed

The components of the ribosome and the cell will separate. The larger they are the further
they move in the gradient. In the bottom we’ll have the 70S assembled ribosome, and at the
top the 50 and 30 S subunits.

4. After the centrifugation, we’ll pump in a high solution of sucrose in to raise the sucrose
gradient and then we’ll pump out the sucrose gradient to run it through a UV detector
5. We’ll measure the absorbance of the rRNA at A260. Through this we’ll identify the
position of the rRNA on the gradient (top or bottom) every time there is a peak

6. We can separate the fractions and load them in a gel

Here we have the total RNAs of the cytoplasm (including all types of RNAs) and then the
fractions of these gradients. The 70S comprises the assembled ribosome, for the 30S subunit
we only have the 16S part, and for the 50S we only have the 23S part. Not a super nice
purification then.

STRUCTURE ANALYSIS
To analyze the structure of an RNA (either ribosomal or another) what methods are used?
STRUCTURAL MAPPING OF RNA BY CHEMICAL AND ENZYMATIC PROBING

There are a lot of possible structures


that can happen in RNA. There are single
stranded regions that are not paired, but
most of the time the RNA tends to fold
and make helixes, loops etc.

Nowadays we have programs that,


because we know the RNA sequence,
can predict folds and structures. But the
predictions are not the reality because
for example, there are chaperons that
favor a structure that bioinformatically
would not be calculated. So how do we
do it?

• BY ENZYMATIIC PROBING
We can use RNases to do an enzymatic cleaving. We’ll use RNases that are specific for
cleaving at specific positions.
There are a lot of RNases that degrade single stranded regions but there are also
others that cleave at double stranded regions. The ones used are the ones for double
stranded regions

• BY CHEMICAL PROBING
There are a lot of chemicals which modify the nucleotides. There are different probes
which modify in different positions either in the backbone or at the nucleosides.

With the enzymatic and chemical probing we can modify or cleave the RNA, and we can
identify the positions. It is very important to use either the chemicals or enzymes in a
concentration in which statistically every molecule is only modified or cleaved once.

Then we’ll simply compare the modified RNA with the original one. We do a reverse
transcription or do a primer extension analysis (easier).

PRIMER EXTENSION ANALYSIS:

1. We’ll have denatured strands of modified RNA and unmodified RNA


2. We’ll add a labeled primer
3. We use reverse transcriptase to obtain a cDNA from the labeled primer to the 5’
4. In the modified RNA, when we have a changed nucleotide or a cleavage the reaction
will stop at the spot, so we’ll have a shorter fragment
5. We separate the fragments on the gel.

With this we can identify in which position was the RNA modified or cleaved and where not
and we can trace it back

Let’s analyze the following example:

This example RNA was modified by DMS and


CMCT

CMCT: Modifies U’s and possibily G’s

DMS: Modifies A’s and C’s

➔ We know that these modifications can only happen if the RNA is single stranded.

Whenever we see in the gel a signal indicating a modification (intense band) in a position (for
example the band before U9 in the CMCT lane), we can conclude that this part of the sequence
is single stranded. Now we trace back to the molecule, and we conclude that the ‘ugau’ region
( the 10 position) is single stranded.

For the DMS lane, we can see a signal in the A 19 to A 22 region, so the A’s there are single
stranded because they were modified.

With this information we can deduce and assemble the structure. We can also use it to
identify protein binding sites, because when we added proteins that bind the region will be
protected of modifications or cleavage (footprinting).
ROL IN TRANSLATION OF THE rRNA
The rRNA is the major determinant for the interaction between the ribosomal subunits
because is situated on the interior forming the core.

Intersubunit bridges: Nucleotides that provide the contact sites for subunit interactions.
Therefore these nucleotides are highly conserved (mutations super detrimental). These are
RNA-RNA contacts. In addition to the RNA-RNA contacts, there are also protein-RNA contacts

• Peptide bond formation is catalyzed by the nucleotides of the 23S rRNA


• The universally conserved nucleotides G530, A1492 and A1493 of the 16S ribosomal
RNA are critical for tRNA binding in the A site. They check the correct codon anti
codon pairing

16S rRNA: plays an active role in the functions of the 30S subunit. It directly interacts with
mRNA, the 50S subunit, and the anticodons of tRNAs in the P and A sites.

PROTEIN SYNTHESIS IN PROKARYOTES


The ribosome and the translation mechanism are targets for many antibiotics:

Initiation of protein synthesis involves the formation of a 70S ribosome (composed of a 30S
and a 50S subunit) with the initiator tRNA and the start codon of the mRNA positioned at the
P-site. This process is inhibited by antibiotics such as kasugamycin (Ksg)

Elongation cycle involves the delivery of


the aminoacylated-tRNA (aa-tRNA) to the
A-site of the ribosome by the elongation
factor Tu (EF-Tu). This process is inhibited
by antibiotics such as tetracyclines (Tet).

Termination and recycling are the final


phases and involve the release of the
polypeptide chain and subsequent
dissociation of the 70S ribosome,
followed by recycling of the components
for the next round of initiation. These
processes are inhibited by antibiotics
such as Cam.

So there are a lot of molecules that inhibit translation at very specific time points. Thanks to
this property, we can use these antibiotics to study the translational process and we can also
identify which factors interfere with translation up to a precise point. We can ask ourselves ‘is
X factor found before this step or after this step?’ We can ask this because we can stop the
reaction at a very specific moment

These antibiotics are also very useful to study the structure of the ribosome. At every step of
the translation process, the ribosome’s conformation changes, so we can freeze the process at
any time, purify the ribosome or the complexes the ribosome forms with other proteins, and
analyze its structure at a precise time.

TRANSLATION INITIATION IN PROKARYOTES


What defines the translation initiation site? How does the ribosome recognize the start of the
open reading frame (ORF) or the ribosome binding site, respectively?

IDENTIFICATION OF BINDING SITE: The ribosome identifies its binding site directly internal of
the mRNA. It doesn’t recognize first of all the AUG start codon. The 1rst point of interaction is
with the Shine-Delgarno (SD) sequence
anti
- SD sequence: Localized 5-7 nucleotides upstream the start codon. It base pairs (so its
complementary) with the a anti (SD).
- a(SD): Single stranded region at the 3’ end of the 16S RNA that comprises a UCCUCC
sequence which is perfectly complimentary to the AGGAGG sequence of the SD.

RECRUITMENT tRNA: The next step is the codon-anticodon interaction between the initiator
tRNA and the AUG start codon.

PROTEIN-RNA INTERACTIONS: These primary interactions are based on mRNA-rRNA contact.


But there are also ribosomal proteins:

- S1: Is very flexible because it has 6 domains that are very similar and that are obifold
(the folds interact with nucleic acids). The domains bind to, in most cases single
stranded regions, of the RNA. S1 has the ability to open structures by loosening them
(f.e. hairpins). It will keep the initiation region open.

- S21: It has an anti SD helix in its structure. It places the helix in a pocket, which locks
the ribosome in an optimal position for codon-anticodon interaction. This is why the
distance between the SD and the AUG codon is so important, to accommodate the S21
helix through the formation of a pocket
- Quality control: S1 and S21 are the last two proteins that are assembled to the
ribosome. This is important because this prevents a premature ribosome binding to its
binding site. S21 for instance can only bind if every step (modifications, maturations
etc) has been done.

IDENTIFY THE RIBOSOME


BINDING SITE
The ribosome binds to the initiation
site on mRNA

Add nuclease to digest all


unprotected mRNA

Isolate the protected fragment of


mRNA

Determine the sequence of the


protected fragment

Identify the SD sequence and the


AUG codon

IDENTIFY THE FUNCTIONALLYTY OF SD – ANTI SD INTERACTIONS


We need to create specialized ribosomes. It was first done with a plasmid containing a 16S
rRNA with an altered aSD sequence

➔ They swaped the aSD to a SD sequence on the 16S. So they created a ribosome with a
SD sequence and they coupled it with a reporter gene.

This system is added on top of a wild type ribosome, since e coli cannot live with a wrong aSD
sequence. So the only thing they could test was the synthesis of the reporter gene.

With this experiment they were able to identify that the antiSD-SD interaction is important for
the recognition of the ribosome binding site.

INITIATOR tRNA
There are two distinc forms of methionine tRNAs:

- The [Link] which is an elongation tRNA


- The fMet-tRNA which is the initiator tRNA

There is a structural difference between these. What is really important in the initiator tRNA is:

• Three G-C base pairs in the anticodon hairpin: They contribute to the positioning of
the initiator RNA in the P site. This is the only tRNA that can enter the P site primarily
(not moving from the A site to the P site). They are also required for the recognition by
the IF2 .
• C-A non base pairing: The 5’ nucleotide region is not baise
paired. This is in contrast of the elongation tRNA which is
base paired. It is important for the formylation of the
aminoacid. This formyl group will be the very N-terminus
of the protein, and in several cases it will be removed after
the synthesis of the protein. It is also speculated that the
formyl is needed for the nascent peptide to enter the E
tunnel.

TRANSLATION INITIATOR FACTOR 2 (IF2)


IF2: Is a GTPase (so it comes with a GTP bound to it) which function is to position the initiator
tRNA into the P site of the 30 S ribosomal complex. It also promotes the joining of the 30S
with the 50S to form the complete 70S ribosome.

POSITIONING OF INITIATOR tRNA: The positioning of fMet-tRNAf


is controlled by IF-2 and the ribosome

- Only the fMet-tRNA enters the P site on the 30S that is


bound to the mRNA
- Only an aa-tRNA (not initiator) enters the A site on the
complete 70S ribosome

STRUCTURE OF IF2:

- IF2 is quite large, so it covers the site of the 30S interface


- The C terminal part of the IF2 is the one that interacts
with the initiator tRNA

1RST TERNARY COMPLEX: This complex is comprised by the


initiator tRNA, the IF2 factor and the GTP. The GTP hydrolisis is necessary for the release of the
factor:

1. IF2 positions the tRNA in the P site


2. IF2 introduces some kind of rotation to the 30S that facilitates the binding of the 50S
subunit. So we have a pre-70S initiation complex
3. GTP hydrolysis occurs which introduces another rotation to the 30S subunit
4. The rotation allows the release of the IF2, and confers the perfect position to the 50S
so the 70S complex is very tight (inter subunit bridges are formed)
TRANSLATION INITIATOR FACTOR 1 (IF1)
IF1: Its structure is only one small fold, which is obifold, that binds to the A site. It has different
functions:

- In the initiation complex it prevents the positioning of the initiator tRNA in the A site
- By binding in the A site it prevents elongator tRNAs to bind to there
- Provides the anchoring point for IF2 and IF3 on the 30S, and enhances their activity.
- Controls the conformational dynamics of the 30S subunit

There is a footprinting IF1 experiment, that was used to demonstrate that the addition of IF1
reduces the modification of the nucleotides A1492 and A1493, which are the ones important
for codon-anti codon scaning in the A site.

TRANSLATION INITIATOR FACTOR 3 (IF3)


IF3: Is a 2 domain protein that has a C-terminal domain and an N-terminal domain that are
connected by a flexible linker. It has different functions:

- IF3 takes the initiator tRNA and introduces into it a conformational change
- It also discriminates against non initiator tRNA complexes. This is because only the
correct codon-anti codon interaction (provided by the correct initiatior tRNA) can
withstand the conformational changes. So it stimulates AUG start translation
- IF3 sterically blocks subunit joining. When IF3 is realeased IF2 will promote the joining
of the 2 subunits to form the 70S ribosome.

REGULATION OF THE infC GENE


The cell must balance and keep the homeostasis for all these translational factors, and also has
to coordinate this with the translational activity. For IF3, there is a regulatory feedback loop

➔ If we now that IF3 promotes initiation by favoring AUG codons, what could be the
basis of the negative regulatory feedback loop?

infC: Is the gene that encodes the IF3 factor. In contrast to the majority of genes in bacteria, it
starts with an AUU codon.

MECHANISM

If IF3 concentration is low: Translation will be


allowed to start with non AUG codon. Since infC
starts with an AUU codon the positioning of the
tRNA into this initiation codon is allowed, so IF3 will
be transcribed.

If IF3 concentration is high: Translation will only be


allowed to start with AUG codon. Since infC starts
with an AUU codon the positioning of the tRNA into
this initiation codon is not allowed, so IF3 will not be
transcribed anymore
EXPERIMENTAL VERIFICATION
There are different valid methods to verificate the regulatory infC mechanism. For instance
one could play with the concentration of IF3 by placing this factor under the control of an
inducible promoter. There are other methods though:

METHOD 1

This method aims to investigate the impact of a mutant infC gene, known to have reduced
activity, on the expression of a fusion gene, rpsO-lacZ, containing different initiation codons.

1. Mutant infC under an Inducible Promoter: The researchers placed a mutant infC gene
under the control of an inducible promoter. This allowed them to control the
expression of the mutant infC gene.
r

2. Fusion of rpsO to lacZ: They fused the rpsO gene (encoding a ribosomal protein) with
the reporter gene lacZ. This fusion allowed them to monitor the expression of rpsO by
measuring the activity of the lacZ reporter.
3.

4. Change in Start Codon: The start codon of the fused gene was changed from the
typical AUG to AUU. This change in the initiation codon affects the initiation of
translation for the fused gene.
5. r

6. Comparison of Wild Type and Mutant infC: The researchers examined the synthesis of
the fused gene under two conditions: one with the wild-type infC and the other with
the mutant infC.
7. r

8. Interpretation: The experiment revealed that when the mutant infC was used in
conjunction with the fusion gene containing the AUU start codon, the levels of β-
galactosidase were higher.

The results can be explained based on the regulatory feedback loop involving IF3 and the
unique AUU initiation codon of the infC gene.

- Low Activity of Mutant infC: The reduced activity of mutant infC results in a lower
concentration of IF3 within the cell.
- r

- AUU Start Codon in the Fused Gene: The fused gene contains an AUU start codon.
Normally, when IF3 levels are high, translation initiation primarily occurs at AUG
codons, and initiation at AUU is limited.
r

- Higher β-Galactosidase Levels: With the mutant infC and its lower IF3 activity, there is
a reduced ability to prevent initiation at the AUU codon in the fused gene.
Consequently, translation initiation at AUU is allowed, leading to increased expression
of the rpsO-lacZ fusion gene.

This experiment really verifies the negative regulatory feedback loop of IF3

METHOD 2: TOEPRINTING ASSAY

Toeprinting assay is a molecular technique used to investigate the formation of translation


initiation complexes and analyze the function of specific proteins, such as IF-3, involved in the
initiation of protein synthesis.
This assay involves the use of mRNA containing an AUG start codon and a Shine-Dalgarno
sequence, along with a labeled primer and reverse transcriptase.

1. Primer Annealing: A labeled primer anneals downstream of the AUG start codon on
the mRNA.
2. Primer Extension: A reverse transcriptase extends the primer towards the 5' end of
the mRNA (forming cDNA).
3. Formation of Translation Initiation Complex: If a translation initiation complex (30S
and initiator tRNA) forms, it can withstand reverse transcriptase activity.
4. Shorter cDNA Fragment Formation: Primer extension is terminated where the
complex is bound so the reverse transcriptase falls off at the complex site, creating
shorter cDNA fragment.
5. Gel Electrophoresis: The different cDNA fragments are separated on a gel to compare
the full-length fragment to the shorter one.

The ratio between the signals from the full-length and shorter fragments provides information
about the strength and efficiency of the translation initiation complex formation.

How is this applied to Analyzing IF-3 Function?

In addition to the 30S and the mRNA, different tRNAs are also added including the initiator
tRNA and other tRNAs that recognize various codons instead of the original start codon.

When present IF3, causes the complex to position itself precisely over the AUG start codon
using the typical initiator tRNA, generating a specific short cDNA fragment (+15 nucleotides).

Without IF-3, the complex will position itself in different codons using the other tRNAs, which
will generate different cDNA lengths depending in which codon the complex is bound to.

In essence, the Toeprinting assay shows that IF-


3 plays a crucial role in discriminating against
other initiation complexes and favoring the
formation of the complex over the AUG start
codon, confirming its specific function in
translation initiation.

PRO AND EUKARYOTIC RIBOSOMAL RNA


When we examine the rRNA in eukaryotic and prokaryotic cells, we find that the overall
structure and critical functional regions are quite similar. While eukaryotic rRNA shares many
similarities with its prokaryotic counterpart, it also exhibits some unique characteristics.

EUKARYOTIC RRNA FEATURES

Expansion segments: Eukaryotic rRNA contains additional regions known as "expansion


segments" that are not present in prokaryotic rRNA
5S RNA: Eukaryotic ribosomes share this component with prokaryotic ribosomes. It is found on
the central protuberance of the large subunit of both
prokaryotic and eukaryotic ribosomes. It usually mediates the
subunit bridge between 23S (or 25S) to the 16S (or 18S)

25S and 18S rRNAs: In eukaryotic ribosomes, the equivalent of


the prokaryotic 23S and 16S rRNAs are the 25S and 18S rRNAs
respectively. These eukaryotic rRNAs perform similar functions
to their prokayotic counterparts but are slightly different in size
and structure.

5.8S rRNA: This rRNA is unique to eukaryotes. This rRNA


molecule is absent in prokaryotic ribosomes. It plays a specific
role by interacting with the 23S rRNA.

REGULATION OF PROTEIN SYNTHESIS IN BACTERIA


The step of initiation in bacteria is the rate limiting step for protein synthesis and it is the
main step where regulation occurs. Steps it takes:
1. The ribosome binding site is recognized by the 30S subunit which usually has bound
the proteins S1 and S2
2. The initiator tRNA is brought in by the IF2. At the same time IF1 prevents the tRNA
from entering the A site. IF3 scans the entire process so the tRNA is correctly bound
3. At this point we have the 30S initiation complex formed
4. After the GTP hydrolysis by IF2 the conformational change allows the positioning of
50S
5. At this point the 70S initiation complex is formed which will transition into elongation

In translation, regulation occurs via protein factors, RNA factors, and also mRNA features.
This can happen at different steps and different positions.
INITIATION REGULATION

START CODON
• Start codon flexibility: While over 90% of start codons are AUG, exceptions exist
where other codons initiate coding sequences, such as the infC gene
• Infrequent start codons: In rare cases, GUG ("valine") and UUG ("leucine") can also act
as start codons to initiate translation.
• Initiator tRNA: Despite the use of alternative start codons like GUG and UUG, a
methionine will still be incorporated as the first aa, since only the initiator tRNA can be
positioned in the P-site. GUG and the UUG will be read by the initiator Meth-tRNA.

SHINE DALGARNO SEQUENCE


The 16S ribosomal RNA (rRNA) molecule contains, at its 3' end, the anti-Shine Dalgarno (aSD)
sequence which is complementary to the SD sequence typically found upstream of the start
codon .

AFFINITY AND STRENGTH


The degree of complementarity between SD and aSD sequences impacts ribosomal affinity for
the translation initiation region. It also determines
the strength of translation initiation.

Strong complementarities results in robust


initiation, while weak complementarities lead to
less efficient initiation.

SD-aSD DUPLEX
Between the aSD and SD sequences, a duplex (double-stranded
RNA structure) forms. The duplex binds to a ribosomal pocket
created by proteins S1, S11, and S18. This insertion is also
strengthened by the ribosomal protein S21.

The stability and correct positioning of this duplex within the


ribosome's binding pocket is essential for ensuring that the AUG
start codon is accurately placed at the ribosome's P-site since the
duplex helps anchor and orient the mRNA.

SENSITIVITY TO DISTANCE
Translation initiation is highly sensitive to the distance between the SD sequence and the start
codon
- This distance directly affects the alignment and positioning of the AUG codon with
respect to the P-site.
- Deviations in distance can impact the efficiency of translation initiation.
- If the distance is too short or too long, the ribosome will not correctly align the AUG
codon with the P-site.

Maintaining the optimal distance between is essential for ensuring that translation initiation
occurs accurately.

TOEPRINTING ANALYSIS
The toeprinting assay can be used to analyze the optimal distance in translation initiation

Experimental setup:

- Sequences of mRNA (probes) with perfect SD sequences are used


- Each mRNA probe will have a different number of nucleotides inserted between the
SD sequence and the AUG start codon.

Measurement of Translation Initiation: The key aspect is that a signal is generated only when
the ternary complex (composed of the ribosome, mRNA, and initiator tRNA) forms, indicating
the potential for translation initiation.

Results and Optimal Distance: The obtained cDNA fragments from the assays are separated on
a gel for analysis. The observations reveal a distinct pattern:

- The signal for ternary complex formation increases as the number of nucleotides
between the SD sequence and the start codon increases.
- The signal peaks at around 6-7 nucleotides in between, indicating the optimal
distance for efficient translation initiation.
- Beyond this optimal range the signal weakens, indicating less efficient translation
initiation.

RIBOSOMAL PROTEIN S1
Ribosomal protein S1 plays a critical role in translation initiation.

- It binds to the 30S subunit through protein-protein interactions, rather than RNA-RNA
interactions.
- This protein consists of six similar domains with distinct functions.

S1 Domains and Functions:

- N-terminal Domain (D1): Required for interaction with ribosomal protein S2 (rS2).It is
bound to the ribosome through the S2 protein.
- Domains D3-D6: RNA binding domains that interact with mRNA upstream of the SD
sequence within the 5´UTR.
- RNA Unfolding Capacity: S1 has the ability to unfold RNA structures within the 5´UTR,
facilitating translation initiation. It is a passive way of opening the RNA structures so it
doesn’t require ATP.

S1's Role in Translation Initiation: S1 is required


for translation initiation for canonical mRNAs
containing structures within the 5´UTR.

IN VITRO TRANSLATION ASSAY


The primary goal of this in vitro translation assay is to investigate how the presence or absence
of ribosomal protein S1 affects the translation of messenger RNA (mRNA) into proteins

Experimental Setup:

1. Cell Growth and Harvesting


- The experiment begins by growing bacterial cells, in a culture. After reaching a
specific growth stage, the cells are harvested.
2. Cell Lysis and Fractionation:
- The harvested cells are lysed. The lysate is then fractionated, separating
different cellular components.
- The fractionation process isolates the cytoplasmic fraction, which contains the
cellular components involved in translation, such as ribosomes.
3. Addition of mRNA and Components:
- In this isolated cytoplasmic fraction, a specific mRNA molecule of interest is
added.
- Cofactors, substrates, energy sources (like ATP and GTP), and salts are also
introduced to create an environment conducive to translation.
4. Labeling Initiator tRNA with Methionine:
- To visualize the synthesized proteins, a labeled form of methionine is added to
the mixture, which will be the first aa in the growing polypeptide chain during
translation.
5. Observation and Analysis:
- Synthesized proteins are separated based on their size using techniques like
SDS-PAGE
6. Results of In Vitro Translation Assay:
- Translation with 70S ribosomes from which S1
was removed shows minimal translation.
- Adding purified S1 stimulates translation
again.
- 70S ribosomes with normal S1 display normal
translation.

LEADERLESS mRNAs
- Leaderless mRNAs, lacking a 5´UTR, directly start with a 5´
AUG start codon.
- Translation of leaderless mRNAs does not depend on S1.
- Excess of S1 can reduce translation levels, potentially due to saturation

WORKING MODEL OF S1
Two possible pathways:

• S1 binds first to the 30S subunit, recruiting mRNA that will have secondary
structures, and unfold them there.
• S1 binds first to mRNA, unfolds its structures, and then the binary complex is
positioned on the 30S subunit.

Both pathways lead to the formation of a ternary complex on the translation initiation region.

RNA STRUCTURES
RNA structures can significantly impact translation initiation. This section explores the
influence of RNA secondary structures on translation regulation through various examples:

REGULATION OF T4 LYSOZYME EXPRESSION


T4 lysozyme mRNA can be transcribed from two different promoters: the early promoter and
the late promoter. The choice of promoter determines the structure of the 5' untranslated
region (5'UTR) of this mRNA.

• Early Promoter:
- Results in a strong stem-loop structure within the 5'UTR that prevents the
binding of the ribosome.
- The shine Dalgarno and the AUG start codon are not accessible due to the
stem-loop so the ribosome cannot access the mRNA and initiate translation.
• Late Promoter:
- Generates a shorter mRNA with an open 5'UTR that lacks the inhibitory stem-
loop structure.
- The translation initiation region, comprising the SD sequence and the AUG
start codon is open and accessible so the ribosome
can initiate translation effectively.

In essence, the choice of promoter dictates the structure of the


5'UTR in T4 lysozyme mRNA, which, in turn, affects ribosome
binding and translation initiation. The choice of promoter is
typically dictated by the regulatory signals and environmental
conditions within a cell.

BREATHING OF mRNA STRUCTURES


Translation initiation can be influenced by the equilibrium between two structural states of
mRNA, open and closed.

This equilibrium is sensitive to the complementarity and stability of hairpin structures within
the mRNA.
REGULATION OF TRANSLATION INITIATION OF PHAGE
LAMBDA S GENE
Lambda S Holing Protein:

The lambda phage produces a protein called the S holing protein, which inserts into the inner
membrane of the host cell and forms "holes" in that [Link] are two versions of this
protein each one with a different translation start site.

- S105: Its translation starts at the second AUG start codon. It results in the production
of a specific form of the S holing protein that is capable of forming lethal holes in the
inner membrane. This form of the protein has
its N-terminus oriented toward the outside of
the cell, specifically the periplasmic space.

- S107: Its translation starts at the first AUG start


codon. It results in the production of a different
form of the S holing protein. This form of the
protein acts as an inhibitor of hole formation,
preventing the phages from being released.

Importance of a Single Amino Acid:

The critical difference between these two forms of the protein


lies on the presence or absence of a lysine at the N-terminal
which is a positively charged residue. The lysine influences
whether the protein acts as a hole forming or inhibitory
protein

Regulation by mRNA Structure:

The decision between translation initiation at the first start site (S107) or the second start site
(S105) is determined by the structural dynamics of the mRNA (breathing of the stem-loops)

- The equilibrium between S105 and S107 initiation depends on the stability and
presence of upstream and downstream hairpin structures.
r

- The mRNA's structural dynamics determine ribosome positioning on either the first or
the second start codon.
-

- The "breathing" of the stem-loop structure causes an equilibrium shift.

So the structural characteristics of the mRNA, including hairpin structures, influence which
start site is used, ultimately affecting the outcome of phage infection.

In Vitro Demonstration:

The impact of mutations in start codons or hairpin structures can be studied in vitro.
Mutations reveal how changes in RNA structure influence translation initiation.
1. mRNA Variants: Researchers prepare mRNA variants with specific mutations or
structural changes that influence the choice of translation initiation site.
2. Translation Reaction: They set up translation reactions adding ribosomes, initiation
factors, etc, with the prepared mRNA variants.
3. Toeprinting Assay: To determine where translation initiation occurs on the mRNA
variants, they use a toeprinting assay.
4. Results:
- In the wild-type mRNA, ribosomes can initiate translation at both sites. This
results in the production of both S105 and S107 proteins.
- When they mutate the first codon at the S107 site, it becomes nonfunctional.
As a result, only translation from the S105 site occurs.R

- When they mutate the second codon at the S105 site, it becomes
nonfunctional. As a result, only translation from the S107 site occurs
- r

- If both methionine codons are mutated, translation initiation at both sites is


blocked, and no protein synthesis occurs from this mRNA.
5. Interpretation: These experiments demonstrate that the choice of translation
initiation site depends on the presence and functionality of the start codons at those
sites.

THERMOSENSORS
Thermosensors control gene expression in response to temperature.

• Ribosome binding sites are blocked within secondary structures at low temperatures
• High temperatures melt the stem structures, making the ribosome binding site
accessible.
• Various structures, including the FourU structure, are found in thermosensor mRNAs.

An example is the prfA gene in Listeria


monocytogenes. It creates a stem-loop
structure in low temperatures that melts at
high temperatures. PrfA is a transcription
factor that activates virulence genes at 37°C
(body temperature), but not at 30°C.

Advantages of Translation-Based Control:

• Fast response. Translation-based control responds rapidly to


temperature changes.
• The mRNA is always present, and temperature-induced
structural changes allow immediate translation initiation.

Two Mechanisms of Thermosensors:

- Molecular Zippers: Simple structures that open with


increasing temperature. Example: rpoH mRNA in E. coli,
which encodes the sigma factor 42
required under heat conditions.

- Switches: Structures that refold into an


alternative secondary structure at a
different conditions. Example: CspA is a
protein required during cold shock to
stimulate specific genes.

RIBOSWITCHES
Metabolite Sensing by Riboswitches: Riboswitches
are RNA elements that can sense and mesure metabolites in the cell and respond by causing
an allosteric change in RNA structure that impacts downstream gene expression. This
regulation primarily occurs at the translation level

STRUCTURE AND FUNCTION


Riboswitches consist of:

• Aptamer: A binding domain with high specificity and affinity for the regulating
molecule (the metabolite). It’s a structured binding pocket
• Expression platform: It contains the ribosome binding site. It can also control and
regulate transcription, not only translation.
• Coding sequence: The corresponding regulated gene

The binding or release of the metabolite causes a conformational change in the expression
platform that alters gene expression or translation. When the ribosome binding site is trapped,
translation initiation is blocked.

FUNCTIONS

In bacteria, both transcription and translation can be controlled by external events related to
the cell's metabolic state. This control often involves the formation of RNA stem-loop
structures that either :

• Prematurely terminate transcription


• Sequester the SD sequence, inhibiting translation initiation.
RIBOSWITCH TRANSLATION CONTROL
Translation control by metabolite binding rearranges the mRNA structure leading to the
formation of either sequestor or anti-sequestor (mutually exclusive)

GENE REPRESSION

• In the absence of the metabolite, translation is


allowed. The 5’ end of the mRNA forms an anti-
sequestor loop.
r

• In the presence of the metabolite translation is not


allowed. The metabolite will bind to the mRNA and
the 5’ end will change its structure so the anti-
sequestor cannot be formed, but the sequestor
loop can (sequesters ribosome binding site).

GENE ACTIVATION

• In the absence of the metabolite, translation is not


allowed. The 5’ end of the mRNA forms a sequestor loop
(sequesters ribosome binding site).
• r

• In the presence of the metabolite, translation is allowed.


The metabolite will bind to the mRNA and the 5’ end will
change its structure so the anti-sequestor can be formed,
but the sequestor loop cannot.

RIBOSWITCH TRANSCRIPTION CONTROL


Transcription control by metabolite binding rearranges the mRNA structure leading to the
formation of either terminator or anti-terminator (mutually exclusive)

If it affects transcription, at least the first part of the mRNA has to be already transcribed
(nascent mRNA). So this is a co-transcriptional regulatory mechanism. It can activate or
repress gene expression.

GENE REPRESSION

• In the absence of the metabolite, transcription is


allowed. The 5’ end of the mRNA forms an anti-
terminator loop
r

• In the presence of the metabolite transcription is


not allowed. The metabolite will bind to the
mRNA and the 5’ end will change its structure so
the anti-terminator cannot be formed, but the
terminator loop can.
GENE ACTIVATION

• In the absence of the metabolite, transcription is


not allowed. The 5’ end of the mRNA forms a
terminator loop
r

• In the presence of the metabolite, transcription is


allowed. The metabolite will bind to the mRNA
and the 5’ end will change its structure so the
anti-terminator can be formed, but the
terminator loop cannot.

TRANSCRIPTION TERMINATION
There are 2 types of transcription termination in bacteria distinguished by their mechanism
and structural features

FACTOR DEPENDENT TERMINATION (RHO-DEPENDENT)

Factor-dependent terminators are characterized by their inconsistent sequence homology but


uniformly require the essential protein factor Rho for termination of transcription.

Rho Function: Rho is an essential protein factor that plays a critical role in factor-dependent
termination.

• It is an RNA/DNA helicase or translocase


• Rho binds to the single-stranded RNA molecule at a specific site known as the rut site.

Termination Mechanism

1. Rho binds to the RNA at the rut site


2. It translocates itself along the RNA strand
3. Uses ATP hydrolysis to disassociate the RNA polymerase from the DNA template
4. This releases the nascent RNA transcript which results in transcription termination

INTRINSIC TERMINATION (RHO-INDEPENDENT)

Intrinsic termination is a termination mechanism that occurs at specific template sequences


and does not require the involvement of any auxiliary proteins. It is characterized by the
presence of a GC-rich inverted repeat followed by a consecutive stretch of thymidylate (T)
residues.

Termination Mechanism:
1. The moment they are transcribed the inverted repeat sequences fold into a hairpin
structure.
2. Following the hairpin structure, there is a stretch of uridylate (U) residues.
3. The hairpin structure, along with the consecutive Us, creates steric hindrance on the
mRNA.
4. This steric hindrance exerts pressure on the RNA polymerase enzyme.
5. As a result of this pressure, the transcription complex disintegrates, causing RNA
polymerase to release the RNA transcript.
6. Termination of transcription occurs at the poly-U stretches
following the hairpin structure.

Timing and Efficiency:

- Rho-independent termination depends on the precise


coordination of hairpin folding and transcription
termination.
r

- The efficiency of termination can be influenced by the length, size, and sequence of
the hairpin structure.

EXAMPLES OF RIBOSWITCHES
There are different kinds of riboswitches each with its own mechanism and with different
ligands (coenzymes, aa, ions, signaling molecules, nucleotides derivatives etc)

tRNA BINDING RIBOSWITCH: T-BOX


The primary regulatory mechanism that regulates the tyrS operon is the tyrosine-sensitive
transcription attenuation that leads to conditional transcription termination

tyrS operon: It encodes the tyrosyl-tRNA synthetase enzyme that charges tRNAs with
tyrosines. So the sensing mechanism senses if there is enough tyrosine charged tRNA or not

Regulation Mechanism:

• Termination occurs at an intrinsic terminator hairpin


• A crucial element is the presence of a 14-base
transcript sequence (T box) located immediately
upstream of the terminator hairpin.

Role of the T Box: It serves as a recognition element that


interacts with the free (no aa) 3’end of the tRNA

The aptamer specifically recognizes the tyrosine charged


tRNAs by:

- The tRNA anticodon sequence


- The elbow of the tRNAs which is very specific
There can be 2 situations:

• High tRNA charging:


- If the aa (tyrosine) is charged in the tRNA, there is no interaction between the
T-box and the 3’ end of the tRNA.
r

- This allows the formation of a termination loop. There will not be transcription
• Low tRNA charging:
- If the aa (tyrosine) is not charged in the tRNA, there is an interaction between
the T-box and the 3’ end of the tRNA.
r

- This allows the formation of an anti-termination loop. There will be


transcription

FLAVIN MONONUCLEOTIDE-BINDING RIBOSWITCHES: RNF BOX


Unlike tRNAs, which are relatively large, riboswitches primarily interact with smaller
molecules. One example of this is the riboswitch found in the operon ribDEAHT.

ribD operon: It encodes the necessary genes for the biosyinthesis of riboflavin which is the
precursor of flavin mononucleotide (FMN). So the sensing mechanism senses if there is enough
FMN or not.

Role of the rnf Box: It is an aptamer that interacts with FMN molecules. Thus, the riboswitch
regulates gene expression in response to the availability of FMN

There can be 2 situations:

• Low concentration of FMN:


- When FMN levels are low this molecule
fails to bind to the aptamer.
r

- In response, the ribD leader transcript


has a conformational change and
adopts a secondary structure that
includes an antiterminator loop. There
will be transcription
• High concentration of FMN:
- When FMN levels are high this
molecule will bind to the aptamer.
r

- In response, the ribD leader transcript


has a conformational change and
adopts a secondary structure that
includes a terminator loop. There will
not be transcription
CLASSES OF RIBOSWITCHES AND LIGANDS
There are a lot of different riboswitch classes and several known ligands

They are involved , for example, in regulation of:

• vitamin biosynthesis
• amino acid metabolism
• purine metabolism

RIBOSWITCH OF COENZYME B12-RESPONSIVE RNA


An example of metabolite sensing riboswith is the cobalamin-responsive RNA, which regulates
the expression of the cob-operon responsible for coenzyme B12 synthesis

Mechanism: The RNA has a structural element that binds to the coenzyme B12, which is the
end product of the synthesis pathway.

There are two ways of using this mechanisms depending if the bacteria is gram negative or
gram positive because this mechanism can be use at transcriptional or translational level

GRAM POSITIVE: Regulation takes place at a transcriptional level

• Off state – B12 levels are high:


- The ligand binds to a structural element of the RNA
- This triggers a conformational change and formation of terminator loop
- Additionally the conformational change sequesters the antiterminator loop so
there will not be translation
• On state – B12 levels are low:
- There is no B12 so it must be produced
- In the absence of B12 an anti-terminator loop forms
- Additionally the terminator loop cannot be form so there will be translation
GRAM NEGATIVE: Regulation takes place at a translational level

• Off state – B12 levels are high:


- The ligand binds to a structural element of the RNA
- This triggers a conformational change and the formation of an anti-RBS hairpin
- There is no translation because the anti-RBS hairpin interacts with the SD
sequence and prevents ribosome binding
• On state – B12 levels are low:
- There is no B12 so it must be produced
- In the absence of B12 an anti-antiRBS structure loop forms
- There is translation because the new structure sequesters the anti-RBS hairpin
so it does not interact with SD, so there will be ribosome binding

EXPERIMENTAL ANALYSIS

Structure analysis can be done through chemical or enzymatic probing. There is another
possibility to analyze the structural rearrangements during metabolite binding:

RNA analysis by in-line probing: Analyzes RNA structure and how it changes upon metabolite
binding.
➔ It relies on the spontaneous degradation of mRNA. It can be either stimulated or
inhibited depending on its structure and the presence or absence of metabolite.

Basis of in-line probing:


• RNA degradation primarily occurs through a process involving a 'nucleophilic attack' by
the 2′ oxygen on the adjacent phosphorus center.
• Efficient cleavage in this process occurs only when the attacking 2′ oxygen, and the
departing 5′ oxygen of the phosphodiester linkage
align in a linear configuration (in-line process)
• The rate of spontaneous cleavage depends on the
local structural context of each RNA linkage.
• Linkages located in highly structured regions of RNA,
like a base-paired helix, tend to resist cleavage
because they are not in an in-line configuration.
Experimental Analysis:
1. Start with your RNA sample, which is typically labeled at the 5′ end with a radioactive
marker (e.g., 32P).
2. Place the RNA in a buffer solution, with or without the
metabolite of interest.
3. Allow the RNA to incubate for several hours or days,
allowing for structural changes.
4. During this incubation, the RNA begins to degrade
through the in line process.
5. The speed at which spontaneous cleavage occurs
varies depending on the local RNA structure.
6. By comparing the degradation patterns of RNA in the
presence and absence of the metabolite, you can
observe differences in cleavage patterns.

SAM DEPENDANT RIBOSWITCHES


• 7 distinct families of riboswitches bind S-adenosylmethionine (SAM) as their effector
• They regulate genes involved in sulfur metabolism

• SAM riboswitches also regulate expression of genes


essential for survival and/or virulence in medically
important pathogens.
• Although SAM is used as a coenzyme for methylase
enzymes, in certain bacteria this compound also
seems to be an important genetic signal.

There are 2 ways of using this mechanism depending if the


bacteria is gram positive or negative:

Gram negative: The mechanism is regulated at the


translational level. Through an S-box mechanism.
Gram positive: The mechanism is regulated at the
transcriptional level. It mimics the B12 mechanism (termination, anti-termination)

VALIDATION OF TRANSCRIPTIONAL ATTENUATION IN VIVO


To prove that in gram positive the regulation is at the
transcriptional level we do an In vitro transcription analysis:
1. Provide a DNA template and a RNA pol
2. Use radioactively labeled nucleotides
3. We will see different sized mRNAs fragments
- The full length mRNA is transcribed down to
the end
- Shorter mRNAs fragments due to the
formation of terminator structures
4. The ratio of the full length/short fragments will shift depending on if we are adding
SAM molecules or not

THIAMINE PYROPHOSPHATE (TPP) RIBOSWITCH


Thiamine pyrophosphate (TPP) is a thiamine (vitamin B1) derivative and functions as a
coenzyme for decarboxylase enzymes, thus it is a key factor for carbon metabolism. The
riboswitch can regulate at all levels:

• Gram negative: The mechanism is regulated at the translational level. TPP binding to
the riboswitch (localized at the 5’UTR) can modulate translation by inhibition
• Gram positive: The mechanism is regulated at
the transcriptional level. TPP binding to the
riboswitch (localized at the 5’UTR) triggers a
termination loop so it inhibits transcription
• Other organism like molds: The mechanism is
regulated at the splicing level. TPP binding to
the riboswitch (localized at the 5’UTR) interferes
with splicing
• Other organisms like plants: The mechanism is
regulated at the processing level. TPP binding to
the riboswitch situated at the 3’ UTR regulates
the processing and stability of the mRNA

This TPP riboswitch has a very similar mechanism on all


these organisms, but the consequences of it are
completely different.

Riboswitch: Intrinsic RNA structure within the mRNA


that changes its structure upon binding of a metabolite
or other molecule.

A SPECIAL CASE: TRYPTOPHAN OPERON


For this operon, the regulation of transcript attenuation is mediated by the ribosome.

➔ Why? Because the translational mechanism is also the sensing mechanism for the
concentration of triptophane

On the sequence of the operon we have:

• the promoter and operator


• Genes that encode enzymes for the synthesis of triptophane
• In the region where the transcription starts (after promoter and operator and before
the genes) there is a sequence that encodes a 162 bp leader region
LEADER REGION: This element encodes:

- Leader peptide
- Attenuator codons. Array of several tryptophane codons. This is the sensing
mechanism

HIGH TRP LEVELS:

• This means that there are a lot of


tRNAs charged with trp
• Translation of this leader region is
happening very fast.
• The ribosome will stop at the end of
the leader region and release the
translated peptide
• The translated peptide will form a
terminator hairpin followed by a
stretch of U’s, preventing the RNA
pol to advance and transcribe the
genes of the operon

LOW TRP LEVELS:

• This means that there are not tRNAs charged with trp
• Translation of the leader region is stalled because the ribosome cannot continue
beyond the leader peptide.
• The ribosome will sit on the trp
codons unable to continue and
therefore will prevent the
formation of the terminator
loop, but will trigger the
formation of an anti-terminator
loop
• The anti-terminator will allow
the RNA polymerase to
continue transcription of the
downstream genes of the
operon

It’s a feedback system: When the operon genes are transcribed trp is synthetized, therfore its
levels will be high and the cell will be able to charge more tRNAs with trp. When the
concentration of the trip reaches a certain level, the translation of the leader region will
continue and the operon genes will not be transcribed anymore.
RIBOREGULATION
In contrast to riboswitches, riboregulation defines the regulation
with small RNA regulators (sRNA).

Benefits of sRNAs regulation: Faster synthesis compared to


proteins, energy-efficient, and rapid regulation. There are two
different classes cis sRNA and trans sRNA

CIS SRNA (CONTROLLED BY CIS-ACTING ELEMENTS):

• Control plasmids, transposons, and phage genes.


• Encoded at the same same genetic loci as their target genes.
• They are found in the counterdirection and overlapping to their
target mRNA.
• Forms perfect duplex with target mRNA (perfect
complementarity).

Functions:

- Prevents ribosome binding by binding to the RBS.


- Can promote mRNA cleavage.
- Can lead to transcription termination if it interacts with
a riboswitch.
- Typically, only one target due to perfect duplex
formation.

TRANS SRNA (CONTROLLED BY TRANS-ACTING ELEMENTS):

• Involved in bacterial responses to environmental stress.


• Encoded at different positions on the chromosome.
• Duplexes with mRNA are not perfect (mismatches).
• Can have multiple targets due to imperfect binding to
mRNAs.

Functions:

- Can prevent translation.


- Promotes mRNA degradation.
- Can stimulate translation under certain conditions.

CIS REGULATION sRNA


CIS COPY NUMBER
Cis sRNAs can control the copy number of specific elements.

Plasmid copy number: Every plasmid has a copy number which is the times it gets replicated.
The copy number control on E. coli plasmid ColE1 involves two different small RNAs:

• RNAII: RNAII forms a duplex at the origin of replication (ori) of the plasmid, acting as a
preprimer for replication. So it triggers replication

• RNA I: Its levels increase with higher plasmid copy numbers (so everytime the plasmid
replicates the RNA I levels increase).

Regulation: When there is a high copy number, with the help of the adaptor protein ROM,
RNAI forms a casing loop complex with RNAII, which inhibits replication. Plasmid copy number
is balanced by RNAI and RNAII concentrations

Characteristic of RNA I :

- High turnover: Is degraded fast in order to allow the


balance between the two sRNAs
- The concentration of RNAI proportional to concentration
of ColE1 plasmid

TRANSCRIPTIONAL TERMINATION
Some cis sRNAs are involved in transcriptional termination. For example the sRNA regulating
the gene virG

➔ This sRNA forms a perfect duplex with the 5'UTR of the VirG mRNA

In the presence of sRNA:

- The duplex is formed.


- Due duplex formation the mRNA 5’UTR structure (antiterminator) cannot be formed
- Instead a termination hairpin is folded in the target mRNA
- Transcription termination

In the absence of sRNA:

- There is no duplex is formation.


- The mrna 5’UTR can be folded in
the antiterminator structure
- This prevents termination loop
formation
- Transcription is allowed
OTHER MECHANISMS
It can take place at different levels and with different mechanisms.

TRANS REGULATION sRNA


All envorimental stresses and therefore different ligands trigger variety of trans sRNAs

INACTIVATION
• sRNA binds to the ribosome binding site (RBS).
• Prevents ribosome binding so it inhibits translation.
• mRNA is not protected by ribosomes and may degrade.
• Provides cleavage sites for RNases leading to further degradation.

ACTIVATION
Trans sRNA normally inhibits or or attenuates translation, but in some cases
they can activate translation.

DsrA: sRNA that stimulates the translation of rpoS (stress sigma factor).

Mechanism:

1. In the absence of the sRNA (DsrA the structure of the RBS is a


duplex
2. Binding of the sRNA to the upstream region that forms the loop
with the RBS, leads to opening of the RBS.

So the sRNA stimulates translation by binding and rearranging the mRNA structure

Problem: Since the complementarity is not perfect, it’s difficult


for the sRNA to find its target.

➔ It requires chaperon HFQ for proper binding


HFQ PROTEIN
• Acts as an RNA chaperone and it is heat-stable.
• Interacts with sRNAs, mRNAs, and other proteins, and mediates sRNA-mRNA
interactions
• Induces conformational changes in sRNAs and mRNAs to facilitate the interactions.
• Plays a role in virulence in pathogens and is a pleitropic regulator

STRUCTURE

HFQ protein possesses two surfaces. It facilitates the binding of one RNA molecule on one
surface site and another RNA molecule on the other surface site.

Binding Pockets: Hfq has nucleotide binding pockets on both surfaces for to interact with the
RNAs

- On the proximal site, it contains binding pockets formed by polyU sequences.


- On the distal site, it has binding pockets formed by sequences with the motif (A-R-N)n.

ELECTROPHORETIC MOBILITY SHIFT ASSAY (EMSA)


• Purpose: Used to confirm that the Hfq protein stimulates RNA-RNA interactions.
• Principle: Measures the shift in the velocity of mobility when RNA molecules interact
with Hfq.
• Method: Conducted using a native (non-denaturing) gel, and labeled RNAs are
essential for this assay.

Over time, when the sRNA and mRNA have formed the duplex (dimmerization), they no longer
require Hfq to maintain this duplex.

➔ So HFQ's role is primarily facilitating the initial interaction between the sRNA and
mRNA, acting as a catalyst.

Conclusion:

• HFQ acts as a platform on which both RNA ligands can transiently bind to either
surface of the protein, facilitating their interaction.
• HFQ increases the local concentration of ligands, promoting the chances of RNA-RNA
interactions.
• HFQ allows sRNA to sample large spaces, facilitating base-pairing between mRNA and
sRNA. This represents Hfq's inherent RNA chaperone functions.
• Ultimately, HFQ assists in ligand release and the formation of stable RNA duplexes,
promoting effective RNA-RNA interactions.

TRANSLATIONAL SILENCING BY Srna


Trans sRNA can silence translation. One example is the sRNA RyhB responsible for regulating
genes involved in iron metabolism

Iron: Its homeostasis must be tightly regulated because

- It plays a crucial role in various biological processes including serving as a cofactor in


enzymes like hemoglobin.

- It can be toxic due to the Fenton reaction, leading to the production of reactive
oxygen species

Mechanism: Is based in the regulator Fur protein (ferric uptake regulator)

• Fur protein forms dimmers in the presence of iron ions (Fe²⁺).


• The dimmers bind to specific operator regions, inhibiting the transcription of the
RyhB by preventing RNA pol access to the promoter

Iron Limitation:

1. When iron is limited, Fur cannot effectively block


sRNA (RyhB) transcription.

2. Transcription of RyhB sRNA interferes with the


translation of proteins that rely on iron but are
not essential.

3. This ensures that non-essential proteins are


reduced when iron concentration is low, allowing
the limited available iron to bind to essential iron-
binding proteins.

Silencing Mechanism:

• RyhB sRNA recruits Hfq, a protein known for facilitating


RNA-RNA interactions.

• With Hfq's assistance, RyhB binds to its mRNA targets.

• This binding event triggers the recruitment of the RNA


degradosome, a complex consisting of RNase E and other
proteins.

DEGRADOSOME
RppH enzyme: In prokaryotes, all RNA transcripts have a 5' end with 3 phosphatesthat. RppH
can remove two of these p’s, leaving a monophosphate which marks the RNA for degradation

RNase E:

• It initiates the first cuts in mRNAs


• It possesses a binding pocket for the 5' monophosphate of marked RNAs.
• It triggers downstream cleavage at single-stranded regions rich in A's and U's.

Cleavage and Degradation:

1. RNase binds to the 5’ monophosphate through its pocket.


2. Downstream cleavage at single-stranded regions rich in A's and U's.
3. Cleavage generates two fragments each with a 3' OH end and another
5' monophosphate
4. These downstream fragments can also trigger further cleavage
5. Finally numerous 3' exonucleases degrade the fragments generated
by RNase E down to nucleotides.

RNase E Structure: It is structured as a dimer of dimers, consisting of four RNase E molecules


together.

• N-terminal catalytic domain: Is Essential for the cleavage reaction and It has several
subdomains:
- 5' sensor domain: Sensing the 5' phosphate and containing a binding pocket.
- S1 domain: Binds to RNA and mRNA.
- Dnase-like domain: Triggers the cleavage.

• C-terminus interaction domain: Allows binding to various enzymes to form the


degradosome and potentially regulates its function.

sRNA NETWORKS
Bacterial cells employ various sRNAs to regulate numerous mRNA targets, forming a complex
sRNA regulatory netwo

EXPERIMENTAL IDENTIFICATION sRNA


sRNAs are characterized by 2 important features: They are small in size and they interact with
HFQ. These characteristics make their purification and identification possible through specific
methods:

• Size exclusion: Purify the small


RNA fraction through size
exclusion methods, and
sequence them; Since this
method is based on the size we
may also purify small RNA
fragments that are non-
regulatory.
• Overexpression: Enrich the
overexpressed sRNA fraction
with HFQ and further purify it for
sequencing. The HFQ has to be
tagged in order to be purified

Sequencing: Generate short reads from the purified s RNAs for alignment with reference
genomes, enabling the determination of the sRNAs origin and encoding location.

HFQ TAGGING
A row of histidines (His tag), which interact with nickel, can be added to the HFQ protein. The
tag facilitates purification through a nickel column eliminating the need for antibodies.

➔ Why not antibodies? Antibodies bind broadly and would potentially interfe with RNA-
HFQ interactions due to their non-specific binding (Polyclonal antibodies bind a lot of
antigens of the protein)

PLACEMENT AND PREPARATION OF HFQ-HIS TAG

To introduce his tag to the HFQ protein one can use a


plasmid with a sequence encoding the HFQ + the his tag.

His tag can be placed at the N-terminus or C-terminus of the


protein. The N-terminus placement may interfere with
protein structure and function

- N-terminus: Aug start codon-hist codons- HFQ gene -


stop codon.
- C terminus: AUG start codon-HFQ gene-his codons-
stop codon

EVALUATING HIS TAG'S IMPACT IN PROTEIN FUNCTION

To assess the potential interference of the tag with the


structure or function of the protein 2 methods can be used:
• EMSA assay: Do the assay with the sRNAs and check if HFQ binds to them
• Gene reporter system: Compare the levels of the reporter gene when the the non-
tagged HFQ is overexpressed versus when the tagged HFQ is overexpressed.

IMPORTANCE OF THE CONTROL REPLICAS

Control experiments without tagged HFQ are necessary to account for non-specific binding in
purification. Control function:

- Verify that there is a selective interaction between HFQ and the sRNA
- Substract the signals produced by RNA fragments that are not from sRNAs

NORTHEN BLOTT
Northern blotting is used to identify specific RNAs in a cell. We already know the sequence of
the specific RNA, but we want to know if it’s present in the cell.

➔ It involves purifying the total RNA cell fraction, separating the RNAs by length on a
denaturing gel, transferring it to a membrane, and detecting the target RNA with a
complementary fluorescent probe.

➔ sRNA expression (presence) under different conditions can be compared. This can
reveal gene expression profiles.

FUNCTIONAL sRNA ANALYSIS


BIOINFORMATIC APPROACH
Bioinformatic analysis: Search for potential mRNA targets. It can also be used to identify
potential sRNA

➔ Validation: For bioinformatically identified sRNAs and their targets, validation via
northern blotting (for sRNAs) or other methods is crucial.

OVEREXPRESSION OF sRNA
Overexpression of sRNAs through recombinant methods (cloning in plasmids etc).

• Can lead to the identification of targets


• May also identify indirect targets through transcriptomics or proteomics
• One can also analyze how the expression of sRNA changes the protein pattern

DELETION OF sRNA
Comparison of a wild type cell with a mutant lacking the sRNA

• Can lead to the identification of targets


• May also identify indirect targets through transcriptomics or proteomics
• Analysis of the reaction of the mutant to stress conditions and it’s differences (in the
transcriptome, proteome etc) with the wild type
PULSE EXPRESSION
Induce short-term expression (pulse) of sRNAs and identify the changes in the patterns of the
cell. This methods is useful for identifying immediate effects (direct targets .

Procedure:

1. Clone the sRNA gene into a plasmid with a regulatable promoter.


2. Induce sRNA expression in bacterial cells using an inducer.
3. Identify the changes in the cell patterns (genes, rnas ,proteins etc) at short intervals
after induction
4. Rapidly shut down sRNA expression after a brief period.

Useful for:

• Identify direct mRNA targets affected by the sRNA.


• Study changes in gene expression, protein abundance, mRNA stability, and
translation efficiency.
• Construct regulatory networks to map sRNA interactions.

Advantages: Pulse expression offers precise control and captures early sRNA responses,
reducing the risk of observing indirect effects.

In summary, pulse expression is a focused approach to studying sRNA function, providing


insights into immediate and direct sRNA impacts on gene expression and cellular responses.

IDENTIFIACTION OF DIRECT sRNA


TARGETS
In addition to the experimental approaches mentioned earlier,
there are methods specifically designed to determine the
direct interactions between sRNAs and their target mRNAs.

GIL-SEQ
In this method, the target mRNA that you want to identify is
present within the bacterial cell.

1. .Overexpression of Small RNA: Introduce in the cell a


plasmid containing the sRNA of interest. This plasmid
allows for the overexpression of the small RNA.
2. RNA Ligase Plasmid: Introduce another plasmid that
contains an RNA ligase gene.
3. Pulse Expression:Turn on the expression of the small
RNA, creating a high concentration of the sRNA in the
cell. The sRNA then hybridizes with the target RNA.
4. Induction of RNA Ligase: Induce the expression of the RNA ligase within the cell.
5. Ligation of RNAs: The RNA ligase catalyzes the ligation of the two RNAs. This ligation
results in the formation of chimeric RNA molecule, consisting of the sRNA linked to the
target RNA.
6. Isolation of Chimeric RNA: Oligonucleotides that are tagged (polyA tail) and that are
complementary to the sRNA part are introduced . Magnetic beads coated with polyT
oligos can be used to pull out the chimeric RNA based on these tags.
7. cDNA Library Construction: The isolated chimeric RNA molecules can then be used to
create a cDNA library.
8. Sequencing: Perform RNA sequencing. By aligning the obtained sequences with a
reference genome, you can precisely identify the target RNA that is ligated to the small
RNA.

Once you identify the target RNA downstream of this method, you need to validate that this
interaction is biologically relevant and that the sRNA indeed regulates the purified RNA (do a
functional analysis)

OTHER METHODS
MAP-seq: In this method, the sRNA is equipped with an MS2 tag which has hairpin loops.
There are proteins that bind to MS2 and wecan use them to purify the sRNA with their targets.

GRAD-seq: Complexes of sRNAs, HFQ, and target mRNAs fractionate at the same position in a
sucrose gradient. This approach allows researchers to directly identify sRNA targets by
determining which sRNAs and target mRNAs are found in the same gradient fractions,
indicating their binding to each other.
TRANSLATIONAL REPRESSORS
The translation initiation regions of mRNAs can be blocked by protein binding factors that
target specific regions such as the RBS and the AUG start codon. These proteins act as
translational repressors

GP32 REPRESSOR
The GP32 repressor is encoded by a bacterial phage. It has a dual purpose in the cell:

• Inhibition of DNA Replication: GP32


binds to single-stranded DNA thereby
blocking replication
• Translational repression: GP32 acts as an
autoregulatory repressor. When the
concentration of GP32 becomes too high,
it can block translation by binding to the
RBS of its own mRNA.

Feedback Loop: This autoregulatory mechanism creates a feedback loop to maintain the
balance of GP32 in the cell.

AFFINITY OF GP32

Principle of regulatory feedback loops: Regulatory ligands must have a higher affinity for the
regulatory target, and a lower affinity for the second target.

In this case GP32 has a higher affinity for the DNA target. Any excess of GP32 would then bind
to the RBS to block its own translation

RIBOSOME BIOSYNTHESIS
The assembly of ribosomes involves multiple components that must be present in the correct
ratios within the cell.

• The genes responsible for these components are often organized in operons. However,
ribosomal proteins are encoded by different operons.
• Is crucial to regulate the operons precisely to ensure the correct stoichiometry of
ribosomal components.

STRINGENT CONTROL
Stringent control measures the metabolic status of a cell.

Key player: Guanosin tetra or penta phosphate (p)ppGpp (also


called alarmone). Its concentration increases during growth or
stress conditions. Its concentration is dependent on Rel A.
RelA: This enzyme senses the presence of uncharged tRNAs on the A site of the ribosomes,
which typically indicates a shortage of amino acids and nutrients

- Under these conditions RelA triggers (p)ppGpp synthesis


- During growth at some point the uncharged tRNAs signal will be very strong because
cells consume a lot

SpoT: SpoT is another enzyme involved in the stringent response. It has 2 functions:

- It can sense different signals and trigger (p)ppGpp synthesis.


- In other conditions it changes its enzymatic activity and hydrolyzes (p)ppGp

Effects of (p)ppgpp: It removes the sigma 70 factor of the RNA pol and allows binding of
alternative sigma factors. This triggers the stringent control response which affects rRNA and
ribosomal protein transcription.

DOWNREGULATION OF rRNA SYNTHESIS

Under stress conditions (p)ppGpp ‘ s concentration


increases and triggers 2 things:

• Downregulation of rRNA and ribosomal


protein transcription
• Upregulation of other genes required for
survival under these conditions.

In order to follow the reduce synthesis of rRNA,


the proteins that bind to the ribosome must also
be regulated.

RIBOSOMAL OPERONS REGULATION

There are several operones where the ribosomal RNA is encoded

Chromosome localization: In the [Link] genome, one half of the


circle contains the ORI (replication origin), and the other half
contains termination sites. The operons of the ribosomal RNAs
(mX) are localized in the ORI half of the genome. Why?

From the ORI site, replication occurs in both directions. This


means:

➔ The cell will have 2 copies of the genes localized on the


ORI half. If replication starts again before the first
replication is done, we’ll have 4 copies.
➔ The cell will have only 1 copy of the genes localized in the termination half.

Gene dosis effect: Replication is also coupled to the metabolic status of the cell. If the cell has
enough nutrients it will start to replicate and therefore will have more copies of the genes.
• In good conditions (enogh nutrients) the replication allows the cell to have a strong
expression of all operons involved in rRNA. A lot of ribosomes can be synthesized.

AUTOREGULATION OF RIBOSOMAL PROTEINS


The cells have a very tightly regulated feedback loop which ensures that in the end the correct
concentration of rRNA and ribosomal proteins in the cell.

If this regulation is not based on transcription (stringent control), the ribosomal proteins have
the capacity to downregulate their own synthesis at the level of translation

How can ribosomal proteins be regulated so they are present in the same stochiometry?

➔ They bind to their own mRNA and regulate their own expression

How can ribosomal proteins bind to their own mRNA?

➔ Since ribosomal proteins bind to the rRNA of the ribosome, their own mRNA mimics
one part of the rRNA

RIBOSOMAL PROTEINS OPERONS


• Ribosomal protein genes are organized in
operons.
• Most operons contain genes for both small
and large subunit proteins.
• Some non-ribosomal proteins that have
roles in translation process also encoded in
these operons.
• One gene in each operon (dark colour) acts
as a regulator.
• In each case the regulatory protein is an
rRNA binding protein.

• Genes marked + are under translational


feedback regulation by the dark coloured
gene.
AUTOREGULATION OF RIBOSOMAL PROTEIN S15
S15: Is a ribosomal protein that is encoded by the rspO operon

S15 binding:

• Usually S15 binds to a specific rRNA structure of the 16S


subunit
• Binds to a similar structure found in the 5’UTR of the
mRNA that encodes the operon rpsO

Autoregulation mechanism:

1. S15 binds to its target rRNA on the 16S subunit


2. When 16S is saturated, S15 it will bind to the its own mRNA
3. It will shut down its own translation

AUTOREGULATION OF tRNA SYNTHETASES


tRNAs synthetases: ribosomal proteins that charge tRNAs with their correspondent amino
acid. For example ThrRS charges tRNAs with the threonyl.

A lots of these proteins autoregulate their own translation:

1. When there is not enough aa’s the tRNAs cannot bind to their syntethase protein
2. This causes the presence of a lot of free syntethase proteins
3. These proteins will bind to a specific structure (stem loop) of their own mRNAs. This
structure mimicks thestructure of a tRNA
4. This will cause the blocking of the RBS and the inhibition of translation

Analyze the effect in vivo: Remove the potential structure (the structure that you think they
bind to) to which these proteins bind in their own mRNAs and observe the effects

THE CARBON STORAGE REGULATOR A (CsrA)


CsrA: Translational regulator that oversees various pathways involved in carbohydrate
metabolism. This regulator has numerous mRNA targets, over which it can exert both positive
and negative control.

➔ Recognition motif: CsrA forms dimmers that


bind at specific sites that have the nucleotide
motif GGA. These motifs are typically found at
the tips of two consecutive hairpin loop
structures within the mRNAs.

CsrB: Is a sRNA that sequesters CsrA molecules , actimg


like a sponge. CsrB achieves this by folding into a
structure with many hairpins, each having the
recognition motif (GGA) at the tip
Absence of CsrB:

- CsrA binds to the recognition motif on its target mRNA


- CsrA blocks the RBS
- Transcription is inhibited.

Presence of CsrB:

- CsrB sequesters CsrA


- CsrB effectively removes away CsrA from the target mRNA
- This allows translation to proceed

CsrB RNA functions as an antagonist of CsrA activity, preventing it from interacting with its
target mRNAs. The only way to release CsrA from CsrB's sequestration is for CsrB to be
degraded by an RNase E enzyme.

CsrA functions: CsrA exhibits various


regulatory mechanisms to explain its
diverse effects on different RNAs.
While we've discussed a repressor
mechanism, CsrA can also activate
RNA translation, stabilize mRNA, and
participate in other mechanisms.

THE OPERON glgCAP


glgCAP operon: is responsible for glycogen biosynthesis. In this operon the 5'UTR of the
transcribed mRNA contains loops with CsrA recognition motifs.

IDENTIFYING REGULATION MECHANISMS

To study and elucidate the regulatory mechanisms of the glgCAP operon a reporter gene like
GFP can be employed. Here's how it works:

1. The reporter GFP is placed under the regulatory region of glgC, within the 5'UTR.
2. CsrA is provided on a plasmid
3. When CsrA is turned on, it downregulates the expression of GFP, causing a decrease in
fluorescence
4. Turning off CsrA expression allows GFP to fluoresce again.

Addition of CsrB: CsrB acts as one level of regulation above CsrA

➔ When CsrA expression is turned on, there is no fluorescence. However, if CsrB


expression is also activated, it sequesters CsrA, leading to the re-emergence of GFP
fluorescence.

Addition of CsrD: It is another regulator which function is to degrade CsrB


➔ When CsrAB expression is
turned on, there is fluorescence.
However, if CsrD expression is
also activated it will degrade
CsrB, which in turn will release
CsrA . This will cause inhibits
GFP fluorescence.

Recap:

1. We first added a plasmid


containing CsrA which caused a
decrease in fluorescence
2. Then we added a plasmid
containing CsrB which restored
fluorescence
3. Finally we added a plasmid
containing CsrD which shot down
fluorescence again

This multi-level, multi-plasmid system can be used to validate the role of different molecules
in regulation and can also allow us to modulate the expression of target genes.

TRANSLATIONAL COUPLING
In bacterial cells, mRNAs are typically polycistronic

➔ Single mRNA transcribed from one promoter contains several ORFs


➔ The genes in a polycistronic mRNA often encode proteinsform a multiprotein complex

This arrangement ensures that the expression of multiple genes is coordinated, both in
transcription and translation.

Translational coupling: Is the regulatory mechanisms by which stoichiometry between these


proteins is maintained, which is crucial for the proper assembly and function of multiprotein
complexes. There are two main types of translational coupling:

NON-OVERLAPPING TRANSLATIONAL COUPLING

There is a clear separation between the stop codon of one gene (Gene A) and the start codon
of the next gene (Gene B) downstream. This separation is typically achieved by an intergenic
spacer region. Here's how it works:

1. After translating Gene A, the ribosome reaches the stop codon.


2. There is a spacer sequence (intergenic region) between Gene A and Gene B.
3. At the end of the spacer, there is the SD sequence and the start codon of Gene B.
4. Two different ribosomes translate Gene A and Gene B sequentially.
OVERLAPPING TRANSLATIONAL COUPLING:

The stop codon of Gene A overlaps with the start codon (AUG) of the next gene (Gene B)
downstream. Here's how it works:

1. When translating Gene A, the ribosome reaches the stop codon (e.g., TGA).
2. The overlapping creaes a continuous ribosome-binding site.
3. In this case, the ribosome does not dissociate after translating Gene A.
4. Instead, the ribosome shifts back to accommodate this overlap.
5. The ribosome repositions itself in a way
that the start codon of Gene B is now
positioned in the ribosomal P-site.
6. This allows the initiator tRNA to locate
the start codon of Gene B, and
translation can reinitiate from there.

OVERLAPPING TRANSITIONAL COUPLING


CONTROL OF IRAD EXPRESSION
Context: IraD is an anti-adapter protein that binds to an adapter (RssB), which is required for
the degradation of RpoS sigma factor

Mechanism: The translational control of iraD expression is tightly regulated.

• The iraD mRNA contains an upstream open reading frame (uORF) called idlP (IraD
leader peptide).
• The SD sequence of the iraD mRNA is initially trapped in a secondary structure,
preventing ribosome binding.

Regulation:

• Several GGA sequences within the 5'UTR of iraD can be bound by CsrA, which would
block translation initiation.
• When CsrA is not bound to the mRNA, translation can occur from the start codon of
the idlP uORF until the stop codon of the uORF

Ribosome Shifting:

• When the ribosome reaches the stop codon of the uORF it engages in a ribosomal
shifting mechanism.
• Via interaction between the SD sequence of the next ORF and the aSD sequence of the
uORF, the ribosome shifts by +2 nucleotides to the 3' side.
Result:

• The shifting places the start codon of the second ORF in the ribosomal P-site,
resembling a perfect translation initiation complex.
• Initiation factors like IF2 can then bind to the AUG start codon, allowing synthesis of
IraD.

NON-OVERLAPPING TRANSITIONAL COUPLING


PHAGE MS2
Contex: In phage MS2 there is a polycistronic mRNA that encodes 2 genes (gene 1 and 2).

➔ The region downstream region of the gene 1 (spacer region) interacts with the RBS of
the gene 2.
➔ This interaction initially blocks ribosome entry at the binding site of the 2nd gene

Mechanism: The coupling ensures that translation of the first gene is required to facilitate the
translation of the second gene.

• Translation of gene 1 triggers the opening of the secondary structure formed in the
spacer region.
• By opening this structure, the RBS of gene 2 is freed. This allows translation initiation
by a second ribosome.

TRANSITIONAL ATTENUATION – RESISTANCE GENES


Regulation of resistance genes against antibiotics: A leader ORF in an mRNA forms a
secondary structure that blocks ribosome binding for the downstream resistance genes.

Mechanism:

• After translation of the upstream leader ORF the ribosome is released


• This allows the rapid formation of the secondary structure that blocks the translation
initiation region of the resistance gene.

Antibiotic Binding:

1. When an antibiotic binds in the E tunnel of the ribosome, translation of the leader ORF
can continue until it reaches a specific sequence with two positively charged residues
2. This residue motif blocks the ribosome so it gets stalled
3. The stalling of the ribosome gives time for the remodeling
of the mRNA structure.
4. This structural change opens up the translation initiation
region of the resistance gene, allowing a second ribosome
to bind and initiate translation.
5. When the resistance gene is expressed, it can remove the
antibiotic from the ribosome, so the translation of the
leader ORF continues

It basically couples the expression of the resistance gene to the


presence of the antibiotic. Otherwise it would be a waste of
energy to always express the resistance gene.

TRANSLATIONAL ATTENUATION - REGULATION OF secA BY secM


SecA: Motor chaperone involved in the transport of proteins through the inner membrane in
bacteria.

SecM: Is an uORF that regulates the translation of secA to ensure the tight control of SecA
expression. SecM is positioned in the upstream region of the secA mRNA

Mechanism: The polycistronic mRNA of secA contains a RBS that is initially trapped in a
secondary structure preventing ribosome binding under normal conditions.

1. Ribosome translation: The ribosome starts translation of the the secM ORF
2. SecM stalling sequence: SecM contains a signal sequence that interacts with positively
charged residues present in the E tunnel. This interaction forms a secondary structure
that stalls the ribosome

At this point 2 things can happen:

• Concentration of secA is low: If there is no secA,


the ribosome continues to be stalled, which gives
time for the secondary structure to open. This
allows a second ribosome to initiate translation of
secA
• Concentration of secA is high: SecA will bind to
the ribosome and pull the nascent peptide, which
will release the ribosome stopping further
translation

SecA Recruitment: When there are high levels of secA, this protein is recruited to the stalled
ribosome, allowing translation of the SecM peptide to continue. Upon translation termination,
the inhibitory structure reforms rapidly.

You might also like