Chapter 10
Chapter 10
Reprinted with permission from Science Vol. 287, No. 5461 24, March 2000. (Drawings are from the Archives, California Institute of Technology.) Copyright 2000 AAAS.
chapter, we will explore how the understanding of inheritance developed over time, starting with the inheritance models of Mendels factors to todays understanding of the genetic material as DNA, the material of genes and chromosomes and its organisation into genomes.
In a monastery garden
In the summer of 1856, visitors to the monastery of St Thomas in the town of Brno, in what is now the Czech Republic, would have seen monks at work and prayer. Visitors may have noticed one monk examining owers on pea plants in the vegetable garden near the monastery kitchen.
(a)
Standard Petal
(b) Stigma
Ovary
Flower bud
Wing
Keel
Figure 10.2 shows the typical structure of a pea ower. Under normal conditions, pea plants are self-fertilising that is, pollen from one ower fertilises the ovules of the same ower. However, this monk was carrying out a procedure to prevent self-fertilisation. Using forceps, he carefully removed the stamens from ower buds on one pea plant and dusted pollen that he had collected from another pea plant onto the stigma of the rst plant. In doing this, he was articially crossing the pea plants (see gure 10.3).
(a) Forceps
(b)
(c)
340
Later, the monk wrote about his procedure for an articial cross as follows: For this purpose, the bud is opened before it is perfectly developed, the keel is removed and each stamen carefully extracted by means of forceps, after which the stigma can once be dusted over with the foreign pollen. At another time, this monk could be seen in another section of the vegetable garden where he recorded the characteristics of mature pea plants in his notebook. Later, with others assisting him, the monk sat at a table where he shelled peas, sorted them into groups of different colours and shapes and counted the numbers in the various groups. Who was this quiet monk? He was Gregor Mendel (18221884) (see gure 10.5a, page 342). Growing up on a farm, the young Mendel would have noticed variation in the offspring of farm animals. Years later in the monastery, Mendel turned his attention to edible pea plants (Pisum sativum) and examined the inheritance of variation in seven different traits of this species (see gure 10.4). He also used other plant species, such as beans, and experimented with bees.
Trait Variations
Stem length tall Seed (cotyledon) colour yellow Seed (cotyledon) shape round Seed coat colour grey white wrinkled green short
Pod texture inflated constricted Constricted pods lack a hard inner pod lining so that the seed outlines can be seen (as in snow peas); inflated pods have a tough parchment-like lining.
Flower position
axial terminal In the axial arrangement, flowers can arise along the entire length of the stem; in the terminal arrangement, flowers are bunched at the top of the stem.
341
ODD FACT
Tall pea plants grow to a height of about two metres, while short pea plants grow to less than half a metre.
(a)
Figure 10.5
(a) Gregor Mendel and (b) the monastery gardens in which he carried out his plant-breeding experiments. Mendel stopped his genetic experiments in 1871 after being elected Abbot in 1868. He died of Brights disease.
Just one rst-hand account of Mendel exists and it is that of a horticulturalist named Eichling who visited Mendel at the Brno monastery in 1878. Recalling this visit years later in 1942, Eichling wrote that Mendel gave him lunch and showed him the monastery garden. Mendel told Eichling that he had reshaped (the green peas) in height as well as in type of fruit. In response to Eichlings question of how he had done that, Mendel answered: It is just a little trick, but there is a long story connected with it which would take too long to tell.
Oodles of peas
Mendels choice of pea plants for his breeding experiments meant that he was able to obtain relatively large numbers of offspring from even a single cross. Every pea in a pod on a pea plant is a single offspring and each pea baby will grow into a mature plant (see gure 10.7).
Figure 10.6
342
NATURE OF BIOLOGY BOOK 2
In all, Mendel produced thousands of offspring from his pea plant crosses over eight years. Large numbers of offspring allow regularities to be recognised and valid averages to be identied. If only small numbers of offspring are obtained, regularities may not be seen and averages may be biased by chance events. Numbers do matter! For example, imagine that you have four coins and that one of them is double-headed. Would you be absolutely condent that you could identify the double-headed coin on the basis of the result of tossing each coin just once? What about ten tosses? Likewise, when Mendel was examining various outcomes from his crosses, such as green pods or yellow pods, he obtained large numbers because he wanted to ascertain their statistical relations.
Figure 10.7
(a) Hybrid plant with its baby offspring enclosed in pods. How many offspring have been produced from the self-fertilisation of the plant shown? Traits that are expressed in baby peas include pea shape and pea (cotyledon) colour. (b) After planting, each pea develops into a mature pea plant. Traits that are expressed in mature plants include ower position, stem length, seed coat colour and pod colour.
(b)
343
2. For each trait, individual plants had two factors that could be identical or different. Plants with two identical factors (such as long and long) were referred to as pure breeding, while plants with different factors (such as long and short) were called hybrids. 3. Each factor was a discrete particle that retained its identity across generations. This idea challenged the commonly held view that inheritance was a blending process in which factors lost their identity (see gure 10.9). 4. The character that was expressed in the F1 hybrid plants was dominant, while the hidden character in the hybrid was recessive. For example, green pod colour is dominant and yellow pod colour is recessive. 5. During gamete formation, the members of each pair of factors separated to different gametes, with one factor per gamete. This is the principle of segregation of alleles or Mendels rst law.
6. In separating, members of one pair of factors behaved independently of members of other pairs of factors. This is the principle of independent assortment or Mendels second law. 7. The results of a particular cross were the same, regardless of which plant was used as the male parent and which as the female parent.
ODD FACT
Mendel was disappointed that his work was not recognised and is reported to have stated: Meine Zeit wird schon kommen (My time is sure to come). That recognition did not come until 1900, more than 30 years after Mendels work was published and 12 years after his death.
ODD FACT
The rst report of a human condition behaving as a Mendelian dominant characteristic was published in 1905. This condition is abnormally short ngers or brachydactyly.
acquaintance from the monastery in Brno could have explained the occurrence of this long-haired kitten! In the years following the publication of Charles Darwins The Origin of Species in 1859, it is claimed that the attention of scientists moved to evolution and to the differences between species. As a result, there was a decline in interest in the work of plant and animal breeders who were concerned with differences within species. In this climate, few biologists would have been interested in the plant-breeding experiments of an obscure monk in a monastery in Brno. Mendels explanatory model was ignored for more than 30 years. Mendels model was rediscovered in 1900 by three biologists working independently. The biologists were de Vries, a Dutch plant breeder; Correns, an Austrian botanist; and Tschermak, a German botanist. After its rediscovery, biologists in Europe and America demonstrated that Mendels model applied to inheritance in many plants and animals. By the end of the rst decade of the twentieth century, Mendelian principles had been found to apply to many organisms, including: nettles (Urtica pilulifera) serrated leaf margin dominant to entire wheat (Triticum sp.) late ripening dominant to early ripening stocks (Matthiola sp.) coloured dominant to white maize (Zea mays) smooth seed dominant to wrinkled mice (Mus musculus) coloured coat dominant to albino rabbits (Oryctolagus cuniculus) Angora (long) fur dominant to short fur cattle (Bos taurus) polled (hornless) dominant to horned poultry (Gallus gallus) brown eggs dominant to white eggs sheep (Ovis aries) white wool dominant to black. The Mendelian model of inheritance was soon universally accepted as the basis of inheritance in plant and animal species.
KEY IDEAS
Mendel carried out carefully planned experiments using techniques different from other plant breeders. Mendel developed a model of inheritance built on a set of assumptions about his factors. Mendels model both explained observed results and allowed predictions. Mendels model of inheritance was ignored by the scientic community but was rediscovered in 1900 independently by three biologists. After its rediscovery, Mendels model was soon found to apply to other kinds of living things.
QUICK-CHECK
1 Identify the following as true or false. a Mendels model assumed that parental characters blended in their offspring. b Pea plants normally undergo cross-fertilisation. c Mendels model applies to inheritance in plants and animals but not to human inheritance. 2 List two assumptions of Mendels explanatory model for inheritance. 3 What impact did Mendels model make on the scientic community at the time it was rst reported? 4 Name the three biologists who rediscovered Mendels work.
345
ODD FACT
The science of genetics was initially called Mendelism after Gregor Mendel, who laid the foundations for the modern science of genetics. The name genetics was introduced in the early 1900s by William Bateson, the rst professor of genetics at Cambridge University.
rgd
ragged leaf
virescent seedling
w H
yellow skin
blood groupH
Y sl pg Dt Pl su sm py
yellow endosperm ms slashed leaf pale green seedling dotted aleurone purple plant sugary endosperm salmon silk pygmy
male sterile
se p
Chromosome 8
ma P Na
marbled
blood groupP
naked neck
Chromosome 6
h Fl
silkiness flightless
(a)
Chromosome 1
(b)
Figure 10.12 Linkage groups in (a) maize (Zea mays) and (b) in chickens (Gallus
gallus). Bateson and Punnett were the rst to recognise that Mendels factors (genes) did not always assort independently.
346
Walter Sutton did not carry out any experimental crosses instead he synthesised the independent results of other scientists, recognised patterns and made the key link between the chromosomes studied by cytologists and the genes studied by geneticists (see table 10.1). Figures 10.13 and 10.14 show the parallel relationship between the behaviour of genes and chromosomes during meiosis.
Genetic behaviour Segregation of the members of each pair of alleles into different gametes
Chromosomal behaviour Separation (disjunction) of the members of each pair of matching chromosomes into different gametes (gure 10.13) Random orientation of different pairs of chromosomes across the cell equator prior to their separation during gamete formation (gure 10.14)
R
1 2R
R R R r r r r Anaphase 1 Anaphase 2 R
r
1 2r
r Gametes
R R R r r T T t t R r r R R R r r t t T T r r Anaphase 1 R
T T t t t t
R R
T T
1 4 RT
r r R R
t t
1 4 rt
t t
1 4 Rt
T T
r r
T T
1 4 rT
Anaphase 2
Gametes
The parallel behaviour of chromosomes and genes provided strong evidence for Suttons conclusion that genes were located on chromosomes. However, it was not until 1910 that the rst specic gene was demonstrated to be located on a specic chromosome. This was done by T. H. Morgan (18661945) (see gure 10.15).
NATURE, STRUCTURE AND ORGANISATION OF THE GENETIC MATERIAL
347
Morgan and his co-workers at Columbia University (United States) conrmed Suttons conclusion. They showed that factors (genes) were not free particles like peas in soup, but were organised into larger structures chromosomes. They showed that when genes were located close together on homologous chromosomes, specic alleles of these linked genes tended to be inherited together. Morgans ndings explained the strange observation of Bateson and Punnett. Clearly, the genes controlling pea ower shape and pea ower colour were located close together on the same chromosome.
Figure 10.15 Thomas Hunt Morgan with y drawings. Morgan used fruit ies (Drosophila melanogaster) in his experiments that showed that genes were located on chromosomes. In 1933, T. H. Morgan was awarded a Nobel Prize for his contribution to genetics.
A commonly held, but incorrect, view at that time was that genes were probably made of protein. However, over the rst half of the twentieth century, several experiments revealed what genes were made of.
348
Grifth carried out experiments using two kinds of pneumococci bacteria (see gure 10.17). He found that: injection of living smooth type into mice caused them to die from pneumonia injection of living rough type left the mice healthy. Grifth also killed smooth bacteria by heating and extracted the contents of these dead cells. When mice received an injection of this material, they remained healthy. These results supported the conclusion that pneumonia was caused by living smooth pneumococci bacteria. Grifth then mixed the contents from dead smooth cells with living rough cells. He injected mice with these treated rough cells and found that they died from pneumonia. When the dead mice were examined, living smooth cells were found, even though no living smooth cells had been injected. How had this happened?
(No change)
(No change)
4. Living rough cells + dead smooth cells (Dead) from pneumonia; living smooth cells present in mouse
Figure 10.18
The deadly smooth bacteria found in the mice had formerly been harmless rough bacteria. The harmless rough bacteria had been changed or transformed by something in the contents of the smooth cells. This change agent caused the harmless rough bacteria to produce an external capsule and become the deadly smooth type of bacteria. This something became known as the transforming factor. It was concluded that the transforming substance was equivalent to the substance of the genetic material itself. Grifths experiments demonstrated that genetic material was a chemical substance (see gure 10.18). But, what was it?
NATURE, STRUCTURE AND ORGANISATION OF THE GENETIC MATERIAL
349
Result ability to transform rough to smooth remained ability to transform rough to smooth remained ability to transform rough to smooth remained transformation ability destroyed
Later, Avery extracted the contents from smooth bacteria, separated and puried the various components until he had a highly puried sample of the transforming factor. When this was identied, it was found to be DNA. The momentous discovery of Avery and his co-workers was not accepted immediately by the entire scientic community. Some biologists doubted the validity of Averys conclusion. Even books published some years after Avery announced his discovery include cautious statements about the identity of genetic material, as for example:
. . . the present experiments strongly suggest rather than prove that genes are pure DNA . . . 1957
New scientic discoveries are not always rapidly accepted by the entire scientic community. If scientists hold strongly competing alternative views, they may not readily accept new ndings that disagree with their views. It is now universally accepted that genes are made of the chemical compound DNA. DNA belongs to the class of chemical substances called nucleic acids. Figure 10.20 shows one bacterial cell (centre) that has been treated to release its genetic material strands of the nucleic acid DNA.
Figure 10.20 Threads of genetic material from the bacterium Escherichia coli. What is this material?
350
NATURE OF BIOLOGY BOOK 2
KEY IDEAS
Genes are located on chromosomes. The rst suggestion that genes were located on chromosomes was made in 1902 by Sutton. In 1910, Morgan and his co-workers carried out the rst experiments that demonstrated that genes were located on chromosomes. Grifth demonstrated the chemical nature of the genetic material by showing that a substance existed in bacteria that could change or transform one strain of harmless bacteria into a lethal strain and that this change was then passed on to the next generation. Averys experiments led to the conclusion that the transforming factor, and hence the genetic material, is DNA (deoxyribonucleic acid).
QUICK-CHECK
5 Whose experiments were the rst that suggested that Mendels factors did not always behave independently? 6 How did the work of Morgan change Mendels model? 7 What is the transforming factor? 8 Briey explain how the work of the following scientists contributed to an understanding of the chemical nature of Mendels factors. a Grifth b Avery
Nature of genes
With Averys discovery, Mendels factors or genes were identied as being made of DNA. Denitions of genes before Averys work naturally make no reference to their chemical nature. The following denitions of a gene come from textbooks, including some published during the early part of this century. A gene was dened as: a name for the thing in a germ cell which makes the germ cell develop a particular character, such as tallness as opposed to dwarfness (1911) an independently inheritable element of the genotype by the presence of which some particular character in the organism is made possible (1918) the hypothetical unit in a germ cell which determines the production of a particular character in the individual derived from that germ cell (1921) a hypothetical unit in the chromatin of a cell which has a specic inuence on certain characteristics (1934) something which appears to pass through the reproductive cells and to inuence a particular character in the offspring (1939) a segment of a DNA molecule that can copy itself and pass on to other generations the directions it contains (1966).
Analysing DNA
Figure 10.21
Genes are made of the chemical substance called deoxyribonucleic acid (DNA). Lets examine further this substance that we introduced in chapter 1 (page 23).
NATURE, STRUCTURE AND ORGANISATION OF THE GENETIC MATERIAL
351
ODD FACT
Two of the bases, cytosine and thymine, belong to the class of chemical compounds called pyrimidines. The other two, adenine and guanine, are larger and belong to the class of chemical compounds called purines.
A T G
P S G
ODD FACT
DNA was rst isolated from nuclei of cells from pus on bandages by the young Swiss post-graduate student, Miescher. At that time in 1869, he was working in a laboratory located in a castle in Tubingen, Germany and gave the name, nuclein to the compound he isolated. DNA was the only known nucleic acid for many years, but in the 1930s, a second kind was found in the cell cytosol this was given the name RNA.
Figure 10.22 Four different representations of the nucleotides that are the sub-units of
DNA. Can you identify the phosphate, the sugar and the base in each nucleotide?
When many nucleotides join to form a chain, a bond forms between the sugar of one nucleotide and the phosphate group of the next nucleotide, and so on (see gure 10.23). So, one chain of nucleotides runs from head-to-tail, with a phosphate group at the head end (also known as the 5 [5 prime] end) and a sugar molecule at the tail end (also known as the 3 [3 prime] end).
P P S T P S T P S A P S A P S T
1.
2.
3.
So DNA is built from nucleotides joined to form a chain. However, the question remained: how is a typical DNA molecule arranged in three-dimensional space? For example, does it consist of one nucleotide chain coiled into a ball? Does it contain more than one nucleotide chain?
Table 10.4 Approximate values for the total amounts of the four kinds of nucleotides in
DNA samples from different organisms. What predictions would you make about DNA from calf liver cells?
Nucleotides Source of DNA calf thymus yeast cells tubercle bacteria herring sperm A 1.7 1.8 1.1 1.1 T 1.6 1.9 1.0 1.1 C 1.0 1.0 2.6 0.9 G 1.0 1.0 2.6 0.9
These gures indicate a possible pattern. It appears that in DNA the proportions of A and T are about equal and also that the proportions of C and G are about equal. This idea, known as Chargaffs rule, was an important observation that later contributed to understanding the 3-D structure of DNA.
KEY IDEAS
The genetic material DNA is built of sub-units called nucleotides joined to form a linear chain. Each nucleotide consists of a deoxyribose sugar part, a phosphate part and an N-containing base. The sugar and phosphate parts are identical in nucleotides found in DNA, but there are four different bases, namely adenine (A), guanine (G), cytosine (C) and thymine (T). Chargaffs rule states that certain bases occur in equal proportions in DNA.
QUICK-CHECK
9 What are the sub-units of DNA? 10 If the sub-units of DNA were analysed, what parts would be identied? 11 The relative proportion of the G nucleotide in DNA from human gut cells was found to be 1.4 and that of T was 0.9. What other conclusions are possible?
353
nucleotide chains arranged to form a double helix. Their work built on the work of other scientists, in particular, the pioneering X-ray crystallography work of Rosalind Franklin (19201958) and Maurice Wilkins (19162004).
ODD FACT
In 1962 the Nobel Prize was awarded jointly to Watson, Crick and Wilkins for their work in discovering the structure of DNA. The other person who played a decisive role in this discovery was Rosalind Rosy Franklin. She died of cancer in 1958 at the age of 37. The rules governing the Nobel Prize do not permit an award to be made to a person after death.
The key features of the double helix model of DNA (see gure 10.25) are: Each DNA molecule consists of two nucleotide chains. The chains run in opposite directions and are said to be anti-parallel. The sugarphosphate backbones of the two chains are on the outside of the DNA double helix and they coil around each other in a regular manner to form a molecule with a constant diameter. The bases (A, T, C and G) are arranged so that they point to the inside of the DNA molecule. The bases in one chain pair with the bases in the second chain in a very specic way: there is pairing only between A and T or between C and G. Weak hydrogen bonds form between the base pairs. The base pairs between the two strands, namely, A with T and C with G, are said to be complementary base pairs (see gure 10.26). This complementary double helix structure for DNA ts with the known properties of the genetic material including the facts that DNA: can act as a template for its own replication contains genetic instructions can undergo change or mutation. The box on pages 3778 contains excerpts from James Watsons personal account of the discovery of the DNA double helix (The Double Helix, Atheneum, New York, 1968). Reading these excerpts may help you understand how scientists work and realise that they spend time thinking about problems and assessing alternative ideas, not just doing experiments.
354
(a)
(b)
N N O H
Cystosine
355
Heat
Cool
E
Figure 10.28 The double helix
structure of DNA can undergo a reversible change. What is this change?
Dissociation Re-association
KEY IDEAS
DNA normally exists as a double helix molecule. In a DNA double helix, bases along one chain pair with complementary bases in the other chain and are stabilised by weak hydrogen bonds. Double-helical DNA can be reversibly dissociated into two single DNA chains by heating and the chains re-associate on cooling.
QUICK-CHECK
12 What is meant by a double helix? 13 What would be expected to happen if a solution of DNA was (a) gently heated, and (b) then allowed to cool? 14 Consider part of a DNA chain with the nucleotides: . . . T T A G G A C. . . Which of the following is part of the complementary strand: a . . . C C G A A G T. . . ? b . . . T T A G G A C. . . ? c . . . A A T C C T G. . . ?
ODD FACT
The smallest chromosome deletion that can be detected visually through a microscope is about four million bp.
356
Figure 10.29
ODD FACT
At a biological conference in early 2001, researchers recorded their guesses about the number of human genes; their estimates ranged from 27 462 to 200 000!
Figure 10.30
357
ODD FACT
Of the 37 genes carried on mtDNA, 28 are located on one strand, known as the H strand, and 9 are located on the complementary strand, known as the L strand. The H(eavy) strand has many more G bases than the L(ight) strand that is rich in Cs.
genes. The term D-loop comes from the fact that, during replication of mtDNA, the rst newly synthesised strand displaces one of the parental strands and forms a loop.
12s rRNA
OH
D-Loop nt 0/16569
Cyt b
Figure 10.32
Map of the double-stranded circular molecule of human mitochondrial DNA (mtDNA) showing the various genes for which this DNA codes. Do all the mtDNA genes control production of proteins? To avoid congestion, symbols for the 22 tRNA genes are not shown. Can you locate these 22 genes by their colour code? OH and OL denote the origins of replication of the two chain of the mtDNA molecule.
16s rRNA
PL ND6 ND5
Complex 1 genes (NADH dehydrogenase) Complex IV genes (cytochrome c oxidase) Complex III genes (ubiquinol : cytochrome c oxidoreductase) Complex V genes (ATP synthase) Transfer RNA genes Ribosomal RNA genes
ND1 Q
ND2 OL ND4
Gene structure
The template strand and its partner
A gene consists of part of a double-helical molecule of DNA. One of the two chains contains the information present in a particular gene and this is called the template strand. The complementary chain is sometimes called the nontemplate strand.
358
NATURE OF BIOLOGY BOOK 2
Representations of DNA
DNA cannot be seen with a light microscope. However, a technique known as scanning tunnelling microscopy (STM) allows DNA molecules to be visualised (see gure 10.33). Part of a single chain of DNA could be shown as follows: . . . -nucleotide-nucleotide-nucleotide-nucleotide-nucleotide- . . . OR it could be shown as: . . . -P-sugar-P-sugar-P-sugar-P-sugar-P-sugar- . . . | | | | | base base base base base
OR the specic nucleotides in one chain could be shown: 5 . . . A T T A G C T T G A G G C G . . . 3 Which representation is correct? All are correct. Which representation is the most informative? The third is the most informative because it gives the information about the order of nucleotides, the only variable part of the genetic material. What information does the second representation provide? DNA is not always represented in diagrams as a double helix. Figure 10.34 shows some of the many ways of representing DNA. The representation used will depend on the purpose of the diagram. Each provides different information about DNA. For example, (a) gives information about the coding regions (introns) and the non-coding regions (exons) within a gene, while (c) gives the base sequence.
(a)
Intron 1 Exon 1 Intron 2
A C G
T
(d)
Exon 2
Exon 3
T A T A G
(b)
879 bp 286 bp
C T A
(c)
TCTGAGCGCG GCGCTCAGA AGCTGGACAGCC GCTGTCCAGCT
T A
Figure 10.34
Different ways of representing DNA
Gene sequencing
What is this? ATGGTGCACCTGACTCCTGAGGAGAA This is part of the nucleotide sequence of the template strand of the human HBB gene, which controls production of one of the protein chains found in haemoglobin. What is the sequence of the complementary strand? When the order of the nucleotides in a gene is identied, the gene is said to be sequenced. The order in many genes from animals, plants and bacteria has been identied, and data are being added each week. Gene sequencing involves the process of identifying the order of nucleotides along a gene. Figure 10.35 shows a scientist examining some sets of
NATURE, STRUCTURE AND ORGANISATION OF THE GENETIC MATERIAL
359
ODD FACT
In 1990, the cost of DNA sequencing was about $10 per base. By 2003, the cost had dropped to about 5 cents per base.
bands arranged in columns. Each band represents one nucleotide and the order of the bands down the column corresponds to the gene sequence. New techniques of sequencing are described in the box below. Are gene sequencers used only for human genes? No! The genetic material of all organisms is DNA and the structure of that DNA is identical, regardless of whether it comes from wheat, jellysh, ducks, Bacillus bacteria, insects or people. In all organisms, genes are built of the same alphabet of four letters, namely, the nucleotides A, T, C and G of DNA. The only exception to this generalisation is that one particular group of viruses (family Retroviridae), known as retroviruses, have ribonucleic acid or RNA as their genetic material, not DNA. So the genetic instruction kit to make a human being or make an oak tree or make a white shark consists of thousands of instructions, each consisting of DNA with different base sequences.
DNA SEQUENCERS
(a)
The process of gene sequencing has now been automated and is done using instruments known as DNA sequencers (see gure 10.36a). This automated system involves the use of four different coloured uorescent dyes, each of which binds to a specic base (A, T, C or G) in DNA. The DNA chain is sequenced using a procedure that stepwise makes a complementary copy using the DNA template, with each copy being one nucleotide longer than the previous one as shown below: DNA template: CTCTCCGCCAAACGCATAACC 1st copy 2nd copy 3rd copy 4th copy etc. 21st copy G* GA* GAG* GAGA* GAGAGGCGGTTTGCGTATTGG*
G A
(b)
Figure 10.36
(a) ABI Prism 310 Genetic Analyzer from Applied Biosystems (b) Output from a DNA sequencer showing the laser signals and the output from the computer, which identied the base sequence in part of a DNA fragment. What is signalled by a red band?
In each case, the nucleotide at the end of each copy becomes attached to the specic uorescent dye (shown as a *). The copies move in turn, shortest rst, past a scanning laser that activates the dye so that it emits a uorescent signal, which is captured by a detector. This detector transfers the signal to a microcomputer, which determines the entire base sequence. The output from a DNA sequencer shows the base sequences as a series of coloured signals (see gure 10.36b) with a yellow peak denoting G, a red peak denoting T, a green peak denoting A and a blue peak denoting C. DNA sequencing laboratories exist in several Australian cities, including the Brisbane Division of the Australian Genome Research Facility and the DNA Sequencing Laboratory at the Walter and Eliza Hall Institute in Melbourne. You can visit the latter at the web site http://www.wehi.edu.au/dsl.
360
Table 10.5 Part of the sequences of different genes from various organisms. Numbers are placed above the sequences for ease
of locating a particular nucleotide.
1 Organism P
10 TTA
20
30
40
GCC CTC CTT GCG CTC CTT TCC CTT TTA 20 30 40 AAA GAG GGA AGC 40
Organism Q
ATG AAG TGT AAT GAA TGT AAC AGG GTT CAA TTA 1 10 20 30
Organism R
ATG ACG CTG ACT CAA GCT GAG AAG GCT GCC GTG ATC ACC ATC TGG 1 10 20 30 40 GGG TTC TGC TGG GCT
Organism S
ATG AGG CTC TTG TGG TTG CTT TTC ACC ATT
KEY IDEAS
The length of a double-helical DNA molecule can be expressed as the number of base pairs (bp) it contains. Each human chromosome contains one long molecule of doublestranded DNA with millions of base pairs. A typical gene consists of tens of thousands of base pairs. The estimated total number of human genes is 20 000 to 25 000. Of the two DNA chains in a gene, the one containing the genetic information is known as the template strand of DNA, while its complementary chain is called the non-template strand. Genetic instructions are coded in an alphabet of four letters only: (the nucleotides) A, T, C and G. Identication of the order of nucleotides along a length of DNA is called DNA sequencing. Different genes vary in the nucleotide sequences along their DNA.
361
QUICK-CHECK
15 A piece of DNA contains 20 000 bp. Is this more likely to be a whole chromosome or a whole gene? Explain. 16 If genes were isolated from a cat, a cyanobacterium and a cauliower, what similarity would be seen? 17 If the two human genes for making blood-clotting factor and salivary amylase were compared: a in what way would they be similar? b in what way different?
various combinations of these elements, morse code can convey very complex information, such as all the words in this chapter.
Proteins have many functions, and the various types include: structural proteins, which occur in connective tissues and in cell membranes contractile protein of muscle and myolaments enzymes that regulate chemical processes proteins of the immune system, such as the antibodies oxygen-carrying proteins, such as haemoglobin hormonal proteins, such as insulin and growth hormone. By encoding the sets of instructions on how to make the various types of proteins, genes control the structure and the biochemical and physiological functioning of an organism. The estimated 20 000 to 25 000 genes of a human organism contain all the instructions on How to make a human organism that, if printed as the base sequences, would ll 1000 volumes of an encyclopedia.
Table 10.6
Number of nucleotides in one instruction 1 (e.g. T) 2 (e.g. AA, AT, GA) 3 (e.g. TTA, GCC, AAA) 4 (e.g. GGGA, TGCA, AATG) In discussing the genetic code, the term base is sometimes used. You should recall that each base is part of a nucleotide. Total number of different instructions possible 4 16 64 256
In fact, one genetic instruction consists of a group of three bases, such as AAT, GCT and so on. Because of this, the genetic code is referred to as a triplet code. This form of code is sufcient to account for the pieces of information that must be encoded. Consider a piece of DNA with the base sequence: T A C A A A C A A G C T C C T A C T . . . This DNA has six coded instructions (shown underlined) that are decoded or translated as follows: 1. TAC = Start building a protein, commencing with the amino acid, met 2. AAA = now add the amino acid, phe 3. CAA = now add the amino acid, val 4. GCT = now add the amino acid, arg 5. CCT = now add the amino acid, gly. 6. ATT = now stop.
NATURE, STRUCTURE AND ORGANISATION OF THE GENETIC MATERIAL
363
ODD FACT
The genetic code is not completely universal. Some triplets that code for one instruction in most organisms code for a different instruction in a few other organisms. For example, the triplet ATC which is a STOP signal in most organisms codes for the addition of the amino acid, glu, in some protists, including Paramecium. The code is also different for mitochondrial DNA TCT is a STOP signal, not a code for arg.
KEY IDEAS
DNA contains information encoded in the base sequence of its template strand. Genes contain coded instructions for joining specic amino acids into proteins. The genetic code in DNA is a non-overlapping triplet code consisting of groups of three bases. One piece of genetic code typically contains the information to add one amino acid to a protein.
QUICK-CHECK
18 Give an example of a code. 19 In what form is information held in a DNA molecule? 20 Which of the following statements is most accurate? Explain your choice. a DNA is converted to the amino acid sub-units of protein. b DNA contains the coded information for joining amino acids to form protein. c DNA turns into protein. 21 How many instructions (for adding amino acids) are present in the base sequence: TTAGGG? 22 What code translates as START joining amino acids to form a protein? 23 What are the meanings of the following codes in DNA: CAA and ACT?
364
ODD FACT
The largest genome yet identied is that of an amoeba (Amoeba dubia). Its genome comprises 670 000 000 000 base pairs. This is more than 200 times larger than the human genome!
What is a genome?
The genome of an organism is its complete set of genetic instructions, encoded in DNA. For humans, the genome consists of the DNA of the haploid set of autosomes, plus sex chromosomes. Similarly, for other eukaryotes animals, plants, fungi, protists their genomes are the DNA of the haploid sets of their chromosomes. When we refer to the genome of a eukaryotic organism, as for example, the chimp genome or the rice genome, we are speaking about the nuclear DNA. We can also talk about the genome of those organelles that contain DNA, such as the mitochondrial genome or the chloroplast genome. The eld of study of genomes is termed genomics. For prokaryotes bacteria and archaeans their genomes comprise the DNA of the single circular chromosome that carries the genetic instructions of each species. For viruses, their genomes consist of their entire genetic instructions encoded in one DNA molecule, or, in the case of retroviruses, in one RNA molecule. When a genome is sequenced, it means that the precise order or sequence of bases in the DNA of the genome has been identied. Reports of the sequencing of the genome of various organisms appear in scientic journals. For example, in March 2000, the fruit y genome sequence was published (refer back to gure 10.1) and, in October 2005, sequences of the RNA of the genomes of more than 200 inuenza viruses were published, including one inuenza virus that was the cause of the 1918 u pandemic.
ODD FACT
Reports of the sequencing of the genomes of various organisms appear regularly in the scientic press. Publication of genome sequences typically occurs in stages: a preliminary release, a nal draft sequence then a nished version. The error rate in the nal sequence is less than one in every 10 000 bases.
(a)
365
ODD FACT
In December 1999, chromosome 22 became the rst human chromosome to be sequenced (33 500 000 bp and 545 genes). In May 2000, sequencing of the 33 800 000 base pairs of chromosome 21 was completed.
Here are some facts about the human genome. How big? About 3 000 000 000 bp (3 billion base pairs) organised as a DNA double helix in the chromosomes. The most recent measurement puts the size of the human genome at 2 850 000 000 base pairs. If you were to count the bases on one chain of DNA at the rate of one per second, it would take you over 90 years to complete the count. How many genes? 20 000 to 25 000 genes organised into the DNA of 22 nonhomologous autosomes and the X and Y sex chromosomes. The majority of these genes contain the instructions for building proteins and surprisingly, these genes constitute about 1.5 per cent of the total genome! About 95 per cent of the genome is of unknown function. (While this is sometimes called junk DNA, it should more correctly be called non-coding DNA, because functions for at least some of this DNA will be identied in the future.) How do the genomes of individuals differ? Many of the differences between the genomes of different people are single base differences in the DNA sequence of the genome (see gure 10.40), known as single nucleotide polymorphisms or SNPs. Other differences involve variation in the number of repeats of short sequences of bases.
Person 1
Person 2
Figure 10.41
366
NATURE OF BIOLOGY BOOK 2
ODD FACT
The DNA sequenced in the HGP came from 12 anonymous donors who provided either blood samples (from females) or sperm samples (from males).
This knowledge might identify people at risk and develop treatments to reduce or prevent these conditions. Human biology: Data from the HGP will allow us to understand better the genetic control of normal human development and will also provide new insights into human evolution, anthropology and the prehistory migrations of human groups.
To view photos and read more about Cinnamon and Tasha, go to www.genome.gov and click on Newsroom or do a search for each name.
Table 10.7 Selected organisms whose genomes have been sequenced as at the end of 2005
Organism Date published Size of genome (base pairs, bp) Estimated number of genes Comment
AGCphiX174 CGTAATTTACCGCGCTTACCGTAATTTAC11 TGGCCTACTTACCGTA TTAC C virus Apr. 1993 5 386 rst genome sequenced ATTTACCGCGCTTACCGTAATTTACCTGGCCTACTTACCGTAA TTACCGCGCTT T smallpox May 186 000 197 ACCGTvirus TTTACCTGGC1993ACTTACCGTAATTTACCGCGCTTACCGTAATTTAC AA CT July 1995 830 000 1 850 bacterium Haemophilus inuenzae CTGGCCTACTTACCGTAATTTACCG1CGCTTACCGTAATTTACCrstGGCCTACTTA T (bacterium) CCGTAATTTACCGCGCT1996 CCGTAATTTACCGCGCTTACCGTArst TTTACand GGC TA A CT Apr. 12 069 000 6 294 eukaryote Saccharomyces cerevisiae rst fungus (brewers TAC CTACTyeast) CGTAATTTACCGCGCTTACCGTAATTTACCTGGCCTACTTACCGTA A TTACCGCGCTTACCG1998ATTTACCTGGCCTACTT1ACCGTAArstTTACCGCGCT T Aug. TA 1 700 000 738 T archaean Methanococcus janaschii (archaean TAAat TTACCTGGCCTACTTACCGTAATTTACCGCGCTTACCGTAATTTA found T TACCG vents) hydrothermal CCTGGCCTACTTACCGT1998TTTACCGCGCTTACCGTAATTTACCGCGCTTACCG AA Dec. 97 000 000 19 099 rst animal Caenorhabditis elegans TAATTTACCTGGCCTACTTACCGTAATTTACCGCGCTTACCGTAATTTACCTGG (nematode worm) CCTACTmelanogaster AA Mar. ACCGCGC137 000CCGTAATTTACCTGGCCTACTTACCGT TTT 2000 TTA 000 14 100 Drosophila TACCGT (fruit T AATTy) ACCGCGCTTACCGTAATTTACCTGGCCTACTTACCGTAATTTACCGCGC Dec. 2000 115 000 000 25 498 T plant Arabidopsis thaliana TTACCGTAATTTACCTGGCCTACTTACCGTAATTTACCGCGCTrstACCGTAATTT (thale cress) ACCGCGCTTACCGTAATTTACCTGGCCTACTTACC GTAATTTACCGCGCTTAC Oct. 278 000 000 14 000 main Anopheles gambiae CGTAATTTACCTGGCCT2002 TTACCGTAA TTACCGCGCTTACCGTvectorTof malariaCT AC T AA TTAC (mosquito) GGCCTACTT(rat) CGTAAT2004ACCGCGCTTACCGTAA20TTACCTGGCCTACTTACC AC TT T 975 Apr. 275 000 000 Rattus norvegicus GTAATTTACCGCGCTTACCGTAATTTACCTGGCCTACTTACCGTAATTTACCGC Apr. 390 000 000 37 544 Oryza sativa (rice) GCCTACTTACCGTAATT2005CCGCGCTTACCGTAATTTACCTGGCCTACTTACCG TA
In June 2004, it was announced that Australian and US scientists would cooperate on sequencing the genome of an Australian marsupial, the tammar wallaby (Macropus eugenii). The Australian researchers involved in this project are at the Australian Genome Research Facility in Melbourne. Read the story of one of the scientists, Dr Sue Forrest, who is involved in the sequencing of the tammar wallaby (see page 369). In addition to the organisms listed in table 10.7 above, the genomes of the mitochondria of several species, including the human mitochondrial genome (16 568 bp), have been sequenced. So, too, have the chloroplast genomes of several plant species.
NATURE, STRUCTURE AND ORGANISATION OF THE GENETIC MATERIAL
367
A genome is more than simply a sequence of so many base pairs. Researchers also explore genomes in terms of other features, including: organisation of the genome into chromosomes genome maps, identifying the order of genes and the relative position of each gene on the DNA of the chromosomes proteins produced by a genome identifying all the protein products of the coding genes within a genome, a new eld of study termed proteomics.
Comparative genomics
The availability of the complete sequences of the genomes of an increasing number of organisms has created a new eld of study known as comparative genomics. Comparing the genomes of various species will elucidate how various features of genomes have evolved and how the genomes of closely related species differ. Comparative genomics also provides data to assist research into medicine, ecology and biodiversity and is also a powerful tool in exploring evolution (see chapter 14, pages 5567). For example, comparative genomics has provided evidence of the occurrence of processes such as gene duplication, where a second copy of a gene appears in a genome, and horizontal gene transfer, where a new gene has been acquired by one species as a result of the transfer of DNA from a second species. In August 2005, the sequencing of the genome of the chimp (Pan troglodytes) (see gure 10.42) was completed. Comparisons between the chimp genome and the human genome are expected to elucidate the genes that control the distinctive features of primates, such as high brain-to-body-mass ratios, and the genes that determine our uniquely human features. Figure 10.43 shows a map of the B2 cat chromosome. This gure also shows a comparison of that chromosome with the human chromosome-6. Note that some of the genes are conserved in the two species, as indicated by the dotted lines.
368
BIOLOGIST AT WORK
Dr Sue Forrest molecular geneticist
Dr Sue Forrest is a molecular geneticist and the Director/ CEO of the Australian Genome Research Facility, a major national research facility with nodes in Brisbane, Melbourne and Adelaide. Her previous position totalled 13 years at the Murdoch Childrens Research Institute at the Royal Childrens Hospital in Parkville. Most recently, she headed the Gene Discovery Group there for ve years, developing methodologies for the discovery of the genes responsible for common human diseases and, prior to this, ran the DNA Diagnostic laboratory for eight years. My interest in genetics developed right from my rst introduction to this fascinating area in rst-year Biology as part of my Bachelor of Science at Melbourne University. Following completion of my Honours degree, majoring in Biochemistry and Genetics, I headed overseas to study for my PhD at Oxford University where I was fortunate to work with Professor Kay Davies. We cloned the gene, dystrophin, in 1987. This gene, when mutated, results in Duchenne muscular dystrophy and was one of the rst disease-causing genes to be cloned in the late 1980s. In the gene discovery laboratory, we were particularly interested in neurodegenerative diseases. We looked for large pedigrees of individuals demonstrating clear Mendelian inheritance where we could obtain blood samples and determine the most likely location for the disease-causing gene using genetic markers. These studies would result in locating disease-causing genes to smaller sections of a particular chromosome but the hard part remained nding the exact gene that was mutated. A fantastic event in genetic history occurred in 2003 when the sequence of the human genome was announced as completed. The rst major outcome was that there were only about 25 000 genes in the human genome, compared to the 100 000 originally predicted, requiring new ideas about gene structure and function to be developed. Since the Human Genome Project was instigated, there is far more information about genes and their sequences on the Internet and much of the research is now done as computer cloning rather than actual laboratory bench work! The challenge now is to determine the function of the genes in the human genome and how they are regulated. During this nalisation of the Human Genome Project in 2001, I was offered the position of Scientic Director of the Australian Genome Research Facility (AGRF) followed by Director/CEO in 2003. AGRF is partly funded by the Federal Government to provide access to state-of-the-art genetic tools and technologies that can be utilised by researchers across the whole biological spectrum. Thus, moving to this position dramatically opened my eyes to the vast spectrum of molecular biology and genetic research in species, from microbes through to animals, that was occurring within Australia and around the world. Australia did not play a major role in the sequencing of the human genome, but through a unique collaboration between the National Institutes of Health in the USA and the Australian Genome Research Facility, funded by the State Government of Victoria, the genetic sequence of the tammar wallaby is being determined. This sequence will assist with dening which regions of the genome share sequence between human and wallaby, thereby indicating that they are likely to have a signicant function. Such sequences could be involved in regulating gene expression as an example. Also, much biological research has been done in Australia on the tammar wallaby demonstrating novel properties of lactation, development and reproduction that will be unravelled using the genetic sequence. It is an exciting time in genetics and I certainly would never have predicted in the early 1980s that, 20 years later, I could read the whole human genome sequence on the Internet. I wonder what the next 20 years will bring us!
Figure 10.44
Dr Sue Forrest, Director of the Australian Genome Research Facility, and a tammar wallaby named Wriggles
369
KEY IDEAS
The genome of an organism consists of its complete set of genetic instructions. The human genome consists of about 3000 million base pairs of DNA. The Human Genome Project was completed in 2003. Comparative genomics will provide new insights to our understanding of evolution.
QUICK-CHECK
24 List one of the benets of the Human Genome Project. 25 Identify the rst eukaryotic organism to have its genome sequenced. 26 Is the following statement true or false? Most of the genomes of two unrelated persons would be different. 27 What is meant by the term comparative genomics?
Figure 10.46 Dr Jenny Cox coordinates courses in Medical Radiations. Here she
operates an X-ray machine. Notice on Jennys coat the personal radiation monitor that measures her exposure to unseen radiation.
370
ODD FACT
Spontaneous mutations or sports appear from time to time in domesticated plants and animals. These heritable mutations include the rst appearance of short-legged (Ancon) sheep in 1791 on Seth Wrights farm in New England, United States.
Kinds of mutations
Here is part of the base sequence in part of the template strand of a DNA molecule:
10 20
ODD FACT
In the 1920s, ies were taken up in a balloon to a height of more than 18 kilometres and it was found that the rate of gene mutation in these ies occurred more than ve times more rapidly at this height than at sea level. Can you suggest a possible explanation?
This mutation involves replacement (substitution) of G by C at position number 13. Addition: insertion of one or more nucleotides into the DNA strand. After exposure to a mutagenic agent, the original sequence is changed to:
10 20 AAT GTC GGT AGT C
This mutation involves addition of a T between original nucleotides (numbers 17 and 18). Deletion: removal of one or more nucleotides from the DNA strand. After exposure to a mutagenic agent, the original sequence is altered to:
10 20 AAT GCG GAG TC
This mutation has deleted T from between original nucleotides (numbers 13 and 15).
Chemical mutagens include: mustard gas peanut oil ethyl methane sulfonate (EMS).
371
ODD FACT
Fragile-X syndrome is the most common inherited form of mental retardation in males and occurs at a frequency of about 1/1250 male births. Affected males often have long faces, large ears and, after puberty, may have enlarged testes (macro-orchidism). In many cases, this mutation is visibly expressed on the X chromosome.
In 1991, it was recognised that males affected by fragile-X syndrome (see gure 10.48) have from 200 to 2000 repeats of the CCG trinucleotide within their mutant FMR1 allele, in contrast to the normal 6 to about 50 repeats seen in unaffected persons. Similarly, males and females affected by Huntington disease have from 36 to over 100 repeats of the CAG trinucleotide in their mutant HD allele, while unaffected persons normally have from 6 to about 35 copies of this trinucleotide. Trinucleotide repeat expansion mutations tend to be unstable so that the number of repeats can change from one generation to the next. For example, the number of trinucleotide repeats has been observed to increase when the mutant HD allele is transmitted from an affected male parent to his children.
(b) (c)
(a)
Figure 10.48 (a) A male with fragile-X syndrome (b) Fragile X (left) and Y chromosomes from an affected male (c) A fragile X
(left) and normal X chromosome from a carrier female
(a)
(b)
(c)
(d)
372
KEY IDEAS
The genetic material DNA is usually stable. DNA can undergo change (mutation). Mutations of DNA can vary and include deletions, substitutions and additions of nucleotides, as well as the type known as trinucleotide repeat expansions. Agents that can cause mutations are known as mutagenic agents. Mutations can be somatic or germline, and only germline mutations can be transmitted to the next generation. DNA mutations often, but not always, have deleterious results.
QUICK-CHECK
28 How do a substitution mutation and a deletion mutation differ? 29 The DNA in the gametes of an industrial worker is altered because of exposure to a chemical. a Is this a spontaneous or an induced mutation? b Is this a somatic or a germline mutation? c Can this mutation be transmitted to the workers children?
ODD FACT
Mutations that change bases at exonintron junctions have serious effects on the gene product. Why?
An unexpected discovery about the genes in eukaryotes was made in 1977. Until then, the coding region of a gene was thought to be continuous (see gure 10.52a). Instead, the coding region is interrupted by other segments of DNA. Each segment of the coding region of a gene is called an exon. The exons are separated by lengths of DNA that do not contain instructions relating to the protein chain. These non-coding segments are called introns (see gure 10.52b).
NATURE, STRUCTURE AND ORGANISATION OF THE GENETIC MATERIAL
373
A ROYAL MUTATION
Queen Victoria of England (18191901) had nine children, many of whom married into the European royal families that existed during the 1800s. Victoria carried a gene mutation. The mutation may have occurred in the germline cell in one of Victorias parents. Another possibility is that the mutation occurred in the queens gonadal tissues. This gene mutation affected the DNA of the F8C gene which is located on the X chromosome and controls production of a blood-clotting substance. Victorias daughters, Alice and Beatrice, inherited from their mother an X chromosome with the mutated allele (h). Since these daughters inherited a normal allele (H) from their father, Albert, they were heterozygous carriers (Hh) of haemophilia. One of Victorias sons, Leopold, inherited the mutation from his mother. Males are hemizygous for X-linked traits, so Leopold was genotype h (Y) and showed the recessive condition, haemophilia. In haemophilia, the blood fails to clot normally and internal bleeding can occur. Another of Victorias sons, Albert (later King Edward VII of England), did not inherit this mutation and so haemophilia disappeared from the English royal family. The haemophilia mutation was introduced into the royal families of Russia and Spain by Alice and Beatrice who transmitted it to some of their children. One of Victorias carrier granddaughters was Alexandra, who married Nikolas II, the Tsar of Russia. Their only son, Alexis, suffered from haemophilia. Nikolas and Alexandra became preoccupied with their sons condition and sought cures, particularly through the monk, Rasputin. Some people have speculated that this led to neglect of state matters by the Tsar and may have contributed to the Russian Revolution of 1917.
Figure 10.51
Victoria and her descendants. A germline mutation in Victoria, or one of her parents, caused haemophilia to appear in her family. This family group includes several of Victorias daughters who were heterozygous carriers of this trait and passed it on to their sons. Will haemophilia appear in the sons or daughters of carrier females?
Edward VII
Alice of Hesse
Beatrice
George V
Waldemar
Henry
Alexis
Rupert
Alfonso
Gonzalo
Elizabeth II
Philip
Juan Carlos
374
Exon 1
So, genes are not like nursery rhymes in a book, where the reader starts at the beginning and reads through to the end. The information in genes is broken up into segments and the sections in between are lled with other printed material that is unrelated to the rhyme.
Downstream region
Figure 10.53 Regions of the template strand of a typical gene. What is the DNA sequence in the other DNA strand?
375
Here is an interrupted rhyme: Hey, diddle diddle the cat and the d HERE IS AN INTERRUPTION dle, the cow jumped over the AND HERE IS ANOTHER INTERRUPTION moon. The little dog laughed to see such fun and the dish HERES ANOTHER ran away with the spoon. If this interrupted rhyme were thought of as a gene, how many exons and how many introns would it contain? The underlined portions are like exons, and there are four of them. The interruptions are like introns, and there are three of them. They are removed from the mRNA before translation. The number of exons and introns in genes varies. The DNA making up the HBB gene that controls the production of one chain of the haemoglobin molecules consists of three exons and two introns. The F8C gene that controls the production of Factor VIII that assists in blood clotting consists of 26 exons and 25 introns.
KEY IDEAS
Each gene in eukaryote organisms contains a coding region, and also includes anking regions upstream and downstream of the coding region. The coding region of a gene typically consists of several exons separated or interrupted by introns.
QUICK-CHECK
30 Using words or diagrams, distinguish between the members of each of the following pairs: a intron and exon b coding region and anking region. 31 True or false? All genes contain the same number of exons.
376
Watson commented:
. . . my stomach sank in apprehension . . . Seeing that neither Francis nor I could bear any further suspense, he (Peter) quickly told us that the model was a threechain helix with the sugarphosphate backbone in the centre. This sounded so suspiciously like our aborted effort of last year . . .
At rst, Watson and Crick tried a three-chain model, with the sugarphosphate backbones at the inside of the model. Watson wrote:
. . . we decided upon models in which the sugar phosphate backbone was in the centre of the molecule. Only in that way would it be possible to obtain a structure regular enough to give the crystalline diffraction patterns observed by Rosy and Maurice. Our rst few minutes with the models, though, were not joyous . . . After tea, however, a shape began to emerge which brought back our spirits. Three chains twisted about each other . . . Admittedly, a few of the atomic contacts were still too close for comfort, but, after all, the ddling had just begun.
In fact, Paulings model was incorrect. The critical breakthrough came when Rosalind Franklin prepared X-ray diffraction patterns of a different form of DNA (the B-form) (see gure 10.54). Watson wrote:
The instant I saw the picture my mouth fell open and my pulse began to race. The pattern was unbelievably simpler than those obtained previously. Moreover, the black cross of reections which dominated the picture could arise only from a helical structure . . .
He later recalled:
Then as the train jerked towards Cambridge, I tried to decide between the two- and three-chain models . . . Thus by the time I had cycled back to college and climbed over the back gate, I had decided to build two-chain models. Francis would have to agree. Even though he was a physicist, he knew that important biological objects come in pairs.
Eventually, Watson and Crick realised that this threechain model had major faults. Watson wrote:
A fresh start would be necessary to get the problem rolling again.
Fitting two chains together Over the next months, Watson and Crick tried to build a two-chain model, but they were still working with the incorrect idea that the sugarphosphate backbones were in the centre of the molecule and the bases on the outside. Watson said:
. . . for a day and a half I tried to nd a suitable two-chain model with the backbone in the centre . . . Though I kept insisting that we should keep the backbone in the centre, I knew none of my reasons held water. But the real stumbling block was the bases. As long as they were outside, we did not have to consider them. If they were pushed inside, the frightful problem existed of how to pack together two or more chains with irregular sequences of bases.
Chargaffs rule provides a clue A new avenue of exploration was raised by the ratio of the four bases in DNA:
The moment was thus appropriate to think seriously about some curious regularities in DNA chemistry . . . the number of adenine (A) molecules was very similar to the number of thymine (T) molecules, while the number of guanine (G) molecules was very close to the number of cytosine (C) molecules. Back in my rooms I lit the coal re . . . With my ngers too cold to write legibly I huddled next to the replace, daydreaming about how several DNA chains could fold together in a pretty and hopefully scientic way.
At that time, Linus Pauling, an American scientist, developed his model for the structure of DNA. He sent the details of his model through his son Peter to Watson and Crick, who at rst were disappointed to think that they had been beaten to the answer.
(continued )
377
The sugarphosphate backbones were arranged on the outside of the model; they posed no further problem, but the nagging problem of what to do with the bases still remained:
I went ahead spending most evenings at the lms, vaguely dreaming that at any moment the answer would suddenly hit me . . . Even during good lms I found it almost impossible to forget the bases. Thus, unless some very special trick existed, randomly twisting two polynucleotide chains around one another should result in a mess.
Watson rst considered the pairing of identical bases, that is, A with A, T with T, and so on:
I thus started wondering whether each DNA molecule consisted of two chains with identical base sequences held together by hydrogen bonds between pairs of identical bases. For over two hours I happily lay awake with pairs of adenine residues whirling in front of my eyes. Only for brief moments did fear shoot through me that an idea this good could be wrong.
Watson and Crick announced their double-helix model in the short article that briey outlines a discovery that ranks as one of the major discoveries of the twentieth century: Watson, J. D. and Crick, F. H. C. (1953) Molecular structure of nucleic acids a structure for deoxyribose nucleic acid, Nature, vol. 171, pp. 737738.
You can read what Francis Crick thought at the following website: www.accessexcellence.org/AE/AEC/CC/crick.html.
The DNA double helix structure was at last identied. The pairing between the bases in DNA involves hydrogen bonding between complementary bases, not identical bases. This model with AT and CG pairs between the chains made sense in terms of Chargaffs rule that the number of As was about equal to those of T, and that the number of Gs was about the same as those of C.
378
BIOCHALLENGE
The human genome of about 3000 million base pairs includes 20 000 to 25 000 protein-coding genes and large regions of non-coding DNA. In this biochallenge, you will access two databases relating to the human genome. Do not worry about the complexity of some of the information in the database. The purpose is simply to familiarise you with some aspects of using major genetic databases. Exploring human genes Information about known human genes and inherited disorders comprises a large body of data held in the public database called Online Mendelian Inheritance in Man (OMIM). Go to www.jaconline.com.au/natureofbiology/natbiol2-3e and click on the OMIM weblink for this chapter. (Scroll to the bottom of the rst page and read the note.) Exploring the nucleotide sequence The Genome Browser Gateway web site contains the nucleotide sequence of the genomes of many organisms. It allows scientists to explore many aspects of different genomes. Your biochallenge is simply to access the site, recognise the complexity and amount of data held there and see the base sequence of a tiny portion of one of the human chromosomes. Go to www.jaconline.com.au/natureofbiology/natbiol2-3e and click on the Genome Browser weblink for this chapter.
Locate OMIM Facts on the left menu, and select Statistics. a How many genes and DNA markers are included in the database on the date that you access it? b How many genes have been mapped to specic loci on the various chromosomes? c Has every gene in the database been mapped to a specic chromosomal locus? Return to the left menu and locate OMIM, then select Search Gene Map. Conduct a search on the BRCA1 gene symbol. A table similar to the one below will appear. Location 17q21 Symbol BRCA1 Title MIM # 113705 Disorder Breast cancer
1 2 3 4 5
Locate the Genome eld, which defaults to Human. What other vertebrate genomes are in the database? Ensure that you have Human selected in the genome box. The Assembly eld shows the date of the version of the human genome you are using. The Position or search term eld defaults to chr7:127,471,196127,495,720. This means: chromosome-7, starting from base number 127 471 196 and ending at base 127 495 720. Click on the Submit button to reveal a set of complex information about this region of the chromosome. At the top of this display, locate the zoom button labelled base. Press this button to see the start of the base sequence of this region of chromosome-7 just a series of As, Ts, Cs and Gs. a What are the rst ten bases shown in this sequence? b If you went to any other sequence in any other human chromosome, would you expect to see anything other than a series of As, Ts, Cs and Gs? c What would you expect to see if you viewed the genome of another organism, such as a mouse or a carrot?
a The Title gives the name of the gene. What is the title of the BRCA1 gene? b The Location identies the gene locus. Where is this gene located? On the web site, click on the Location entry, and this will take you to a chromosome map. Notice the large number of genes and DNA markers on this chromosomal region. c Identify another inherited disorder very close (or linked) to the BRCA1 gene locus. Return to the table, and click on the MIM# entry to read a description of the gene. d Scroll down to the section on Clinical features. List three clinical features of this condition. e Scroll down further to the section on Inheritance. What is the mode of inheritance? You can search the OMIM database by gene name (such as FRM1, F8C or CTRF) or by chromosome regions (such as Xp21.1 or 4p16.3) or by a disorder (such as achondroplasia or alkaptonuria). You might like to try another search.
379
CHAPTER REVIEW
Key words
CROSSWORD
adenine base base pairs base sequence Chargaffs rule chromosomes coding region comparative genomics complementary base pairs cytosine D-loop decoded deoxyribonucleic acid (DNA) deoxyribose dihybrid dissociation DNA sequencers dominant double helix encoded exon anking regions frameshift gene gene duplication
gene sequence gene sequencing genetic code genome genomics germline mutation guanine horizontal gene transfer Human Genome Project (HGP) hybridisation hydrogen bonds induced mutation introns Mendels factors monohybrid mutagenic agents mutation nucleic acids nucleotide sequence nucleotides promoters proteomics purines pyrimidines re-association recessive
retroviruses single nucleotide polymorphisms somatic mutation spontaneous mutation TATA box template strand thymine transforming factor trinucleotide repeat expansion (TRE) triplet code
Questions
1 Making connections between concepts Use at least eight of the key words above to prepare a concept map on DNA structure. You may add other concepts that you wish. 2 Developing explanations Suggest explanations for the following: a The statement that Genes are made of DNA is absent from textbooks published before the mid-1940s. b In a DNA double helix, the number of adenine molecules can be used to predict the number of thymine molecules. c In a single strand of DNA, the number of adenine molecules cannot be used to predict the number of thymine molecules. d In a DNA double helix, the ratio (A + C)/(T + G) is equal to 1. e The information encoded in a piece of DNA consisting of only As ( AAAAAAAAAAAAAAAAAA etc.) is decoded as a protein containing only one kind of amino acid. f The information in a DNA template strand consisting of more than 100 nucleotides was greatly changed by the removal of just one base.
380
NATURE OF BIOLOGY BOOK 2
Egyptian number
3 Applying knowledge Each answer is a number: a When four nucleotides join to form a chain, how many sugarphosphate bonds are formed? b A sequence of 12 nucleotides within the coding region of a gene encodes the information to join how many amino acids into a protein? c A protein consists of a chain of 22 amino acids. What is the minimum number of nucleotides needed to encode the instructions to join these amino acids to form this protein? d How many nucleotides in the START signal of a gene? e The total length of DNA from a human sperm cell. f The number of different kinds of amino acids that can be found in proteins. g This number corresponds to the distance in nanometres between adjacent pairs of bases in a DNA double helix. h The number of DNA molecules in a chromosome. 4 Analysing data Ancient Egyptians used a system of numbers that was based on a code. Examine the data in figure 10.55. a How many elements of this code are shown? b Try to crack the code by assigning a numeric meaning to each element. c Why do biologists talk about a genetic code? 5 Demonstrating your understanding Some biologists have likened parts of the genetic code to letters, words and sentences. To which of these would the following correspond: a TTA? b G? c GCG? d TAGCGTGTAGGCCTGTTGCAAA? e In the English language, words are of different lengths, such as ox and rhinoceros. Is this also true of the words in the genetic code? 6 Demonstrating your understanding Samples of DNA were analysed and the following proportions were found:
DNA Sample 1 Sample 2 Sample 3 A 1.2 0.8 1.0 C 1.2 0.8 0.7 G 1.2 0.4 0.7 T 1.2 0.4 1.0
Identify which of the above sample(s) could be double-stranded DNA Explain. 7 Evaluating information Refer to pages 3778 relating to the discovery of the double helix structure for DNA. Using Watson and Crick as examples of scientists, identify the following statements as true (T) or false (F) and briefly explain your choice: a Scientists might spend more time planning experiments rather than doing them. b Scientists start investigations without reference to the work of other scientists. c Discoveries occur only in laboratories as a result of experiments. d Unplanned inspiration can play a role in scientific discoveries.
NATURE, STRUCTURE AND ORGANISATION OF THE GENETIC MATERIAL
381
8 Demonstrating knowledge and understanding a What is Chargaffs rule? b After demonstrating that the proportions of A and T and of C and G were about equal in the DNA from many organisms, Chargaff and his co-workers wrote:
A comparison of the molar proportions reveals certain striking, but perhaps meaningless, regularities. 1949
10
15
Does this fact suggest that scientists will be aware of the importance of facts or regularities that they discover? Explain. c The paper describing Watson and Cricks 3-D model of DNA appeared in the journal Nature, vol. 171, page 737 (1953). Visit a library or search the Internet and locate this article. Does this article suggest that major discoveries can be explained only in long articles that are difcult to understand? 9 Applying principles Consider part of the template strand of the coding sequence of a gene from a owering gum that includes the base sequence at left: a Write the sequence in the corresponding portion of the complementary strand. b Refer to the genetic code (page 656) and copy and complete the following table.
Code (in DNA) Decoded information
Start with amino acid, met Now add the amino acid, ser
1500
1000
500
0 0 2000 4000 6000 Radiation (Roentgen units)
Figure 10.56
382
NATURE OF BIOLOGY BOOK 2
c If this sequence were from the DNA template strand from a monkey, would it translate in a different way from that of the owering gum? d Describe the result of each of the following DNA mutations separately: i deletion of nucleotide number 10 ii substitution of nucleotide number 13 (G) by A. 10 Demonstrating your understanding DNA is often described as the blueprint of life. a Explain why this is a reasonable analogy. b Which of the following is DNA more like: i a recipe for a cake? ii the ingredients that go into a cake? Explain. 11 Interpreting data presented graphically Examine the graph in gure 10.56: a What variable is plotted along the horizontal axis? Identify the units of measurement. b What variable is plotted along the vertical axis? Identify the units of measurement. c Describe the general trend shown in the graph. d What approximate dosage of radiation produced a mutation rate of about 0.1?
12 Interpreting data and making predictions a Explain why people working in a situation involving exposure to radiation wear protective clothing or radiation monitoring devices. b Which organs of the human body are at risk of a germline mutation? c A segment of the eye tissue of a developing fruit-y undergoes a spontaneous mutation so that, after metamorphosis, the adult is a red-eyed y with a white sector in the right eye. Would you predict that offspring of this y in the next generation would show this mutant phenotype? Explain. 13 Demonstrating knowledge a Dene a mutagenic agent. b Give an example of a chemical mutagen. c Identify two types of radiation that are known to be mutagenic. 14 Analysing data Refer back to figure 10.43 on page 368. Examine this figure and answer the following questions. a Which cat chromosome is depicted here? b Which human chromosome is shown? c The human chromosome is labelled as Hsa6 from Homo sapiens. i What is meant by Hsa21? ii Suggest what might be meant by FcaB2. d Approximately how many gene loci are shown on the cat chromosome? e The dotted lines link matching (homologous) gene loci that occur in both species. How many homologous genes exist in the two species on these chromosomes? f In general, where genes are conserved between two species, is their order on the chromosomes also more or less conserved? 15 Using the web In 2005, the National Human Genome Research Institute published a press release regarding the detailed analysis of human chromosomes 2 and 4. Go to www.jaconline.com.au/natureofbiology/natbiol2-3e, click on the Gene Deserts weblink for this chapter and answer the following questions: a How many genes have been identied on chromosome 2 and on chromosome 4? b List two gene loci of interest on chromosome 4. c Chromosome 2 carries the locus of the gene that encodes genetic instructions for the muscle protein titin. What is distinctive about this gene (and this protein)? d What is a gene desert? e Regions of gene desert have been conserved in the evolution of birds and in the evolution of mammals. What does this nding suggest? f Humans have 23 pairs of chromosomes while chimps (and other great apes) have 24 pairs. i What long-standing hypothesis existed to explain how this reduction in chromosome number arose? ii Does the detailed analysis of human chromosome 2 provide support for this hypothesis? Explain.
383