Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
A MAP OF GENETICS
Mendel’s Modified
second Mendelian
law 3 ratios 6
Evolution
1, 14, 18, 19, 20
Linkage 4, 17
Independent Chromosome
Mendel’s assortment 3 rearrangements
first law 2 Population 17
Crossing-over 4, 16 genetics 18, 19, 20 Single gene
mutations
Recombination 2, 16
3, 4, 16
Equal Quantitative
segregation 2 inheritance 3, 19
Transposons
15
Inheritance of more Inheritance and Epigenetics
than one gene recombination of
Chromosomal 12
3, 4, 19 genes on bacterial
genes 2, 3, 4
chromosomes
and plasmids and of
Single gene phage genes 5, 10 Transcription Translation
inheritance 8, 11, 12 9, 11, 12
2 Meiosis and
mitosis in
eukaryotes
THE GENOME
2, 3 DNA Gene Gene
(DNA, genes,
replication chromosomes) regulation interaction
Organelle 1, 7 11, 12, 14 6, 11, 12, 13, 14
1, 2, 3, 5, 7, 14, 15
gene inheritance
3
Development
11, 12, 13
The map displays the general divisions of genetics in boxes, with arrows
showing the main connections between them covered in this book.
Orange, broadly, is inheritance, purple is function, and green is change.
Numbers are chapters covering the topic, with main discussions in bold.
Introduction to
Genetic
Analysis
About the Authors
[ Barbara Moon.]
[ Iqbal Pittawala.]
[ Sean Carroll.]
[ John Doebley.]
Introduction to
Genetic
Analysis
ELEVENTH EDITION
Anthony J. F. Griffiths
University of British Columbia
Susan R. Wessler
University of California, Riverside
Sean B. Carroll
Howard Hughes Medical Institute
University of Wisconsin–Madison
John Doebley
University of Wisconsin–Madison
Publisher Kate Ahr Parker
Senior Acquisitions Editor Lauren Schultz
Executive Marketing Manager John Britch
Marketing Assistant Bailey James
Developmental Editor Erica Champion
Editorial Assistant Alexandra Garrett
Supplements Editor Erica Champion
Executive Media Editor Amanda Dunning
Media Editors Donna Broadman,
Erica Champion, Tue Tran
Art Director Diana Blume
Cover and Interior Designer Vicki Tomaselli
Senior Project Editor Jane O’Neill
Permissions Manager Jennifer MacMillan
Photo Editor Richard Fox
Illustration Coordinator Janice Donnola
Illustrations Dragonfly Media Group
Production Supervisor Susan Wein
Composition and Layout Sheridan Sellers
Printing and Binding RR Donnelley
Cover Photo Susan Schmitz/Shutterstock
Hardcover: ISBN-13: 978-1-4641-0948-5
ISBN-10: 1-4641-0948-6
Loose-Leaf: ISBN-13: 978-1-4641-8804-6
ISBN-10: 1-4641-8804-1
First printing
www.macmillanhighered.com
Contents in Brief Contents
v
vi CONTENTS
3.5 Organelle Genes: Inheritance Independent F plasmids that carry genomic fragments 188
of the Nucleus 110 R plasmids 188
Patterns of inheritance in organelles 111 5.3 Bacterial Transformation 191
Cytoplasmic segregation 113 The nature of transformation 191
Cytoplasmic mutations in humans 115 Chromosome mapping using transformation 191
mtDNA in evolutionary studies 116 5.4 Bacteriophage Genetics 192
Infection of bacteria by phages 192
4 Mapping Eukaryote Chromosomes Mapping phage chromosomes by using phage
by Recombination 127 crosses 194
4.1 Diagnostics of Linkage 129 5.5 Transduction 196
Using recombinant frequency to recognize linkage 129 Discovery of transduction 196
How crossovers produce recombinants for linked Generalized transduction 197
genes 132 Specialized transduction 198
Linkage symbolism and terminology 132 Mechanism of specialized transduction 200
Evidence that crossing over is a breakage-and-
rejoining process 133 5.6 Physical Maps and Linkage Maps Compared 201
Evidence that crossing over takes place at
the four-chromatid stage 133 6 Gene Interaction 215
Multiple crossovers can include more than two 6.1 Interactions Between the Alleles of a
chromatids 134 Single Gene: Variations on Dominance 216
4.2 Mapping by Recombinant Frequency 135 Complete dominance and recessiveness 216
Map units 136 Incomplete dominance 218
Three-point testcross 139 Codominance 219
Deducing gene order by inspection 141 Recessive lethal alleles 220
CONTENTS vii
6.2 Interaction of Genes in Pathways 223 Small nuclear RNAs (snRNAs): the mechanism
of exon splicing 307
Biosynthetic pathways in Neurospora 224
Self-splicing introns and the RNA world 308
Gene interaction in other types of pathways 226
8.5 Small Functional RNAs That Regulate
6.3 Inferring Gene Interactions 227
and Protect the Eukaryotic Genome 310
Sorting mutants using the complementation test 227
miRNAs are important regulators of gene expression 310
Analyzing double mutants of random mutations 231
siRNAs ensure genome stability 311
6.4 Penetrance and Expressivity 239 Similar mechanisms generate siRNA and miRNA 314
PART Ii From DNA to Phenotype 9 Proteins and Their Synthesis 319
9.1 Protein Structure 322
7 Structure and Replication 259 9.2 The Genetic Code 324
7.1 DNA: The Genetic Material 260 Overlapping versus nonoverlapping codes 325
Discovery of transformation 261 Number of letters in the codon 325
Hershey–Chase experiment 263 Use of suppressors to demonstrate a triplet code 325
Degeneracy of the genetic code 327
7.2 DNA Structure 264
Cracking the code 328
DNA structure before Watson and Crick 264
Stop codons 329
The double helix 267
9.3 tRNA: The Adapter 329
7.3 Semiconservative Replication 270
Codon translation by tRNA 331
Meselson–Stahl experiment 271
Degeneracy revisited 331
The replication fork 272
DNA polymerases 273 9.4 Ribosomes 332
Ribosome features 333
7.4 Overview of DNA Replication 274
Translation initiation, elongation, and termination 335
7.5 The Replisome: A Remarkable Replication
Nonsense suppressor mutations 338
Machine 277
9.5 The Proteome 339
Unwinding the double helix 279
Assembling the replisome: replication initiation 280 Alternative splicing generates protein isoforms 339
Posttranslational events 340
7.6 Replication in Eukaryotic Organisms 280
Eukaryotic origins of replication 280
10 Gene Isolation and Manipulation 351
DNA replication and the yeast cell cycle 281
Replication origins in higher eukaryotes 282 10.1 Overview: Isolating and Amplifying
Specific DNA Fragments 353
7.7 Telomeres and Telomerase: Replication
Termination 283 10.2 Generating Recombinant DNA Molecules 354
Genomic DNA can be cut up before cloning 355
8 RNA: Transcription and Processing 291 The polymerase chain reaction amplifies selected
regions of DNA in vitro 356
8.1 RNA 293
DNA copies of mRNA can be synthesized 358
Early experiments suggest an RNA intermediate 293
Attaching donor and vector DNA 358
Properties of RNA 294
Amplification of donor DNA inside a bacterial cell 362
Classes of RNA 294
Making genomic and cDNA libraries 366
8.2 Transcription 296
10.3 Using Molecular Probes to Find and
Overview: DNA as transcription template 296
Analyze a Specific Clone of Interest 367
Stages of transcription 298
Finding specific clones by using probes 367
8.3 Transcription in Eukaryotes 301
Finding specific clones by functional
Transcription initiation in eukaryotes 303 complementation 369
Elongation, termination, and pre-mRNA Southern- and Northern-blot analysis of DNA 371
processing in eukaryotes 304
10.4 Determining the Base Sequence of a
8.4 Intron Removal and Exon Splicing 307 DNA Segment 374
viii CONTENTS
10.5 Aligning Genetic and Physical Maps to Gal4 regulates multiple genes through upstream
Isolate Specific Genes 377 activation sequences 436
Using positional cloning to identify a human-disease The Gal4 protein has separable DNA-binding
gene 378 and activation domains 438
Using fine mapping to identify genes 379 Gal4 activity is physiologically regulated 439
Gal4 functions in most eukaryotes 439
10.6 Genetic Engineering 382
Activators recruit the transcriptional machinery 440
Genetic engineering in Saccharomyces cerevisiae 383
The control of yeast mating type: combinatorial
Genetic engineering in plants 383 interactions 440
Genetic engineering in animals 386 12.3 Dynamic Chromatin 443
Chromatin-remodeling proteins and gene activation 444
11 Regulation of Gene Expression Modification of histones 445
in Bacteria and Their Viruses 397 Histone methylation can activate or repress
11.1 Gene Regulation 399 gene expression 448
The basics of prokaryotic transcriptional The inheritance of histone modifications and
regulation: genetic switches 400 chromatin structure 448
A first look at the lac regulatory circuit 401 Histone variants 449
11.2 Discovery of the lac System: Negative Control 404 DNA methylation: another heritable mark that
influences chromatin structure 449
Genes controlled together 405
12.4 Activation of Genes in a Chromatin
Genetic evidence for the operator and repressor 405
Environment 450
Genetic evidence for allostery 407
The β-interferon enhanceosome 451
Genetic analysis of the lac promoter 408
Enhancer-blocking insulators 452
Molecular characterization of the Lac repressor
and the lac operator 408 12.5 Long-Term Inactivation of Genes in a
Genetic analysis of the lac promoter 408 Chromatin Environment 454
Molecular characterization of the Lac repressor Mating-type switching and gene silencing 454
and the lac operator 408 Heterochromatin and euchromatin compared 455
11.3 Catabolite Repression of the lac Operon: Position-effect variegation in Drosophila reveals
Positive Control 409 genomic neighborhoods 456
Genetic analysis of PEV reveals proteins necessary
The basics of lac catabolite repression:
for heterochromatin formation 457
choosing the best sugar to metabolize 410
The structures of target DNA sites 410 12.6 Gender-Specific Silencing of Genes
and Whole Chromosomes 460
A summary of the lac operon 411
Genomic imprinting explains some unusual
11.4 Dual Positive and Negative Control:
patterns of inheritance 460
The Arabinose Operon 413
But what about Dolly and other cloned mammals? 461
11.5 Metabolic Pathways and Additional
Silencing an entire chromosome: X-chromosome
Levels of Regulation: Attenuation 414 inactivation 462
11.6 Bacteriophage Life Cycles: More 12.7 Post-Transcriptional Gene Repression
Regulators, Complex Operons 417 by miRNAs 463
Molecular anatomy of the genetic switch 421
Sequence-specific binding of regulatory proteins 13 The Genetic Control of Development 469
to DNA 422
11.7 Alternative Sigma Factors Regulate Large 13.1 The Genetic Approach to Development 471
Sets of Genes 423 13.2 The Genetic Toolkit for Drosophila
Development 474
Classification of genes by developmental function 474
12 Regulation of Gene Expression
Homeotic genes and segmental identity 474
in Eukaryotes 431
Organization and expression of Hox genes 476
12.1 Transcriptional Regulation in Eukaryotes:
The homeobox 478
An Overview 432
Clusters of Hox genes control development in
12.2 Lessons from Yeast: The GAL System 436 most animals 479
CONTENTS ix
13.3 Defining the Entire Toolkit 482 14.7 Functional Genomics and Reverse
The anteroposterior and dorsoventral axes 483 Genetics 536
Expression of toolkit genes 484 “’Omics” 536
Turning sequence reads into an assembled sequence 511 Transposable elements in the human genome 568
16.3 The Molecular Basis of Induced Mutations 593 The inbreeding coefficient 680
Mechanisms of mutagenesis 593 Population size and inbreeding 682
The Ames test: evaluating mutagens in our 18.4 Genetic Variation and Its Measurement 684
environment 595
18.5 The Modulation of Genetic Variation 687
16.4 Biological Repair Mechanisms 596
New alleles enter the population: mutation and
Direct reversal of damaged DNA 597 migration 687
Base-excision repair 598 Recombination and linkage disequilibrium 689
Nucleotide-excision repair 599 Genetic drift and population size 691
Postreplication repair: mismatch repair 602 Selection 696
Error-prone repair: translesion DNA synthesis 604 Forms of selection 698
Repair of double-strand breaks 606 Balance between mutation and drift 702
The involvement of DSB repair in meiotic Balance between mutation and selection 703
recombination 608
18.6 Biological and Social Applications 704
16.5 Cancer: An Important Phenotypic
Conservation genetics 704
Consequence of Mutation 609
Calculating disease risks 705
How cancer cells differ from normal cells 609
DNA forensics 706
Mutations in cancer cells 609
Googling your DNA mates 707
S
ince its first edition in 1974, Introduction to Genetic Analysis has emphasized
the power and incisiveness of the genetic approach in biological research
and its applications. Over its many editions, the text has continuously ex-
panded its coverage as the power of traditional genetic analysis has been extended
with the introduction of recombinant DNA technology and then genomics. In the
eleventh edition, we continue this tradition and show how the flowering of this
powerful type of analysis has been used for insight into research in biology, agri-
culture, and human health.
Pedagogical Tools
One of the important new features in this edition is the inclusion of
lists of learning outcomes at the beginning of each chapter. Learning LEARNING OUTCOMES
outcomes are crucial components of understanding. One of the tenets After completing this chapter, you will be able to
of the constructivist theory of learning is that although understand- • Perform a quantitative analysis of the
ing might be a series of new mental circuits, the learner can never be progeny of a dihybrid testcross to assess
sure of what is in his or her brain until called upon for some type of whether or not the two genes are linked on
performance. Indeed, understanding has even been defined by some the same chromosome.
as flexible performance capacity. The lists of goals show learners what • Extend the same type of analysis to several
precise performances are expected of them. The notes that follow loci to produce a map of the relative positions
of loci on a chromosome.
show how the benefits of the learning outcomes in this book can be
maximized for instructors who wish to use them. • In ascomycete fungi, map the centromeres
to other linked loci.
Classroom sessions large and small (for example, lectures and
tutorials) should be structured as far as possible on learning out- • In asci, predict allele ratios stemming from
specific steps in the heteroduplex model of
comes closely paralleling those in these chapters. At various stages crossing over.
in the classes students should be asked to demonstrate their under-
standing of the material just covered by attaining one or more learn-
ing outcomes. In writing examination or test questions, the instructor should try
to stick closely to learning outcomes. When reviewing test results, show in what
ways the outcomes have been attained or not attained by the learner.
Students should read the list of learning outcomes before embarking on a
chapter. Although it will not be possible to understand most of them before read-
ing the chapter, their wording gives a good idea of the lay of the land, and shows
the extent of what the instructor’s expectations are. Ideally, after reading a sec-
tion of the chapter, it is a good idea for a student to go back to the list and match
the material covered to an outcome. This process should be repeated at the end
of the chapter by scanning the sections and making a complete match with each
outcome as far as possible. In solving the end-of-chapter problems, try to focus
effort on the skills described in the learning outcomes. Students should use the
learning outcomes for rapid review when studying for exams; they should try to
imagine ways that they will be expected to demonstrate understanding through
the application of the outcomes.
The general goal of a course in genetics is to learn how to think and work like a
geneticist. The learning outcomes can fractionate this general goal into the many
different skills required in this analytical subject.
In this edition we have replaced “Messages” with “Key Concepts.” Messages
have been in the book since its first edition in 1974. In the 1960s and 1970s, per-
haps due to the popularity of Marshall McLuhan’s principle “The medium is the
message,” the word message was in common use, and teachers were often asked,
“What is your message?” Although with the rise of electronic media it is perhaps
time for a resurgence of McLuhan’s principle, we felt that the word message no
longer has the meaning it had in 1974.
xiii
xiv PREFACE
Flood-intolerant and flood-tolerant rice SUB1 gene increases rice yield under flooding
6.0
5.0
4.0
Yield (t ha–1)
3.0
2.0
Swarna
1.0
Swarna-Sub1
0.0
0 5 10 15 20 25 30
Duration of submergence (days)
M
A
A A H4 H3
Genome surveillance: Cutting-edge research in trans- Ser
A Lys
Lys
M A A
A
M
Lys Lys A A
posable elements has uncovered genome surveillance Lys
Ser Lys
Lys
Lys
Lys
Ser Lys
systems in plants, animals, and bacteria similar to that
previously identified in C. elegans. Chapter 15 now pro-
F igure 12 -13 (a) Histone tails protrude from the nucleosome core (purple).
vides an overview of piRNAs in animals and crRNAs in (b) Examples of histone tail modifications are shown. Circles with A represent
bacteria, and allows students to compare and contrast acetylation while circles with M represent methylation. See text for details.
those approaches to Tc1 elements in worms and MITEs in
plants.
Transcription Transcription
Inactive element
not transcribed
Processing Processing
piRNA mRNA
piwi-
Argonaute Translation
Enduring Features
Coverage of model organisms
The eleventh edition retains the enhanced coverage of model systems in formats
that are practical and flexible for both students and instructors.
• Chapter 1 introduces some key genetic model organisms and highlights some of
the successes achieved through their use.
• Model Organism boxes presented in context where appropriate provide
additional information about the organism in nature and its use experimentally.
• A Brief Guide to Model Organisms, at the back of the book, provides quick
access to essential, practical information about the uses of specific model
organisms in research studies.
• An Index to Model Organisms, on the endpapers at the back of the book,
provides chapter-by-chapter page references to discussions of specific
organisms in the text, enabling instructors and students to easily find and
assemble comparative information across organisms.
Problem sets
No matter how clear the exposition, deep understanding requires the student to
personally engage with the material. Hence our efforts to encourage student prob-
lem solving. Building on its focus on genetic analysis, the eleventh edition provides
students with opportunities to practice problem-solving skills—both in the text and
online through the following features.
• Versatile Problem Sets. Problems span the full range of degrees of
difficulty. They are categorized according to level of difficulty—basic or
challenging.
• Working with the Figures. An innovative set of problems included at the
back of each chapter asks students pointed questions about figures in the
chapter. These questions encourage students to think about the figures and
help them to assess their understanding of key concepts.
• Solved Problems. Found at the end of each chapter, these worked examples
illustrate how geneticists apply principles to experimental data.
• Unpacking the Problems. A genetics problem draws on a complex matrix
of concepts and information. “Unpacking the Problem” helps students learn
to approach problem solving strategically, one step at a time, concept on
concept.
• NEW Multiple-choice versions of the end-of-chapter
problems are available on our online LaunchPad for quick gradable quizzing
and easily gradable homework assignments. The Unpacking the Problem
tutorials from the text have been converted to in-depth online tutorials and
expanded to help students learn to solve problems and think like a geneticist.
New videos demonstrate how to solve selected difficult problems.
Includes all the electronic resources listed below for teachers. Contact your W. H.
Freeman sales representative to learn how to log on as an instructor.
e-Book
The e-Book fully integrates the text and its interactive media in a format that features
a variety of helpful study tools (full-text, Google-style searching; note taking; book-
marking; highlighting; and more). Available as a stand-alone item or on the LaunchPad.
Clicker Questions
Jump-start discussions, illuminate important points, and promote better conceptual
understanding during lectures.
Layered PowerPoint Presentations
Illuminate challenging topics for students by deconstructing intricate genetic con-
cepts, sequences, and processes step-by-step in a visual format.
All Images from the Text
More than 500 illustrations can be downloaded as JPEGs and PowerPoint slides. Use
high-resolution images with enlarged labels to project clearly for lecture hall presen-
tations. Additionally, these JPEG and PowerPoint files are available without labels
for easy customization in PowerPoint.
67 Continuous-Play Animations
A comprehensive set of animations, updated and expanded for the eleventh edition,
covers everything from basic molecular genetic events and lab techniques to analyzing
crosses and genetic pathways. The complete list of animations appears on page xix.
xviii Preface
Assessment Bank
This resource brings together a wide selection of genetics problems for use in test-
ing, homework assignments, or in-class activities. Searchable by topic and pro-
vided in MS Word format, as well as in LaunchPad and Diploma, the assessment
bank offers a high level of flexibility.
Animations
Sixty-seven animations are fully integrated with the content and figures in the text
chapters. These animations are available on the LaunchPad and the Book Compan-
ion site.
CHAPTER 1
A Basic Plant Cross (Figure 1-3)
The Central Dogma (Figure 1-10)
CHAPTER 2
Mitosis (Chapter Appendix 2-1)
Meiosis (Chapter Appendix 2-2)
X-Linked Inheritance in Flies (Figure 2-17)
CHAPTER 3
Punnett Square and Branch Diagram Methods for Predicting the Outcomes
of Crosses (Figure 3-4)
Meiotic Recombination Between Unlinked Genes by Independent Assortment
(Figures 3-8 and 3-13)
Analyzing a Cross: A Solved Problem (Solved Problem 2)
CHAPTER 4
Crossing Over Produces New Allelic Combinations (Figures 4-2 and 4-3)
Meiotic Recombination Between Linked Genes by Crossing Over (Figure 4-7)
A Molecular Model of Crossing Over (Figure 4-21)
A Mechanism of Crossing Over: A Heteroduplex Model (Figure 4-21)
A Mechanism of Crossing Over: Genetic Consequences of the Heteroduplex Model
Mapping a Three-Point Cross: A Solved Problem (Solved Problem 2)
CHAPTER 5
Bacterial Conjugation and Mapping by Recombination (Figures 5-11 and 5-17)
CHAPTER 6
Interactions Between Alleles at the Molecular Level, RR: Wild-Type
Interactions Between Alleles at the Molecular Level, rr: Homozygous
Recessive, Null Mutation
Interactions Between Alleles at the Molecular Level, r ′r ′: Homozygous
Recessive, Leaky Mutation
Interactions Between Alleles at the Molecular Level, Rr: Heterozygous,
Complete Dominance
Screening and Selecting for Mutations
A Model for Synthetic Lethality (Figure 6-20)
CHAPTER 7
DNA Replication: The Nucleotide Polymerization Process (Figure 7-15)
DNA Replication: Coordination of Leading and Lagging Strand Synthesis
(Figure 7-20)
DNA Replication: Replication of a Chromosome (Figure 7-23)
xx Preface
CHAPTER 8
Transcription in Prokaryotes (Figures 8-7 to 8-10)
Transcription in Eukaryotes (Figures 8-12 and 8-13)
Mechanism of RNA Splicing (Figures 8-16 and 8-17)
CHAPTER 9
Peptide-Bond Formation (Figure 9-2)
tRNA Charging (Figure 9-7)
Translation (Figure 9-14 to 9-16)
Nonsense Suppression at the Molecular Level: The rod ns Nonsense Mutation
(Figure 9-18)
Nonsense Suppression at the Molecular Level: The tRNA Nonsense Suppressor
(Figure 9-18)
Nonsense Suppression at the Molecular Level: Nonsense Suppression of the rod ns
Allele (Figure 9-18)
CHAPTER 10
Polymerase Chain Reaction (Figure 10-3)
Plasmid Cloning (Figure 10-9)
Finding Specific Cloned Genes by Functional Complementation: Functional
Complementation of the Gal− Yeast Strain and Recovery of the Wild-Type
GAL gene
Finding Specific Cloned Genes by Functional Complementation: Making a Library
of Wild-Type Yeast DNA
Finding Specific Cloned Genes by Functional Complementation: Using the Cloned
GAL Gene as a Probe for GAL mRNA
SDS Gel Electrophoresis and Immunoblotting
Dideoxy Sequencing of DNA (Figure 10-17)
Creating a Transgenic Mouse (Figures 10-29 and 10-30)
CHAPTER 11
Regulation of the Lactose System in E. coli: Assaying Lactose Presence/Absence
Through the Lac Repressor (Figure 11-6)
Regulation of the Lactose System in E. coli: OC lac Operator Mutations
(Figure 11-8)
Regulation of the Lactose System in E. coli: I− Lac Repressor Mutations
(Figure 11-9)
Regulation of the Lactose System in E. coli: IS Lac Superrepressor Mutations
(Figure 11-10)
CHAPTER 12
Three-Dimensional Structure of Nuclear Chromosomes (Figure 12-11)
Gal4 Binding and Activation (Figures 12-6 through 12-9)
Chromatin Remodeling (Figures 12-13 and 12-14)
CHAPTER 13
Drosophila Embryonic Development
Sex Determination in Flies (Figure 13-23)
CHAPTER 14
DNA Microarrays: Using an Oligonucleotide Array to Analyze Patterns of
Gene Expression (Figure 14-20)
DNA Microarrays: Synthesizing an Oligonucleotide Array
Yeast Two-Hybrid Systems (Figure 14-21)
Preface xxi
CHAPTER 15
Replicative Transposition (Figure 15-9)
Life Cycle of a Retrovirus (Figure 15-11)
The Ty1 Mechanism of Retrotransposition (Figures 15-13 and 15-14)
CHAPTER 16
Replication Slippage Creates Insertion or Deletion Mutations (Figure 16-8)
UV-Induced Photodimers and Excision Repair (Figure 16-19)
Base-Excision Repair, Nucleotide Excision Repair, and Mismatch Repair
(Figures 16-20, 16-22, and 16-23)
CHAPTER 17
Autotetraploid Meiosis (Figure 17-6)
Meiotic Nondisjunction at Meiosis I (Figure 17-12)
Meiotic Nondisjunction at Meiosis II (Figure 17-12)
Chromosome Rearrangements: Paracentric Inversion, Formation of
Paracentric Inversions (Figure 17-27)
Chromosome Rearrangements: Paracentric Inversion, Meiotic Behavior
of Paracentric Inversions (Figure 17-28)
Chromosome Rearrangements: Reciprocal Translocation, Formation of
Reciprocal Translocations (Figure 17-30)
Chromosome Rearrangements: Reciprocal Translocation, Meiotic Behavior
of Reciprocal Translocations (Figure 17-30)
Chromosome Rearrangements: Reciprocal Translocation, Pseudolinkage
of Genes by Reciprocal Translocations (Figure 17-32)
Acknowledgments
We extend our thanks and gratitude to our colleagues who reviewed this edition and whose insights and advice were
most helpful:
Shirlean Goodwin, University of Memphis Ann Murkowski, North Seattle Community College
Topher Gee, UNC Charlotte Saraswathy Nair, University of Texas at Brownsville
John Graham, Berry College Sang-Chul Nam, Texas A&M International University
Theresa Grana, University of Mary Washington Scot Nelson, University of Hawaii at Manoa
Janet Guedon, Duquesne University Brian Nichols, University of Illinois at Chicago
Patrick Gulick, Concordia University Todd Nickle, Mount Royal University
Richard Heineman, Kutztown University Juliet Noor, Duke University
Anna Hicks, Memorial University Mohamed Noor, Duke University
Susan Hoffman, Miami University Daniel Odom, California State University, Northridge
Stanton Hoegerman, College of William and Mary Kirk Olsen, East Los Angeles College
Margaret Hollingsworth, University at Buffalo Kavita Oommen, Georgia State University
Nancy Huang, Colorado College Maria Orive, University of Kansas
Jeffrey Hughes, Millikin University Laurie Pacarynuk, University of Lethbridge
Varuni Jamburuthugoda, Fordham University Patricia Phelps, Austin Community College
Pablo Jenik, Franklin & Marshall College Martin Poenie, University of Texas at Austin
Aaron Johnson, University of Colorado School Jennifer Powell, Gettysburg College
of Medicine Robyn Puffenbarger, Bridgewater College
Anil Kapoor, University of La Verne Jason Rauceo, John Jay College (CUNY)
Jim Karagiannis, University of Western Ontario Eugenia Ribiero-Hurley, Fordham University
Kathleen Karrer, Marquette University Ronda Rolfes, Georgetown University
Jessica Kaufman, Endicott College Edmund Rucker, University of Kentucky
Darrell Killian, Colorado College Jeffrey Sands, Lehigh University
Dennis Kraichely, Cabrini College Monica Sauer, University of Toronto at Scarborough, UTSC
Anuj Kumar, University of Michigan Ken Saville, Albion College
Janice Lai, Austin Community College Pratibha Saxena, University of Texas at Austin
Evan Lau, West Liberty University Jon Schnorr, Pacific University
Min-Ken Liao, Furman University Malcolm Schug, University of North Carolina at
Sarah Lijegren, University of Mississippi Greensboro
Renyi Liu, University of California, Riverside Deborah Schulman, Lake Erie College
Diego Loayza, Hunter College Allan Showalter, Ohio University
James Lodolce, Loyola University Chicago Elaine Sia, University of Rochester
Joshua Loomis, Nova Southeastern University Robert Smith, Nova Southeastern University
Amy Lyndaker, Ithaca College Joyce Stamm, University of Evansville
Jessica Malisch, Claremont McKenna College Tara Stoulig, Southeastern Louisiana University
Patrick Martin, North Carolina A&T State University Julie Torruellas Garcia, Nova Southeastern University
Presley Martin, Hamline University Virginia Vandergon, California State University,
Dmitri Maslov, University of California, Riverside Northridge
Maria Julia Massimelli, Claremont McKenna College Charles Vigue, University of New Haven
Endre Mathe, Vasile Goldis Western University of Arad Susan Walsh, Rollins College
Herman Mays, University of Cincinnati Michael Watters, Valparaiso University
Thomas McGuire, Penn State Abington Roger Wartell, Georgia Institute of Technology
Mark Meade, Jacksonville State University Matthew White, Ohio University
Ulrich Melcher, Oklahoma State University Dwayne Wise, Mississippi State University
Philip Meneely, Haverford College Andrew Wood, Southern Illinois University
Ron Michaelis, Rutgers University Mary Alice Yund, UC Berkeley Extension
Chris Mignone, Berry College Malcom Zellars, Georgia State University
Sarah Mordan-McCombs, Franklin College of Indiana Deborah Zies, University of Mary Washington
Preface xxiii
1
C h a p t e r
The Genetics
Revolution
Learning Outcomes
After completing this chapter, you will be able to
outline
1.1 The birth of genetics
1.3 Genetics today
1
2 CHAPTER 1 The Genetics Revolution
G
enetics is a form of information science. Geneticists seek to understand
the rules that govern the transmission of genetic information at three lev-
els—from parent to offspring within families, from DNA to gene action
within and between cells, and over many generations within populations of organ-
isms. These three foci of genetics are known as transmission genetics, molecular-
developmental genetics, and population-evolutionary genetics. The three parts of
this text examine these three foci of genetics.
The science of genetics was born just over 100 years ago. Since that time,
genetics has profoundly changed our understanding of life, from the level of the
individual cell to that of a population of organisms evolving over millions of years.
In 1900, William Bateson, a prominent British biologist, wrote presciently that an
“exact determination of the laws of heredity will probably work more change in
man’s outlook on the world, and in his power over nature, than any other advance
in natural knowledge that can be foreseen.” Throughout this text, you will see the
realization of Bateson’s prediction. Genetics has driven a revolution in both the
biological sciences and society in general.
In this first chapter, we will look back briefly at the history of genetics, and in
doing so, we will review some of the basic concepts of genetics that were discov-
ered over the last 100 years. After that, we will look at a few examples of how
genetic analysis is being applied to critical problems in biology, agriculture, and
human health today. You will see how contemporary research in genetics inte-
grates concepts discovered decades ago with recent technological advances. You
will see that genetics today is a dynamic field of investigation in which new dis-
coveries are continually advancing our understanding of the biological world.
heights, from short to tall, and we have not all narrowed in on a single average Gregor Mendel
height despite the many generations that human populations have dwelled on
Earth.
Two gene
copies
First-generation
hybrid
Self-pollination
Second-generation
hybrids
Eggs Sperm
of offspring from this cross all had purple flowers, just like
one of the parents. There was no blending. Then, Mendel self-
pollinated the first-generation hybrid plants and grew a sec-
ond generation of offspring. Among the progeny, he saw
plants with purple flowers as well as plants with white flow-
ers. Of the 929 plants, he recorded 705 with purple flowers
and 224 with white flowers (Figure 1-4). He observed that
there were roughly 3 purple-flowered plants for every 1 white-
flowered plant.
How did Mendel explain his results? Clearly, blending the-
ory would not work since that theory predicts a uniform group
of first-generation hybrid plants with light purple flowers. So
Mendel proposed that the factors that control traits act like
particles rather than fluids and that these particles do not
blend together but are passed intact from one generation to
the next. Today, Mendel’s particles are known as genes.
Mendel proposed that each individual pea plant has two
copies of the gene controlling flower color in each of the cells of
the plant body (somatic cells). However, when the plant forms
sex cells, or gametes (eggs and sperm), only one copy of the
gene enters into these reproductive cells (see Figure 1-3). Then,
when egg and sperm unite to start a new individual, once again
there will be two copies of the flower color gene in each cell of
the plant body.
Mendel had some further insights. He proposed that the
F I G U R E 1- 4 Excerpts from Mendel’s gene for flower color comes in two gene variants, or alleles—
1866 publication, Versuche über Pflanzen- one that conditions purple flowers and one that conditions white flowers. He pro-
Hybriden (Experiments on plant hybrids).
posed that the purple allele of the flower color gene is dominant to the white
[ Augustinian Abbey in Old Brno, Courtesy of
the Masaryk University, Mendel Museum.]
allele such that a plant with one purple allele and one white allele would have
purple flowers. Only plants with two white alleles would have white flowers (see
Figure 1-3). Mendel’s two conclusions, (1) that genes behaved like particles that
do not blend together and (2) that one allele is dominant to the other, enabled
him to explain the lack of blending in the first-generation hybrids and the re-
appearance of white-flowered plants in the second-generation hybrids with a
3 : 1 ratio of purple- to white-flowered plants. This revolutionary advance in our
understanding of inheritance will be fully discussed in Chapter 2.
How did Mendel get it right when so many others before him were wrong? Mendel
chose a good organism and good traits to study. The traits he studied were all
controlled by single genes. Traits that are controlled by several genes, as many
traits are, would not have allowed him to discover the laws of inheritance so easily.
Mendel was also a careful observer, and he kept detailed records of each of his
experiments. Finally, Mendel was a creative thinker capable of reasoning well
Introduction to Genetic Analysis, 11e beyond the ideas of his times.
Figure 01.04 #104 Mendel’s particulate theory of inheritance was published in 1866 in the Pro-
04/15/14 ceedings of the Natural History Society of Brünn (see Figure 1-4). At that time, his
05/01/14 work was noticed and read by some other biologists, but its implications and
Dragonfly Media Group
importance went unappreciated for over 30 years. Unlike Charles Darwin, whose
discovery of the theory of evolution by natural selection made him world-
renowned virtually overnight, when Mendel died in 1884, he was more or less
unknown in the world of science. As biochemist Erwin Chargaff put it, “There are
people who seem to be born in a vanishing cap. Mendel was one of them.”
F I G U R E 1- 6 Students at
the Connecticut Agriculture
College in 1914 show a
range of heights. Ronald
Fisher proposed that
continuously variable traits
like human height are
controlled by multiple
Mendelian genes. [ A. F.
Blakeslee, “Corn and Men,”
Journal of Heredity 5, 11,
4:10 4:11 5:0 5:1 5:2 5:3 5:4 5:5 5:6 5:7 5:8 5:9 5:10 5:11 6:0 6:1 6:2
1914, 511–518.]
6 CHAPTER 1 The Genetics Revolution
1903 Walter Sutton and Theodor Boveri hypothesized that chromosomes are the hereditary elements. 4
1905 William Bateson introduced the term “genetics” for the study of inheritance. 2
1908 G. H. Hardy and Wilhelm Weinberg proposed the Hardy–Weinberg law, the foundation for population 18
genetics.
1910 Thomas H. Morgan demonstrated that genes are located on chromosomes. 4
1913 Alfred Sturtevant made a genetic linkage map of the Drosophila X chromosome, the first genetic map. 4
1918 Ronald Fisher proposed that multiple Mendelian factors can explain continuous variation for traits, founding 19
the field of quantitative genetics.
1931 Harriet Creighton and Barbara McClintock showed that crossing over is the cause of recombination. 4, 16
1941 Edward Tatum and George Beadle proposed the one-gene—one-polypeptide hypothesis. 6
1944 Oswald Avery, Colin MacLeod, and Maclyn McCarty provided compelling evidence that DNA is the genetic 7
material in bacterial cells.
1946 Joshua Lederberg and Edward Tatum discovered bacterial conjugation. 5
1948 Barbara McClintock discovered mobile elements (transposons) that move from one place to another in the 15
genome.
1950 Erwin Chargaff showed DNA composition follows some simple rules for the relative amounts of A, C, G, and T. 7
1952 Alfred Hershey and Martha Chase proved that DNA is the molecule that encodes genetic information. 7
1953 James Watson and Francis Crick determined that DNA forms a double helix. 7
1958 Matthew Meselson and Franklin Stahl demonstrated the semiconservative nature of DNA replication. 7
1958 Jérôme Lejeune discovered that Down syndrome resulted from an extra copy of the 21st chromosome. 17
1961 François Jacob and Jacques Monod proposed that enzyme levels in cells are controlled by feedback 11
mechanisms.
1961– Marshall Nirenberg, Har Gobind Khorana, Sydney Brenner, and Francis Crick "cracked" the genetic code. 9
1967
1968 Motoo Kimura proposed the neutral theory of molecular evolution. 18, 20
1977 Fred Sanger, Walter Gilbert, and Allan Maxam invented methods for determining the nucleotide sequences of 10
DNA molecules.
1980 Christiane Nüsslein-Volhard and Eric F. Wieschaus defined the complex of genes that regulate body plan 13
development in Drosophila.
1989 Francis Collins and Lap-Chee Tsui discovered the gene causing cystic fibrosis. 4, 10
1993 Victor Ambrose and colleagues described the first microRNA. 13
1995 First genome sequence of a living organism (Haemophilus influenzae) published. 14
1996 First genome sequence of a eukaryote (Saccharomyces cerevisiae) published. 14
1998 First genome sequence of an animal (Caenorhabditis elegans) published. 14
2000 First genome sequence of a plant (Arabidopsis thaliana) published. 14
2001 The sequence of the human genome first published. 14
2006 Andrew Fire and Craig Mello win the Nobel prize for their discovery of gene silencing by double-stranded RNA. 8
2012 John Gurdon and Shinya Yamanaka win the Nobel prize for their discovery that just four regulatory genes can 8, 12
convert adult cells into stem cells.
1.1 The Birth of Genetics 7
was that continuous traits are each controlled by multiple Mendelian genes. Fisher’s
insight is known as the multifactorial hypothesis. In Chapter 19, we will dissect the
FPO
mathematical model and experimental evidence for Fisher’s hypothesis.
How do genes function inside cells in a way that enables them to control different
states for a trait like flower color? In 1941, Edward Tatum and George Beadle proposed
that genes encode enzymes. Using bread mold (Neurospora crassa) as their experi-
mental organism, they demonstrated that genes encode the enzymes that perform
metabolic functions within cells (Figure 1-7). In the case of the pea plant, there is a
gene that encodes an enzyme required to make the purple pigment in the cells of a
flower. Tatum and Beadle’s breakthrough became known as the one-gene–one-
enzyme hypothesis. You’ll see how they developed this hypothesis in Chapter 6.
What is the physical nature of the gene? Are genes composed of protein, nucleic
acid, or some other substance? In 1944, Oswald Avery, Colin MacLeod, and Mac-
lyn McCarty offered the first compelling experimental evidence that genes are
made of deoxyribonucleic acid (DNA). They showed that DNA extracted from a
virulent strain of bacteria carried the necessary genetic information to transform
a nonvirulent strain into a virulent one. You’ll learn exactly how they demon-
strated this in Chapter 7.
How can DNA molecules store information? In the 1950s, there was something
of a race among several groups of geneticists and chemists to answer this ques-
tion. In 1953, James Watson and Francis Crick working at Cambridge University in
England won that race. They determined that the molecular structure of DNA
was in the form of a double helix—two strands of DNA wound side-by-side in a
spiral. Their structure of the double helix is like a twisted ladder (Figure 1-8). The
sides of the ladder are made of sugar and phosphate groups. The rungs of the lad-
der are made of four bases: adenine (A), thymine (T), guanine (G), and cyto-
sine (C). The bases face the center, and each base is hydrogen bonded to the base
facing it in the opposite strand. Adenine in one strand is always paired with thy-
mine in the other by a double hydrogen bond, whereas guanine is always paired
with cytosine by a triple hydrogen bond. The bonding specificity is based on the
complementary shapes and charges of the bases. The sequence of A, T, G, and C
represents the coded information carried by the DNA molecule. You will learn in
Chapter 7 how this was all worked out.
How are genes regulated? Cells need mechanisms to turn genes on or off in spe-
cific cell and tissue types and at specific times during development. In 1961, François
Jacob and Jacques Monod made a conceptual breakthrough on this question. Work-
Introduction to Genetic Analysis, 11e
ing on01.07
Figure the genes
#123 necessary to metabolize the sugar lactose in the bacterium Esche-
richia coli, they demonstrated that genes have regulatory elements that regulate
04/03/14
gene expression—that
Dragonfly Media Group is, whether a gene is turned on or off (Figure 1-9). The regu-
latory elements are specific DNA sequences to which a regulatory protein binds and
acts as either an activator or repressor of the expression of the gene. In Chapter 11,
you will explore the logic behind the experiments of Jacob and Monod with E. coli,
and in Chapter 12, you will explore the details of gene regulation in eukaryotes.
8 CHAPTER 1 The Genetics Revolution
O 5´
P O
3´ O
O H N O
H
5´ CH2
O T N H N A O 4´
2´
3´ O 1´ 3´
4´ 1´ 2´
O O
5´ CH2
P O
O O
O P O H N O
O CH2
O NC O
GN H
O 3´
O N H
O
CH2
P O
O O
O P N H O O
O 5´ CH
H N T
O 2
AN O
O
O
O
CH2
P O
O O
O P N H O O
O CH2
O C N H N G O
O H N
O O
CH2 H
O 3´
O P
C
(a) (b) 5´ O
How is the information stored in DNA decoded to synthesize proteins? While the
discovery of the double-helical structure of DNA was a watershed for biology,
many details were still unknown. Precisely how information was encoded into
DNA and how it was decoded to form the enzymes that Tatum and Beadle had
shown to be the workhorses of gene action remained unknown. Over the years
1961 through 1967, teams of molecular geneticists and chemists working in several
countries answered these questions when they “cracked the genetic code.” What
this means is that they deduced how a string of DNA nucleotides, each with one
of four different bases (A, T, C, or G), encodes the set of 20 different amino acids
that are the building blocks of proteins. They also discovered that there is a mes-
senger molecule made of ribonucleic acid (RNA) that carries information in the
DNA in the nucleus to the cytoplasm where proteins are synthesized. By 1967, the
basic flowchart for information transmission in cells was known. This flowchart is
called the central dogma of molecular biology.
F I G U R E 1- 9 The structure of a
Genes have regulatory and coding regions protein-coding gene showing a regulatory
DNA element (GGGCCC) to which a
RNA polymerase regulatory protein binds, the promoter
Regulatory complex
region where the RNA polymerase
protein complex binds to initiate transcription, and
Direction of
transcription a protein-coding region
GGGCCC
Regulatory Site where the Protein coding
element RNA polymerase sequence
complex binds
Replication
Transcription Translation
DNA RNA Protein
(a)
F I G U R E 1-10 (a) One version of Francis Crick’s sketch of the central dogma, showing
information flow between biological molecules. The circular arrow represents DNA
replication, the central straight arrow represents the transcription of DNA into RNA, and the
right arrow the translation of RNA into protein. (b) More detailed sketch showing how the
two strands of the DNA double helix are independently replicated, how the two strands are
disassociated for transcription, and how the messenger RNA (mRNA) is translated into
protein at the ribosome.
10 CHAPTER 1 The Genetics Revolution
complete copy of all the DNA in the parent cell. In Chapter 7, you will explore the
details of the structure of DNA and its replication.
Another arrow connects DNA to RNA, symbolizing how the sequence of base
pairs in a gene (DNA) is copied to an RNA molecule. The process of RNA synthe-
sis from a DNA template is called transcription. One class of RNA molecules
made by transcription is messenger RNA, or mRNA for short. mRNA is the tem-
plate for protein synthesis. In Chapter 8, you’ll discover how transcription is
accomplished.
The final arrow in Figure 1-10b connects mRNA and protein. This arrow sym-
bolizes protein synthesis, or the translation of the information in the specific
sequence of bases in the mRNA into the sequence of amino acids that compose a
protein. Proteins are the workhorses of cells, comprising enzymes, structural com-
ponents of the cell, and molecules for cell signaling. The process of translation takes
place at the ribosomes in the cytoplasm of each cell. In Chapter 9, you will learn how
the genetic code is written in three-letter words called codons. A codon is a set of
three consecutive nucleotides in the mRNA that specifies an amino acid in a pro-
tein. CGC specifies the amino acid arginine, AGC specifies serine, and so forth.
Since Crick proposed the central dogma, additional pathways of genetic infor-
mation flow have been discovered. We now know that there are classes of RNA
that do not code for proteins, instances in which mRNA is edited after transcrip-
tion, and cases in which the information in RNA is copied back to DNA (see Chap-
ters 8, 9, and 15).
Model organisms
Geneticists make special use of a small set of model organisms for genetic analy-
sis. A model organism is a species used in experimental biology with the pre-
sumption that what is learned from the analysis of that species will hold true for
other species, especially other closely related species. The philosophy underlying
the use of model organisms in biology was wryly expressed by Jacques Monod:
“Anything found to be true of E. coli must also be true of elephants.”1
As genetics matured and focused on model organisms, Mendel’s pea plants
fell to the wayside, but Morgan’s fruit flies rose to prominence to become one of
the most important model organisms for genetic research. New species were
added to the list. An inconspicuous little plant that grows as a weed called Arabi-
dopsis thaliana became the model plant species and a minute roundworm called
Caenorhabditis elegans that lives in compost heaps became a star of genetic analy-
sis in developmental biology (Figure 1-11).
What features make a species suitable as a model organism? (1) Small organisms
that are easy and inexpensive to maintain are very convenient for research. So fruit
flies are good, blue whales not so good. (2) A short generation time is imperative
because geneticists, like Mendel, need to cross different strains and then study their
1F. Jacob and J. Monod, Cold Spring Harbor Quant. Symp. Biol. 26, 1963, 393.
1.2 After Cracking the Code 11
Fruit fly
Drosophila Nematode
melanogaster Caenorhabditis elegans
Yeast
Saccharomyces
cerevisiae
Mouse
Mus musculus
Mouse-eared cress
Arabidopsis
thaliana
Eukaryotes
Mycoplasma Archaea
gentalium
Bacillus
subtilis
Helicobacter
pylori E. coli
Eubacteria
FIGURE 1-11 The tree shows evolutionary relationships among the major groups of
organisms: Bacteria, Archaea, and Eukaryota (plants, fungi, and animals). [ (Clockwise, from top,
FPO
center) Sinclair Stammers/Science Source; SciMAT/Science Source; Darwin Dale/Science Source;
Biophoto Associates/Science Photo Library; Imagebroker.net/SuperStock; © blickwinkel/Alamy.]
first- and second-generation hybrids. The shorter the generation time, the sooner
the experiments can be completed. (3) A small genome is useful. As you will learn
in Chapter 15, some species have large genomes and others small genomes in terms
of the total number of DNA base pairs. Much of the extra size of large genome spe-
cies is composed of repetitive DNA elements between the genes. If a geneticist is
looking for genes, these can be more easily found in organisms with smaller genomes
and fewer repetitive elements. (4) Organisms that are easy to cross or mate and that
produce large numbers of offspring are best.
As you read this textbook, you will encounter certain organisms over and over.
Organisms such as Escherichia coli (a bacterium), Saccharomyces cerevisiae (baker’s
yeast), Caenorhabditis elegans (nematode or roundworm), Drosophila melanogas-
Introduction to Genetic Analysis, 11e
ter (fruit fly),
Figure 01.11 and Mus musculus (mice) have been used repeatedly in experiments
#127
and revealed much of what we know about how inheritance works. Model organ-
04/04/14
Dragonfly
isms can Media Group
be found on diverse branches of the tree of life (see Figure 1-11), repre-
senting bacteria, fungi, algae, plants, and invertebrate and vertebrate animals.
12 CHAPTER 1 The Genetics Revolution
This diversity enables each geneticist to use a model best suited to a particular
question. Each model organism has a community of scientists working on it who
share information and resources, thereby facilitating each other’s research.
Mendel’s experiments were possible because he had several different varieties
of pea plants, each of which carried a different genetic variant for traits such as
purple versus white flowers, green versus yellow seeds, or tall versus dwarf stems.
For each of the model species, geneticists have assembled large numbers of variet-
ies (also called strains or stocks) with special genetic characters that make them
useful in research. There are strains of fruit flies that have trait variants such as
red versus white eyes. There are strains of mice that are prone to develop specific
forms of cancer or other disease conditions such as diabetes. For baker’s yeast,
there is a collection of nearly 5000 deletion stocks, each of these having just one
gene deleted from the genome. These stocks enable geneticists to study the func-
tion of each gene by examining how yeast is affected when the gene is removed.
Since baker’s yeast has about 6000 total genes, this collect of 5000 deletion stocks
covers most of the genes in the genome.
The different strains of each model organism are available to researchers
through stock centers that maintain and distribute the strains. Lists of available
stocks are on the Internet (see Appendix B). To view an example for mouse stocks,
go to the link http://jaxmice.jax.org/. Then, click the “Find JAX mice” button at
the top of the page. Next, enter the word “black” in the search field and click the
Search button. Now, click the “C57BL/6J” link. You will see an image and informa-
tion on a commonly used C57-Black mouse strain. Other search terms such as
“albino” or “obese” will link you with strains with other features.
5′ 3′
5′ 3′ 5′ 3′
Heat Cool
Denature Anneal
5′ 3′ 5′ 3′
3′ 5′
(a) (b)
F I G U R E 1-13 (a) The two strands of the DNA double helix can be dissociated by heat in aqueous
solutions. Upon cooling under controlled conditions, strands reassociate, or hybridize, with their
complement. (b) A cloned copy of the human BAPX1 gene was tagged with a green fluorescent dye.
The fluorescent-tagged DNA was then denatured and allowed to hybridize to the chromosomes in a
single cell. The fluorescent-tagged clone hybridized to the location on chromosome 4 (green
fluorescent regions) where the gene is located. [ (b) C. Tribioli and T. Lufkin, “Molecular cloning,
chromosomal mapping and developmental expression of BAPX1, a novel human homeobox-containing gene
homologous to Drosophila bagpipe,” Gene, 203, 2, 1997, 225–233, Fig. 6, © Elsevier.]
Au-ED,
14 CHAPTER 1 The Genetics Revolution
“You have this clear, tangible phenomenon in which children resemble their
parents. Despite what students get told in elementary-school science, we just
don’t know how that works.”
(a) (b)
both parents is only 1 in 1024. In Chapter 2, you’ll learn how to calculate such
probabilities.
With this hint from the family history, the UDP team now knew where to look
in the genome for the mutant gene. They needed to look for a segment on one of
the chromosomes for which the copy that Benge inherited from her mother is
identical to the copy she inherited from her father. Moreover, each of Benge’s
siblings must also have two copies of this segment identical to Benge’s. Such
regions are very rare in people unless their parents are related, as in the case of
Benge since her parents are third cousins. Generally, a segment of a chromosome
that is just a few hundred base pairs long will have several differences in the
sequence of A’s, C’s, G’s, and T’s between the copy we inherited from our mother
and the one we inherited from our
Introduction father. Analysis,
to Genetic These differences
11e are known as single
nucleotide polymorphisms, or SNPs
Figure 1UN2 #138 for short (see Box 1-1).
The UDP team used 04/03/14
a new genomic technology, called a DNA microarray (see
05/01/14
Chapter 18), that allowed them to study one million base-pair positions across the
Dragonfly Media Group
genome. At each of these base-pair positions along the chromosomes, the team
could see where Benge’s two chromosomal segments were identical, and whether
all of Benge’s siblings also carried two identical copies in this segment. For Benge,
a portion of only 1/512 of her genome is expected to have two identical copies,
and the chance that all four of her siblings will also have the same two identical
copies is far smaller.
Looking over the genome-wide SNP data, the UDP team found exactly the
type of chromosome segment for which they were looking. There was a small seg-
ment on one of Benge’s chromosomes for which she and her siblings all had the
same two identical copies. Furthermore, they discovered that the gene that
encodes the CD73 enzyme is located in this segment. This result suggested that
Benge and her siblings all had two identical copies of the same defective CD73-
encoding gene. The team seemed to have found the needle in a haystack for which
they were looking; however, there was one last experiment to perform.
The team needed to identify the specific defect in the defective CD73 gene
that Benge and her siblings had inherited. After determining the DNA sequence
for the CD73 gene from Benge and her siblings, the team found the defect in the
gene—“the smoking gun.” The defective gene encoded only a short, or truncated,
protein—it did not encode the complete sequence of amino acids. One of the DNA
1.3 Genetics Today 17
codons with letters TCG that encodes the amino acid serine was mutated to TAG,
which signals the truncation of the protein. The protein made from Benge’s ver-
sion of the CD73 gene was truncated so it could not signal cells in the arteries to
keep the calcification pathway turned off.
Louise Benge’s journey from first experiencing pain in her legs to learning
that she had a new disease called ACDC was a long one. The diagnosis of her dis-
ease was a triumph made possible by the integration of classic transmission genet-
ics and genomics. Knowing the defect underlying the disease ACDC allowed the
doctors to try a medication that they would never have considered before they
knew that the cause was a defective CD73 enzyme. The medication in question is
called etidronate, and it can substitute for CD73 in signaling cells to keep the cal-
cification pathway turned off. Clinical trials with etidronate are currently under-
way for ACDC patients and are scheduled for completion in 2017.
Mother Father
Copy M1 • • C AGCAGA T TGC TGC T T TGT A TGAG • • Copy F1 • • C AGC TGA T TGC TGC T T TGT AGGAG • •
Copy M2 • • C AGC TGA T TGC TGC T T TGT A TGAG • • Copy F2 • • C AAC TGA T TGC TGC T T TGT A TGAG • •
Child
F I G U R E 1-17 A short segment of DNA individual has two copies of the segment. Notice that copy M1 in the mother has
from one of the chromosomes is shown. a SNP (green letter) that distinguishes it from copy M2. Similarly, there are two
Each individual has two copies of the SNPs (purple letters) that distinguish the father’s two copies of this segment.
segment. In the mother, these are labeled
Comparing the child to the parents, we see that the child inherited copy M1 from
M1 and M2; in the father, F1 and F2. The
child inherited copy M1 from its mother its mother and copy F2 from its father. Look closer at the child’s two copies of the
and F2 from its father. The version of F2 in segment, and you’ll notice something else. There is a unique variant (red letter)
the child carries a new point mutation that occurs in the child but neither of its parents. This is a de novo point mutation.
(red). Single nucleotide polymorphisms It this case, it is a mutation from a guanine (G) to a thymine (T). We can see that
(SNPs) that distinguish the different copies the mutation arose in the father since it is on the F2 copy of the segment.
are shown in green (mother) and purple Where and exactly when did the new mutation depicted in Figure 1-17 arise?
(father).
Most of our bodies are composed of somatic cells that make up everything from
our brain to our blood. However, we also have a special lineage of cells called the
germline that divide to produce eggs in women and sperm in men. New muta-
tions that arise in somatic cells as they divide during the growth and development
of our bodies are not passed on to our offspring. However, a new mutation that
occurs in the germline can be transmitted to the offspring. The mutation depicted
in Figure 1-17 arose in the germline of the father.
With the genome sequence data for the trios, the Icelandic geneticists made
some pretty startling discoveries. First, among the 78 children in the study, they
Introduction to Genetic Analysis,observed
11e a total of 4933 new point mutations. Each child carried about 63 unique
Figure 01.17 #131 mutations that did not exist in its parents. Most of these occurred in parts of the
04/03/14
05/01/14
genome where they have only a small chance to pose a health risk, but 62 of the
Dragonfly Media Group 4933 mutations caused potentially damaging changes to the genes such that they
altered the amino acid sequence of the protein encoded. Second, among the muta-
tions that could be assigned a parent of origin, there were on average 55 from the
father for every 14 from the mother. The children were inheriting nearly four
times as many new mutations from their fathers as their mothers. The Icelandic
team had confirmed Haldane’s prediction made 90 years earlier.
The genome sequences also allowed the team to test Weinberg’s prediction
that the frequency of mutation rises with the age of the parents. For each trio, the
researchers knew the ages of the mother and the father at the time of conception.
When they investigated whether the frequency of mutation rises with the moth-
er’s age when controlling for the age of the father, the team found no evidence
that it did. Older mothers did not pass on more new point mutations to their off-
spring than younger ones. (Older mothers are known to produce more chromo-
somal aberrations than younger mothers, such as an extra copy of the 21st
chromosome that causes Down syndrome; see Chapter 17.) Next, they examined
the relationship between mutation and the age of the father when controlling for
the age of the mother. Here, they found a powerful relationship. The older the
father, the higher the frequency of new point mutations (Figure 1-18). In fact, for
1.3 Genetics Today 19
80
60
40
15 20 25 30 35 40 45
Age of father at conception of child (years)
each year of increase in his age, a father will pass on two additional new mutations
to his children. A 20-year-old father will pass on about 25 new mutations to each
of his children, but a 40-year-old father will pass on about 65 new mutations.
Weinberg’s observation made 100 years earlier was confirmed.
Why does the age of the father matter, while that of the mother seems to have
no effect on the frequency of new point mutations? The answer lies in the differ-
ent ways by which men and women form gametes. In women, as in the females of
other mammals, the process of making eggs takes place largely before a woman is
born. Thus, when a woman is born she possesses in her ovaries a set of egg precur-
sor cells that will mature into egg cells without further rounds of DNA replication.
For a woman, from the point when she was conceived until the formation of the
egg cells in her ovaries, there are about 24 rounds of cell division, 23 of which
Introduction
have a roundto Genetic Analysis, 11e
of chromosome (DNA) replication and an opportunity for a copying
Figure 01.18 #132
error or
04/02/14
mutation. All 23 of these rounds of chromosome replication occur before
a05/01/14
woman is born, so there are no additional rounds after her birth and no chance
for additional
Dragonfly Mediamutations
Group as she ages. Thus, older mothers contribute no more new
point mutations to their children than younger mothers.
Sperm production is altogether different. The cell divisions that produce
sperm continue throughout a man’s life, and there are many more rounds of cell
division in sperm formation than in egg formation. Sperm produced by 20-year-
old men will have experienced about 150 rounds of DNA replication from the
time of the man’s conception, almost seven times as many as for the eggs pro-
duced by 20-year-old women. By the time a man is age 40, his sperm will have a
history that involves over 25 times as many rounds of DNA replication as for eggs
in a woman of the same age. Thus, there is much more risk of new point muta-
tions occurring during these extra rounds of cell division and DNA replication
with the increase in the age of the father.
There is one final twist to the remarkable project performed by the Icelandic
geneticists. The 78 trios that they studied were chosen because the children in
most of the trios had inherited disorders. These included 44 children with autism
spectrum disorder and 21 with schizophrenia. For all these children, there were
no other cases of these disorders among their relatives, suggesting that their
20 CHAPTER 1 The Genetics Revolution
control these levels. If the water in the paddies gets too deep (greater than 50 cm)
for a prolonged period, then the rice plants, like the weeds, can suffer or even die.
Paddy agriculture, as practiced in the lowlands of India, Southeast Asia, and
West Africa, relies on natural rainfall, rather than irrigation, to flood the fields.
This circumstance poses a risk. When the rains are heavy, water depth in the pad-
dies can exceed 50 cm and completely submerge the plants, causing rice plants to
either suffer a loss in yield or simply die. Of the 60 million hectares of rain-fed
lowland paddies, one-third experience damaging floods on a regular basis. The
heavy rains and monsoons that flood the fields are estimated to cause a loss of
rice worth more than US$1 billion each year. In India, Indonesia, and Bangladesh
alone, 4 million tons of rice are lost to flooding each year, enough to feed 30 mil-
lion people. Since this loss is mostly incurred by the poorest farmers, it can lead
to malnourishment and even starvation.
In the early 1990s, David Mackill, a plant geneticist and breeder at the Inter-
national Rice Research Institute, had an idea about how to improve rice so that it
could tolerate being submerged in flood waters. He identified a remarkable variety
of rice called FR13A that could survive submergence and even thrive after the
plants remained fully submerged in deep water for up to two weeks. Unfortunately,
FR13A had a low yield and the quality of its grain was marginal. So Mackill set out
to transfer FR13A’s genetic factor(s) for submergence tolerance into a rice variety
with a higher yield and higher grain quality. He first crossed FR13A and a superior
variety of rice and then for several generations crossed the hybrid plants back to the
superior variety until he had created an improved form of rice that combined sub-
mergence tolerance and high yield.
Mackill had achieved his initial goal of transferring submergence tolerance
into a superior variety, but the genetic basis for why FR13A was submergence
tolerant remained obscure. Was FR13A’s submergence tolerance controlled by
many genes on multiple chromosomes, or might it be mostly controlled by just
one gene? To delve into the genetic basis of submergence tolerance, Mackill and
his team conducted a form of genetic analysis called quantitative trait locus
(QTL) mapping (see Chapter 19). A QTL is a genetic locus that contributes incre-
mentally or quantitatively to variation for a trait. Mendel’s gene for flower color
had two categorical alleles: one for purple flowers and the other for white flowers.
QTL have alleles that usually engender only partial changes such as the difference
between a pale purple and a medium purple. Using QTL mapping, Mackill learned
that the secret to FR13A exceptionalism was mostly due to a single genetic locus
or QTL on one of the rice chromosomes. He named this locus SUB1 for “submer-
gence tolerant.”
With the chromosomal location of SUB1 revealed, it was time to delve even
deeper and identify the molecular nature of SUB1. What type of protein did it
encode? How did the allele of SUB1 found in FR13A allow the plant to cope with
submergence? What is the physiological response that enables the plant to sur-
vive submergence?
To address these questions, molecular geneticists Pamela Ronald at the Univer-
sity of California, Davis, and Julia Bailey-Serres at the University of California, Riv-
erside, joined the team. Working with Mackill, this expanded team zeroed in on the
chromosome segment containing the SUB1 QTL and determined that it encom-
passes a member of a class of genes called ethylene response factors (ERFs). ERF
genes encode regulatory proteins that bind to regulatory elements in other genes
and thereby regulate their expression. Thus, SUB1 is a gene that regulates the
expression of other genes. Moreover, they determined that the allele of SUB1 in
FR13A is switched on in response to submergence, while the allele of SUB1 found in
submergence-sensitive varieties is not switched on by submergence.
22 CHAPTER 1 The Genetics Revolution
The next question was, how does switching on SUB1 enable FR13A to survive
complete submergence? To answer this question, let’s review how ordinary rice
plants respond to submergence. When a plant is completely submerged, oxygen
levels in its cells drop to a low level, and the concentration of ethylene, a plant hor-
mone, in the cells increases. Ethylene signals the plant to escape submergence by
elongating its leaves and stems to keep its “head” above water. This escape strategy
works fine as long as the water is not so deep that the plant fails to grow enough to
position its stems and leaves above the flood waters. If the flood waters are too deep,
then the plant cannot grow enough to escape. As a plant in such deeply flooded
circumstances grows to escape the flood water, it uses up all its energy reserves
(carbohydrates), becomes spindly and weak, and eventually dies.
How does the FR13A variety manage to survive submergence while many other
types of rice cannot? FR13A has a different strategy that could be called sit tight. In
response to complete submergence, rather than attempt rapid growth to escape the
flood, an FR13A plant using the sit-tight strategy becomes quiescent. It stops the
elongation growth response, thereby preventing itself from burning up all its
reserve carbohydrates and becoming weak and spindly. With the sit-tight strategy, a
plant can remain in a quiescent, submerged state for up to two weeks and then
emerge healthy and resume normal growth when the flood waters recede.
The sit-tight strategy of FR13A is controlled by SUB1, which acts as the master
switch or regulatory gene to activate this strategy. When the flood waters rise, the
concentration of the plant hormone ethylene increases in plant cells. Because
SUB1 is an ERF, it is switched on in response to the elevated ethylene levels. Then,
the protein that SUB1 encodes orchestrates the plant’s response by switching on
(or off) a battery of genes involved in plant growth and metabolism. In FR13A
plants that become submerged, genes involved in stem and leaf elongation as part
of the escape strategy are switched off, as are genes involved in mobilizing the
energy reserves (carbohydrates) needed to fuel the escape strategy. Using the tools
of molecular genetics and genomics such as DNA microarrays (see Chapters 10 and
14), the rice team was able to decipher the extensive catalog of genes controlling
Yield (t ha–1)
location of SUB1 on one of the chromosomes, they could
transfer it into a superior variety with surgical precision. This 3.0
precision is important because it enabled the team to avoid
transferring other undesirable genes at the same time. For 2.0
this project, they worked with a submergence-intolerant, but Swarna
1.0
superior, Indian variety, called Swarna, which is widely Swarna-Sub1
grown and favored by farmers. The new line they created is 0.0
called Swarna-Sub1, and it has lived up to expectations. 0 5 10 15 20 25 30
Field trials showed a striking difference in plant survival Duration of submergence (days)
and yield between Swarna and Swarna-Sub1 when there is
complete submergence (Figure 1-20). As shown in Figure F I G U R E 1-2 1 Yield comparison
1-21, Swarna-Sub1 provides higher yield than the original Swarna under all differ- between variety Swarna that is not tolerant
ent levels of flooding. In various trials, the SUB1 improved yield between 1 to 3 to flooding (purple circles) and variety
tons of grain per hectare. Swarna-Sub1 that is tolerant (green
With the support and sponsorship of international research organizations, circles). Yield in tons per hectare ( y-axis)
versus duration of flooding in days (x-axis).
governmental agencies, and philanthropies, Swarna-Sub1 and other superior vari- [ Data from Ismail et al., “The contribution of
eties carrying the SUB1 allele from FR13A have now been distributed to farmers. submergence-tolerant (Sub 1) rice varieties to
In 2008, only 700 farmers were growing SUB1 enhanced rice, but by 2012, that food security in flood-prone rainfed lowland
number had grown to 3.8 million farmers. By 2014, the number of farmers grow- areas in Asia,” Field Crops Research 152, 2013,
ing rice with SUB1 should climb to 5 million, adding considerably to food security 83–93.]
among some of the world’s poorest farmers.
In the long run, the impact of the SUB1 research may not be limited to rice.
Many crops are subjected to damaging floods that reduce yields or destroy the
crop altogether. The genetic research on SUB1 has provided a deep understanding
of the molecular genetics of how plants respond to flooding. With this knowledge,
it will be possible to manipulate the genomes of other crop plants so that they too
can withstand getting their feet a little too wet.
conditions of life on different parts of the globe. This work revealed that three
factors have been particularly powerful in shaping the types of gene variants that
occur in different human populations. These factors are (1) pathogens such as
malaria or smallpox; (2) local climatic conditions including solar radiation, tem-
perature, and altitude; and (3) diet, such as the relative amounts of meat, cereals,
or dairy products eaten. In Chapter 20, you’ll learn how a genetic variant in the
hemoglobin gene has enabled people in Africa to adapt to the ravages of malaria.
Let’s look briefly at examples of genetic adaptations to climate and diet. We’ll
start with a case of human adaptation to life at high altitude.
FPO
[ Stefan Auth/imagebroker/AGE Fotostock; 2 V. J. Vitzthum, “The home team advantage: Reproduction in women indigenous to high altitude,”
(inset) Planet Observer/UIG/Getty Images.] Journal of Experimental Biology 204, 2001, 3141–3150.
1.3 Genetics Today 25
9 EPAS1
6
Statistical test value
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Chromosome
F I G U R E 1-2 3 Twenty-two human chromosomes are arrayed from left to right. The y-axis
shows results from a statistical test of whether there is a significant difference in SNP
frequency between Tibetans and Han Chinese. Each small dot represents one of the SNPs
that was tested. SNPs above the horizontal red line are significantly different. Only the
SNPs in the EPAS1 gene show a significant difference. [ C. Beall et al. Proceedings of the
National Academy of Sciences USA, 107, 25, 2010, 11459–11464, Fig. 1.]
occur at very different frequencies in Tibetans (87 percent) and Han Chinese (9
percent). Their results are shown in Figure 1-23. In this figure, the human chromo-
somes, numbered 1 through 22, are along the x-axis, and a measure of the differ-
ence in SNP variant frequency between Tibetans and Chinese is on the y-axis. Each
dot represents a SNP. SNPs that fall above the horizontal red line are those for
which the frequency difference between Tibetans and Han Chinese is so large that
the gene near these SNPs must have provided some advantage to people who colo-
nized the Tibetan Plateau. The SNPs in EPAS1 fall above this line.
These results suggest that Tibetans have a special variant of EPAS1 that helps
them adapt to life at high elevation. To understand this better, let’s first review
what is known about EPAS1. This gene regulates the number of red blood cells
(RBCs) that our bodies produce. Moreover, it regulates the number of RBCs in
Introduction to Genetic Analysis, 11e
response to the level of oxygen in our tissues. When oxygen levels in our tissues
Figure 01.23 #000
are low, EPAS1 signals the body to produce more RBCs.
04/02/14
Why does EPAS1 direct our bodies to produce more RBCs when the oxygen
05/01/14
Dragonfly
levels Media
in our Groupare low? The EPAS1 response to low oxygen may be how our
tissues
bodies normally respond to anemia (too few red blood cells). People with low RBC
counts get too little oxygen in their tissues, and so EPAS1 could signal the body to
make more RBCs to correct anemia. This mechanism could explain why people
who live at low elevation need the EPAS1 gene.
Now, let’s think about how a person from low elevation would respond if they
move to high elevation. Because of the thin air at high elevation, their tissues
would get less oxygen. If their bodies interpreted low oxygen due to thin air as a
sign of anemia, then EPAS1 would try to correct the problem by signaling their
26 CHAPTER 1 The Genetics Revolution
body to make more RBCs. However, since they are not anemic and already have
enough RBCs, their blood would become overloaded with RBCs. Too many RBCs
can cause pulmonary hypertension and the formation of blot clots, the conditions
underlying CMS.
Finally, how could a new variant of EPAS1 have helped Tibetans avoid CMS
and adapt to high elevation? The answer to this question is not known, and it is
now being actively investigated, but here is one hypothesis. Unlike lowlanders,
Tibetans maintain relatively normal levels of RBCs at high elevation, and they
have a lower risk of blot clot formation and pulmonary hypertension than low-
landers who move to high elevation. Thus, the Tibetan version of EPAS1 may no
longer cause the overproduction of RBCs at high elevation, while providing
another mechanism to cope with the thin air. The Tibetan variant of EPAS1 helps
them live at high elevation without suffering from CMS.
Lactose tolerance Before the invention of agriculture about 10,000 to 12,000 years
ago, human populations subsisted on foods harvested from nature by hunting wild
animals and gathering wild fruits and vegetables. At that time, no human popula-
tions used dairy products. Cattle were yet to be domesticated, and methods for
milking cows were not yet invented. Children nursed on mother’s milk, but as they
aged, the gene that encodes the enzyme lactase, which enables children to digest
milk sugar (lactose), was switched off. Once weaned, a child in pre-agricultural soci-
eties no longer needed the lactase enzyme, and so the lactase gene had a “switch” or
regulatory element that turned it off during late childhood.
With the origin of agriculture, cattle were domesticated from wild aurochs.
The early farmers may have kept cattle as a source of meat at first. After milking
was invented, milk offered another source of food. But there was a problem.
Although children in these ancient societies could digest milk sugar, the adults
could not. Adults could consume milk, but since they could not digest the lactose,
they would experience bloating, cramps, and diarrhea. Adults who experience
these symptoms from drinking milk are lactose intolerant. Importantly, because
they could not digest milk sugar, they were not utilizing this source of nutrition.
In ancient societies, where food could be scarce at times, the difference between
life and death could hinge on making the best use of all available food sources. Yet,
because the lactase gene is switched off in adults, adults could not digest milk sugar.
Lactase gene
RNA polymerase
complex
Direction of
OCT1 transcription
K e y C o n c e p t Evolutionary genetics provides the tools to document how to Genetic Analysis, 11e
Introduction
gene variants that provide a beneficial effect can rise in frequency in a population
Figure 01.25 #000
and make individuals in the population better adapted to the environment in which
04/04/14
they live. 05/01/14
Dragonfly Media Group
28 CHAPTER 1 The Genetics Revolution
s u m m a ry
As you begin your study of genetics, imagine yourself as a • How do cells manage to seamlessly orchestrate the in-
person at halftime on an amazing journey of discovery. The credibly complex array of interacting genes and bio-
last 100 years have witnessed a remarkable revolution in chemical reactions that are found within them?
human knowledge about how biological systems are put
• How do genetic variants at hundreds or even thousands
together and how they work. Genetics has been at the epi-
of genes control the yield of crop plants?
center of that revolution. Genetic analysis has answered
many fundamental questions about the transmission of • How can genetics guide both the prevention and treat-
genetic information within families, inside cells, and over ment of cancer, autism, and other diseases?
the eons of evolutionary time. Yet, as you will learn, the dis-
• How do genes give humans the capacity for language and
covery process in genetics has never been more dynamic
consciousness?
and the pace of growth in knowledge never greater.
Unanswered questions abound.
Genetic analysis over the next 100 years promises to help
• How do all the genes in the genome work together to answer many questions like these.
transform a fertilized egg into an adult organism?
key terms
adenine (A) (p. 7) gene (p. 4) one-gene–one-enzyme hypothesis
alleles (p. 4) gene expression (p. 7) (p. 7)
blending theory (p. 2) genetically modified organism point mutation (p. 17)
chromosome theory (p. 5) (GMO) (p. 13) quantitative trait locus (QTL) (p. 21)
codon (p. 10) genetics (p. 5) regulatory element (p. 7)
complementary (p. 7) genomics (p. 13) single nucleotide polymorphism
cytosine (C) (p. 7) guanine (G) (p. 7) (SNP) (p. 16)
DNA polymerase (p. 12) ligase (p. 12) somatic cells (p. 4)
DNA sequencing (p. 13) messenger RNA (mRNA) (p. 10) thymine (T) (p. 7)
dominant (p. 4) model organism (p. 10) transcription (p. 10)
DNA replication (p. 9) multifactorial hypothesis (p. 7) transformation (p. 13)
gametes (p. 4) nuclease (p. 12) translation (p. 10)
p r obl e m s
Most of the problems are also available for review/grading through http://www.whfreeman.com/
launchpad/iga11e.
6. In Figure 1-8b, can you tell if the number of hydrogen 13. The complementary strands of DNA in the double helix
bonds between adenine and thymine is the same as that are held together by hydrogen bonds: G ≡ C or A = T.
between cytosine and guanine? Do you think that a These bonds can be broken (denatured) in aqueous so-
DNA molecule with a high content of A + T would be lutions by heating to yield two single strands of DNA
more stable than one with high content of G + C? (see Figure 1-13a). How would you expect the relative
7. Which of three major groups (domains) of life in amounts of GC versus AT base pairs in a DNA double
Figure 1-11 is not represented by a model organism? helix to affect the amount of heat required to denature
it? How would you expect the length of a DNA double
8. Figure 1-13b shows the human chromosomes in a single
helix in base pairs to affect the amount of heat required
cell. The green dots show the location of a gene called
to denature it?
BAPX1. Is the cell in this figure a sex cell (gamete)?
Explain your answer. 14. The figure at the bottom of the page shows the DNA
sequence of a portion of one of the chromosomes from
9. Figure 1-15 shows the family tree, or pedigree, for Louise
a trio (mother, father, and child). Can you spot any new
Benge (Individual VI-1) who suffers from the disease
point mutations in the child that are not in either par-
ACDC because she has two mutant copies of the CD73
ent? In which parent did the mutation arise?
gene. She has four siblings (VI-2, VI-3, VI-4, and VI-5)
who have this disease for the same reason. Do all of the
C h a ll e n g i n g P r obl e m s
10 children of Louise and her siblings have the same
number of mutant copies of the CD73 gene, or might 15. a. There are three nucleotides in each codon, and each
this number be different for some of the 10 children? of these nucleotides can have one of four different bas-
es. How many possible unique codons are there?
B a s i c P r obl e m s b. If DNA had only two types of bases instead of
10. Below is the sequence of a single strand of a short DNA four, how long would codons need to be to specify all
molecule. On a piece of paper, rewrite this sequence 20 amino acids?
and then write the sequence of the complementary 16. Fathers contribute more new point mutations to their
strand below it. children than mothers. You may know from general bi-
ology that people have sex chromosomes—two X chro-
GTTCGCGGCCGCGAAC mosomes in females and an X plus a Y chromosome in
Comparing the top and bottom strands, what do you males. Both sexes have the autosomes (A’s).
notice about the relationship between them? a. On which type of chromosome (A, X, or Y) would
11. Mendel studied a tall variety of pea plants with stems you expect the genes to have the greatest number of
that are 20 cm long and a dwarf variety with stems that new mutations per base pair over many generations in
are only 12 cm long. a population? Why?
a. Under blending theory, how long would you expect b. On which type of chromosome would you expect the
the stems of first and second hybrids to be? least number of new mutations per base pair? Why?
b. Under Mendelian rules and assuming stem length c. Can you calculate the expected number of new
is controlled by a single gene, what would you expect mutations per base pair for a gene on the X and Y
to observe in the second-generation hybrids if all the chromosomes for every one new mutation in a gene
first-generation hybrids were tall? on an autosome if the mutation rate in males is twice
12. If a DNA double helix that is 100 base pairs in length has that in females?
32 adenines, how many cytosines, guanines, and thy- 17. For young men of age 20, there have been 150 rounds of
mines must it have? DNA replication during sperm production as compared
Mother Father
Copy M1 • • C AGC AGA T TGC TGC T T TGT A TGAG • • Copy F1 • • C AGC TGA T TGC TGC T T TGT AGGAG • •
Copy M2 • • C AGC TGA T TGC TGC T T TGT A TGAG • • Copy F2 • • C A A C TGA T TGC TGC T T TGT A TGAG • •
Child
to only 23 rounds for a woman of age 20. That is a 6.5- 19. The human genome is approximately 3 billion base
fold greater number of cell divisions and proportion- pairs in size.
ately greater opportunity for new point mutations. Yet, a. Using standard 8.5″ × 11″ paper with one-inch mar-
on average, 20-year-old men contribute only about gins, a 12-point font size, and single-spaced lines, how
twice as many new point mutations to their offspring as many sheets of paper printed on one side would be
do women. How can you explain this discrepancy? required to print out the human genome?
18. In computer science, a bit stores one of two states, 0 or b. A ream of 500 sheets of paper is about 5 cm thick.
1. A byte is a group of 8 bits that has 28 = 256 possible How tall would the stack of paper with the entire
states. Modern computer files are often megabytes human genome be?
(106 bytes) or even gigabytes (109 bytes) in size. The hu-
c. Would you want a backpack, shopping cart, or a
man genome is approximately 3 billion base pairs in
semitrailer truck to haul around this stack?
size. How many nucleotides are needed to encode a
single byte? How large of a computer file would it take
to store the same amount of information as a single hu-
man genome?
344
2
Ch a p t e r
Single-Gene
Inheritance
Learning Outcomes
After completing this chapter, you will be able to
31
32 C H APTER 2 Single-Gene Inheritance
W
hat kinds of research do biologists do? One central area of research in
the biology of all organisms is the attempt to understand how an organ-
ism develops from a fertilized egg into an adult—in other words, what
makes an organism the way it is. Usually, this overall goal is broken down into the
study of individual biological properties such as the development of plant flower
color, or animal locomotion, or nutrient uptake, although biologists also study
some general areas such as how a cell works. How do geneticists analyze biological
properties? The genetic approach to understanding any biological property is to
find the subset of genes in the genome that influence that property, a process
sometimes referred to as gene discovery. After these genes have been identified,
their cellular functions can be elucidated through further research.
There are several different types of analytical approaches to gene discovery,
but one widely used method relies on the detection of single-gene inheritance pat-
terns, and that is the topic of this chapter.
All of genetics, in one aspect or another, is based on heritable variants. The
basic approach of genetics is to compare and contrast the properties of variants,
and from these comparisons make deductions about genetic function. It is similar
to the way in which you could make inferences about how an unfamiliar machine
works by changing the composition or positions of the working parts, or even by
removing parts one at a time. Each variant represents a “tweak” of the biological
machine, from which its function can be deduced.
In genetics, the most common form of any property of an organism is called
the wild type, that which is found “in the wild,” or in nature. The heritable vari-
ants observed in an organism that differs from the wild type are mutants, indi-
vidual organisms having some abnormal form of a property. As examples, the wild
type and some mutants in two model organisms are shown in Figure 2-1. The
alternative forms of the property are called phenotypes. In this analysis we dis-
tinguish a wild-type phenotype and a mutant phenotype.
F i g u r e 2 -1 These photographs Compared to wild type, mutants are rare. We know that they arise from wild
show the range of mutant phenotypes types by a process called mutation, which results in a heritable change in the
typical of those obtained in the genetic DNA of a gene. The changed form of the gene is also called a mutation. Mutations
dissection of biological properties. are not always detrimental to an organism; sometimes they can be advantageous,
These cases are from the dissection of but most often they have no observable effect. A great deal is known about the
floral development in Arabidopsis
mechanisms of mutation (see Chapter 16), but generally it can be said that they
thaliana (a) and hyphal growth in
Neurospora crassa, a mold (b). WT = arise from mistakes in cellular processing of DNA.
wild type. [ (a) George Haughn ; (b) Anthony Most natural populations also show polymorphisms, defined as the coexis-
Griffiths/Olivera Gavric.] tence of two or more reasonably common phenotypes of a biological property,
lfy WT ap1 WT
ap2 ap3 ag
(a) (b)
Single-Gene Inheritance 3 3
see reverse genetics at work in later chapters. In brief, it starts with genomic analy-
sis at the DNA level to identify a set of genes as candidates for encoding the biological
property of interest, then induces mutants targeted specifically to those genes, and
then examines the mutant phenotypes to see if they indeed affect the property
under study.)
in Figure 2-2. His results were substantially the same for The seven phenotypic pairs studied by Mendel
each character, and so we can use one character, pea seed
color, as an illustration. All of the lines used by Mendel were The seven phenotypic pairs studied by Mendel
pure lines, meaning that, for the phenotype in question, all
offspring produced by matings within the members of that Round or wrinkled ripe seeds
line were identical. For example, within the yellow-seeded
line, all the progeny of any mating were yellow seeded.
Mendel’s analysis of pea heredity made extensive use of
crosses. To make a cross in plants such as the pea, pollen is
simply transferred from the anthers of one plant to the stig- Yellow or green seeds
mata of another. A special type of mating is a self (self-
pollination), which is carried out by allowing pollen from a
flower to fall on its own stigma. Crossing and selfing are
illustrated in Figure 2-3. The first cross made by Mendel
mated plants of the yellow-seeded lines with plants of the
green-seeded lines. In his overall breeding program, these Axial or terminal flowers
Purple or white petals
lines constituted the parental generation, abbreviated P.
In Pisum sativum, the color of the seed (the pea) is deter-
mined by the seed’s own genetic makeup; hence, the peas
resulting from a cross are effectively progeny and can be
conveniently classified for phenotype without the need to
grow them into plants. The progeny peas from the cross
between the different pure lines were found to be all yellow,
no matter which parent (yellow or green) was used as male
or female. This progeny generation is called the first filial Inflated or pinched ripe pods
generation, or F1. The word filial comes from the Latin
words filia (daughter) and filius (son). Hence, the results of
these two reciprocal crosses were as follows, where × repre-
sents a cross:
female from yellow line × male from green line →
F1 peas all yellow Long or short stems
Mendel noted that this outcome was very close to a mathematical ratio of
three-fourths (75%) yellow and one-fourth (25%) green. A simple calculation
shows us that 6022/8023 = 0.751 or 75.1%, and 2001/8023 = 0.249 or 24.9%.
Hence, there was a 3 : 1 ratio of yellow to green. Interestingly, the green pheno-
type, which had disappeared in the F1, had reappeared in one-fourth of the F2
individuals, showing that the genetic determinants for green must have been
present in the yellow F1, although unexpressed.
To further investigate the nature of the F2 plants, Mendel selfed plants grown
from the F2 seeds. He found three different types of results. The plants grown
from the F2 green seeds, when selfed, were found to bear only green peas.
3 6 C H APTER 2 Single-Gene Inheritance
Cross-pollination Selfing
Stigma
Progeny Progeny
These two types of matings, the F1 self and the cross of the F1 with any green-
seeded plant, both gave yellow and green progeny, but in different ratios. These
two ratios are represented in Figure 2-4. Notice that the ratios are seen only when
the peas in several pods are combined.
The 3 : 1 and 1 : 1 ratios found for pea color were also found for comparable
crosses for the other six characters that Mendel studied. The actual numbers for
the 3 : 1 ratios for those characters are shown in Table 2-1.
either
F2 F2
Progeny Progeny
seeds seeds
Total 21 7 Total 11 11
Table 2-1 Results of All Mendel’s Crosses in Which Parents Differed in One Character
Parental phenotypes F1 F2 F2 ratio
1. round × wrinkled seeds All round 5474 round; 1850 wrinkled 2.96 : 1
2. yellow × green seeds All yellow 6022 yellow; 2001 green 3.01 : 1
3. purple × white petals All purple 705 purple; 224 white 3.15 : 1
4. inflated × pinched pods All inflated 882 inflated; 299 pinched 2.95 : 1
5. green × yellow pods All green 428 green; 152 yellow 2.82 : 1
6. axial × terminal flowers All axial 651 axial; 207 terminal 3.14 : 1
7. long × short stems All long 787 long; 277 short 2.84 : 1
Pure Pure
P P Y /Y y /y
F1 F1 Y/y
Equal segregation
Selfed Crossed with
y /y
green
F2
1 1
2 Y 2 y all y
1 1 1
4 Y /Y 4 Y/y 2 Y/y
F2 1 1
3 1 2 Y 2 Y
4 2
1 1 1
4 Y/y 4 y /y 2 y /y
1 1
1 1 2 y 2 y
4 2
Mendel’s research in the mid-nineteenth century was not noticed by the inter-
national scientific community until similar observations were independently pub-
lished by several other researchers in 1900. Soon research in many species of plants,
animals, fungi, and algae showed that Mendel’s law of equal segregation was appli-
cable to all sexual eukaryotes and, in all cases, was based on the chromosomal seg-
regations taking place in meiosis, a topic that we turn to in the next section.
either 2n → 2n + 2n
or n → n + n
This “trick” of constancy is accomplished when each chromosome replicates to
make two identical copies of itself, with underlying DNA replication. The two iden-
tical copies, which are often visually discernible, are called sister chromatids.
Then, each copy is pulled to opposite ends of the cell. When the cell divides, each
daughter cell has the same chromosomal set as its progenitor.
In addition, most eukaryotes have a sexual cycle, and, in these organisms, spe-
cialized diploid cells called meiocytes are set aside that divide to produce sex
cells such as sperm and egg in plants and animals, or sexual spores in fungi or
algae. Two sequential cell divisions take place, and the two nuclear divisions that
accompany them are called meiosis. Because there are two divisions, four cells
are produced from each progenitor cell. Meiosis takes place only in diploid cells,
and the resulting gametes (sperm and eggs in animals and plants) are haploid.
Hence, the net result of meiosis is
2n → n + n + n + n
This overall halving of chromosome number during meiosis is achieved
through one replication and two divisions. As with mitosis, each chromosome
replicates once, but in meiosis the replicated chromosomes (sister chromatids)
remain attached. One of each of the replicated chromosome pairs is pulled to
opposite ends of the cell, and division occurs. At the second division, the sister
chromatids separate and are pulled to opposite ends of the cell.
Original Daughter
cell cells
G G
Figure 2- 6
2.2 The Chromosomal Basis of Single-Gene Inheritance Patterns 41
2n Meiocytes 2n
Meiosis Meiosis
2n Meiocytes 2n n n
n n n n n n
n
n
Tetrad Tetrad
Meiosis Meiosis
Mitosis 2n Transient
diploid cell
n n n n n n
n
n (meiocyte)
Tetrad Tetrad gp gp gp gp gp
Meiosis
sperm egg
(gamete) (gamete)
Sperm n n Egg
Sexual
2n Zygote n n n n spores
Tetrad
Mitosis
Animal Fungus
Plant
The location of the meiocytes in animal, plant, and fungal life cycles is shown F i g u r e 2 -7 The life cycles of humans,
in Figure 2-7. plants, and fungi, showing the points at
The basic genetic features of mitosis and meiosis are summarized in Figure 2-8. which mitosis and meiosis take place.
Note that in the females of humans and
To make comparison easier, both processes are shown in a diploid cell. Notice, again,
many plants, three cells of the meiotic
that mitosis takes place in one cell division, and the two resulting “daughter” cells tetrad abort. The abbreviation n indicates
have the same genomic content as that of the “mother” (progenitor) cell. The first a haploid cell, 2n a diploid cell; gp stands
key process to note is a premitotic chromosome replication. At the DNA level, this for gametophyte, the name of the small
stage is the synthesis, or S, phase (see Figure 2-6), at which the DNA is replicated. structure composed of haploid cells that
The replication produces pairs of identical sister chromatids, which become visible will produce gametes. In many plants such
at the beginning of mitosis. When a cell divides, each member of a pair of sister as corn, a nucleus from the male
gametophyte fuses with two nuclei from
chromatids is pulled into a daughter cell, where it assumes the role of a fully fledged
the female gametophyte, giving rise to a
chromosome. Hence, each daughter cell has the same chromosomal content as the triploid (3n ) cell, which then replicates to
original cell. form the endosperm, a nutritive tissue that
Before meiosis, as in mitosis, chromosome replication takes place to form sis- surrounds the embryo (which is derived
ter chromatids, which become visible at meiosis. The centromere appears not to from the 2n zygote).
divide at this stage, whereas it does in mitosis. Also in contrast with mitosis, the
homologous pairs of sister chromatids now unite to form a bundle of four homol-
ogous chromatids. This joining of the homologous pairs is called synapsis, and it
relies on the properties of a macromolecular assemblage called the synaptonemal
complex (SC), which runs down the center of the pair. Replicate sister chromo-
somes are together called a dyad (from the Greek word for “two”). The unit com-
prising the pair of synapsed dyads is called a bivalent. The four chromatids that
42 C H APTER 2 Single-Gene Inheritance
F i g u r e 2 - 8 Simplified representation
of mitosis and meiosis in diploid cells (2n,
diploid; n, haploid). (Detailed versions are
shown in Appendix 2-1, page 83.)
Mitosis
Interphase Prophase Metaphase
2n 4n
Replication
Meiosis
2n 4n
Replication
Pairing
make up a bivalent are called a tetrad (Greek for “four”), to indicate that there are
four homologous units in the bundle.
dyad
bivalent SC tetrad
dyad
(A parenthetical note. The process of crossing over takes place at this tetrad
stage. Crossing over changes the combinations of alleles of several different genes
but does not directly affect single-gene inheritance patterns; therefore, we will
postpone its detailed coverage until Chapter 4. For the present, it is worth noting
that, apart from its allele-combining function, crossing over is also known to be a
crucial event that must take place in order for proper chromosome segregation in
the first meiotic division.)
The bivalents of all chromosomes move to the cell’s equator, and, when the
cell divides, one dyad moves into each new cell, pulled by spindle fibers attached
near the centromeres. In the second cell division of meiosis, the centromeres
divide and each member of a dyad (each member of a pair of chromatids) moves
into a daughter cell. Hence, although the process starts with the same genomic
content as that for mitosis, the two successive segregations result in four haploid
cells. Each of the four haploid cells that constitute the four products of meiosis
contains one member of a tetrad; hence, the group of four cells is sometimes
called a tetrad, too. Meiosis can be summarized as follows:
2.2 The Chromosomal Basis of Single-Gene Inheritance Patterns 4 3
Daughter cells
Telophase
Anaphase
2n
2n
Segregation
Products of
Telophase II meiosis
n
Prophase II Metaphase II Anaphase II
Telophase I
Anaphase I n
Segregation
n
Segregation
Research in cell biology has shown that the spindle fibers that pull apart chro-
mosomes are polymers of the molecule tubulin. The pulling apart is caused
mainly by a depolymerization and hence shortening of the fibers at the point
where they are attached to the chromosomes.
The behavior of chromosomes during meiosis clearly explains Mendel’s law of
equal segregation. Consider a heterozygote of general type A/a. We can simply follow
the preceding summary while considering what happens to the alleles of this gene:
Start: one homolog carries A and one carries a
Replication: one dyad is AA and one is aa
Pairing: tetrad is A/A/a/a
First-division products: one cell AA, the other cell aa (crossing over can mix
these types of products up, but the overall ratio is not changed)
Second-division products: four cells, two of type A and two of type a
4 4 C H APTER 2 Single-Gene Inheritance
1
Hence, the products of meiosis from a heterozygous meiocyte A/a are 2 A
1
and 2 a, precisely the equal ratio that is needed to explain Mendel’s first law.
Meiosis
A A
A 1
A 2 A
A a A A
a a
a 1
a 2
a
a a
Note that we have focused on the broad genetic aspects of meiosis necessary
to explain single-gene inheritance. More complete descriptions of the detailed
stages of mitosis and meiosis are presented in Appendices 2-1 and 2-2 at the end
of this chapter.
defining them at the molecular level. What are the structural dif-
ferences between wild-type and mutant alleles at the DNA level
of a gene? What are the functional differences at the protein level?
Mutant alleles can be used to study single-gene inheritance with- Diploid
out needing to understand their structural or functional nature. +
However, because a primary reason for embarking on single- 2n
gene inheritance is ultimately to investigate a gene’s function, we r
must come to grips with the molecular nature of wild-type and
mutant alleles at both the structural and the functional level.
Chromosome replication
Haploid b
Figure 2-10 Each chromosome divides longitudinally into two
b b A
chromatids (left ); at the molecular level (right ), the single DNA molecule
b A T of each chromosome replicates, producing two DNA molecules, one
T A for each chromatid. Also shown are various combinations of a gene
with wild-type allele b+ and mutant form b, caused by the change in a
b T
single base pair from GC to AT. Notice that, at the DNA level, the two
chromatids produced when a chromosome replicates are always
identical with each other and with the original chromosome.
2.3 The Molecular Basis of Mendelian Inheritance Patterns 47
A
a
A a
A
A
a A A
a A
A a A
Mitosis Mitosis Meiosis
a
Alignment Alignment Pairing of
on equator on equator homologs
at equator A
A A a (tetrad)
A A a
a
A A a
A End of
first division
a A a
and and
A A
A a
a
End of
and second division and
1 1
A Sex cells a
2 2
F i g u r e 2 -11 DNA and gene transmission in mitosis and meiosis in eukaryotes. The S phase and the main stages of mitosis
and meiosis are shown. Mitotic divisions (left and middle ) conserve the genotype of the original cell. At the right, the two
successive meiotic divisions that take place during the sexual stage of the life cycle have the net effect of halving the number of
chromosomes. The alleles A and a of one gene are used to show how genotypes are transmitted in cell division.
48 C H APTER 2 Single-Gene Inheritance
alleles (say, A and a) in the parents and the meiotic products: the result would be that
one-half of the products would have the A DNA sequence and one-half would have
the a DNA sequence. The same would be true for any DNA sequence that differed in
the inherited chromosomes, including those not necessarily inside alleles correlated
with known phenotypes such as red and white flowers. Thus, we see the rules of
segregation enunciated by Mendel apply not only to genes but to any stretch of DNA
along a chromosome.
phenylalanine
phenylpyruvic acid
Babies are now routinely tested for this processing deficiency at birth. If the
deficiency is detected, phenylalanine can be withheld with the use of a special
diet and the development of the disease arrested.
The PAH enzyme is made up of a single type of protein. What changes have
occurred in the mutant form of the PKU gene’s DNA, and how can such change at
the DNA level affect protein function and produce the disease phenotype? Sequenc-
ing of the mutant alleles from many PKU patients has revealed a plethora of muta-
tions at different sites along the gene, mainly in the protein-encoding regions, or the
exons; the results are summarized in Figure 2-12. They represent a range of DNA
changes, but most are small changes affecting only one nucleotide pair among the
thousands that constitute the gene. What these alleles have in common is that they
encode a defective protein that no longer has normal PAH activity. By changing one
or more amino acids, the mutations all inactivate some essential part of the protein
encoded by the gene. The effect of the mutation on the function of the gene depends
on where within the gene the mutation occurs. An important functional region of
the gene is that encoding an enzyme’s active site; so this region is very sensitive to
mutation. In addition, a minority of mutations are found to be in introns, and these
mutations often prevent the normal processing of the primary RNA transcript.
2.3 The Molecular Basis of Mendelian Inheritance Patterns 49
24 5 7 21
Exon 4 11 10 4 7 37 12 9 1
mutations
1 2 3 4 5 6 7 8 9 10 11 12 13
Intron
mutations 1 4 2 2 1 3 1 41 1
Some of the general consequences of mutation at the protein level are shown in F i g u r e 2 -12 Many mutations of the
Figure 2-13. Many of the mutant alleles are of a type generally called null alleles: human phenylalanine hydroxylase gene
the proteins encoded by them completely lack PAH function. Other mutant alleles that cause enzyme malfunction are
known. The number of mutations in the
reduce the level of enzyme function; they are sometimes called leaky mutations,
exons, or protein-encoding regions (black),
because some wild-type function seems to “leak” into the mutant phenotype. DNA are listed above the gene. The number of
sequencing often detects changes that have no functional impact at all, so they are mutations in the intron regions (green,
functionally wild type. Hence, we see that the terms wild type and mutant some- numbered 1 through 13) that alter splicing
times have to be used carefully. are listed below the gene. [ Data from C. R.
Scriver, Ann. Rev. Genet. 28, 1994, 141–165.]
K e y C o n c e p t Most mutations that alter phenotype alter the amino acid
sequence of the gene’s protein product, resulting in reduced or absent function.
We have been pursuing the idea that finding a set of genes that impinge on the
biological property under investigation is an important goal of genetics, because it
defines the components of the system. However, finding the precise way in which
mutant alleles lead to mutant phenotypes is often challenging, requiring not only
the identification of the protein products of these genes, but also detailed cellular
and physiological studies to measure the effects of the mutations. Furthermore,
DNA
Components of protein active site
Promoter Intron
5´ 3´ Wild type
Exon Exon
m1: null
m2: null
m3: null
m4: leaky
m5: silent
m6: null
finding how the set of genes interacts is a second level of challenge and a topic that
we will pursue later, starting in Chapter 6.
The standard procedure is to cross the mutant with wild type. (If the mutant
is sterile, then another approach is needed.) First, we will consider three simple
cases that cover most of the possible outcomes:
1. A fertile flower mutant with no pigment in the petals (for example, white
petaled in contrast with the normal red)
2. A fertile fruit-fly mutant with short wings
3. A fertile mold mutant that produces excess hyphal branches (hyperbranching)
showing gametes and gametic fusions is called a Punnett square, named after an
early geneticist, Reginald C. Punnett. They are useful devices for explaining genetic
ratios. We shall encounter more in later discussions.
when the mutant first shows up in the population, it will be in the heterozygous
P SH
state. (This is not true for a recessive mutation such as that in the preceding plant
example, which must be homozygous to be expressed and must have come from
/ SH/ the selfing of an unidentified heterozygous plant in the preceding generation.)
When long-winged progeny were interbred, all of their progeny were long
winged, as expected of a recessive wild-type allele. When the short-winged prog-
/ SH/ eny were interbred, their progeny showed a ratio of three-fourths short to one-
fourth long.
Dominant mutations are represented by uppercase letters or words: in the
present example, the mutant allele might be named SH, standing for “short.”
Then the crosses would be represented symbolically as
F1
P +/+ × SH/+
1
/ / F1 2 +/+
1
2 SH/+
/ /
F1 +/+ × +/+
all +/+
F1 SH/+ × SH/+
1
F1 SH 4 SH/SH
1
2 SH/+
1
/ SH/ 4 +/+
or graphically as shown in the grids on the left.
SH SH/ SH/SH This analysis of the fly mutant identifies a gene that is part of a subset of genes
that, in wild-type form, are crucial for the normal development of a wing. Such a
result is the starting point of further studies that would focus on the precise devel-
opmental and cellular ways in which the growth of the wing is arrested, which,
once identified, reveal the time of action of the wild-type allele in the course of
development.
P + × hb
Diploid meiocyte +/hb
1
F 1 2 +
1
2 hb
The mutation and inheritance analysis has uncovered a gene whose wild-type
allele is essential for normal control of branching, a key function in fungal disper-
sal and nutrient acquisition. Now the mutant needs to be investigated to see the
location in the normal developmental sequence at which the mutant produces a
block. This information will reveal the time and place in the cells at which the
normal allele acts.
Sometimes, the severity of a mutant phenotype renders the organism sterile,
unable to go through the sexual cycle. How can the single-gene inheritance of
2.5 Sex-Linked Single-Gene Inheritance Patterns 5 3
chromosomes also segregate equally, but the phenotypic ratios seen in progeny
are often different from the autosomal ratios.
Sex chromosomes
Most animals and many plants show sexual dimorphism; in other words, individu-
als are either male or female. In most of these cases, sex is determined by a special
pair of sex chromosomes. Let’s look at humans as an example. Human body cells
have 46 chromosomes: 22 homologous pairs of autosomes plus 2 sex chromosomes.
Females have a pair of identical sex chromosomes called the X chromosomes.
Males have a nonidentical pair, consisting of one X and one Y. The Y chromosome
is considerably shorter than the X. Hence, if we let A represent autosomal chromo-
somes, we can write
females = 44A + XX
males = 44A + XY
At meiosis in females, the two X chromosomes pair and segregate like auto-
somes, and so each egg receives one X chromosome. Hence, with regard to sex
chromosomes, the gametes are of only one type and the female is said to be the
homogametic sex. At meiosis in males, the X and the Y chromosomes pair over a
short region, which ensures that the X and Y separate so that there are two types
of sperm, half with an X and the other half with a Y. Therefore, the male is called
the heterogametic sex.
The inheritance patterns of genes on the sex chromosomes are different from
those of autosomal genes. Sex-chromosome inheritance patterns were first inves-
tigated in the early 1900s in the laboratory of the great geneticist Thomas Hunt
Morgan, using the fruit fly Drosophila melanogaster (see the Model Organism box
on page 56). This insect has been one of the most important research organisms
in genetics; its short, simple life cycle contributes to its usefulness in this regard.
Fruit flies have three pairs of autosomes plus a pair of sex chromosomes, again
referred to as X and Y. As in mammals, Drosophila females have the constitution
XX and males are XY. However, the mechanism of sex determination in Drosophila
differs from that in mammals. In Drosophila, the number of X chromosomes in rela-
tion to the autosomes determines sex: two X’s result in a female, and one X results
in a male. In mammals, the presence of the Y chromosome determines maleness and
the absence of a Y determines femaleness. However, it is important to note that,
despite this somewhat different basis for sex determination, the single-gene
inheritance patterns of genes on the sex chromosomes are remarkably similar in
Drosophila and mammals.
Vascular plants show a variety of sexual arrangements. Dioecious species are
those showing animal-like sexual dimorphism, with female plants bearing flowers
containing only ovaries and male plants bearing flowers containing only anthers
(Figure 2-14). Some, but not all, dioecious plants have a nonidentical pair of chro-
mosomes associated with (and almost certainly determining) the sex of the plant.
Of the species with nonidentical sex chromosomes, a large proportion have an XY
system. For example, the dioecious plant Melandrium album has 22 chromosomes
per cell: 20 autosomes plus 2 sex chromosomes, with XX females and XY males.
Other dioecious plants have no visibly different pair of chromosomes; they may
still have sex chromosomes but not visibly distinguishable types.
X-linked inheritance
For our first example of X linkage, we turn to eye color in Drosophila. The wild-type F i g u r e 2 -14 Examples of two
eye color of Drosophila is dull red, but pure lines with white eyes are available dioecious plant species are (a) Osmaronia
dioica and (b) Aruncus dioicus.
[ (a) Leslie Bohm; (b) Anthony Griffiths.]
Pseudoautosomal
region 1
Model
Organism Drosophila
Drosophila melanogaster was one of the first model organ- Adult
isms to be used in genetics. It is readily available from ripe
fruit, has a short life cycle, and is simple to culture and
cross. Sex is determined by X and Y sex chromosomes
(XX = female, XY = male), and males and females are
easily distinguished. Mutant phenotypes regularly arise 1 day
1 1
3 2 –4 2 days
in lab populations, and their frequency can be increased
by treatment with mutagenic radiation or chemicals. It is
a diploid organism, with four pairs of homologous chro- Egg
mosomes (2n = 8). In salivary glands and certain other
1 day
tissues, multiple rounds of DNA replication without chro-
mosomal division result in “giant chromosomes,” each
Pupa
with a unique banding pattern that provides geneticists
with landmarks for the study of chromosome mapping First instar
and rearrangement. There are many species and races of
Drosophila, which have been important raw material for 1 day
the study of evolution. 1
2 2 –3 days
Time flies like an arrow; fruit flies like a banana. Second instar
(Groucho Marx)
Third instar
1 day
Abortion or stillbirth
Dizygotic (sex unspecified)
(nonidentical twins)
2.5 Sex-Linked Single-Gene Inheritance Patterns 57
X X X Y X X X Y
w w
1 1 1 1
2 2 2 2
w w w w w w w w
Female Female
gametes gametes
1 1 1 1
2 Red female 2 Red male 2 Red female 2 White male
w w
1 1 1 1
2 2 2 2
w w w w w w w w
1 1
2 2
1 1 1 1
4 Red female 4 Red male 4 Red female 4 Red male
Female Female
gametes gametes
w w w w w w w w
1 1
2 2
1 1 1 1
4 Red female 4 White male 4 White female 4 White male
F i g u r e 2 -17 Reciprocal crosses between red-eyed (red) and white-eyed ( white) Drosophila give different results.
The alleles are X linked, and the inheritance of the X chromosome explains the phenotypic ratios observed, which are
different from those of autosomal genes. (In Drosophila and many other experimental systems, a superscript plus sign
is used to designate the normal, or wild-type, allele. Here, w + encodes red eyes and w encodes white eyes.)
58 C H APTER 2 Single-Gene Inheritance
The F2 consists of one-half red-eyed and one-half white-eyed flies of both sexes.
Hence, in sex linkage, we see examples not only of different ratios in different
sexes, but also of differences between reciprocal crosses.
Note that Drosophila eye color has nothing to do with sex determination, and
so we have an illustration of the principle that genes on the sex chromosomes are
not necessarily related to sexual function. The same is true in humans: in the
discussion of pedigree analysis later in this chapter, we shall see many X-linked
genes, yet few could be construed as being connected to sexual function.
The abnormal allele associated with white eye color in Drosophila is recessive,
but abnormal alleles of genes on the X chromosome that are dominant also arise,
such as the Drosophila mutant hairy wing (Hw). In such cases, the wild-type allele
(Hw +) is recessive. The dominant abnormal alleles show the inheritance pattern
corresponding to that of the wild-type allele for red eyes in the preceding exam-
ple. The ratios obtained are the same.
3 : 1 and 1 : 1 ratios are usually not seen unless many similar
Pedigree symbols
pedigrees are combined. The approach to pedigree analysis
also depends on whether one of the contrasting phenotypes Male
2 3 Number of children
is a rare disorder or both phenotypes of a pair are common of sex indicated
(in which case they are said to be “morphs” of a polymor- Female
phism). Most pedigrees are drawn for medical reasons and Affected individuals
therefore concern medical disorders that are almost by defi- Mating
nition rare. In this case, we have two phenotypes: the pres-
Heterozygotes for
ence and the absence of the disorder. Four patterns of autosomal recessive
Parents and
single-gene inheritance are revealed in pedigrees. Let’s look, children:
first, at recessive disorders caused by recessive alleles of sin- 1 boy; 1 girl Carrier of sex-linked
gle autosomal genes. (in order of birth) recessive
From this pattern, we can deduce a simple monohybrid cross, with the recessive
allele responsible for the exceptional phenotype (indicated in black). Both par-
ents must be heterozygotes—say, A/a; both must have an a allele because each
contributed an a allele to each affected child, and both must have an A allele
because they are phenotypically normal. We can identify the genotypes of the
children (shown left to right) as A/−, a/a, a/a, and A/−. Hence, the pedigree can
be rewritten as follows:
A/a A/a
This pedigree does not support the hypothesis of X-linked recessive in-
heritance, because, under that hypothesis, an affected daughter must have a
heterozygous mother (possible) and a hemizygous father, which is clearly
6 0 C H APTER 2 Single-Gene Inheritance
Early-onset
Parkinson’s disease
(PARK7),
Male infertility (USP9Y), autosomal Ehlers-Danlos syndrome
Y-linked. recessive. type IV (COL3A1),
Defect of sperm cells. Neurodegeneration. autosomal dominant.
Stretchy collagen.
Hemophilia (F8), X-linked recessive.
Inactive blood clotting factor.
Alkaptonuria (HGD),
autosomal recessive. Black urine.
XY 1 2
autosomal recessive.
Metabolic disorder.
22 3
21 Chromosome 4
Pseudoachondroplasia (COMP), 20 pairs 5 Cystic fibrosis (CFTR),
autosomal recessive.
autosomal dominant. A type of dwarfism.
Abnormal chlorine and
19 6 sodium transport;
mucus in the lungs
interferes with breathing.
18 7
17 8
Hereditary hemorrhagic telangiectasia
(MADH4), autosomal dominant. 16 9
Dilation of capillaries causing bleeding.
15 10
14 11
13 12
Werner syndrome
Canavan disease (ASPA), (WRN), autosomal
autosomal recessive. recessive. Premature aging.
Damage to nerve cells
and brain.
Pseudoachondroplasia phenotype
F i g u r e 2 -2 2 The human
pseudoachondroplasia phenotype is
illustrated here by a family of five
sisters and two brothers. The
phenotype is determined by a
dominant allele, which we can call D,
that interferes with the growth of long
bones during development. This
photograph was taken when the family
arrived in Israel after the end of World
War II. [ Bettmann/CORBIS.]
2.6 Human Pedigree Analysis 6 3
Autosomal polymorphisms
The alternative phenotypes of a polymorphism (the morphs) are often inherited
as alleles of a single autosomal gene in the standard Mendelian manner. Among
the many human examples are the following dimorphisms (with two morphs, the
simplest polymorphisms): brown versus blue eyes, pigmented versus blond hair,
ability to smell Freesias (a fragrant type of flower) versus inability, widow’s peak
versus none, sticky versus dry earwax, and attached versus free earlobes. In each
example, the morph determined by the dominant allele is written first.
100
Of all persons carrying the allele,
50
Polydactyly
II
5,5
6,6
III
5,5 5,5 5,5 6,6 6,6
5,5 5,5 6,6 5,5 5,5
IV
6 normal 7 normal
3 afflicted
5,5 5,5 5,5 5,5 5,6
6,6 6,6 6,6 6,6 6,7
V 12 normal
6,6
(a) (b) 6,6
(a)
I
1 2
II 4 3
1– 4 5 6 7 8 9 10 11– 13
III
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
IV
1 2 3 4
(b)
Figure 2-26 Piebald spotting is a rare dominant human phenotype. Although the phenotype is encountered
sporadically in all races, the patterns show up best in those with dark skin. (a) The photographs show front and
back views of affected persons IV-1, IV-3, III-5, III-8, and III-9 from (b) the family pedigree. Notice the variation in
expression of the piebald gene among family members. The patterns are believed to be caused by the dominant
allele interfering with the migration of melanocytes (melanin-producing cells) from the dorsal to the ventral surface
in the course of development. The white forehead blaze is particularly characteristic and is often accompanied
Introduction to Genetic Analysis, 11e
by a white forelock in the hair.
Figure 02.26ab #256
Piebaldism is not a form of albinism; the cells in the light patches have the genetic potential to make melanin,
04/20/14
but, because
Dragonfly theyGroup
Media are not melanocytes, they are not developmentally programmed to do so. In true albinism, the
cells lack the potential to make melanin. (Piebaldism is caused by mutations in c-kit, a type of gene called a
proto-oncogene; see Chapter 16.) [ Photos (a) and data (b) from I. Winship, K. Young, R. Martell, R. Ramesar, D. Curtis,
and P. Beighton, “Piebaldism: An Autonomous Autosomal Dominant Entity,” Clin. Genet. 39, 1991, 330. © Reproduced with
permission of John Wiley & Sons, Inc.]
(a)
(b)
F i g u r e 2 -2 9 A pedigree for the X-linked recessive condition hemophilia in the royal families of
Europe. A recessive allele causing hemophilia (failure of blood clotting) arose through mutation in
the reproductive cells of Queen Victoria or one of her parents. This hemophilia allele spread into
other royal families by intermarriage. (a) This partial pedigree shows affected males and carrier
females (heterozygotes). Most spouses marrying into the families have been omitted from the
pedigree for simplicity. Can you deduce the likelihood of the present British royal family’s harboring
the recessive allele? (b) A painting showing Queen Victoria surrounded by her numerous
descendants. [ (b) © Lebrecht Music and Arts Photo Library/Alamy.]
6 8 C H APTER 2 Single-Gene Inheritance
they develop as females (Figure 2-30). They have female external genitalia, a blind
Testicular feminization
phenotype vagina, and no uterus. Testes may be present either in the labia or in the abdomen.
Although many such persons marry, they are sterile. The condition is not reversed
by treatment with the male hormone androgen, and so it is sometimes called andro-
gen insensitivity syndrome. The reason for the insensitivity is that a mutation in the
androgen-receptor gene causes the receptor to malfunction, and so the male hor-
mone can have no effect on the target organs that contribute to maleness. In
humans, femaleness results when the male-determining system is not functional.
Y-linked inheritance
Only males inherit genes in the differential region of the human Y chromosome,
with fathers transmitting the genes to their sons. The gene that plays a primary role
in maleness is the SRY gene, sometimes called the testis-determining factor. Genomic
analysis has confirmed that, indeed, the SRY gene is in the differential region of the
F i g u r e 2 - 3 0 An XY individual with Y chromosome. Hence, maleness itself is Y linked and shows the expected pattern
testicular feminization syndrome, caused of exclusively male-to-male transmission. Some cases of male sterility have been
by the recessive X-linked allele for
androgen insensitivity. [ © Wellcome Photo
shown to be caused by deletions of Y-chromosome regions containing sperm-pro-
Library/Custom Medical Stock.] moting genes. Male sterility is not heritable, but, interestingly, the fathers of these
men have normal Y chromosomes, showing that the deletions are new.
There have been no convincing cases of nonsexual phenotypic variants associ-
ated with the Y chromosome. Hairy ear rims (Figure 2-32) have been proposed as
a possibility, although disputed. The phenotype is extremely rare among the pop-
ulations of most countries but more common among the populations of India. In
some families, hairy ear rims have been shown to be transmitted exclusively from
fathers to sons.
I
Xa/ Xa XA/ Y
II
F i g u r e 2 - 31 All the daughters of a Xa/ Y XA/ Xa XA/ Xa Xa/ Y Xa/ Y
male expressing an X-linked dominant
phenotype will show the phenotype.
Females heterozygous for an X-linked III
dominant allele will pass the condition on Xa/ Xa XA/ Xa Xa/ Y XA/ Y
to half their sons and daughters.
2.6 Human Pedigree Analysis 6 9
?
The probability of the couple’s first child having Tay-Sachs can be calculated in
the following way. Because neither of the couple has the disease, each can only be an
unaffected homozygote or heterozygote. If both are heterozygotes, then they each
stand a chance of passing the recessive allele on to a child, who would then have Tay- F i g u r e 2 - 3 2 Hairy ear rims have been
Sachs disease. Hence, we must calculate the probability of their both being heterozy- proposed to be caused by an allele of a
Y-linked gene. [ © Mark Collinson/Alamy.]
gotes, and then, if so, the probability of passing the deleterious allele on to a child.
1. The husband’s grandparents must have both been heterozygotes (T/t) because
they produced a t/t child (the uncle). Therefore, they effectively constituted a
monohybrid cross. The husband’s father could be T/T or T/t, but within the 3/4 of
unaffected progeny we know that the relative probabilities of these genotypes must
be 1/4 and 1/2, respectively (the expected progeny ratio in a monohybrid cross is
1 1 1
4 T/T, 2 T/t, 4 t/t). Therefore, there is a 2/3 probability that the father is a heterozy-
gote (two-thirds is the proportion of unaffected progeny who are heterozygotes:
that is the ratio of 2/4 to 3/4).
2. The husband’s mother is assumed to be T/T, because she married into the fam-
ily and disease alleles are generally rare. Thus, if the father is T/t, then the mating
with the mother was a cross T/t × T/T and the expected proportions in the prog-
1 1
eny (which includes the husband) are 2 T/T, 2 T/t.
3. The overall probability of the husband’s being a heterozygote must be calcu-
lated with the use of a statistical rule called the product rule, which states that
6. Overall, the probability of the couple’s having an affected child is the probability
of them both being heterozygous and then both transmitting the recessive allele to
a child. Again, these events are independent, and so we can calculate the overall
probability as 1/3 × 1/3 × 1/4 = 1/36. In other words, there is a 1 in 36 chance of
them having a child with Tay-Sachs disease.
s u m m a ry
In somatic cell division, the genome is transmitted by mito- The main events of
sis, a nuclear division. In this process, each chromosome mitosis aned meiosis
replicates into a pair of chromatids and the chromatids are
pulled apart to produce two identical daughter cells. Mitosis Meiosis
(Mitosis can take place in diploid or haploid cells.) At meio- Pair of homologous
sis, which takes place in the sexual cycle in meiocytes, each chromosomes
homolog replicates to form a dyad of chromatids; then, the
dyads pair to form a tetrad, which segregates at each of the
two cell divisions. The result is four haploid cells, or gam-
etes. Meiosis can take place only in a diploid cell; hence, Chromatid
formation
haploid organisms unite to form a diploid meiocyte.
An easy way to remember the main events of meiosis, by
using your fingers to represent chromosomes, is shown in
Figure 2-33.
Genetic dissection of a biological property begins with a Alignment at equator Pairing at equator
collection of mutants. Each mutant has to be tested to see if it
is inherited as a single-gene change. The procedure followed
is essentially unchanged from the time of Mendel, who per-
formed the prototypic analysis of this type. The analysis is
based on observing specific phenotypic ratios in the progeny
Anaphase Anaphase I
of controlled crosses. In a typical case, a cross of A/A × a /a
produces an F1 that is all A/a. When the F1 is selfed or inter-
crossed, a genotypic ratio of 41 A/A : 21 A/a : 41 a /a is produced
in the F2. (At the phenotypic level, this ratio is 43 A/− : 41 a /a .)
The three single-gene genotypes are homozygous dominant,
heterozygous (monohybrid), and homozygous recessive. If an Anaphase II
A/a individual is crossed with a/a (a testcross), a 1 : 1 ratio is
produced in the progeny. The 1 : 1, 3 : 1, and 1 : 2 : 1 ratios stem
from the principle of equal segregation, which is that the hap-
loid products of meiosis from A/a will be 21 A and 21 a. The cel-
lular basis of the equal segregation of alleles is the segregation
of homologous chromosomes at meiosis. Haploid fungi can be F i g u r e 2 - 3 3 Using fingers to remember the main events of
used to show equal segregation at the level of a single meiosis mitosis and meiosis.
(a 1 : 1 ratio in an ascus).
Solved Problems 71
The molecular basis for chromatid production in meio- pattern that differs in the two sexes, often resulting in dif-
sis is DNA replication. Segregation at meiosis can be ferent ratios in the male and female progeny.
observed directly at the molecular (DNA) level. The molec- Mendelian single-gene segregation is useful in identifying
ular force of segregation is the depolymerization and subse- mutant alleles underlying many human disorders. Analyses
quent shortening of microtubules that are attached to the of pedigrees can reveal autosomal or X-linked disorders of
centromeres. Recessive mutations are generally in genes both dominant and recessive types. The logic of Mendelian
that are haplosufficient, whereas dominant mutations are genetics has to be used with caution, taking into account that
often due to gene haploinsufficiency. human progeny sizes are small and phenotypic ratios are not
In many organisms, sex is determined chromosomally, necessarily typical of those expected from larger sample sizes.
and, typically, XX is female and XY is male. Genes on the If a known single-gene disorder is present in a pedigree,
X chromosome (X-linked genes) have no counterparts on Mendelian logic can be used to predict the likelihood of chil-
the Y chromosome and show a single-gene inheritance dren inheriting the disease.
key terms
allele (p. 37) homogametic sex (p. 54) property (p. 32)
ascus (p. 44) homozygote (p. 38) propositus (p. 58)
bivalent (p. 41) homozygous dominant (p. 38) pseudoautosomal regions 1 and 2
character (p. 34) homozygous recessive (p. 38) (p. 55)
chromatid (p. 40) law of equal segregation (Mendel’s pure line (p. 35)
cross (p. 33) first law) (p. 37) recessive (p. 37)
dimorphism (p. 63) leaky mutation (p. 49) reverse genetics (p. 34)
dioecious species (p. 54) meiocyte (p. 40) second filial generation (F2) (p. 35)
dominant (p. 38) meiosis (p. 40) self (p. 35)
dyad (p. 41) mitosis (p. 40) sex chromosome (p. 54)
first filial generation (F1) (p. 35) monohybrid (p. 38) sex linkage (p. 55)
forward genetics (p. 33) monohybrid cross (p. 38) SRY gene (p. 68)
gene (p. 37) morph (p. 63) testcross (p. 53)
gene discovery (p. 32) mutant (p. 32) tester (p. 53)
genetic dissection (p. 33) mutation (p. 32) tetrad (p. 42)
genotype (p. 38) null allele (p. 49) trait (p. 34)
haploinsufficient (p. 50) parental generation (P) (p. 35) wild type (p. 32)
haplosufficient (p. 50) pedigree analysis (p. 58) X chromosome (p. 54)
hemizygous (p. 55) phenotype (p. 32) X linkage (p. 55)
heterogametic sex (p. 54) polymorphism (p. 32) Y chromosome (p. 54)
heterozygote (p. 38) product of meiosis (p. 42) Y linkage (p. 55)
heterozygous (p. 38) product rule (p. 69) zygote (p. 38)
s olv e d p r obl e m s
This section in each chapter contains a few solved problems problem. Most of the problems use data taken from research
that show how to approach the problem sets that follow. The that somebody actually carried out: ask yourself why the re-
purpose of the problem sets is to challenge your understand- search might have been initiated and what was the probable
ing of the genetic principles learned in the chapter. The best goal. Find out exactly what facts are provided, what as-
way to demonstrate an understanding of a subject is to be sumptions have to be made, what clues are given in the
able to use that knowledge in a real or simulated situation. Be problem, and what inferences can be made from the avail-
forewarned that there is no machine-like way of solving these able information. Second, be methodical. Staring at the
problems. The three main resources at your disposal are the problem rarely helps. Restate the information in the prob-
genetic principles just learned, logic, and trial and error. lem in your own way, preferably using a diagrammatic rep-
Here is some general advice before beginning. First, it is resentation or flowchart to help you think out the problem.
absolutely essential to read and understand all of the Good luck.
72 C H APTER 2 Single-Gene Inheritance
SOLVED PROBLEM 1. Crosses were made between two pure mula is that genes can act differently in different environ-
lines of rabbits that we can call A and B. A male from line A ments; so
was mated with a female from line B, and the F1 rabbits were genotype 1 + environment 1 = phenotype 1
subsequently intercrossed to produce an F2. Three-fourths of
the F2 animals were discovered to have white subcutaneous and
fat, and one-fourth had yellow subcutaneous fat. Later, the F1 genotype 1 + environment 2 = phenotype 2
was examined and was found to have white fat. Several years
later, an attempt was made to repeat the experiment by using In the present problem, the different diets constituted
the same male from line A and the same female from line B. different environments, and so a possible explanation of the
This time, the F1 and all the F2 (22 animals) had white fat. results is that the homozygous recessive w/w produces yellow
The only difference between the original experiment and the fat only when the diet contains fresh vegetables. This expla-
repeat that seemed relevant was that, in the original, all the nation is testable. One way to test it is to repeat the experi-
animals were fed fresh vegetables, whereas in the repeat, ment again and use vegetables as food, but the parents might
they were fed commercial rabbit chow. Provide an explana- be dead by this time. A more convincing way is to breed sev-
tion for the difference and a test of your idea. eral of the white-fatted F2 rabbits from the second experi-
ment. According to the original interpretation, some of them
Solution should be heterozygous, and, if their progeny are raised on
The first time that the experiment was done, the breeders vegetables, yellow fat should appear in Mendelian propor-
would have been perfectly justified in proposing that a pair tions. For example, if a cross happened to be W/w and w/w,
of alleles determine white versus yellow body fat because the progeny would be 21 white fat and 21 yellow fat.
the data clearly resemble Mendel’s results in peas. White If this outcome did not happen and no progeny having
must be dominant, and so we can represent the white allele yellow fat appeared in any of the matings, we would be
as W and the yellow allele as w. The results can then be forced back to the first or second explanation. The second
expressed as follows: explanation can be tested by using larger numbers, and if
this explanation doesn’t work, we are left with the first ex-
P W/W × w/w
planation, which is difficult to test directly.
F1 W/w As you might have guessed, in reality, the diet was the
F2 1
W/W culprit. The specific details illustrate environmental effects
4
beautifully. Fresh vegetables contain yellow substances called
1
2 W/w xanthophylls, and the dominant allele W gives rabbits the
1
4 w/w ability to break down these substances to a colorless (“white”)
form. However, w/w animals lack this ability, and the xantho-
No doubt, if the parental rabbits had been sacrificed, one phylls are deposited in the fat, making it yellow. When no
parent (we cannot tell which) would have been predicted to xanthophylls have been ingested, both W/− and w/w animals
have white fat and the other yellow. Luckily, the rabbits were end up with white fat.
not sacrificed, and the same animals were bred again, leading
to a very interesting, different result. Often in science, an un- SOLVED PROBLEM 2. Phenylketonuria (PKU) is a human
expected observation can lead to a novel principle, and, rath- hereditary disease resulting from the inability of the body
er than moving on to something else, it is useful to try to ex- to process the chemical phenylalanine, which is contained
plain the inconsistency. So why did the 3 : 1 ratio disappear? in the protein that we eat. PKU is manifested in early
Here are some possible explanations. infancy and, if it remains untreated, generally leads to men-
First, perhaps the genotypes of the parental animals had tal retardation. PKU is caused by a recessive allele with
changed. This type of spontaneous change affecting the simple Mendelian inheritance.
whole animal, or at least its gonads, is very unlikely, because A couple intends to have children but consult a genetic
even common experience tells us that organisms tend to be counselor because the man has a sister with PKU and the
stable to their type. woman has a brother with PKU. There are no other known
Second, in the repeat, the sample of 22 F2 animals did cases in their families. They ask the genetic counselor to de-
not contain any yellow fat simply by chance (“bad luck”). termine the probability that their first child will have PKU.
This explanation, again, seems unlikely, because the sample What is this probability?
was quite large, but it is a definite possibility.
A third explanation draws on the principle that genes do Solution
not act in a vacuum; they depend on the environment for What can we deduce? If we let the allele causing the PKU
their effects. Hence, the formula “genotype + environment phenotype be p and the respective normal allele be P, then
= phenotype” is a useful mnemonic. A corollary of this for- the sister and brother of the man and woman, respectively,
Solved Problems 73
must have been p/p. To produce these affected persons, all Solution
four grandparents must have been heterozygous normal. a. The most likely mode of inheritance is X-linked domi-
The pedigree can be summarized as follows: nant. We assume that the disease phenotype is dominant
because, after it has been introduced into the pedigree by
P/p P/p P/p P/p the male in generation II, it appears in every generation. We
assume that the phenotype is X linked because fathers do
... ... not transmit it to their sons. If it were autosomal dominant,
father-to-son transmission would be common.
p/p P/ P/ p/p In theory, autosomal recessive could work, but it is im-
? probable. In particular, note the marriages between affected
members of the family and unaffected outsiders. If the condi-
When these inferences have been made, the problem is tion were autosomal recessive, the only way in which these
reduced to an application of the product rule. The only way marriages could have affected offspring is if each person mar-
in which the man and woman can have a PKU child is if both rying into the family were a heterozygote; then the matings
of them are heterozygotes (it is obvious that they themselves would be a/a (affected) × A/a (unaffected). However, we are
do not have the disease). Both the grandparental matings are told that the disease is rare; in such a case, heterozygotes are
simple Mendelian monohybrid crosses expected to produce highly unlikely to be so common. X-linked recessive inheri-
progeny in the following proportions: tance is impossible, because a mating of an affected woman
with a normal man could not produce affected daughters. So
1
4 P/P
we can let A represent the disease-causing allele and a repre-
1 Normal (34) sent the normal allele.
2 P/p
1
p/p PKU(14) b. 1 × 9: Number 1 must be heterozygous A/a because she
4
must have obtained a from her normal mother. Number 9
We know that the man and the woman are normal, and must be A/Y. Hence, the cross is A/a × A/Y .
so the probability of each being a heterozygote is 2/3 be-
cause, within the P/− class, 2/3 are P/p and 1/3 are P/P. Female Male
The probability of both the man and the woman being gametes gametes Progeny
heterozygotes is 2/3 × 2/3 = 4/9. If both are heterozygous, 1
A
1
A/A
2 4
then one-quarter of their children would have PKU, and so 1
2 A 1 1
the probability that their first child will have PKU is 1/4 and 2 Y 4
A/Y
the probability of their being heterozygous and of their first 1 1
A A/a
child’s having PKU is 4/9 × 1/4 = 4/36 = 1/9, which is the 1
a
2 4
2
answer. 1
2 Y 1
a/Y
4
Female Male
gametes gametes Progeny
1 1
2
a 4
A/a
1
2 A
1 1
2 Y 4 A/Y
1 1
2
a 4
a/a
1
2 a
1 1
2 Y 4 a/Y
1 2 3 4 5 6 7 8 9 10
2 × 3: Must be a/Y × A/a (same as 1 × 4).
2 × 8: Must be a/Y × a/a (all progeny normal).
a. Deduce the most likely mode of inheritance.
b. What would be the outcomes of the cousin marriages
1 × 9, 1 × 4, 2 × 3, and 2 × 8 ?
74 C H APTER 2 Single-Gene Inheritance
p r obl e m s
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga 11e.
26. Francis Galton, a geneticist of the pre-Mendelian era, de- 36. In a fungus with four ascospores, a mutant allele lys-5
vised the principle that half of our genetic makeup is de- causes the ascospores bearing that allele to be white,
rived from each parent, one-quarter from each grand- whereas the wild-type allele lys-5+ results in black asco-
parent, one-eighth from each great-grandparent, and so spores. (Ascospores are the spores that constitute the
forth. Was he right? Explain. four products of meiosis.) Draw an ascus from each of
the following crosses:
27. If children obtain half their genes from one parent and
half from the other parent, why aren’t siblings identical? a. lys-5 × lys-5+
28. State where cells divide mitotically and where they di- b. lys-5 × lys-5
vide meiotically in a fern, a moss, a flowering plant, a
pine tree, a mushroom, a frog, a butterfly, and a snail. c. lys-5+ × lys-5+
29. Human cells normally have 46 chromosomes. For each 37. For a certain gene in a diploid organism, eight units of
of the following stages, state the number of nuclear DNA protein product are needed for normal function. Each
molecules present in a human cell: wild-type allele produces five units.
a. Metaphase of mitosis a. If a mutation creates a null allele, do you think this
b. Metaphase I of meiosis allele will be recessive or dominant?
c. Telophase of mitosis b. What assumptions need to be made to answer part a?
d. Telophase I of meiosis 38. A Neurospora colony at the edge of a plate seemed to be
e. Telophase II of meiosis sparse (low density) in comparison with the other colo-
nies on the plate. This colony was thought to be a possi-
30. Four of the following events are part of both meiosis and
ble mutant, and so it was removed and crossed with a
mitosis, but only one is meiotic. Which one? (1) Chromatid
formation, (2) spindle formation, (3) chromosome con- wild type of the opposite mating type. From this cross,
densation, (4) chromosome movement to poles, (5) 100 ascospore progeny were obtained. None of the colo-
synapsis. nies from these ascospores was sparse, all appearing to
be normal. What is the simplest explanation of this re-
31. In corn, the allele f ′ causes floury endosperm and the sult? How would you test your explanation? (Note:
allele f ″ causes flinty endosperm. In the cross f ′/f ′ × Neurospora is haploid.)
f ″/f ″ , all the progeny endosperms are floury, but, in
the reciprocal cross, all the progeny endosperms are 39. From a large-scale screen of many plants of Collinsia
flinty. What is a possible explanation? (Check the legend grandiflora, a plant with three cotyledons was discov-
for Figure 2-7.) ered (normally, there are two cotyledons). This plant
was crossed with a normal pure-breeding wild-type
32. What is Mendel’s first law? plant, and 600 seeds from this cross were planted.
33. If you had a fruit fly (Drosophila melanogaster) that was There were 298 plants with two cotyledons and 302
of phenotype A, what cross would you make to deter- with three cotyledons. What can be deduced about the
mine if the fly’s genotype was A/A or A/a? inheritance of three cotyledons? Invent gene symbols
as part of your explanation.
34. In examining a large sample of yeast colonies on a petri
dish, a geneticist finds an abnormal-looking colony 40. In the plant Arabidopsis thaliana, a geneticist is interest-
that is very small. This small colony was crossed with ed in the development of trichomes (small projections).
wild type, and products of meiosis (ascospores) were A large screen turns up two mutant plants (A and B) that
spread on a plate to produce colonies. In total, there have no trichomes, and these mutants seem to be poten-
were 188 wild-type (normal-size) colonies and 180 tially useful in studying trichome development. (If they
small ones. were determined by single-gene mutations, then finding
a. What can be deduced from these results regarding the normal and abnormal functions of these genes would
the inheritance of the small-colony phenotype? (Invent be instructive.) Each plant is crossed with wild type; in
genetic symbols.) both cases, the next generation (F1) had normal tri-
chomes. When F1 plants were selfed, the resulting F2’s
b. What would an ascus from this cross look like?
were as follows:
35. Two black guinea pigs were mated and over several years
F2 from mutant A: 602 normal; 198 no trichomes
produced 29 black and 9 white offspring. Explain these
results, giving the genotypes of parents and progeny. F2 from mutant B: 267 normal; 93 no trichomes
76 C H APTER 2 Single-Gene Inheritance
a. What do these results show? Include proposed geno- 3. Can parts of the problem be restated by using branch
types of all plants in your answer. diagrams?
b. Under your explanation to part a, is it possible to 4. In the pedigree, identify a mating that illustrates
confidently predict the F1 from crossing the original Mendel’s first law.
mutant A with the original mutant B? 5. Define all the scientific terms in the problem, and look
41. You have three dice: one red (R), one green (G), and one up any other terms about which you are uncertain.
blue (B). When all three dice are rolled at the same time, 6. What assumptions need to be made in answering this
calculate the probability of the following outcomes: problem?
a. 6 (R), 6 (G), 6 (B) 7. Which unmentioned family members must be consid-
b. 6 (R), 5 (G), 6 (B) ered? Why?
8. What statistical rules might be relevant, and in what
c. 6 (R), 5 (G), 4 (B)
situations can they be applied? Do such situations exist
d. No sixes at all in this problem?
e. A different number on all dice 9. What are two generalities about autosomal recessive
diseases in human populations?
42. In the pedigree below, the black symbols represent indi-
10. What is the relevance of the rareness of the phenotype
viduals with a very rare blood disease.
under study in pedigree analysis generally, and what
can be inferred in this problem?
11. In this family, whose genotypes are certain and whose
are uncertain?
12. In what way is John’s side of the pedigree different
If you had no other information to go on, would you from Martha’s side? How does this difference affect
think it more likely that the disease was dominant or your calculations?
recessive? Give your reasons.
13. Is there any irrelevant information in the problem as
43. a. The ability to taste the chemical phenylthiocarbamide stated?
is an autosomal dominant phenotype, and the inability 14. In what way is solving this kind of problem similar to
to taste it is recessive. If a taster woman with a nontaster solving problems that you have already successfully
father marries a taster man who in a previous marriage solved? In what way is it different?
had a nontaster daughter, what is the probability that
15. Can you make up a short story based on the human di-
their first child will be
lemma in this problem?
(1) A nontaster girl
(2) A taster girl Now try to solve the problem. If you are unable to do so, try
to identify the obstacle and write a sentence or two describing
(3) A taster boy your difficulty. Then go back to the expansion questions and see
b. What is the probability that their first two children if any of them relate to your difficulty.
will be tasters of either sex?
45. Holstein cattle are normally black and white. A superb
www black-and-white bull, Charlie, was purchased by a farmer
Unpacking the Problem 44 for $100,000. All the progeny sired by Charlie were nor-
www
John and Martha are contemplating having children, but mal in appearance. However, certain pairs of his progeny,
John’s brother has galactosemia (an autosomal recessive when interbred, produced red-and-white progeny at a
disease) and Martha’s great-grandmother also had galacto- frequency of about 25 percent. Charlie was soon removed
semia. Martha has a sister who has three children, none of from the stud lists of the Holstein breeders. Use symbols
whom have galactosemia. What is the probability that John to explain precisely why.
and Martha’s first child will have galactosemia?
46. Suppose that a husband and wife are both heterozy-
1. Can the problem be restated as a pedigree? If so, write gous for a recessive allele for albinism. If they have di-
one. zygotic (two-egg) twins, what is the probability that
2. Can parts of the problem be restated by using Punnett both the twins will have the same phenotype for
squares? pigmentation?
Problems 77
47. The plant blue-eyed Mary grows on Vancouver Island Plants were collected from nature before flowering
and on the lower mainland of British Columbia. The and were crossed or selfed with the following results:
populations are dimorphic for purple blotches on the
leaves—some plants have blotches and others don’t. Near Number of progeny
Nanaimo, one plant in nature had blotched leaves. This
Pollination Winged Wingless
plant, which had not yet flowered, was dug up and taken
to a laboratory, where it was allowed to self. Seeds were Winged (selfed) 91 1*
collected and grown into progeny. One randomly select-
Winged (selfed) 90 30
ed (but typical) leaf from each of the progeny is shown in
the accompanying illustration. Wingless (selfed) 4* 80
Winged × wingless 161 0
Winged × wingless 29 31
Winged × wingless 46 0
Winged × winged 44 0
*Phenotype probably has a nongenetic explanation.
I
1 2
II
a. Formulate a concise genetic hypothesis to explain 1 2 3 4
these results. Explain all symbols and show all genotypic
classes (and the genotype of the original plant).
b. How would you test your hypothesis? Be specific. III
1 2 3 4 5 6 7 8
48. Can it ever be proved that an animal is not a carrier of a
recessive allele (that is, not a heterozygote for a given
gene)? Explain. IV
49. In nature, the plant Plectritis congesta is dimorphic for 1 2 3 4 5 6 7 8 9
fruit shape; that is, individual plants bear either wingless
or winged fruits, as shown in the illustration.
a. How is the disorder inherited? State reasons for your
answer.
b. Give genotypes for as many individuals in the
pedigree as possible. (Invent your own defined allele
symbols.)
c. Consider the four unaffected children of parents
III-4 and III-5. In all four-child progenies from parents
of these genotypes, what proportion is expected to
Wingless fruit Winged fruit
contain all unaffected children?
78 C H APTER 2 Single-Gene Inheritance
51. Four human pedigrees are shown in the accompanying 53. The pedigree below was obtained for a rare kidney
illustration. The black symbols represent an abnormal disease.
phenotype inherited in a simple Mendelian manner.
1 2
2
II
1 2 3 4 5 6 7
55. Consider the accompanying pedigree of a rare autosomal mal to small-winged flies can be expected in each sex in
recessive disease, PKU. the F1? If F1 flies are intercrossed, what F2 progeny ratios
are expected? What progeny ratios are predicted if F1
I
females are backcrossed with their father?
60. An X-linked dominant allele causes hypophosphatemia
II in humans. A man with hypophosphatemia marries a
normal woman. What proportion of their sons will have
hypophosphatemia?
III
61. Duchenne muscular dystrophy is sex linked and usually
affects only males. Victims of the disease become pro-
IV
A B
gressively weaker, starting early in life.
a. What is the probability that a woman whose brother
a. List the genotypes of as many of the family members has Duchenne’s disease will have an affected child?
as possible. b. If your mother’s brother (your uncle) had Duchenne’s
b. If persons A and B marry, what is the probability that disease, what is the probability that you have received
their first child will have PKU? the allele?
c. If their first child is normal, what is the probability c. If your father’s brother had the disease, what is the
that their second child will have PKU? probability that you have received the allele?
d. If their first child has the disease, what is the 62. A recently married man and woman discover that each
probability that their second child will be unaffected? had an uncle with alkaptonuria (black urine disease), a
rare disease caused by an autosomal recessive allele of a
(Assume that all people marrying into the pedigree lack
single gene. They are about to have their first baby. What
the abnormal allele.)
is the probability that their child will have alkaptonuria?
56. A man has attached earlobes, whereas his wife has free 63. The accompanying pedigree concerns a rare inherited
earlobes. Their first child, a boy, has attached earlobes. dental abnormality, amelogenesis imperfecta.
a. If the phenotypic difference is assumed to be due to
two alleles of a single gene, is it possible that the gene
is X linked?
b. Is it possible to decide if attached earlobes are
dominant or recessive?
a. What mode of inheritance best accounts for the
57. A rare recessive allele inherited in a Mendelian manner
transmission of this trait?
causes the disease cystic fibrosis. A phenotypically nor-
mal man whose father had cystic fibrosis marries a phe- b. Write the genotypes of all family members according
notypically normal woman from outside the family, and to your hypothesis.
the couple consider having a child. 64. A couple who are about to get married learn from study-
a. Draw the pedigree as far as described. ing their family histories that, in both their families, their
b. If the frequency in the population of heterozygotes unaffected grandparents had siblings with cystic fibrosis
for cystic fibrosis is 1 in 50, what is the chance that the (a rare autosomal recessive disease).
couple’s first child will have cystic fibrosis? a. If the couple marries and has a child, what is the
c. If the first child does have cystic fibrosis, what is the probability that the child will have cystic fibrosis?
probability that the second child will be normal? b. If they have four children, what is the chance that the
children will have the precise Mendelian ratio of 3 : 1 for
58. The allele c causes albinism in mice (C causes mice to be
normal : cystic fibrosis?
black). The cross C/c × c/c produces 10 progeny. What is
the probability of all of them being black? c. If their first child has cystic fibrosis, what is the
probability that their next three children will be normal?
59. The recessive allele s causes Drosophila to have small
wings, and the s+ allele causes normal wings. This gene is 65. A sex-linked recessive allele c produces a red–green color
known to be X linked. If a small-winged male is crossed blindness in humans. A normal woman whose father was
with a homozygous wild-type female, what ratio of nor- color blind marries a color-blind man.
8 0 C H APTER 2 Single-Gene Inheritance
a. What genotypes are possible for the mother of the 69. A plant geneticist has two pure lines, one with purple
color-blind man? petals and one with blue. She hypothesizes that the
b. What are the chances that the first child from this phenotypic difference is due to two alleles of one gene.
marriage will be a color-blind boy? To test this idea, she aims to look for a 3 : 1 ratio in the
F2. She crosses the lines and finds that all the F1 proge-
c. Of the girls produced by these parents, what pro-
ny are purple. The F1 plants are selfed, and 400 F2
portion can be expected to be color blind?
plants are obtained. Of these F2 plants, 320 are purple
d. Of all the children (sex unspecified) of these parents, and 80 are blue. Do these results fit her hypothesis
what proportion can be expected to have normal color well? If not, suggest why.
vision?
www
66. Male house cats are either black or orange; females are Unpacking the Problem 70
black, orange, or calico. www
A man’s grandfather has galactosemia, a rare autosomal
a. If these coat-color phenotypes are governed by a sex- recessive disease caused by the inability to process galactose,
linked gene, how can these observations be explained? leading to muscle, nerve, and kidney malfunction. The man
b. Using appropriate symbols, determine the pheno- married a woman whose sister had galactosemia. The woman
types expected in the progeny of a cross between an is now pregnant with their first child.
orange female and a black male. a. Draw the pedigree as described.
c. Half the females produced by a certain kind of mating b. What is the probability that this child will have
are calico, and half are black; half the males are orange, galactosemia?
and half are black. What colors are the parental males
and females in this kind of mating? c. If the first child does have galactosemia, what is the
probability that a second child will have it?
d. Another kind of mating produces progeny in the
following proportions: one-fourth orange males, one-
Ch a ll e n g i n g P r obl e m s
fourth orange females, one-fourth black males, and one-
fourth calico females. What colors are the parental males 71. A geneticist working on peas has a single plant monohy-
and females in this kind of mating? brid Y/ y (yellow) plant and, from a self of this plant,
wants to produce a plant of genotype y/y to use as a tes-
67. The pedigree below concerns a certain rare disease that is ter. How many progeny plants need to be grown to be 95
incapacitating but not fatal. percent sure of obtaining at least one in the sample?
72. A curious polymorphism in human populations has to
do with the ability to curl up the sides of the tongue to
make a trough (“tongue rolling”). Some people can do
this trick, and others simply cannot. Hence, it is an ex-
ample of a dimorphism. Its significance is a complete
mystery. In one family, a boy was unable to roll his
? ? ? tongue but, to his great chagrin, his sister could.
Furthermore, both his parents were rollers, and so were
a. Determine the most likely mode of inheritance of this both grandfathers, one paternal uncle, and one paternal
disease. aunt. One paternal aunt, one paternal uncle, and one
b. Write the genotype of each family member according maternal uncle could not roll their tongues.
to your proposed mode of inheritance. a. Draw the pedigree for this family, defining your
c. If you were this family’s doctor, how would you advise symbols clearly, and deduce the genotypes of as many
the three couples in the third generation about the individual members as possible.
likelihood of having an affected child?
b. The pedigree that you drew is typical of the inheritance
68. In corn, the allele s causes sugary endosperm, whereas S of tongue rolling and led geneticists to come up with the
causes starchy. What endosperm genotypes result from inheritance mechanism that no doubt you came up with.
each of the following crosses? However, in a study of 33 pairs of identical twins, both
a. s/s female × S/S male members of 18 pairs could roll, neither member of 8 pairs
could roll, and one of the twins in 7 pairs could roll but
b. S/S female × s/s male
the other could not. Because identical twins are derived
c. S/s female × S/s male from the splitting of one fertilized egg into two embryos,
Problems 81
I ?
II
III
IV
II
III
IV
VI
Red hair Red beard and body hair
the members of a pair must be genetically identical. How ratios in each of the three types of mating be accounted
can the existence of the seven discordant pairs be for?
reconciled with your genetic explanation of the pedigree? 75. A condition known as icthyosis hystrix gravior appeared
73. Red hair runs in families, as the pedigree above shows. in a boy in the early eighteenth century. His skin became
very thick and formed loose spines that were sloughed
(Pedigree data from W. R. Singleton and B. Ellis, Journal off at intervals. When he grew up, this “porcupine man”
of Heredity 55, 1964, 261.) married and had six sons, all of whom had this condition,
a. Does the inheritance pattern in this pedigree and several daughters, all of whom were normal. For
suggest that red hair could be caused by a dominant or four generations, this condition was passed from father
a recessive allele of a gene that is inherited in a simple to son. From this evidence, what can you postulate about
Mendelian manner? the location of the gene?
b. Do you think that the red-hair allele is common or
76. The wild-type (W) Abraxas moth has large spots on its
rare in the population as a whole?
wings, but the lacticolor (L) form of this species has very
74. When many families were tested for the ability to taste small spots. Crosses were made between strains differing
the chemical phenylthiocarbamide, the matings were in this character, with the following results:
grouped into three types and the progeny were totaled,
Parents Progeny
with the results shown below:
Children Cross F1 F2
Number Non- 1 L W W 1
L, 12 W
2
Parents of families Tasters tasters W W
1
Taster × taster 425 929 130 2 W L L 2 W, 12 L
1
Taster × nontaster 289 483 278 W 2 W, 12 L
Nontaster × nontaster 86 5 218
Provide a clear genetic explanation of the results in
With the assumption that PTC tasting is dominant (P ) these two crosses, showing the genotypes of all individual
and nontasting is recessive ( p), how can the progeny moths.
82 C H APTER 2 Single-Gene Inheritance
?
shown, but the fathers in each mating have been omitted
How would you advise the parents about the probability
to draw attention to the remarkable pattern.
of their child being a deaf boy, a deaf girl, a normal boy, or
a normal girl? Be sure to state any assumptions that you a. Concisely state exactly what is unusual about this
make. pedigree.
79. The accompanying pedigree shows a very unusual inheri- b. Can the pattern be explained by Mendelian
tance pattern that actually did exist. All progeny are inheritance?
Appendix 2-1 Stages of Mitosis 8 3
2 Early mitotic
6 Mitotic telophase
prophase
The photographs show mitosis in the nuclei of root-tip cells of Lilium regale. [ J. McLeish and B. Snoad,
Looking at Chromosomes. Copyright 1958, St. Martin’s, Macmillan.]
8 4 C H APTER 2 Single-Gene Inheritance
16 Young pollen grains The tetrad and young pollen grains: In the anthers of a flower, the four products of
meiosis develop into pollen grains. In other organisms, the products of meiosis
differentiate into other kinds of structures, such as sperm cells in animals.
15 The tetrad
Cells divide
4 Diplotene
4 Diplotene 5 Diakinesis
5 Diakinesis
Prophase I: Diplotene.
Prophase Although
I: Diplotene. the DNAthe
Although has
DNA has Prophase I: Diakinesis.
Prophase Further Further
I: Diakinesis. chromosome
chromosome
alreadyalready
replicated during the
replicated premeiotic
during S phase,
the premeiotic S phase, contraction produces
contraction compactcompact
produces units that arethat
units veryare very
this factthis
firstfact
becomes manifestmanifest
first becomes only in diplotene
only in diplotene maneuverable.
maneuverable.
as eachas chromosome
each chromosome is seen is
toseen
havetobecome a
have become a
pair of sister
pair ofchromatids. The synapsed
sister chromatids. structure
The synapsed structure
now consists of a bundle
now consists of a of four homologous
bundle of four homologous
chromosomes.
chromosomes.The paired The homologs separate
paired homologs separate
slightly,slightly,
and oneand or more
one orcross-shaped structures
more cross-shaped structures
called chiasmata
called chiasmata(singular, chiasma)
(singular, appear appear
chiasma)
betweenbetween
nonsister chromatids.
nonsister chromatids.
Metaphase I: The nuclear
Metaphase I: The nuclear
membranemembrane
has has
disappeared,
disappeared,
and each andpair
each
of homologs
pair of homologs
takes takes
up a position
up a position
in the equatorial
in the equatorial
plane. Atplane.
this stage
At thisofstage of
meiosis,meiosis,
the centromeres
the centromeres
do not divide;
do not this
divide;
lackthis
of lack of
divisiondivision
is a majoris adifference
major difference
from mitosis.
from mitosis.
The twoThe two
centromeres
centromeres
of a homologous
of a homologous
chromosome
chromosome
pair pair
attach to
attach
spindle
to spindle
fibers from
fibers
opposite
from opposite
poles. poles. 6 Metaphase
6 Metaphase I I
Cell divides
Cell divides
10 Interphase
10 Interphase 9 Telophase
9 Telophase I I
This page intentionally left blank
344
3
C h a p t e r
Independent
Assortment of Genes
Learning Outcomes
After completing this chapter, you will be able to
T
his chapter is about the principles at work when two or more cases of sin-
gle-gene inheritance are analyzed simultaneously. Nowhere have these
principles been more important than in plant and animal breeding in agri-
culture. For example, between the years 1960 and 2000, the world production of
food plants doubled, marking a so-called Green Revolution. What made this
Green Revolution possible? In part, it was due to improved agricultural practice,
but more important was the development of superior crop genotypes by plant
geneticists. These breeders are constantly on the lookout for the chance occur-
rence of single-gene mutations that significantly increase yield or nutrient value.
However, such mutations arise in different lines in different parts of the world.
For example, in rice, one of the world’s main food crops, the following mutations
have been crucial in the Green Revolution:
sd1. This recessive allele results in short stature, making the plant more resis-
tant to “lodging,” or falling over, in wind and rain; it also increases the relative
amount of the plant’s energy that is routed into the seed, the part that we eat.
se1. This recessive allele alters the plant’s requirement for a specific daylength,
enabling it to be grown at different latitudes.
Xa4. This dominant allele confers resistance to the disease bacterial blight.
bph2. This allele confers resistance to brown plant hoppers (a type of insect).
Snb1. This allele confers tolerance to plant submersion after heavy rains.
To make a truly superior genotype, combining such alleles into one line is clearly
desirable. To achieve such a combination, mutant lines must be intercrossed two at
a time. For instance, a plant geneticist might start by crossing a strain homozygous
for sd1 to another homozygous for Xa4. The F1 progeny of this cross would carry
both mutations but in a heterozygous state. However, most agriculture uses pure
lines, because they can be efficiently propagated and distributed to farmers. To
obtain a pure-breeding doubly mutant sd1/sd1·Xa4/Xa4 line, the F1 would have to
be bred further to allow the alleles to “assort” into the desirable combination. Some
products of such breeding are shown in Figure 3-1. What principles are relevant
here? It depends very much on whether the two genes are on the same chromo-
some pair or on different chromosome pairs. In the latter case, the chromosome
pairs act independently at meiosis, and the
alleles of two heterozygous gene pairs are said
Rice lines
to show independent assortment.
This chapter explains how we can recognize
independent assortment and how the principle
of independent assortment can be used in strain
construction, both in agriculture and in basic
genetic research. (Chapter 4 covers the analo-
gous principles applicable to heterozygous gene
pairs on the same chromosome pair.)
We shall also see that independent assort-
ment of an array of genes is also useful in
providing a basic heritable mechanism for con-
tinuous phenotypes. These are properties such
as height or weight that do not fall into distinct
categories but are nevertheless often heavily
influenced by multiple genes collectively called
“polygenes.” We shall examine the role of inde-
F i g u r e 3 -1 Superior genotypes of crops such as rice have revolutionized pendent assortment in the inheritance of con-
agriculture. This photograph shows some of the key genotypes used in rice tinuous phenotypes influenced by such poly-
breeding programs. [ Bloomberg/Getty Images.]
3.1 Mendel’s Law of Independent Assortment 8 9
genes. We will see that independent assortment of polygenes can produce a con-
tinuous phenotypic distribution among progeny.
Lastly, we will introduce a different type of independent inheritance, that of
genes in the organelles mitochondria and chloroplasts. Unlike nuclear chromo-
somes, these genes are inherited cytoplasmically and result in different patterns
of inheritance than observed for nuclear genes and chromosomes. This pattern is
independent of genes showing nuclear inheritance.
First, we examine the analytical procedures that pertain to the independent
assortment of nuclear genes. These were first developed by the father of genetics,
Gregor Mendel. So, again, we turn to his work as a prototypic example.
The probabilities of the four possible outcomes are calculated by using the
product rule (the probability of two independent events occurring together is the
product of their individual probabilities). Hence, we multiply along the branches
in the diagram. For example, 3/4 of all seeds will be round, and 3/4 of the round
seeds will be yellow, so the probability of a seed being both round and yellow is
calculated as 3/4 × 3/4, which equals 9/16. These multiplications give the follow-
ing four proportions:
3 3 9
4
× 4
= 16
round, yellow
3 1 3
4
× 4
= 16
round, green
1 3 3
4
× 4
= 16
wrinkled, yellow
1 1 1
4
× 4
= 16 wrinkled, green
These proportions constitute the 9 : 3 : 3 : 1 ratio that we are trying to explain.
However, is this exercise not merely number juggling? What could the combina-
tion of the two 3 : 1 ratios mean biologically? The way that Mendel phrased his
explanation does in fact amount to a biological mechanism. In what is now known
as the law of independent assortment (Mendel’s second law), he concluded that
different gene pairs assort independently during gamete formation. The consequence
is that, for two heterozygous gene pairs A/a and B/b, the b allele is just as likely to
end up in a gamete with an a allele as with an A allele, and likewise for the B allele.
3.1 Mendel’s Law of Independent Assortment 91
In hindsight, we now know that, for the most part, this “law” applies to genes on
different chromosomes. Genes on the same chromosome generally do not assort
independently because they are held together by the chromosome itself.
Mendel’s original statement of this law was that different genes assort inde-
pendently because he apparently did not encounter (or he ignored) any excep-
tions that might have led to the concept of linkage.
We have explained the 9 : 3 : 3 : 1 phenotypic ratio as two randomly combined
3 : 1 phenotypic ratios. But can we also arrive at the 9 : 3 : 3 : 1 ratio from a consid-
eration of the frequency of gametes, the actual meiotic products? Let us consider
the gametes produced by the F1 dihybrid R/r ; Y/y (the semicolon shows that we
are now embracing the idea that the genes are on different chromosomes). Again,
we will use the branch diagram to get us started because it visually illustrates
independence. Combining Mendel’s laws of equal segregation and independent
assortment, we can predict that
1
2
of these R gametes will be Y
1
2 of the gametes will be R
1
2 will be y
1
2 of these r gametes will be Y
1
2 of the gametes will be r
1
2 will be y
Multiplication along the branches according to the product rule gives us the
gamete proportions:
1
4
R ; Y
1
4
R ; y
1
4
r ; Y
1
4
r ; y
These proportions are a direct result of the application of the two Mendelian
laws: of segregation and of independence. However, we still have not arrived at the
9 : 3 : 3 : 1 ratio. The next step is to recognize that, because male and female gametes
obey the same laws during formation, both the male and the female gametes will
show the same proportions just given. The four female gametic types will be fertil-
ized randomly by the four male gametic types to obtain the F2. The best graphic way
of showing the outcomes of the cross is by using a 4 × 4 grid called a Punnett square,
which is depicted in Figure 3-4. We have already seen that grids are useful in genet-
ics for providing a visual representation of the data. Their usefulness lies in the fact
that their proportions can be drawn according to the genetic proportions or ratios
under consideration. In the Punnett square in Figure 3-4, for example, four rows
and four columns were drawn to correspond to the four genotypes of female gam-
etes and the four of male gametes. We see that there are 16 boxes representing the
various gametic fusions and that each box is 1/16th of the total area of the grid. In
accord with the product rule, each 1/16th is a result of the fertilization of one egg
type at frequency 1/4 by one sperm type also at frequency 1/4, giving the probabil-
ity of that fusion as (1/4)2. As the Punnett square shows, the F2 contains a variety of
genotypes, but there are only four phenotypes and their proportions are in the
9 : 3 : 3 : 1 ratio. So we see that, when we calculate progeny frequencies directly
92 CHA P TER 3 Independent Assortment of Genes
made, and these tests and other types of tests all showed
R /r ; Y/ y R /r ; y/y r /r ; y/y r/r ; Y/y
r ;y 1 1 1 1 that he had, in fact, devised a robust model to explain the
16 16 16 16
1
4
inheritance patterns observed in his various pea crosses.
In the early 1900s, both of Mendel’s laws were tested in
a wide spectrum of eukaryotic organisms. The results of
R /r ; Y/ Y R /r ; Y/y r /r ; Y/y r/r ; Y/Y
r ;Y 1 1 1 1
these tests showed that Mendelian principles were gener-
1
16 16 16 16 ally applicable. Mendelian ratios (such as 3 : 1, 1 : 1, 9 : 3 : 3 : 1,
4
and 1 : 1 : 1 : 1) were extensively reported, suggesting that
equal segregation and independent assortment are funda-
9 :3 :3 :1 mental hereditary processes found throughout nature.
Mendel’s laws are not merely laws about peas; they are laws
round, yellow wrinkled, yellow about the genetics of eukaryotic organisms in general.
round, green wrinkled, green As an example of the universal applicability of the
principle of independent assortment, we can examine its
F i g u r e 3 - 4 We can use a Punnett action in haploids. If the principle of equal segregation is
square to predict the result of a dihybrid valid across the board, then we should be able to observe its action in haploids,
cross. This Punnett square shows the given that haploids undergo meiosis. Indeed, independent assortment can be
predicted genotypic and phenotypic observed in a cross of the type A ; B × a ; b. Fusion of parental cells results in a
constitution of the F2 generation from a transient diploid meiocyte that is a dihybrid A/a ; B/b, and the randomly sampled
dihybrid cross. products of meiosis (sexual spores such as ascospores in fungi) will be
1
4
A ; B
1
4
A ; b
1
4
a ; B
1
4
a ; b
3.2 Working with Independent Assortment 9 3
Hence, we see the same ratio as in the dihybrid testcross in a diploid organism;
again, the ratio is a random combination of two monohybrid 1 : 1 ratios because of
independent assortment.
Progeny Progeny
genotypes phenotypes
from a self from a self Gametes
1 3 1
4 B/B 4 B/ 2 B
1 1 3 1
4 A/A 2 B/b 4 A/ 2 A
1 1 1
4 b/b 4 b/b 2 b
1 3 1
4 B/B 4 B/ 2 B
1 1 1 1
2 A/a 2 B/b 4 a/a 2 a
1 1 1
4 b/b 4 b/b 2 b
1
4 B/B
1 1
4 a/a 2 B/b
1
4 b/b
Note, however, that the “tree” of branches for genotypes is quite unwieldy
even in this simple case, which uses two gene pairs, because there are 32 = 9 geno-
types. For three gene pairs, there are 33, or 27, possible genotypes. To simplify this
problem, we can use a statistical approach, which constitutes a third method
for calculating the probabilities (expected frequencies) of specific phenotypes
or genotypes coming from a cross. The two statistical rules needed are the
9 4 CHA P TER 3 Independent Assortment of Genes
product rule (introduced in Chapter 2) and the sum rule, which we will now
consider together.
The possible outcomes from rolling two dice follow the product rule because
the outcome on one die is independent of the other. As an example, let us calcu-
late the probability, p, of rolling a pair of 4’s. The probability of a 4 on one die is
1/6 because the die has six sides and only one side carries the number 4. This
probability is written as follows:
1
p (one 4) = 6
Therefore, with the use of the product rule, the probability of a 4 appearing
on both dice is 1/6 × 1/6 = 1/36, which is written
1 1 1
p (two 4’s) = 6 × 6 = 36
K e y C o n c e p t The sum rule states that the probability of either one or the other
of two mutually exclusive events occurring is the sum of their individual probabilities.
(Note that, in the product rule, the focus is on outcomes A and B. In the sum
rule, the focus is on the outcome A′ or A″.)
Dice can also be used to illustrate the sum rule. We have already calculated
that the probability of two 4’s is 1/36; clearly, with the use of the same type of
calculation, the probability of two 5’s will be the same, or 1/36. Now we can calcu-
late the probability of either two 4’s or two 5’s. Because these outcomes are mutu-
ally exclusive, the sum rule can be used to tell us that the answer is 1/36 + 1/36,
which is 1/18. This probability can be written as follows:
1 1 1
p (two 4’s or two 5’s) = 36 + 36 = 18
A/a ; b/b ; C/c ; D/d ; E/e
and
A/a ; B/b ; C/c ; d/d ; E/e
How many progeny do we need to grow? To take the preceding example a step
farther, suppose we need to estimate how many progeny plants need to be grown to
stand a reasonable chance of obtaining the desired genotype a/a ; b/b ; c/c ; d/d ; e/e.
We first calculate the proportion of progeny that is expected to be of that genotype.
As just shown, we learn that we need to examine at least 256 progeny to stand an
average chance of obtaining one individual plant of the desired genotype.
The probability of obtaining one “success” (a fully recessive plant) out of 256
has to be considered more carefully. This is the average probability of success.
Unfortunately, if we isolated and tested 256 progeny, we would very likely have no
successes at all, simply from bad luck. From a practical point of view, a more
meaningful question to ask would be, What sample size do we need to be 95 per-
cent confident that we will obtain at least one success? (Note: This 95 percent
confidence value is standard in science.) The simplest way to perform this calcu-
lation is to approach it by considering the probability of complete failure—that is,
the probability of obtaining no individuals of the desired genotype. In our exam-
ple, for every individual isolated, the probability of its not being the desired type
is 1 - (1/256) = 255/256. Extending this idea to a sample of size n, we see that the
probability of no successes in a sample of n is (255/256)n. (This probability is a
simple application of the product rule: 255/256 multiplied by itself n times.)
Hence, the probability of obtaining at least one success is the probability of all pos-
sible outcomes (this probability is 1) minus the probability of total failure, or
(255/256)n. Hence, the probability of at least one success is 1 - (255/256)n. To
satisfy the 95 percent confidence level, we must put this expression equal to 0.95
(the equivalent of 95 percent).
Therefore,
1 - (255/256)n = 0.95
Solving this equation for n gives us a value of 765, the number of progeny
needed to virtually guarantee success. Notice how different this number is from
the naïve expectation of success in 256 progeny. This type of calculation is useful
in many applications in genetics and in other situations in which a successful
outcome is needed from many trials.
How many distinct genotypes will a cross produce? The rules of probability
can be easily used to predict the number of genotypes or phenotypes in the prog-
eny of complex parental strains. (Such calculations are used routinely in research,
in progeny analysis, and in strain building.) For example, in a self of the “tetrahy-
brid” A/a ; B/b ; C/c ; D/d, there will be three genotypes for each gene pair; for
example, for the first gene pair, the three genotypes will be A/a, A/A, and a/a.
Because there are four gene pairs in total, there will be 34 = 81 different geno-
types. In a testcross of such a tetrahybrid, there will be two genotypes for each
gene pair (for example, A/a and a/a) and a total of 24 = 16 genotypes in the prog-
eny. Because we are assuming that all the genes are on different chromosomes, all
these testcross genotypes will occur at an equal frequency of 1/16.
9 6 CHA P TER 3 Independent Assortment of Genes
that estimates the degree of agreement between the expected (hypothesized) and
observed (actual) results, with the number growing larger as the agreement
increases.
The calculation is most simply performed by using a table:
Now we must look up this χ2 value in Table 3-1, which will give us the probabil-
ity value that we want. The rows in Table 3-1 list different values of degrees of free-
dom (df ). The number of degrees of freedom is the number of independent variables
in the data. In the present context, the number of independent variables is simply
the number of phenotypic classes minus 1. In this case, df = 2 − 1 = 1. So we look
only at the 1 df line. We see that our χ2 value of 0.84 lies somewhere between the
columns marked 0.5 and 0.1—in other words, between 50 percent and 10 percent.
This probability value is much greater than the cutoff value of 5 percent, and so we
accept the observed results as being compatible with the hypothesis.
Some important notes on the application of this test follow:
1. What does the probability value actually mean? It is the probability of observ-
ing a deviation from the expected results at least as large (not exactly this devia-
tion) on the basis of chance if the hypothesis is correct.
2. The fact that our results have “passed” the chi-square test because p > 0.05
does not mean that the hypothesis is true; it merely means that the results are
compatible with that hypothesis. However, if we had obtained a value of p < 0.05,
we would have been forced to reject the hypothesis. Science is all about falsifiable
hypotheses, not “truth.”
3. We must be careful about the wording of the hypothesis because tacit assump-
tions are often buried within it. The present hypothesis is a case in point; if we were
to state it carefully, we would have to say that the “individual under test is a hetero-
zygote A/a, these alleles show equal segregation at meiosis, and the A/a and a/a
progeny are of equal viability.” We will investigate allele effects on viability in
Chapter 6, but, for the time being, we must keep them in mind as a possible compli-
cation because differences in survival would affect the sizes of the various classes.
The problem is that, if we reject a hypothesis that has hidden components, we do
not know which of the components we are rejecting. For example, in the present
case, if we were forced to reject the hypothesis as a result of the χ2 test, we would
not know if we were rejecting equal segregation or equal viability or both.
4. The outcome of the χ2 test depends heavily on sample sizes (numbers in the
classes). Hence, the test must use actual numbers, not proportions or percentages.
Additionally, the larger the samples, the more powerful is the test.
1 1 1
4
A/A 2
A/a 4
a/a
1 1 1 1 1
4
A/A 8
A/A 4
A/a 8
a/a 4
a/a
wheat by Charles Saunders in the early part of the twentieth century. Saunders’s
goal was to develop a productive wheat line that would have a shorter growing sea-
son and hence open up large areas of terrain in northern countries such as Canada
and Russia for growing wheat, another of the world’s staple foods. He crossed a line
having excellent grain quality called Red Fife with a line called Hard Red Calcutta,
which, although its yield and quality were poor, matured 20 days earlier than Red
Fife. The F1 produced by the cross was presumably heterozygous for multiple genes
controlling the wheat qualities. From this F1, Saunders made selfings and selections
that eventually led to a pure line that had the combination of favorable properties
needed—good-quality grain and early maturation. This line was called Marquis. It
was rapidly adopted in many parts of the world.
A similar approach can be applied to the rice lines with which we began the
chapter. All the single-gene mutations are crossed in pairs, and then their F1
plants are selfed or intercrossed with other F1 plants. As a demonstration, let’s
consider just four mutations, 1 through 4. A breeding program might be as fol-
lows, in which the mutant alleles and their wild-type counterparts are always
listed in the same order (recall that the + sign designates wild type):
1/1 ; +/+ ; +/+ ; +/+ +/+ ; 2/2 ; +/+ ; +/+ +/+ ; +/+ ; 3/3 ; +/+ +/+ ; +/+ ; +/+ ; 4/4
Self Self
Select the homozygote 1/1 ; 2/2 ; +/+ ; +/+ Select the homozygote +/+ ; +/+ ; 3/3 ; 4/4
This type of breeding has been applied to many other crop species. The color-
ful and diverse pure lines of tomatoes used in commerce are shown in Figure 3-5.
Note that, in general when a multiple heterozygote is selfed, a range of differ-
ent homozygotes is produced. For example, from A/a ; B/b ; C/c, there are two
homozygotes for each gene pair (that is, for the first gene, the homozygotes are
A/A and a/a), and so there are 23 = 8 different homozygotes possible: A/A ; b/b ; C/c,
and a/a ; B/B ; c/c, and so on. Each distinct homozygote can be the start of a new
pure line.
Hybrid vigor
F i g u r e 3 - 5 Tomato breeding has
We have been considering the synthesis of superior pure lines for research and for resulted in a wide range of lines of different
agriculture. Pure lines are convenient in that propagation of the genotype from genotypes and phenotypes. [ © Mascarucci/
year to year is fairly easy. However, a large proportion of commercial seed that Corbis.]
10 0 CHA P TER 3 Independent Assortment of Genes
(a) (b)
(a)
Figure 3-6 Multiple heterozygous hybrid flanked by the two pure lines crossed to make it.
(a) The plants. (b) Cobs from the same plants. [ (a) Photo courtesy of Jun Cao, Schnable
Laboratory, Iowa State University; (b) Deana Namuth-Covert, PhD, Univ. of Nebraska.]
farmers (and gardeners) use is called hybrid seed. Curiously, in many cases in
which two disparate lines of plants (and animals) are united in an F1 hybrid (pre-
sumed heterozygote), the hybrid shows greater size and vigor than do the two
contributing lines (Figure 3-6). This general superiority of multiple heterozygotes
is called hybrid vigor. The molecular reasons for hybrid vigor are mostly unknown
and still hotly debated, but the phenomenon is undeniable and has made large
contributions to agriculture. A negative aspect of using hybrids is that, every sea-
son, the two parental lines must be grown separately and then intercrossed to
make hybrid seed for sale. This process is much more inconvenient than main-
taining pure lines, which requires only letting plants self; consequently, hybrid
seed is more expensive than seed from pure lines.
From the user’s perspective, there is another negative aspect of using hybrids.
After a hybrid plant has grown and produced its crop for sale, it is not realistic to
keep some of the seeds that it produces and expect this seed to be equally vigorous
the next year. The reason is that, when the hybrid undergoes meiosis, independent
assortment of the various mixed gene pairs will form many different allelic combi-
nations, and very few of these combinations will be that of the original hybrid. For
example, the earlier described tetrahybrid, when selfed, produces 81 different gen-
otypes, of which only a minority will be tetrahybrid. If we assume independent
assortment, then, for each gene pair, selfing will produce one-half heterozygotes
A/a → 41 A/A, 21 A/a, and 41 a/a. Because there are four gene pairs in this tetrahybrid,
the proportion of progeny that will be like the original hybrid A/a ; B/b ; C/c ; D/d
will be (1/2)4 = 1/16.
Prophase. Chromosomes A
and centromeres have a
replicated, but centromeres a
have not split. b b
B
B
2
A B
Prophase. A B
Homologs synapse. a b
a b
3
A B The other, A b
Anaphase. B equally b
A A
Centromeres attach to a b frequent, a B
spindle and are pulled alignment
to poles of cell. a b a B
4 4´
A B A b
Telophase. A B A b
Two cells form. a b a B
a b a B
5 5´
A B A b
Second anaphase.
New spindles form, A B A b
and centromeres a b a B
finally divide.
6 a b 6´ a B
A B A b
1 1
4 4
A B A b
End of meiosis.
Four cells produced
from each meiosis. a b a B
1 1
4 4
a b a B
7 7´
equal frequencies. In other words, the frequency of each of the four genotypes is 1/4.
This gametic distribution is that postulated by Mendel for a dihybrid, and it is the
one that we inserted along one edge of the Punnett square in Figure 3-4. The random
fusion of these gametes results in the 9 : 3 : 3 : 1 F2 phenotypic ratio.
3.3 The Chromosomal Basis of Independent Assortment 10 3
1
4
vg /vg (vestigial)
A
For the X-linked white eye gene, the ratios will be as
Tetrad
A
follows:
Meiocyte after A
1 1
chromatid Females 2 +/+ and 2 +/w (all wild type)
formation A A
1 1
Males 2 +/Y (wild type) and 2 w/Y (white)
A
A A
A
If the autosomal and X-linked genes are combined,
A
a a the F2 phenotypic ratios will be
a a
3
Females 4
fully wild type
a a
1
4
vestigial
a
a a 1
Males 3
8
fully wild type ( 43 × 2 )
First
1
meiotic
a
3
8
white ( 43 × 2 )
division Second
meiotic 1 1 1
8
vestigial ( 4
× 2 )
division Mitosis 1
1
8
vestigial, white ( 41 × 2 )
F i g u r e 3 -10 Neurospora is an ideal model system for studying Hence, we see a progeny ratio that reveals clear ele-
allelic segregation at meiosis. (a) The four products of meiosis (tetrad) ments of both autosomal and X-linked inheritance.
undergo mitosis to produce an octad. The products are contained
within an ascus. (b) An A /a meiocyte undergoes meiosis followed by
mitosis, resulting in equal numbers of A and a products and
Recombination
demonstrating the principle of equal segregation. The independent assortment of genes at meiosis is one of
the main ways by which an organism produces new com-
binations of alleles. The production of new allele combi-
nations is formally called recombination.
3.3 The Chromosomal Basis of Independent Assortment 10 5
Model
Organism Neurospora
Neurospora crassa was one of the first eukaryotic microbes
to be adopted by geneticists as a model organism. It is a
haploid fungus (n = 7) found growing on dead vegetation
in many parts of the world. When an asexual spore (hap-
loid) germinates, it produces a tubular structure that
extends rapidly by tip growth and throws off multiple side
branches. The result is a mass of branched threads (called
hyphae), which constitute a colony. Hyphae have no cross-
(a)
walls, and so a colony is essentially one cell containing
many haploid nuclei. A colony buds off millions of asexual
spores, which can disperse and repeat the asexual cycle.
Asexual colonies are easily and inexpensively main-
tained in the laboratory on a defined medium of inorganic
salts plus an energy source such as sugar. (An inert gel such
as agar is added to provide a firm surface.) The fact that
Neurospora can chemically synthesize all its essential mol-
ecules from such a simple medium led biochemical geneti-
cists (beginning with George Beadle and Edward Tatum;
see Chapter 6) to choose it for studies of synthetic path-
ways. Geneticists worked out the steps in these pathways
by introducing mutations and observing their effects. The
haploid state of Neurospora is ideal for such mutational
analysis because mutant alleles are always expressed
directly in the phenotype.
Neurospora has two mating types, MAT-A and MAT-a, (b)
which can be regarded as simple “sexes.” When colonies of The fungus Neurospora crassa. (a) Orange colonies of
different mating type come into contact, their cell walls Neurospora growing on sugarcane. In nature, Neurospora
and nuclei fuse, resulting in many transient diploid nuclei, colonies are most often found after fire, which activates dormant
each of which undergoes meiosis. The four haploid prod- ascospores. (Fields of sugarcane are burned to remove foliage
ucts of one meiosis stay together in a sac called an ascus. before harvesting the cane stalks.) (b) Developing Neurospora
Each of these products of meiosis undergoes a further octads from a cross of wild type to a strain carrying an
engineered allele of jellyfish green fluorescent protein fused to
mitotic division, resulting in eight ascospores within each
histone. The octads show the expected 4 : 4 Mendelian
ascus. Ascospores germinate and produce colonies exactly segregation of fluorescence. In some spores, the nucleus has
like those produced by asexual spores. Hence, such ascomy- divided mitotically to form two; eventually, each spore will contain
cete fungi are ideal for the study of the segregation and several nuclei. [ (a) David J. Jacobson, Ph.D.; (b) Namboori B. Raju,
recombination of genes in individual meioses. Stanford University.]
This seemingly wordy definition is actually quite simple; it makes the impor-
tant point that we detect recombination by comparing the inputs into meiosis
with the outputs (Figure 3-11). The inputs are the two haploid genotypes that com-
bine to form the meiocyte, the diploid cell that undergoes meiosis. For humans,
the inputs are the parental egg and sperm. They unite to form a diploid zygote,
which divides to yield all the body cells, including the meiocytes that are set aside
within the gonads. The output genotypes are the haploid products of meiosis. In
humans, these haploid products are a person’s own eggs or sperm. Any meiotic
product that has a new combination of the alleles provided by the two input geno-
types is by definition a recombinant.
First, let us look at how recombinants are detected experimentally. The detec-
tion of recombinants in organisms with haploid life cycles such as fungi or algae
is straightforward. The input and output types in haploid life cycles are the geno-
types of individuals rather than gametes and may thus be inferred directly from
phenotypes. Figure 3-11 can be viewed as summarizing the simple detection of
recombinants in organisms with haploid life cycles. Detecting recombinants in
organisms with diploid life cycles is trickier. The input and output types in diploid
cycles are gametes. Thus, we must know the genotypes of both input and output
gametes to detect recombinants in an organism with a diploid cycle. Though we
n n
Input A.B a .b
2n
Meiotic .
A / a B/ b
diploid
Meiosis
n A .b Recombinant
n a .B Recombinant
2n 2n
P .
A /A B/B .
a /a b/ b
Input n A .B n a .b
2n 2n
Meiotic
diploid (F1)
.
A / a B/ b .
a /a b/ b
Tester
Meiosis Meiosis
Progeny (2n)
Output Parental-type
gamete
n
A .B + a .b n Fertilization
A /a . B/ b Parental type
Parental-type
gamete
n
a .b + a .b n
a /a . b/ b Parental type
Recombinant
gamete
n A .b + a .b n
A /a . b/ b Recombinant
Recombinant
gamete
n
a .B + a .b n
a /a . B/ b Recombinant
cannot detect the genotypes of input or output gametes directly, we can infer
these genotypes by using the appropriate techniques:
• To know the input gametes, we use pure-breeding diploid parents because they
can produce only one gametic type.
• To detect recombinant output gametes, we testcross the diploid individual and
observe its progeny (Figure 3-12).
Frequency
Both male and female gametes will show the genotypic pro-
portions as follows:
R1 ; R2 2 doses of redness
R1 ; r2 1 dose of redness
r1 ; R2 1 dose of redness Metric character
(e.g., color intensity)
r1 ; r2 0 doses of redness
Overall, in this gamete population, one-fourth have two doses, one-half have F i g u r e 3 -14 In a population, a metric
one dose, and one-fourth have zero doses. The union of male and female gametes character such as color intensity can take
both showing this array of R doses is illustrated in Figure 3-15. The number of on many values. Hence, the distribution
is in the form of a smooth curve, with the
doses in the progeny ranges from four (R1 /R1 ; R2 /R2) down to zero (r1/r2 ; r2 /r2),
most common values representing the
with all values between. high point of the curve. If the curve is sym-
The proportions in the grid of Figure 3-15 can be drawn as a histogram, as metrical, it is bell shaped, as shown.
shown in Figure 3-16. The shape of the histogram can be thought of as a scaffold
that could be the underlying basis for a bell-shaped distribution curve. When this
analysis of redness in wheat seeds was originally done, variation was found within
the classes that allegedly represented one polygene “dose” level. Presumably, this
variation within a class is the result of environmental differences. Hence, the
environment can be seen to contribute in a way that rounds off the sharp shoul-
ders of the histogram bars, resulting in a smooth bell-shaped curve (the red line
in the histogram). If the number of polygenes is increased, the histogram more
gametes
2 doses 1 dose 0 doses
1 1 1
4 2 4
5 15
6
ths
ths
20
64
4
16
1
1
Frequency in
Frequency in
3 10
4 4 15 15
2
5
1
1 1 1 6 6 1
0 0
0 1 2 3 4 0 1 2 3 4 5 6
Number of contributing Number of contributing
polygenic alleles, or “doses” polygenic alleles, or “doses”
Organelle genomes
A IR
IR B
(a) (b)
(ad +) (ad – )
Poky, ad – Normal, ad –
2n 2n
(ad – ) (ad +)
F i g u r e 3 -2 0 Reciprocal crosses of poky and wild-type Neurospora produce different results because a
different parent contributes the cytoplasm. The female parent contributes most of the cytoplasm of the progeny
cells. Brown shading represents cytoplasm with mitochondria containing the poky mutation, and green shading
represents cytoplasm with wild-type mitochondria. Note that all the progeny in part a are poky, whereas all the
progeny in part b are normal. Hence, both crosses show maternal inheritance. The nuclear gene with the alleles
ad + (black) and ad− (red) is used to illustrate the segregation of the nuclear genes in the 1:1 Mendelian ratio
expected for this haploid organism.
in addition to poky; notice how the Mendelian inheritance of the nuclear gene is
independent of the maternal inheritance of the poky phenotype.
Cytoplasmic segregation
In some cases, cells contain mixtures of mutant and normal
Variegated leaves caused by a mutation
organelles. These cells are called cytohets, or heteroplasmons. In
in cpDNA
these mixtures, a type of cytoplasmic segregation can be
detected, in which the two types apportion themselves into dif-
ferent daughter cells. The process most likely stems from chance
partitioning of the multiple organelles in the course of cell divi-
sion. Plants provide a good example. Many cases of white leaves
are caused by mutations in chloroplast genes that control the
production and deposition of the green pigment chlorophyll.
Because chlorophyll is necessary for a plant to live, this type of
mutation is lethal, and white-leaved plants cannot be obtained
for experimental crosses. However, some plants are variegated,
bearing both green and white patches, and these plants are via-
ble. Thus, variegated plants provide a way of demonstrating All-white branch
cytoplasmic segregation.
The four-o’clock plant in Figure 3-21 shows a commonly All-green branch
observed variegated leaf and branch phenotype that demon-
strates the inheritance of a mutant allele of a chloroplast gene.
The mutant allele causes chloroplasts to be white; in turn, the color of the chloro-
plasts determines the color of cells and hence the color of the branches composed
of those cells. Variegated branches are mosaics of all-green and all-white cells.
Flowers can develop on green, white, or variegated branches, and the chloroplast
genes of a flower’s cells are those of the branch on which it grows. Hence, in a
cross (Figure 3-22), the maternal gamete within the flower (the egg cell) deter-
mines the color of the leaves and branches of the progeny plant. For example, if
an egg cell is from a flower on a green branch, all the progeny will be green,
regardless of the origin of the pollen. A white branch will have white chloroplasts,
and the resulting progeny plants will be white. (Because of lethality, white descen-
dants would not live beyond the seedling stage.)
The variegated zygotes (bottom of Figure 3-22) demonstrate cytoplasmic seg-
regation. These variegated progeny come from eggs that are cytohets. Interest-
ingly, when such a zygote divides, the white and green chloroplasts often
segregate; that is, they sort themselves into separate cells, yielding the distinct
green and white sectors that cause the variegation in the branches. Here, then, is
a direct demonstration of cytoplasmic segregation.
Given that a cell is a population of organelle molecules, how is it ever possible
to obtain a “pure” mutant cell, containing only mutant chromosomes? Most likely,
pure mutants are created in asexual cells as follows. The variants arise by muta-
tion of a single gene in a single chromosome. Then, in some cases, the mutation-
bearing chromosome may by chance increase in frequency in the population
Green Any
Green
Variegated Any
Egg
type White
F i g u r e 3 -2 2 The results of the Mirabilis 1
jalapa crosses can be explained by
autonomous chloroplast inheritance. The large,
dark spheres represent nuclei. The smaller Egg
type Green
bodies represent chloroplasts, either green or 2
white. Each egg cell is assumed to contain
many chloroplasts, and each pollen cell is
assumed to contain no chloroplasts. The first
Egg
two crosses exhibit strict maternal inheritance. type
If, however, the maternal branch is variegated, 3 Cell
Variegated
within the cell. This process is called random genetic drift. A cell that is a cytohet
may have, say, 60 percent A chromosomes and 40 percent a chromosomes. When
this cell divides, sometimes all the A chromosomes go into one daughter, and all
the a chromosomes into the other (again, by chance). More often, this partition-
ing requires several subsequent generations of cell division to be complete
(Figure 3-23). Hence, as a result of these chance events, both alleles are expressed
in different daughter cells, and this separation will continue through the descen-
dants of these cells. Note that cytoplasmic segregation is not a mitotic process; it
does take place in dividing asexual cells, but it is unrelated to mitosis. In chloro-
plasts, cytoplasmic segregation is a common mechanism for producing variegated
(green-and-white) plants, as already mentioned. In fungal mutants such as the
poky mutant of Neurospora, the original mutation in one mtDNA molecule must
have accumulated and undergone cytoplasmic segregation to produce the strain
expressing the poky symptoms.
In certain special systems such as in fungi and algae, cytohets that are “dihybrid”
have been obtained (say, AB in one organelle chromosome and ab in another). In
such cases, rare crossover-like processes can occur, but such an occurrence must be
considered a minor genetic phenomenon.
Aminoglycoside-
induced deafness Deafness Myopathy
MELAS MELAS Respiratory deficiency
PEO MILS
Myopathy 12S F Myopathy
Cardiomyopathy V P
T
Diabetes & Myopathy
deafness 16S Cytb
E
LHON/
MELAS L Dystonia
ND6
LHON ND1
MELAS
PEO I Human mtDNA
Cardiomyopathy 16,569 bp
Q
M ND5
ND2
Chorea W Typical
MILS A
N deletion in L
C S Anemia
PEO Y KSS/PEO
H
Encephalopathy
COX I Myopathy
Myopathy
ND4L/4 LHON
S
D
R
MERRF COX II ND3 LHON/
COX III G
K Dystonia
Deafness ATPase 8/6
Ataxia; myoclonus
Deafness Cardiomyopathy
Cardiopathy MELAS
MERRF NARP Myoglobinuria
Encephalomyopathy
MILS
Diseases: FBSN
Figure 3-24 This map of human mtDNA shows loci of mutations leading to cytopathies. The
transfer RNA genes are represented by single-letter amino acid abbreviations: ND = NADH
dehydrogenase; COX = cytochrome C oxidase; and 12S and 16S refer to ribosomal RNAs.
[ Data from S. DiMauro et al., “Mitochondria in Neuromuscular Disorders,” Biochim. Biophys. Acta 1366,
1998, 199–210.]
The inheritance of a human mitochondrial disease is shown in Figure 3-25. Note that
the condition is always passed to offspring by mothers and never fathers. Occasion-
ally, a mother will produce an unaffected child (not shown), probably owing to cyto-
plasmic segregation in the gamete-forming tissue.
II
III
One key finding is that the “root” of the human mtDNA tree is in Africa, suggest- F i g u r e 3 -2 5 This pedigree shows that
ing that Homo sapiens originated in Africa and from there dispersed throughout a human mitochondrial disease is inherited
the world (see Chapter 18). only from the mother.
s u m m a ry
Genetic research and plant and animal breeding often neces- Sequential generations of selfing increase the pro-
sitate the synthesis of genotypes that are complex combina- portions of homozygotes, according to the principles of
tions of alleles from different genes. Such genes can be on the equal segregation and independent assortment (if the genes
same chromosome or on different chromosomes; the latter is are on different chromosomes). Hence, selfing is used to cre-
the main subject of this chapter. ate complex pure lines with combinations of desirable
In the simplest case—a dihybrid for which the two gene mutations.
pairs are on different chromosome pairs—each individual gene The independent assortment of chromosomes at meiosis
pair shows equal segregation at meiosis as predicted by can be observed cytologically by using heteromorphic chro-
Mendel’s first law. Because nuclear spindle fibers attach ran- mosome pairs (those that show a structural difference). The
domly to centromeres at meiosis, the two gene pairs are parti- X and Y chromosomes are one such case, but other, rarer
tioned independently into the meiotic products. This principle cases can be found and used for this demonstration. The
of independent assortment is called Mendel’s second law independent assortment of genes at the level of single meio-
because Mendel was the first to observe it. From a dihybrid cytes can be observed in the ascomycete fungi, because the
A/a ; B/b, four genotypes of meiotic products are produced, asci show the two alternative types of segregations at equal
A ; B, A ; b, a ; B, and a ; b, all at an equal frequency of 25 percent frequencies.
each. Hence, in a testcross of a dihybrid with a double reces- One of the main functions of meiosis is to produce recom-
sive, the phenotypic proportions of the progeny also are 25 per- binants, new combinations of alleles of the haploid genotypes
cent (a 1 : 1 : 1 : 1 ratio). If such a dihybrid is selfed, the pheno- that united to form the meiocyte. Independent assortment is
typic classes in the progeny are are 169 A/- ; B/-, 163 A/- ; b/b, the main source of recombinants. In a dihybrid testcross show-
163 a/a ; B/-, and 161 a/a ; b/b. The 1 : 1 : 1 : 1 and 9 : 3 : 3 : 1 ratios ing independent assortment, the recombinant frequency will
are both diagnostic of independent assortment. be 50 percent.
More complex genotypes composed of independently Metric characters such as color intensity show a continu-
assorting genes can be treated as extensions of the case for ous distribution in a population. Continuous distributions can
single-gene segregation. Overall genotypic, phenotypic, or be based on environmental variation or on variant alleles of
gametic ratios are calculated by applying the product rule— multiple genes or on a combination of both. A simple genetic
that is, by multiplying the proportions relevant to the indi- model proposes that the active alleles of several genes (called
vidual genes. The probability of the occurrence of any of polygenes) contribute more or less additively to the metric
several categories of progeny is calculated by applying the character. In an analysis of the progeny from the self of a mul-
sum rule—that is, by adding their individual probabilities. In tiply heterozygous individual, the histogram showing the pro-
mnemonic form, the product rule deals with “A AND B,” portion of each phenotype approximates a bell-shaped curve
whereas the sum rule deals with “A′ OR A″.” The χ2 test can typical of continuous variation.
be used to test whether the observed proportions of classes The small subsets of the genome found in mitochondria
in genetic analysis conform to the expectations of a genetic and chloroplasts are inherited independently of the nuclear
hypothesis, such as a hypothesis of single- or two-gene inher- genome. Mutants in these organelle genes often show mater-
itance. If a probability value of less than 5 percent is calcu- nal inheritance, along with the cytoplasm, which is the loca-
lated, the hypothesis must be rejected. tion of these organelles. In genetically mixed cytoplasms
118 CHA P TER 3 Independent Assortment of Genes
(cytohets), the two genotypes (say, wild type and mutant) Mitochondrial mutation in humans results in diseases that
often sort themselves out into different daughter cells by a show cytoplasmic segregation in body tissues and maternal
poorly understood process called cytoplasmic segregation. inheritance in a mating.
key terms
chi-square test (p. 96) law of independent assortment product rule (p. 94)
chloroplast DNA (cpDNA) (p. 111) (Mendel’s second law) (p. 89) quantitative trait locus (QTL)
cytoplasmic segregation (p. 113) maternal inheritance (p. 111) (p. 108)
dihybrid (p. 89) meiotic recombination (p. 105) recombinant (p. 106)
dihybrid cross (p. 89) mitochondrial DNA (mtDNA) (p. 111) recombination (p. 104)
hybrid vigor (p. 100) polygene (quantitative trait locus) sum rule (p. 94)
independent assortment (p. 88) (p. 108) uniparental inheritance (p. 111)
s olv e d p r obl e m s
SOLVED PROBLEM 1. Two Drosophila flies that had normal the inheritance of wing transparency differs from the in-
(transparent, long) wings were mated. In the progeny, two heritance of wing shape. First, long and clipped are found in
new phenotypes appeared: dusky wings (having a semi- a 3 : 1 ratio in both males and females. This ratio can be ex-
opaque appearance) and clipped wings (with squared ends). plained if both parents are heterozygous for an autosomal
The progeny were as follows: gene; we can represent them as L/ l, where L stands for long
Females Males and l stands for clipped.
Having done this partial analysis, we see that only the
179 transparent, long 92 transparent, long inheritance of wing transparency is associated with sex. The
58 transparent, clipped 89 dusky, long most obvious possibility is that the alleles for transparent
28 transparent, clipped (D) and dusky (d ) are on the X chromosome, because we
31 dusky, clipped have seen in Chapter 2 that gene location on this chromo-
some gives inheritance patterns correlated with sex. If this
a. Provide a chromosomal explanation for these results, suggestion is true, then the parental female must be the one
showing chromosomal genotypes of parents and of all pro sheltering the d allele, because, if the male had the d, he
geny classes under your model. would have been dusky, whereas we were told that he had
b. Design a test for your model. transparent wings. Therefore, the female parent would be
D/d and the male D. Let’s see if this suggestion works: if it is
Solution true, all female progeny would inherit the D allele from
a. The first step is to state any interesting features of the their father, and so all would be transparent winged, as was
data. The first striking feature is the appearance of two new observed. Half the sons would be D (transparent) and half d
phenotypes. We encountered the phenomenon in Chapter 2, (dusky), which also was observed.
where it was explained as recessive alleles masked by their So, overall, we can represent the female parent as
dominant counterparts. So, first, we might suppose that one D/d ; L/ l and the male parent as D ; L/l. Then the progeny
or both parental flies have recessive alleles of two different would be
genes. This inference is strengthened by the observation that Females
some progeny express only one of the new phenotypes. If the 3
L/—
3
D/D ; L/—
4 8
new phenotypes always appeared together, we might sup- 1
2 D/D transparent,
1 1 3
pose that the same recessive allele determines both. l/l D/D ; l/l
4 8 4
long
However, the other striking feature of the data, which 3
L/—
3
D/d ; L/— 1
1 4 8 transparent,
we cannot explain by using the Mendelian principles from D/d 4
2 1
l/l
1
D/d ; l/l clipped
Chapter 2, is the obvious difference between the sexes: al- 4 8
b. Generally, a good way to test such a model is to make a Because A, B, and C were all crossed with the same
cross and predict the outcome. But which cross? We have to plant, all the differences between the three progeny popula-
predict some kind of ratio in the progeny, and so it is impor- tions must be attributable to differences in the underlying
tant to make a cross from which a unique phenotypic ratio genotypes of A, B, and C.
can be expected. Notice that using one of the female prog- You might remember a lot about these analyses from the
eny as a parent would not serve our needs: we cannot say chapter, which is fine, but let’s see how much we can deduce
from observing the phenotype of any one of these females from the data. What about dominance? The key cross for
what her genotype is. A female with transparent wings deducing dominance is B. Here, the inheritance pattern is
could be D/D or D/d, and one with long wings could be L/L
or L/l. It would be good to cross the parental female of the yellow, round × green, wrinkled → all yellow, round
original cross with a dusky, clipped son, because the full
genotypes of both are specified under the model that we So yellow and round must be dominant phenotypes be-
have created. According to our model, this cross is cause dominance is literally defined by the phenotype of a
hybrid. Now we know that the green, wrinkled parent used
D/d ; L/l × d ; l/l in each cross must be fully recessive; we have a very conve-
nient situation because it means that each cross is a test-
From this cross, we predict
cross, which is generally the most informative type of cross.
Females Turning to the progeny of A, we see a 1 : 1 ratio for yel-
1 1
low to green. This ratio is a demonstration of Mendel’s first
1 2 L/l 4 D/d ; L/l law (equal segregation) and shows that, for the character of
2 D/d 1
l/l 1
D/d ; l/l color, the cross must have been heterozygote × homozygous
2 4
1 1
recessive. Letting Y represent yellow and y represent green,
1 2 L/l 4 d/d ; L/l we have
2 d/d 1 1
l/l d/d ; l/l 1 1
2 4
Y/y × y/y → 2 Y/y (yellow) → 2 y/y (green)
Males
For the character of shape, because all the progeny are
1 1
1 2 L/l 4 D ; L/l round, the cross must have been homozygous dominant ×
2 D 1 1 homozygous recessive. Letting R represent round and r rep-
2 l/l 4 D ; l/l
resent wrinkled, we have
1 1
1 2 L/l 4 d ; L/l
2 d 1 1
R/R × r/r → R/r (round)
2 l/l 4 d ; l/l
Combining the two characters, we have
SOLVED PROBLEM 2. Consider three yellow, round peas, 1 1
Y/y ; R/R × y/y ; r/r → 2 Y/y ; R/r 2 y/y ; R/r
labeled A, B, and C. Each was grown into a plant and crossed
with a plant grown from a green wrinkled pea. Exactly 100 Now cross B becomes crystal clear and must have been
peas issuing from each cross were sorted into phenotypic
classes as follows: Y/Y ; R/R × y/y ; r/r → Y/y ; r/r
A: 51 yellow, round
49 green, round because any heterozygosity in pea B would have given rise
to several progeny phenotypes, not just one.
B: 100 yellow, round What about C? Here, we see a ratio of 50 yellow : 50
green (1 : 1) and a ratio of 49 round : 51 wrinkled (also 1 : 1).
C: 24 yellow, round So both genes in pea C must have been heterozygous, and
26 yellow, wrinkled cross C was
25 green, round Y/y ; R/r × y/y ; r/r
25 green, wrinkled which is a good demonstration of Mendel’s second law (in-
What were the genotypes of A, B, and C? (Use gene symbols dependent assortment of different genes).
of your own choosing; be sure to define each one.) How would a geneticist have analyzed these crosses?
Basically, the same way that we just did but with fewer in-
Solution tervening steps. Possibly something like this: “yellow
Notice that each of the crosses is and round dominant; single-gene segregation in A; B
homozygous dominant; independent two-gene segregation
yellow, round × green, wrinkled → progeny in C.”
120 CHA P TER 3 Independent Assortment of Genes
p r obl e m s
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
Working with the Figures 13. Normal mitosis takes place in a diploid cell of genotype
1. Using Table 3-1, answer the following questions about A/a ; B/b. Which of the following genotypes might repre-
probability values (see p. 97): sent possible daughter cells?
a. If χ2 is calculated to be 17 with 9 df, what is the a. A ; B b. a ; b c. A ; b
approximate probability value? d. a ; B e. A/A ; B/B f. A/a ; B/b
b. If χ2 is 17 with 6 df, what is the probability value? g. a/a ; b/b
c. What trend (“rule”) do you see in the previous two 14. In a diploid organism of 2n = 10, assume that you can
calculations? label all the centromeres derived from its female parent
2. Inspect Figure 3-8: which meiotic stage is responsible and all the centromeres derived from its male parent.
for generating Mendel’s second law? When this organism produces gametes, how many male-
and female-labeled centromere combinations are possi-
3. In Figure 3-9,
ble in the gametes?
a. identify the diploid nuclei.
15. It has been shown that when a thin beam of light is aimed
b. identify which part of the figure illustrates Mendel’s at a nucleus, the amount of light absorbed is proportional
first law. to the cell’s DNA content. Using this method, the DNA in
4. Inspect Figure 3-10: what would be the outcome in the the nuclei of several different types of cells in a corn plant
octad if on rare occasions a nucleus from the postmeiotic were compared. The following numbers represent the
mitotic division of nucleus 2 slipped past a nucleus from relative amounts of DNA in these different types of cells:
the postmeiotic mitotic division of nucleus 3? How could
0.7, 1.4, 2.1, 2.8, and 4.2
you measure the frequency of such a rare event?
5. In Figure 3-11, if the input genotypes were a B and A b,
• •
Which cells could have been used for these measure-
what would be the genotypes colored blue? ments? (Note: In plants, the endosperm part of the seed
is often triploid, 3n.)
6. In the progeny seen in Figure 3-13, what are the origins
of the chromosomes colored dark blue, light blue, and 16. Draw a haploid mitosis of the genotype a+ ; b.
very light blue? 17. In moss, the genes A and B are expressed only in the ga-
7. In Figure 3-17, in which bar of the histogram would the metophyte. A sporophyte of genotype A/a ; B/b is al-
genotype R1/r1 R2 /R2 r3/r3 be found?
• • lowed to produce gametophytes.
8. In examining Figure 3-19, what do you think is the main a. What proportion of the gametophytes will be A ; B?
reason for the difference in size of yeast and human b. If fertilization is random, what proportion of sporo-
mtDNA? phytes in the next generation will be A/a ; B/b?
9. In Figure 3-20, what color is used to denote cytoplasm 18. When a cell of genotype A/a ; B/b ; C/c having all the
containing wild-type mitochondria? genes on separate chromosome pairs divides mitotically,
10. In Figure 3-21, what would be the leaf types of progeny what are the genotypes of the daughter cells?
of the apical (top) flower? 19. In the haploid yeast Saccharomyces cerevisiae, the two
11. From the pedigree in Figure 3-25, what principle can you mating types are known as MATa and MATα. You cross a
deduce about the inheritance of mitochondrial disease purple (ad-) strain of mating type a and a white (ad+)
from affected fathers? strain of mating type α. If ad - and ad+ are alleles of one
gene, and a and α are alleles of an independently inher-
B a s i c P r obl e m s ited gene on a separate chromosome pair, what progeny
do you expect to obtain? In what proportions?
12. Assume independent assortment and start with a plant
that is dihybrid A/a ; B/b: 20. In mice, dwarfism is caused by an X-linked recessive al-
lele, and pink coat is caused by an autosomal dominant
a. What phenotypic ratio is produced from selfing it?
allele (coats are normally brownish). If a dwarf female
b. What genotypic ratio is produced from selfing it? from a pure line is crossed with a pink male from a pure
c. What phenotypic ratio is produced from testcrossing line, what will be the phenotypic ratios in the F1 and F2 in
it? each sex? (Invent and define your own gene symbols.)
d. What genotypic ratio is produced from testcrossing 21. Suppose you discover two interesting rare cytological
it? abnormalities in the karyotype of a human male. (A
Problems 121
karyotype is the total visible chromosome complement.) 27. Note: The first part of this problem was introduced in
There is an extra piece (satellite) on one of the chromo- Chapter 2. The line of logic is extended here.
somes of pair 4, and there is an abnormal pattern of In the plant Arabidopsis thaliana, a geneticist is
staining on one of the chromosomes of pair 7. With the interested in the development of trichomes (small
assumption that all the gametes of this male are equally projections) on the leaves. A large screen turns up two
viable, what proportion of his children will have the mutant plants (A and B) that have no trichomes, and
same karyotype that he has? these mutants seem to be potentially useful in studying
22. Suppose that meiosis occurs in the transient diploid stage trichome development. (If they are determined by
of the cycle of a haploid organism of chromosome num- single-gene mutations, then finding the normal and
ber n. What is the probability that an individual haploid abnormal function of these genes will be instructive.)
cell resulting from the meiotic division will have a com- Each plant was crossed with wild type; in both cases,
plete parental set of centromeres (that is, a set all from the next generation (F1) had normal trichomes. When
one parent or all from the other parent)? F1 plants were selfed, the resulting F2’s were as follows:
23. Pretend that the year is 1868. You are a skilled young lens F2 from mutant A: 602 normal ; 198 no trichomes
maker working in Vienna. With your superior new lenses,
F2 from mutant B: 267 normal ; 93 no trichomes
you have just built a microscope that has better resolution
than any others available. In your testing of this micro- a. What do these results show? Include proposed
scope, you have been observing the cells in the testes of genotypes of all plants in your answer.
grasshoppers and have been fascinated by the behavior of b. Assume that the genes are located on separate
strange elongated structures that you have seen within chromosomes. An F1 is produced by crossing the original
the dividing cells. One day, in the library, you read a re- mutant A with the original mutant B. This F1 is
cent journal paper by G. Mendel on hypothetical “factors” testcrossed: What proportion of testcross progeny will
that he claims explain the results of certain crosses in have no trichomes?
peas. In a flash of revelation, you are struck by the paral-
lels between your grasshopper studies and Mendel’s pea 28. In dogs, dark coat color is dominant over albino, and
studies, and you resolve to write him a letter. What do you short hair is dominant over long hair. Assume that these
write? (Problem 23 is based on an idea by Ernest Kroeker.) effects are caused by two independently assorting genes.
Seven crosses were done as shown below, in which D and
24. From a presumed testcross A /a × a /a, in which A repre-
A stand for the dark and albino phenotypes, respectively,
sents red and a represents white, use the χ2 test to find
and S and L stand for the short-hair and long-hair
out which of the following possible results would fit the
phenotypes.
expectations:
a. 120 red, 100 white b. 5000 red, 5400 white Number of progeny
c. 500 red, 540 white d. 50 red, 54 white Parental phenotypes D, S D, L A, S A, L
25. Look at the Punnett square in Figure 3-4.
a. D, S × D, S 88 31 29 12
a. How many different genotypes are shown in the 16
squares of the grid? b. D, S × D, L 19 18 0 0
b. What is the genotypic ratio underlying the 9 : 3 : 3 : 1 c. D, S × A, S 21 0 20 0
phenotypic ratio? d. A, S × A, S 0 0 29 9
c. Can you devise a simple formula for the calculation e. D, L × D, L 0 31 0 11
of the number of progeny genotypes in dihybrid,
f. D, S × D, S 45 16 0 0
trihybrid, and so forth crosses? Repeat for phenotypes.
g. D, S × D, L 31 30 10 10
d. Mendel predicted that, within all but one of the
phenotypic classes in the Punnett square, there should
be several different genotypes. In particular, he Write the genotypes of the parents in each cross. Use the
performed many crosses to identify the underlying symbols C and c for the dark and albino coat-color alleles
genotypes of the round, yellow phenotype. Show two and the symbols H and h for the short-hair and long-hair
different ways that could be used to identify the various alleles, respectively. Assume parents are homozygous
genotypes underlying the round, yellow phenotype. unless there is evidence otherwise.
(Remember, all the round, yellow peas look identical.) 29. In tomatoes, one gene determines whether the plant has
26. Assuming independent assortment of all genes, develop purple (P) or green (G) stems, and a separate, indepen-
formulas that show the number of phenotypic classes dent gene determines whether the leaves are “cut” (C) or
and the number of genotypic classes from selfing a plant “potato” (Po). Five matings of tomato-plant phenotypes
heterozygous for n gene pairs. give the following results:
122 CHA P TER 3 Independent Assortment of Genes
www
Parental Number of progeny Unpacking Problem 31
www
Mating phenotypes P, C P, Po G, C G, Po Before attempting a solution to this problem, try answering the
following questions:
1 P, C × G, C 323 102 309 106
1. What does the word normal mean in this problem?
2 P, C × P, Po 220 206 65 72
2. The words line and strain are used in this problem.
3 P, C × G, C 723 229 0 0
What do they mean, and are they interchangeable?
4 P, C × G, Po 405 0 389 0 3. Draw a simple sketch of the two parental flies showing
5 P, Po × G, C 71 90 85 78 their eyes, wings, and sexual differences.
4. How many different characters are there in this
a. Which alleles are dominant? problem?
b. What are the most probable genotypes for the parents 5. How many phenotypes are there in this problem, and
in each cross? which phenotypes go with which characters?
30. A mutant allele in mice causes a bent tail. Six pairs of 6. What is the full phenotype of the F1 females called
mice were crossed. Their phenotypes and those of their “normal”?
progeny are given in the following table. N is normal 7. What is the full phenotype of the F1 males called “short
phenotype; B is bent phenotype. Deduce the mode of in- winged”?
heritance of bent tail. 8. List the F2 phenotypic ratios for each character that you
Parents Progeny came up with in answer to question 4.
Cross 9. What do the F2 phenotypic ratios tell you?
10. What major inheritance pattern distinguishes sex-
1 N B All B All N linked inheritance from autosomal inheritance?
1 1 1 1
2 B N 2 B, 2 N 2 B, 2 N 11. Do the F2 data show such a distinguishing criterion?
3 B N All B All B 12. Do the F1 data show such a distinguishing criterion?
4 N N All N All N 13. What can you learn about dominance in the F1? The F2?
5 B B All B All B 14. What rules about wild-type symbolism can you use
6 B B All B 1 1 in deciding which allelic symbols to invent for these
2 B, 2 N
crosses?
a. Is it recessive or dominant? 15. What does “deduce the inheritance of these pheno-
types” mean?
b. Is it autosomal or sex-linked?
c. What are the genotypes of all parents and progeny? Now try to solve the problem. If you are unable to do so, make a
list of questions about the things that you do not understand.
31. The normal eye color of Drosophila is red, but strains in
Inspect the learning goals at the beginning of the chapter and ask
which all flies have brown eyes are available. Similarly,
yourself which are relevant to your questions. If this approach
wings are normally long, but there are strains with short
doesn’t work, inspect the Key Concepts of this chapter and ask
wings. A female from a pure line with brown eyes and
yourself which might be relevant to your questions.
short wings is crossed with a male from a normal pure
line. The F1 consists of normal females and short-winged 32. In a natural population of annual plants, a single plant is
males. An F2 is then produced by intercrossing the F1. found that is sickly looking and has yellowish leaves. The
Both sexes of F2 flies show phenotypes as follows: plant is dug up and brought back to the laboratory.
3 Photosynthesis rates are found to be very low. Pollen
red eyes, long wings
8 from a normal dark-green-leaved plant is used to fertilize
3
8 red eyes, short wings emasculated flowers of the yellowish plant. A hundred
1 seeds result, of which only 60 germinate. All the resulting
brown eyes, long wings
8 plants are sickly yellow in appearance.
1
8
brown eyes, short wings a. Propose a genetic explanation for the inheritance
pattern.
Deduce the inheritance of these phenotypes; use
clearly defined genetic symbols of your own invention. b. Suggest a simple test for your model.
State the genotypes of all three generations and the c. Account for the reduced photosynthesis, sickliness,
genotypic proportions of the F1 and F2. and yellowish appearance.
Problems 123
33. What is the basis for the green-and-white color variega- following cross, including a mutation nic3 located on
tion in the leaves of Mirabilis? If the following cross is chromosome VI?
made, stp ⋅ nic3 × wild type
variegated × green
40. In polygenic systems, how many phenotypic classes cor-
what progeny types can be predicted? What about the responding to number of polygene “doses” are expected
reciprocal cross? in selfs
34. In Neurospora, the mutant stp exhibits erratic stop-and- a. of strains with four heterozygous polygenes?
start growth. The mutant site is known to be in the mtD- b. of strains with six heterozygous polygenes?
NA. If an stp strain is used as the female parent in a cross
with a normal strain acting as the male, what type of 41. In the self of a polygenic trihybrid R1/r1 ; R2/r2 ; R3/r3,
progeny can be expected? What about the progeny from use the product and sum rules to calculate the propor-
the reciprocal cross? tion of progeny with just one polygene “dose.”
35. Two corn plants are studied. One is resistant (R) and the 42. Reciprocal crosses and selfs were performed between
other is susceptible (S) to a certain pathogenic fungus. the two moss species Funaria mediterranea and F. hygro-
The following crosses are made, with the results shown: metrica. The sporophytes and the leaves of the gameto-
phytes are shown in the accompanying diagram.
S × R → all progeny S
R × S → all progeny R
What can you conclude about the location of the genetic
determinants of R and S?
36. A presumed dihybrid in Drosophila, B/b ; F/f, is test-
crossed with b/b ; f/f. (B = black body ; b = brown body;
F = forked bristles; f = unforked bristles.) The results are
black, forked 230 brown, forked 240
black, unforked 210 brown, unforked 250
Use the χ2 test to determine if these results fit the results
expected from testcrossing the hypothesized dihybrid.
37. Are the following progeny numbers consistent with the
results expected from selfing a plant presumed to be a
dihybrid of two independently assorting genes, H/h ;
R/r? (H = hairy leaves; h = smooth leaves; R = round
The crosses are written with the female parent first.
ovary; r = elongated ovary.) Explain your answer.
Progeny
hairy, round 178 smooth, round 56
Progeny
hairy, elongated 62 smooth, elongated 24
38. A dark female moth is crossed with a dark male. All the
male progeny are dark, but half the female progeny are
light and the rest are dark. Propose an explanation for
this pattern of inheritance.
39. In Neurospora, a mutant strain called stopper (stp) arose
spontaneously. Stopper showed erratic “stop and start”
growth, compared with the uninterrupted growth of
wild-type strains. In crosses, the following results were
found: Progeny
stopper × wild type → progeny all stopper Progeny
a. Describe the results presented, summarizing the main only which crosses to make, but also how many progeny
findings. should be sampled in each case.
b. Propose an explanation of the results. 47. We have dealt mainly with only two genes, but the same
c. Show how you would test your explanation; be sure principles hold for more than two genes. Consider the
to show how it could be distinguished from other following cross:
explanations. A/a ; B/b ; C/c ; D/d ; E/e × a/a ; B/b ; c/c ; D/d ; e/e
43. Assume that diploid plant A has a cytoplasm genetical- a. What proportion of progeny will phenotypically
ly different from that of plant B. To study nuclear–cyto- resemble (1) the first parent, (2) the second parent,
plasmic relations, you wish to obtain a plant with the (3) either parent, and (4) neither parent?
cytoplasm of plant A and the nuclear genome predomi-
b. What proportion of progeny will be genotypically the
nantly of plant B. How would you go about producing
same as (1) the first parent, (2) the second parent,
such a plant?
(3) either parent, and (4) neither parent?
44. You are studying a plant with tissue comprising both
green and white sectors. You wish to decide whether this Assume independent assortment.
phenomenon is due (1) to a chloroplast mutation of the 48. The accompanying pedigree shows the pattern of trans-
type considered in this chapter or (2) to a dominant nu- mission of two rare human phenotypes: cataract and
clear mutation that inhibits chlorophyll production and pituitary dwarfism. Family members with cataract are
is present only in certain tissue layers of the plant as a shown with a solid left half of the symbol; those with
mosaic. Outline the experimental approach that you pituitary dwarfism are indicated by a solid right half.
would use to resolve this problem.
I
1 2
C h a ll e n g i n g P r obl e m s
45. You have three jars containing marbles, as follows: II
1 2 3 4 5 6 7
jar 1 600 red and 400 white
jar 2 900 blue and 100 white
III
jar 3 10 green and 990 white 1 2 3 4 5 6 7 8 9
Assume independent assortment of the three gene pairs. radioactive nucleotide was added and was incorporated
(Note: Corn will self or cross-pollinate easily.) into newly synthesized DNA. The cells were then re-
moved from the radioactivity, washed, and allowed to pro-
50. In humans, color vision depends on genes encoding
ceed through mitosis. Radioactive chromosomes or chro-
three pigments. The R (red pigment) and G (green pig-
matids can be detected by placing photographic emulsion
ment) genes are close together on the X chromosome,
on the cells; radioactive chromosomes or chromatids ap-
whereas the B (blue pigment) gene is autosomal. A reces-
peared covered with spots of silver from the emulsion.
sive mutation in any one of these genes can cause color
(The chromosomes “take their own photograph.”) Draw
blindness. Suppose that a color-blind man married a
the chromosomes at prophase and telophase of the first
woman with normal color vision. The four sons from this
and second mitotic divisions after the radioactive treat-
marriage were color-blind, and the five daughters were
ment. If they are radioactive, show it in your diagram. If
normal. Specify the most likely genotypes of both par-
there are several possibilities, show them, too.
ents and their children, explaining your reasoning. (A
pedigree drawing will probably be helpful.) (Problem 50 53. In the species of Problem 52, you can introduce radioac-
is by Rosemary Redfield.) tivity by injection into the anthers at the S phase before
meiosis. Draw the four products of meiosis with their
51. Consider the accompanying pedigree for a rare human
chromosomes, and show which are radioactive.
muscle disease.
54. The plant Haplopappus gracilis is diploid and 2n = 4.
There are one long pair and one short pair of chromo-
somes. The diagrams below (numbered 1 through 12)
represent anaphases (“pulling apart” stages) of individu-
al cells in meiosis or mitosis in a plant that is genetically
a dihybrid (A /a ; B /b ) for genes on different chromo-
a. What unusual feature distinguishes this pedigree somes. The lines represent chromosomes or chromatids,
from those studied earlier in this chapter? and the points of the V’s represent centromeres. In each
b. Where do you think the mutant DNA responsible for case, indicate if the diagram represents a cell in meiosis
this phenotype resides in the cell? I, meiosis II, or mitosis. If a diagram shows an impossible
52. The plant Haplopappus gracilis has a 2n of 4. A diploid cell situation, say so.
culture was established and, at premitotic S phase, a
A 1 A a 7
a B b
A a B b
a A b B
B b a A b B
A 2 A A 8
b B B
a a b b
a a b b
A b B B
A A
A 3 A a 9
B B b
A a B b
A a B b
A B A a B b
a 4 A 10
B b
A b
a B
a B a B
A a 5 a 11
b B B
a B
a B
A a b B a B
A a 6 A 12
B b a
A a
B b
a A b B B b
126 CHA P TER 3 Independent Assortment of Genes
55. The pedigree below shows the recurrence of a rare neu- pollen give male-sterile plants. In addition, some lines of
rological disease (large black symbols) and spontaneous corn are known to carry a dominant nuclear restorer al-
fetal abortion (small black symbols) in one family. lele (Rf ) that restores pollen fertility in male-sterile lines.
(A slash means that the individual is deceased.) Provide a. Research shows that the introduction of restorer
an explanation for this pedigree in regard to the cyto- alleles into male-sterile lines does not alter or affect
plasmic segregation of defective mitochondria. the maintenance of the cytoplasmic factors for male
sterility. What kind of research results would lead to
such a conclusion?
b. A male-sterile plant is crossed with pollen from a
plant homozygous for Rf. What is the genotype of the
F1? The phenotype?
56. A man is brachydactylous (very short fingers; rare auto- c. The F1 plants from part b are used as females in a
somal dominant), and his wife is not. Both can taste the testcross with pollen from a normal plant (rf/rf ). What
chemical phenylthiocarbamide (autosomal dominant; are the results of this testcross? Give genotypes and
common allele), but their mothers could not. phenotypes, and designate the kind of cytoplasm.
a. Give the genotypes of the couple. d. The restorer allele already described can be called
Rf-1. Another dominant restorer, Rf-2, has been found.
If the genes assort independently and the couple has Rf-1 and Rf-2 are located on different chromosomes.
four children, what is the probability of Either or both of the restorer alleles will give pollen
b. all of them being brachydactylous? fertility. With the use of a male-sterile plant as a tester,
c. none being brachydactylous? what will be the result of a cross in which the male
parent is
d. all of them being tasters?
(1) heterozygous at both restorer loci?
e. all of them being nontasters?
(2) homozygous dominant at one restorer locus and
f. all of them being brachydactylous tasters?
homozygous recessive at the other?
g. none being brachydactylous tasters?
(3) heterozygous at one restorer locus and homozygous
h. at least one being a brachydactylous taster? recessive at the other?
57. One form of male sterility in corn is maternally transmit- (4) heterozygous at one restorer locus and homozygous
ted. Plants of a male-sterile line crossed with normal dominant at the other?
344
Mapping Eukaryote
4
C h a p t e r
Chromosomes by
Recombination
Learning Outcomes
After completing this chapter, you will be able to
outline
S
ome of the questions that geneticists want to answer about the genome are,
What genes are present in the genome? What functions do they have? What
positions do they occupy on the chromosomes? Their pursuit of the third
question is broadly called mapping. Mapping is the main focus of this chapter, but
all three questions are interrelated, as we will see.
We all have an everyday feeling for the importance of maps in general, and,
indeed, we have all used them at some time in our lives to find our way around.
Relevant to the focus of this chapter is that, in some situations, several maps need
to be used simultaneously. A good example in everyday life is in navigating the
dense array of streets and buildings of a city such as London, England. A street
map that shows the general layout is one necessity. However, the street map is
used by tourists and Londoners alike in conjunction with another map, that of the
underground railway system. The underground system is so complex and spa-
ghetti-like that, in 1933, an electrical circuit engineer named Harry Beck drew up
the streamlined (although distorted) map that has remained to this day an icon of
London. The street and underground maps of London are compared in Figure 4-1.
Note that the positions of the underground stations and the exact distances
between them are of no interest in themselves, except as a way of getting to a
destination of interest such as Westminster Abbey. We will see three parallels with
the London maps when chromosome maps are used to zero in on individual “des-
tinations,” or specific genes. First, several different types of chromosome maps
are often necessary and must be used in conjunction; second, maps that contain
distortions are still useful; and third, many sites on a chromosome map are
F i g u r e 4 -1 These London maps illustrate the principle that, often, several maps are
needed to get to a destination of interest. The map of the underground railway (“the Tube”)
is used to get to a destination of interest such as a street address, shown on the street
map. In genetics, two different kinds of genome maps are often useful in locating a gene,
leading to an understanding of its structure and function. [ (left) © MAPS.com/Corbis; (right)
Transport for London.]
4.1 Diagnostics of Linkage 129
charted only because they are useful in trying to zero in on other sites that are the
ones of real interest.
Obtaining a map of gene positions on the chromosomes is an endeavor that
has occupied thousands of geneticists for the past 80 years or so. Why is it so
important? There are several reasons:
1. Gene position is crucial information needed to build complex genotypes re-
quired for experimental purposes or for commercial applications. For example, in
Chapter 6, we will see cases in which special allelic combinations must be put to-
gether to explore gene interaction.
2. Knowing the position occupied by a gene provides a way of discovering its
structure and function. A gene’s position can be used to define it at the DNA level.
In turn, the DNA sequence of a wild-type gene or its mutant allele is a necessary
part of deducing its underlying function.
3. The genes present and their arrangement on chromosomes are often slightly
different in related species. For example, the rather long human chromosome
number 2 is split into two shorter chromosomes in the great apes. By comparing
such differences, geneticists can deduce the evolutionary genetic mechanisms
through which these genomes diverged. Hence, chromosome maps are useful in
interpreting mechanisms of evolution.
a standard self of a dihybrid F1, the F2 did not show the 9 : 3 : 3 : 1 ratio predicted
by the principle of independent assortment. In fact, Bateson and Punnett noted
that certain combinations of alleles showed up more often than expected, almost
as though they were physically attached in some way. However, they had no
explanation for this discovery.
Later, Thomas Hunt Morgan found a similar deviation from Mendel’s second
law while studying two autosomal genes in Drosophila. Morgan proposed linkage
as a hypothesis to explain the phenomenon of apparent allele association.
Let’s look at some of Morgan’s data. One of the genes affected eye color (pr ,
purple, and pr +, red), and the other gene affected wing length (vg , vestigial, and
vg +, normal). (Vestigial wings are very small compared to wild type.) The wild-
type alleles of both genes are dominant. Morgan performed a cross to obtain dihy-
brids and then followed with a testcross:
Testcross:
pr+/pr ⋅ vg+/vg/ × pr/pr ⋅ vg/vg
F1 dihybrid female Tester male
Morgan’s use of the testcross is important. Because the tester parent contrib-
utes gametes carrying only recessive alleles, the phenotypes of the offspring
directly reveal the alleles contributed by the gametes of the dihybrid parent, as
described in Chapters 2 and 3. Hence, the analyst can concentrate on meiosis in
one parent (the dihybrid) and essentially forget about meiosis in the other (the
tester). In contrast, from an F1 self, there are two sets of meioses to consider in the
analysis of progeny: one in the male parent and the other in the female.
Morgan’s testcross results were as follows (listed as the gametic classes from
the dihybrid):
Testcross:
pr+/pr ⋅ vg+/vg/ × pr/pr ⋅ vg/vg
Again, these results are not even close to a 1 : 1 : 1 : 1 Mendelian ratio. Now,
however, the recombinant classes are the converse of those in the first analysis,
pr + vg + and pr vg. But notice that their frequency is approximately the same:
(157 + 146)/2335 × 100 = 12.9 percent. Again, linkage is suggested, but, in this
case, the F1 dihybrid must have been as follows:
pr1 vg
pr vg1
Dihybrid testcross results like those just presented are commonly encoun-
tered in genetics. They follow the general pattern:
AB/ab or + +/ab
Cis
Ab/aB or + b/a +
Trans
Wx C
wx c
In the progeny of a testcross of this plant, they compared recombinants and
parental genotypes. They found that all the recombinants inherited one or the
other of the two following chromosomes, depending on their recombinant makeup:
wx C
Wx c
Thus, there was a precise correlation between the genetic event of the appear-
ance of recombinants and the chromosomal event of crossing over. Consequently,
the chiasmata appeared to be the sites of exchange, although what was considered
to be the definitive test was not undertaken until 1978.
What can we say about the molecular mechanism of chromosome exchange in
a crossover event? The short answer is that a crossover results from the breakage
and reunion of DNA. Two parental chromosomes break at the same position, and
then each piece joins up with the neighboring piece from the other chromosome. In
Section 4.8, we will see a model of the molecular processes that allow DNA to break
and rejoin in a precise manner such that no genetic material is lost or gained.
algae. The products of meiosis of a single tetrad can be isolated, which is equiva-
lent to isolating all four chromatids from a single meiocyte. Tetrad analyses of
crosses in which genes are linked show many tetrads that contain four different
allele combinations. For example, from the cross
AB × ab
Therefore, for any pair of homologous chromosomes, two, three, or four chro-
matids can take part in crossing-over events in a single meiocyte. Note, however,
that any single crossover is between two chromatids.
4.2 Mapping by Recombinant Frequency 13 5
A B C A b C A B C A b c
a b c a B C a b c a B C
a b c a b c a b c a b C
A B A B
Meioses Parental
with no A B A B
crossover Parental
between a b a b
Parental
the genes
a b a b
Parental
F i g u r e 4 -7
Recombinants arise from
A B A B meioses in which a
Meioses Parental
A B A b
crossover takes place
with a between nonsister
Recombinant
crossover
between a b a B chromatids.
Recombinant
the genes ANIMATED
a b a b
Parental ART: Meiotic recombination
between linked genes by
crossing over
13 6 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
F i g u r e 4 - 9 A chromosome region
Map distances are generally additive containing three linked genes. Because
map distances are additive, calculation of
A B A–B and A–C distances leaves us with
Map based on A–B recombination the two possibilities shown for the B–C
5 m.u. distance.
A C
Map based on A–C recombination
3 m.u.
C A B
3 m.u. 5 m.u.
Possible combined maps 8 m.u.
A C B
3 m.u. 2 m.u.
5 m.u.
molecular analysis that a chromosome is a single DNA molecule with the genes
arranged along it, it is no surprise for us today to learn that recombination-based
maps are linear because they reflect a linear array of genes.
How is a map represented? As an example, in Drosophila, the locus of the eye-
color gene and the locus of the wing-length gene are approximately 11 m.u. apart,
as mentioned earlier. The relation is usually diagrammed in the following way:
pr 11.0 m.u. vg
A B C
Meiocyte 1
a b c
A B C
Meiocyte 2
a b c
A B C
Meiocyte 3
a b c
A B C
Meiocyte 4
…
etc.
a b c
Few Numerous
recombinants recombinants
Three-point testcross
So far, we have looked at linkage in crosses of dihybrids (double heterozygotes)
with doubly recessive testers. The next level of complexity is a cross of a trihybrid
(triple heterozygote) with a triply recessive tester. This kind of cross, called a
three-point testcross or a three-factor cross, is commonly used in linkage anal-
ysis. The goal is to deduce whether the three genes are linked and, if they are, to
deduce their order and the map distances between them.
Let’s look at an example, also from Drosophila. In our example, the mutant
alleles are v (vermilion eyes), cv (crossveinless, or absence of a crossvein on the
wing), and ct (cut, or snipped, wing edges). The analysis is carried out by perform-
ing the following crosses:
P v+/v+ ⋅ cv /cv ⋅ ct /ct × v /v ⋅ cv+/cv+ ⋅ ct+/ct+
↓
Gametes v+ ⋅ cv ⋅ t v ⋅ cv+ ⋅ ct+
v /v+ ⋅ cv /cv+ ⋅ ct /ct+
F1 trihybrid
Let’s analyze the loci two at a time, starting with the v and cv loci. In other words,
we look at just the first two columns under “Gametes” and cover up the third one.
Because the parentals for this pair of loci are v +· cv and v· cv +, we know that the
recombinants are by definition v· cv and v +· cv +. There are 45 + 40 + 89 + 94 = 268
140 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
The testcross can be rewritten as follows, now that we know the linkage
arrangement:
v+ ct cv/v ct+ cv+ × v ct cv/v ct cv
Note several important points here. First, we have deduced a gene order that
is different from that used in our list of the progeny genotypes. Because the point
of the exercise was to determine the linkage relation of these genes, the original
listing was of necessity arbitrary; the order was simply not known before the data
were analyzed. Henceforth, the genes must be written in correct order.
Second, we have definitely established that ct is between v and cv. In the dia-
gram, we have arbitrarily placed v to the left and cv to the right, but the map could
equally well be drawn with the placement of these loci inverted.
Third, note that linkage maps merely map the loci in relation to one another,
with the use of standard map units. We do not know where the loci are on a chro-
mosome—or even which specific chromosome they are on. In subsequent analy-
ses, as more loci are mapped in relation to these three, the complete chromosome
map would become “fleshed out.”
F i g u r e 4 -11 Example of a double A final point to note is that the two smaller map distances, 13.2 m.u. and
crossover between two chromatids. 6.4 m.u., add up to 19.6 m.u., which is greater than 18.5 m.u., the distance calcu-
Notice that a double crossover produces lated for v and cv. Why? The answer to this question lies in the way in which we
double-recombinant chromatids that have treated the two rarest classes of progeny (totaling 8) with respect to the
have the parental allele combinations at recombination of v and cv. Now that we have the map, we can see that these two
the outer loci. The position of the
rare classes are in fact double recombinants, arising from two crossovers (Figure
centromere cannot be determined from
the data. It has been added for 4-11). However, when we calculated the RF value for v and cv, we did not count the
completeness. v ct cv+ and v+ ct+ cv genotypes; after all, with regard to v and cv , they are parental
combinations (v cv+ and v+ cv). In light of our map, however,
Double recombinants arising we see that this oversight led us to underestimate the dis-
from two crossovers tance between the v and the cv loci. Not only should we have
counted the two rarest classes, we should have counted each
v ct + cv + of them twice because each represents double recombinants.
Hence, we can correct the value by adding the numbers 45 +
v ct + cv + 40 + 89 + 94 + 3 + 3 + 5 + 5 = 284. Of the total of 1448, this
v+ ct cv number is exactly 19.6 percent, which is identical with the
sum of the two component values. (In practice, we do not
v+ ct cv need to do this calculation, because the sum of the two shorter
distances gives us the best estimate of the overall distance.)
4.2 Mapping by Recombinant Frequency 141
Interference
Knowing the existence of double crossovers permits us to ask questions about their
possible interdependence. We can ask, Are the crossovers in adjacent chromosome
regions independent events or does a crossover in one region affect the likelihood of
there being a crossover in an adjacent region? The answer is that, generally, cross-
overs inhibit each other somewhat in an interaction called interference. Double-
recombinant classes can be used to deduce the extent of this interference.
Interference can be measured in the following way. If the crossovers in the two
regions are independent, we can use the product rule (see page 94) to predict the
frequency of double recombinants: that frequency would equal the product of the
recombinant frequencies in the adjacent regions. In the v-ct-cv recombination data,
the v-ct RF value is 0.132 and the ct-cv value is 0.064; so, if there is no interference,
double recombinants might be expected at the frequency 0.132 × 0.064 = 0.0084
(0.84 percent). In the sample of 1448 flies, 0.0084 × 1448 = 12 double recombinants
v ct + cv + v ct cv +
v+ ct cv v+ ct + cv
ct + v cv + ct + v+ cv +
ct v+ cv ct v cv
are expected. But the data show that only 8 were actually observed. If this deficiency
of double recombinants were consistently observed, it would show us that the two
regions are not independent and suggest that the distribution of crossovers favors
singles at the expense of doubles. In other words, there is some kind of interference:
a crossover does reduce the probability of a crossover in an adjacent region.
Interference is quantified by first calculating a term called the coefficient of
coincidence (c.o.c.), which is the ratio of observed to expected double recombi-
nants. Interference (I) is defined as 1 − c.o.c. Hence,
observed frequency, or
number, of double recombinants
I = 1−
expected frequency, or
ble recombinants
number, of doub
In our example,
8 4 1
I=1- 12 = 12 = 3
, or 33 percent
In some regions, there are never any observed double recombinants. In these
cases, c.o.c. = 0, and so I = 1 and interference is complete. Interference values any-
where between 0 and 1 are found in different regions and in different organisms.
You may have wondered why we always use heterozygous females for test-
crosses in Drosophila. The explanation lies in an unusual feature of Drosophila
males. When, for example, pr vg / pr+ vg+ males are crossed with pr vg / pr vg females,
only pr vg / pr+ vg+ and pr vg / pr vg progeny are recovered. This result shows that
there is no crossing over in Drosophila males. However, this absence of crossing over
in one sex is limited to certain species; it is not the case for males of all species (or
for the heterogametic sex). In other organisms, there is crossing over in XY males
and in WZ females. The reason for the absence of crossing over in Drosophila males
is that they have an unusual prophase I, with no synaptonemal complexes. Inciden-
tally, there is a recombination difference between human sexes as well. Women
show higher recombinant frequencies for the same autosomal loci than do men.
With the use of a reiteration of the preceding recombination-based tech-
niques, maps have been produced of thousands of genes for which variant
(mutant) phenotypes have been identified. A simple illustrative example from
the tomato is shown in Figure 4-13. The tomato chromosomes are shown in Figure
4-13a, their numbering in Figure 4-13b, and recombination-based gene maps in
Figure 4-13c. The chromosomes are shown as they appear under the microscope,
together with chromosome maps based on linkage analysis of various allelic pairs
shown with their phenotypes.
5
9 LII
3
12
11 9SI
2
7 8SI 10
4
1 8I
(a) (b)
1 2 5 7
(c)
23
Normal (F ) 23 Fasciated (f )
Normal (M ) 12 Mottled (m ) Red (R ) Yellow (r ) Green-base Uniform fruit
(U ) (u)
15
Purple (A ) Green (a )
4 21
Tall (D) Dwarf (d ) 18 Smooth (H ) Hairy (h)
Yellow (W f ) White (w f )
Hairy (H I) Hairless (h l )
3 20
17 Non-tangerine 30 Tangerine (t)
Smooth (P) Peach (p )
(T )
2
Normal (L f ) 16 Leafy (lf )
30
Dihybrid testcrossed
(independent assortment) 1:1:1:1
Dihybrid selfed
(independent assortment) 9:3:3:1
Dihybrid testcrossed
(linked) [Example only (P:R:R:P)]
Trihybrid testcrossed
(independent assortment)
1:1:1:1:1:1:1:1
Trihybrid testcrossed
(all linked) [Example only
(P:P:SCO:SCO:SCO:SCO:DCO:DCO)]
....AAGGCTCAT....
....TTCCGAGTA....
and, in another, it might be
....AAAGCTCAT....
....TTTCGAGTA....
4.3 Mapping with Molecular Markers 14 5
5′ C-A-C-A-C-A-C-A-C-A-C-A-C-A-C-A 3′
3′ G-T-G-T-G-T-G-T-G-T-G-T-G-T-G-T 5′
146 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
These results tell us that the testcross must have been in the following
conformation:
A M1/a M2 × a M1/a M1
and the two progeny genotypes on the right in the list must be recombinants,
giving a map distance of 2 m.u. between the A/a locus and the molecular locus
M1/M2. Hence, we now know the general location of the gene in the genome and
can narrow its location down with more finely scaled approaches. In addition,
different molecular markers can be mapped to each other, creating a map that can
act like a series of stepping-stones on the way to some gene with an interesting
phenotype.
Although mapping molecular markers with the use of what are effectively
testcrosses is the simplest type of informative analysis, in many analyses (such as
those in humans) the molecular markers cannot be mapped using a testcross.
However, because each molecular allele has its own signature, recombinant and
nonrecombinant products can be identified from any meiosis, even in crosses that
are not testcrosses. Such an analysis is diagrammed in Figure 4-15.
Figure 4-16 contains some real data showing how molecular markers can flesh
out a map of a human chromosome. You can see that the number of mapped
molecular markers greatly exceeds the number of mapped genes with mutant
phenotypes. Note that SNPs, because of their even higher density, cannot be rep-
resented on a whole-chromosome map such as that in Figure 4-16, inasmuch as
there would be thousands of them. One centimorgan (1 m.u.) of human DNA is a
huge segment, estimated as 1 megabase (1 Mb = 1 million base pairs, or 1000 kb).
Hence, you can see the need for closely packed molecular markers for a fine-scale
analysis that resolves smaller distances. Note that the DNA equivalent of 1 m.u.
p M´ p M´´´
×
P M´´ p M´´´´
1 2 3 4 5 6
M´´´´
F i g u r e 4 -15 A PCR banding pattern is
M´´
M´
shown for a family with six children, and
this pattern is interpreted at the top of the
M´´´ illustration with the use of four different-
size microsatellite “alleles,” M′ through
M″″. One of these markers (M′′ ) is
probably linked in cis configuration to the
PCR products disease allele P. (Note: This mating is not a
testcross, yet is informative about linkage.)
14 8 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
D1S434
19 9 -1 D1S234
D1S247
36.2 idiogram), by using techniques described later in this chapter. Having
D1S513
36.1
common landmark markers on the different genetic maps permits the
D1S233
D1S201 locations of other genes and molecular markers to be estimated on
D1S441 35
1- 10 13.7 D1S472 D1S496 each map. [ Data from B. R. Jasny et al., Science, September 30, 1994.]
D1S186 34.3
D1S1157 34.2
2 8 3 17.0 D1S193 34.1
D1S319
D1S161 33
D1S209
3 6 4 7.7
5 10 2 7.3
D1S417 32.3 varies a lot between species; for example, in the malarial
D1S200 32.2
3 8 4.7 D1S476 32.1 parasite Plasmodium falciparium, 1 m.u. = 17 kb.
6 18 3 11.4 D1S220 31.3
D1S312 31.2
D1S473
1- 6 13.4 D1S246 31.1 K e y C o n c e p t Loci of any DNA heterozygosity can be
D1S1613
4 7 8.7 D1S198 mapped and used as molecular chromosome markers or
22.3
1 4 1 5.0 D1S159 milestones.
D1S224 22.2
1 3 11.8 D1S532 22.1
D1S500
D1S221
2 15 9.8 D1S1728 21
D1S207
1 11 -1 11.0 D1S167
D1S188
13.3
13.2
13.1
4.4 Centromere Mapping with Linear
D1S236 12
5 9 -1 14.2
D1S223
D1S239
11
11
Tetrads
1 9 9.4 D1S221
4 6 1 4.4 D1S187
D1S418
Centromeres are not genes, but they are regions of DNA on
12
6 11 4 13.9 D1S189
D1S440
which the orderly reproduction of living organisms abso-
6 11 -1 13.6
D1S534
D1S498
21.1
21.2
lutely depends and are therefore of great interest in genetics.
5 1 6.5
D1S305
D1S303
21.3 In most eukaryotes, recombination analysis cannot be used
22
2 4 7.9 SPTA1 to map the loci of centromeres because they show no hetero-
D1S431
CRP
1 6 1 7.5 D1S484
APOA2
23 zygosity that can enable them to be used as markers. How-
5 7 10.7
7 1 6.2
D1S104
D1S194 24
ever, in the fungi that produce linear tetrads (see Chapter 3,
1 11 1 5.6 D1S318
D1S210
page 103), centromeres can be mapped. We will use the fun-
D1S218
D1S416
25 gus Neurospora as an example. Recall that, in haploid fungi
8 24 6 33.8 D1S215 such as Neurospora, haploid nuclei from each parent fuse to
D1S237 D1S412
D1S399 31
D1S240
D1S191
form a transient diploid, which undergoes meiotic divisions
32.1
1 10 9.3 D1S518
D1S461 32.2
along the long axis of the ascus, and so each meiocyte pro-
2 14 9.5 D1S422
D1S412
32.3
41
duces a linear array of eight ascospores, called an octad.
2 5 12.7 D1S310
D1S510
These eight ascospores constitute the four products of meio-
42.1
D1S249 sis (a tetrad) plus a postmeiotic mitosis.
D1S446
a A a A A a A
a A
126 132 9 11 10 12 A
a
Total = 300 A A
a
a a
The first two columns on the left are from meioses First
with no crossover in the region between the A locus division
a
and the centromere. The letter M is used to stand for Second
division
a type of segregation at meiosis. The patterns for the
Mitosis
first two columns are called MI patterns, or first-
division segregation patterns, because the two dif-
A second-
ferent alleles segregate into the two daughter nuclei at division
the first division of meiosis. The other four columns are segregation
all from meiocytes with a crossover. These patterns are pattern, M II
called second-division segregation patterns (MII)
because, as a result of crossing over in the centromere- F i g u r e 4 -17 A and a segregate into
to-locus region, the A and a alleles are still together in the nuclei at the end of the separate nuclei at the second meiotic
first division of meiosis (Figure 4-17). There has been no first-division segrega- division when there is a crossover
tion. However, the second meiotic division does segregate the A and a alleles into between the centromere and the A locus.
separate nuclei. The other patterns are produced similarly; the difference is that
the chromatids move in different directions at the second division (Figure 4-18).
You can see that the frequency of octads with an MII pattern should be propor-
tional to the size of the centromere–A/a region and could be used as a measure of
the size of that region. In our example, the MII frequency is 42/300 = 14 percent.
Does this percentage mean that the A/a locus is 14 m.u. from the centromere? The
answer is no, but this value can be used to calculate the number of map units. The
14 percent value is a percentage of meioses, which is not the way that map units are
A a A a
A a A a
a a A A a a A A
F i g u r e 4 -18 In the second
A A a a a a A A meiotic division, the centromeres
a A A a attach to the spindle at random,
a A A a producing the four arrangements
1 2 3 4 shown. The four arrangements
are equally frequent.
150 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
defined. Map units are defined as the percentage of recombinant chromatids issuing
from meiosis. Because a crossover in any meiosis results in only 50 percent recom-
binant chromatids (four out of eight; see Figure 4-17), we must divide the 14 percent
by 2 to convert the MII frequency (a frequency of meioses) into map units (a fre-
quency of recombinant chromatids). Hence, this region must be 7 m.u. in length,
and this measurement can be introduced into the map of that chromosome.
A B 0.25
A b 0.25
a B 0.25
a b 0.25
A cross of this type was made and the following phenotypes obtained in a
progeny sample of 200.
A B 60
A b 37
a B 41
a b 62
There is clearly a deviation from the prediction of no linkage (which would
have given the progeny numbers 50 : 50 : 50 : 50). The results suggest that the dihy-
brid was a cis configuration of linked genes, A B / a b, because the progeny A B and
a b are in the majority. The recombinant frequency would be (37 + 41)/200 =
78/200 = 39 percent, or 39 m.u.
However, we know that chance deviations due to sampling error can provide
results that resemble those produced by genetic processes; hence, we need the χ2
(pronounced “chi square”) test to help us calculate the probability of a chance
deviation of this magnitude from a 1 : 1 : 1 : 1 ratio.
First, let us examine the allele ratios for both loci. These are 97 : 103 for A : a,
and 101 : 99 for B : b. Such numbers are close to the 1 : 1 allele ratios expected from
Mendel’s first law, so skewed allele ratios cannot be responsible for the quite large
deviations from the expected numbers of 50 : 50 : 50 : 50.
We must apply the χ2 analysis to test a hypothesis of no linkage. If that hypoth-
esis is rejected, we can infer linkage. (We cannot test a hypothesis of linkage
directly because we have no way of predicting what recombinant frequency to
test.) The calculation for testing lack of linkage is as follows:
Since there are four genotypic classes, we must use 4 − 1 = 3 degrees of freedom.
Consulting the chi-square table in Chapter 3, we see our values of 9.88 and 3 df give
a p value of ~0.025, or 2.5 percent. This is less than the standard cut-off value of
5 percent, so we can reject the hypothesis of no linkage. Hence, we are left with the
conclusion that the genes are very likely linked, approximately 39 m.u. apart.
Notice, in retrospect, that it was important to make sure alleles were segregat-
ing 1 : 1 to avoid a compound hypothesis of 1 : 1 allele ratios and no linkage. If we
rejected such a compound hypothesis, we would not know which part of it was
responsible for the rejection.
F i g u r e 4 -19 Demonstration that the
average RF is 50 percent for meioses in
which the number of crossovers is not
4.6 Accounting for Unseen Multiple Crossovers zero. Recombinant chromatids are brown.
Two-strand double crossovers produce all
In the discussion of the three-point testcross, some parental (nonrecombinant) parental types; so all the chromatids are
chromatids resulted from double crossovers. These crossovers initially could not be orange. Note that all crossovers are
counted in the recombinant frequency, skewing the results. This situation leads to between nonsister chromatids. Try the
the worrisome notion that all map distances based on recombinant frequency triple crossover class yourself.
might be underestimations of physical distances because
undetected multiple crossovers might have occurred, some of Any number of crossovers gives
whose products would not be recombinant. Several creative 50 percent recombinants
mathematical approaches have been designed to get around
the multiple-crossover problem. We will look at two methods.
A B
First, we examine a method originally worked out by J. B. S. No crossovers
A B
Haldane in the early years of genetics. RF
a b 0
4
a b 0%
A mapping function
The approach worked out by Haldane was to devise a map-
A B
ping function, a formula that relates an observed recombi- One crossover
A B
nant-frequency value to a map distance corrected for multiple (Can be between any RF
crossovers. The approach works by relating RF to the mean nonsister pair.) a b 2
4
50%
number of crossovers, m, that must have taken place in that a b
chromosomal segment per meiosis and then deducing what
map distance this m value should have produced. A B
Two crossovers
To find the relation of RF to m, we must first think about A B
RF
Two-
(Holding one crossover strand
outcomes of the various crossover possibilities. In any chro- constant and varying
a b
0
4
double
0%
mosomal region, we might expect meioses with 0, 1, 2, 3, 4, the position of the a b crossover
or more crossovers. Surprisingly, the only class that is really second produces four
equally frequent two-
crucial is the zero class. To see why, consider the following. It crossover meioses.) A B
is a curious but nonintuitive fact that any number of cross- A B Three-
overs produces a frequency of 50 percent recombinants a b
RF
2 strand
within those meioses. Figure 4-19 proves this statement for
4
double
50% crossover
a b
single and double crossovers as examples, but it is true for
any number of crossovers. Hence, the true determinant of
A B
RF is the relative sizes of the classes with no crossovers (the
A B Three-
zero class) compared with the classes with any nonzero RF strand
2
number of crossovers. a b 4
double
50% crossover
Now the task is to calculate the size of the zero class. The a b
occurrence of crossovers in a specific chromosomal region is
well described by a statistical distribution called the Poisson A B
distribution. The Poisson formula in general describes the A B Four-
RF
distribution of “successes” in samples when the average a b 4 strand
4
double
probability of successes is low. An illustrative example is to a b
100%
crossover
dip a child’s net into a pond of fish: most dips will produce
8
no fish, a smaller proportion will produce one fish, an even Average two-crossover RF 16 50%
smaller proportion two, and so on. This analogy can be
152 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
fi = (e-mmi)/i!
The Poisson distribution tells us that the frequency of the i = 0 class (the key
one) is
m0
e-m
0!
Because m0 and 0! both equal 1, the formula reduces to e−m.
Now we can write a function that relates RF to m. The frequency of the class
with any nonzero number of crossovers will be 1 − e −m, and, in these meioses,
50 percent (1/2) of the products will be recombinant; so
RF = 21 (1 - e-m)
and this formula is the mapping function that we have been seeking.
Let’s look at an example in which RF is converted into a map distance corrected
for multiple crossovers. Assume that, in one testcross, we obtain an RF value of
27.5 percent (0.275). Plugging this into the function allows us to solve for m:
0.275 = 21 (1 - e-m)
so
e-m = 1 - (2 × 0.275) = 0.45
By using a calculator to find the natural logarithm (ln) of 0.45, we can deduce
that m = 0.8. That is, on average, there are 0.8 crossovers per meiosis in that chro-
mosomal region.
The final step is to convert this measure of crossover frequency to give a “cor-
rected” map distance. All that we have to do to convert into corrected map units
is to multiply the calculated average crossover frequency by 50 because, on aver-
age, a crossover produces a recombinant frequency of 50 percent. Hence, in the
preceding numerical example, the m value of 0.8 can be converted into a cor-
rected recombinant fraction of 0.8 × 50 = 40 corrected m.u. We see that, indeed,
this value is substantially larger than the 27.5 m.u. that we would have deduced
from the observed RF.
Note that the mapping function neatly explains why the maximum RF value
for linked genes is 50 percent. As m gets very large, e−m tends to zero and the RF
tends to 1/2, or 50 percent.
The recombinant genotypes are shown in red. If the genes are linked, a simple
approach to mapping their distance apart might be to use the following formula:
because this formula gives the percentage of all recombinants. However, in the
1960s, David Perkins developed a formula that compensates for the effects of
double crossovers. The Perkins formula thus provides a more accurate estimate of
map distance:
corrected map distance = 50(T + 6 NPD)
We will not go through the derivation of this formula other than to say that it is
based on the totals of the PD, T, and NPD classes expected from meioses with 0, 1,
and 2 crossovers (it assumes that higher numbers are vanishingly rare). Let’s look at
an example of its use. In our hypothetical cross of A B × a b, the observed frequen-
cies of the tetrad classes are 0.56 PD, 0.41 T, and 0.03 NPD. By using the Perkins
formula, we find the corrected map distance between the a and b loci to be
Let us compare this value with the uncorrected value obtained directly from
the RF. By using the same data, we find
This distance is 6 m.u. less than the estimate that we obtained by using the
Perkins formula because we did not correct for double crossovers.
As an aside, what PD, NPD, and T values are expected when dealing with
unlinked genes? The sizes of the PD and NPD classes will be equal as a result of
independent assortment. The T class can be produced only from a crossover
between either of the two loci and their respective centromeres, and, therefore,
the size of the T class will depend on the total size of the two regions lying between
locus and centromere. However, the formula 21 T + NPD should always yield 0.50,
reflecting independent assortment.
Recombination
map
F i g u r e 4 -2 0 Comparison of relative
positions on physical and recombination
of interest. Further studies are needed to narrow the choice to one. If that single maps can connect phenotype with an
case is a gene whose function is known for other organisms, then a function for unknown gene function.
the gene of interest is suggested. In this way, the phenotype mapped on the
recombination map can be tied to a function deduced from the physical map.
Molecular markers on both maps (not shown in Figure 4-20) can be aligned to
help in the zeroing-in process. Hence, we see that both maps contain elements of
function: the physical map shows a gene’s possible action at the cellular level,
whereas the recombination map contains information related to the effect of the
gene at the phenotypic level. At some stage, the two have to be melded to under-
stand the gene’s contribution to the development of the organism.
There are several other genetic-mapping techniques, some of which we will
encounter in Chapters 5, 18, and 19.
F i g u r e 4 -2 1 A molecular model of
Crossing over creates heteroduplex DNA
crossing over. Only the two chromatids
(blue and red) participating in the crossover
Inner two chromatids
are shown. The 3′-to-5′ strand is placed on
Double-strand break the inside of both for clarity. The
chromatids differ at one site, GC, in one
G allele (perhaps allele A) and AT in the other
5' 3'
(perhaps a). Only the outcome with
3' 5' mispaired heteroduplex DNA and a
C
T crossover are shown. The final crossover
3' 5' products are shaded in yellow and blue.
5' 3'
A
The observation of a nonidentical
Erosion
sister-spore pair suggests that the DNA
of one of the final four meiotic homologs
G
contains heteroduplex DNA. Hetero-
duplex DNA is DNA in which there is a
T
mismatched nucleotide pair in the gene
under study. The logic is as follows. If in
A
a cross of A × a, one allele (A) is G : C and
the other allele (a) is A : T, the two alleles
Invasion and would usually replicate faithfully. How-
displacement
ever, a heteroduplex, which forms only
G rarely, would be a mismatched nucleo-
tide pair such G : T or A : C (effectively a
DNA molecule bearing both A and a
T
information). Note that a heteroduplex
involves only one nucleotide position:
A the surrounding DNA segment might be
as follows, where the heteroduplex site is
Polymerization
shown in bold:
G GCTAATGTTATTAG
CGATTATAATAATC
T At replication to form an octad, a
G : T heteroduplex would pull apart and
A replicate faithfully, with G bonding to
Heteroduplex region C and A bonding to T. The result would
Resolution to be a nonidentical spore pair of G : C
crossover by nicks ( ) (allele A) and A : T (allele a).
G Nonidentical sister spores (and
aberrant octads generally) were found
T to be statistically correlated with cross-
T ing over in the region of the gene con-
cerned, providing an important clue
A that crossing over might be based on the
formation of heteroduplex DNA.
In the currently accepted model (follow it in Figure 4-21), the heteroduplex
DNA and a crossover are both produced by a double-stranded break in the DNA
of one of the chromatids participating in the crossover. Let’s see how that works.
Molecular studies show that broken ends of DNA will promote recombination
between different chromatids. In step 1, both strands of a chromatid break in the
same location. From the break, DNA is eroded at the 5′ end of each broken strand,
leaving both 3′ ends single stranded (step 2). One of the single strands “invades”
the DNA of the other participating chromatid; that is, it enters the center of the
Summary 157
helix and base-pairs with its homologous sequence (step 3), displacing the other
strand. Then the tip of the invading strand uses the adjacent sequence as a tem-
plate for new polymerization, which proceeds by forcing the two resident strands
of the helix apart (step 4). The displaced single-stranded loop hydrogen bonds
with the other single strand (the blue one in the figure). If the invasion and strand
displacement spans a site of heterozygosity (such as A/a), then a region of hetero-
duplex DNA is formed. Replication also takes place from the other single-stranded
end to fill the gap left by the invading strand (also shown on the upper blue strand
in step 4 of Figure 4-21). The replicated ends are sealed, and the net result is a
strange structure with two single-stranded junctions called Holliday junctions
after their original proposer, Robin Holliday. These junctions are potential sites of
single-strand breakage and reunion; two such events, shown by the darts in the
figure, then lead to a complete double-stranded crossover (step 5).
Note that when the invading strand uses the invaded DNA as a replication
template, this automatically results in an extra copy of the invaded sequence at
the expense of the invading sequence, thus explaining the departure from the
expected 4 : 4 ratio.
This same sort of recombination takes place at many different chromosomal
sites where the invasion and strand displacement do not span a heterozygous
mutant site. Here DNA would be formed that is heteroduplex in the sense that it
is composed of strands of each participating chromatid, but there would not be a
mismatched nucleotide pair and the resulting octad would contain only identical
spore pairs. Those rare occasions in which the invasion and polymerization do
span a heterozygous site are simply lucky cases that provided the clue for the
mechanism of crossing over.
s u m m a ry
In a dihybrid testcross in Drosophila, Thomas Hunt Morgan wondered if these values corresponded to the actual dis-
found a deviation from Mendel’s law of independent assort- tances between genes on a chromosome. Alfred Sturtevant, a
ment. He postulated that the two genes were located on the student of Morgan’s, developed a method of determining the
same pair of homologous chromosomes. This relation is distance between genes on a linkage map, based on the RF.
called linkage. The easiest way to measure RF is with a testcross of a dihy-
Linkage explains why the parental gene combinations brid or trihybrid. RF values calculated as percentages can be
stay together but not how the recombinant (nonparental) used as map units to construct a chromosomal map showing
combinations arise. Morgan postulated that, in meiosis, there the loci of the genes analyzed. In ascomycete fungi, centro-
may be a physical exchange of chromosome parts by a pro- meres also can be located on the map by measuring second-
cess now called crossing over. A result of the physical break- division segregation frequencies.
age and reunion of chromosome parts, crossing over takes Single nucleotide polymorphisms (SNPs) are single-
place at the four-chromatid stage of meiosis. Thus, there are nucleotide differences in sequences of DNA. Single
two types of meiotic recombination. Recombination by sequence length polymorphisms (SSLPs) are differences in
Mendelian independent assortment results in a recombinant the number of repeating units. SNPs and SSLPs can be used
frequency of 50 percent. Crossing over results in a recombi- as molecular markers for mapping genes.
nant frequency generally less than 50 percent. Although the basic test for linkage is deviation from
As Morgan studied more linked genes, he discovered independent assortment, such a deviation may not be obvi-
many different values for recombinant frequency and ous in a testcross, and a statistical test is needed. The χ2 test,
158 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
which tells how often observations deviate from expecta- which shows all the gene-like sequences. Knowledge of gene
tions purely by chance, is particularly useful in determining position in both maps enables the melding of cellular func-
whether loci are linked. tion with a gene’s effect on phenotype.
Some multiple crossovers can result in nonrecombinant The mechanism of crossing over is thought to start with
chromatids, leading to an underestimation of map distance a double-stranded break in one participating chromatid.
based on RF. The mapping function, applicable in any Erosion leaves the ends single stranded. One single strand
organism, corrects for this tendency. The Perkins formula invades the double helix of the other participating chroma-
has the same use in fungal tetrad analysis. tid, leading to the formation of heteroduplex DNA. Gaps are
In genetics generally, the recombination-based map of filled by polymerization. The molecular resolution of this
loci conferring mutant phenotypes is used in conjunction structure becomes a full double-stranded crossover at the
with a physical map such as the complete DNA sequence, DNA level.
key terms
centimorgan (cM) (p. 136) interference (p. 141) restriction fragment length
chromosome map (p. 129) linkage map (p. 136) polymorphism (RFLP) (p. 145)
cis conformation (p. 132) linked genes (p. 129) second-division segregation pattern
coefficient of coincidence (c.o.c.) locus (p. 129) (MII) (p. 149)
(p. 142) mapping function (p. 151) simple sequence length
crossing over (p. 132) microsatellite marker (p. 145) polymorphism (SSLP) (p. 145)
crossover product (p. 132) minisatellite marker (p. 145) single nucleotide polymorphism
DNA fingerprint (p. 146) molecular marker (p. 144) (SNP) (p. 145)
double-stranded break (p. 156) octad (p. 148) three-factor cross (p. 139)
first-division segregation pattern physical map (p. 154) three-point testcross (p. 139)
(MI pattern) (p. 149) Poisson distribution (p. 151) trans conformation (p. 132)
genetic map unit (m.u.) recombinant frequency (RF) variable number tandem repeat
(p. 136) (p. 136) (VNTR) (p. 145)
heteroduplex DNA (p. 156) recombination map (p. 129)
s olv e d p r obl e m s
SOLVED PROBLEM 1. A human pedigree shows people c. If there is evidence of linkage, then draw the alleles on
affected with the rare nail–patella syndrome (misshapen the relevant homologs of the grandparents. If there is no
nails and kneecaps) and gives the ABO blood-group genotype evidence of linkage, draw the alleles on two homologous
of each person. Both loci concerned are autosomal. Study the pairs.
pedigree below. d. According to your model, which generation II descen-
a. Is the nail–patella syndrome a dominant or recessive phe- dants are recombinants?
notype? Give reasons to support your answer. e. What is the best estimate of RF?
b. Is there evidence of linkage between the nail–patella f. If man III-1 marries a normal woman of blood type O,
gene and the gene for ABO blood type, as judged from this what is the probability that their first child will be blood
pedigree? Why or why not? type B with nail–patella syndrome?
1 2
I
i/i I B/i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
II
i/i I B/i I B/i i/i I B/i I B/i I A/i i/i i/i i/i I B/i i/i i/i i/i I B/i I B/i
1 2 3 4 5
III
I B/i I B/i I B/i I A/i I A/i
Solved Problems 159
and for wx–cn, convert this percentage into map units, we must divide by 2,
which gives 5.05 m.u.
69+ 67+ 6 +5
RF = = 14 .7 % nic
1000
Now we have to put the two together and decide between the
b cn1 wx following alternatives, all of which are compatible with the
preceding locus-to-centromere distances:
c. The expected number of double recombinants is 0.103 ×
0.147 × 1000 = 15.141. The observed number is 6 + 5 = 11, a. nic ad
and so interference can be calculated as
I = 1 − (11/15.141) = 1 − 0.726 = 0.274 = 27.4% 5.05 m.u. 9.30 m.u.
b. nic ad
SOLVED PROBLEM 3. A cross is made between a haploid
strain of Neurospora of genotype nic+ ad and another hap- 5.05 m.u. 9.30 m.u.
loid strain of genotype nic ad +. From this cross, a total of c. nic ad
1000 linear asci are isolated and categorized as in the table
below. Map the ad and nic loci in relation to centromeres
5.05 m.u.
and to each other.
9.30 m.u.
Solution
What principles can we draw on to solve this problem? It is a Here, a combination of common sense and simple analy-
good idea to begin by doing something straightforward, sis tells us which alternative is correct. First, an inspection of
which is to calculate the two locus-to-centromere distances. the asci reveals that the most common single type is the one
We do not know if the ad and the nic loci are linked, but we do labeled 1, which contains more than 80 percent of all the asci.
not need to know. The frequencies of the MII patterns for This type contains only nic+⋅ ad and nic ⋅ ad + genotypes, and
each locus give the distance from locus to centromere. (We they are parental genotypes. So we know that recombination
can worry about whether it is the same centromere later.) is quite low and the loci are certainly linked. This rules out
Remember that an MII pattern is any pattern that is not alternative a.
two blocks of four. Let’s start with the distance between the Now consider alternative c. If this alternative were cor-
nic locus and the centromere. All we have to do is add the rect, a crossover between the centromere and the nic locus
ascus types 4, 5, 6, and 7, because all of them are MII pat- would generate not only an MII pattern for that locus, but
terns for the nic locus. The total is 5 + 90 + 1 + 5 = 101 of also an MII pattern for the ad locus, because it is farther
1000, or 10.1 percent. In this chapter, we have seen that to from the centromere than nic is. The ascus pattern pro-
1 2 3 4 5 6 7
nic ad nic ad nic ad nic ad nic ad nic ad nic ad
nic ad nic ad nic ad nic ad nic ad nic ad nic ad
nic ad nic ad nic ad nic ad nic ad nic ad nic ad
nic ad nic ad nic ad nic ad nic ad nic ad nic ad
nic ad nic ad nic ad nic ad nic ad nic ad nic ad
nic ad nic ad nic ad nic ad nic ad nic ad nic ad
nic ad nic ad nic ad nic ad nic ad nic ad nic ad
nic ad nic ad nic ad nic ad nic ad nic ad nic ad
808 1 90 5 90 1 5
Problems 161
1
duced by a crossover between nic and the centromere in al- the recombinant ascospores, but using the formula RF = 2 T
ternative c should be + NPD is simpler. The T asci are classes 3, 4, and 7, and the
nic1 ad NPD asci are classes 2 and 6. Hence, RF = [ 21 (100) + 2]/1000
= 5.2 percent, or 5.2 m.u., and a better map is
nic1 ad
nic ad
nic1 ad nic ad1
nic1 ad nic ad1 5.05 m.u. 5.2 m.u.
nic ad1 nic1 ad
nic ad1 10.25 m.u.
nic1 ad
The reason for the underestimation of the ad-to-
nic ad1 centromere distance calculated from the MII frequency is
nic ad1 the occurrence of double crossovers, which can produce an
MI pattern for ad, as in ascus type 4:
Remember that the nic locus shows MII patterns in asci types nic1 ad
4, 5, 6, and 7 (a total of 101 asci); of them, type 5 is the very
one that we are talking about and contains 90 asci. Therefore, nic1 ad
alternative c appears to be correct because ascus type 5 com- nic ad
prises about 90 percent of the MII asci for the nic locus. This nic1 ad
relation would not hold if alternative b were correct because nic1 ad nic ad
crossovers on either side of the centromere would generate nic ad1 nic1 ad1
the MII patterns for the nic and the ad loci independently. nic ad1
Is the map distance from nic to ad simply 9.30 − 5.05 = nic1 ad1
4.25 m.u.? Close, but not quite. The best way of calculating nic ad1
map distances between loci is always by measuring the recom-
binant frequency. We could go through the asci and count all nic ad1
p r obl e m s
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
.com/launchpad/iga11e.
Working with the Figures 8. Using the conventions of Figure 4-15, draw parents and
1. In Figure 4-3, would there be any meiotic products that progeny classes from a cross
did not undergo a crossover in the meiosis illustrated? If P M′′′/p M′ × p M′/p M′′′′
so, what colors would they be in the color convention
used? 9. In Figure 4-17, draw the arrangements of alleles in an oc-
tad from a similar meiosis in which the upper product of
2. In Figure 4-6, why does the diagram not show meioses
the first division segregated in an upside-down manner
in which two crossovers occur between the same two
at the second division.
chromatids (such as the two inner ones)?
10. In Figure 4-19, what would be the RF between A/a and
3. In Figure 4-8, some meiotic products are labeled
B/b in a cross in which purely by chance all meioses had
parental. Which parent is being referred to in this
four-strand double crossovers in that region?
terminology?
11. a. In Figure 4-21, let GC = A and AT = a, then draw
4. In Figure 4-9, why is only locus A shown in a constant
the fungal octad that would result from the final struc-
position?
ture (5).
5. In Figure 4-10, what is the mean frequency of crossovers
b. (Challenging) Insert some closely linked flanking
per meiosis in the region A–B ? The region B–C ?
markers into the diagram, say P/p to the left and Q/q to
6. In Figure 4-11, is it true to say that from such a cross the the right (assume either cis or trans arrangements). As-
product v cv+ can have two different origins? sume neither of these loci show non-Mendelian segre-
7. In Figure 4-14, in the bottom row four colors are labeled gation. Then draw the final octad based on the structure
SCO. Why are they not all the same size (frequency)? in part 5.
162 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
2
a. Determine which genes are linked.
e⋅f 6 b. Draw a map that shows distances in map units.
Explain these results. c. Calculate interference, if appropriate.
Recombinant for
Progeny
Class phenotypes F1 gametes Numbers p b p–v v–b
1 gre sen pla 3,210
2 pur res bro p v b 3,222
3 gre res pla v 1,024 R R
4 pur sen bro p b 1,044 R R
5 pur res pla p v 690 R R
6 gre sen bro b 678 R R
7 gre res bro v b 72 R R
8 pur sen pla p 60 R R
www
Unpacking Problem 21 24. What is the formula for interference? How are the “ex-
www pected” frequencies calculated in the coefficient-of-
1. Sketch cartoon drawings of the P, F1, and tester corn coincidence formula?
plants, and use arrows to show exactly how you would
perform this experiment. Show where seeds are obtained. 25. Why does part c of the problem say “if appropriate”?
26. How much work is it to obtain such a large progeny size
2. Why do all the +’s look the same, even for different
in corn? Which of the three genes would take the most
genes? Why does this not cause confusion?
work to score? Approximately how many progeny are
3. How can a phenotype be purple and brown, for exam- represented by one corncob?
ple, at the same time?
4. Is it significant that the genes are written in the order 22. You have a Drosophila line that is homozygous for autoso-
p-v-b in the problem? mal recessive alleles a, b, and c, linked in that order. You
5. What is a tester and why is it used in this analysis? cross females of this line with males homozygous for the
corresponding wild-type alleles. You then cross the F1 het-
6. What does the column marked “Progeny phenotypes” erozygous males with their heterozygous sisters. You ob-
represent? In class 1, for example, state exactly what tain the following F2 phenotypes (where letters denote
“gre sen pla” means. recessive phenotypes and pluses denote wild-type pheno-
7. What does the line marked “Gametes” represent, and types): 1364 + + +, 365 a b c, 87 a b +, 84 + + c,
how is it different from the column marked “F1 gam- 47 a + +, 44 + b c, 5 a + c, and 4 + b +.
etes”? In what way is comparison of these two types of a. What is the recombinant frequency between a and
gametes relevant to recombination? b? Between b and c? (Remember, there is no crossing
8. Which meiosis is the main focus of study? Label it on over in Drosophila males.)
your drawing. b. What is the coefficient of coincidence?
9. Why are the gametes from the tester not shown? 23. R. A. Emerson crossed two different pure-breeding lines
of corn and obtained a phenotypically wild-type F1 that
10. Why are there only eight phenotypic classes? Are there
was heterozygous for three alleles that determine reces-
any classes missing?
sive phenotypes: an determines anther; br, brachytic;
11. What classes (and in what proportions) would be ex- and f, fine. He testcrossed the F1 with a tester that was
pected if all the genes are on separate chromosomes? homozygous recessive for the three genes and obtained
12. To what do the four pairs of class sizes (very big, two these progeny phenotypes: 355 anther; 339 brachytic,
intermediates, very small) correspond? fine; 88 completely wild type; 55 anther, brachytic, fine;
21 fine; 17 anther, brachytic; 2 brachytic; 2 anther, fine.
13. What can you tell about gene order simply by inspect- a. What were the genotypes of the parental lines?
ing the phenotypic classes and their frequencies?
b. Draw a linkage map for the three genes (include
14. What will be the expected phenotypic class distribution map distances).
if only two genes are linked? c. Calculate the interference value.
15. What does the word “point” refer to in a three-point 24. Chromosome 3 of corn carries three loci (b for plant-col-
testcross? Does this word usage imply linkage? What or booster, v for virescent, and lg for liguleless). A test-
would a four-point testcross be like? cross of triple recessives with F1 plants heterozygous for
16. What is the definition of recombinant, and how is it ap- the three genes yields progeny having the following
plied here? genotypes: 305 + v lg, 275 b + +, 128 b + lg, 112 + v +,
74 + + lg, 66 b v +, 22 + + +, and 18 b v lg. Give the gene
17. What do the “Recombinant for” columns mean? sequence on the chromosome, the map distances be-
18. Why are there only three “Recombinant for” columns? tween genes, and the coefficient of coincidence.
19. What do the R’s mean, and how are they determined? 25. Groodies are useful (but fictional) haploid organisms
that are pure genetic tools. A wild-type groody has a fat
20. What do the column totals signify? How are they used? body, a long tail, and flagella. Mutant lines are known
21. What is the diagnostic test for linkage? that have thin bodies, are tailless, or do not have flagella.
Groodies can mate with one another (although they are
22. What is a map unit? Is it the same as a centimorgan?
so shy that we do not know how) and produce recombi-
23. In a three-point testcross such as this one, why aren’t nants. A wild-type groody mates with a thin-bodied
the F1 and the tester considered to be parental in calcu- groody lacking both tail and flagella. The 1000 baby
lating recombination? (They are parents in one sense.) groodies produced are classified as shown in the
16 4 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
illustration here. Assign genotypes, and map the three 28. From several crosses of the general type A/A ⋅ B/B ×
genes. (Problem 25 is from Burton S. Guttman.) a/a ⋅ b/b, the F1 individuals of type A/a ⋅ B/b were test-
crossed with a/a ⋅ b/b. The results are as follows:
Testcross progeny
398 370
Testcross of A/a ⋅ a/a ⋅ A/a ⋅ a/a ⋅
F1 from cross B/b b/b b/b B/b
72 67 1 310 315 287 288
2 36 38 23 23
3 360 380 230 230
44 35
4 74 72 50 44
I I
P long, ebony short, gray 1 2 1 2
a. Does the pedigree show any evidence that the genes Explain these proportions with the aid of simplified
are linked? meiosis diagrams.
b. If there is linkage, does the pedigree show any 33. In the tiny model plant Arabidopsis, the recessive allele
evidence of crossing over? hyg confers seed resistance to the drug hygromycin, and
Explain your answers to parts a and b with the aid of her, a recessive allele of a different gene, confers seed
the diagram. resistance to herbicide. A plant that was homozygous
hyg/hyg ⋅ her/her was crossed with wild type, and the F1
c. Can you calculate a value for the recombination be- was selfed. Seeds resulting from the F1 self were placed
tween these genes? Is this recombination by independent on petri dishes containing hygromycin and herbicide.
assortment or by crossing over?
a. If the two genes are unlinked, what percentage of
31. In corn, a triple heterozygote was obtained carrying the seeds are expected to grow?
mutant alleles s (shrunken), w (white aleurone), and b. In fact, 13 percent of the seeds grew. Does this per-
y (waxy endosperm), all paired with their normal wild- centage support the hypothesis of no linkage? Explain.
type alleles. This triple heterozygote was testcrossed, and If not, calculate the number of map units between the
the progeny contained 116 shrunken, white; 4 fully wild loci.
type; 2538 shrunken; 601 shrunken, waxy; 626 white;
c. Under your hypothesis, if the F1 is testcrossed, what
2708 white, waxy; 2 shrunken, white, waxy; and 113 waxy.
proportion of seeds will grow on the medium containing
a. Determine if any of these three loci are linked and, hygromycin and herbicide?
if so, show map distances.
b. Show the allele arrangement on the chromosomes 34. In a diploid organism of genotype A/a ; B/b ; D/d, the
of the triple heterozygote used in the testcross. allele pairs are all on different chromosome pairs. The
two diagrams below purport to show anaphases (“pull-
c. Calculate interference, if appropriate.
ing apart” stages) in individual cells. State whether
32. a. A mouse cross A/a ⋅ B/b × a/a ⋅ b/b is made, and in each drawing represents mitosis, meiosis I, or meiosis II
the progeny there are or is impossible for this particular genotype.
25% A/a ⋅ B/b, 25% a/a ⋅ b/b, 35. The Neurospora cross al-2+ × al-2 is made. A linear tet-
25% A/a ⋅ b/b, 25% a/a ⋅ B/b rad analysis reveals that the second-division segrega-
Explain these proportions with the aid of simplified tion frequency is 8 percent.
meiosis diagrams. a. Draw two examples of second-division segregation
b. A mouse cross C/c ⋅ D/d × c/c ⋅ d/d is made, and in patterns in this cross.
the progeny there are b. What can be calculated by using the 8 percent value?
45% C/c ⋅ d/d, 45% c/c ⋅ D/d,
5% c/c ⋅ d/d, 5% C/c ⋅ D/d
a. f.
A B D A b
a b d d
A B D d
a B
a b d
b. g.
A b d a b
A b d d
A b d d
a b
A b d
c. h.
A B d A A B B D D
A b d A A B B D D
a B D a a b b d d
a a b b d d
a b D
d. i.
A b D A a B b D d
A b D
a B d
A a B b D d
a B d
e. j.
A B A a B b D d
D
D
A B a A b B d D
16 6 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
36. From the fungal cross arg-6 ⋅ al-2 × arg-6+ ⋅ al-2+, what 12. How is a cross made in a fungus such as Neurospora?
will the spore genotypes be in unordered tetrads that Explain how to isolate asci and individual ascospores.
are (a) parental ditypes? (b) tetratypes? (c) nonparen- How does the term tetrad relate to the terms ascus and
tal ditypes? octad?
37. For a certain chromosomal region, the mean number of 13. Where does meiosis take place in the Neurospora life
crossovers at meiosis is calculated to be two per meiosis. cycle? (Show it on a diagram of the life cycle.)
In that region, what proportion of meioses are predict- 14. What does Problem 38 have to do with meiosis?
ed to have (a) no crossovers? (b) one crossover? (c) two
15. Can you write out the genotypes of the two parental
crossovers?
strains?
38. A Neurospora cross was made between a strain that car-
16. Why are only four genotypes shown in each class?
ried the mating-type allele A and the mutant allele arg-1
and another strain that carried the mating-type allele a 17. Why are there only seven classes? How many ways have
and the wild-type allele for arg-1 (+). Four hundred lin- you learned for classifying tetrads generally? Which of
ear octads were isolated, and they fell into the seven these classifications can be applied to both linear and
classes given in the table below. (For simplicity, they are unordered tetrads? Can you apply these classifications
shown as tetrads.) to the tetrads in this problem? (Classify each class in as
many ways as possible.) Can you think of more possibili-
a. Deduce the linkage arrangement of the mating-type
ties in this cross? If so, why are they not shown?
locus and the arg-1 locus. Include the centromere or
centromeres on any map that you draw. Label all intervals 18. Do you think there are several different spore orders
in map units. within each class? Why would these different spore or-
b. Diagram the meiotic divisions that led to class 6. Label ders not change the class?
clearly.
1 2 3 4 5 6 7
A arg A A arg A arg A arg A A
A arg A A a arg a a arg a arg
A a arg a arg A A arg A A arg
A a arg a a a a arg a
127 125 100 36 2 4 6
www
Unpacking Problem 38 19. Why is the following class not listed?
www a⋅+ A ⋅ arg
1. Are fungi generally haploid or diploid? A ⋅ arg
a⋅+
2. How many ascospores are in the ascus of Neurospora?
Does your answer match the number presented in this 20. What does the expression linkage arrangement mean?
problem? Explain any discrepancy. 21. What is a genetic interval?
3. What is mating type in fungi? How do you think it is de- 22. Why does the problem state “centromere or centro-
termined experimentally? meres” and not just “centromere”? What is the general
method for mapping centromeres in tetrad analysis?
4. Do the symbols A and a have anything to do with domi-
nance and recessiveness? 23. What is the total frequency of A ⋅ + ascospores? (Did you
calculate this frequency by using a formula or by inspec-
5. What does the symbol arg-1 mean? How would you test tion? Is this a recombinant genotype? If so, is it the only
for this genotype? recombinant genotype?)
6. How does the arg-1 symbol relate to the symbol +? 24. The first two classes are the most common and are ap-
7. What does the expression wild type mean? proximately equal in frequency. What does this infor-
8. What does the word mutant mean? mation tell you? What is their content of parental and
9. Does the biological function of the alleles shown have recombinant genotypes?
anything to do with the solution of this problem?
39. A geneticist studies 11 different pairs of Neurospora loci
10. What does the expression linear octad analysis mean? by making crosses of the type a ⋅ b × a+ ⋅ b+ and then ana-
11. In general, what more can be learned from linear tetrad lyzing 100 linear asci from each cross. For the conve-
analysis that cannot be learned from unordered tetrad nience of making a table, the geneticist organizes the data
analysis? as if all 11 pairs of genes had the same designation—a and
Problems 167
b—as shown below. For each cross, map the loci in relation
to each other and to centromeres.
Number of asci of type
a ⋅ b a ⋅ b+ a ⋅ b a ⋅ b a ⋅ b a ⋅ b+ a ⋅ b+
a ⋅ b a ⋅ b+ a ⋅ b+ a+ ⋅ b a+ ⋅ b+ a+ ⋅ b a+ ⋅ b
a+ ⋅ b+ a+ ⋅ b a+ ⋅ b+ a+ ⋅ b+ a+ ⋅ b+ a+ ⋅ b a+ ⋅ b+
Cross a+ ⋅ b+ a+ ⋅ b a+ ⋅ b a ⋅ b+ a ⋅ b a ⋅ b+ a ⋅ b
1 34 34 32 0 0 0 0
2 84 1 15 0 0 0 0
3 55 3 40 0 2 0 0
4 71 1 18 1 8 0 1
5 9 6 24 22 8 10 20
6 31 0 1 3 61 0 4
7 95 0 3 2 0 0 0
8 6 7 20 22 12 11 22
9 69 0 10 18 0 1 2
10 16 14 2 60 1 2 5
11 51 49 0 0 0 0 0
40. Three different crosses in Neurospora are analyzed on 42. A rice breeder obtained a triple heterozygote carrying
the basis of unordered tetrads. Each cross combines a the three recessive alleles for albino flowers (al), brown
different pair of linked genes. The results are shown in awns (b), and fuzzy leaves (fu), all paired with their
the following table: normal wild-type alleles. This triple heterozygote was
testcrossed. The progeny phenotypes were
Non-
Parental Tetra- parental 170 wild type 710 albino
ditypes types ditypes 150 albino, brown, fuzzy 698 brown, fuzzy
Cross Parents (%) (%) (%) (%) 5 brown 42 fuzzy
1 a ⋅ b+ × a+⋅ b 51 45 4 3 albino, fuzzy 38 albino, brown
2 c⋅ d+ × c+ ⋅ d 64 34 2 a. Are any of the genes linked? If so, draw a map labeled
3 e⋅ f+ × e+ ⋅ f 45 50 5 with map distances. (Don’t bother with a correction for
multiple crossovers.)
For each cross, calculate b. The triple heterozygote was originally made by cross-
a. the frequency of recombinants (RF). ing two pure lines. What were their genotypes?
b. the uncorrected map distance, based on RF. 43. In a fungus, a proline mutant (pro) was crossed with a
c. the corrected map distance, based on tetrad histidine mutant (his). A nonlinear tetrad analysis gave
frequencies. the following results:
d the corrected map distance, based on the mapping + + + + + his
function. + + + his + his
41. On Neurospora chromosome 4, the leu3 gene is just to the pro his pro + pro +
left of the centromere and always segregates at the first pro his pro his pro +
division, whereas the cys2 gene is to the right of the cen- 6 82 112
tromere and shows a second-division segregation fre-
quency of 16 percent. In a cross between a leu3 strain and a. Are the genes linked or not?
a cys2 strain, calculate the predicted frequencies of the b. Draw a map (if linked) or two maps (if not linked),
following seven classes of linear tetrads where l = leu3 and showing map distances based on straightforward
c = cys2. (Ignore double and other multiple crossovers.) recombinant frequency where appropriate.
(i) l c (ii) l + (iii) l c (iv) l c (v) l c (vi) l + (vii) l + c. If there is linkage, correct the map distances for
l c l + l + + c + + + c + c multiple crossovers (choose one approach only).
+ + + c + + + + + + + c + + 44. In the fungus Neurospora, a strain that is auxotrophic for
+ + + c + c l + l c l + l c thiamine (mutant allele t) was crossed with a strain that is
16 8 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
auxotrophic for methionine (mutant allele m). Linear asci c. If the F1 is testcrossed, what proportion of the
were isolated and classified into the following groups: progeny will have all four recessive phenotypes?
47. In corn, the cross WW ee FF × ww EE ff is made. The
Spore pair Ascus types
three loci are linked as follows:
1 and 2 t+ t+ t+ t+ tm tm
W/w E/e F/f
3 and 4 t+ tm +m ++ tm ++
5 and 6 +m ++ t+ tm ++ t+ 8 m.u. 24 m.u.
7 and 8 +m +m +m +m ++ +m Assume no interference.
Number 260 76 4 54 1 5 a. If the F1 is testcrossed, what proportion of progeny
will be ww ee ff?
a. Determine the linkage relations of these two genes
b. If the F1 is selfed, what proportion of progeny will be
to their centromere(s) and to each other. Specify
ww ee ff?
distances in map units.
48. The fungal cross + ⋅ + × c ⋅ m was made, and nonlinear
b. Draw a diagram to show the origin of the ascus type
(unordered) tetrads were collected. The results were
with only one single representative (second from right).
++ ++ +m
45. A corn geneticist wants to obtain a corn plant that has
the three dominant phenotypes: anthocyanin (A), long ++ +m +m
tassels (L), and dwarf plant (D). In her collection of c m c + c+
pure lines, the only lines that bear these alleles are AA
c m c m c+
LL dd and aa ll DD. She also has the fully recessive line
aa ll dd. She decides to intercross the first two and test- Total 112 82 6
cross the resulting hybrid to obtain in the progeny a a. From these results, calculate a simple recombinant
plant of the desired phenotype (which would have to be frequency.
Aa Ll Dd in this case). She knows that the three genes
b. Compare the Haldane mapping function and the
are linked in the order written, that the distance be-
Perkins formula in their conversions of the RF value
tween the A/a and the L/l loci is 16 m.u., and that the
into a “corrected” map distance.
distance between the L/l and the D/d loci is 24 m.u.
c. In the derivation of the Perkins formula, only the
a. Draw a diagram of the chromosomes of the parents,
possibility of meioses with zero, one, and two crossovers
the hybrid, and the tester.
was considered. Could this limit explain any discrepancy
b. Draw a diagram of the crossover(s) necessary to in your calculated values? Explain briefly (no calculation
produce the desired genotype. needed).
c. What percentage of the testcross progeny will be of 49. In mice, the following alleles were used in a cross:
the phenotype that she needs?
W = waltzing gait w = nonwaltzing gait
d. What assumptions did you make (if any)?
G = normal gray color g = albino
46. In the model plant Arabidopsis thaliana, the following
B = bent tail b = straight tail
alleles were used in a cross:
A waltzing gray bent-tailed mouse is crossed with a non-
T = presence of trichomes t = absence of trichomes
waltzing albino straight-tailed mouse and, over several
D = tall plants d = dwarf plants years, the following progeny totals are obtained:
W = waxy cuticle w = nonwaxy waltzing gray bent 18
A = presence of purple a = absence (white) waltzing albino bent 21
anthocyanin pigment nonwaltzing gray straight 19
The T/t and D/d loci are linked 26 m.u. apart on chromo- nonwaltzing albino straight 22
some 1, whereas the W/w and A/a loci are linked 8 m.u. waltzing gray straight 4
apart on chromosome 2. waltzing albino straight 5
A pure-breeding double-homozygous recessive trichome- nonwaltzing gray bent 5
less nonwaxy plant is crossed with another pure-breeding nonwaltzing albino bent 6
double-homozygous recessive dwarf white plant. Total 100
a. What will be the appearance of the F1? a. What were the genotypes of the two parental mice
b. Sketch the chromosomes 1 and 2 of the parents and in the cross?
the F1, showing the arrangement of the alleles. b. Draw the chromosomes of the parents.
Problems 16 9
a. parental ditypes showing MI patterns for both loci? C/c D/d, is testcrossed with a/a b/b c/c d/d, and 1000
• • • •
b. nonparental ditypes showing MI patterns for both progeny are classified by the gametic contribution of
loci? the heterozygous parent as follows:
c. tetratypes showing an MI pattern for +/f and an MII a B C D
• • • 42
pattern for +/p?
d. tetratypes showing an MII pattern for +/f and an MI A b c d
• • • 43
pattern for +/p? A B C d
• • • 140
51. In a haploid fungus, the genes al-2 and arg-6 are 30 m.u. a b c D
• • • 145
apart on chromosome 1, and the genes lys-5 and met-1 a B c D
• • • 6
are 20 m.u. apart on chromosome 6. In a cross
A b C d
• • • 9
al-2 + ; + met-1 × + arg-6 ; lys-5 +
A B c d
• • • 305
what proportion of progeny would be prototrophic + + ;
a b C D
• • • 310
+ +?
52. The recessive alleles k (kidney-shaped eyes instead of a. Which genes are linked?
wild-type round), c (cardinal-colored eyes instead of wild- b. If two pure-breeding lines had been crossed to
type red), and e (ebony body instead of wild-type gray) produce the heterozygous individual, what would their
identify three genes on chromosome 3 of Drosophila. genotypes have been?
Females with kidney-shaped, cardinal-colored eyes were c. Draw a linkage map of the linked genes, showing the
mated with ebony males. The F1 was wild type. When F1 order and the distances in map units.
females were testcrossed with kk cc ee males, the follow-
ing progeny phenotypes were obtained: d. Calculate an interference value, if appropriate.
56. An autosomal allele N in humans causes abnormalities
k c e 3 in nails and patellae (kneecaps) called the nail–patella
k c + 876 syndrome. Consider marriages in which one partner
k + e 67 has the nail–patella syndrome and blood type A and
k + + 49 the other partner has normal nails and patellae and
+ c e 44 blood type O. These marriages produce some children
+ c + 58 who have both the nail–patella syndrome and blood
+ + e 899 type A. Assume that unrelated children from this phe-
+ + + 4 notypic group mature, intermarry, and have children.
Total 2000 Four phenotypes are observed in the following per-
centages in this second generation:
a. Determine the order of the genes and the map
distances between them. nail–patella syndrome, blood type A 66%
b. Draw the chromosomes of the parents and the F1. normal nails and patellae, blood type O 16%
c. Calculate interference and say what you think of its normal nails and patellae, blood type A 9%
significance. nail–patella syndrome, blood type O 9%
53. From parents of genotypes A/A ⋅ B/B and a/a ⋅ b/b, a
Fully analyze these data, explaining the relative frequen-
dihybrid was produced. In a testcross of the dihybrid,
cies of the four phenotypes. (See pages 219–220 for the
the following seven progeny were obtained:
genetic basis of these blood types.)
A/a ⋅ B/b, a/a ⋅ b/b, A/a ⋅ B/b, A/a ⋅ b/b,
57. Assume that three pairs of alleles are found in Drosophila:
a/a ⋅ b/b, A/a ⋅ B/b, and a/a ⋅ B/b
x + and x, y + and y, and z + and z. As shown by the symbols,
Do these results provide convincing evidence of linkage? each non-wild-type allele is recessive to its wild-type
170 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
allele. A cross between females heterozygous at these and a left-sided heart (r). The three loci are autosomal
three loci and wild-type males yields progeny having the and linked as shown in this linkage map:
following genotypes: 1010 x + y + z + females, 430 x y + z
• • • •
P A R
males, 441 x + y z + males, 39 x y z males, 32 x + y + z
• • • • • •
a. On what chromosome of Drosophila are the genes Mr. Spock, first officer of the starship Enterprise, has a
carried? Vulcan father and an Earthling mother. If Mr. Spock mar-
ries an Earth woman and there is no (genetic) interfer-
b. Draw the relevant chromosomes in the heterozygous
ence, what proportion of their children will have
female parent, showing the arrangement of the alleles.
a. Vulcan phenotypes for all three characters?
c. Calculate the map distances between the genes and
the coefficient of coincidence. b. Earth phenotypes for all three characters?
58. The five sets of data given in the following table repre- c. Vulcan ears and heart but Earth adrenals?
sent the results of testcrosses using parents with the d. Vulcan ears but Earth heart and adrenals?
same alleles but in different combinations. Determine 61. In a certain diploid plant, the three loci A, B, and C are
the order of genes by inspection—that is, without calcu- linked as follows:
lating recombination values. Recessive phenotypes are
symbolized by lowercase letters and dominant pheno- A B C
types by pluses.
20 m.u. 30 m.u.
Phenotypes Data sets
observed in One plant is available to you (call it the parental plant). It
3-point testcross 1 2 3 4 5 has the constitution A b c/a B C.
+ + + 317 1 30 40 305 a. With the assumption of no interference, if the plant is
selfed, what proportion of the progeny will be of the
+ + c 58 4 6 232 0
genotype a b c/a b c?
+ b + 10 31 339 84 28
b. Again, with the assumption of no interference, if the
+ b c 2 77 137 201 107 parental plant is crossed with the a b c/a b c plant, what
a + + 0 77 142 194 124 genotypic classes will be found in the progeny? What will
a + c 21 31 291 77 30 be their frequencies if there are 1000 progeny?
a b + 72 4 3 235 1 c. Repeat part b, this time assuming 20 percent inter-
a b c 203 1 34 46 265 ference between the regions.
62. The following pedigree shows a family with two rare ab-
59. From the phenotype data given in the following table for normal phenotypes: blue sclerotic (a brittle-bone defect),
two 3-point testcrosses for (1) a, b, and c and (2) b, c, and represented by a black-bordered symbol, and hemophil-
d, determine the sequence of the four genes a, b, c, and d ia, represented by a black center in a symbol. Members
and the three map distances between them. Recessive represented by completely black symbols have both dis-
phenotypes are symbolized by lowercase letters and orders. The numbers in some symbols are the numbers of
dominant phenotypes by pluses. individuals with those types.
1 2
+ + + 669 b c d 8
a b + 139 b + + 441
a + + 3 b + d 90
3
+ + c 121 + c d 376
5
+ b c 2 + + + 14
a + c 2280 + + d 153
a b c 653 + c + 65
+ b + 2215 b c + 141
The geneticist crosses the F1 with a recessive tester and 8 m.u. 14 m.u.
classifies the progeny by the gametic contribution of
the F1: If the following cross is made
+ + +/+ + + × w s e/w s e
A⋅B⋅C⋅D⋅E 316
a⋅b⋅C⋅d⋅E 314 and the F1 is testcrossed with w s e/w s e, and if it is
assumed that there is no interference on this region of
A⋅B⋅C⋅d⋅E 31
the chromosome, what proportion of progeny will be of
a⋅b⋅C⋅D⋅E 39 the following genotypes?
A⋅b⋅C⋅d⋅E 130 a. + + + e. + + e
a⋅B⋅C⋅D⋅E 140 b. w s e f. w s +
A⋅b⋅C⋅D⋅E 17 c. + s e g. w + e
a⋅B⋅C⋅d⋅E 13 d. w + + h. + s +
1000 67. Every Friday night, genetics student Jean Allele, exhaust-
ed by her studies, goes to the student union’s bowling
The second cross of pure lines is A/A B/B C/C D/D
• • • •
lane to relax. But, even there, she is haunted by her ge-
E/E × a/a B/B c/c D/D e/e.
• • • •
netic studies. The rather modest bowling lane has only
The geneticist crosses the F1 from this cross with a four bowling balls: two red and two blue. They are bowled
recessive tester and obtains at the pins and are then collected and returned down the
172 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
chute in random order, coming to rest at the end stop. As the two cultures to mate. The diploid cells then divide
the evening passes, Jean notices familiar patterns of the meiotically and form unordered tetrads. Some of the as-
four balls as they come to rest at the stop. Compulsively, cospores will grow on minimal medium. You classify a
she counts the different patterns. What patterns did she large number of these tetrads for the phenotypes ARG−
see, what were their frequencies, and what is the relevance (arginine requiring) and ARG+ (arginine independent)
of this matter to genetics? and record the following data:
68. In a tetrad analysis, the linkage arrangement of the p and Segregation Frequency
q loci is as follows: of ARG- : ARG+ (%)
(i) (ii) 4 : 0 40
3 : 1 20
p q
2 : 2 40
Assume that
• in region i, there is no crossover in 88 percent of meioses a. Using symbols of your own choosing, assign genotypes
and there is a single crossover in 12 percent of meioses; to the two parental cultures. For each of the three kinds
of segregation, assign genotypes to the segregants.
• in region ii, there is no crossover in 80 percent of meio-
b. If there is more than one locus governing arginine
ses and there is a single crossover in 20 percent of meio-
requirement, are these loci linked?
ses; and
70. An RFLP analysis of two pure lines A/A B/B and a/a b/b
• •
• there is no interference (in other words, the situation in
showed that the former was homozygous for a long RFLP
one region does not affect what is going on in the other
allele (l) and the latter for a short allele (s). The two were
region).
crossed to form an F1, which was then backcrossed to the
What proportions of tetrads will be of the following types? second pure line. A thousand progeny were scored as
(a) MIMI, PD; (b) MIMI, NPD; (c) MIMII, T; (d)MIIMI, T; (e) follows:
MIIMII, PD; (f) MIIMII, NPD; (g) MIIMII,T. (Note: Here the
Aa Bb ss 9 Aa bb ss 43
M pattern written first is the one that pertains to the p
locus.) Hint: The easiest way to do this problem is to start Aa Bb ls 362 Aa bb ls 93
by calculating the frequencies of asci with crossovers in aa bb ls 11
both regions, region i, region ii, and neither region. Then aa Bb ls 37
determine what MI and MII patterns result. aa bb ss 358 aa Bb ss 87
69. For an experiment with haploid yeast, you have two dif-
ferent cultures. Each will grow on minimal medium to a. What do these results tell us about linkage?
which arginine has been added, but neither will grow on
b. Draw a map if appropriate.
minimal medium alone. (Minimal medium is inorganic
salts plus sugar.) Using appropriate methods, you induce c. Incorporate the RFLP fragments into your map.
344
The Genetics
5
C h a p t e r
of Bacteria and
Their Viruses
Learning Outcomes
After completing this chapter, you will be able to
5.5 Transduction
173
174 CHA P TER 5 The Genetics of Bacteria and Their Viruses
D
NA technology is responsible for the rapid advances being made in the genet-
The fruits of DNA
ics of all model organisms. It is also a topic of considerable interest in the
technology, made possible by
public domain. Examples are the highly publicized announcement of the full
bacterial genetics
genome sequences of humans and chimpanzees in recent years and the popularity
of DNA-based forensic analysis in television shows and movies (Figure 5-1). Indeed,
improvements in technology have led to the sequencing of the genomes of many
hundreds of species. Such dramatic results, whether in humans, fish, insects, plants,
or fungi, are all based on the use of methods that permit small pieces of DNA to be
isolated, carried from cell to cell, and amplified into large pure samples. The sophis-
ticated systems that permit these manipulations of the DNA of any organism are
almost all derived from bacteria and their viruses. Hence, the advance of modern
genetics to its present state of understanding was entirely dependent on the devel-
opment of bacterial genetics, the topic of this chapter.
However, the goal of bacterial genetics has never been to facilitate eukaryotic
molecular genetics. Bacteria are biologically important in their own right. They
are the most numerous organisms on our planet. They contribute to the recycling
of nutrients such as nitrogen, sulfur, and carbon in ecosystems. Some are agents
of human, animal, and plant disease. Others live symbiotically inside our mouths
and intestines. In addition, many types of bacteria are useful for the industrial
synthesis of a wide range of organic products. Hence, the impetus for the genetic
F i g u r e 5 -1 The dramatic results of dissection of bacteria has been the same as that for multicellular organisms—to
modern DNA technology, such as understand their biological function.
sequencing the human genome, were Bacteria belong to a class of organisms known as prokaryotes, which also
possible only because bacterial genetics includes the blue-green algae (classified as cyanobacteria). A key defining feature
led to the invention of efficient DNA
of prokaryotes is that their DNA is not enclosed in a membrane-bounded nucleus.
manipulation vectors. [ Science 291, 2001,
pp. 1145–1434. Image by Ann E. Cutting.
Like higher organisms, bacteria have genes composed of DNA arranged in a long
Reprinted with permission from AAAS.] series on a “chromosome.” However, the organization of their genetic material is
unique in several respects. The genome of most bacteria is a single molecule of
double-stranded DNA in the form of a closed circle. In addition, bacteria in nature
often contain extra DNA elements called plasmids. Most plasmids also are DNA
circles but are much smaller than the main bacterial genome.
Bacteria can be parasitized by specific viruses called bacteriophages or, sim-
ply, phages. Phages and other viruses are very different from the organisms that
we have been studying so far. Viruses have some properties in common with
organisms; for example, their genetic material can be DNA or RNA, constituting a
short “chromosome.” However, most biologists regard viruses as nonliving because
they are not cells and they have no metabolism of their own. Hence, for the study
of their genetics, viruses must be propagated in the cells of their host organisms.
When scientists began studying bacteria and phages, they were naturally curi-
ous about their hereditary systems. Clearly, bacteria and phages must have hered-
itary systems because they show a constant appearance and function from one
generation to the next (they are true to type). But how do these hereditary sys-
tems work? Bacteria, like unicellular eukaryotic organisms, reproduce asexually
by cell growth and division, one cell becoming two. This asexual reproduction is
quite easy to demonstrate experimentally. However, is there ever a union of dif-
ferent types for the purpose of sexual reproduction? Furthermore, how do the
much smaller phages reproduce? Do they ever unite for a sex-like cycle? These
questions are pursued in this chapter.
We will see that there is a variety of hereditary processes in bacteria and
phages. These processes are interesting because of the basic biology of these
forms, but they also act as models—as sources of insight into genetic processes at
work in all organisms. For a geneticist, the attraction of these forms is that they
can be cultured in very large numbers because they are so small. Consequently, it
is possible to detect and study very rare genetic events that are difficult or impos-
sible to study in eukaryotes.
The Genetics of Bacteria and Their Viruses 175
Partial genome
Transformation transfer by
DNA uptake
Genome
Genome
F i g u r e 5 -2 Bacterial DNA
can be transferred from cell to
Transduction cell in four ways: conjugation with
plasmid transfer, conjugation with
Transfer as part of viral genome partial genome transfer,
transformation, and transduction.
176 CHA P TER 5 The Genetics of Bacteria and Their Viruses
the need for cell division. This term distinguishes this type
Bacterial colonies, each derived from a single cell
of DNA transfer from that during vertical transmission,
the passage of DNA down thorough the bacterial genera-
tions. Horizontal transmission can spread DNA rapidly
through a bacterial population by contact in much the
same way that a disease spreads. For bacteria, horizontal
transmission provides a powerful method by which they
can adapt rapidly to changing environmental conditions.
Phages themselves can undergo recombination when
two different genotypes both infect the same bacterial cell
(phage recombination, not shown in Figure 5-2).
Suspension of Suspension spread on Before we analyze these modes of genetic exchange,
bacterial cells petri plate with agar gel let’s consider the practical ways of handling bacteria,
which are much different from those used in handling
Incubate from
multicellular organisms.
1 to 2 days
ability to use a specific energy source; for example, the wild type (lac+) can use
Distinguishing lac+ and lac–
lactose and grow, whereas a mutant (lac−) cannot. Figure 5-4 shows another way by using a red dye
of distinguishing lac+ and lac− colonies by using a dye. In another mutant cate-
gory, whereas wild types are susceptible to an inhibitor, such as the antibiotic
streptomycin, resistant mutants can divide and form colonies in the presence of
the inhibitor. All these types of mutants allow the geneticist to distinguish differ-
ent individual strains, thereby providing genetic markers (marker alleles) to
keep track of genomes and cells in experiments. Table 5-1 summarizes some
mutant bacterial phenotypes and their genetic symbols.
The following sections document the discovery of the various processes by
which bacterial genomes recombine. The historical methods are interesting in
themselves but also serve to introduce the diverse processes of recombination, as
well as analytical techniques that are still applicable today.
then plated, however, a few colonies met – bio – thr + leu + thi + Mixture met + bio + thr – leu – thi –
appear on the agar plate. These colonies
derive from single cells in which genetic
material has been exchanged; they are
therefore capable of synthesizing all the
required constituents of metabolism.
MM MM MM
No met + bio + thr + leu + thi + No
colonies Prototrophic colonies
colonies
(b)
Hfr strains
An important breakthrough came when Luca Cavalli-Sforza
Bacteria conjugate by using pili
discovered a derivative of an F+ strain with two unusual
properties:
1. On crossing with F− strains, this new strain produced 1000
times as many recombinants as a normal F+ strain. Cavalli-
Sforza designated this derivative an Hfr strain to symbolize its
ability to promote a high frequency of recombination.
2. In Hfr × F− crosses, virtually none of the F− parents were
converted into F+ or into Hfr. This result is in contrast with
F+ × F− crosses, in which, as we have seen, infectious transfer
of F results in a large proportion of the F− parents being con-
verted into F+.
F Plasmid
Recipient F –
F+ culture. Cavalli-Sforza isolated examples of these rare cells from F+ cultures and
found that, indeed, they now acted as true Hfr’s.
Does an Hfr cell die after donating its chromosomal material to an F− cell? The
answer is no. Just like the F plasmid, the Hfr chromosome replicates and transfers a
single strand to the F− cell during conjugation. That the transferred DNA is a single
Model
Organism Escherichia coli
The seventeenth-century microscopist Antony van form the basis of the gene transfers at the center of modern
Leeuwenhoek was probably the first to see bacterial cells genetic engineering.
and to recognize their small size: “There are more living in E. coli is unicellular and grows by simple cell division.
the scum on the teeth in a man’s mouth than there are men Because of its small size (~1 µm in length), E. coli can be
in the whole kingdom.” However, bacteriology did not grown in large numbers and subjected to intensive selection
begin in earnest until the nineteenth century. In the 1940s, and screening for rare genetic events. E. coli research repre-
Joshua Lederberg and Edward Tatum made the discovery sents the beginning of “black box” reasoning in genetics:
that launched bacteriology into the burgeoning field of through the selection and analysis of mutants, the workings
genetics: they discovered that, in a certain bacterium, there of the genetic machinery could be deduced even though it
was a type of sexual cycle including a crossing-over-like was too small to be seen. Phenotypes such as colony size,
process. The organism that they chose for this experiment drug resistance, carbon-source utilization, and colored-dye
has become the model not only for prokaryote genetics but, production took the place of the visible phenotypes of
in a sense, for all of genetics. The organism was Escherichia eukaryotic genetics.
coli, a bacterium named after its discoverer, the nineteenth-
century German bacteriologist Theodore Escherich.
The choice of E. coli was fortunate because it has proved
to have many features suitable for genetic research, not the
least of which is that it is easily obtained, given that it lives
in the gut of humans and other animals. In the gut, it is a
benign symbiont, but it occasionally causes urinary tract
infections and diarrhea.
E. coli has a single circular chromosome 4.6 Mb in
length. Of its 4000 intron-free genes, about 35 percent are
of unknown function. The sexual cycle is made possible by
the action of an extragenomic plasmid called F, which con-
fers a type of “maleness.” Other plasmids carry genes An electron micrograph of an E. coli cell showing long flagella, used
whose functions equip the cell for life in specific environ- for locomotion, and fimbriae, proteinaceous hairs that are important
ments, such as drug-resistance genes. These plasmids have in anchoring the cells to animal tissues. (Sex pili are not shown in
been adapted as gene vectors, which are gene carriers that this micrograph.) [ Biophoto Associates/Science Photo Library.]
5.2 Bacterial Conjugation 181
strand can be demonstrated visually with the use of special strains and
Integration of the F plasmid
antibodies, as shown in Figure 5-10. The replication of the chromo- creates an Hfr strain
some ensures a complete chromosome for the donor cell after mat-
ing. The transferred strand is converted into a double helix in the
recipient cell, and donor genes may become incorporated in the
recipient’s chromosome through crossovers, creating a recombinant
cell (Figure 5-11). If there is no recombination, the transferred frag-
ments of DNA are simply lost in the course of cell division. F+ F
Hfr azir tonr lac+ gal+ strs × F- azis tons lac- gal- strr
Integrated F
(Superscripts “r” and “s” stand for resistant and sensitive, respec-
tively.) At specific times after mixing, they removed samples, which
were each put in a kitchen blender for a few seconds to separate the mating cell F i g u r e 5 - 9 In an F+ strain the free
pairs. This procedure is called interrupted mating. The sample was then plated F plasmid occasionally integrates into
onto a medium containing streptomycin to kill the Hfr donor cells, which bore the the E. coli chromosome, creating an
Hfr strain.
sensitivity allele strs. The surviving strr cells then were tested for the presence of
alleles from the donor Hfr genome. Any strr cell bearing a donor allele must have
taken part in conjugation; such cells are called exconjugants. The results are
plotted in Figure 5-12a, showing a time course of entry of each donor allele azir,
tonr, lac+, and gal+. Figure 5-12b portrays the transfer of Hfr alleles.
The key elements in these results are
1. Each donor allele first appears in the F− recipients at a specific time after mat-
ing began.
2. The donor alleles appear in a specific sequence.
3. Later donor alleles are present in fewer recipient cells.
Exconjugant
Exogenote c+ b+ a+ Transferred
fragment
c– b– a– converted
Endogenote into double helix
Recombinant Lost
c+ b – a–
Double crossover
c– b+ a+ inserts
donor DNA
Putting all these observations together, Wollman and Jacob deduced that, in
the conjugating Hfr, single-stranded DNA transfer begins from a fixed point on
the donor chromosome, termed the origin (O), and continues in a linear fashion.
The point O is now known to be the site at which the F plasmid is inserted. The
farther a gene is from O, the later it is transferred to the F−. The transfer process
will generally stop before the farthermost genes are transferred, and, as a result,
these genes are included in fewer exconjugants. Note that a type of chromosome
map can be produced in units of minutes, based on time of entry of marked genes.
In the example in Figure 5-12, the map would be:
azi r tonr lac + gal +
0 10 12 17 25
10 2 5 8
How can we explain the second unusual property of Hfr crosses, that F−
exconjugants are rarely converted into Hfr or F+? When Wollman and Jacob
allowed Hfr × F− crosses to continue for as long as 2 hours before disruption, they
found that in fact a few of the exconjugants were converted into Hfr. In other
words, the part of F that confers donor ability was eventually transmitted but at a
very low frequency. The rareness of Hfr exconjugants suggested that the inserted
F was transmitted as the last element of the linear chromosome. We can summa-
rize the order of transmission with the following general type of map, in which
the arrow indicates the direction of transfer, beginning with O:
O a b c F
Thus, almost none of the F− recipients are converted, because the fertility fac-
tor is the last element transmitted and usually the transmission process will have
stopped before getting that far.
rr exconjugants
mating experiments with different, separately derived
exconjugants
r
azi
azi r
exconjugants
r exconjugants
genetic
Hfr strains. Significantly, the order of transmission of
genetic
80
80 azi
azi r r
genetic
genetic
the alleles differed from strain to strain, as in the fol- 80
80 ton
ton r
r
Hfr
of Hfr
60
60
Hfr
of rHfr
str
strstr
of
Hfr strain 60
60
(%)
among str
(%)(%)
of
among
(%)
among
H O thr pro lac pur gal his gly thi F
among
40
Frequency
40
Frequency
lac
lac +
+
40
Frequency
40
Frequency
1 O thr thi gly his gal pur lac pro F
characters
lac++
lac
characters
characters
characters
2 O pro thr thi gly his gal pur lac F 20 gal +
gal +
20
20 gal++
gal
3 O pur lac pro thr thi gly his gal F 20
AB 312 O thi thr pro lac pur gal his gly F 0
0
0 10 20 30 40 50 60
00 0 10 20 30 40 50 60
00 10
10 20 Time
20 30
30 40
40
Time (minutes)
(minutes) 50
50 60
60
Each line can be considered a map showing the order Time(minutes)
Time (minutes)
of alleles on the chromosome. At first glance, there
(b)
(b)
seems to be a random shuffling of genes. However, when F factor
F factor
(b)
(b)
some of the Hfr maps are inverted, the relation of the FFfactor
factor
sequences becomes clear.
10
10 min
min
H F thi gly his gal pur lac pro thr O
10min
10 min
(written backward)
1 O thr thi gly his gal pur lac pro F
s r
Hfr str
Hfr Origin Origin F − str
F −
2 O pro thr thi gly his gal pur lac F str s Origin Origin str r
Hfr strss
Hfrstr Origin
Origin Origin
Origin strr r
FF−−str
3 O pur lac pro thr thi gly his gal F
AB 312 F gly his gal pur lac pro thr thi O 17
17 min
min
(written backward) 17min
17 min
A single crossover inserts F at a specific locus, which then determines the order of gene transfer
O
Homologous
regions where
F pairing can
take place 2
1 2 1
a
a b
1 b c d a 2
Hfr
Transferred last Direction of transfer Transferred first
d c
E. coli chromosome d c
F i g u r e 5 -13 The insertion of F creates an Hfr cell. Hypothetical markers 1 and 2 are
shown on F to depict the direction of insertion. The origin (O) is the mobilization point where
insertion into the E. coli chromosome occurs; the pairing region is homologous with a region
on the E. coli chromosome; a through d are representative genes in the E. coli chromosome.
Pairing regions (hatched) are identical in plasmid and chromosome. They are derived from
mobile elements called insertion sequences (see Chapter 15). In this example, the Hfr cell
created by the insertion of F would transfer its genes in the order a, d, c, b.
A and D would give the order ABCD or DCBA, depending on orientation. Check
the different orientations of the insertions in Figure 5-14.
How is it possible for F to integrate at different sites? If F DNA had a region
homologous to any of several regions on the bacterial chromosome, any one of
them could act as a pairing region at which pairing could be followed by a cross-
over. These regions of homology are now known to be mainly segments of trans-
posable elements called insertion sequences. For a full explanation of insertion
sequences, see Chapter 15.
The fertility factor thus exists in two states:
1. The plasmid state: As a free cytoplasmic element, F is easily transferred to
F — recipients.
2. The integrated state: As a contiguous part of a circular chromosome, F is
transmitted only very late in conjugation.
The E. coli conjugation cycle is summarized in Figure 5-15.
thi gly his gal pur lac pro thr lac pur gal his gly thi thr pro pro lac pur gal his gly thi thr
H 2 1
thr thr
thi pro thi pro
Chromosome Plasmid
transfer transfer
F a+
F+ a +
Conjugation and
Insertion of F factor transfer of F factor
F a+ F+ a + F a+
Hfr a +
F– a – a–
Conjugation and
chromosome transfer
F a+ Hfr a + F a+
a+ F+ a +
a– F– a –
F – a +/ a – F
F i g u r e 5 -15 Conjugation can take
a–
Recombination No recombination place by partial transfer of a chromosome
F+ a –
containing the F factor or by transfer of an
F– a + F– a –
F plasmid that remains a separate entity.
18 6 CHA P TER 5 The Genetics of Bacteria and Their Viruses
F chromosome
To obtain this merozygote, we must first select stable exconjugants bearing the last
donor allele, which, in this case, is leu+. Why? In leu+ exconjugants, we know all
three markers were transferred into the recipient because leu is the last donor allele.
We also know that at least the leu+ marker was integrated into the endogenote. We
want to know how often the other two markers were also integrated so that we can
determine the number of recombination events in which arg+ or met+ was omitted
due to double crossover.
The goal now is to count the frequencies of crossovers at different locations.
Note that we now have a different situation from the analysis of interrupted con-
jugation. In mapping by interrupted conjugation, we measure the time of entry of
5.2 Bacterial Conjugation 187
individual loci; to be stably inherited, each marker has to recombine into the
recipient chromosome by a double crossover spanning it. However, in the recom-
binant frequency analysis, we have specifically selected trihybrids as a starting
point, and now we have to consider the various possible combinations of the three
donor alleles that can be inserted by double crossing over in the various intervals.
F i g u r e 5 -17 The diagram shows how
We know that leu+ must have entered and inserted because we selected it, but the
genes can be mapped by recombination
leu+ recombinants that we select may or may not have incorporated the other in E. coli. In exconjugants, selection is
donor markers, depending on where the double crossover took place. Hence, the made for merozygotes bearing the leu+
procedure is to first select leu+ exconjugants and then isolate and test a large sam- marker, which is donated late. The early
ple of them to see which of the other markers were integrated. markers (arg+ and met +) may or may not
Let’s look at an example. In the cross Hfr met+ arg+ leu+ strs × F− met− arg− be inserted, depending on the site where
leu− strr, we would select leu+ recombinants and then examine them for the arg+ recombination between the Hfr fragment
and the F− chromosome takes place. The
and met+ alleles, called the unselected markers. Figure 5-17 depicts the types of
frequencies of events diagrammed in parts
double-crossover events expected. One crossover must be on the left side of the a and b are used to obtain the relative
leu marker and the other must be on the right side. Let’s assume that the leu+ sizes of the leu–arg and arg–met regions.
exconjugants are of the following types and frequencies: Note that, in each case, only the DNA
inserted into the F− chromosome survives;
leu+ arg- met- 4%
the other fragment is lost.
leu+ arg+ met- 9%
ANIMATED ART: Bacterial
leu+ arg+ met+ 87% conjugation and mapping by recombination
Hfr fragment
arg met
leu leu arg met
F chromosome
(b) Insertion of late marker and one early marker
arg met
leu
(d) Insertion of late and early markers, but not of marker in between
arg met
leu
The double crossovers needed to produce these genotypes are shown in Figure 5-17.
The first two types are the key because they require a crossover between leu and
arg in the first case and between arg and met in the second. Hence, the relative
frequencies of these types correspond to the sizes of these two regions between
the genes. We would conclude that the leu−arg region is 4 m.u. and that the arg−
met is 9 m.u.
In a cross such as the one just described, one type of potential recombinants
of genotype leu+ arg− met+ requires four crossovers instead of two (see the bottom
of Figure 5-17). These recombinants are rarely recovered because their frequency
is very low compared with that of the other types of recombinants.
R plasmids
An alarming property of pathogenic bacteria first came to light through studies in
Japanese hospitals in the 1950s. Bacterial dysentery is caused by bacteria of the
genus Shigella. This bacterium was initially sensitive to a wide array of antibiotics
that were used to control the disease. In the Japanese hospitals, however, Shigella
5.2 Bacterial Conjugation 18 9
lac
Hfr chromosome
lac
(c) Excision
F' lac
lac
(d)
lac
These elements have been found to carry many different kinds of genes in bacte-
ria. Table 5-2 shows some of the characteristics that can be borne by plasmids.
Figure 5-19 shows an example of a well-traveled plasmid isolated from the dairy
industry.
F i g u r e 5 -19 The diagram shows the Engineered derivatives of R plasmids, such as pBR 322 and pUC (see Chap-
origins of genes of the Lactococcus
ter 10), have become the preferred vectors for the molecular cloning of the DNA
lactis plasmid pK214. The genes are from
many different bacteria. [ Data from Table 1 of all organisms. The genes on an R plasmid that confer resistance can be used as
in V. Perreten, F. Schwarz, L. Cresta, M. markers to keep track of the movement of the vectors between cells.
Boeglin, G. Dasen, and M. Teuber, Nature 389, On R plasmids, the alleles for antibiotic resistance are often contained within
1997, 801–802.] a unit called a transposon (Figure 5-20). Transposons are unique segments of DNA
Enterococcus Lactococcus
faecium lactis Mycoplasma
Listeria
monocytogenes
29 1 2
3
4
5
28
6
Lactococcus
Enterococcus 27
lactis
faecalis
7
26
25
8
Enterococcus
faecium 24
Listeria
monocytogenes
23 Plasmid pk214
Staphylococcus 22
9 Streptococcus
aureus
agalactiae
21
Lactobacillus
plantarum
10
Lactococcus
lactis
20 11
12
19 Streptococcus
13
14 pyogenes
Staphylococcus 18
aureus 17 15
16 Escherichia
coli
Enterococcus
faecium Lactococcus
Escherichia Escherichia lactis
coli coli
Staphylococcus
aureus
5.3 Bacterial Transformation 191
that can move around to different sites in the genome, a process called transposi-
An R plasmid with
tion. (The mechanisms for transposition, which occurs in most species studied, resistance genes carried
will be detailed in Chapter 15.) When a transposon in the genome moves to a new in a transposon
location, it can occasionally embrace between its ends various types of genes,
including alleles for drug resistance, and carry them along to their new locations
as passengers. Sometimes, a transposon carries a drug-resistance allele to a plas-
mid, creating an R plasmid. Like F plasmids, many R plasmids are conjugative; in
other words, they are effectively transmitted to a recipient cell during conjuga- Conjugative
plasmid
tion. Even R plasmids that are not conjugative and never leave their own cells can
donate their R alleles to a conjugative plasmid by transposition. Hence, through
plasmids, antibiotic-resistance alleles can spread rapidly throughout a population
IS50 IS50
of bacteria. Although the spread of R plasmids is an effective strategy for the sur- kan R neoR
vival of bacteria, it presents a major problem for medical practice, as mentioned
Transposon Tn5
earlier, because bacterial populations rapidly become resistant to any new anti-
biotic drug that is invented and applied to humans. F i g u r e 5 -2 0 A transposon such as
Tn5 can acquire several drug-resistance
genes (in this case, those for resistance
5.3 Bacterial Transformation to the drugs kanamycin and neomycin)
and transmit them rapidly on a plasmid,
Some bacteria can take up fragments of DNA from the external medium, and leading to the infectious transfer of
such uptake constitutes another way in which bacteria can exchange their genes. resistance genes as a package.
Insertion sequence 50 (IS50) forms the
The source of the DNA can be other cells of the same species or cells of other spe-
flanks of TN5.
cies. In some cases, the DNA has been released from dead cells; in other cases, the
DNA has been secreted from live bacterial cells. The DNA taken up integrates
into the recipient’s chromosome. If this DNA is of a different genotype from that
of the recipient, the genotype of the recipient can become permanently changed,
a process aptly termed transformation.
F i g u r e 5 -2 1 A bacterium undergoing
K e y C o n c e p t Bacteria can take up DNA
fragments from the surrounding medium. Inside the
transformation (a) picks up free DNA
cell, these fragments can integrate into the
released from a dead bacterial cell. As
chromosome.
DNA-binding complexes on the bacterial
surface take up the DNA (inset), enzymes
break down one strand into nucleotides; a
derivative of the other strand may integrate
into the bacterium’s chromosome (b).
5.4 Bacteriophage Genetics
The word bacteriophage, which is a name for bacterial viruses, means “eater of bac-
teria.” These viruses parasitize and kill bacteria. Pioneering work on the genetics of
bacteriophages in the middle of the twentieth century formed the foundation of
more recent research on tumor-causing viruses and other kinds of animal and plant
viruses. In this way, bacterial viruses have provided an important model system.
These viruses can be used in two different types of genetic analysis. First, two
distinct phage genotypes can be crossed to measure recombination and hence map
the viral genome. Mapping of the viral genome by this method is the topic of this
section. Second, bacteriophages can be used as a way of bringing bacterial genes
together for linkage and other genetic studies. We will study the use of phages in
bacterial studies in Section 5.5. In addition, as we will see in Chapter 10, phages are
used in DNA technology as carriers, or vectors, of foreign DNA. Before we can
understand phage genetics, we must first examine the infection cycle of phages.
Cycle of phage that lyses the host cells A plaque is a clear area in which all
bacteria have been lysed by phages
Uninfected
cell
Lysis of Adsorption
host cell of phage
Free to host cell
phages
F i g u r e 5 -2 6 Through repeated infection and production of
progeny phage, a single phage produces a clear area, or plaque,
on the opaque lawn of bacterial cells. [ D. Sue Katz, Rogers State
University, Claremore, OK.]
fied as parental (h− r+ and h+ r−) and recombinant (h+ r+ and h− r−), and a recom-
A phage cross made by doubly
binant frequency can be calculated as follows: infecting the host cell with
(h+ r+) + (h- r-) parental phages
RF =
total plaques
h– r + h+ r –
If we assume that the recombining phage chromosomes are linear, then single
crossovers produce viable reciprocal products. However, phage crosses are subject
to some analytical complications. First, several rounds of exchange can take place
within the host: a recombinant produced shortly after infection may undergo fur-
ther recombination in the same cell or in later infection cycles. Second, recombi-
nation can take place between genetically similar phages as well as between
different types. Thus, if we let P1 and P2 refer to general parental genotypes, crosses
of P1 × P1 and P2 × P2 take place in addition to P1 × P2. For both these reasons,
E. coli strain 1
recombinants from phage crosses are a consequence of a population of events rather
than defined, single-step exchange events. Nevertheless, all other things being equal,
F i g u r e 5 -2 7
the RF calculation does represent a valid index of map distance in phages.
Because astronomically large numbers of phages can be used in phage-recombi-
nation analyses, very rare crossover events can be detected. In the 1950s, Seymour
Benzer made use of such rare crossover events to map the mutant sites within the
rII gene of phage T4, a gene that controls lysis. For different rII mutant alleles aris-
ing spontaneously, the mutant site is usually at different positions within the gene.
Therefore, when two different rII mutants are crossed, a few rare crossovers may
take place between the mutant sites, producing wild-type
recombinants, as shown here:
Plaques from recombinant and
rII gene parental phage progeny
Parent 1
Parent 2
Wild type
Double
mutant
mapping has been largely superseded by the advent of inexpensive chemical meth-
ods for DNA sequencing, which identify the positions of mutant sites directly.
5.5 Transduction
Some phages are able to pick up bacterial genes and carry them from one bacte-
rial cell to another, a process known as transduction. Thus, transduction joins the
battery of modes of transfer of genomic material between bacteria—along with
Hfr chromosome transfer, F ′ plasmid transfer, and transformation.
Discovery of transduction
In 1951, Joshua Lederberg and Norton Zinder were testing for recombination in
the bacterium Salmonella typhimurium by using the techniques that had been suc-
cessful with E. coli. The researchers used two different strains: one was phe− trp−
tyr−, and the other was met− his−. We won’t worry about the nature of these alleles
except to note that all are auxotrophic. When either strain was plated on a mini-
mal medium, no wild-type cells were observed. However, after the two strains
were mixed, wild-type prototrophs appeared at a frequency of about 1 in 105.
Thus far, the situation seems similar to that for recombination in E. coli.
However, in this case, the researchers also recovered recombinants from a
U-tube experiment, in which conjugation was prevented by a filter separating the
two arms (recall Figure 5-6). They hypothesized that some agent was carrying
genes from one bacterium to another. By varying the size of the pores in the filter,
they found that the agent responsible for gene transfer was the same size as a
known phage of Salmonella, called phage P22. Furthermore, the filterable agent
and P22 were identical in sensitivity to antiserum and in immunity to hydrolytic
enzymes. Thus, Lederberg and Zinder had discovered a new type of gene transfer,
mediated by a virus. They were the first to call this process transduction. As a rar-
ity in the lytic cycle, virus particles sometimes pick up bacterial genes and trans-
fer them when they infect another host. Transduction has subsequently been
demonstrated in many bacteria.
To understand the process of transduction, we need to distinguish two types
of phage cycle. Virulent phages are those that immediately lyse and kill the host.
Temperate phages can remain within the host cell for a period without killing it.
Their DNA either integrates into the host chromosome to replicate with it or rep-
licates separately in the cytoplasm, as does a plasmid. A phage integrated into the
bacterial genome is called a prophage. A bacterium harboring a quiescent phage
is described as lysogenic and is itself called a lysogen. Occasionally, the quiescent
phage in a lysogenic bacterium becomes active, replicates itself, and causes the
spontaneous lysis of its host cell. A resident temperate phage confers resistance to
infection by other phages of that type.
There are two kinds of transduction: generalized and specialized. Generalized
transducing phages can carry any part of the bacterial chromosome, whereas spe-
cialized transducing phages carry only certain specific parts.
Generalized transduction
By what mechanisms can a phage carry out generalized transduction? In 1965,
H. Ikeda and J. Tomizawa threw light on this question in some experiments on the
E. coli phage P1. They found that, when a donor cell is lysed by P1, the bacterial
chromosome is broken up into small pieces. Occasionally, the newly forming
phage particles mistakenly incorporate a piece of the bacterial DNA into a phage
head in place of phage DNA. This event is the origin of the transducing phage.
A phage carrying bacterial DNA can infect another cell. That bacterial DNA
can then be incorporated into the recipient cell’s chromosome by recombination
(Figure 5-29). Because genes on any of the cut-up parts of the host genome can be
transduced, this type of transduction is by necessity of the generalized type.
Phages P1 and P22 both belong to a phage group that shows generalized trans-
duction. P22 DNA inserts into the host chromosome, whereas P1 DNA remains
free, like a large plasmid. However, both transduce by faulty head stuffing.
Generalized transduction can be used to obtain bacterial linkage information
when genes are close enough that the phage can pick them up and transduce
them in a single piece of DNA. For example, suppose that we wanted to find the
linkage distance between met and arg in E. coli. We could grow phage P1 on a
donor met+ arg+ strain and then allow P1 phages from lysis of this strain to infect
a met− arg− strain. First, one donor allele is selected, say, met+. Then the percent-
age of met+ colonies that are also arg+ is measured. Strains transduced to both
a+
a+ b+
b+
a+
Donor bacterium b+
b+
Phages carrying
donor genes
a+
a+
a+ a– a+
a–
F i g u r e 5 -2 9 A newly forming phage may pick up DNA from its host cell’s chromosome
(top) and then inject it into a new cell ( bottom right). The injected DNA may insert into the
new host’s chromosome by recombination ( bottom left). In reality, only a very small minority
of phage progeny (1 in 10,000) carry donor genes.
19 8 CHA P TER 5 The Genetics of Bacteria and Their Viruses
2.0
met+ and arg+ are called cotransductants. The greater the cotransduction fre-
quency, the closer two genetic markers must be (the opposite of most mapping
measurements). Linkage values are usually expressed as cotransduction frequen-
cies (Figure 5-30).
By using an extension of this approach, we can estimate the size of the piece of
host chromosome that a phage can pick up, as in the following type of experiment,
which uses P1 phage:
In this experiment, P1 phage grown on the leu+ thr+ azir donor strain infect the
leu− thr− azis recipient strain. The strategy is to select one or more donor alleles in
the recipient and then test these transductants for the presence of the unselected
alleles. Results are outlined in Table 5-3. Experiment 1 in Table 5-3 tells us that leu
is relatively close to azi and distant from thr, leaving us with two possibilities:
thr leu azi thr azi leu
or
Experiment 2 tells us that leu is closer to thr than azi is, and so the map must be
thr leu azi
By selecting for thr+ and leu+ together in the transducing phages in experi-
ment 3, we see that the transduced piece of genetic material never includes the
azi locus because the phage head cannot carry a fragment of DNA that big. P1 can
only cotransduce genes less than approximately 1.5 minutes apart on the E. coli
chromosome map.
Specialized transduction
A generalized transducer, such as phage P22, picks up fragments of broken host
DNA at random. How are other phages, which act as specialized transducers, able
to carry only certain host genes to recipient cells? The short answer is that a spe-
cialized transducer inserts into the bacterial chromosome at one position only.
When it exits, a faulty outlooping occurs (similar to the type that produces F ′
plasmids). Hence, it can pick up and transduce only genes that are close by.
The prototype of specialized transduction was provided by studies under-
taken by Joshua and Esther Lederberg on a temperate E. coli phage called lambda
(λ). Phage λ has become the most intensively studied and best-characterized phage.
Behavior of the prophage Phage λ has unusual effects when cells lysogenic for
it are used in crosses. In the cross of an uninfected Hfr with a lysogenic F− recipi-
ent [Hfr × F-(l)], lysogenic F− exconjugants with Hfr genes are readily recov-
ered. However, in the reciprocal cross Hfr(λ) × F−, the early genes from the Hfr
chromosome are recovered among the exconjugants, but recombinants for late
genes are not recovered. Furthermore, lysogenic exconjugants are almost never
recovered from this reciprocal cross. What is the explanation? The observations
make sense if the λ prophage is behaving as a bacterial gene locus behaves (that
is, as part of the bacterial chromosome). Thus, in the Hfr(λ) × F− cross, the pro-
phage would enter the F− cell at a specific time corresponding to its position in
the chromosome. Earlier genes are recovered because they enter before the pro-
phage. Later genes are not recovered because lysis destroys the recipient cell. In
interrupted-mating experiments, the λ prophage does in fact always enter the
F− cell at a specific time, closely linked to the gal locus.
In an Hfr(λ) × F− cross, the entry of the λ prophage into the cell immediately
triggers the prophage into a lytic cycle; this process is called zygotic induction
(Figure 5-31). However, in the cross of two lysogenic cells Hfr(λ) × F−(λ), there is
no zygotic induction. The presence of any prophage prevents another infecting
virus from causing lysis. This is because the prophage produces a cytoplasmic fac-
tor that represses the multiplication of the virus. (The phage-directed cytoplas-
mic repressor nicely explains the immunity of the lysogenic bacteria, because a
phage would immediately encounter a repressor and be inactivated.)
(a) (b)
F– F–
gal gal
F i g u r e 5 - 3 2 Reciprocal
recombination takes place between a
l phage inserts by a crossover at a specific site
specific attachment site on the circular
DNA and a specific region called the phage
attachment site on the E. coli chromosome
between the gal and bio genes.
Attachment site
. . . . E. coli
bio chromosome
gal
....
Integration enzymes
gal bio
E. coli chromosome
2
3 1 Attachment sites
gal 1 2 3
gal bio bio
2 2
3 1 3 1
dgal helper
1 2 gal 1 2 3
gal – bio
(i) Lysogenic transductants
1
2 gal
F i g u r e 5 - 3 4 The 1963
genetic map of E. coli genes A map of the E. coli genome obtained genetically
with mutant phenotypes.
D
Units are minutes, based on
A
B
I
C
interrupted-mating and
thrA,D
pdxA
uvrA
malB
pyrA
(lex)
metA
*purH
(ace upM
recombination experiments.
ara
aceE
*purthl
pgl
A
leu
aceF
azi
Y
*s
Z
A,D
ftsA
Asterisks refer to map
P
pan
O
*cy H
arg B
)
80 0
positions that are not as
arg C
c
1
ar gE
79
lac
pr cl
g
ph C
ar c
C
lo hoR *
precise as the other 2
la
o
p oA
B
*serB
(mutT)
*thyR
*
pp rts
(trpR)
(tp p)
A 78 10
(gua pil
*valS
tonA
* d)
positions. [ Data from G. S. m
hs p
(ra
(ast)
D m etF *
argF
*
n
e ,Bin
pyr
C)
glp tB m
A
Stent, Molecular Biology of pE
pro
77 11 u
B
fdp
rh K s s*
*p
Bacterial Viruses.] P rn pG*
tfr oB
a
urA A
*a
B 11.5 su A*
pr
m
A
glt
p
C 76 0 / 90 15
ilv c
O 85 5 x su L*
A rb ts l p
su ,B*
D pho s 74
mb E
(da S
u r tolA K
E rA) 80 10 p G
me 16 aro T
*tna tE
R nicA E
*tna
A gal ) O
73
B
(chlB
75 15 (mglR
C bgl
) (phr)
A pyrE 17
xyl gltH* 2)
(att434.8
(gad) glyS aroA* att l
72
*gltC 70 20 pyrD chlA
bioA
mtl pyrC 18
urvB*
purB
(cat) supC,O24
71
65 25 supF*
aroE tdk*
A (chlC
67 *spc ) gaIU
argR 25 att f
a s d A 80
*lin 60 30 pab tonB
yA B A
glpD *er sp aro trp
a H B
glpR 66 G aro * C
arg np pp D 26 cys
B
lA p ) 55 35 D
ma m s* pyr
laS a F E
B (a fda ph n* 36
aro ) * tC
e 50 40 eS
O
o B
(bi m
se gP
45
(ft gS
65 mo
rA
r
sB
*a
ar d)
y s m tB
l
)
37
*re
*c
(e f)
ot
*m ysC
A
ab ) A
d
(zw
(da utS
(da pA)
*p
(so
uv
c
D 64
*pr )
pB
rg
(mg
*rec d
rC
A
a
(tolC
m)
(
glpT
rA
56
purC
38
purF
*ctr
*nicB
supN
aroC
dsdA
dsdC
lP)
st
ha
)
pS
g
su
(tr
55 39
pH
(su
48
su E
ra
sh
pT
pD
54 50 49
*a
his
)
iA
cB
)
gn
galR
lysA
(re
d
thyA
argA
gua
fuc
purG*
glyA
tyrA
aroF*
pheA
uraP*
B
A
O
dapC*
cdsA*
gInD*
optA*
dapD
sefA*
polC
hlpA
IpxA
IpxB
rpsB
murG
murC
murE
firA
murF
erivA
mutT
secA
orf
orf
orf
polB
frsQ
tsf
ftsA
ftsZ
ddl
ftsl
orf
rpsT
lspA
ileS
ABC KJ AB
panBCD
acrC*
proS*
pyrH*
tadE*
popC*
sefA*
metD
mafB*
garB*
serR*
guaC
nadC
mrcB
pcnB
aceE
prlD*
dadB
fruR*
aceF
ftsM*
brnS*
ssyD
chlG*
aroP
(rimG) rimF*
apaH
gprA*
dapB
mafA
pdxA
ksgA
kefC
(popD) gprB
spe
Irs*
hpt
fhu
Ipd
folA
ant*
ara
leu
dna
orf
ilvJ
tolJ
car
ilv
orf
thr
(envN )
(tdi)
(toll)
0 1 2 3 4 5
F i g u r e 5 - 3 5 A linear scale drawing of a sequenced 5-minute section of the 100-minute
1990 E. coli linkage map. The parentheses and asterisks indicate markers for which the
exact location was unknown at the time of publication. Arrows above genes and groups of
genes indicate the direction of transcription. [ Data from B. J. Bachmann, “Linkage Map of
Escherichia coli K-12, Edition 8,” Microbiol. Rev. 54, 1990, 130–197.]
5.6 Physical Maps and Linkage Maps Compared 20 3
Te
r
Te min
Re rm us
inu
Re plic s
pl ho
ic re
ho 2
re
2
By 1963, the E. coli map (Figure 5-34) already detailed the positions of approx-
imately 100 genes. After 27 years of further refinement, the 1990 map depicted
the positions of more than 1400 genes. Figure 5-35 shows a 5-minute section of
the 1990 map (which is adjusted to a scale of 100 minutes). The complexity of
these maps illustrates the power and sophistication of genetic analysis. How well F i g u r e 5 - 3 7 An alignment of the
do these maps correspond to physical reality? In 1997, the DNA sequence of the genetic and physical maps. (a) Markers on
the 1990 genetic map in the region near
entire E. coli genome of 4,632,221 base pairs was completed, allowing us to com-
60 and 61 minutes. (b) The exact positions
pare the exact position of genes on the genetic map with the position of the cor- of every gene, based on the complete
responding coding sequence on the linear DNA sequence (the physical map). The sequence of the E. coli genome. (Not
full map is represented in Figure 5-36. Figure 5-37 makes a comparison for a seg- every gene is named in this map, for
ment of both maps. Clearly, the genetic map is a close match to the physical map. simplicity.) The elongated boxes are genes
Chapter 4 considered some ways in which the physical map (usually the full and putative genes. Each color represents
genome sequence) can be useful in mapping new mutations. In bacteria, the tech- a different type of function. For example,
red denotes regulatory functions, and dark
nique of insertional mutagenesis is another way to zero in rapidly on a muta-
blue denotes functions in DNA replication,
tion’s position on a known physical map. The technique causes mutations through recombination, and repair. Lines between
the random insertion of “foreign” DNA fragments. The inserts inactivate any gene the maps in parts a and b connect the
in which they land by interrupting the transcriptional unit. Transposons are par- same gene in each map. [ Data from
ticularly useful inserts for this purpose in several model organisms, including bac- F. R. Blattner et al., “The Complete Science
teria. To map a new mutation, the procedure is as follows. The DNA of a transposon 277, l997, 1453–1462.]
Proportions of the genetic and physical maps are similar but not identical
(a)
cysC cysH eno relA argA recC mutH
60 ptr thyA 61
(b)
mutS rpoS pcm cysC iap cysH eno relA barA syd sdaC exo gcvA mltA argA ptr recC thyA ptsP mutH aas galR araE glyU
20 4 CHA P TER 5 The Genetics of Bacteria and Their Viruses
s u m m a ry
Advances in bacterial and phage genetics within the past which the map unit is a unit of time (minutes). In an exten-
50 years have provided the foundation for molecular biology sion of this technique, the frequency of recombinants
and cloning (discussed in later chapters). Early in this period, between markers known to have entered the recipient can
gene transfer and recombination were found to take place provide a finer-scale map distance.
between different strains of bacteria. In bacteria, however, Several types of plasmids other than F can be found.
genetic material is passed in only one direction—for exam- R plasmids carry antibiotic-resistance alleles, often within a
ple, in Escherichia coli, from a donor cell (F+ or Hfr) to a mobile element called a transposon. Rapid plasmid spread
recipient cell (F−). Donor ability is determined by the pres- causes population-wide resistance to medically important
ence in the cell of a fertility factor (F), a type of plasmid. On drugs. Derivatives of such natural plasmids have become
occasion, the F factor present in the free state in F+ cells can important cloning vectors, useful for gene isolation and
integrate into the E. coli chromosome and form an Hfr cell. study in all organisms.
When this occurs, a fragment of donor chromosome can Genetic traits can also be transferred from one bacterial
transfer into a recipient cell and subsequently recombine cell to another in the form of pieces of DNA taken into the cell
with the recipient chromosome. Because the F factor can from the extracellular environment. This process of transfor-
insert at different places on the host chromosome, early mation in bacterial cells was the first demonstration that DNA
investigators were able to piece the transferred fragments is the genetic material. For transformation to occur, DNA
together to show that the E. coli chromosome is a single cir- must be taken into a recipient cell, and recombination must
cle, or ring. Interruption of the transfer at different times then take place between a recipient chromosome and the
has provided geneticists with an unconventional method incorporated DNA.
(interrupted mating) for constructing a linkage map of the Bacteria can be infected by viruses called bacteriophages.
single chromosome of E. coli and other similar bacteria, in In one method of infection, the phage chromosome may enter
Solved Problems 20 5
the bacterial cell and, by using the bacterial metabolic machin- alone into the phage head during lysis. In specialized transduc-
ery, produce progeny phages that burst the host bacterium. The tion, faulty excision of the prophage from a unique chromo-
new phages can then infect other cells. If two phages of different somal locus results in the inclusion of specific host genes as
genotypes infect the same host, recombination between their well as phage DNA in the phage head.
chromosomes can take place. Today, a physical map in the form of the complete genome
In another mode of infection, lysogeny, the injected phage sequence is available for many bacterial species. With the use
lies dormant in the bacterial cell. In many cases, this dormant of this physical genome map, the map position of a mutation
phage (the prophage) incorporates into the host chromosome of interest can be precisely located. First, appropriate muta-
and replicates with it. Either spontaneously or under appropriate tions are produced by the insertion of transposons (inser-
stimulation, the prophage can leave its dormant state and lyse tional mutagenesis). Then, the DNA sequence surrounding
the bacterial host cell. the inserted transposon is obtained and matched to a sequence
A phage can carry bacterial genes from a donor to a recipient. in the physical map. This technique provides the locus, the
In generalized transduction, random host DNA is incorporated sequence, and possibly the function of the gene of interest.
key terms
auxotroph (p. 176) genetic marker (p. 177) prokaryote (p. 174)
bacteriophage (phage) (p. 174) Hfr (high frequency of recombination) prophage (p. 196)
cell clone (p. 176) (p. 179) prototroph (p. 176)
colony (p. 176) horizontal transmission (p. 175) R plasmid (p. 189)
conjugation (p. 177) insertional mutagenesis (p. 203) recipient (p. 178)
cotransductant (p. 198) interrupted mating (p. 181) resistant mutant (p. 177)
λ attachment site (p. 199) lysate (p. 193) rolling circle replication (p. 179)
donor (p. 178) lysis (p. 193) screen (p. 195)
double (mixed) infection (p. 194) lysogen (lysogenic bacterium) selective system (p. 195)
double transformation (p. 191) (p. 196) specialized transduction (p. 199)
endogenote (p. 186) merozygote (p. 186) temperate phage (p. 196)
exconjugant (p. 181) minimal medium (p. 176) terminus (p. 183)
exogenote (p. 186) mixed (double) infection (p. 194) transduction (p. 196)
F+ (donor) (p. 179) origin (O) (p. 182) transformation (p. 191)
F − (recipient) (p. 179) phage (bacteriophage) (p. 174) unselected marker (p. 187)
F ′ plasmid (p. 188) phage recombination (p. 176) vertical transmission (p. 176)
fertility factor (F ) (p. 179) plaque (p. 193) virulent phage (p. 196)
generalized transduction plasmid (p. 179) virus (p. 174)
(p. 197) plating (p. 176) zygotic induction (p. 199)
s olv e d p r obl e m s
SOLVED PROBLEM 1. Suppose that a cell were unable to transduction entails the integration of the phage at a spe-
carry out generalized recombination (rec−). How would this cific point on the chromosome and the rare incorporation
cell behave as a recipient in generalized and in specialized of chromosomal markers near the integration site into the
transduction? First, compare each type of transduction, and phage genome. Therefore, only those markers that are near
then determine the effect of the rec− mutation on the inheri- the specific integration site of the phage on the host chro-
tance of genes by each process. mosome can be transduced.
Markers are inherited by different routes in generalized
Solution and specialized transduction. A generalized transducing
Generalized transduction entails the incorporation of chro- phage injects a fragment of the donor chromosome into the
mosomal fragments into phage heads, which then infect recipient. This fragment must be incorporated into the re-
recipient strains. Fragments of the chromosome are incor- cipient’s chromosome by recombination, with the use of the
porated randomly into phage heads, and so any marker on recipient’s recombination system. Therefore, a rec− recipient
the bacterial host chromosome can be transduced to another will not be able to incorporate fragments of DNA and cannot
strain by generalized transduction. In contrast, specialized inherit markers by generalized transduction. On the other
20 6 CHA P TER 5 The Genetics of Bacteria and Their Viruses
hand, the major route for the inheritance of markers by spe- recombinants for any marker among the leu+ recombinants.
cialized transduction is by integration of the specialized Because the inheritance of thr + is the highest, thr + must be
transducing particle into the host chromosome at the specific the first marker to enter after leu. The complete order is leu,
phage integration site. This integration, which sometimes re- thr, ile, mal, trp.
quires an additional wild-type (helper) phage, is mediated by
a phage-specific enzyme system that is independent of the SOLVED PROBLEM 4. A cross is made between an Hfr that is
normal recombination enzymes. Therefore, a rec− recipient met + thi + pur + and an F − that is met − thi − pur−. Interrupted-
can still inherit genetic markers by specialized transduction. mating studies show that met + enters the recipient last, and
so met + recombinants are selected on a medium containing
SOLVED PROBLEM 2. In E. coli, four Hfr strains donate the supplements that satisfy only the pur and thi requirements.
following genetic markers, shown in the order donated: These recombinants are tested for the presence of the thi +
Strain 1: Q W D M T and pur + alleles. The following numbers of individuals are
Strain 2: A X P T M found for each genotype:
Strain 3: B N C A X met+ thi+ pur+ 280
Strain 4: B Q W D M
met+ thi+ pur- 0
All these Hfr strains are derived from the same F+ met+ thi- pur+ 6
strain. What is the order of these markers on the circular met+ thi- pur- 52
chromosome of the original F+?
a. Why was methionine (Met) left out of the selection
Solution medium?
A two-step approach works well: (1) determine the underly- b. What is the gene order?
ing principle and (2) draw a diagram. Here the principle is c. What are the map distances in recombination units?
clearly that each Hfr strain donates genetic markers from a
fixed point on the circular chromosome and that the earli- Solution
est markers are donated with the highest frequency. Because a. Methionine was left out of the medium to allow selection
not all markers are donated by each Hfr, only the early for met + recombinants because met + is the last marker to
markers must be donated for each Hfr. Each strain allows us enter the recipient. The selection for met + ensures that all the
to draw the following circles: loci that we are considering in the cross will have already
Q B Q entered each recombinant that we analyze.
W W
b. Here, a diagram of the possible gene orders is helpful.
D B D
N
Because we know that met enters the recipient last, there are
M M C M only two possible gene orders if the first marker enters on the
A T A
T X P X right: met, thi, pur or met, pur, thi. How can we distinguish
Strain 1 Strain 2 Strain 3 Strain 4 between these two orders? Fortunately, one of the four pos-
sible classes of recombinants requires two additional cross-
From this information, we can consolidate each circle into
overs. Each possible order predicts a different class that arises
one circular linkage map of the order Q, W, D, M, T, P, X, A,
by four crossovers rather than two. For instance, if the order
C, N, B, Q.
were met, thi, pur, then met + thi − pur + recombinants would be
SOLVED PROBLEM 3. In an Hfr × F − cross, leu+ enters as the very rare. On the other hand, if the order were met, pur, thi,
first marker, but the order of the other markers is unknown. If then the four-crossover class would be met + pur− thi +. From
the Hfr is wild type and the F − is auxotrophic for each marker the information given in the table, the met + pur− thi + class is
in question, what is the order of the markers in a cross where clearly the four-crossover class and therefore the gene order
leu+ recombinants are selected if 27 percent are ile+, 13 per- met, pur, thi is correct.
cent are mal+, 82 percent are thr +, and 1 percent are trp+? c. Refer to the following diagram:
Solution met 15.4 m.u. pur 1.8 m.u. thi
Recall that spontaneous breakage creates a natural gradient Hfr
of transfer, which makes it less and less likely for a recipient met pur thi
to receive later and later markers. Because we have selected
for the earliest marker in this cross, the frequency of recom-
binants is a function of the order of entry for each marker. F
Therefore, we can immediately determine the order of the To compute the distance between met and pur, we com-
genetic markers simply by looking at the percentage of pute the percentage of met + pur− thi −, which is 52/338 =
Problems 207
15.4 m.u. Similarly, the distance between pur and thi is responsible for the gene transfer displayed by cultures of
6/338 =1.8 m.u. F + cells. In the Hfr−- and F +-mediated gene transfer, inheri-
tance requires the incorporation of a transferred fragment
SOLVED PROBLEM 5. Compare the mechanism of transfer by recombination (recall that two crossovers are needed)
and inheritance of the lac + genes in crosses with Hfr, F +, and into the F − chromosome. Therefore, an F − strain that can-
F ′ lac + strains. How would an F − cell that cannot undergo not undergo recombination cannot inherit donor chromo-
normal homologous recombination (rec−) behave in crosses somal markers even though they are transferred by Hfr
with each of these three strains? Would the cell be able to strains or Hfr cells in F + strains. The fragment cannot be
inherit the lac + gene? incorporated into the chromosome by recombination.
Because these fragments do not possess the ability to repli-
Solution cate within the F − cell, they are rapidly diluted out during
Each of these three strains donates genes by conjugation. In cell division.
the Hfr and F + strains, the lac + genes on the host chromo- Unlike Hfr cells, F ′ cells transfer genes carried on the F ′
some are donated. In the Hfr strain, the F factor is integrated factor, a process that does not require chromosome transfer.
into the chromosome in every cell, and so chromosomal In this case, the lac + genes are linked to the F ′ factor and are
markers can be efficiently donated, particularly if a marker transferred with it at a high efficiency. In the F− cell, no re-
is near the integration site of F and is donated early. The F + combination is required because the F ′ lac + strain can repli-
cell population contains a small percentage of Hfr cells, in cate and be maintained in the dividing F− cell population.
which F is integrated into the chromosome. These cells are Therefore, the lac + genes are inherited even in a rec− strain.
p r obl e m s
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
Percentage of cells transformed into d. Based on your answer to part c, explain the relative
proportions of genotypes observed in experiment II.
Transforming DNA strr mtl- strs mtl+ strr mtl+
36. Although most λ-mediated gal+ transductants are induc-
strr mtl+ 4.3 0.40 0.17 ible lysogens, a small percentage of these transductants
strr mtl- + strs mtl+ 2.8 0.85 0.0066 in fact are not lysogens (that is, they contain no integrat-
ed λ). Control experiments show that these transduc-
a. What does the first row of the table tell you? Why? tants are not produced by mutation. What is the likely
b. What does the second row of the table tell you? Why? origin of these types?
33. Recall that, in Chapter 4, we considered the possibility 37. An ade+ arg+ cys+ his+ leu+ pro+ bacterial strain is known
that a crossover event may affect the likelihood of to be lysogenic for a newly discovered phage, but the site
another crossover. In the bacteriophage T4, gene a is of the prophage is not known. The bacterial map is
1.0 m.u. from gene b, which is 0.2 m.u. from gene c. The arg
gene order is a, b, c. In a recombination experiment, you his
recover five double crossovers between a and c from cys
100,000 progeny viruses. Is it correct to conclude that in- leu
terference is negative? Explain your answer.
ade
34. E. coli cells were infected with two strains of T4 virus. pro
One strain is minute (m), rapid lysis (r), and turbid (t); The lysogenic strain is used as a source of the phage, and
the other is wild type for all three markers. The lytic the phages are added to a bacterial strain of genotype
products of this infection were plated and classified. The ade - arg - cys - his- leu- pro-. After a short incubation,
resulting 10,342 plaques were distributed among eight samples of these bacteria are plated on six different
genotypes as follows: media, with the supplementations indicated in the
m r t 3469 m + + 521 following table. The table also shows whether colonies
were observed on the various media.
+ + + 3727 + r t 475
m r + 854 + r + 171 Nutrient supplementation in medium
Presence
m + t 163 + + t 963 Medium Ade Arg Cys His Leu Pro of colonies
a. What are the linkage distances between m and r, 1 - + + + + + N
between r and t, and between m and t. 2 + - + + + + N
b. Determine the linkage order for the three genes. 3 + + - + + + C
c. What is the coefficient of coincidence (see Chapter 4) 4 + + + - + + N
in this cross? What does it signify? 5 + + + + - + C
6 + + + + + - N
35. With the use of P22 as a generalized transducing phage
grown on a pur+ pro+ his+ bacterial donor, a recipient
(In this table, a plus sign indicates the presence of a
strain of genotype pur− pro− his− was infected and incu-
nutrient supplement, a minus sign indicates that a
bated. Afterward, transductants for pur+, pro+, and his+
supplement is not present, N indicates no colonies, and C
were selected individually in experiments I, II, and III,
indicates colonies present.)
respectively.
a. What genetic process is at work here?
a. What medium is used in each of these selection
experiments? b. What is the approximate locus of the prophage?
b. The transductants were examined for the presence of 38. In a generalized-transduction system using P1 phage, the
unselected donor markers, with the following results: donor is pur + nad + pdx − and the recipient is pur − nad −
pdx +. The donor allele pur + is initially selected after
I II III transduction, and 50 pur + transductants are then scored
for the other alleles present. Here are the results:
pro- his- 86% pur- his- 44% pur- pro- 20%
pro+ his- 0% pur+ his- 0% pur+ pro- 14% Genotype Number of colonies
pro- his+ 10% pur- his+ 54% pur- pro+ 61% nad+ pdx+ 3
pro+ his+ 4% pur+ his+ 2% pur+ pro+ 5% nad+ pdx- 10
nad- pdx+ 24
nad- pdx- 13
What is the order of the bacterial genes?
50
c. Which two genes are closest together?
210 CHA P TER 5 The Genetics of Bacteria and Their Viruses
a. What is the cotransduction frequency for pur and and 320 of the cultures are found to be able to grow but
nad? the remainder cannot.
b. What is the cotransduction frequency for pur and a. Deduce the genotypes of the two types of cultures.
pdx? b. Draw the crossover events required to produce these
c. Which of the unselected loci is closest to pur? genotypes.
d. Are nad and pdx on the same side or on opposite sides c. Calculate the distance between the pro and thi genes
of pur? Explain. in recombination units.
(Draw the exchanges needed to produce the various
transformant classes under either order to see which
www
requires the minimum number to produce the results Unpacking Problem 41
obtained.) www
1. What type of organism is E. coli?
39. In a generalized-transduction experiment, phages are
2. What does a culture of E. coli look like?
collected from an E. coli donor strain of genotype cys +
leu + thr + and used to transduce a recipient of genotype 3. On what sort of substrates does E. coli generally grow in
cys- leu- thr -. Initially, the treated recipient population its natural habitat?
is plated on a minimal medium supplemented with leu- 4. What are the minimal requirements for E. coli cells to
cine and threonine. Many colonies are obtained. divide?
a. What are the possible genotypes of these colonies? 5. Define the terms prototroph and auxotroph.
b. These colonies are then replica plated onto three 6. Which cultures in this experiment are prototrophic,
different media: (1) minimal plus threonine only, (2) and which are auxotrophic?
minimal plus leucine only, and (3) minimal. What
7. Given some strains of unknown genotype regarding
genotypes could, in theory, grow on these three media?
thiamine and proline, how would you test their geno-
c. Of the original colonies, 56 percent are observed to types? Give precise experimental details, including
grow on medium 1, 5 percent on medium 2, and no equipment.
colonies on medium 3. What are the actual genotypes of
8. What kinds of chemicals are proline and thiamine?
the colonies on media 1, 2, and 3?
Does it matter in this experiment?
d. Draw a map showing the order of the three genes and
which of the two outer genes is closer to the middle gene. 9. Draw a diagram showing the full set of manipulations
performed in the experiment.
40. Deduce the genotypes of the following E. coli strains 1 10. Why do you think the experiment was done?
through 4:
11. How was it established that pro enters after thi? Give
Minimal Minimal plus precise experimental steps.
arginine
12. In what way does an interrupted-mating experiment
1 2 1 2 differ from the experiment described in this problem?
3 4 3 4
13. What is an exconjugant? How do you think that excon-
jugants were obtained? (It might include genes not de-
Minimal plus Minimal plus scribed in this problem.)
methionine arginine and methionine
14. When the pro gene is said to enter after thi, does it mean
1 2 1 2 the pro allele, the pro+ allele, either, or both?
3 4 3 4
15. What is “fully supplemented medium” in the context of
this question?
16. Some exconjugants did not grow on minimal medium.
41. In an interrupted-conjugation experiment in E. coli, the On what medium would they grow?
pro gene enters after the thi gene. A pro + thi + Hfr is
crossed with a pro - thi - F - strain, and exconjugants are 17. State the types of crossovers that take part in Hfr × F-
plated on medium containing thiamine but no proline. A recombination. How do these crossovers differ from
total of 360 colonies are observed, and they are isolated crossovers in eukaryotes?
and cultured on fully supplemented medium. These cul- 18. What is a recombination unit in the context of the pres-
tures are then tested for their ability to grow on medium ent analysis? How does it differ from the map units
containing no proline or thiamine (minimal medium), used in eukaryote genetics?
Problems 211
42. A generalized transduction e xperiment uses a metE+ b. The following table shows the number of colonies on
pyrD+ strain as donor and metE - pyrD- as recipient. each type of agar for samples taken at various times after
metE+ transductants are selected and then tested for the the strains are mixed. Use this information to determine
pyrD+ allele. The following numbers were obtained: the order of genes a, b, and c.
metE+ pyrD- 857 Time
metE+ pyrD+ 1 of sampling Number of colonies on agar of type
(minutes) 1 2 3
Do these results suggest that these loci are closely
linked? What other explanations are there for the lone 0 0 0 0
“double”? 5 0 0 0
43. An argC -strain was infected with transducing phage, 7.5 102 0 0
and the lysate was used to transduce metF - recipients
10 202 0 0
on medium containing arginine but no methionine.
The metF+ transductants were then tested for arginine 12.5 301 0 74
requirement: most were argC+ but a small percentage 15 400 0 151
were found to be argC -. Draw diagrams to show the
17.5 404 49 225
likely origin of the argC+ and argC - strains.
20 401 101 253
C h a ll e n g i n g P r obl e m s 25 398 103 252
44. Four E. coli strains of genotype a+ b- are labeled 1, 2, 3,
and 4. Four strains of genotype a- b+ are labeled 5, 6, 7, c. From each of the 25-minute plates, 100 colonies are
and 8. The two genotypes are mixed in all possible combi- picked and transferred to a petri dish containing agar
nations and (after incubation) are plated to determine the with all the nutrients except D. The numbers of colonies
frequency of a+ b+ recombinants. The following results that grow on this medium are 90 for the sample from
are obtained, where M = many recombinants, L = low agar type 1, 52 for the sample from agar type 2, and 9 for
numbers of recombinants, and 0 = no recombinants: the sample from agar type 3. Using these data, fit gene d
into the sequence of a, b, and c.
1 2 3 4
d. At what sampling time would you expect colonies to
5 0 M M 0 first appear on agar containing C and streptomycin but
6 0 M M 0 no A or B?
7 L 0 0 M 46. In the cross Hfr aro+ arg + ery r str s × F - aro - arg - ery s
8 0 L L 0 str r, the markers are transferred in the order given
(with aro + entering first), but the first three genes
On the basis of these results, assign a sex type (either
are very close together. Exconjugants are plated on
Hfr, F+, or F-) to each strain.
a medium containing Str (streptomycin, to kill Hfr
45. An Hfr strain of genotype a+ b+ c+ d - str s is mated with cells), Ery (erythromycin), Arg (arginine), and Aro
a female strain of genotype a- b- c- d+ str r. At various (aromatic amino acids). The following results are ob-
times, mating pairs are separated by vigorously shaking tained for 300 colonies isolated from these plates and
the culture. The cells are then plated on three types of tested for growth on various media: on Ery only,
agar, as shown below, where nutrient A allows the growth 263 strains grow; on Ery + Arg, 264 strains grow; on
of a - cells; nutrient B, of b - cells; nutrient C, of c - cells; Ery + Aro, 290 strains grow; on Ery + Arg + Aro,
and nutrient D, of d - cells. (A plus indicates the presence 300 strains grow.
of streptomycin or a nutrient, and a minus indicates its a. Draw up a list of genotypes, and indicate the number
absence.) of individuals in each genotype.
Agar type Str A B C D b. Calculate the recombination frequencies.
1 + + + - + c. Calculate the ratio of the size of the arg-to-aro region
2 + - + + + to the size of the ery-to-arg region.
3 + + - + + 47. A bacterial transformation is performed with a donor
strain that is resistant to four drugs, A, B, C, and D, and a
a. What donor genes are being selected on each type of recipient strain that is sensitive to all four drugs. The re-
agar? sulting recipient cell population is divided and plated on
212 CHA P TER 5 The Genetics of Bacteria and Their Viruses
media containing various combinations of the drugs. an F - derivative of strain A that has lost the F ′ ). How
The following table shows the results. would you determine whether trp1 and trp2 are alleles of
the same locus? (Describe the crosses and the results
Drugs Number Drugs Number
expected.)
added of colonies added of colonies
50. A generalized transducing phage is used to transduce an
None 10,000 BC 50 a - b - c - d - e - recipient strain of E. coli with an a + b + c +
A 1155 BD 48 d + e + donor. The recipient culture is plated on various
B 1147 CD 785 media with the results shown in the following table.
C 1162 ABC 31 (Note that a - indicates a requirement for A as a nutrient,
D 1140 ABD 43 and so forth.) What can you conclude about the linkage
AB 47 ACD 631 and order of the genes?
AC 641 BCD 35
AD 941 ABCD 29 Compounds added Presence (+) or
to minimal medium absence (-) of colonies
a. One of the genes is distant from the other three, which
appear to be closely linked. Which is the distant gene? CDE -
BDE -
b. What is the likely order of the three closely linked BCE +
genes? BCD +
48. You have two strains of λ that can lysogenize E. coli; their ADE -
linkage maps are as follows: ACE -
ACD -
Strain X Strain Y
c b c b ABE -
ABD +
ABC -
transducing particles of phage φ80 that carried the lac with the use of a DNA replication primer identical with
region. part of the end of the transposon sequence, and the se-
quence adjacent to the transposon was found to corre-
52. Wild-type E. coli takes up and concentrates a certain red spond to a gene of unknown function called atoE, span-
food dye, making the colonies blood red. Transposon ning positions 2.322 through 2.324 Mb on the map
mutagenesis was used, and the cells were plated on food (numbered from an arbitrary position zero). Propose a
dye. Most colonies were red, but some colonies did not function for atoE. What biological process could be inves-
take up dye and appeared white. In one white colony, the tigated in this way, and what other types of white colo-
DNA surrounding the transposon insert was sequenced, nies might be expected?
This page intentionally left blank
344
6
C h a p t e r
Gene Interaction
Learning Outcomes
After completing this chapter, you will be able to
outline
215
216 CHA P TER 6 Gene Interaction
T
he thrust of our presentation in the book so far has been to show how genet-
icists identify a gene that affects some biological property of interest. We
have seen how the approaches of forward genetics can be used to identify
individual genes. The researcher begins with a set of mutants, and then crosses
each mutant with the wild type to see if the mutant shows single-gene inheri-
tance. The cumulative data from such a research program would reveal a set of
genes that all have roles in the development of the property under investigation.
In some cases, the researcher may be able to identify specific biochemical func-
tions for many of the genes by comparing gene sequences with those of other
organisms. The next step, which is a greater challenge, is to deduce how the genes
in a set interact to influence phenotype.
How are the gene interactions underlying a property deduced? One molecular
approach is to analyze protein interactions directly in vitro by using one protein
as “bait” and observing which other cellular proteins attach to it. Proteins that are
found to bind to the bait are candidates for interaction in the living cell. Another
molecular approach is to analyze mRNA transcripts. The genes that collaborate in
some specific developmental process can be defined by the set of RNA transcripts
present when that process is going on, a type of analysis now carried out with the
use of genome chips (see Chapter 14). Finally, gene interactions and their signifi-
cance in shaping phenotype can be deduced by genetic analysis, which is the focus
of this chapter.
Gene interactions can be classified broadly into two categories. The first
category consists of interactions between alleles of one locus, broadly speaking
variations on dominance. In earlier chapters, we dealt with full dominance and
full recessiveness, but as we shall see in this chapter, there are other types of
dominance, each with their own underlying cell biology. Although this informa-
tion does not address the range of genes affecting a function, a great deal can be
learned of a gene’s role by considering allelic interactions. The second category
consists of interactions between two or more loci. These interactions reveal the
number and types of genes in the overall program underlying a particular biologi-
cal function.
mRNA
Chromosome
+ + m
+ m m
Chromosome
mRNA
Protein
be distinguished from the heterozygote; that is, at the phenotypic level, A/A = F i g u r e 6 -1 In the heterozygote, even
A/a. As mentioned earlier, phenylketonuria (PKU) and many other single-gene though the mutated copy of the gene pro-
duces nonfunctional protein, the wild-type
human diseases are fully recessive, whereas their wild-type alleles are dominant.
copy generates enough functional protein
Other single-gene diseases such as achondroplasia are fully dominant, whereas, in to produce the wild-type phenotype.
those cases, the wild-type allele is recessive. How can these dominance relations
be interpreted at the cellular level?
The disease PKU is a good general model for recessive mutations. Recall that
PKU is caused by a defective allele of the gene encoding the enzyme phenylala-
nine hydroxylase (PAH). In the absence of normal PAH, the phenylalanine enter-
ing the body in food is not broken down and hence accumulates. Under such
conditions, phenylalanine is converted into phenylpyruvic acid, which is trans-
ported to the brain through the bloodstream and there impedes normal develop-
ment, leading to mental retardation. The reason that the defective allele is
recessive is that one “dose” of the wild-type allele P produces enough PAH to
break down the phenylalanine entering the body. Thus, the PAH wild-type allele
is said to be haplosufficient. Haplo means a haploid dose (one) and sufficient refers
to the ability of that single dose to produce the wild-type phenotype. Hence, both
P/P (two doses) and P/p (one dose) have enough PAH activity to result in the nor-
mal cellular chemistry. People with p/p have zero doses of PAH activity. Figure 6-1
illustrates this general notion.
How can we explain fully dominant mutations? There are several molecular
mechanisms for dominance. A regularly encountered mechanism is that the wild-
type allele of a gene is haploinsufficient. In haploinsufficiency, one wild-type dose
is not enough to achieve normal levels of function. Assume that 16 units of a
gene’s product are needed for normal chemistry and that each wild-type allele
218 CHA P TER 6 Gene Interaction
can make 10 units. Two wild-type alleles will produce 20 units of product, well
over the minimum. But consider what happens if one of the mutations is a null
mutation, which produces a nonfunctional protein. A null mutation in combina-
tion with a single wild-type allele would produce 10 + 0 = 10 units, well below the
minimum. Hence, the heterozygote (wild type/null) is mutant, and the mutation
is, by definition, dominant. In mice, the gene Tbx1 is haploinsufficient. This gene
encodes a transcription-regulating protein (a transcription factor) that acts on
genes responsible for the development of the pharynx. A knockout of one wild-
type allele results in an inadequate concentration of the regulatory protein, which
results in defects in the development of the pharyngeal arteries. The same haplo-
insufficiency is thought to be responsible for DiGeorge syn-
Two models for dominance drome in humans, a condition with cardiovascular and
of a mutation craniofacial abnormalities.
Another important type of dominant mutation is called
a dominant negative. Polypeptides with this type of muta-
Model 1: Model 2: Phenotype tion act as “spoilers” or “rogues.” In some cases, the gene
Haploinsufficiency Dominant negative product is a unit of a homodimeric protein, a protein com-
posed of two units of the same type. In the heterozygote
(+/M), the mutant polypeptide binds to the wild-type poly-
+/+ peptide and acts as a spoiler by distorting it or otherwise
interfering with its function. The same type of spoiling can
also hinder the functioning of a heterodimer composed of
2 “doses” of product Dimer Wild type
polypeptides from different genes. In other cases, the gene
product is a monomer, and, in these situations, the mutant
binds the substrate, and acts as a spoiler by hindering the
M/M Mutant ability of the wild-type protein to bind to the substrate.
An example of mutations that can act as dominant nega-
0 “dose” tives is found in the gene for collagen protein. Some muta-
tions in this gene give rise to the human phenotype
osteogenesis imperfecta (brittle-bone disease). Collagen is a
+/M Mutant connective-tissue protein formed of three monomers inter-
twined (a trimer). In the mutant heterozygote, the abnormal
protein wraps around one or two normal ones and distorts
1 “dose” (inadequate)
the trimer, leading to malfunction. In this way, the defective
collagen acts as a spoiler. The difference between haploin-
F i g u r e 6 -2 A mutation may be sufficiency and the action of a dominant negative as causes of dominance of a
dominant because (left) a single wild-type mutation is illustrated in Figure 6-2.
gene does not produce enough protein
product for proper function or (right) the
K e y C o n c e p t For most genes, a single wild-type copy is adequate for full
mutant allele acts as a dominant negative
expression (such genes are haplosufficient), and their null mutations are fully
that produces a “spoiler” protein product.
recessive. Harmful mutations of haploinsufficient genes are often dominant.
Mutations in genes that encode units in homo- or heterodimers can behave as
dominant negatives, acting through “spoiler” proteins.
Incomplete dominance
Four-o’clocks are plants native to tropical America. Their name comes from the
fact that their flowers open in the late afternoon. When a pure-breeding wild-type
four-o’clock line having red petals is crossed with a pure line having white petals,
the F1 has pink petals. If an F2 is produced by selfing the F1, the result is
1
4 of the plants have red petals
1
2 of the plants have pink petals
1
4 of the plants have white petals
6.1 Interactions Between the Alleles of a Single Gene: Variations on Dominance 219
In this allelic series, the alleles determine the presence and form of a complex
sugar molecule present on the surface of red blood cells. This sugar molecule is an
antigen, a cell-surface molecule that can be recognized by the immune system.
The alleles IA and IB determine two different forms of this cell-surface molecule.
However, the allele i results in no cell-surface molecule of this type (it is a null
allele). In the genotypes IA/i and I B/i, the alleles IA and I B are fully dominant over
i. However, in the genotype IA/I B, each of the alleles produces its own form of the
cell-surface molecule, and so the A and B alleles are codominant.
The human disease sickle-cell anemia illustrates the somewhat arbitrary ways
in which we classify dominance. The gene concerned encodes the molecule hemo-
globin, which is responsible for transporting oxygen in blood vessels and is the
220 CHA P TER 6 Gene Interaction
major constituent of red blood cells. There are two main alleles HbA and HbS, and
Sickled and normal the three possible genotypes have different phenotypes, as follows:
red blood cells
HbA/HbA: normal; red blood cells never sickle
Hb /Hb : severe, often fatal anemia; abnormal hemoglobin
S S
oxygen concentrations
Figure 6-4 shows an electron micrograph of blood cells including some sickled
cells. In regard to the presence or absence of anemia, the Hb A allele is dominant. In
the heterozygote, a single Hb A allele produces enough functioning hemoglobin to
prevent anemia. In regard to blood-cell shape, however, there is incomplete domi-
nance, as shown by the fact that, in the heterozygote, many of the cells have a slight
sickle shape. Finally, in regard to hemoglobin itself, there is codominance. The
alleles Hb A and Hb S encode two different forms of hemoglobin that differ by a sin-
gle amino acid, and both forms are synthesized in the heterozygote. The A and S
F i g u r e 6 - 4 The sickle-shaped cell is forms of hemoglobin can be separated by electrophoresis because it happens that
caused by a single mutation in the gene they have different charges (Figure 6-5). We see that homozygous Hb A/HbA people
for hemoglobin. [ Eye of Science/Science have one type of hemoglobin (A), and anemics have another (type S), which moves
Source.] more slowly in the electric field. The heterozygotes have both types, A and S. In
other words, there is codominance at the molecular level. The fascinating popula-
tion genetics of the Hb A and Hb S alleles will be considered in Chapter 20.
Sickle-cell anemia illustrates the arbitrariness of the terms dominance, incom-
plete dominance, and codominance. The type of dominance inferred depends on
the phenotypic level at which the assay is made—organismal, cellular, or molecu-
lar. Indeed, caution should be applied to many of the categories that scientists use
to classify structures and processes; these categories are devised by humans for
the convenience of analysis.
Model Mouse
Organism
The laboratory mouse is descended from the house mouse A major limitation of mouse genetics is its cost. Whereas
Mus musculus. The pure lines used today as standards are working with a million individuals of E. coli or S. cerevisiae is
derived from mice bred in past centuries by mouse “fanci- a trivial matter, working with a million mice requires a fac-
ers.” Among model organisms, it is the one whose genome tory-size building. Furthermore, although mice do breed
most closely resembles the human genome. Its diploid chro- rapidly (compared with humans), they cannot compete with
mosome number is 40 (compared with 46 in humans), and microorganisms for speedy life cycle. Hence, the large-scale
the genome is slightly smaller than that of humans (the selections and screens necessary to detect rare genetic events
human genome being 3000 Mb) and contains approxi- are not possible.
mately the same number of genes (current estimate 25,000).
Furthermore, all mouse genes seem to have counterparts in
humans. A large proportion of genes are arranged in blocks
in exactly the same positions as those of humans.
Research on the Mendelian genetics of mice began
early in the twentieth century. One of the most important
early contributions was the elucidation of the genes that
control coat color and pattern. Genetic control of the
mouse coat has provided a model for all mammals, includ-
ing cats, dogs, horses, and cattle. A great deal of work was
also done on mutations induced by radiation and chemi-
cals. Mouse genetics has been of great significance in medi-
cine. A large proportion of human genetic diseases have
mouse counterparts useful for experimental study (they
are called “mouse models”). The mouse has played a par-
ticularly important role in the development of our current
understanding of the genes underlying cancer.
The mouse genome can be modified by the insertion of
specific fragments of DNA into a fertilized egg or into
somatic cells. The mice in the photograph have received a Green-glowing genetically modified mice. The jellyfish gene for
jellyfish gene for green fluorescent protein (GFP) that green fluorescent protein has been inserted into the
makes them glow green under special lights. Gene knock- chromosomes of the glowing mice. The other mice are normal.
outs and replacements also are possible. [ Eye of Science/Science Source.]
double dose. In general, the term pleiotropic is used for any allele that affects
several properties of an organism.
The tailless Manx phenotype in cats (Figure 6-8) also is produced by an allele
that is lethal in the homozygous state. A single dose of the Manx allele, ML,
severely interferes with normal spinal development, resulting in the absence of a
tail in the M L/M heterozygote. But in the M L/M L homozygote, the double dose of
the gene produces such an extreme abnormality in spinal development that the
embryo does not survive.
The yellow and ML alleles have their own phenotypes in a heterozygote, but
most recessive lethals are silent in the heterozygote. In such a situation, recessive
lethality is diagnosed by observing the death of 25 percent of the progeny at some
stage of development.
Whether an allele is lethal or not often depends on the environment in which
the organism develops. Whereas certain alleles are lethal in virtually any environ-
ment, others are viable in one environment but lethal in another. Human heredi-
tary diseases provide some examples. Cystic fibrosis and sickle-cell anemia are
diseases that would be lethal without treatment. Furthermore, many of the alleles
favored and selected by animal and plant breeders would almost certainly be
eliminated in nature as a result of competition with the members of the natural
6.2 Interaction of Genes in Pathways 223
responsible for black urine was homogentisic acid, which is present in high
amounts and secreted into the urine in AKU patients. He knew that, in unaffected
people, homogentisic acid is converted into maleylacetoacetic acid; so he pro-
posed that, in AKU, there is a defect in this conversion. Consequently, homogen-
tisic acid builds up and is excreted. Garrod’s observations raised the possibility
that the cell’s chemical pathways were under the control of a large set of interact-
ing genes. However, the direct demonstration of this control was provided by the
later work of Beadle and Tatum on the fungus Neurospora.
Their second step was to classify the specific nutritional requirement of each
auxotroph. Some would grow only if proline was supplied, others methionine, oth-
ers pyridoxine, others arginine, and so on. Beadle and Tatum decided to focus on
Arginine and its analogs arginine auxotrophs. They found that the genes that mutated to give arginine auxo-
trophs mapped to three different loci on three separate chromosomes. Let’s call the
NH2 NH2 genes at the three loci the arg-1, arg-2, and arg-3 genes. A key breakthrough was
C"O C " NH
Beadle and Tatum’s discovery that the auxotrophs for each of the three loci differed
in their response to the structurally related compounds ornithine and citrulline
NH2 NH NH (Figure 6-10). The arg-1 mutants grew when supplied with any one of the chemicals
(CH2)3 (CH2)3 (CH2)3 ornithine, citrulline, or arginine. The arg-2 mutants grew when given arginine or
CHNH2 CHNH2 CHNH2
citrulline but not ornithine. The arg-3 mutants grew only when arginine was sup-
plied. These results are summarized in Table 6-1.
COOH COOH COOH Cellular enzymes were already known to interconvert such related compounds.
Ornithine Citrulline Arginine On the basis of the properties of the arg mutants, Beadle and Tatum and their col-
leagues proposed a biochemical pathway for such conversions in Neurospora:
F i g u r e 6 -10 The chemical structures
of arginine and the structurally related enzyme X enzyme Y enzyme Z
compounds citrulline and ornithine. precursor ornithine citrulline arginine
6.2 Interaction of Genes in Pathways 225
This pathway nicely explains the three classes of mutants shown in Table 6-1.
Under the model, the arg-1 mutants have a defective enzyme X, and so they are
unable to convert the precursor into ornithine as the first step in producing argi-
nine. However, they have normal enzymes Y and Z, and so the arg-1 mutants are
able to produce arginine if supplied with either ornithine or citrulline. Similarly,
the arg-2 mutants lack enzyme Y, and the arg-3 mutants lack enzyme Z. Thus, a
mutation at a particular gene is assumed to interfere with the production of a
single enzyme. The defective enzyme creates a block in some biosynthetic path-
way. The block can be circumvented by supplying to the cells any compound that
normally comes after the block in the pathway.
We can now diagram a more complete biochemical model:
Different chromosomes:
a1 1
1 a2
1 a2
However, if the progeny are not wild type, then the recessive mutations must
be alleles of the same gene. Because both alleles of the gene are mutants, there is
no wild-type allele to help distinguish between two different mutant alleles of a
gene whose wild-type allele is a+. These alleles could have different mutant sites
228 CHA P TER 6 Gene Interaction
within the same gene, but they would both be nonfunctional. The heterozygote
a′/a″ would be
a
a
= mutation
How does complementation work at the molecular level? The normal blue
Flowers of the harebell plant (Campanula color of the harebell flower is caused by a blue pigment called anthocyanin. Pig-
species). [ Gregory G. Dimijian/Science ments are chemicals that absorb certain colors of light; in regard to the harebell,
Source.] the anthocyanin absorbs all wavelengths except blue, which is reflected into the
eye of the observer. However, this anthocyanin is made from chemical precursors
that are not pigments; that is, they do not absorb light of any specific wavelength
and simply reflect back the white light of the sun to the observer, giving a white
appearance. The blue pigment is the end product of a series of biochemical con-
versions of nonpigments. Each step is catalyzed by a specific enzyme encoded by
a specific gene. We can explain the results with a pathway as follows:
6.3 Inferring Gene Interactions 229
enzyme 1 enzyme 2
$
w1$/w1$ • w2+/w2+
£
w1£/w1£ • w2+/w2+
¥
w1+/w1+ • w2¥/w2¥
However, in practice, the subscript symbols would be dropped and the geno-
types would be written as follows:
w1/w1 • w2+/w2+
$
£
w1/w1 • w2+/w2+
¥
w1+/w1+ • w2/w2
Hence, an F1 from $ × £ will be
w1/w1 • w2+/w2+
These F1 plants will have two defective alleles for w1 and will therefore be
blocked at step 1. Even though enzyme 2 is fully functional, it has no substrate on
which to act; so no blue pigment will be produced and the phenotype will be
white.
The F1 plants from the other crosses, however, will have the wild-type alleles
for both of the enzymes needed to take the intermediates to the final blue prod-
uct. Their genotypes will be
w1+/w1 • w2+/w2
Nucleus 1 is arg-1 • arg-2+
Nucleus 2 is arg-1+ • arg-2
23 0 CHA P TER 6 Gene Interaction
F i g u r e 6 -12 Three
phenotypically identical white
The molecular basis of genetic complementation
harebell mutants—$, £, and ¥—are
intercrossed. Mutations in the same
Wild type
gene (such as $ and £) cannot
complement because the F1 has one
gene with two mutant alleles. The
pathway is blocked and the flowers
are white. When the mutations are in
different genes (such as £ and ¥),
+ +
there is complementation by the
wild-type alleles of each gene in the
F1 heterozygote. Pigment is + +
synthesized and the flowers are blue. w1 w2
(What would you predict to be the gene gene
result of crossing $ and ¥ ?)
F1
No
complementation Complementation
$ + + ¥
£ + £ +
Enzyme 1 Enzyme 2
No substrate Enzyme 2
Colorless No Colorless Colorless
White Blue
precursor 1 precursor 2 precursor 1 precursor 2
Because gene products are made in a common cytoplasm, the two wild-type
alleles can exert their dominant effect and cooperate to produce a heterokaryon
of wild-type phenotype. In other words, the two mutations complement, just as
they would in a diploid. If the mutations had been alleles of the same gene, there
would have been no complementation.
6.3 Inferring Gene Interactions 231
Fusion
Heterokaryon grows
without arginine
The 9 : 3 : 3 : 1 ratio: no gene interaction As a baseline, let’s start with the case
in which two mutated genes do not interact, a situation where we expect the
9 : 3 : 3 : 1 ratio. Let’s look at the inheritance of skin coloration in corn snakes. The
snake’s natural color is a repeating black-and-orange camouflage pattern, as
shown
Introductionin Figure 6-14a.
to Genetic The 11e
Analysis, phenotype is produced by two separate pigments,
both
Figureof which
06.13 #619are under genetic control. One gene determines the orange pig-
04/29/14
ment, and the alleles that we will consider are o+ (presence of orange pigment)
Dragonfly Media Group
and o (absence of orange pigment). Another gene determines the black pigment,
and its alleles are b+ (presence of black pigment) and b (absence of black pig-
ment). These two genes are unlinked. The natural pattern is produced by the
genotype o+/− ; b+/−. (The dash represents the presence of either allele.) A snake
that is o/o ; b+/− is black because it lacks the orange pigment (Figure 6-14b), and a
snake that is o+/− ; b/b is orange because it lacks the black pigment (Figure 6-14c).
The double homozygous recessive o/o ; b/b is albino (Figure 6-14d). Notice, however,
232 CHA P TER 6 Gene Interaction
that the faint pink color of the albino is from yet another
Independently synthesized
and inherited pigments pigment, the hemoglobin of the blood that is visible
through this snake’s skin when the other pigments are
absent. The albino snake also clearly shows that there is
another element to the skin-pigmentation pattern in
addition to pigment: the repeating motif in and around
which pigment is deposited.
If a homozygous orange and a homozygous black
snake are crossed, the F1 is wild type (camouflaged),
demonstrating complementation:
/ o+/o+ ; b/b × ? o/o ; b+/b+
(orange) (black)
↓
F1 o+/o ; b+/b
(a) (b)
(camouflaged)
r+ a+
(b) r a+
Mutation in
the gene that No protein A
encodes the produced
regulatory Nonfunctional
protein regulatory protein
(c) r+ a
Mutation in
the gene that Mutant protein A
encodes the produced
structural
protein
r a
(d)
Mutation in No protein A
both genes produced
F i g u r e 6 -15 The r + gene encodes a regulatory protein, and the a + gene encodes a
structural protein. Both must be normal for a functional (“active”) structural protein to be
synthesized.
Clearly, in this case, the only way in which a 9 : 7 ratio is possible is if the dou-
ble mutant has the same phenotypes as the two single mutants. Hence, the modi-
fied ratio constitutes a way of identifying the double mutant’s phenotype.
Furthermore, the identical phenotypes of the single and double mutants suggest
that each mutant allele controls a different step in the same pathway. The results
show that a plant will have white petals if it is homozygous for the recessive
mutant allele of either gene or both genes. To have the blue phenotype, a plant
must have at least one copy of the dominant allele of both genes because both are
needed to complete the sequential steps in the pathway. No matter which is
absent, the same pathway fails, producing the same phenotype. Thus, three of the
genotypic classes will produce the same phenotype, and so, overall, only two phe-
notypes result.
Introduction to Genetic Analysis, 11e
Figure 06.15
The #620 in harebells entailed different steps in a synthetic pathway. Similar
example
04/29/14
results can come from gene regulation. A regulatory gene often functions by produc-
Dragonfly Media Group
ing a protein that binds to a regulatory site upstream of a target gene, facilitating the
transcription of the gene (Figure 6-15). In the absence of the regulatory protein, the
target gene would be transcribed at very low levels, inadequate for cellular needs.
Let’s cross a pure line r/r defective for the regulatory protein to a pure line a/a defec-
tive for the target protein. The cross is r/r ; a+/a+ × r+/r+ ; a/a. The r+/r ; a+/a dihy-
brid will show complementation between the mutant genotypes because both r+ and
23 4 CHA P TER 6 Gene Interaction
a+ are present, permitting normal transcription of the wild-type allele. When selfed,
the F1 dihybrid will also result in a 9 : 7 phenotypic ratio in the F2:
Functional a+
Proportion Genotype protein Ratio
9
16 r+/- ; a+/- Yes 9
3
16
r+/- ; a/a No
3
16
r/r ; a+/- No 7
u
1
16
r/r ; a/a No
4 1 w/w ; m/m (white)
1
16
w/w ; m/m Blocked at first enzyme
In the F2, the 9 : 3 : 4 phenotypic ratio is diagnostic of
recessive epistasis. As in the preceding case, we see, again,
that the ratio tells us what the phenotype of the double must
be, because the 164 component of the ratio must be a grouping
F i g u r e 6 -16 Wild-type alleles of two genes (w + and m +)
of one single mutant class ( 163 ) plus the double mutant class
encode enzymes catalyzing successive steps in the synthesis of a
blue petal pigment. Homozygous m/m plants produce magenta
( 161 ). Hence, the double mutant expresses only one of the two
flowers, and homozygous w/w plants produce white flowers. The mutant phenotypes; so, by definition, white must be epistatic
double mutant w/w ; m/m also produces white flowers, indicating to magenta. (To find the double mutant within the group,
that white is epistatic to magenta. white F2 plants would have to be individually testcrossed.)
6.3 Inferring Gene Interactions 23 5
Notice that the epistatic mutation occurs in a step in the pathway leading to blue
pigment; this step is upstream of the step that is blocked by the masked mutation.
Another informative case of recessive epistasis is the yellow coat color of some
Labrador retriever dogs. Two alleles, B and b, stand for black and brown coats,
respectively. The two alleles produce black and brown melanin. The allele e of
another gene is epistatic on these alleles, giving a yellow coat (Figure 6-17). There-
fore, the genotypes B/− ; e/e and b/b ; e/e both produce a yellow phenotype, whereas
B/− ; E/− and b/b ; E/− are black and brown, respectively. This case of epistasis is
not caused by an upstream block in a pathway leading to dark pigment. Yellow dogs
can make black or brown pigment, as can be seen in their noses and lips. The action
of the allele e is to prevent the deposition of the pigment in hairs. In this case, the
epistatic gene is developmentally downstream; it represents a kind of developmental
target that must be of E genotype before pigment can be deposited.
Hence, the double mutant must be the non-wild-type genotype and can be
assessed accordingly. If the phenotype is the a phenotype, then b is being overrid-
den; if the phenotype is the b phenotype, then a is being overridden. If both phe-
notypes are present, then there is no epistasis.
The ratio tells us that the dominant allele W is epistatic, producing the 12 : 3 : 1
9
ratio. The 12
16 component of the ratio must include the double mutant class ( 16 ),
which is clearly white in phenotype, establishing the epistasis of the dominant
allele W. The two genes act in a common developmental pathway: W prevents the
synthesis of red pigment but only in a special class of cells constituting the main
area of the petal; synthesis is allowed in the throat spots. When synthesis is
allowed, the pigment can be produced in either high or low concentrations.
F2 9 pd+/- ; su+/- red
3 pd+/- ; su/su red u 13
1 pd/pd ; su/su red
3 pd/pd ; su+/- purple 3
Progeny Phenotype
a+ • b+ wild type
a+ • b defective (low transcription)
a • b+ defective (defective protein A)
a • b extremely defective (low transcription
of defective protein)
Hence, the action of the modifier is seen in the appearance of two grades of
mutant phenotypes within the a progeny.
6.4 Penetrance and Expressivity 23 9
A summary of some of the ratios that reveal gene interac- F i g u r e 6 -2 0 Two interacting proteins
tion is shown in Table 6-2. perform some essential function on some
substrate such as DNA but must first bind
to it. Reduced binding of either protein
Table 6-2 Some Modified F2 Ratios allows some functions to remain, but
reduced binding of both is lethal.
9 : 3 : 3 : 1 No interaction
9 : 7 Genes in same pathway
9 : 3 : 4 Recessive epistasis
12 : 3 : 1 Dominant epistasis
13 : 3 Suppressor has no phenotype
14 : 2 Suppressor is like mutant
Note: Some of these ratios can be produced with other mechanisms of interaction.
1. The influence of the environment. Individuals with the same genotype may show
Inferring incomplete
penetrance a range of phenotypes, depending on the environment. The range of phenotypes
for mutant and wild-type individuals may overlap: the phenotype of a mutant indi-
vidual raised in one set of circumstances may match the phenotype of a wild-type
individual raised in a different set of circumstances. Should this matching happen,
the mutant cannot be distinguished from the wild type.
Q
2. The influence of other interacting genes. Uncharacterized modifiers, epistatic
genes, or suppressors in the rest of the genome may act to prevent the expression
R of the typical phenotype.
F i g u r e 6 -2 1 In this human pedigree of 3. The subtlety of the mutant phenotype. The subtle effects brought about by the
a dominant allele that is not fully penetrant, absence of a gene function may be difficult to measure in a laboratory situation.
person Q does not display the phenotype
but passed the dominant allele to at least
A typical encounter with incomplete penetrance is shown in Figure 6-21. In
two progeny. Because the allele is not fully
penetrant, the other progeny (for example,
this human pedigree, we see a normally dominantly inherited phenotype disap-
R) may or may not have inherited the pearing in the second generation only to reappear in the next.
dominant allele. Another measure for describing the range of phenotypic expression is called
expressivity. Expressivity measures the degree to which a given allele is expressed
at the phenotypic level; that is, expressivity measures the intensity of the pheno-
type. For example, “brown” animals (genotype b/b) from different stocks might
show very different intensities of brown pigment from light to dark. As for pene-
trance, variable expressivity may be due to variation in the allelic constitution of
the rest of the genome or to environmental factors. Figure 6-22 illustrates the
distinction between penetrance and expressivity. An example of variable expres-
sivity in dogs is found in Figure 6-23.
The phenomena of incomplete penetrance and variable expressivity can make
any kind of genetic analysis substantially more difficult, including human pedi-
gree analysis and predictions in genetic counseling. For example, it is often the
case that a disease-causing allele is not fully penetrant. Thus, someone could have
the allele but not show any signs of the disease. If that is the case, it is difficult to
give a clean genetic bill of health to any person in a disease pedigree (for example,
person R in Figure 6-21). On the other hand, pedigree analysis can sometimes
identify persons who do not express but almost certainly do have a disease geno-
type (for example, individual Q in Figure 6-21). Similarly, variable expressivity
Phenotypic expression
(each oval represents an individual)
1 2 3 4
5 6 7 8
9 10
s u m m a ry
A gene does not act alone; rather, it acts in concert with Genetic dissection of gene interactions begins by the experi-
many other genes in the genome. In forward genetic analy- menter amassing mutants affecting a character of interest.
sis, deducing these complex interactions is an important The complementation test determines whether two distinct
stage of the research. Individual mutations are first tested recessive mutations are of one gene or of two different
for their dominance relations, a type of allelic interaction. genes. The mutant genotypes are brought together in an F1
Recessive mutations are often a result of haplosufficiency individual, and if the phenotype is mutant, then no comple-
of the wild-type allele, whereas dominant mutations are mentation has occurred and the two alleles must be of the
often the result either of haploinsufficiency of the wild type same gene. If the phenotype is wild type, then complemen-
or of the mutant acting as a dominant negative (a rogue tation has occurred, and the alleles must be of different
polypeptide). Some mutations cause severe effects or even genes.
death (lethal mutations). Lethality of a homozygous reces- The interaction of different genes can be detected by test-
sive mutation is a way to assess if a gene is essential in the ing double mutants because allele interaction implies interac-
genome. tion of gene products at the functional level. Some key types
The interaction of different genes is a result of their par- of interaction are epistasis, suppression, and synthetic lethal-
ticipation in the same or connecting pathways of various ity. Epistasis is the replacement of a mutant phenotype pro-
kinds—synthetic, signal transduction, or developmental. duced by one mutation with a mutant phenotype produced
242 CHA P TER 6 Gene Interaction
by mutation of another gene. The observation of epistasis sug- The different types of gene interactions produce F2 dihy-
gests a common developmental or chemical pathway. A sup- brid ratios that are modifications of the standard 9 : 3 : 3 : 1. For
pressor is a mutation of one gene that can restore wild-type example, recessive epistasis results in a 9 : 3 : 4 ratio.
phenotype to a mutation at another gene. Suppressors often In more general terms, gene interaction and gene-envi-
reveal physically interacting proteins or nucleic acids. Some ronment interaction are revealed by variable penetrance
combinations of viable mutants are lethal, a result known as (the ability of a genotype to express itself in the phenotype)
synthetic lethality. Synthetic lethals can reveal a variety of and expressivity (the quantitative degree of phenotypic
interactions, depending on the nature of the mutations. manifestation of a genotype).
key terms
allelic series (multiple alleles) full (complete) dominance (p. 216) penetrance (p. 239)
(p. 216) functional RNA (p. 225) permissive temperature (p. 223)
codominance (p. 219) heterokaryon (p. 229) pleiotropic allele (p. 222)
complementation (p. 228) incomplete dominance (p. 219) restrictive temperature (p. 223)
complementation test (p. 227) lethal allele (p. 220) revertant (p. 236)
dominant negative mutation (p. 218) modifier (p. 238) suppressor (p. 236)
double mutants (p. 227) multiple alleles (p. 216) synthetic lethal (p. 239)
epistasis (p. 234) null mutation (p. 218) temperature-sensitive (ts) mutations
essential gene (p. 221) one-gene–one-polypeptide hypothesis (p. 223)
expressivity (p. 240) (p. 225)
s olv e d p r obl e m s
SOLVED PROBLEM 1. Most pedigrees show polydactyly (see a. What irregularity does this pedigree show?
Figure 2-25) inherited as a rare autosomal dominant, but the b. What genetic phenomenon does this pedigree illustrate?
pedigrees of some families do not fully conform to the pat- c. Suggest a specific gene-interaction mechanism that could
terns expected for such inheritance. Such a pedigree is shown produce such a pedigree, showing genotypes of pertinent
here. (The unshaded diamonds stand for the specified num- family members.
ber of unaffected persons of unknown sex.)
I
1 2
II
1 2 3 4 5 6 7 8 9 10 11
III 4
5 6 7 8 9 10 11 12 13 14 15 16 17
IV 4
5 6 7 8 9
Solution
a. The normal expectation for an autosomal dominant is for expectation is not seen in this pedigree, which constitutes the
each affected individual to have an affected parent, but this irregularity. What are some possible explanations?
Solved Problems 24 3
Could some cases of polydactyly be caused by a different We cannot rule out the possibilities that II-2 and II-4 have
gene, one that is an X-linked dominant gene? This suggestion the genotype P/p • S/s and that by chance none of their
is not useful, because we still have to explain the absence of descendants are affected.
the condition in persons II-6 and II-10. Furthermore, postu-
lating recessive inheritance, whether autosomal or sex- SOLVED PROBLEM 2. Beetles of a certain species may have
linked, requires many people in the pedigree to be heterozy- green, blue, or turquoise wing covers. Virgin beetles were
gotes, which is inappropriate because polydactyly is a rare selected from a polymorphic laboratory population and
condition. mated to determine the inheritance of wing-cover color. The
b. Thus, we are left with the conclusion that polydactyly crosses and results were as given in the following table:
must sometimes be incompletely penetrant. As described in
this chapter, some individuals who have the genotype for a Cross Parents Progeny
particular phenotype do not express it. In this pedigree, II-6 1 blue × green all blue
and II-10 seem to belong in this category; they must carry the
polydactyly gene inherited from I-1 because they transmit it 2 blue × blue 3
4 blue : 41 turquoise
to their progeny. 3 green × green 3
4 green : 41 turquoise
c. As discussed in this chapter, environmental suppression of 4 blue × turquoise 1
blue : 21 turquoise
2
gene expression can cause incomplete penetrance, as can sup-
pression by another gene. To give the requested genetic expla- 5 blue × blue 3
4 blue : 41 green
nation, we must come up with a genetic hypothesis. What do 6 blue × green 1
2 blue : 21 green
we need to explain? The key is that I-1 passes the mutation on 7 blue × green 1
2 blue : 41 green
to two types of progeny, represented by II-1, who expresses the 1
turquoise
4
mutant phenotype, and by II-6 and II-10, who do not. (From
8 turquoise × turquoise all turquoise
the pedigree, we cannot tell whether the other children of I-1
have the mutant allele.) Is genetic suppression at work? I-1
does not have a suppressor allele because he expresses poly-
dactyly. So the only person from whom a suppressor could a. Deduce the genetic basis of wing-cover color in this
come is I-2. Furthermore, I-2 must be heterozygous for the species.
suppressor allele because at least one of her children does b. Write the genotypes of all parents and progeny as com-
express polydactyly. Therefore, the suppressor allele must be pletely as possible.
dominant. We have thus formulated the hypothesis that the
mating in generation I must have been Solution
a. These data seem complex at first, but the inheritance
(I-1) P/p • s/s × (I-2) p/p • S/s pattern becomes clear if we consider the crosses one at a
time. A general principle of solving such problems, as we
where S is the suppressor and P is the allele responsible for
have seen, is to begin by looking over all the crosses and by
polydactyly. From this hypothesis, we predict that the prog-
grouping the data to bring out the patterns.
eny will comprise the following four types if the genes assort:
One clue that emerges from an overview of the data is that
all the ratios are one-gene ratios: there is no evidence of two
Genotype Phenotype Example separate genes taking part at all. How can such variation be
P/p • S/s normal (suppressed) II-6, II-10 explained with a single gene? The answer is that there is varia-
tion for the single gene itself—that is, multiple allelism.
P/p s/s polydactylous
• II-1
Perhaps there are three alleles of one gene; let’s call the gene
p/p • S/s normal w (for wing-cover color) and represent the alleles as w g, w b,
p/p • s/s normal and w t. Now we have an additional problem, which is to deter-
mine the dominance of these alleles.
Cross 1 tells us something about dominance because all of
If S is rare, the progeny of II-6 and II-10 are: the progeny of a blue × green cross are blue; hence, blue ap-
Progeny genotype Example pears to be dominant over green. This conclusion is support-
ed by cross 5, because the green determinant must have been
P/p · S/s III-13 present in the parental stock to appear in the progeny. Cross
P/p · s/s III-8 3 informs us about the turquoise determinants, which must
p/p · S/s have been present, although unexpressed, in the parental
stock because there are turquoise wing covers in the progeny.
p/p · s/s
So green must be dominant over turquoise. Hence, we have
24 4 CHA P TER 6 Gene Interaction
formed a model in which the dominance is w b > w g > w t. ratio. How do we know this ratio? Well, there are simply not
Indeed, the inferred position of the w t allele at the bottom of that many complex ratios in genetics, and trial and error
the dominance series is supported by the results of cross 7, brings us to the 12 : 3 : 1 quite quickly. In the 128 progeny
where turquoise shows up in the progeny of a blue × green total, the numbers of 96 : 24 : 8 are expected, but the actual
cross. numbers fit these expectations remarkably well.
b. Now it is just a matter of deducing the specific geno- One of the principles of this chapter is that modified
types. Notice that the question states that the parents were Mendelian ratios reveal gene interactions. Cross 3 gives F2
taken from a polymorphic population, which means that numbers appropriate for a modified dihybrid Mendelian ra-
they could be either homozygous or heterozygous. A parent tio, and so it looks as if we are dealing with a two-gene interac-
with blue wing covers, for example, might be homozygous tion. It seems the most promising place to start; we can return
(w b/w b) or heterozygous (w b/w g or w b/w t). Here, a little trial to crosses 1 and 2 and try to fit them in later.
and error and common sense are called for, but, by this Any dihybrid ratio is based on the phenotypic proportions
stage, the question has essentially been answered, and all 9 : 3 : 3 : 1. Our observed modification groups them as follows:
that remains is to “cross the t’s and dot the i’s.” The follow-
ing genotypes explain the results. A dash indicates that the 9 A/- ; B/-
u 12 piping
genotype may be either homozygous or heterozygous in 3 A/- ; b/b
having a second allele farther down the allelic series. 3 a/a ; B/- 3 spiny tip
Cross Parents Progeny 1 a/a ; b/b 1 spiny
1 wb/wb × wg/- wb/wg or wb/- So, without worrying about the name of the type of gene
3
2 wb/wt × wb/wt 4 wb/- : 41 wt/wt interaction (we are not asked to supply this anyway), we can
3 wg/wt × wg/wt 3
4 wg/- : 41 wt/wt already define our three pineapple-leaf phenotypes in rela-
4 wb/wt × wt/wt 1
wb/wt : 21 wt/wt tion to the proposed allelic pairs A/a and B/b:
2
3
5 wb/wg × wb/wg 4 wb/- : 41 wg/wg piping = A/- (B/b irrelevant)
6 wb/wg × wg/wg 2 w /w : 2 w /w
1 b g 1 g g
spiny tip = a/a ; B/-
7 wb/wt × wg/wt 2 w /- : 4 w /w : 4 w /w
1 b 1 g t 1 t t
spiny = a/a ; b/b
8 wt/wt × wt/wt all wt/wt
What about the parents of cross 3? The spiny parent must
be a/a ; b/b, and, because the B gene is needed to produce F2
SOLVED PROBLEM 3. The leaves of pineapples can be classi- spiny-tip leaves, the piping parent must be A/A ; B/B. (Note
fied into three types: spiny (S), spiny tip (ST), and piping that we are told that all parents are pure, or homozygous.)
(nonspiny; P). In crosses between pure strains followed by The F1 must therefore be A/a ; B/b.
intercrosses of the F1, the following results appeared: Without further thought, we can write out cross 1 as
Phenotypes follows:
Cross Parental F1 F2
a/a ; B/B a/a ; b/b
1 ST × S ST 99 ST : 34 S 3
a/a ; B/–
4
2 P × ST P 120 P : 39 ST a/a ; B/b
3 P × S P 95 P : 25 ST : 8 S 1
a/a ; b/b
4
a. Assign gene symbols. Explain these results in regard to Cross 2 can be partly written out without further thought by
the genotypes produced and their ratios. using our arbitrary gene symbols:
b. Using the model from part a, give the phenotypic ratios
that you would expect if you crossed (1) the F1 progeny A/A ; –/– a/a ; B/B
from piping × spiny with the spiny parental stock and (2) 3
A/– ; –/–
4
the F1 progeny of piping × spiny with the F1 progeny of A/a ; B/–
spiny × spiny tip. 1
a/a ; B/–
4
p r obl e m s
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
Working with the Figures c. in the situation in panel b, would protein from the
1. In Figure 6-1, active protein gene be made?
a. what do the yellow stars represent? 9. In Figure 6-16, if you selfed 10 different F2 pink plants,
b. explain in your own words why the heterozygote is would you expect to find any white-flowered plants
functionally wild type. among the offspring? Any blue-flowered plants?
2. In Figure 6-2, explain how the mutant polypeptide acts 10. In Figure 6-19,
as a spoiler and what its net effect on phenotype is. a. what do the square/triangular pegs and holes
represent?
3. In Figure 6-6, assess the allele V f with respect to the V by
allele: is it dominant? recessive? codominant? incom- b. is the suppressor mutation alone wild type in
pletely dominant? phenotype?
4. In Figure 6-11, 11. In Figure 6-21, propose a specific genetic explanation
for individual Q (give a possible genotype, defining the
a. in view of the position of HPA oxidase earlier in the
alleles).
pathway compared to that of HA oxidase, would you
expect people with tyrosinosis to show symptoms of
alkaptonuria? B a s i c P r obl e m s
b. if a double mutant could be found, would you expect 12. In humans, the disease galactosemia causes mental re-
tyrosinosis to be epistatic to alkaptonuria? tardation at an early age. Lactose (milk sugar) is broken
down to galactose plus glucose. Normally, galactose is
5. In Figure 6-12,
broken down further by the enzyme galactose-1-phos-
a. what do the dollar, pound, and yen symbols represent? phate uridyltransferase (GALT). However, in patients
b. why can’t the left-hand F1 heterozygote synthesize with galactosemia, GALT is inactive, leading to a buildup
blue pigment? of high levels of galactose, which, in the brain, causes
6. In Figure 6-13, explain at the protein level why this het- mental retardation. How would you provide a secondary
erokaryon can grow on minimal medium. cure for galactosemia? Would you expect this disease
phenotype to be dominant or recessive?
7. In Figure 6-14, write possible genotypes for each of the
four snakes illustrated. 13. In humans, PKU (phenylketonuria) is a recessive disease
caused by an enzyme inefficiency at step A in the follow-
8. In Figure 6-15,
ing simplified reaction sequence, and AKU (alkapton-
a. which panel represents the double mutant? uria) is another recessive disease due to an enzyme inef-
b. state the function of the regulatory gene. ficiency in one of the steps summarized as step B here:
246 CHA P TER 6 Gene Interaction
22. Black, sepia, cream, and albino are coat colors of guinea popular with buyers of fox coats, but the breeders could
pigs. Individual animals (not necessarily from pure not develop a pure-breeding platinum strain. Every time
lines) showing these colors were intercrossed; the re- two platinums were crossed, some normal foxes ap-
sults are tabulated as follows, where the abbreviations A peared in the progeny. For example, the repeated mat-
(albino), B (black), C (cream), and S (sepia) represent ings of the same pair of platinums produced 82 platinum
the phenotypes: and 38 normal progeny. All other such matings gave
Phenotypes of progeny similar progeny ratios. State a concise genetic hypothesis
Parental that accounts for these results.
Cross phenotypes B S C A
26. For several years, Hans Nachtsheim investigated an in-
1 B × B 22 0 0 7 herited anomaly of the white blood cells of rabbits. This
2 B × A 10 9 0 0 anomaly, termed the Pelger anomaly, is the arrest of
3 C × C 0 0 34 11 the segmentation of the nuclei of certain white cells. This
4 S × C 0 24 11 12 anomaly does not appear to seriously burden the rabbits.
5 B × A 13 0 12 0
6 B × C 19 20 0 0
a. When rabbits showing the Pelger anomaly were mated
7 B × S 18 20 0 0 with rabbits from a true-breeding normal stock,
8 B × S 14 8 6 0 Nachtsheim counted 217 offspring showing the Pelger
9 S × S 0 26 9 0 anomaly and 237 normal progeny. What is the genetic
10 C × A 0 0 15 17 basis of the Pelger anomaly?
b. When rabbits with the Pelger anomaly were mated
a. Deduce the inheritance of these coat colors, and use with each other, Nachtsheim found 223 normal progeny,
gene symbols of your own choosing. Show all parent and 439 with the Pelger anomaly, and 39 extremely abnormal
progeny genotypes. progeny. These very abnormal progeny not only had
defective white blood cells, but also showed severe
b. If the black animals in crosses 7 and 8 are crossed,
deformities of the skeletal system; almost all of them
what progeny proportions can you predict by using your
died soon after birth. In genetic terms, what do you
model?
suppose these extremely defective rabbits represented?
23. In a maternity ward, four babies become accidentally Why were there only 39 of them?
mixed up. The ABO types of the four babies are known to
c. What additional experimental evidence might you
be O, A, B, and AB. The ABO types of the four sets of
collect to test your hypothesis in part b?
parents are determined. Indicate which baby belongs to
each set of parents: (a) AB × O, (b) A × O, (c) A × AB, d. In Berlin, about 1 human in 1000 shows a Pelger
(d) O × O. anomaly of white blood cells very similar to that
24. Consider two blood polymorphisms that humans have in described for rabbits. The anomaly is inherited as a
addition to the ABO system. Two alleles LM and LN deter- simple dominant, but the homozygous type has not been
mine the M, N, and MN blood groups. The dominant al- observed in humans. Based on the condition in rabbits,
lele R of a different gene causes a person to have the Rh+ why do you suppose the human homozygous has not
(rhesus positive) phenotype, whereas the homozygote for been observed?
r is Rh− (rhesus negative). Two men took a paternity dis- e. Again by analogy with rabbits, what phenotypes and
pute to court, each claiming three children to be his own. genotypes would you expect among the children of a
The blood groups of the men, the children, and their man and woman who both show the Pelger anomaly?
mother were as follows: (Data from A. M. Srb, R. D. Owen, and R. S. Edgar,
Person Blood group General Genetics, 2nd ed. W. H. Freeman and Company,
1965.)
husband O M Rh+
wife’s lover AB MN Rh- 27. Two normal-looking fruit flies were crossed, and, in the
wife A N Rh+ progeny, there were 202 females and 98 males.
child 1 O MN Rh+
a. What is unusual about this result?
child 2 A N Rh+
child 3 A MN Rh- b. Provide a genetic explanation for this anomaly.
c. Provide a test of your hypothesis.
From this evidence, can the paternity of the children be 28.
You have been given a virgin Drosophila female. You
established? notice that the bristles on her thorax are much shorter
25. On a fox ranch in Wisconsin, a mutation arose that gave than normal. You mate her with a normal male (with
a “platinum” coat color. The platinum color proved very long bristles) and obtain the following F1 progeny:
248 CHA P TER 6 Gene Interaction
1
3 short-bristled females, 13 long-bristled females, and 32. In roses, the synthesis of red pigment is by two steps in
1
3 long-bristled males. A cross of the F1 long-bristled fe- a pathway, as follows:
males with their brothers gives only long-bristled F2. gene P
A cross of short-bristled females with their brothers gives colorless intermediate
gene Q
3 short-bristled females, 3 long-bristled females, and
1 1
magenta intermediate red pigment
3 long-bristled males. Provide a genetic hypothesis to
1
account for all these results, showing genotypes in every a. What would the phenotype be of a plant homozygous
cross. for a null mutation of gene P?
29. A dominant allele H reduces the number of body bristles b. What would the phenotype be of a plant homozygous
that Drosophila flies have, giving rise to a “hairless” phe- for a null mutation of gene Q?
notype. In the homozygous condition, H is lethal. An in- c. What would the phenotype be of a plant homozygous
dependently assorting dominant allele S has no effect on for null mutations of genes P and Q?
bristle number except in the presence of H, in which case
a single dose of S suppresses the hairless phenotype, thus d. Write the genotypes of the three strains in parts a, b,
restoring the hairy phenotype. However, S also is lethal and c.
in the homozygous (S/S) condition. e. What F2 ratio is expected from crossing plants from
a. What ratio of hairy to hairless flies would you find in parts a and b? (Assume independent assortment.)
the live progeny of a cross between two hairy flies both 33. Because snapdragons (Antirrhinum) possess the pig-
carrying H in the suppressed condition? ment anthocyanin, they have reddish purple petals.
b. When the hairless progeny are backcrossed with a Two pure anthocyaninless lines of Antirrhinum were
parental hairy fly, what phenotypic ratio would you developed, one in California and one in Holland. They
expect to find among their live progeny? looked identical in having no red pigment at all, mani-
fested as white (albino) flowers. However, when petals
30. After irradiating wild-type cells of Neurospora (a hap- from the two lines were ground up together in buffer in
loid fungus), a geneticist finds two leucine-requiring the same test tube, the solution, which appeared color-
auxotrophic mutants. He combines the two mutants in less at first, gradually turned red.
a heterokaryon and discovers that the heterokaryon is
prototrophic. a. What control experiments should an investigator
conduct before proceeding with further analysis?
a. Were the mutations in the two auxotrophs in the same
gene in the pathway for synthesizing leucine or in two b. What could account for the production of the red
different genes in that pathway? Explain. color in the test tube?
b. Write the genotype of the two strains according to c. According to your explanation for part b, what would
your model. be the genotypes of the two lines?
c. What progeny and in what proportions would you d. If the two white lines were crossed, what would you
predict from crossing the two auxotrophic mutants? predict the phenotypes of the F1 and F2 to be?
(Assume independent assortment.) 34. The frizzle fowl is much admired by poultry fanciers. It
31. A yeast geneticist irradiates haploid cells of a strain that gets its name from the unusual way that its feathers
is an adenine-requiring auxotrophic mutant, caused by curl up, giving the impression that it has been (in the
mutation of the gene ade1. Millions of the irradiated memorable words of animal geneticist F. B. Hutt)
cells are plated on minimal medium, and a small num- “pulled backwards through a knothole.” Unfortunately,
ber of cells divide and produce prototrophic colonies. frizzle fowl do not breed true: when two frizzles are in-
These colonies are crossed individually with a wild- tercrossed, they always produce 50 percent frizzles,
type strain. Two types of results are obtained: 25 percent normal, and 25 percent with peculiar woolly
feathers that soon fall out, leaving the birds naked.
(1) prototroph × wild type : progeny all prototrophic
a. Give a genetic explanation for these results, showing
(2) prototroph × wild type : progeny 75% prototrophic,
genotypes of all phenotypes, and provide a statement of
25% adenine-requiring auxotrophs
how your explanation works.
a. Explain the difference between these two types of b. If you wanted to mass-produce frizzle fowl for sale,
results. which types would be best to use as a breeding pair?
b. Write the genotypes of the prototrophs in each case. 35. The petals of the plant Collinsia parviflora are normally
c. What progeny phenotypes and ratios do you predict blue, giving the species its common name, blue-eyed
from crossing a prototroph of type 2 by the original ade1 Mary. Two pure-breeding lines were obtained from col-
auxotroph? or variants found in nature; the first line had pink pet-
Problems 249
als, and the second line had white petals. The following 22. Draw chromosomes representing the meioses in the
crosses were made between pure lines, with the results parents in the cross blue × white and representing mei-
shown: osis in the F1.
Parents F1 F2 23. Repeat step 22 for the cross blue × pink.
9. In what way do the first two rows in the table differ white 50%
from the third row? solid purple 25%
10. Which phenotypes are dominant? spotted purple 25%
11. What is complementation?
What were the genotypes of the F2 plants crossed?
12. Where does the blueness come from in the progeny of
the pink × white cross? 38. Most flour beetles are black, but several color variants
are known. Crosses of pure-breeding parents produced
13. What genetic phenomenon does the production of a
the following results (see table) in the F1 generation, and
blue F1 from pink and white parents represent?
intercrossing the F1 from each cross gave the ratios
14. List any ratios that you can see. shown for the F2 generation. The phenotypes are abbre-
15. Are there any monohybrid ratios? viated Bl, black; Br, brown; Y, yellow; and W, white.
16. Are there any dihybrid ratios?
Cross Parents F1 F2
17. What does observing monohybrid and dihybrid ratios
tell you? 1 Br × Y Br 3 Br : 1 Y
2 Bl × Br Bl 3 Bl : 1 Br
18. List four modified Mendelian ratios that you can think of.
3 Bl × Y Bl 3 Bl : 1 Y
19. Are there any modified Mendelian ratios in the problem? 4 W × Y Bl 9 Bl : 3 Y : 4 W
20. What do modified Mendelian ratios indicate generally? 5 W × Br Bl 9 Bl : 3 Br : 4 W
21. What is indicated by the specific modified ratio or ra- 6 Bl × W Bl 9 Bl : 3 Y : 4 W
tios in this problem?
250 CHA P TER 6 Gene Interaction
a. From these results, deduce and explain the inheritance type mice are crossed with cinnamons, all of the F1 are
of these colors. wild type and the F2 has a 3 : 1 ratio of wild type to
b. Write the genotypes of each of the parents, the F1, and cinnamon. Diagram this cross as in part a, letting B stand
the F2 in all crosses. for the wild-type black allele and b stand for the
cinnamon brown allele.
39. Two albinos marry and have four normal children. How
is this possible? c. When mice of a true-breeding cinnamon line are
crossed with mice of a true-breeding nonagouti (black)
40. Consider the production of flower color in the Japanese
line, all of the F1 are wild type. Use a genetic diagram to
morning glory (Pharbitis nil ). Dominant alleles of either
explain this result.
of two separate genes (A/− • b/b or a/a • B/−) produce
purple petals. A/− • B/− produces blue petals, and d. In the F2 of the cross in part c, a fourth color called
a/a • b/b produces scarlet petals. Deduce the genotypes chocolate appears in addition to the parental cinnamon
of parents and progeny in the following crosses: and nonagouti and the wild type of the F1. Chocolate
mice have a solid, rich brown color. What is the genetic
Cross Parents Progeny constitution of the chocolates?
1 blue × scarlet 1
4 blue : 21 purple : 41 scarlet e. Assuming that the A/a and B/b allelic pairs assort
2 purple × purple 1
blue : 21 purple : 41 scarlet independently of each other, what do you expect to be
4
the relative frequencies of the four color types in the F2
3 blue × blue 3
4
blue : 41 purple described in part d ? Diagram the cross of parts c and d,
4
4 blue × purple 3
8 blue : 8 purple : 81 scarlet showing phenotypes and genotypes (including gametes).
1
5 purple × scarlet 2 purple : 21 scarlet f. What phenotypes would be observed in what
proportions in the progeny of a backcross of F1 mice
41. Corn breeders obtained pure lines whose kernels turn from part c with the cinnamon parental stock? With
sun red, pink, scarlet, or orange when exposed to sun- the nonagouti (black) parental stock? Diagram these
light (normal kernels remain yellow in sunlight). Some backcrosses.
crosses between these lines produced the following re- g. Diagram a testcross for the F1 of part c. What colors
sults. The phenotypes are abbreviated O, orange; P, pink; would result and in what proportions?
Sc, scarlet; and SR, sun red. h. Albino (pink-eyed white) mice are homozygous for
Phenotypes the recessive member of an allelic pair C/c, which assorts
independently of the A/a and B/b pairs. Suppose that
Cross Parents F1 F2 you have four different highly inbred (and therefore
1 SR × P all SR 66 SR : 20 P presumably homozygous) albino lines. You cross each of
2 O × SR all SR 998 SR : 314 O these lines with a true-breeding wild-type line, and you
3 O × P all O 1300 O : 429 P raise a large F2 progeny from each cross. What genotypes
4 O × Sc all Y 182 Y : 80 O : 58 Sc for the albino lines can you deduce from the following F2
phenotypes?
Analyze the results of each cross, and provide a unifying
hypothesis to account for all the results. (Explain all sym- Phenotypes of progeny
bols that you use.) F2 of Wild Cinna- Choco-
42. Many kinds of wild animals have the agouti coloring pat- line type Black mon late Albino
tern, in which each hair has a yellow band around it. 1 87 0 32 0 39
a. Black mice and other black animals do not have the 2 62 0 0 0 18
yellow band; each of their hairs is all black. This absence 3 96 30 0 0 41
of wild agouti pattern is called nonagouti. When mice of 4 287 86 92 29 164
a true-breeding agouti line are crossed with nonagoutis,
the F1 is all agouti and the F2 has a 3 : 1 ratio of agoutis to (Adapted from A. M. Srb, R. D. Owen, and R. S. Edgar,
nonagoutis. Diagram this cross, letting A represent the General Genetics, 2nd ed. W. H. Freeman and Company,
allele responsible for the agouti phenotype and a, 1965.)
nonagouti. Show the phenotypes and genotypes of the 43. An allele A that is not lethal when homozygous causes
parents, their gametes, the F1, their gametes, and the F2. rats to have yellow coats. The allele R of a separate gene
b. Another inherited color deviation in mice substitutes that assorts independently produces a black coat.
brown for the black color in the wild-type hair. Such Together, A and R produce a grayish coat, whereas a and
brown-agouti mice are called cinnamons. When wild- r produce a white coat. A gray male is crossed with a yel-
Problems 251
1
low female, and the F1 is 83 yellow, 83 gray, 8 black, and
1
8
An individual of genotype td grows only when its medi-
white. Determine the genotypes of the parents. um supplies tryptophan. The allele su assorts indepen-
dently of td; its only known effect is to suppress the td
44. The genotype r/r ; p/p gives fowl a single comb, R/− ; P/− phenotype. Therefore, strains carrying both td and su do
gives a walnut comb, r/r ; P/− gives a pea comb, and not require tryptophan for growth.
R/− ; p/p gives a rose comb (see the illustrations).
Assume independent assortment. a. If a td ; su strain is crossed with a genotypically wild-
type strain, what genotypes are expected in the progeny
and in what proportions?
b. What will be the ratio of tryptophan-dependent to
tryptophan-independent progeny in the cross of part a?
48. Mice of the genotypes A/A ; B/B ; C/C ; D/D ; S/S and
a/a ; b/b ; c/c ; d/d ; s/s are crossed. The progeny are inter-
Single Walnut Pea Rose
crossed. What phenotypes will be produced in the F2 and
in what proportions? [The allele symbols stand for the
a. What comb types will appear in the F1 and in the F2 following: A = agouti, a = solid (nonagouti); B = black
and in what proportions if single-combed birds are pigment, b = brown; C = pigmented, c = albino; D =
crossed with birds of a true-breeding walnut strain? nondilution, d = dilution (milky color); S = unspotted,
b. What are the genotypes of the parents in a walnut × s = pigmented spots on white background.]
rose mating from which the progeny are 83 rose, 83 walnut, 49. Consider the genotypes of two lines of chickens: the
1 1
8 pea, and 8 single? pure-line mottled Honduran is i/i ; D/D ; M/M ; W/W, and
c. What are the genotypes of the parents in a walnut × the pure-line leghorn is I/I ; d/d ; m/m ; w/w, where
rose mating from which all the progeny are walnut?
d. How many genotypes produce a walnut phenotype? I = white feathers, i = colored feathers
Write them out. D = duplex comb, d = simplex comb
45. The production of eye-color pigment in Drosophila re- M = bearded, m = beardless
quires the dominant allele A. The dominant allele P of a
W = white skin, w = yellow skin
second independent gene turns the pigment to purple,
but its recessive allele leaves it red. A fly producing no These four genes assort independently. Starting with
pigment has white eyes. Two pure lines were crossed these two pure lines, what is the fastest and most
with the following results: convenient way of generating a pure line that has colored
feathers, has a simplex comb, is beardless, and has yellow
P red-eyed female white-eyed male skin? Make sure that you show
a. the breeding pedigree.
F1 purple-eyed females
red-eyed males b. the genotype of each animal represented.
F1 F1 c. how many eggs to hatch in each cross, and why this
number.
3
F2 both males and females: 8 purple eyed d. why your scheme is the fastest and the most
3
red eyed convenient.
8
2
8 white eyed 50. The following pedigree is for a dominant phenotype gov-
erned by an autosomal allele. What does this pedigree
Explain this mode of inheritance, and show the genotypes suggest about the phenotype, and what can you deduce
of the parents, the F1, and the F2. about the genotype of individual A?
46. When true-breeding brown dogs are mated with certain
true-breeding white dogs, all the F1 pups are white. The
F2 progeny from some F1 × F1 crosses were 118 white,
32 black, and 10 brown pups. What is the genetic basis for A
these results?
47. Wild-type strains of the haploid fungus Neurospora can
make their own tryptophan. An abnormal allele td ren- 51. Petal coloration in foxgloves is determined by three genes.
ders the fungus incapable of making its own tryptophan. M encodes an enzyme that synthesizes anthocyanin,
252 CHA P TER 6 Gene Interaction
the purple pigment seen in these petals; m/m produces 54. In minks, wild types have an almost black coat. Breeders
no pigment, resulting in the phenotype albino with yel- have developed many pure lines of color variants for the
lowish spots. D is an enhancer of anthocyanin, resulting mink-coat industry. Two such pure lines are platinum
in a darker pigment; d/d does not enhance. At the third (blue gray) and aleutian (steel gray). These lines were
locus, w/w allows pigment deposition in petals, but W pre- used in crosses, with the following results:
vents pigment deposition except in the spots and so re-
sults in the white, spotted phenotype. Consider the fol- Cross Parents F1 F2
lowing two crosses: 1 wild × platinum wild 18 wild, 5 platinum
Cross Parents Progeny 2 wild × aleutian wild 27 wild, 10 aleutian
3 platinum × aleutian wild 133 wild
1 dark × white with 1
2 dark purple :
41 platinum
purple yellowish 1
2 light purple
spots 46 aleutian
2 white with × light 1
white with purple 17 sapphire (new)
2
yellowish purple spots : 41 dark a. Devise a genetic explanation of these three crosses.
spots purple : 41 light purple Show genotypes for the parents, the F1, and the F2 in
the three crosses, and make sure that you show the
In each case, give the genotypes of parents and progeny alleles of each gene that you hypothesize for every
with respect to the three genes. mink.
52. In one species of Drosophila, the wings are normally b. Predict the F1 and F2 phenotypic ratios from crossing
round in shape, but you have obtained two pure lines, one sapphire with platinum and with aleutian pure lines.
of which has oval wings and the other sickle-shaped 55. In Drosophila, an autosomal gene determines the shape
wings. Crosses between pure lines reveal the following of the hair, with B giving straight and b giving bent
results: hairs. On another autosome, there is a gene of which a
Parents F1 dominant allele I inhibits hair formation so that the fly
is hairless (i has no known phenotypic effect).
Female Male Female Male
a. If a straight-haired fly from a pure line is crossed with
sickle round sickle sickle a fly from a pure-breeding hairless line known to be an
round sickle sickle round inhibited bent genotype, what will the genotypes and
phenotypes of the F1 and the F2 be?
sickle oval oval sickle
b. What cross would give the ratio 4 hairless : 3 straight : 1
bent?
a. Provide a genetic explanation of these results, defining
all allele symbols. 56. The following pedigree concerns eye phenotypes in
Tribolium beetles. The solid symbols represent black
b. If the F1 oval females from cross 3 are crossed with the
eyes, the open symbols represent brown eyes, and the
F1 round males from cross 2, what phenotypic pro-
cross symbols (X) represent the “eyeless” phenotype, in
portions are expected for each sex in the progeny?
which eyes are totally absent.
53. Mice normally have one yellow band on each hair, but
variants with two or three bands are known. A female I
mouse having one band was crossed with a male having 1 2 3
three bands. (Neither animal was from a pure line.) The
II
progeny were 1 2 3 4 5
57. A plant believed to be heterozygous for a pair of alleles one another. Luxuriant growth is noted at both ends of
B/b (where B encodes yellow and b encodes bronze) was the trpE streak and at one end of the trpD streak (see the
selfed, and, in the progeny, there were 280 yellow and figure below).
120 bronze plants. Do these results support the hypoth-
esis that the plant is B/b?
58. A plant thought to be heterozygous for two indepen-
dently assorting genes (P/p ; Q/q) was selfed, and the
progeny were
88
P/- ; Q/- 25
p/p ; Q/-
32
P/- ; q/q 14
p/p ; q/q
II
III
Symptoms
Unknown, presumed normal Eye lens displacement Long fingers and toes
Examined, normal Congenital heart disease Very long, thin fingers and toes
Questionably affected
a. Use the pedigree above to propose a mode of in- and b. Both genes are autosomal. In the following pedi-
heritance for Marfan’s syndrome. gree, black symbols indicate a black coat, pink symbols
b. What genetic phenomenon is shown by this pedigree? indicate brown, and white symbols indicate beige.
c. Speculate on a reason for such a phenomenon. I
1 2
(Data from J. V. Neel and W. J. Schull, Human Heredity.
University of Chicago Press, 1954.)
64. In corn, three dominant alleles, called A, C, and R,
II
must be present to produce colored seeds. Genotype 1 2 3 4 5 6
A/− ; C/− ; R/− is colored; all others are colorless.
A colored plant is crossed with three tester plants of
known genotype. With tester a/a ; c/c ; R/R, the colored III
plant produces 50 percent colored seeds; with a/a ; 1 2 3 4 5 6 7
C/C ; r/r, it produces 25 percent colored; and with
A/A ; c/c ; r/r, it produces 50 percent colored. What is a. What is the name given to the type of gene
the genotype of the colored plant? interaction in this example?
65. The production of pigment in the outer layer of seeds b. What are the genotypes of the individual mice in the
of corn requires each of the three independently as- pedigree? (If there are alternative possibilities, state
sorting genes A, C, and R to be represented by at least them.)
one dominant allele, as specified in Problem 64. The 67. A researcher crosses two white-flowered lines of
dominant allele Pr of a fourth independently assorting Antirrhinum plants as follows and obtains the following
gene is required to convert the biochemical precursor results:
into a purple pigment, and its recessive allele pr makes pure line 1 × pure line 2
the pigment red. Plants that do not produce pigment ↓
have yellow seeds. Consider a cross of a strain of geno-
F1 all white
type A/A ; C/C ; R/R ; pr/pr with a strain of genotype
a/a ; c/c ; r/r ; Pr/Pr. F1 × F1
↓
a. What are the phenotypes of the parents?
F2 131 white
b. What will be the phenotype of the F1?
29 red
c. What phenotypes, and in what proportions, will
appear in the progeny of a selfed F1? a. Deduce the inheritance of these phenotypes; use
d. What progeny proportions do you predict from the clearly defined gene symbols. Give the genotypes of
testcross of an F1? the parents, F1, and F2.
66. The allele B gives mice a black coat, and b gives a brown b. Predict the outcome of crosses of the F1 with each
one. The genotype e/e of another, independently assort- parental line.
ing gene prevents the expression of B and b, making the 68. Assume that two pigments, red and blue, mix to give the
coat color beige, whereas E/− permits the expression of B normal purple color of petunia petals. Separate bio-
Problems 255
chemical pathways synthesize the two pigments, as a. all the genotypes in each of the six rows.
shown in the top two rows of the accompanying diagram. b. the proposed origin of the superdouble.
“White” refers to compounds that are not pigments.
(Total lack of pigment results in a white petal.) Red pig- 70. In a certain species of fly, the normal eye color is red (R).
ment forms from a yellow intermediate that is normally Four abnormal phenotypes for eye color were found: two
at a concentration too low to color petals. were yellow (Y1 and Y2), one was brown (B), and one
was orange (O). A pure line was established for each phe-
E
pathway I white1 blue notype, and all possible combinations of the pure lines
A B were crossed. Flies of each F1 were intercrossed to pro-
pathway II white2 yellow red duce an F2. The F1 and the F2 flies are shown within the
C
following square; the pure lines are given at the top and
at the left-hand side.
white3
D
white4 Y1 Y2 B O
pathway III
F 1 all Y all R all R all R
A third pathway, whose compounds do not contribute Y1 F2 all Y 9 R 9 R 9R
pigment to petals, normally does not affect the blue and 7 Y 4 Y 4O
red pathways, but, if one of its intermediates (white3) 3 B 3Y
should build up in concentration, it can be converted into F1 all Y all R all R
the yellow intermediate of the red pathway. Y2 F2 all Y 9 R 9R
In the diagram, the letters A through E represent 4 Y 4Y
enzymes; their corresponding genes, all of which are 3 B 3O
unlinked, may be symbolized by the same letters. F1 all B all R
Assume that wild-type alleles are dominant and B F2 all B 9R
encode enzyme function and that recessive alleles result 4 O
in a lack of enzyme function. Deduce which combinations 3 B
of true-breeding parental genotypes could be crossed to F1 all O
produce F2 progeny in the following ratios: O F2 all O
a. 9 purple : 3 green : 4 blue
a. Define your own symbols, and list the genotypes of
b. 9 purple : 3 red : 3 blue : 1 white all four pure lines.
c. 13 purple : 3 blue b. Show how the F1 phenotypes and the F2 ratios are
d. 9 purple : 3 red : 3 green : 1 yellow produced.
(Note: Blue mixed with yellow makes green; assume that c. Show a biochemical pathway that explains the
no mutations are lethal.) genetic results, indicating which gene controls which
enzyme.
69. The flowers of nasturtiums (Tropaeolum majus) may be 71. In common wheat, Triticum aestivum, kernel color is de-
single (S), double (D), or superdouble (Sd). Superdoubles termined by multiply duplicated genes, each with an R
are female sterile; they originated from a double-flow- and an r allele. Any number of R alleles will give red, and
ered variety. Crosses between varieties gave the progeny a complete lack of R alleles will give the white phenotype.
listed in the following table, in which pure means “pure In one cross between a red pure line and a white pure
breeding.” 63 1
line, the F2 was 64 red and 64 white.
Cross Parents Progeny a. How many R genes are segregating in this system?
1 pure S × pure D All S b. Show the genotypes of the parents, the F1, and the F2.
2 cross 1 F1 × cross 1 F1 78 S : 27 D c. Different F2 plants are backcrossed with the white
3 pure D × Sd 112 Sd : 108 D parent. Give examples of genotypes that would give the
following progeny ratios in such backcrosses: (1) 1 red :
4 pure S × Sd 8 Sd : 7 S 1 white, (2) 3 red : 1 white, (3) 7 red : 1 white.
5 pure D × cross 4 Sd progeny 18 Sd : 19 S d. What is the formula that generally relates the
6 pure D × cross 4 S progeny 14 D : 16 S number of segregating genes to the proportion of red
individuals in the F2 in such systems?
Using your own genetic symbols, propose an explanation 72. The following pedigree shows the inheritance of deaf-
for these results, showing mutism.
256 CHA P TER 6 Gene Interaction
1 2 3 4 5 6 7 8 9 10 11 12
1 w + + + w + + + + + + +
2 w + + + w + w + w + +
3 w w + + + + + + + +
4 w + + + + + + + +
5 w + + + + + + +
6 w + w + w + +
7 w + + + w w
8 w + w + +
9 w + + +
10 w + +
11 w w
12 w
a. Explain what this experiment was designed to test. 79. In the nematode C. elegans, some worms have blistered
b. Use this reasoning to assign genotypes to all 12 cuticles due to a recessive mutation in one of the bli
mutants. genes. Someone studying a suppressor mutation that
suppressed bli-3 mutations wanted to know if it would
c. Explain why the phenotype of the F1 hybrids between
also suppress mutations in bli-4. They had a strain that
mutants 1 and 2 differed from that of the hybrids between
was homozygous for this recessive suppressor mutation,
mutants 1 and 5.
and its phenotype was wild type.
78. A geneticist working on a haploid fungus makes a cross
a. How would they determine whether this recessive
between two slow-growing mutants called mossy and spi-
suppressor mutation would suppress mutations in bli-4?
der (referring to the abnormal appearance of the colo-
In other words, what is the genotype of the worms
nies). Tetrads from the cross are of three types (A, B, C),
required to answer the question?
but two of them contain spores that do not germinate.
b. What cross(es) would they do to make these worms?
Spore A B C
c. What results would they expect in the F2 if
1 wild type wild type spider
(1) it did act as a suppressor of bli-4?
2 wild type spider spider
3 no germination mossy mossy (2) it did not act as a suppressor of bli-4?
4 no germination no germination mossy
7
C h a p t e r
DNA: Structure
and Replication
Learning Outcomes
After completing this chapter, you will be able to
outline
J
ames Watson (an American microbial geneticist) and Francis Crick (an Eng-
A sculpture of DNA
lish physicist) solved the structure of DNA in 1953. Their model of the struc-
ture of DNA was revolutionary. It proposed a definition for the gene in
chemical terms and, in doing so, paved the way for an understanding of gene
action and heredity at the molecular level. A measure of the importance of their
discovery is that the double-helical structure has become a cultural icon that is
seen more and more frequently in paintings, in sculptures, and even in play-
grounds (Figure 7-1).
The story begins in the first half of the twentieth century, when the results of
several experiments led scientists to conclude that DNA is the genetic material,
not some other biological molecule such as a carbohydrate, protein, or lipid. DNA
is a simple molecule made up of only four different building blocks (the four
nucleotide bases). It was thus necessary to understand how this very simple mol-
ecule could be the blueprint for the incredible diversity of organisms on Earth.
The model of the double helix proposed by Watson and Crick was built upon
the results of scientists before them. They relied on earlier discoveries of the
chemical composition of DNA and the ratios of its bases. In addition, X-ray dif-
fraction pictures of DNA revealed to the trained eye that DNA is a helix of precise
dimensions. Watson and Crick concluded that DNA is a double helix composed of
two strands of linked nucleotide bases that wind around each other.
The proposed structure of the hereditary material immediately suggested
F i g u r e 7-1 [ Neil Grant/Alamy.]
how it could serve as a blueprint and how this blueprint could be passed down
through the generations. First, the information for making an organism is encoded
in the sequence of the nucleotide bases composing the two DNA strands of the
helix. Second, because of the rules of base complementarity discovered by Wat-
son and Crick, the sequence of one strand dictates the sequence of the other
strand. In this way, the genetic information in the DNA sequence can be passed
down from one generation to the next by having each of the separated strands of
DNA serve as a template for producing new copies of the molecule.
In this chapter, we focus on DNA, its structure, and the production of DNA cop-
ies in a process called replication. Precisely how DNA is replicated is still an active
area of research more than 50 years after the discovery of the double helix. Our cur-
rent understanding of the mechanism of replication gives a central role to a protein
machine, called the replisome. This complex of proteins coordinates the numerous
reactions that are necessary for the rapid and accurate replication of DNA.
bacterial cells that express one phenotype can be transformed into cells that ex-
press a different phenotype and that the transforming agent is DNA.
Discovery of transformation
Frederick Griffith made a puzzling observation in the course of experiments per-
formed in 1928 on the bacterium Streptococcus pneumoniae. This bacterium,
which causes pneumonia in humans, is normally lethal in mice. However, some
strains of this bacterial species have evolved to be less virulent (less able to cause
disease or death). Griffith’s experiments are summarized in Figure 7-2. In these
experiments, Griffith used two strains that are distinguishable by the appearance
of their colonies when grown in laboratory cultures. One strain was a normal viru-
lent type deadly to most laboratory animals. The cells of this strain are enclosed
in a polysaccharide capsule, giving colonies a smooth appearance; hence, this
strain is identified as S. Griffith’s other strain was a mutant nonvirulent type that
grows in mice but is not lethal. In this strain, the polysaccharide coat is absent,
giving colonies a rough appearance; this strain is called R.
Griffith killed some virulent cells by boiling them. He then injected the heat-
killed cells into mice. The mice survived, showing that the carcasses of the cells do
not cause death. However, mice injected with a mixture of heat-killed virulent
cells and live nonvirulent cells did die. Furthermore, live cells could be recovered
from the dead mice; these cells gave smooth colonies and were virulent on
(a) (b)
R strain
Mouse dies
(c) (d)
R strain
F i g u r e 7-2 The presence of heat-killed S cells transforms live R cells into live S cells.
(a) Mouse dies after injection with the virulent S strain. (b) Mouse survives after injection with the
R strain. (c) Mouse survives after injection with heat-killed S strain. (d) Mouse dies after injection
with a mixture of heat-killed S strain and live R strain. Live S cells were isolated from the dead
mouse, indicating that the heat-killed S strain somehow transforms the R strain into the virulent
S strain.
262 CHAPTER 7 DNA: Structure and Replication
S strain
extract
R strain
Mouse dies Mouse dies Mouse dies Mouse dies Mouse dies Mouse lives
F i g u r e 7- 3 DNA is the agent transforming the R strain into virulence. If the DNA in an
extract of heat-killed S-strain cells is destroyed, then mice survive when injected with a
mixture of the heat-killed cells and the live nonvirulent R-strain cells.
subsequent injection. Somehow, the cell debris of the boiled S cells had converted
the live R cells into live S cells. The process, already discussed in Chapter 5, is
called transformation.
The next step was to determine which chemical component of the dead donor
cells had caused this transformation. This substance had changed the genotype of
the recipient strain and therefore might be a candidate for the hereditary mate-
rial. This problem was solved by experiments conducted in 1944 by Oswald Avery
and two colleagues, Colin MacLeod and Maclyn McCarty (Figure 7-3). Their
approach to the problem was to chemically destroy all the major categories of
chemicals in an extract of dead cells one at a time and find out if the extract had
lost the ability to transform. The virulent cells had a smooth polysaccharide coat,
whereas the nonvirulent cells did not; hence, polysaccharides were an obvious
candidate for the transforming agent. However, when polysaccharides were
destroyed, the mixture could still transform. Proteins, fats, and ribonucleic acids
(RNAs) were all similarly shown not to be the transforming agent. The mixture
lost its transforming ability only when the donor mixture was treated with the
enzyme deoxyribonuclease (DNase), which breaks up DNA. These results strongly
implicate DNA as the genetic material. It is now known that fragments of the
transforming DNA that confer virulence enter the bacterial chromosome and
replace their counterparts that confer nonvirulence.
Hershey–Chase experiment
The experiments conducted by Avery and his colleagues were definitive, but
many scientists were very reluctant to accept DNA (rather than proteins) as the
genetic material. After all, how could such a low-complexity molecule as DNA
encode the diversity of life on this planet? Alfred Hershey and Martha Chase pro-
vided additional evidence in 1952 in an experiment that made use of phage T2, a
virus that infects bacteria. They reasoned that the infecting phage must inject
into the bacterium the specific information that dictates the reproduction of new
viral particles. If they could find out what material the phage was injecting into
the bacterial host, they would have determined the genetic material of phages.
The phage is relatively simple in molecular constitution. The T2 structure is
similar to T4 shown in Figures 5-22 to 5-24. Most of its structure is protein, with
DNA contained inside the protein sheath of its “head.” Hershey and Chase
decided to give the DNA and protein distinct labels by using radioisotopes so that
they could track the two materials during infection. Phosphorus is not found in
the amino acid building blocks of proteins but is an integral part of DNA; con-
versely, sulfur is present in proteins but never in DNA. Hershey and Chase incor-
porated the radioisotope of phosphorus (32P) into phage DNA and that of sulfur
(35S) into the proteins of a separate phage culture. As shown in Figure 7-4, they
then infected two E. coli cultures with many virus particles per cell: one E. coli
culture received phage labeled with 32P, and the other received phage labeled with
35S. After allowing sufficient time for infection to take place, they sheared the
empty phage carcasses (called ghosts) off the bacterial cells by agitation in a
kitchen blender. They separated the bacterial cells from the phage ghosts in a
Most of
E. coli Phage radioactivity
ghosts recovered in
T2 phage + phage ghosts
35S
Blender
and
centrifuge
F i g u r e 7- 4 The Hershey–Chase
Phage
ghosts experiment demonstrated that the genetic
material of phages is DNA, not protein.
The experiment uses two sets of T2
32P + bacteriophage. In one set, the protein coat
is labeled with radioactive sulfur (35S), not
Most of found in DNA. In the other set, the DNA is
radioactivity labeled with radioactive phosphorus (32P),
Blender recovered in not found in amino acids. Only the 32P is
and bacteria
recovered from the E. coli, indicating that
centrifuge DNA is the agent necessary for the
production of new phages.
26 4 CHAPTER 7 DNA: Structure and Replication
centrifuge and then measured the radioactivity in the two fractions. When the
32P-labeled phages were used to infect E. coli, most of the radioactivity ended up
inside the bacterial cells, indicating that the phage DNA entered the cells. When
the 35S-labeled phages were used, most of the radioactive material ended up in the
phage ghosts, indicating that the phage protein never entered the bacterial cell.
The conclusion is inescapable: DNA is the hereditary material. The phage pro-
teins are mere structural packaging that is discarded after delivering the viral
DNA to the bacterial cell.
The building blocks of DNA The first piece of the puzzle was knowledge of the
basic building blocks of DNA. As a chemical, DNA is quite simple. It contains three
types of chemical components: (1) phosphate, (2) a sugar called deoxyribose,
and (3) four nitrogenous bases—adenine, guanine, cytosine, and thymine. The
sugar in DNA is called “deoxyribose” because it has only a hydrogen atom (H) at
the 2′-carbon atom, unlike ribose (a component of RNA), which has a hydroxyl
(OH) group at that position. Two of the bases, adenine and guanine, have a dou-
ble-ring structure characteristic of a type of chemical called a purine. The other
two bases, cytosine and thymine, have a single-ring structure of a type called a
pyrimidine.
The carbon atoms in the bases are assigned numbers for ease of reference.
The carbon atoms in the sugar group also are assigned numbers—in this case, the
number is followed by a prime (1′, 2′, and so forth).
The chemical components of DNA are arranged into groups called nucleo-
tides, each composed of a phosphate group, a deoxyribose sugar molecule, and
any one of the four bases (Figure 7-5). It is convenient to refer to each nucleotide
7.2 DNA Structure 26 5
Purine nucleotides
NH2 O
H
Phosphate N 6 N 6
7 5 N 7 5 N
1 1
8 4 2
Nitrogenous base 8 4 2
Guanine (G)
9 3 9
O N (Adenine, A) O N
3
N N NH2
5′
−O P O CH2 O −O P O CH2 O
−O 4′ H H 1′ Deoxyribose sugar −O H H
H H H H
3′ 2′
OH H OH H
Deoxyadenosine 5′-monophosphate (dAMP) Deoxyguanosine 5′-monophosphate (dGMP)
Pyrimidine nucleotides
NH2 O
CH3 H
4
3N
4
5 3N 5
Cytosine (C) 6 2
Thymine (T)
6 2 1
1
N O O N O
O
−O –O P O
P O CH2 O CH2 O
−O H H –O H H
H H H H
OH H OH H
Deoxycytidine 5′-monophosphate (dCMP) Deoxythymidine 5′-monophosphate (dTMP)
F i g u r e 7- 5 These nucleotides, two with purine bases and two with pyrimidine bases,
are the fundamental building blocks of DNA. The sugar is called deoxyribose because it is
a variation of a common sugar, ribose, that has one more oxygen atom (position indicated
by the red arrow).
by the first letter of the name of its base: A, G, C, or T. The nucleotide with the
adenine base is called deoxyadenosine 5′-monophosphate, where the 5′ refers to
the position of the carbon atom in the sugar ring to which the single (mono) phos-
phate group is attached.
Chargaff’s rules of base composition The second piece of the puzzle used by
Watson and Crick came from work done several years earlier by Erwin Chargaff.
Studying a large selection of DNAs from different organisms (Table 7-1), Chargaff
established certain empirical rules about the amounts of each type of nucleotide
found in DNA:
1. The total amount of pyrimidine nucleotides (T + C) always equals the total
amount of purine nucleotides (A + G).
2. The amount of T always equals the amount of A, and the amount of C always
equals the amount of G. But the amount of A + T is not necessarily equal to the
amount of G + C, as can be seen in the right-hand column of Table 7-1. This ratio
varies among different organisms but is virtually the same in different tissues of
the same organism.
26 6 CHAPTER 7 DNA: Structure and Replication
X-ray diffraction analysis of DNA The third and most controversial piece of the
puzzle came from X-ray diffraction data on DNA structure that were collected by
Rosalind Franklin when she was in the laboratory of Maurice Wilkins (Figure 7-6).
In such experiments, X rays are fired at DNA fibers, and the scatter of the rays
from the fibers is observed by catching the rays on photographic film, on which
the X rays produce spots. The angle of scatter represented by each spot on the
film gives information about the position of an atom or certain groups of atoms in
the DNA molecule. This procedure is not simple to carry out (or to explain), and
the interpretation of the spot patterns requires complex mathematical treatment
that is beyond the scope of this text. The available data suggested that DNA is long
and skinny and that it has two similar parts that are parallel to each other and run
along the length of the molecule. The X-ray data showed the molecule to be heli-
cal (spiral-like). Unknown to Rosalind Franklin, her best X-ray picture was shown
to Watson and Crick by Maurice Wilkins, and it was this crucial piece of the puzzle
F i g u r e 7- 6 Rosalind Franklin ( left ) and her X-ray diffraction pattern of DNA ( right ).
[ Left ) Science Source; ( right ) Rosalind Franklin/Science Source.]
7.2 DNA Structure 267
F i g u r e 7- 8 (a) A simplified model showing the helical structure of DNA. The sticks
represent base pairs, and the ribbons represent the sugar–phosphate backbones of the two
antiparallel chains. (b) An accurate chemical diagram of the DNA double helix, unrolled to
show the sugar–phosphate backbones (blue) and base-pair rungs (purple, orange). The
backbones run in opposite directions; the 5 ′ and 3 ′ ends are named for the orientation of
the 5 ′ and 3 ′ carbon atoms of the sugar rings. Each base pair has one purine base, adenine
(A) or guanine (G), and one pyrimidine base, thymine (T) or cytosine (C), connected by
hydrogen bonds (red dashed lines).
(known from the X-ray data) would be explained if a purine base always pairs (by
hydrogen bonding) with a pyrimidine base (Figure 7-10). Such pairing would
account for the (A + G) = (T + C) regularity observed by Chargaff, but it would
predict four possible pairings: T…A, T…G, C…A, and C…G. Chargaff’s data,
however, indicate that T pairs only with A, and C pairs only with G. Watson
and Crick concluded that each base pair consists of one purine base and one
pyrimidine base, paired according to the following rule: G pairs with C, and A
pairs with T.
Note that the G–C pair has three hydrogen bonds, whereas the A–T pair has only
two (see Figure 7-8b). We would predict that DNA containing many G–C pairs
7.2 DNA Structure 26 9
Major
groove
C in phosphate
5ʹ
ester chain
3ʹ
P
3ʹ Minor
groove C and N
in bases
5ʹ
Base
pairs
Sugar–phosphate
backbone
(a) (b)
F i g u r e 7- 9 The ribbon diagram (a) highlights the stacking of the base pairs, whereas the
space-filling model (b) shows the major and minor grooves.
would be more stable than DNA containing many A–T Base pairing in DNA
pairs. In fact, this prediction is confirmed. Heat causes the
two strands of DNA double helix to separate (a process
called DNA melting or DNA denaturation); DNAs with Pyrimidine + pyrimidine: DNA too thin
higher G + C content can be shown to require higher tem-
peratures to melt because of the greater attraction of the
G–C pairing.
AT
CG model, in addition to being consistent with earlier data about
CG DNA structure, fulfilled the three requirements for a heredi-
tary substance:
TA 1. The double-helical structure suggested how the genetic
CG material might determine the structure of proteins. Perhaps
GC the sequence of nucleotide pairs in DNA dictates the sequence
of amino acids in the protein specified by that gene. In other
words, some sort of genetic code may write information in
TA
DNA as a sequence of nucleotides and then translate it into a
TA
different language of amino acid sequences in protein. Just
AT
how it is done is the subject of Chapter 9.
2. If the base sequence of DNA specifies the amino acid se-
TA
quence, then mutation is possible by the substitution of one
AT
CG
type of base for another at one or more positions. Mutations
will be discussed in Chapter 16.
3. As Watson and Crick stated in the concluding words of
TA their 1953 Nature paper that reported the double-helical
T A
G C structure of DNA: “It has not escaped our notice that the
G
C G specific pairing we have postulated immediately suggests a
The two strands of the
C
parental double helix unwind,
T possible copying mechanism for the genetic material.”2 To
A and each specifies a new daughter A geneticists at the time, the meaning of this statement was
T strand by base-pairing rules. clear, as we see in the next section.
GC
GC GC
GC AT
AT
TA
TA
7.3 Semiconservative Replication
The copying mechanism to which Watson and Crick referred is
GC
GC called semiconservative replication and is diagrammed in
TA TA
TA TA Figure 7-11. The sugar–phosphate backbones are represented
GC GC by thick ribbons, and the sequence of base pairs is random.
Let’s imagine that the double helix is analogous to a zipper that
AT unzips, starting at one end. We can see that, if this zipper anal-
AT GC
ogy is valid, the unwinding of the two strands will expose single
GC GC
GC TA
bases on each strand. Each exposed base has the potential to
TA Old pair with free nucleotides in solution. Because the DNA struc-
ture imposes strict pairing requirements, each exposed base
TA will pair only with its complementary base, A with T and G
TA
AT
with C. Thus, each of the two single strands will act as a tem-
AT plate, or mold, to direct the assembly of complementary bases
GC GC
New to re-form a double helix identical with the original. The newly
added nucleotides are assumed to come from a pool of free
TA TA nucleotides that must be present in the cell.
CG
CG If this model is correct, then each daughter molecule
should contain one parental nucleotide chain and one newly
Meselson–Stahl experiment
The first problem in understanding DNA replication was to figure
out whether the mechanism of replication was semiconservative,
conservative, or dispersive. In 1958, two young scientists, Matthew Dispersive replication
Meselson and Franklin Stahl, set out to discover which of these pos-
sibilities correctly described DNA replication. Their idea was to allow
parental DNA molecules containing nucleotides of one density to
replicate in medium containing nucleotides of different density. If
DNA replicated semiconservatively, the daughter molecules should
be half old and half new and therefore of intermediate density.
To carry out their experiment, Meselson and Stahl grew E. coli
cells in a medium containing the heavy isotope of nitrogen (15N)
F i g u r e 7-12 Of three alternative
rather than the normal light (14N) form. This isotope was inserted into the nitro- models for DNA replication, the Watson–
gen bases, which then were incorporated into newly synthesized DNA strands. Crick model of DNA structure would
After many cell divisions in 15N, the DNA of the cells were well labeled with the produce the first (semiconservative)
heavy isotope. The cells were then removed from the 15N medium and put into a model. Gold lines represent the newly
14N medium; after one and two cell divisions, samples were taken and the DNA synthesized strands.
was isolated from each sample.
Meselson and Stahl were able to distinguish DNA of different densities because
the molecules can be separated from one another by a procedure called cesium
chloride gradient centrifugation. If cesium chloride (CsCl) is spun in a centrifuge at
tremendously high speeds (50,000 rpm) for many hours, the cesium and chloride
ions tend to be pushed by centrifugal force toward the bottom of the tube. Ulti-
mately, a gradient of ions is established in the tube, with the highest ion concentra-
tion, or density, at the bottom. DNA centrifuged with the cesium chloride forms a
band at a position identical with its density in the gradient (Figure 7-13). DNA of
different densities will form bands at different places. Cells initially grown in the
heavy isotope 15N showed DNA of high density. This DNA is shown in blue at the
left-hand side of Figure 7-13a. After growing these cells in the light isotope 14N for
one generation, the researchers found that the DNA was of intermediate density,
shown half blue (15N) and half gold (14N) in the middle of Figure 7-13a. Note that
Meselson and Stahl continued the experiment through two E. coli generations so
that they could distinguish semiconservative replication from dispersive. After
two generations, both intermediate- and low-density DNA was observed (right-
hand side of Figure 7-13a), precisely as predicted by Watson–Crick’s semiconserva-
tive replication model.
14N/14N
(light) DNA
14N/15N The replication fork
(hybrid) DNA
Another prediction of the Watson–Crick model of
15N/15N
DNA replication is that a replication zipper, or fork,
(heavy) DNA
will be found in the DNA molecule during replication.
This fork is the location at which the double helix
is unwound to produce the two single strands that
serve as templates for copying. In 1963, John Cairns
tested this prediction by allowing replicating DNA
(b) Predictions of conservative model
in bacterial cells to incorporate tritiated thymidine
Parental 1st generation 2nd generation ([3H]thymidine)—the thymine nucleotide labeled
with a radioactive hydrogen isotope called tritium.
Theoretically, each newly synthesized daughter mole-
cule should then contain one radioactive (“hot”) strand
(with 3H) and another nonradioactive (“cold”) strand.
After varying intervals and varying numbers of replica-
tion cycles in a “hot” medium, Cairns carefully lysed
14N/14N
(light) DNA
the bacteria and allowed the cell contents to settle onto
grids designed for electron microscopy. Finally, Cairns
covered the grid with photographic emulsion and
exposed it in the dark for 2 months. This procedure,
15N/15N
(heavy) DNA called autoradiography, allowed Cairns to develop a
picture of the location of 3H in the cell material. As 3H
decays, it emits a beta particle (an energetic electron).
The photographic emulsion detects a chemical reac-
tion that takes place wherever a beta particle strikes the
(c) Predictions of dispersive model emulsion. The emulsion can then be developed like a
photographic print so that the emission track of the
Parental 1st generation 2nd generation
beta particle appears as a black spot or grain.
After one replication cycle in [3H]thymidine, a ring
of dots appeared in the autoradiograph. Cairns inter-
preted this ring as a newly formed radioactive strand in
a circular daughter DNA molecule, as shown in Figure
7-14a. It is thus apparent that the bacterial chromo-
some is circular—a fact that also emerged from genetic
analysis described earlier (see Chapter 5). In the sec-
14N/15N 14N/15N ond replication cycle, the forks predicted by the model
(hybrid) DNA (hybrid) DNA were indeed seen. Furthermore, the density of grains
15N/15N in the three segments was such that the interpretation
(heavy) DNA shown in Figure 7-14b could be made: the thick curve of
dots cutting through the interior of the circle of DNA
would be the newly synthesized daughter strand, this
time consisting of two radioactive strands. Cairns saw
7.3 Semiconservative Replication 273
F i g u r e 7-14 A replicating bacterial chromosome has two replication forks. (a) Left:
Autoradiograph of a bacterial chromosome after one replication in tritiated thymidine.
According to the semiconservative model of replication, one of the two strands should be
radioactive. Right: Interpretation of the autoradiograph. The gold helix represents the
tritiated strand. (b) Left: Autoradiograph of a bacterial chromosome in the second round of
replication in tritiated (3H) thymidine. In this molecule, the newly replicated double helix that
crosses the circle could consist of two radioactive strands (if the parental strand were the
radioactive one). Right: The double thickness of the radioactive tracing on the
autoradiogram appears to confirm the interpretation shown here.
DNA polymerases
A problem confronted by scientists was to understand just how the bases are
brought to the double-helix template. Although scientists suspected that enzymes
played a role, that possibility was not proved until 1959, when Arthur Kornberg
isolated DNA polymerase from E. coli and demonstrated its enzymatic activity in
vitro. This enzyme adds deoxyribonucleotides to the 3′ end of a growing nucleo-
tide chain, using for its template a single strand of DNA that has been exposed by
localized unwinding of the double helix (Figure 7-15). The substrates for DNA
polymerase are the triphosphate forms of the deoxyribonucleotides, dATP, dGTP,
dCTP, and dTTP. The addition of each base to the growing polymer is accompa-
nied by the removal of two of the three phosphates in the form of pyrophosphate
(PPi). The energy produced by cleaving this high-energy bond and the subsequent
hydrolysis of pyrophosphate to two inorganic phosphate molecules helps drive
the endergonic process of building a DNA polymer.
There are now known to be five DNA polymerases in E. coli. The first enzyme
that Kornberg purified is now called DNA polymerase I, or pol I. This enzyme has
three activities, which appear to be located in different parts of the molecule:
1. a polymerase activity, which catalyzes chain growth in the 5′-to-3′ direction;
2. a 3′-to-5′ exonuclease activity, which removes mismatched bases; and
3. a 5′-to-3′ exonuclease activity, which degrades single strands of DNA or RNA.
We will return to the significance of the two exonuclease activities later in this
chapter.
Although pol I has a role in DNA replication (see next section), some scientists
suspected that it was not responsible for the majority of DNA synthesis because it
274 CHAPTER 7 DNA: Structure and Replication
DNA DNA
template strand template strand
5' 3' 5' 3'
–O P O –O P O
O O
H2C O G C H 2C O G C
H H H H
H H H H
3'
HO
••
H PPi O H
–O P O
O O
O
O P O P O P O
O O– H 2C O C G
O– O–
H2C 5' O C G H H
H H
H H
H H HO H
HO H
T T
5' 5'
Figure 7-15 DNA polymerase catalyzes the chain-elongation reaction. Energy for the reaction
comes from breaking the high-energy phosphate bond of the triphosphate substrate.
ANIMATED ART: The nucleotide polymerization process
was too slow (~20 nucleotides/second) and too abundant (~400 molecules/cell)
and because it dissociated from the DNA after incorporating from only 20 to
50 nucleotides. In 1969, John Cairns and Paula DeLucia settled this matter when
they demonstrated that an E. coli strain harboring a mutation in the gene that
Introduction to Genetic Analysis, 11e
Figure 07.15 #719 encodes DNA pol I was still able to grow normally and replicate its DNA. They
05/02/14 concluded that another DNA polymerase, now called pol III, catalyzes DNA syn-
Dragonfly Media Group thesis at the replication fork.
Synthesis on the other template also takes place at 3′ DNA replication at the growing fork
growing tips, but this synthesis is in the “wrong” direc-
tion, because, for this strand, the 5′-to-3′ direction of syn- is 5′
hes
thesis is away from the replication fork (see Figure 7-16). f s ynt
o 3′
ion
As we will see, the nature of the replication machinery ect
Dir
requires that synthesis of both strands take place in the Template strands
region of the replication fork. Therefore, synthesis mov-
Lagging strand
ing away from the growing fork cannot go on for long. It 5′
5′
must be in short segments: polymerase synthesizes a
segment, then moves back to the segment’s 5′ end, where 3′ 3′ Leading strand
the growing fork has exposed new template, and begins Fork movement
the process again. These short (1000–2000 nucleotides)
stretches of newly synthesized DNA are called Okazaki Dir
ect
fragments. ion
of s 5′
ynt
Another problem in DNA replication arises because hes
is 3′
DNA polymerase can extend a chain but cannot start a
chain. Therefore, synthesis of both the leading strand
and each Okazaki fragment must be initiated by a primer, or short chain of nucle- F i g u r e 7-16 The replication fork
otides, that binds with the template strand to form a segment of duplex nucleic moves in DNA synthesis as the double
acid. The primer in DNA replication can be seen in Figure 7-17. The primers are helix continuously unwinds. Synthesis of
the leading strand can proceed smoothly
synthesized by a set of proteins called a primosome, of which a central compo-
without interruption in the direction of
nent is an enzyme called primase, a type of RNA polymerase. Primase synthesizes movement of the replication fork, but syn-
a short (~8–12 nucleotides) stretch of RNA complementary to a specific region of thesis of the lagging strand must proceed
the chromosome. On the leading strand, only one initial primer is needed because, in the opposite direction, away from the
after the initial priming, the growing DNA strand serves as the primer for continu- replication fork.
ous addition. However, on the lagging strand, every Okazaki fragment needs its
own primer. The RNA chain composing the primer is then extended as a DNA
chain by DNA pol III.
A different DNA polymerase, pol I, removes the RNA primers with its 5′ to 3′
exonuclease activity and fills in the gaps with its 5′-to-3′ polymerase activity. As
mentioned earlier, pol I is the enzyme originally purified by Kornberg. Another
enzyme, DNA ligase, joins the 3′ end of the gap-filling DNA to the 5′ end of the F i g u r e 7-17 Steps in the synthesis of
downstream Okazaki fragment. The new strand thus formed is called the lagging the lagging strand. DNA synthesis
strand. DNA ligase joins broken pieces of DNA by catalyzing the formation of a proceeds by continuous synthesis on the
leading strand and discontinuous
synthesis on the lagging strand.
1. Primase synthesizes short RNA oligonucleotides (primers) 3. DNA polymerase I removes RNA at 5' end of neighboring
copied from DNA. fragment and fills gap.
5'
3' 5' 3'
3'
3' 5' RNA primer
5' 5'
3'
2. DNA polymerase III elongates RNA primers with new DNA. 4. DNA ligase connects adjacent fragments.
3' 3'
New DNA Okazaki fragment Ligation
5' 5'
276 CHAPTER 7 DNA: Structure and Replication
phosphodiester bond between the 5′-phosphate end of one fragment and the
adjacent 3′-OH group of another fragment.
A hallmark of DNA replication is its accuracy, also called fidelity: overall, less
than one error per 1010 nucleotides is inserted. Part of the reason for the accuracy
of DNA replication is that both DNA pol I and DNA pol III possess 3′-to-5′ exo-
nuclease activity, which serves a “proofreading” function by excising erroneously
inserted mismatched bases.
Given the importance of proofreading, let’s take a closer look at how it works.
A mismatched base pair occurs when the 5′-to-3′ polymerase activity inserts, for
example, an A instead of a G next to a C. The addition of an incorrect base is often
due to a process called tautomerization. Each of the bases in DNA can appear in
one of several forms, called tautomers, which are isomers that differ in the posi-
tions of their atoms and in the bonds between the atoms. The forms are in equi-
librium. The keto form of each base is normally present in DNA, but in rare
instances a base may shift to the imino or enol form. The imino and enol forms
may pair with the wrong base, forming a mispair (Figure 7-18). When a C shifts to
N O
5 5
6 4 H 6 4 H H
1 3 O 1 3 N
N 2
N N 2
N
H 6 N H N
N1 5 7 N1 6
7
O 8 O 5
2 4 8
H 9 2 4 9
Cytosine N 3
N Thymine 3
N
N N
H Adenine
Guanine
its rare imino form, the polymerase adds an A rather than a G (Figure 7-19). For- Proofreading removes
tunately, such a mismatch is usually detected and removed by the 3′-to-5′ exonu- mispaired bases
clease activity. Once the mismatched base is removed, the polymerase has another
chance to add the correct complementary G base. DNA
As you would expect, mutant strains lacking a functional 3′-to-5′ exonuclease polymerase I and III A
have a higher rate of mutation. In addition, because primase lacks a proofreading 5′ 3′ G
T G G A C T
function, the RNA primer is more likely than DNA to contain errors. The need to A C C T G A C G G
maintain the high fidelity of replication is one reason that the RNA primers at the 3′ 5′
ends of Okazaki fragments must be removed and replaced with DNA. Only after the
RNA primer is gone does DNA pol I catalyze DNA synthesis to replace the primer. Extension: incorrect
The subject of DNA repair will be covered in detail in Chapter 16. base (A) bonded to
imino form of C
3′ 5′
Topoisomerase
Replication
fork
movement
Helicase
Next Okazaki fragment will start here.
RNA primer
Primase
RNA primer
Okazaki
fragment
Single-strand-
β clamp binding proteins
DNA
polymerase DNA
Leading
III dimer polymerase I
strand
Lagging
strand
Ligase
3′ 5′ 5′ 3′
F i g u r e 7-2 0 The replisome and accessory proteins carry out a number of steps at the
replication fork. Topoisomerase and helicase unwind and open the double helix in preparation
for DNA replication. When the double helix has been unwound, single-strand-binding proteins
prevent the double helix from re-forming. The illustration is a representation of the so-called
trombone model (named for its resemblance to a trombone owing to the looping of the
lagging strand) showing how the two catalytic cores of the replisome are envisioned to interact
to coordinate the numerous events of leading- and lagging-strand replication.
ANIMATED ART: Leading- and lagging-strand synthesis
Note that primase, the enzyme that synthesizes the RNA primer, is not touch-
ing the clamp protein. Therefore, primase acts as a distributive enzyme—it adds
only a few ribonucleotides before dissociating from the template. This mode of
action makes sense because the primer need be only long enough to form a suit-
able duplex starting point for DNA pol III.
7.5 The Replisome: A Remarkable Replication Machine 279
Unwound
parental
duplex
Over-
wound
region
(a)
2 DNA rotates
to remove the coils.
5′ 3′ 5′ 3′
3′ 5′ 3′ 5′
Growth Growth
Chromosome
DNA
Sister chromatids
DNA replicas (daughter molecules)
yeast research (see the yeast Model Organism box in Chapter 12). The origins of F i g u r e 7-2 3 DNA replication
replication in yeast are very much like oriC in E. coli. The 100- to 200-bp origins proceeds in both directions from an
have a conserved DNA sequence that includes an AT-rich region that melts when origin of replication. Black arrows
indicate the direction of growth of
an initiator protein binds to adjacent binding sites. Unlike prokaryotic chromo-
daughter DNA molecules. (a) Starting at
somes, each eukaryotic chromosome has many replication origins to replicate the the origin, DNA polymerases move
much larger eukaryotic genomes quickly. Approximately 400 replication origins outward in both directions. Long yellow
are dispersed throughout the 16 chromosomes of yeast, and there are estimated arrows represent leading strands and
to be thousands of growing forks in the 23 chromosomes of humans. Thus, in short joined yellow arrows represent
eukaryotes, replication proceeds in both directions from multiple points of origin lagging strands. (b) How replication
(Figure 7-23). The double helices that are being produced at each origin of repli- proceeds at the chromosome level.
Three origins of replication are shown
cation elongate and eventually join one another. When replication of the two
in this example.
strands is complete, two identical daughter molecules of DNA result.
ANIMATED ART: DNA
replication: replication of a chromosome
K e y C o n c e p t Where and when replication takes place are carefully controlled
by the ordered assembly of the replisome at a precise site called the origin. Replication
proceeds in both directions from a single origin on the circular prokaryotic
chromosome. Replication proceeds in both directions from hundreds or thousands of
origins on each of the linear eukaryotic chromosomes.
Original Daughter
cell cells
G2 G1
recruit the helicase, called the MCM complex, and the other components of the
replisome.
Replication is linked to the cell cycle through the availability of Cdc6 and Cdt1.
In yeast, these proteins are synthesized during late mitosis and gap 1 (G1) and are
destroyed by proteolysis after synthesis has begun. In this way, the replisome can
be assembled only before the S phase. When replication begins, new replisomes
cannot form at the origins, because Cdc6 and Cdt1 are degraded during the S phase
and are no longer available.
ORC
mechanism may have evolved so that higher eukaryotes can regulate Origin recognition
the timing of DNA replication during S phase (see Chapter 12 for
more about euchromatin and heterochromatin). Gene-rich regions of
the chromosome (the euchromatin) have been known for some time
to replicate early in S phase, whereas gene-poor regions, including the
ORC
densely packed heterochromatin, replicate late in S phase. DNA repli-
cation could not be timed by region if ORCs were to bind to related
sequences scattered throughout the chromosomes. Instead, ORCs
may, for example, have a higher affinity for origins in open chromatin
and bind to these origins first and then bind to condensed chromatin Loading of helicase,
Cdc6, and Cdt1
only after the gene-rich regions have been replicated.
Unwinding of helix
and sliding of helicase
7.7 Telomeres and Telomerase: Replication
Termination
Replication of the linear DNA molecule in a eukaryotic chromosome
proceeds in both directions from numerous replication origins, as
shown in Figure 7-23. This process replicates most of the chromo-
somal DNA, but there is an inherent problem in replicating the two
ends of linear DNA molecules, the regions called telomeres. Con-
tinuous synthesis on the leading strand can proceed right up to the
Recruitment of
very tip of the template. However, lagging-strand synthesis requires DNA polymerase
primers ahead of the process; so, when the last primer is removed,
sequences are missing at the end of that strand. As a consequence, a
single-stranded tip remains in one of the daughter DNA molecules
(Figure 7-26). If the daughter chromosome with this DNA molecule
were replicated again, the strand missing sequences at the end would
become a shortened double-stranded molecule after replication. At
each subsequent replication cycle, the telomere would continue to
shorten, until eventually essential coding information would be lost.
Cells have evolved a specialized system to prevent this loss. The
solution involves the addition of multiple copies of a simple noncod-
ing sequence to the DNA at the chromosome tips. Thus, every time aIntroduction
chromo- to Genetic Analysis, 11e
some is duplicated, it is shortened and only these repeating sequences, which
Figure 07.25 #729
contain no information, are lost. The lost repeats are then added back to06/25/14
the chro-
07/18/14
mosome ends. Dragonfly Media Group
The discovery that the ends of chromosomes are made up of sequences
repeated in tandem was made in 1978 by Elizabeth Blackburn and Joe Gall, who
were studying the DNA in the unusual macronucleus of the single-celled ciliate
28 4 CHAPTER 7 DNA: Structure and Replication
3′ 5′
5′ 3′
3′ overhang
3′ AACCC 5′ 3′ AACCCCAAC
AACCCCAAC 5′
5′ TTGGGGTTGGGGTTGGGG 3′
DNA polymerase then use the very long 3′ overhang as a
template to fill in the end of the other DNA strand. Work-
ing with Elizabeth Blackburn, a third researcher, Jack Elongation
Szostak, went on to show that telomeres also exist in the
less unusual eukaryote yeast. For contributing to the dis-
covery of how telomeres protect chromosomes from short- 3′ AACCC 5′ 3′ AACCCCAAC 5′
ening, Blackburn, Grieder, and Szostak were awarded the 5′ TTGGGGTTGGGGTTGGGGTTG
TTG 3′
2009 Nobel Prize in Medicine or Physiology.
One notable feature of this reaction is that RNA is serv-
ing as the template for the synthesis of DNA. As you saw in Translocation
Chapter 1 (and will revisit in Chapter 8), DNA normally
serves as the template of RNA synthesis in the process
3′ AACCC 5′ 3′ AACCCCAAC 5′
called transcription. It is for this reason that the poly- 5′ TTG 3′
TTGGGGTTGGGGTTGGGGTTG
merase of telomerase is said to have reverse transcriptase
activity. We will revisit reverse transcriptase in Chapters 10
and 15. Elongation
In addition to preventing the erosion of genetic mate-
rial after each round of replication, telomeres preserve
chromosomal integrity by associating with proteins to 3′ AACCC 5′ 3′ AACCCCAAC 5′
form protective caps. These caps sequester the 3′ single- 5′ TTGGGGT T G 3′
TTGGGGTTGGGGTTGGGGTTGGGGT
stranded overhang, which can be as much as 100 nucleo-
tides long (Figure 7-28). Without this protective cap, the
(b) Replication of complementary
double-stranded ends of chromosomes would be mistaken strand
for double-stranded breaks by the cell and dealt with
accordingly. As you will see later, in Chapter 16, double- A primer is
stranded breaks are potentially very dangerous because synthesized
Primase
they can result in chromosomal instability that can lead to
cancer and a variety of phenotypes associated with aging. 3′ AACCC 5′ 3′ AAC 5′
For this reason, when a double-stranded break is detected, 5′ TTGGGGTTGGGGTTGGGGTTGGGGT
TTGGGGT T G 3′
the cell responds in a variety of ways, depending, in part,
on the cell type and the extent of the damage. For example, Polymerase fills
the double-stranded break can be fused to another break in the gap
or the cell can limit the damage to the organism by stop- DNA polymerase
ping further cell division (called senescence) or by initiat- 3′ AACCC CAACCCCAAC CCCAACCCCAAC 5′
ing a cell-death pathway (called apoptosis). 5′ TTGGGGTTGGGGTTGGGGTTGGGGT T G 3′
WRN
TRF1
TRF2
5′
3′
s u m m a ry
Experimental work on the molecular nature of hereditary RNA primer (synthesized by primase) that provides a 3′ end
material has demonstrated conclusively that DNA (not pro- for deoxyribonucleotide addition.
tein, lipids, or carbohydrates) is indeed the genetic material. The multiple events that have to occur accurately and rap-
Using data obtained by others, Watson and Crick deduced a idly at the replication fork are carried out by a biological
double-helical model with two DNA strands, wound around machine called the replisome. This protein complex includes
each other, running in antiparallel fashion. The binding of two DNA polymerase units, one to act on the leading strand
the two strands together is based on the fit of adenine (A) to and one to act on the lagging strand. In this way, the more
thymine (T) and guanine (G) to cytosine (C). The former time-consuming synthesis and joining of the Okazaki frag-
pair is held by two hydrogen bonds; the latter, by three. ments into a continuous strand can be temporally coordinated
The Watson–Crick model shows how DNA can be repli- with the less complicated synthesis of the leading strand.
cated in an orderly fashion—a prime requirement for genetic Where and when replication takes place is carefully controlled
material. Replication is accomplished semiconservatively in by the ordered assembly of the replisome at certain sites on
both prokaryotes and eukaryotes. One double helix is repli- the chromosome called origins. Eukaryotic genomes may have
cated to form two identical helices, each with their nucleo- tens of thousands of origins. The assembly of replisomes at
tides in the identical linear order; each of the two new double these origins can take place only at a specific time in the cell
helices is composed of one old and one newly polymerized cycle.
strand of DNA. The ends of linear chromosomes (telomeres) present a
The DNA double helix is unwound at a replication fork, problem for the replication system because there is always a
and the two single strands thus produced serve as templates short stretch on one strand that cannot be primed. The
for the polymerization of free nucleotides. Nucleotides are enzyme telomerase adds a number of short, repetitive
polymerized by the enzyme DNA polymerase, which adds sequences to maintain length. Telomerase carries a short RNA
new nucleotides only to the 3′ end of a growing DNA chain. that acts as the template for the synthesis of the telomeric
Because addition is only at 3′ ends, polymerization on one repeats. These noncoding telomeric repeats associate with
template is continuous, producing the leading strand, and, on proteins to form a telomeric cap. Telomeres shorten with age
the other, it is discontinuous in short stretches (Okazaki frag- in somatic cells because telomerase is not made in those cells.
ments), producing the lagging strand. Synthesis of the leading Individuals who have defective telomeres experience prema-
strand and of every Okazaki fragment is primed by a short ture aging.
key terms
antiparallel (p. 267) keto (p. 276) processive enzyme (p. 277)
base (p. 264) lagging strand (p. 275) purine (p. 264)
complementary base (p. 270) leading strand (p. 274) pyrimidine (p. 264)
conservative replication major groove (p. 267) replication fork (p. 274)
(p. 271) minor groove (p. 267) replisome (p. 277)
daughter molecule (p. 281) nucleotide (p. 264) semiconservative replication
deoxyribose (p. 264) Okazaki fragment (p. 275) (p. 271)
distributive enzyme (p. 277) origin (p. 280) single-strand-binding (SSB) protein
DNA ligase (p. 275) phosphate (p. 264) (p. 279)
double helix (p. 267) polymerase III (pol III) holoenzyme tautomerization (p. 276)
enol (p. 276) (p. 277) telomerase (p. 284)
genetic code (p. 270) primase (p. 275) telomere (p. 283)
helicase (p. 279) primer (p. 275) template (p. 270)
imino (p. 276) primosome (p. 275) topoisomerase (p. 279)
28 8 CHAPTER 7 DNA: Structure and Replication
s olv e d p r obl e m s
SOLVED PROBLEM 1. Mitosis and meiosis were presented in Solution
Chapter 2. Considering what has been covered in this chapter Refer to Figure 7-13 for an additional explanation. In con-
concerning DNA replication, draw a graph showing DNA servative replication, if bacteria are grown in the presence
content against time in a cell that undergoes mitosis and then of 15N and then shifted to 14N, one DNA molecule will be all
meiosis. Assume a diploid cell. 15N after the first generation and the other molecule will be
all 14N, resulting in one heavy band and one light band in
Solution
4
the gradient. After the second generation, the 15N DNA will
yield one molecule with all 15N and one molecule with all
3 14N, whereas the 14N DNA will yield only 14N DNA. Thus,
Solution
If the GC content is 56 percent, then, because G = C, the con-
tent of G is 28 percent and the content of C is 28 percent. The 15 N
p r obl e m s
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
Working with the Figures 5. In Figure 7-23(a), label all the leading and lagging
1. In Table 7-1, why are there no entries for the first four strands.
tissue sources? For the last three entries, what is the most
B a s i c P r obl e m s
likely explanation for the slight differences in the compo-
sition of human DNA from the three tissue sources? 6. Describe the types of chemical bonds in the DNA double
2. In Figure 7-7, do you recognize any of the components helix.
used to make Watson and Crick’s DNA model? Where 7. Explain what is meant by the terms conservative and
have you seen them before? semiconservative replication.
3. Referring to Figure 7-20, answer the following questions: 8. What is meant by a primer, and why are primers neces-
a. What is the DNA polymerase I enzyme doing? sary for DNA replication?
b. What other proteins are required for the DNA 9. What are helicases and topoisomerases?
polymerase III on the left to continue synthesizing 10. Why is DNA synthesis continuous on one strand and dis-
DNA? continuous on the opposite strand?
c. What other proteins are required for the DNA 11. If the four deoxynucleotides showed nonspecific base
polymerase III on the right to continue synthesizing pairing (A to C, A to G, T to G, and so on), would the
DNA? unique information contained in a gene be maintained
4. What is different about the reaction catalyzed by the through round after round of replication? Explain.
green helicase in Figure 7-20 and the yellow gyrase in 12. If the helicases were missing during replication, what
Figure 7-21? would happen to the replication process?
Problems 28 9
13. What would happen if, in the course of replication, the 19. If thymine makes up 15 percent of the bases in a specific
topoisomerases were unable to reattach the DNA frag- DNA molecule, what percentage of the bases is cytosine?
ments of each strand after unwinding (relaxing) the
DNA molecule?
www
14. Which of the following is not a key property of hereditary Unpacking the Problem
material? www
20. If the GC content of a DNA molecule is 48 percent, what
a. It must be capable of being copied accurately. are the percentages of the four bases (A, T, G, and C) in
b. It must encode the information necessary to form this molecule?
proteins and complex structures. 21. Bacteria called extremophiles are able to grow in hot
c. It must occasionally mutate. springs such as Old Faithful at Yellowstone National Park
in Wyoming. Do you think that the DNA of extremo-
d. It must be able to adapt itself to each of the body’s
philes would have a higher content of GC or AT base
tissues.
pairs? Justify your answer.
15. It is essential that RNA primers at the ends of Okazaki 22. Assume that a certain bacterial chromosome has one ori-
fragments be removed and replaced by DNA because gin of replication. Under some conditions of rapid cell
otherwise which of the following events would result? division, replication could start from the origin before the
a. The RNA would interfere with topoisomerase preceding replication cycle is complete. How many repli-
function. cation forks would be present under these conditions?
b. The RNA would be more likely to contain errors 23. A molecule of composition
because primase lacks a proofreading function.
5′-AAAAAAAAAAA-3′
c. The β-clamp of the DNA pol II dimer would release
the DNA and replication would stop. 3′-TTTTTTTTTTTTT-5′
d. The RNA primers would be likely to hydrogen bond is replicated in a solution containing unlabeled (not
to each other, forming complex structures that might radioactive) GTP, CTP, and TTP plus adenine nucleo-
interfere with the proper formation of the DNA helix. side triphosphate with all its phosphorus atoms in the
16. Polymerases usually add only about 10 nucleotides to form of the radioactive isotope 32P. Will both daughter
a DNA strand before dissociating. However, during rep- molecules be radioactive? Explain. Then repeat the
lication, DNA pol III can add tens of thousands of question for the molecule
nucleotides at a moving fork. How is this addition
accomplished? 5′-ATATATATATATAT-3′
5′.…ATTCGTACGATCGACTGACTGACAGTC….3′
3′.…TAAGCATGCTAGCTGACTGACTGTCAG….5′
Normal Mutant
If the DNA polymerase starts replicating this segment
a. No change at all in replication. from the right,
b. Replication would take place only on one half of the a. which will be the template for the leading strand?
chromosome.
b. Draw the molecule when the DNA polymerase is
c. Replication would be complete only on the leading halfway along this segment.
strand.
c. Draw the two complete daughter molecules.
d. Replication would take twice as long.
d. Is your diagram in part b compatible with bidirec-
18. In a diploid cell in which 2n = 14, how many telomeres tional replication from a single origin, the usual mode
are there in each of the following phases of the cell of replication?
cycle?
26. The DNA polymerases are positioned over the following
(a) G1 (c) prophase of mitosis DNA segment (which is part of a much larger molecule)
(b) G2 (d) telophase of mitosis and moving from right to left. If we assume that an
29 0 CHAPTER 7 DNA: Structure and Replication
Okazaki fragment is made from this segment, what will a. Prepare a model for the structure of Raman DNA.
be the fragment’s sequence? Label its 5′ and 3′ ends. b. On Rama, mitosis produces three daughter cells.
5′.…CCTTAAGACTAACTACTTACTGGGATC….3′ Bearing this fact in mind, propose a replication pattern
for your DNA model.
3′.…GGAATTCTGATTGATGAATGACCCTAG….5′ c. Consider the process of meiosis on Rama. What
27. E. coli chromosomes in which every nitrogen atom is la- comments or conclusions can you suggest?
beled (that is, every nitrogen atom is the heavy isotope 30. If you extract the DNA of the coliphage φ X174, you will
15N instead of the normal isotope 14N) are allowed to rep- find that its composition is 25 percent A, 33 percent T,
licate in an environment in which all the nitrogen is 14N. 24 percent G, and 18 percent C. Does this composition
Using a solid line to represent a heavy polynucleotide make sense in regard to Chargaff’s rules? How would
chain and a dashed line for a light chain, sketch each of you interpret this result? How might such a phage rep-
the following descriptions: licate its DNA?
a. The heavy parental chromosome and the products 31. In Chapter 5 you saw that bacteria transfer DNA from
of the first replication after transfer to a 14N medium, one member of their species to another in a process called
assuming that the chromosome is one DNA double conjugation. Recently it has been shown that the transfer
helix and that replication is semiconservative. of DNA from one bacterial cell to another is not limited to
b. Repeat part a, but now assume that replication is members of the same species. A microbiologist studying
conservative. the bacteria Diplococcus pneumonia hypothesizes that a
region of its chromosome was in fact transferred from
c. If the daughter chromosomes from the first division
Mycobacterium tuberculosis. Based on the data presented
in 14N are spun in a cesium chloride density gradient
in Table 7-1, what distinguishing feature of the trans-
and a single band is obtained, which of the possibilities
ferred DNA would provide support for this hypothesis?
in parts a and b can be ruled out? Reconsider the
Meselson and Stahl experiment: What does it prove? 32. Given what you know about the structure and function
of telomerase, provide a plausible model to explain how
C h a ll e n g i n g P r obl e m s a species could exist with a combination of two different
28. If a mutation that inactivated telomerase occurred in a repeats (for example, TTAGGG and TTGTGG) on each
cell (telomerase activity in the cell = zero), what do you of their telomeres.
expect the outcome to be? 33. Do bacteria require telomerase? Explain why or why not.
29. On the planet Rama, the DNA is of six nucleotide types: 34. Watson and Crick used an approach called model build-
A, B, C, D, E, and F. Types A and B are called marzines, ing to deduce the structure of the DNA double helix.
C and D are orsines, and E and F are pirines. The follow- How does this differ from the more conventional experi-
ing rules are valid in all Raman DNAs: mental approach that is undertaken in a research labora-
Total marzines = total orsines = total pirines tory? In this regard, why was the experiment of Meselson
and Stahl considered to be of such critical importance?
A=C=E
B=D=F
344
8
C h a p t e r
RNA: Transcription
and Processing
Learning Outcomes
After completing this chapter, you will be able to
RNA polymerase in action. A very small RNA polymerase (blue), made by the bacte-
riophage T7, transcribes DNA into a strand of RNA (red). The enzyme separates the
DNA double helix (yellow, orange), exposing the template strand to be copied into
RNA. [ David S. Goodsell, Scripps Research Institute.]
outline
8.1 RNA
8.2 Transcription
291
292 CHAPTER 8 RNA: Transcription and Processing
U
sing their newly acquired knowledge of the DNA sequences of entire
genomes, scientists have been able to determine the approximate number of
genes in several organisms, both simple and complex. At first there were no
surprises: the bacterium Escherichia coli has about 4400 genes, the unicellular
eukaryote yeast Saccharomyces cerevisiae has about 6300 genes, and the multicellu-
lar fruit fly Drosophila melanogaster has about 13,600 genes. Scientists assumed that
more complex organisms would require more genes, and so early estimates were
that our genome would have 100,000 genes. At a conference focused on genome
research in 2000, scientists started an informal betting pool called GeneSweep that
would be won by the person who most closely predicted the actual number of genes
in the human genome. The entries ranged from ~26,000 to ~150,000 genes.
With the release of the first draft sequence, a winner was announced. Surpris-
ingly, the winner was the entrant with the very lowest estimate, 25,947 genes.
How could Homo sapiens with their complex brains and sophisticated immune
systems have only twice as many genes as the roundworm and approximately the
same number of genes as the first sequenced plant genome, the mustard weed
Arabidopsis thaliana? Part of the answer to this question has to do with a remark-
able discovery made in the late 1970s. At that time, the proteins of many eukary-
otes were found to be encoded in DNA not as continuous stretches (as they are in
bacteria and yeast) but in pieces. Thus, the genes of higher eukaryotes are usually
composed of pieces called exons (for expressed region) that encode parts of pro-
teins and pieces called introns (for intervening regi ) that separate exons. As
you will learn in this chapter, an RNA copy containing both exons and introns is
synthesized from a gene. A biological machine (called a spliceosome) removes
the introns and joins the exons (in a process called RNA splicing) to produce
a mature RNA that contains the continuous information needed to synthesize
a protein.
What do exons and introns have to do with the low human gene count? For
now, suffice it to say that the RNA transcribed from a gene can be spliced in alter-
native ways. Although we have only about 21,000 genes, these genes encode more
than 100,000 proteins, thanks to the process of alternative splicing of RNA.
Even more surprising is the finding that only a small fraction of the genome
actually codes for proteins (a little more than 2 percent for most complex multi-
cellular organisms). The content of genomes will be the subject of future chap-
ters. For now it is important to note that despite having such a small proportion of
coding DNA, most of the genome still encodes RNA. The story of this aptly named
non-protein-coding RNA (ncRNA) is a work in progress. That story will be
introduced in this chapter and developed in succeeding chapters.
In this chapter, we see the first steps in the transfer of information from genes
to gene products. Within the DNA sequence of any organism’s genome is encoded
information specifying each of the gene products that the organism can make.
These DNA sequences also contain information specifying when, where, and how
much of the product is made. To utilize the information, an RNA copy of the gene
must be synthesized in a process called transcription.
The transfer of information from gene to gene product takes place in several
steps. The first step, which is the focus of this chapter, is to copy (transcribe) the
information into a strand of RNA with the use of DNA as a template. In prokary-
otes, the information in protein-encoding RNA is almost immediately converted
into an amino acid chain (protein) by a process called translation. This second
step is the focus of Chapter 9. In eukaryotes, transcription and translation are
spatially separated: transcription takes place in the nucleus and translation in the
cytoplasm. However, before RNAs are ready to be transported into the cytoplasm
for translation or other uses, they undergo extensive processing, including the
removal of introns and the addition of a special 5′ cap and a 3′ tail of adenine
8.1 RNA 29 3
nucleotides. One fully processed type of RNA, called messenger RNA (mRNA), is
the intermediary in the synthesis of proteins. In addition, in both prokaryotes and
eukaryotes, there are other types of RNAs that are never translated. These ncRNAs
perform many essential roles.
DNA and RNA function during transcription is based on two principles:
1. Complementarity of bases is responsible for determining the sequence of the
RNA transcript in transcription. Through the matching of complementary bases,
the information encoded in the DNA passes into RNA, and protein complexes as-
sociated with ncRNAs are guided to specific regions in the RNA to regulate their
expression.
2. Certain proteins recognize particular base sequences in DNA and RNA. These
nucleic-acid-binding proteins bind to these sequences and act on them.
We will see these principles at work throughout the detailed discussions of
transcription and translation that follow in this chapter and in chapters to come.
Early investigators had good reason for thinking that information is not transferred
directly from DNA to protein. In a eukaryotic cell, DNA is found in the nucleus,
whereas protein is synthesized in the cytoplasm. An intermediate is needed.
more, this phage-induced RNA “turns over” rapidly; that is, its lifetime is brief, Chased with
nonradioactive
on the order of minutes. Its rapid appearance and disappearance suggested that RNA precursors
RNA might play some role in the expression of the T2 genome necessary to make
more virus particles.
Volkin and Astrachan demonstrated the rapid turnover of RNA by using a
protocol called a pulse–chase experiment. To conduct a pulse–chase experi-
ment, the infected bacteria are first fed (pulsed with) radioactive uracil (a mole-
cule needed for the synthesis of RNA but not DNA). Any RNA synthesized in the
bacteria from then on is “labeled” with the readily detectable radioactive uracil.
After a short period of incubation, the radioactive uracil is washed away and
After the
replaced (chased) by uracil that is not radioactive. This procedure “chases” the chase
label out of the RNA because, as the pulse-labeled RNA breaks down, only the
unlabeled precursors are available to synthesize new RNA molecules (the labeled F i g u r e 8 -1 The pulse–chase
nucleotides are “diluted” by the huge excess of unlabeled uracil added in the experiment showed that mRNA moves
chase). The RNA recovered shortly after the pulse is labeled, but RNA recovered into the cytoplasm. Cells are grown briefly
somewhat later is unlabeled, indicating that the RNA has a very short lifetime. in radioactive uracil to label newly
synthesized RNA (pulse). Cells are washed
A similar experiment can be done with eukaryotic cells. Cells are first pulsed
to remove the radioactive uracil and are
with radioactive uracil and, after a short time, they are transferred to medium then grown in excess nonradioactive uracil
with unlabeled uracil. In samples taken after the pulse, most of the label is in the (chase). The red dots indicate the location
nucleus. In samples taken after the chase, the labeled RNA is also found in of the RNA containing radioactive uracil
the cytoplasm (Figure 8-1). Apparently, in eukaryotes, the RNA is synthesized in over time.
29 4 CHAPTER 8 RNA: Transcription and Processing
the nucleus and then moves into the cytoplasm, where proteins are synthesized.
Thus, RNA is a good candidate for an information-transfer intermediary between
DNA and protein.
Properties of RNA
Let’s consider the general features of RNA. Although both RNA and DNA are
nucleic acids, RNA differs from DNA in several important ways:
CH2 O Base CH2 O Base 1. RNA has ribose sugar in its nucleotides, rather than the deoxyribose found in
4′ 1′ 4′ 1′ DNA. As the names suggest, the two sugars differ in the presence or absence of
H H H
H H H H
H just one oxygen atom. The RNA sugar contains a hydroxyl group (OH) bound to
3′ 2′ 3′ 2′
the 2 ′-carbon atom, whereas the DNA sugar has only a hydrogen atom bound
OH OH OH H to the 2 ′-carbon atom.
Ribose Deoxyribose 2. RNA is usually a single-stranded nucleotide chain, not a double helix like DNA.
A consequence is that RNA is more flexible and can form a much greater variety
of complex three-dimensional molecular shapes than can double-stranded DNA.
An RNA strand can bend in such a way that some of its own bases pair with each
other. Such intramolecular base pairing is an important determinant of RNA shape.
As you will see later in this chapter, the presence of the hydroxyl group at the
2 ′-carbon atom facilitates the action of RNA in many important cellular
processes.
Like an individual DNA strand, a strand of RNA is formed of a sugar–phosphate
backbone, with a base covalently linked at the 1 ′ position on each ribose. The
sugar–phosphate linkages are made at the 5 ′ and 3 ′ positions of the sugar, just as
in DNA; so an RNA chain will have a 5 ′ end and a 3 ′ end.
3. RNA nucleotides (called ribonucleotides) contain the bases adenine, guanine,
and cytosine, but the pyrimidine base uracil (abbreviated U) is present instead of
thymine.
O Uracil forms two hydrogen bonds with adenine just as thymine does. Figure
Introduction to Genetic Analysis, 11e O 8-2 shows the four ribonucleotides found in RNA.
Figure 08UN1 #8UN1 C
In addition, uracil is capable of base pairing with G. The bases U and G form
05/05/14 HC 5 4 3 N H NH
Dragonfly Media Group base pairs only during RNA folding and not during transcription. The two hydro-
HC 6 1
2C O gen bonds that can form between U and G are weaker than the two that form
N N O
between U and A. The ability of U to pair with both A and G is a major reason why
H H RNA can form extensive and complicated structures, many of which are impor-
tant in biological processes.
Uracil
4. RNA—like protein, but unlike DNA—can catalyze biological reactions. The name
ribozyme was coined for the RNA molecules that function like protein enzymes.
Classes of RNA
RNAs can be grouped into two general classes. One class of RNA encodes the
information necessary to make polypeptide chains (proteins). We refer to this
class as messenger RNA (mRNA) because, like a messenger, these RNAs serve as
the intermediary that passes information from DNA to protein. We refer to the
other class as functional RNA because the RNA does not encode information to
make protein. Instead, the RNA itself is the final functional product.
Messenger RNA The steps through which a gene influences phenotype are called
gene expression. For the vast majority of genes, the RNA transcript is only an
intermediate necessary for the synthesis of a protein, which is the ultimate func-
tional product that influences phenotype.
8.1 RNA 29 5
Purine ribonucleotides
NH2 O
H
Phosphate N 6 N 6
7 5 N 7 5 N
1 1
8
9 4 2 Adenine (A) 8 4 2
Guanine (G)
3 9 3
O N N O N N NH2
5′
−O P O CH2 O −O P O CH2 O
4′ 1′
−O H H Ribose sugar −O H H
H H H H
3′ 2′
OH OH OH OH
Adenosine 5′-monophosphate (AMP) Guanosine 5′-monophosphate (GMP)
Pyrimidine ribonucleotides
NH2 O
H
4
3N
4
5 3N 5
Cytosine (C) 6 2
Uracil (U)
6 2 1
1
N O O N O
O
−O P O CH2 O –O P O CH2 O
−O H H –O H H
H H H H
OH OH OH OH
Cytidine 5′-monophosphate (CMP) Uridine 5′-monophosphate (UMP)
F i g u r e 8 -2
Functional RNA As more is learned about the intimate details of gene expres-
sion and regulation, it becomes apparent that functional RNAs fall into a variety
of classes and play diverse roles. Again, it is important to emphasize that func-
tional RNAs are active as RNA; they are never translated into polypeptides.
The main classes of functional RNAs contribute to various steps in the transfer of
information from DNA to protein, in the processing of other RNAs, and in the regu-
lation of RNA and protein levels in the cell. Two such classes of functional RNAs are
found in both prokaryotes and eukaryotes: transfer RNAs and ribosomal RNAs.
• Transfer RNA (tRNA) molecules are responsible for bringing the correct ami-
no acid to the mRNA in the process of translation.
• Ribosomal RNA (rRNA) molecules are the major components of ribosomes,
which are large macromolecular machines that guide the assembly of the ami-
no acid chain by the mRNAs and tRNAs.
The entire collection of tRNAs and rRNAs are encoded by a small number of
genes (a few tens to a few hundred at most). However, though the genes that
encode them are few in number, rRNAs account for a very large percentage of the
RNA in the cell because they are both stable and transcribed into many copies.
Another class of functional RNAs participate in the processing of RNA and are
specific to eukaryotes:
29 6 CHAPTER 8 RNA: Transcription and Processing
• Small nuclear RNAs (snRNAs) are part of a system that further processes
RNA transcripts in eukaryotic cells. Some snRNAs unite with several protein
subunits to form the ribonucleoprotein processing complex (the spliceosome)
that removes introns from eukaryotic mRNAs.
8.2 Transcription
The first step in the transfer of information from gene to protein is to produce an
RNA strand whose base sequence matches the base sequence of a DNA segment,
sometimes followed by modification of that RNA to prepare it for its specific cel-
lular roles. Hence, RNA is produced by a process that copies the nucleotide
sequence of DNA. Because this process is reminiscent of transcribing (copying)
written words, the synthesis of RNA is called transcription. The DNA is said to
be transcribed into RNA, and the RNA is called a transcript.
Stages of transcription
The protein-encoding sequence in a gene is a relatively small seg-
ment of DNA embedded in a much longer DNA molecule (the
chromosome). How is the appropriate segment transcribed into a
single-stranded RNA molecule of correct length and nucleotide
sequence? Because the DNA of a chromosome is a continuous
unit, the transcriptional machinery must be directed to the start of
F i g u r e 8 - 5 This electromicrograph a gene to begin transcribing at the right place, continue transcribing the length of
shows the transcription of ribosomal RNA the gene, and finally stop transcribing at the other end. These three distinct stages
genes repeated in tandem in the nucleus of transcription are called initiation, elongation, and termination. Although the
of the amphibian Triturus viridiscens.
overall process of transcription is remarkably similar in prokaryotes and eukary-
Along each gene, many RNA polymerases
are transcribing in one direction. The
otes, there are important differences. For this reason, we will follow the three stages
growing RNA transcripts appear as first in prokaryotes (by using the gut bacterium E. coli as an example) and then in
threads extending outward from the DNA eukaryotes.
backbone. The shorter transcripts are
close to the start of transcription; the Initiation in prokaryotes How does RNA polymerase find the correct starting
longer ones are near the end of the gene. point for transcription? In prokaryotes, RNA polymerase usually binds to a spe-
The “Christmas tree” appearance is the
cific DNA sequence called a promoter, located close to the start of the transcribed
result. [Photograph from O. L. Miller, Jr.,
and Barbara A. Hamkalo.]
region. A promoter is an important part of the regulatory region of a gene.
Remember that, because the synthesis of an RNA transcript begins at its 5 ′ end
and continues in the 5 ′-to-3 ′ direction, the convention is to draw and refer to the
orientation of the gene in the 5 ′-to-3 ′ direction, too. For this reason, the nontem-
plate DNA strand is usually shown. Generally, the 5 ′ end is drawn at the left and
the 3 ′ at the right. With this view, because the promoter must be near the end of
the gene where transcription begins, it is said to be at the 5 ′ end of the gene; thus,
F i g u r e 8 - 6 The mRNA sequence is the promoter region is also called the 5 ′ regulatory region (Figure 8-7a).
complementary to the DNA template The first transcribed base is always at the same location, designated the initia-
strand from which it is transcribed and
therefore matches the sequence of the
tion site. The promoter is referred to as upstream of the initiation site because it
nontemplate strand (except that the RNA is located ahead of the initiation site (5 ′ of the gene), in the direction opposite the
has U where the DNA has T). This direction of transcription. A downstream site would be located later in the direc-
sequence is from the gene for the enzyme tion of transcription. By convention, the first DNA base to be transcribed is
β-galactosidase.
Nontemplate strand
5′ — CTGCCATTGTCA GA CA TGT A TA CCCCGTA CGTCTTCCCGA GCGA A A A CGA TCTGCGCTGC — 3′
Coding strand
DNA
Template strand
3′ — GACGGTA A CAGTCTGTA CA T A TGGGGCA TGCAGAAGGGCTCGCTTTTGCTA GACGCGACG — 5′
Noncoding strand
5′ — CUGCCAUUGUCAGACAUGUAUACCCCGUACGUCUUCCCGAGCGAAAACGAUCUGCGCUGC — 3′ mRNA
8.2 Transcription 29 9
5′ UTR
AUG Transcription
(a) 5′
Promoter Coding sequence of gene
+1
ATG
(b) Strong E. coli promoters
F i g u r e 8 -7 (a) The promoter lies “upstream” (toward the 5′ end) of the initiation point
and coding sequences. (b) Promoters have regions of similar sequences, as indicated by
the yellow shading in seven different promoter sequences in E. coli. Spaces (dots) are
inserted in the sequences to optimize the alignment of the common sequences. Numbers
refer to the number of bases before (−) or after (+) the RNA synthesis initiation point.
The consensus sequence for most E. coli promoters is at the bottom. [ Data from H. Lodish,
D. Baltimore, A. Berk, S. L. Zipursky, P. Matsudaira, and J. Darnell, Molecular Cell Biology, 3rd ed.]
numbered +1. Nucleotide positions upstream of the initiation site are indicated
by a negative (−) sign and those downstream by a positive (+) sign.
Figure 8-7b shows the promoter sequences of seven different genes in the E.
coli genome. Because the same RNA polymerase binds to the promoter sequences
of these different genes, the similarities among the promoters are not surprising.
In particular, two regions of great similarity appear in virtually every case. These
regions have been termed the −35 (minus 35) and −10 regions because they are
located 35 base pairs and 10 base pairs, respectively, upstream of the first tran-
scribed base. They are shown in yellow in Figure 8-7b. As you can see, the −35
and −10 regions from different genes do not have to be identical to perform a
similar function. Nonetheless, it is possible to arrive at a sequence of nucleotides,
called a consensus sequence, that is in agreement with most sequences. The
E. coli promoter consensus sequence is shown at the bottom of Figure 8-7b. An
RNA polymerase holoenzyme (see next paragraph) binds to the DNA at this
point, then unwinds the DNA double helix and begins the synthesis of an RNA
molecule. Note in Figure 8-7a that the protein-encoding part of the gene usually
begins at an ATG sequence, but the initiation site, where transcription begins, is
usually well upstream of this sequence. The intervening part is referred to as the
5 ′ untranslated region (5 ′ UTR).
The bacterial RNA polymerase that scans the DNA for a promoter sequence is
called the RNA polymerase holoenzyme (Figure 8-8). This multisubunit com-
plex is composed of the five subunits of the core enzyme (two subunits of a, one
of β, one of β ′, and one of ω) plus a subunit called sigma factor (σ).The two
a subunits help assemble the enzyme and promote interactions with regulatory
proteins, the β subunit is active in catalysis, the β ′ subunit binds DNA, and the
3 0 0 CHAPTER 8 RNA: Transcription and Processing
Transcription initiation in prokaryotes ω subunit has roles in enzyme assembly and the regulation
of gene expression. The σ subunit binds to the −10 and −35
(a) RNA polymerase binding (b) Initiation regions, thus positioning the holoenzyme to initiate tran-
to promoter
70 scription correctly at the start site (see Figure 8-8a). The σ
subunit also has a role in separating (melting) the DNA
α β α β strands around the −10 region so that the core enzyme can
DNA bind tightly to the DNA in preparation for RNA synthesis.
α β′ After the core enzyme is bound, transcription begins and
70
ω ω the σ subunit dissociates from the rest of the complex (see
5′ Figure 8-8b).
–35 –10
E. coli, like most other bacteria, has several different σ
F i g u r e 8 - 8 The σ subunit positions factors. One, called σ70 because its mass in kilodaltons is 70,
prokaryotic RNA polymerase for is the primary σ subunit used to initiate the transcription of the vast majority of
transcription initiation. (a) Binding of the E. coli genes. Other σ factors recognize different promoter sequences. Thus, by
σ subunit to the −10 and −35 regions associating with different σ factors, the same core enzyme can recognize different
positions the other subunits for correct
promoter sequences and transcribe different sets of genes.
initiation. (b) Shortly after RNA synthesis
begins, the σ subunit dissociates from
the other subunits, which continue Elongation As the RNA polymerase moves along the DNA, it unwinds the DNA
transcription. ahead of it and rewinds the DNA that has already been transcribed. In this way, it
maintains a region of single-stranded DNA, called a transcription bubble, within
which the template strand is exposed. In the bubble, polymerase monitors the
binding of a free ribonucleoside triphosphate to the next exposed base on the
DNA template and, if there is a complementary match, adds it to the chain. The
energy for the addition of a nucleotide is derived from splitting the high-energy
triphosphate and releasing inorganic diphosphate, according to the following gen-
Figure 8-9 The five subunits of RNA eral formula:
polymerase are shown as a single
ellipse-like shape surrounding the DNA
NTP (NMP)n (NMP)n1 PPi
transcription bubble. (a) Elongation: Mg2
Synthesis of an RNA strand complementary RNA polymerase
to the single-strand region of the DNA
template strand is in the 5′-to-3′ direction. Figure 8-9a gives a physical picture of elongation. Inside the bubble, the last
DNA that is unwound ahead of RNA eight or nine nucleotides added to the RNA chain form an RNA–DNA hybrid by
polymerase is rewound after it has been complementary base pairing with the template strand. As the RNA chain length-
transcribed. (b) Termination: The intrinsic ens at its 3′ end, the 5′ end is further extruded from the polymerase. The comple-
mechanism shown here is one of two ways mentary base pairs are broken at the point of exit, leaving the extruding strand
used to end RNA synthesis and release the single stranded.
completed RNA transcript and RNA
polymerase from the DNA. In this case, the
formation of a hairpin loop sets off their Termination The transcription of an individual gene continues beyond the
release. For both the intrinsic and the protein-encoding segment of the gene, creating a 3 ′ untranslated region (3 ′ UTR)
rho-mediated mechanism, termination first at the end of the transcript. Elongation proceeds until RNA polymerase recog-
requires the synthesis of certain RNA nizes special nucleotide sequences that act as a signal for chain termination. The
sequences.
3′ 3′ 3′ 3′ 3′ 3′
C U
5′ AU G
AUCG G G U T A 5′ 5′ UU
A
5′
TAGCCCA AA
RNA polymerase
5′ Hairpin loop
5′
8.3 Transcription in Eukaryotes 3 01
encounter with the signal nucleotides initiates the release of the nascent RNA and
A bacterial transcription-
the enzyme from the template (Figure 8-9b). The two major mechanisms for ter-
termination site
mination in E. coli (and other bacteria) are called intrinsic and rho dependent.
In the intrinsic mechanism, the termination is direct. The terminator sequences
contain about 40 base pairs, ending in a GC-rich stretch that is followed by a string U C C
of six or more A’s. Because G and C in the template will give C and G, respectively, U G
in the transcript, the RNA in this region also is GC rich. These C and G bases are able G= C
A= U
to form complementary hydrogen bonds with each other, resulting in an RNA hair- C=G
pin stem-loop (Figure 8-10). Recall that the G−C base pair is more stable than the C=G
A−T pair because it is hydrogen bonded at three sites, whereas the A−T (or A−U) G==C
pair is held together by only two hydrogen bonds. RNA hairpins with stems that are C=G
largely G−C pairs are more stable than hairpins with stems that are largely A−U C=G
5′ UAAUCCCACAG = CUUUUUUUU 3′
pairs. The hairpin structure is followed by a string of about eight U’s that are com- RNA transcript
plementary to the A residues on the DNA template.
Normally, in the course of transcription elongation, RNA polymerase will F i g u r e 8 -10 The structure of a
pause if the short DNA–RNA hybrid in the transcription bubble is weak and will termination site for RNA polymerase in
backtrack to stabilize the hybrid. Like that of hairpins, the strength of the hybrid bacteria. The hairpin structure forms by
is determined by the relative number of G−C base pairs compared with A−U complementary base pairing within a
GC-rich RNA strand. Most of the RNA
base pairs (or A−T base pairs in RNA–DNA hybrids). In the intrinsic mechanism, base pairing is between G and C, but
the polymerase is believed to pause after synthesizing the U’s (A−U forms a weak there is one A–U pair.
DNA–RNA hybrid). However, the backtracking polymerase encounters the hair-
pin loop. This roadblock sets off the release of RNA from the polymerase and the
release of the polymerase from the DNA template.
The second type of termination mechanism requires the help of a protein
called the rho factor. This protein recognizes the nucleotide sequences that act as
termination signals for RNA polymerase. RNAs with rho-dependent termination
signals do not have the string of U residues at their 3 ′ end and usually do not have
hairpin loops. Instead, they have a sequence of about 40 to 60 nucleotides that is
rich in C residues and poor in G residues and includes an upstream segment
called the rut (rho utilization) site. Rho is a hexamer consisting of six identical
subunits that bind a nascent RNA chain at the rut site. These sites are located just
upstream from (recall that upstream means 5′ of) sequences at which the RNA
polymerase tends to pause. After binding, rho facilitates the release of the RNA
from RNA polymerase. Thus, rho-dependent termination entails the binding of
rho to rut, the pausing of polymerase, and rho-mediated dissociation of the RNA
from the RNA polymerase.
DNA Nucleus
DNA
Transcription
and processing
3′
5′
Processing
mRNA Cytoplasm
F i g u r e 8 -11 Transcription 5′ 3′
and translation take place in the
same cellular compartment in Transport
prokaryotes but in different mRNA
compartments in eukaryotes. 3′
5′
Moreover, unlike prokaryotic RNA
transcripts, eukaryotic transcripts 5′
Growing
undergo extensive processing chain of
before they can be translated PROKARYOTE amino acids EUKARYOTE
into proteins.
8.3 Transcription in Eukaryotes 3 0 3
Before the RNA leaves the nucleus, it must be modified in several ways. These
modifications are collectively referred to as RNA processing. To distinguish the
RNA before and after processing, newly synthesized RNA is called the primary
transcript or pre-mRNA, and the term mRNA is reserved for the fully processed
transcript that can be exported out of the nucleus. As you will see, the 5′ end of
the RNA undergoes processing while the 3′ end is still being synthesized. Thus,
RNA polymerase II must synthesize RNA while simultaneously coordinating a
diverse array of processing events. For this reason, among others, RNA poly-
merase II is a more complicated multisubunit enzyme than prokaryotic RNA
polymerase. In fact, it is considered to be another molecular machine. The coor-
dination of RNA processing and synthesis by RNA polymerase II will be discussed
in the section on transcription elongation in eukaryotes.
3. Finally, the template for transcription, genomic DNA, is organized into chroma-
tin in eukaryotes (see Chapter 1), whereas it is virtually “naked” in prokaryotes. As
you will learn in Chapter 12, certain chromatin structures can block the access of
RNA polymerase to the DNA template. This feature of chromatin has evolved into
a very sophisticated mechanism for regulating eukaryotic gene expression. However,
a discussion of the influence of chromatin on the ability of RNA polymerase II to
initiate transcription will be put aside until Chapter 12 as we focus on the events
that take place after RNA polymerase II gains access to the DNA template.
Transcription initiation in eukaryotes different species are aligned, the sequence TATA can
often be seen to be located about 30 base pairs (−30 bp)
30 bp from the transcription start site (Figure 8-12). This
TATA sequence, called the TATA box, is the site of the first
event in transcription: the binding of the TATA-binding
protein (TBP). The TBP is part of the TFIID complex,
Start site one of the six GTFs. When bound to the TATA box, TBP
attracts other GTFs and the RNA polymerase II core to the
Binding of TBP and TFIID promoter, thus forming the PIC. After transcription has
been initiated, RNA polymerase II dissociates from most
of the GTFs to elongate the primary RNA transcript. Some
TFIID
of the GTFs remain at the promoter to attract the next
TBP
RNA polymerase core. In this way, multiple RNA poly-
TATA merase II enzymes can simultaneously synthesize tran-
scripts from a single gene.
How is the RNA polymerase II core able to separate
Formation of preinitiation complex from the GTFs and start transcription? Although the
details of this process are still being worked out, what is
known is that the β subunit of RNA polymerase II con-
TFIIF tains a protein tail, called the carboxy terminal domain
TFIIH (CTD), that plays a key role. The CTD is strategically
TATA located near the site at which nascent RNA will emerge
TFIII
from the polymerase. The initiation phase ends and the
elongation phase begins after the CTD has been phos-
phorylated by one of the GTFs. This phosphorylation is
RNA polymerase II thought to somehow weaken the connection of RNA
begins elongation polymerase II to the other proteins of the PIC and per-
mit elongation. The CTD also participates in several
other critical phases of RNA synthesis and processing.
TFIID
The CTD of eukaryotic RNA polymerase II plays a cen- Cotranscriptional processing of RNA
tral role in coordinating all processing events. The CTD is
composed of many repeats of a sequence of seven amino (a) Capping
acids. These repeats serve as binding sites for some of the
Transcription
enzymes and other proteins that are required for RNA cap-
start site
ping, splicing, and cleavage followed by polyadenylation.
The CTD is located near the site where nascent RNA emerges 5ʹ
3ʹ
from the polymerase, and so it is in an ideal place to orches- 3ʹ
5ʹ 3ʹ
trate the binding and release of proteins needed to process P
the nascent RNA transcript while RNA synthesis continues. P
P
P
In the various phases of processing, the amino acids of the P
P
Processing 5′ and 3′ ends Figure 8-13a depicts the process- P
P
ing of the 5′ end of the transcript of a protein-encoding gene. P
When the nascent RNA first emerges from RNA polymerase P
P
II, a special structure, called a cap, is added to the 5′ end by P
P
several proteins that interact with the CTD. The cap consists CTD
of a 7-methylguanosine residue linked to the transcript by RNA splicing 5ʹ
three phosphate groups. The cap has two functions. First, it machinery
protects the RNA from degradation in its long journey to the
site of translation. Second, as you will see in Chapter 9, the
cap is required for translation of the mRNA. (c) Cleavage and polyadenylation
RNA elongation continues until the conserved sequence
AAUAAA or AUUAAA is reached, marking the 3′ end of the 3ʹ
transcript. An enzyme recognizes that sequence and cuts off
P
the end of the RNA approximately 20 bases farther down. To P
P P
this cut end, a stretch of 150 to 200 adenine nucleotides P
Poly(A) tail added
called a poly(A) tail is added (see Figure 8-13c). Hence, the P P
AAUAAA sequence of the mRNA from protein-encoding
P
genes is called a polyadenylation signal. CTD
Introns are removed from the primary transcript while RNA is still being
transcribed and after the cap has been added but before the transcript is trans-
ported into the cytoplasm. The removal of introns and the joining of exons is
called splicing because it is reminiscent of the way in which videotape or movie
film can be cut and rejoined to delete a specific segment. Splicing brings together
the coding regions, or exons, so that the mRNA now contains a coding sequence
that is completely colinear with the protein that it encodes.
The number and size of introns vary from gene to gene and from species to spe-
cies. For example, only about 200 of the 6300 genes in yeast have introns, whereas
typical genes in mammals, including humans, have several. The average size of a
mammalian intron is about 2000 nucleotides, and the average exon is about
200 nucleotides; thus, a larger percentage of the DNA in mammals encodes introns
than exons. An extreme example is the human Duchenne muscular dystrophy
gene. This gene has 79 exons and 78 introns spread across 2.5 million base pairs.
When spliced together, its 79 exons produce an mRNA of 14,000 nucleotides, which
means that introns account for the vast majority of the 2.5 million base pairs.
Alternative splicing At this point, you might be wondering about the utility of
having genes organized into exons and introns. Recall that this chapter began with
a discussion of the number of genes in the human genome. This number (now
estimated at ~21,000 genes) is less than twice the number of genes in the round-
worm, yet the spectrum of human proteins (called the proteome; see Chapter 9)
is in excess of 70,000. That proteins so outnumber genes indicates that a gene can
encode the information for more than one protein. One way that a gene can
encode multiple proteins is through a process called alternative splicing. In this
process, different mRNAs and, subsequently, different proteins are produced
from the same primary transcript by splicing together different combinations of
Primary 1a 2a 2b 1b 3 4 5 6a 6b 7 8 9a 9b 9c 9d
pre-mRNA
transcript
A A A A A
Striated
muscle
Smooth
muscle
Brain
TMBr-1
Brain
TMBr-2
F i g u r e 8 -14 The pre-mRNA
Brain
transcript of the rat a-tropomyosin TMBr-3
gene is alternatively spliced in
Fibroblast
different cell types. The light green TM-2
boxes represent introns; the other
colors represent exons. Fibroblast
Polyadenylation signals are indicated TM-3
by an A. Dashed lines in the mature Fibroblast
mRNAs indicate regions that have TM-5a
been removed by splicing. TM = Fibroblast
tropomyosin. [ Data from J. P. Lees et al., TM-5b
Mol. Cell. Biol. 10, 1990, 1729 –1742.]
8.4 Intron Removal and Exon Splicing 3 07
exons. For reasons that are currently unknown, the proportion of alternatively
spliced genes varies from species to species. Although alternative splicing is rare
in plants, more than 70 percent of human genes are alternatively spliced. Many
mutations with serious consequences for the organism are due to splicing defects.
The consequences of alternative splicing on protein structure and function
will be presented later in the book. For now, suffice it to say that proteins pro-
duced by alternative splicing are usually related (because they usually contain
subsets of the same exons from the primary transcript) and that they are often
used in different cell types or at different stages of development. Figure 8-14
shows the myriad combinations produced by alternative splicing of the primary
RNA transcript of the a-tropomyosin gene. The mechanism of splicing is consid-
ered in the next section.
Intron
2′ A
HO
5′ O O 3′
− −
O P O O P O
5′ O O 3′
Exon 1 3′ 5′ Exon 2
First transesterification
5′ O A
2′
O P O
O− O 3′
−
O P O
5′ O H O 3′
3′ 5′
Second transesterification
O−
+ 5′ O P O 3′
5′ O A
2′ O
O P O Spliced exons
3′
O− OH
Excised lariat intron
O = 3′ oxygen of exon 1
O = 2′ oxygen of branch point A
O = 3′ oxygen of intron
been the genetic material in the first cells because only RNA is known to both
encode genetic information and catalyze biological reactions.
Processing
Nucleus
Cytoplasm
Dicer
Dicing
RISC
Strand separation
Repression of Targeting
translation Deadenylation
and degradation F i g u r e 8 -19 miRNAs are synthesized by
pol II as longer RNAs that are processed in
Coding sequence of gene AAAA
several steps to their mature form. Once fully
Ribosome processed, miRNAs bind to RISC and direct its
Repressed gene expression activities to reduce the expression of
(e.g., lin-14 or lin-28 ) complementary mRNAs by either repressing their
translation or promoting their degradation.
312 CHAPTER 8 RNA: Transcription and Processing
(a) Fire/Mello: injection of dsRNA (b) Jorgensen: insertion of transgene (c) Baulcombe: insertion of viral gene
1. unc-22 dsRNA synthesized in lab. 1. Transgene inserted into petunia cells. 1. Viral gene inserted into tobacco plant.
Antisense
Transgene Gene
dsRNA
Sense
Endogenous pigment gene
2. dsRNA injected into C. elegans embryos. 2. Adults grown from transformed cells 2. Plant exposed to virus but remains
have white sectors in flowers. healthy.
Micropipette
with dsRNA
solution
Conclusion: unc-22 gene silenced. Conclusion: transgene and Conclusion: viral gene silenced.
endogenous pigment gene silenced.
mRNA
Transgene Gene …
Endogenous
gene
Antisense of
transgene
F i g u r e 8 -2 2 The insertion of a transgene can
lead to the production of double-stranded RNA
(dsRNA) if the transgene is inserted at the end of
mRNA of transgene mRNA of endogenous gene a gene in the opposite orientation. The antisense
RNA produced when the neighboring gene is
OR transcribed can bind to the mRNA of either the
dsRNA dsRNA transgene itself or the endogenous gene to
produce dsRNA.
Viral gene
or transposon Gene
314 CHAPTER 8 RNA: Transcription and Processing
sense strand of the gene and the antisense strand of the transgene. Double-
siRNAs degrade mRNA from
stranded RNA will then form when the antisense part of the long RNA hybridizes
viral genes or transposons
with sense RNA produced by either the transgene or the endogenous gene.
Thus, dsRNA is a common feature of this form of gene silencing. However, the
function of this process is clearly not to shut off genes introduced by scientists.
Viral gene
or transposon Gene
What is the normal role of this form of gene silencing in the cell? An important
clue came from the experiments conducted by another plant scientist, David
Baulcombe, who was investigating the reason why tobacco plants that were engi-
dsRNA forms and
neered to express a viral gene were resistant to subsequent infection by the virus.
Dicer binds to it. In this experiment the viral gene is another example of a transgene, which was, in
this case, introduced into the tobacco genome (see Figure 8-20c). A key difference
Dicer between this and the petunia experiment was that tobacco plants do not normally
dsRNA
have a viral gene in their genome. So this experiment suggested that this form of
gene silencing functions to silence invading viruses.
Baulcombe and his co-workers found that the resistant plants, and only the
Dicer chops
up dsRNA.
resistant plants, produced large amounts of short RNAs, 25 nucleotides in length,
that were complementary to the viral genome. Significantly, short RNAs related to
the endogenous genes have also been found to be present during gene silencing in
siRNA the worm and in the petunia. The short RNAs generated during viral resistance and
gene silencing associated with either injected dsRNAs or transgenes are now col-
siRNA binds to
RISC and is lectively called small interfering RNAs (siRNAs). The phenomena that results in
separated into gene silencing and viral resistance through the production of siRNAs is called RNA
single strands. interference (RNAi). The short RNAs (21–31 nucleotides in length) are now clas-
RISC sified as one of three types depending on their biogenesis: miRNAs or siRNAs (both
21–25 nucleotides) or the recently discovered piwi-interacting RNAs (piRNAs,
24–31 nucleotides). Because the mechanism of piRNA synthesis is still under inves-
tigation, we will focus on the better characterized miRNAs and siRNAs.
Complementary
mRNA binds Similar mechanisms generate siRNA and miRNA
to RISC.
As we have seen, siRNAs can arise from an antisense copy of any source of mRNA
Target in the genome: from endogenous genes to transgenes to invading viruses. How-
mRNA
ever, the most likely source of antisense RNA is not an organism’s own genes, but
AA A rather foreign DNA that inserts into the genome. In this regard, it would be cor-
rect to think of siRNAs as the product of a genome immune system that detects
RISC degrades
the insertion of foreign DNA by, in some cases, promoting the synthesis of anti-
bound mRNA. sense mRNA. Complementarity between sense and antisense RNAs produces
dsRNAs, which, as in the miRNA pathway, are recognized by Dicer and cleaved
into short double-stranded products that are bound by RISC (Figure 8-23). As
with miRNAs, RISC unwinds the product into the biologically active single-
stranded siRNA that targets RISC to complementary mRNAs so they can be
degraded. Unlike miRNAs, complementarity between siRNAs and mRNAs is per-
F i g u r e 8 -2 3 In the RNA interference
pathway, double-stranded RNA (dsRNA) fect; there are no mismatches. This is because of their different origins: siRNAs
specifically interacts with the Dicer are derived from the same gene, whereas miRNAs come from a different gene.
complex, which chops the dsRNA up. This difference is probably responsible for the different outcomes: miRNAs direct
The RNA-induced silencing complex RISC to repress the translation of an mRNA or degrade mRNAs when they are
(RISC) uses the small dsRNAs to find and being translated, whereas siRNAs direct RISC to degrade the mRNA directly.
destroy homologous mRNA transcribed As discussed above, the production of siRNAs probably plays an important
from the target DNA, thereby repressing
role in viral defense. However, its most important role may be to protect the
gene expression.
hereditary material of an organism from genetic elements in its own genome. In
Chapter 15, you will learn about the transposable elements that constitute a huge
fraction of the genomes of multicellular eukaryotes, including humans. These ele-
ments can amplify themselves and move to new locations, creating an obvious
threat to the integrity of the genome. Just like the introduction of transgenes by
scientists, the movement of transposable elements into new chromosomal loca-
tions can trigger the production of siRNAs by generating dsRNA. The siRNAs
Summary 315
s u m m a ry
We know that information is not transferred directly from the phases of initiation, elongation, and termination of RNA
DNA to protein, because, in a eukaryotic cell, DNA is in the synthesis in eukaryotes resemble those in prokaryotes.
nucleus, whereas protein is synthesized in the cytoplasm. However, there are important differences. RNA polymerase
Information transfer from DNA to protein requires an inter- II does not bind directly to promoter DNA, but rather to
mediate. That intermediate is RNA. GTFs, one of which recognizes the TATA sequence in most
Although DNA and RNA are nucleic acids, RNA differs eukaryotic promoters. RNA polymerase II is a much larger
from DNA in that (1) it is usually single stranded rather molecule than its prokaryotic counterpart. It contains numer-
than a double helix, (2) its nucleotides contain the sugar ous subunits that function not only to elongate the primary
ribose rather than deoxyribose, (3) it has the pyrimidine RNA transcript, but also to coordinate the extensive process-
base uracil rather than thymine, and (4) it can serve as a ing events that are necessary to produce the mature mRNA.
biological catalyst. These processing events are 5 ′ capping, intron removal and
The similarity of RNA to DNA suggests that the flow of exon joining by spliceosomes, and 3 ′ cleavage followed by
information from DNA to RNA relies on the complementar- polyadenylation. Part of the RNA polymerase II core, the car-
ity of bases, which is also the key to DNA replication. A tem- boxy terminal domain (CTD), is positioned ideally to interact
plate DNA strand is copied, or transcribed, into either a with the nascent RNA as it emerges from polymerase.
functional RNA (such as transfer RNA or ribosomal RNA), Through the CTD, RNA polymerase II coordinates the
which is never translated into polypeptides, or a messenger numerous events of RNA synthesis and processing.
RNA, from which proteins are synthesized. Discoveries of the past 20 years have revealed the impor-
In prokaryotes, all classes of RNA are transcribed by a tance of new classes of functional RNAs. Once thought to be
single RNA polymerase. This multisubunit enzyme initiates a lowly messenger, RNA is now recognized as a versatile and
transcription by binding to the DNA at promoters that con- dynamic participant in many cellular processes. The discov-
tain specific sequences at −35 and −10 bases before the tran- ery of self-splicing introns demonstrated that RNA can
scription start site at +1. After being bound, RNA polymerase function as a catalyst, much like proteins. Since the discov-
locally unwinds the DNA and begins incorporating ribonu- ery of these ribozymes, the scientific community has begun
cleotides that are complementary to the template DNA to pay more attention to RNA. Small nuclear RNAs, the
strand. The chain grows in the 5′-to-3′ direction until one of noncoding RNAs in the spliceosome, are now recognized to
two mechanisms, intrinsic or rho dependent, leads to the dis- provide the catalytic activity to remove introns and join
sociation of the polymerase and the RNA from the DNA tem- exons. The twentieth century ended with the discovery
plate. As we will see in Chapter 9, in the absence of a nucleus, that two other classes of functional RNA, miRNA and
prokaryotic RNAs that encode proteins are translated while siRNA, associate with RNA-induced silencing complexes
they are being transcribed. (RISC) and target complementary cellular mRNA for
In eukaryotes, there are three different RNA polymer- repression (in the case of miRNA) or for destruction (in the
ases; only RNA polymerase II transcribes mRNAs. Overall, case of siRNA).
key terms
alternative splicing (p. 306) coding strand (p. 298) Dicer (p. 310)
antisense RNA strand (p. 312) consensus sequence (p. 299) double-stranded RNA (dsRNA)
cap (p. 305) constitutive (p. 296) (p. 310)
carboxy terminal domain (CTD) cosuppression (p. 313) downstream (p. 298)
(p. 304) cotranscriptional processing (p. 304) elongation (p. 298)
316 CHAPTER 8 RNA: Transcription and Processing
endogenous gene (p. 313) post-transcriptional processing small interfering RNA (siRNA)
exon (p. 292) (p. 304) (pp. 296, 314)
functional RNA (p. 294) preinitiation complex (PIC) (p. 303) small nuclear RNA (snRNA) (p. 296)
gene expression (p. 294) primary transcript (pre-mRNA) spliceosome (p. 292)
general transcription factor (GTF) (p. 303) splicing (p. 306)
(p. 302) promoter (p. 298) TATA-binding protein (TBP)
gene silencing (p. 311) proteome (p. 306) (p. 304)
genetically modified organism (GMO) pulse–chase experiment (p. 293) TATA box (p. 304)
(p. 312) ribose (p. 294) template (p. 296)
GU–AG rule (p. 307) ribosomal RNA (rRNA) (p. 295) termination (p. 298)
initiation (p. 298) ribozyme (p. 294) transcript (p. 296)
intron (p. 292) RISC (RNA-induced silencing transcription (p. 296)
long noncoding RNAs (lncRNAs) complex) (p. 310) transcription bubble (p. 300)
(p. 296) RNA interference (RNAi) (p. 314) transfer RNA (tRNA) (p. 295)
messenger RNA (mRNA) (p. 294) RNA polymerase (p. 297) transgene (p. 312)
microRNA (miRNA) (p. 296) RNA polymerase holoenzyme (p. 299) 3′ untranslated region (3′ UTR)
non-protein-coding RNA (ncRNA) RNA processing (p. 303) (p. 300)
(p. 292) RNA splicing (p. 292) 5′ untranslated region (5′ UTR)
piwi-interacting RNAs (piRNAs) RNA world (p. 308) (p. 299)
(pp. 296, 314) self-splicing intron (p. 308) upstream (p. 298)
poly(A) tail (p. 305) sigma factor (σ) (p. 299) uracil (U) (p. 294)
problems
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
Working with the Figures 10. In Figure 8-22, show how the double-stranded RNA is
1. In Figure 8-3, why are the arrows for genes 1 and 2 point- able to silence the transgene. What would have to hap-
ing in opposite directions? pen for the transgene to also silence the flanking cellular
gene (in yellow)?
2. In Figure 8-5, draw the “one gene” at much higher reso-
lution with the following components: DNA, RNA poly-
Basic Problems
merase(s), RNA(s).
11. In prokaryotes and eukaryotes, describe what else is hap-
3. In Figure 8-6, describe where the gene promoter is
pening to the RNA while RNA polymerase is synthesiz-
located.
ing a transcript from the DNA template.
4. In Figure 8-9b, write a sequence that could form the
12. List three examples of proteins that act on nucleic acids
hairpin loop structure.
during transcription
5. How do you know that the events in Figure 8-13 are oc-
13. What is the primary function of the sigma factor? Is there
curring in the nucleus?
a protein in eukaryotes analogous to the sigma factor?
6. In Figure 8-15, what do you think would be the effect of
14. You have identified a mutation in yeast, a unicellular eu-
a G to A mutation in the first G residue of the intron?
karyote, that prevents the capping of the 5 ′ end of the
7. By comparing Figures 8-16 and 8-17, evaluate what is/are RNA transcript. However, much to your surprise, all the
the function(s) of proteins U1–U6. enzymes required for capping are normal. You deter-
8. By comparing Figures 8-16 and 8-17 with Figure 8-18, mine that the mutation is, instead, in one of the subunits
speculate what features of RNA permit self-splicing (that of RNA polymerase II. Which subunit is mutant, and
is, in the absence of proteins). how does this mutation result in failure to add a cap to
9. In Figure 8-20, three very different situations are shown yeast RNA?
that all result in gene silencing. What do these situations 15. Why is RNA produced only from the template DNA
have in common to make this is possible? strand and not from both strands?
Problems 317
16. A linear plasmid contains only two genes, which are 22. Draw a two-intron eukaryotic gene and its pre-mRNA
transcribed in opposite directions, each one from the and mRNA products. Be sure to include all the features
end, toward the center of the plasmid. Draw diagrams of of the prokaryotic gene included in your answer to
a. the plasmid DNA, showing the 5′ and 3′ ends of the Problem 19, plus the processing events required to pro-
nucleotide strands. duce the mRNA.
b. the template strand for each gene. 23. A certain Drosophila protein-encoding gene has one in-
tron. If a large sample of null alleles of this gene is exam-
c. the positions of the transcription-initiation sites.
ined, will any of the mutant sites be expected
d. the transcripts, showing the 5′ and 3′ ends. a. in the exons?
17. Are there similarities between the DNA replication bub- b. in the intron?
bles and the transcription bubbles found in eukaryotes?
c. in the promoter?
Explain.
d. in the intron–exon boundary?
18. Which of the following statements are true about eukary-
otic mRNA? 24. What are self-splicing introns, and why does their exis-
tence support the theory that RNA evolved before
a. The sigma factor is essential for the correct initiation protein?
of transcription.
25. Antibiotics are drugs that selectively kill bacteria with-
b. Processing of the nascent mRNA may begin before out harming animals. Many antibiotics act by selectively
its transcription is complete. binding to certain proteins that are critical for bacterial
c. Processing takes place in the cytoplasm. function. Explain why some of the most successful anti-
d. Termination is accomplished by the use of a hairpin biotics target bacterial RNA polymerase.
loop or the use of the rho factor. 26. Describe four types of RNA that perform distinct
e. Many RNAs can be transcribed simultaneously from functions.
one DNA template.
C h a ll e n g i n g P r o b l e m s
19. A researcher was mutating prokaryotic cells by inserting
segments of DNA. In this way, she made the following 27. The following data represent the base compositions of
mutation: double-stranded DNA from two different bacterial spe-
cies and their RNA products obtained in experiments
Original TTGACAT 15 to 17 bp TATAAT conducted in vitro:
Mutant TATAAT 15 to 17 bp TTGACAT
Species (A + T) (A + U) (A + G)
a. What does this sequence represent? (G + C) (G + C) (U + C)
b. What do you predict will be the effect of such a Bacillus subtilis 1.36 1.30 1.02
mutation? Explain. E. coli 1.00 0.98 0.80
20. You will learn more about genetic engineering in Chapter
10, but for now, put on your genetic engineer’s cap and a. From these data, determine whether the RNA of
try to solve this problem. E. coli is widely used in labora- these species is copied from a single strand or from both
tories to produce proteins from other organisms. strands of the DNA. Draw a diagram to show how you
a. You have isolated a yeast gene that encodes a meta- solve this problem.
bolic enzyme and want to produce this enzyme in E. coli. b. How can you tell if the RNA itself is single stranded or
You suspect that the yeast promoter will not work in double stranded?
E. coli. Why?
b. After replacing the yeast promoter with an E. coli 28. A human gene was initially identified as having three ex-
promoter, you are pleased to detect RNA from the yeast ons and two introns. The exons are 456, 224, and 524 bp,
gene but are confused because the RNA is almost twice whereas the introns are 2.3 kb and 4.6 kb.
the length of the mRNA from this gene isolated from a. Draw this gene, showing the promoter, introns, exons,
yeast. Explain why this result might have occurred. and transcription start and stop sites.
21. Draw a prokaryotic gene and its RNA product. Be sure to b. Surprisingly, this gene is found to encode not one
include the promoter, transcription start site, transcrip- but two mRNAs that have only 224 nucleotides in
tion termination site, untranslated regions, and labeled common. The original mRNA is 1204 nucleotides, and
5 ′ and 3 ′ ends. the new mRNA is 2524 nucleotides. Use your drawing
318 CHAPTER 8 RNA: Transcription and Processing
to show how this one region of DNA can encode these the transgene in their chromosomes. Draw a picture to
two transcripts. help him understand.
29. While working in your laboratory, you isolate an mRNA 31. Many human cancers result when a normal gene mu-
from C. elegans that you suspect is essential for embryos tates and leads to uncontrolled growth (a tumor). Genes
to develop successfully. With the assumption that you that cause cancer when they mutate are called onco-
are able to turn mRNA into double-stranded RNA, de- genes. Chemotherapy is effective against many tumors
sign an experiment to test your hypothesis. because it targets rapidly dividing cells and kills them.
Unfortunately, chemotherapy has many side effects,
30. Glyphosate is an herbicide used to kill weeds. It is the
such as hair loss or nausea, because it also kills many of
main component of a product made by the Monsanto
our normal cells that are rapidly dividing, such as those
Company called Roundup. Glyphosate kills plants by in-
in the hair follicles or stomach lining.
hibiting an enzyme in the shikimate pathway called
Many scientists and large pharmaceutical companies
EPSPS. This herbicide is considered safe because animals
are excited about the prospects of exploiting the
do not have the shikimate pathway. To sell even more of
RNAi pathway to selectively inhibit oncogenes in life-
their herbicide, Monsanto commissioned its plant geneti-
threatening tumors. Explain in very general terms how
cists to engineer several crop plants, including corn, to be
gene-silencing therapy might work to treat cancer and
resistant to glyphosate. To do so, the scientists had to in-
why this type of therapy would have fewer side effects
troduce an EPSPS enzyme that was resistant to inhibition
than chemotherapy.
by glyphosate into crop plants and then test the trans-
formed plants for resistance to the herbicide. 32. Would you expect self-splicing introns to be on average
Imagine that you are one of these scientists and that longer or shorter than introns spliced by spliceosomes?
you have managed to successfully introduce the re- Justify your answer.
sistant EPSPS gene into the corn chromosomes. You 33. A scientist who inserted a plant gene into human chro-
find that some of the transgenic plants are resistant to mosomes was not able to detect any transcription from
the herbicide, whereas others are not. Your supervisor the plant gene. Propose an explanation based on what
is very upset and demands an explanation of why some you have learned about transcription. Now devise an ex-
of the plants are not resistant even though they have periment to test your hypothesis.
344
9
C h a p t e r
outline
9.4 Ribosomes
319
320 CHA P TER 9 Proteins and Their Synthesis
I
n an address to Congress in 1969, William Stewart, Surgeon General of the United
States, said, “It is time to close the book on infectious diseases. The war against
pestilence is over.” At the time, his claim of victory was not an unreasonable
boast. In the preceding two decades, three infectious diseases that had plagued
humankind for centuries—polio, smallpox, and tuberculosis—had been virtually
eliminated throughout the world. A major contributing factor to the eradication of
tuberculosis and some other infectious diseases was the discovery and widespread
use of antibiotics, a diverse group of chemical compounds that kill specific bacte-
rial pathogens without harming the animal host. Antibiotics such as penicillin, tet-
racycline, ampicillin, and chloramphenicol, to name but a few, have saved hundreds
of millions of lives.
Unfortunately, William Stewart’s claim of victory in the battle against infec-
tious disease was premature. The overuse of antibiotics worldwide has spurred
the evolution of resistant bacterial strains. For example, each year, more than
2 million hospital patients in the United States acquire an infection that is resis-
tant to antibiotics and 90,000 die as a result. How did resistance develop so
quickly? Will infectious disease be, once again, a significant cause of human mor-
tality? Or will scientists be able to use their understanding of resistance mecha-
nisms to develop more durable antibiotics?
To answer these questions, scientists have focused on the cellular machinery
that is targeted by antibiotics. More than half of all antibiotics currently in use
target the bacterial ribosome, the site of protein synthesis in prokaryotes. In this
chapter, you will learn that scientists have had incredible success with the use of
a technique called X-ray crystallography to visualize the ribosomal RNAs (rRNAs)
and the ~50 proteins that make up the large and small ribosomal subunits of bac-
terial ribosomes. Although the ribosomes of prokaryotes and eukaryotes are very
similar, there are still subtle differences. Because of these differences, antibiotics
are able to target bacterial ribosomes but leave eukaryotic ribosomes untouched.
Using X-ray crystallography, scientists have also succeeded in visualizing antibiot-
ics bound to the ribosome (Figure 9-1). From these studies, they have determined
that mutations in bacterial rRNA and/or ribosomal proteins are responsible for
antibiotic resistance. With this knowledge of the points of contact between cer-
tain antibiotics and the ribosome, drug designers are attempting to design a new
generation of antibiotics that, for example, will be able to bind to multiple nearby
sites. Resistance to such a drug would be less likely to evolve because it would
require the occurrence of two mutations, which is a very unlikely event even for
bacteria.
Chapters 7 and 8 described how DNA is copied from generation to generation
and how RNA is synthesized from specific regions of DNA. We can think of these
processes as two stages of information transfer: replication (the synthesis of DNA)
and transcription (the synthesis of an RNA copy of a part of the DNA). In this
chapter, you will learn about the final stage of information transfer: translation
(the synthesis of a polypeptide directed by the RNA sequence).
As you learned in Chapter 8, RNA transcribed from genes is classified as either
messenger RNA (mRNA) or functional RNA. In this chapter, we will see the fate
of both RNA classes. The vast majority of genes encode mRNAs whose function is
to serve as an intermediate in the synthesis of the ultimate gene product, protein.
In contrast, recall that functional RNAs are active as RNAs; they are never trans-
lated into proteins. The main classes of functional RNAs are important actors in
protein synthesis. They include transfer RNAs and ribosomal RNAs.
• Transfer RNA (tRNA) molecules are the adapters that translate the three-
nucleotide codon in the mRNA into the corresponding amino acid, which is
brought by the tRNA to the ribosome in the process of translation. The tRNAs
are general components of the translation machinery; a tRNA molecule can
bring an amino acid to the ribosome for the purpose of translating any mRNA.
Proteins and Their Synthesis 321
Although most genes encode mRNAs, functional RNAs make up, by far, the
largest fraction of total cellular RNA. In a typical actively dividing eukaryotic cell,
rRNA and tRNA account for almost 95 percent of the total RNA, whereas mRNA
accounts for only about 5 percent. Two factors explain the abundance of rRNAs
and tRNAs. First, they are much more stable than mRNAs, and so these molecules
remain intact much longer. Second, because an actively dividing eukaryotic cell
has tens of thousands of ribosomes, the transcription of rRNA and tRNA genes
constitutes more than half of the total nuclear transcription in active eukaryotic
cells and almost 80 percent of transcription in yeast cells.
The components of the translational machinery and the process of translation
are very similar in prokaryotes and eukaryotes. The major feature that distin-
guishes translation in prokaryotes from that in eukaryotes is the location where
transcription and translation take place in the cell: the two processes take place in
the same compartment in prokaryotes, whereas they are physically separated in
eukaryotes by the nuclear membrane. After extensive processing, eukaryotic
mRNAs are exported from the nucleus for translation on ribosomes that reside in
the cytoplasm. In contrast, transcription and translation are coupled in prokary-
otes: translation of an RNA begins at its 5′ end while the rest of the mRNA is still
being transcribed.
322 CHA P TER 9 Proteins and Their Synthesis
H2N C COOH
R
All amino acids have two functional groups (the carboxyl and amino, shown
above) bonded to the same carbon atom (called the α carbon). Also attached to
the α carbon are an H atom and a side chain, or R (reactive) group. There are 20
amino acids known to exist in proteins, each having a different R group that gives
the amino acid its unique properties. The side chain can be anything from a
(a) aa1 aa aa3
hydrogen atom (as in the amino acid glycine) to a complex2 ring (as in the amino
acid tryptophan). In proteins, the Hamino
R1 acids are linked
H R2 together by H covalent
R3
H R1 H R2 H R3
O
H N C C OH H N C C OH H N C C OH
1.24
H O H O H O H R
Introduction to Genetic Analysis, 11e
Figure 09UN1 #927
C 1. C
05/12/14 1 32 6
Amino DragonflyCarboxyl
Media Group 1.5 1.4
H R1 H R2 H R3
end end
C
H N C C N C C N C C OH + 2(H2O) N
H O H O H O
R H
aa1 aa2 aa3
Peptide Peptide H
bond bond
u r e 9 -2 (a) A polypeptide is formed by the removal of water between amino acids to form peptide bonds. Each aa indicates an
F i g(b)
Peptide bond
amino acid. R1, R2, and R3 represent R groups (side chains) that differentiate the amino acids. (b) The peptide bond is a rigid planar unit
with the R groups projecting O out from the C–N backbone. Standard bond distances (in angstroms) are shown.
ANIMATED ART: Translation: peptide-bond formation
1.24
H R
C 1. C
1 32 6
1.5 1.4
9.1 Protein Structure 323
C H O H O H O H O
C C C C
C N C N C N C N C N C N C N C N
O C H C H C C
O O H O H
R R R R
R R R R
H O H O H O H O
C N C C N C C N C C N C C
C N C N C N C N
C C C C
O H O H O H O H
R R R R
Pleated sheet
Heme group
β polypeptide α
α
Figure 9-3 A protein can have four levels of structure. (a) Primary structure. The sequence of amino acids
defined by their R groups. (b) Secondary structure. The polypeptide can form a helical structure (an a helix) or a
zigzag structure (a b-pleated sheet). The b-pleated sheet has two polypeptide segments arranged in opposite
polarity, as indicated by the arrows. (c) Tertiary structure. The heme group is a nonprotein ring structure with an
iron atom at its center. (d) Quaternary structure illustrated by hemoglobin, which is composed of four polypeptide
subunits: two a subunits and two b subunits.
324 CHA P TER 9 Proteins and Their Synthesis
fold into specific shapes, called the protein’s secondary structure. Each shape arises
from the bonding forces between amino acids that are close together in the linear
sequence. These forces include several types of weak bonds, notably hydrogen
bonds, electrostatic forces, and van der Waals forces. The most common secondary
structures are the α helix and the b-pleated sheet. Different proteins show either one
or the other or sometimes both within their structures. Tertiary structure is pro-
duced by the folding of the secondary structure. Some proteins have quaternary
structure: such a protein is composed of two or more separate folded polypep-
tides, also called subunits, joined by weak bonds. The quaternary association
can be between different types of polypeptides (resulting in a heterodimer if
there are two subunits) or between identical polypeptides (making a homodi-
mer). Hemoglobin is an example of a heterotetramer, a four-subunit protein; it is
composed of two copies each of two different polypeptides, shown in green and
purple in Figure 9-3d.
Many proteins are compact structures; they are called globular proteins.
Enzymes and antibodies are among the best-known globular proteins. Proteins
with linear shape, called fibrous proteins, are important components of such
structures as skin, hair, and tendons.
Shape is all-important to a protein because a protein’s specific shape enables it
to do its specific job in the cell. A protein’s shape is determined by its primary amino
acid sequence and by conditions in the cell that promote the folding and bonding
necessary to form higher-level structures. The folding of proteins into their correct
conformation will be discussed at the end of this chapter. The amino acid sequence
also determines which R groups are present at specific positions and thus available
to bind with other cellular components. The active sites of enzymes are good illus-
trations of the precise interactions of R groups. Each enzyme has a pocket called the
active site into which its substrate or substrates can fit. Within the active site, the
R groups of certain amino acids are strategically positioned to interact with a sub-
strate and catalyze a specific chemical reaction.
At present, the rules by which primary structure is converted into higher-level
structure are imperfectly understood. However, from knowledge of the primary
amino acid sequence of a protein, the functions of specific regions can be pre-
dicted. For example, some characteristic protein sequences are the contact points
with membrane phospholipids that position a protein in a membrane. Other char-
acteristic sequences act to bind the protein to DNA. Amino acid sequences or
protein folds that are associated with particular functions are called domains. A
protein may contain one or more separate domains.
grow on an E. coli K host. Mutations causing this rII phenotype were induced by
using a chemical called proflavin, which was thought to act by the addition or
deletion of single nucleotide pairs in DNA. (This assumption is based on experi-
mental evidence not presented here.) The following examples illustrate the action
of proflavin on double-stranded DNA.
–ATCTAGTCT–
rtion –TAGATCAGA–
inse
–ATCTGTCT–
–TAGACAGA–
del
etio
n –ATCTGTT–
–TAGACAA–
“Reversion”
Double mutant
rII mutant plaques wild-type plaques
The insertion suppresses the effect of the deletion by restoring most of the
sense of the sentence. By itself, however, the insertion also disrupts the sentence:
THE FAT CAT AAT ETH EBI GRA T
9.2 The Genetic Code 327
If we assume that the FCO mutant is caused by an addition, then the second
(suppressor) mutation would have to be a deletion because, as we have seen, only
a deletion would restore the reading frame of the resulting message (a second
insertion would not correct the frame). In the following diagrams, we use a hypo-
thetical nucleotide chain to represent RNA for simplicity. We also assume that the
code words are three letters long and are read in one direction (from left to right
in our diagrams).
1. Wild-type message
CAU CAU CAU CAU CAU
2. rIIa message: Words after the addition are changed (×) by frameshift muta-
tion (words marked ✓ are unaffected).
Addition
CAU ACA UCA UCA UCA U
✓
3. rIIarIIb message: Few words are wrong, but reading frame is restored for later
words.
Deletion
CAU ACA UCU CAU CAU
✓ ✓ ✓
The few wrong words in the suppressed genotype could account for the fact
that the “revertants” (suppressed phenotypes) that Crick and his associates recov-
ered did not look exactly like the true wild types in phenotype.
We have assumed here that the original frameshift mutation was an addition,
but the explanation works just as well if we assume that the original FCO muta-
tion is a deletion and the suppressor is an addition. You might want to verify it on
your own. Very interestingly, combinations of three additions or three deletions
have been shown to act together to restore a wild-type phenotype. This observa-
tion provided the first experimental confirmation that a word in the genetic code
consists of three successive nucleotides, or a triplet. The reason is that three addi-
tions or three deletions within a gene automatically restore the reading frame in
the mRNA if the words are triplets.
–Phe–Phe–Phe–Phe–Phe –Phe–
Third letter
First letter
found that the head protein of each mutant was a shorter CUG CCG CAG
Gln
CGG G
polypeptide chain than that of the wild type. Brenner exam-
AUU ACU AAU AGU U
ined the ends of the shortened proteins and compared them AUC Ile ACC AAC Asn AGC Ser C
with the wild-type protein. For each mutant, he recorded A
AUA ACA
Thr
AAA AGA A
the next amino acid that would have been inserted to con- AUG Met ACG AAG Lys AGG Arg G
tinue the wild-type chain. The amino acids for the six muta- GUU GCU GAU GGU U
tions were glutamine, lysine, glutamic acid, tyrosine, GUC GCC GAC Asp GGC C
G Val Ala Gly
tryptophan, and serine. These results present no immedi- GUA GCA GAA GGA A
Glu
ately obvious pattern, but Brenner deduced that certain GUG GCG GAG GGG G
codons for each of these amino acids are similar. Specifically,
each of these codons can mutate to the codon UAG by a F i g u r e 9 - 5 The genetic code
single change in a DNA nucleotide pair. He therefore postulated that UAG is a designates the amino acids specified by
stop (termination) codon—a signal to the translation mechanism that the protein each codon.
is now complete.
UAG was the first stop codon deciphered; it is called the amber codon (amber is
the English translation of the last name of the codon’s discoverer, Bernstein).
Mutants that are defective owing to the presence of an abnormal amber codon are
called amber mutants. Two other stop codons are UGA and UAA. Analogously to the
amber codon, and continuing the theme of naming for colors and gems, UGA is
called the opal codon and UAA is called the ochre codon. Mutants that are defective
because they contain abnormal opal or ochre codons are called opal and ochre
mutants, respectively. Stop codons are often called nonsense codons because they
designate no amino acid.
In addition to a shorter head protein, Brenner’s phage mutants had another
interesting feature in common: the presence of a suppressor mutation (su—) in the
host chromosome would cause the phage to develop a head protein of normal (wild-
type) chain length despite the presence of the m mutation. We will consider stop
codons and their suppressors further after we have dealt with the process of protein
synthesis.
1958.
3 3 0 CHA P TER 9 Proteins and Their Synthesis
(a) 3′ (b)
OH 5′
Amino acid A
attachment site 3′
C
C
A Amino acid
C G 5′ p attachment
C G
U G
G C
C G
U U
U C G
A U G
C C G G A U
G G C G U A C
C G G C CU
T C G C G A G
C
G DHU loop
G G A G C
A U
G C UH2
G C Anticodon loop
G C
Anticodon loop
ψ U
ml U
3′ CC G IU 5′
Anticodon
G C A
5′ Codon for alanine 3′ mRNA
20 amino acids. Each amino acid has a specific synthetase that links it only to An aminoacyl-tRNA
those tRNAs that recognize the codons for that particular amino acid. To catalyze synthetase attaches
this reaction, synthetases have two binding sites, one for the amino acid and the an amino acid to its tRNA
other for its cognate tRNA (Figure 9-7). An amino acid is attached at the free 3 ′
end of its tRNA, the amino acid alanine in the case shown in Figures 9-6a and 9-7. ATP
AMP + PPi
The tRNA with an attached amino acid is said to be charged. 3′
OH
A tRNA normally exists as an L-shaped folded cloverleaf, as shown in Figure 9-6b,
rather than the “flattened” cloverleaf shown in Figure 9-6a. The three-dimensional H H O
5′
structure of tRNA was determined with the use of X-ray crystallography. In the years N C C
since this technique was used to deduce the double-helical structure of DNA, it has H OH
CH3
been refined so that it can now be used to determine the structure of very complex
macromolecules such as the ribosome. Although tRNAs differ in their primary
nucleotide sequence, all tRNAs fold into virtually the same L-shaped conformation
except for differences in the anticodon loop and aminoacyl end. This similarity of C G U
structure can be easily seen in Figure 9-8, which shows two different tRNAs superim-
posed. Conservation of structure tells us that shape is important for tRNA function.
What would happen if the wrong amino acid were covalently attached to a Binding site Binding site
tRNA? A convincing experiment answered this question. The experiment used for tRNAAla for alanine
cysteinyl-tRNA (tRNACys), the tRNA specific for cysteine. This tRNA was “charged” Aminoacyl-tRNA synthetase
with cysteine, meaning that cysteine was attached to the tRNA. The charged specific for alanine
tRNA was treated with nickel hydride, which converted the cysteine (while still
F i g u r e 9 -7 Each aminoacyl-tRNA
bound to tRNACys) into another amino acid, alanine, without affecting the tRNA:
synthetase has binding pockets for a
nickel hydride specific amino acid and its cognate tRNA.
cysteinetRNACys alanine tRNACys By this means, an amino acid is covalently
attached to the tRNA with the
Protein synthesized with this hybrid species had alanine wherever we would
corresponding anticodon.
expect cysteine. The experiment demonstrated that the amino acids are “illiter-
ate”; they are inserted at the proper position because the tRNA “adapters” recog-
nize the mRNA codons and insert their attached amino acids appropriately. Thus,
the attachment of the correct amino acid to its cognate tRNA is a critical step in Two superimposed tRNAs
ensuring that a protein is synthesized correctly. If the wrong amino acid is
attached, there is no way to prevent it from being incorporated into a growing
protein chain.
Degeneracy revisited
As can be seen in Figure 9-5, the number of codons for a single amino acid varies,
ranging from one codon (UGG for tryptophan) to as many as six (UCC, UCU, UCA,
UCG, AGC, or AGU for serine). Why the genetic code contains this variation is not
exactly clear, but two facts account for it:
1. Most amino acids can be brought to the ribosome by several alternative tRNA
types. Each type has a different anticodon that base-pairs with a different codon
in the mRNA.
2. Certain charged tRNA species can bring their specific amino acids to any one Anticodon
of several codons. These tRNAs recognize and bind to several alternative codons,
not just the one with a complementary sequence, through a loose kind of base Figure 9-8 When folded into their correct
pairing at the 3 ′ end of the codon and the 5 ′ end of the anticodon. This loose pair- three-dimensional structures, the yeast
Introduction to Genetic Analysis, 11e
wobble.
ing is calledFigure tRNA for glutamine (blue) almost completely
09UN7 #933
overlaps the yeast tRNA for phenylalanine
05/12/14
(red) except for the anticodon loop and
Wobble Dragonfly
is a situation
Media in which the third nucleotide of an anticodon (at the
Group
aminoacyl end. [ Data from M. A. Rould, J. J.
5 ′ end) can form either of two alignments (Figure 9-9). This third nucleotide can Perona, D. Soll, and T. A. Steitz, “Structure of
form hydrogen bonds either with its normal complementary nucleotide in the E. coli Glutaminyl–tRNA Synthetase Complexed
third position of the codon or with a different nucleotide in that position. “Wobble with tRNA(Gln) and ATP at 2.8 Å Resolution,”
rules” dictate which nucleotides can and cannot form hydrogen bonds with Science 246, 1989, 1135–1142.]
3 32 CHA P TER 9 Proteins and Their Synthesis
tRNA anticodon
loop
Anticodon Anticodon Wobble
position
3´ A G G 5´ 3´ A G G 5´
U C C U C U
5´ Codon 3´ mRNA 5´ Codon 3´ mRNA
Cytosine Uracil
H O
N
N N
mRNA H H
N N
O Guanine O Guanine
H O
H
O
N N tRNA N N
H H
N N N N
N N
H H
Table 9-1 Codon–Anticodon alternative nucleotides through wobble (Table 9-1). In Table 9-1, the letter I stands
Pairings Allowed by for inosine, one of the rare bases found in tRNA, often in the anticodon.
the Wobble Rules
K e y C o nce p t The genetic code is said to be degenerate because, in many
5 end of
′ 3′ end of cases, more than one codon is assigned to a single amino acid; in addition, several
anticodon codon codons can pair with more than one anticodon (wobble).
G C or U
C G only
A U only
9.4 Ribosomes
U A or G Protein synthesis takes place when tRNA and mRNA molecules associate with
I U, C, or A ribosomes. The task of the tRNAs and the ribosome is to translate the sequence of
nucleotide codons in mRNA into the sequence of amino acids in protein. The
term biological machine was used in preceding chapters to characterize multisub-
unit complexes that perform cellular functions. The replisome, for example, is a
biological machine that can replicate DNA with precision and speed. The site of
protein synthesis, the ribosome, is much larger and more complex than the
machines described thus far. Its complexity is due to the fact that it has to perform
several jobs with precision and speed. For this reason, it is better to think of the
ribosome as a factory containing many machines that act in concert. Let’s see how
this factory is organized to perform its numerous functions.
In all organisms, the ribosome consists of one small and one large subunit,
each made up of RNA (called ribosomal RNA or rRNA) and protein. Each subunit
is composed of one to three rRNA types and as many as 50 proteins. Ribosomal
subunits were originally characterized by their rate of sedimentation when spun
in an ultracentrifuge, and so their names are derived from their sedimentation
coefficients in Svedberg (S) units, which is an indication of molecular size. In
prokaryotes, the small and large subunits are called 30S and 50S, respectively, and
9.4 Ribosomes 3 3 3
28S rRNA
23S rRNA
16S rRNA 5.8S rRNA 18S rRNA
5S rRNA
5S rRNA
31 proteins 21 proteins 49 proteins 33 proteins
F i g u r e 9 -10 A ribosome contains a large and a small subunit. Each subunit contains
both rRNA of varying lengths and a set of proteins. There are two principal rRNA molecules
in all ribosomes. Prokaryotic ribosomes also contain one 120-base-long rRNA that
sediments at 5S, whereas eukaryotic ribosomes have two small rRNAs: a 5S RNA molecule
similar to the prokaryotic 5S and a 5.8S molecule 160 bases long.
they associate to form a 70S particle (Figure 9-10a). The eukaryotic counterparts rRNA folds up by
are called 40S and 60S, and the complete ribosome is called 80S (Figure 9-10b). intramolecular base pairing
Although eukaryotic ribosomes are bigger owing to their larger and more numer-
ous components, the components and the steps in protein synthesis are similar
overall. The similarities clearly indicate that translation is an ancient process that
originated in the common ancestor of eukaryotes and prokaryotes.
When ribosomes were first studied, the fact that almost two-thirds of their
mass is RNA and only one-third is protein was surprising. For decades, rRNAs had
been assumed to function as the scaffold or framework necessary for the correct
assembly of the ribosomal proteins. That role seemed logical because rRNAs fold
up by intramolecular base pairing into stable secondary structures (Figure 9-11). 5′
According to this model, the ribosomal proteins were solely responsible for carry- 3′
ing out the important steps in protein synthesis. This view changed with the dis-
covery in the 1980s of catalytic RNAs (see Chapter 8). As you will see, scientists
now believe that the rRNAs, assisted by the ribosomal proteins, carry out most of
the important steps in protein synthesis.
Ribosome features
The ribosome brings together the other important players in protein synthesis— F i g u r e 9 -11 The folded structure of
tRNA and mRNA molecules—to translate the nucleotide sequence of an mRNA the prokaryotic 16S ribosomal RNA of the
into the amino acid sequence of a protein. The tRNA and mRNA molecules are small ribosomal subunit.
3 3 4 CHA P TER 9 Proteins and Their Synthesis
Translation initiation The main task of initiation is to place the first aminoacyl-
tRNA in the P site of the ribosome and, in this way, establish the correct reading
frame of the mRNA. In most prokaryotes and all eukaryotes, the first amino acid
in any newly synthesized polypeptide is methionine, specified by the codon AUG.
It is inserted not by tRNAMet but by a special tRNA called an initiator, symbolized
tRNAMeti. In bacteria, a formyl group is added to the methionine while the amino
acid is attached to the initiator, forming N-formylmethionine. (The formyl group
on N-formylmethionine is removed later.)
How does the translation machinery know where to begin? In other words,
how is the initiation AUG codon selected from among the many AUG codons in
an mRNA molecule? Recall that, in both prokaryotes and eukaryotes, mRNA has
a 5 ′ untranslated region consisting of the sequence between the transcriptional
start site and the translational start site (see Figure 8-7). As you will see below, the
nucleotide sequence of the 5 ′ UTR adjacent to the AUG initiator is critical for
ribosome binding in prokaryotes but not in eukaryotes.
Shine–Dalgarno sequence
30S
Shine–Dalgarno Start
sequence codon F i g u r e 9 -13 In bacteria, base
mRNA complementarity between the 3′ end of
C A U C C U A GG A GGU U UG A T C C U A UGCG the 16S rRNA of the small ribosomal
5′ subunit (30S) and the Shine–Dalgarno
16S rRNA U U C C U C C A sequence of the mRNA positions the
A
ribosome to correctly initiate translation
3′
at the downstream AUG codon.
3 3 6 CHA P TER 9 Proteins and Their Synthesis
IF3
30S
AUG
IF2·fMet-tRNAi ATP
+ mRNA
40S subunit
with initiation
IF3 components ADP + Pi
fMet Met
IF2
IF1
mRNA AUG
5′ AUG
Initiation factors
IF1 + IF2
Met
fMet
fMet
AUG
AUG
70S ribosome
F i g u r e 9 -15 The initiation complex forms at the 5′ end of the
mRNA and then scans in the 3′ direction in search of a start codon.
Figure 9-14 Initiation factors assist the assembly of the Recognition of the start codon triggers the assembly of the
ribosome at the translation start site and then dissociate complete ribosome and the dissociation of initiation factors (not
before translation. shown). The hydrolysis of ATP provides energy to drive the
scanning process.
Three proteins—IF1, IF2, and IF3 (for initiation factor)—are required for cor-
Introduction to Genetic Analysis, 11e rect initiation (Figure 9-14). IF3 is necessary to keep the 30S subunit dissociated from
Figure 09.14 #916
05/12/14 the 50S subunit, and IF1 and IF2 act to ensure that only the initiator tRNA enters the
Dragonfly Media Group P site. The 30S subunit, mRNA, and initiator tRNA constitute the initiation complex.
The complete 70S ribosome is formed by the association of the 50S large subunit
with the initiation complex and the release of the initiation factors.
Because a prokaryote lacks a nuclear compartment that separates transcrip-
tion and translation, the prokaryotic initiation complex is able to form at a Shine–
Dalgarno sequence near the 5 ′ end of an RNA that is still being transcribed. Thus,
9.4 Ribosomes 3 37
translation can begin on prokaryotic RNAs even before they are completely
transcribed.
Elongation It is during the process of elongation that the ribosome most resem-
bles a factory. The mRNA acts as a blueprint specifying the delivery of cognate
tRNAs, each carrying as cargo an amino acid. Each amino acid is added to the
growing polypeptide chain while the deacylated tRNA is recycled by the addition
of another amino acid. Figure 9-16 details the steps in elongation. Two protein fac-
tors called elongation factor Tu (EF-Tu) and elongation factor G (EF-G) assist the
elongation process.
Ternary
complex EF-Tu
EF-Tu
E P A Aminoacyl-tRNA Peptide
binds to A site. bond
forms. GTP
EF-G
Translocation
F i g u r e 9 -16 A ternary
Pi
complex consisting of an
aminoacyl-tRNA attached to an
EF-Tu factor binds to the A site.
When its amino acid has joined
the growing polypeptide chain,
GDP an EF-G factor binds to the A
Aminoacyl- site while nudging the tRNAs
tRNA binds
and their mRNA codons into the
to A site.
E and P sites. See text for
tRNA in details.
E site leaves.
ANIMATED ART:
The three steps of translation
3 3 8 CHA P TER 9 Proteins and Their Synthesis
(a) Wild type: no mutations. The tyrosine (b) Amber mutation introduces UAG stop (c) A further mutation changes the tyrosine
tRNA binds to the codon UAC. codon. Translation stops. tRNA codon to AUC. Tyrosine tRNA reads
the UAG codon. Translation continues.
Tyr Tyr
Gly Gly Gly
Gln Gln Gln
RF1
AU RF1 AU
G C
GUC GUC GUC
5' G G G C A G UAC A AG 3' 5' G G G C A G UAG A A G 3' 5' G G G C A G UAG A A G 3'
Alternative splicing
mRNA
Ligands: Ligands:
FGF10 FGF2
FGF7 FGF9
FGF4
FGF8
Exterior FGF6
Cell membrane
Cytoplasm
Posttranslational events
When released from the ribosome, most newly synthesized proteins are unable to
function. As you will see in this section and in subsequent chapters of this book,
DNA sequence is only part of the story of how organisms function. In this case, all
newly synthesized proteins need to fold up correctly and the amino acids of some
proteins need to be chemically modified. Because some protein folding and modi-
fication take place after protein synthesis, they are called posttranslational events.
Protein folding inside the cell The most important posttranslational event is the
folding of the nascent (newly synthesized) protein into its correct three-dimen-
sional shape. A protein that is folded correctly is said to be in its native conforma-
tion (in contrast with an unfolded or misfolded protein that is nonnative). As we
saw at the beginning of this chapter, proteins exist in a remarkable diversity of
structures. The distinct structures of proteins are essential for their enzymatic
activity, for their ability to bind to DNA, or for their structural roles in the cell.
9.5 The Proteome 3 41
Although it has been known since the 1950s that the amino acid sequence of a pro-
Phosphorylation and
tein determines its three-dimensional structure, it is also known that the aqueous dephosphorylation
environment inside the cell does not favor the correct folding of most proteins. of proteins
Given that proteins do in fact fold correctly in the cell, a long-standing question has
been, How is this correct folding accomplished? SIGNAL IN
The answer seems to be that nascent proteins are folded correctly with the
help of chaperones—a class of proteins found in all organisms from bacteria to Inactive
plants to humans. One family of chaperones, called the GroE chaperonins, form enzyme
large multisubunit complexes called chaperonin folding machines. Although the
Kinase
precise mechanism is not yet understood, newly synthesized, unfolded proteins
are believed to enter a chamber in the folding machine that provides an electri- ATP Pi
cally neutral microenvironment within which the nascent protein can success- ADP Phosphatase
fully fold into its native conformation.
Active P
Posttranslational modification of amino acid side chains As already stated,
enzyme
proteins are polymers of amino acids made from any of the 20 different types.
However, biochemical analysis of many proteins reveals that a variety of mole-
cules can be covalently attached to amino acid side chains. More than 300 modi- SIGNAL OUT
fications of amino acid side chains are possible after translation. Two of the more
commonly encountered posttranslational modifications—phosphorylation and Figure 9-20 Proteins can be activated
ubiquitination—are considered next. through the enzymatic attachment of
phosphate groups to their amino acid side
groups and inactivated by the removal of
Phosphorylation Enzymes called kinases attach phosphate groups to the hydroxyl those phosphate groups.
groups of the amino acids serine, threonine, and tyrosine, whereas enzymes
called phosphatases remove these phosphate groups. Because phosphate groups
are negatively charged, their addition to a protein usually changes protein confor-
mation. The addition and removal of phosphate groups serves as a reversible
switch to control a variety of cellular events, including enzyme activity, protein–
protein interactions, and protein–DNA interactions (Figure 9-20).
One measure of the importance of protein phosphorylation is the number of
genes encoding kinase activity in the genome. Even a simple organism such as
yeast has hundreds of kinase genes, whereas the mustard plant Arabidopsis thali-
ana has more than 1000. Another measure of the significance of protein phos-
phorylation is that most of the numerous protein–protein interactions that take
place in a typical cell are regulated by phosphorylation.
Recent analyses of the protein–protein interactions of the proteome indicate
that most proteins function by interacting with other proteins. The interactome is
the name given to the complete set of protein–protein interactions in an organism,
organ, tissue, or cell. One way to display the network of protein–protein interactions
that constitute an interactome is shown in Figure 9-21. To generate this figure,
researchers determined the 3186 protein interactions among 1705 human proteins.
However, these interactions constitute only a tiny fraction of the protein–protein
interactions that are taking place in all human cells under all growth conditions.
What is the biological significance of these interactions? In this chapter and
preceding ones, you have seen that protein–protein interactions are central to the
function of large biological machines such as the replisome, the spliceosome, and
the ribosome. Another set of significant interactions is the associations between
human proteins and the proteins of human pathogens. For example, the interac-
tome of 40 Epstein–Barr virus (EBV) proteins and 112 human proteins consists of
173 interactions (Figure 9-22). Understanding of this web of interactions may lead
to new therapies for mononucleosis, a disease caused by EBV infection.
F i g u r e 9 -2 1 Proteins (represented by
circles) interact with other proteins Some of the protein interactions
(connected by lines) to form simple or in the human interactome
large protein complexes. This interactome
shows 3186 interactions among 1705
human proteins. [Data from Ulrich Steizl
et al., Max Delbrück Center for Molecular
Medicine (MDC) Berlin-Buch. Copyright MDC.]
called the 26S proteasome (Figure 9-23). The modification targeting a protein
for degradation is the addition of chains of multiple copies of a protein called
ubiquitin to the ε-amine of lysine residues (called ubiquitination). Ubiquitin con-
tains 76 amino acids and is found only in eukaryotes, where it is highly conserved in
plants and animals. Two broad classes of proteins are targeted for destruction by
ubiquitination: short-lived proteins such as cell-cycle regulators or proteins that
have become damaged or mutated.
s u m m a ry
This chapter has described the translation of information upstream of the AUG start codon. The initiation complex
encoded in the nucleotide sequence of an mRNA into the in eukaryotes is assembled at the 5 ′ cap structure of the
amino acid sequence of a protein. Our proteins, more than mRNA and moves in a 3 ′ direction until the start codon is
any other macromolecule, determine who we are and what recognized. The longest phase of translation is the elonga-
we are. They are the enzymes responsible for cell metabo- tion cycle; in this phase, the ribosome moves along the
lism, including DNA and RNA synthesis, and are the regula- mRNA, revealing the next codon that will interact with its
tory factors required for the expression of the genetic pro- cognate-charged tRNA so that the charged tRNA’s amino
gram. The versatility of proteins as biological molecules is acid can be added to the growing polypeptide chain. This
manifested in the diversity of shapes that they can assume. cycle continues until a stop codon is encountered. Release
Furthermore, even after they are synthesized, they can be factors facilitate translation termination.
modified in a variety of ways by the addition of molecules In the past few years, new imaging techniques have
that can alter their function. revealed ribosomal interactions at the atomic level. With
Given the central role of proteins in life, it is not surpris- these new “eyes,” we can now see that the ribosome is an
ing that both the genetic code and the machinery for trans- incredibly dynamic machine that changes shape in
lating this code into protein have been highly conserved response to the contacts made with tRNAs and with pro-
from bacteria to humans. The major components of transla- teins. Furthermore, imaging at atomic resolution has
tion are three classes of RNA: tRNA, mRNA, and rRNA. The revealed that the ribosomal RNAs, not the ribosomal pro-
accuracy of translation depends on the enzymatic linkage of teins, are intimately associated with the functional centers
an amino acid with its cognate tRNA, generating a charged of the ribosome.
tRNA molecule. As adapters, tRNAs are the key molecules The proteome is the complete set of proteins that can
in translation. In contrast, the ribosome is the factory where be expressed by the genetic material of an organism.
mRNA, charged tRNAs, and other protein factors come Whereas a typical multicellular eukaryote has about 20,000
together for protein synthesis. genes, the typical proteome is probably 10- to 50-fold larger.
The key decision in translation is where to initiate This difference is in part the result of posttranslational
translation. In prokaryotes, the initiation complex assem- modifications such as phosphorylation and ubiquitination,
bles on mRNA at the Shine–Dalgarno sequence, just which influence protein activity and stability.
key terms
active site (p. 324) globular protein (p. 324) release factor (RF) (p. 338)
amino acid (p. 322) initiation factor (p. 336) ribosomal RNA (rRNA) (p. 321)
aminoacyl-tRNA synthetase (p. 330) initiator (p. 335) ribosome (p. 321)
amino end (p. 322) interactome (p. 341) secondary structure (p. 324)
anticodon (p. 330) isoform (p. 339) Shine–Dalgarno sequence (p. 335)
A site (p. 334) nuclear localization sequence signal sequence (p. 343)
carboxyl end (p. 322) (NLS) (p. 343) structure-based drug design (p. 335)
charged tRNA (p. 331) peptidyltransferase center (p. 334) subunit (p. 324)
codon (p. 325) polypeptide (p. 322) tertiary structure (p. 324)
decoding center (p. 334) primary structure (p. 322) transfer RNA (tRNA) (p. 321)
degenerate code (p. 327) protein targeting (p. 343) triplet (p. 325)
domain (p. 324) proteome (p. 339) ubiquitin (p. 342)
E site (p. 334) P site (p. 334) ubiquitination (p. 342)
fibrous protein (p. 324) quaternary structure (p. 324) wobble (p. 331)
Problems 3 4 5
s olv e d p r obl e m s
SOLVED PROBLEM 1. Using Figure 9-5, show the conse- – His – Thr – G lu – Asp – Trp – Leu – His – Gln – Asp
quences on subsequent translation of the addition of an ade- U U
U A U U A U
nine base to the beginning of the following coding sequence: –CA –ACC–GA –GA –UGG–CUC–CA –CA –GA
C A G C C G C
A A
G G
UUA
–CGA–UCG–GAA–CCA–CGU–GAU–AAG–CAU– G
– Arg – Ser – Glu – Pro – Arg – Asp – Lys – His – Because the protein-sequence change given to us at the
beginning of the problem begins after the first amino acid
Solution (His) owing to a single nucleotide addition, we can deduce
With the addition of A at the beginning of the coding that a Thr codon must change to an Asp codon. This change
sequence, the reading frame shifts, and a different set of must result from the addition of a G directly before the Thr
amino acids is specified by the sequence, as shown here codon (indicated by a box), which shifts the reading frame, as
(note that a set of nonsense codons is encountered, which shown here:
results in chain termination): –A
U A G U
–ACG–AUC–GGA–ACC–ACG–UGA–UAA–GCA– –CA – G AC–UGA – GA– U UG–G C U–UCA– U CAq–GA –
C G C
C C U C C
– Thr – Ile – Gly – Thr – Thr – stop – stop A A
G G
SOLVED PROBLEM 2. A single nucleotide addition followed Introduction
– His – to
Asp – ArgAnalysis,
Genetic – Gly – 11e
Leu – Ala – Thr – Ser – Asp –
by a single nucleotide deletion approximately 20 bp apart in Figure 09UN9 #935
DNA causes a change in the protein sequence from 05/12/14
Additionally,because a deletion of a nucleotide must
Dragonfly Media Group
–His–Thr–Glu–Asp–Trp–Leu–His–Gln–Asp– restore the final Asp codon to the correct reading frame,
an A or G must have been deleted from the end of the
to
original next-to-last codon, as shown by the arrow. The
–His–Asp–Arg–Gly–Leu–Ala–Thr–Ser–Asp– original protein sequence permits us to draw the mRNA
Which nucleotide has been added and which nucleotide has with a number of ambiguities. However, the protein se-
been deleted?
Introduction to GeneticWhat are11e
Analysis, the original and the new mRNA quence resulting from the frameshift allows us to deter-
Figuresequences?
09UN8 #934(Hint: Consult Figure 9-5.) mine which nucleotide was in the original mRNA at most
05/12/14 of these points of ambiguity. Nucleotides that could have
Solution
Dragonfly Media Group appeared in the original sequence are circled. In only a
We can draw the mRNA sequence for the original protein few cases does the ambiguity remain.
sequence (with the inherent ambiguities at this stage):
p r obl e m s
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
Working with the Figures 3. Using the quarternary structure of hemoglobin shown in
1. The primary protein structure is shown in Figure 9-3(a). Figure 9-3(d), explain in structural terms how a muta-
Where in the mRNA (near the 5′ or 3′ end) would a mu- tion in the β subunit protein could be suppressed by a
tation in R2 be encoded? mutation in the a subunit gene.
2. In this chapter you were introduced to nonsense sup- 4. Transfer RNAs (tRNAs) are examples of RNA molecules
pressor mutations in tRNA genes. However, suppressor that do not encode protein. Based on Figures 9-6 and 9-8,
mutations also occur in protein-coding genes. Using the what is the significance of the sequence of tRNA mole-
tertiary structure of the β subunit of hemoglobin shown cules? What do you predict would be the impact on
in Figure 9-3(c), explain in structural terms how a muta- translation of a mutation in one of the bases of one of the
tion could cause the loss of globin protein function. Now stems in the tRNA structure? On the mutant organism?
explain how a mutation at a second site in the same pro- 5. The components of prokaryotic and eukaryotic ribo-
tein could suppress this mutation and lead to a normal somes are shown in Figure 9-10. Based on this figure, do
or near-normal protein. you think that the large prokaryotic ribosomal RNA
3 46 CHA P TER 9 Proteins and Their Synthesis
(23S rRNA) would be able to substitute for the eukary- 14. A mutational event inserts an extra nucleotide pair into
otic 28S rRNA? Justify your answer. DNA. Which of the following outcomes do you expect?
6. Ribosomal RNAs (rRNAs) are another example of a func- (1) No protein at all; (2) a protein in which one amino
tional RNA molecule. Based on Figure 9-11, what do you acid is changed; (3) a protein in which three amino acids
think is the significance of the secondary structure of are changed; (4) a protein in which two amino acids are
rRNA? changed; (5) a protein in which most amino acids after
the site of the insertion are changed.
7. In Figure 9-12, is the terminal amino acid emerging from
the ribosome encoded by the 5′ or 3′ end of the mRNA? 15. Before the true nature of the genetic coding process was
fully understood, it was proposed that the message might
8. In Figure 9-12(b), what do you think happens to the
be read in overlapping triplets. For example, the se-
tRNA that is released from the E site?
quence GCAUC might be read as GCA CAU AUC:
9. In Figure 9-17, what do you think happens next to the
ribosomal subunits after they are finished translating G C A U C
that mRNA?
10. Based on Figure 9-19, can you predict the position of a
mutation that would affect the synthesis of one isoform Devise an experimental test of this idea.
but not the other? 16. If tRNA is the adaptor for translation, what is the
11. Based on Figure 9-24, can you predict the position of a ribosome?
mutation that would produce an active protein that was 17. Which anticodon would you predict for a tRNA species
not directed to the correct location? carrying isoleucine? Is there more than one possible an-
swer? If so, state any alternative answers.
B ASIC P r obl e m s
18. a. In how many cases in the genetic code would you
www
fail to know the amino acid specified by a codon if you
Unpacking the Problem knew only the first two nucleotides of the codon?
www
12. a. Use the codon dictionary in Figure 9-5 to complete b. In how many cases would you fail to know the first
the following table. Assume that reading is from left to two nucleotides of the codon if you knew which amino
right and that the columns represent transcriptional and acid is specified by it?
translational alignments. 19. Deduce what the six wild-type codons may have been in
the mutants that led Brenner to infer the nature of the
C UAG codon.
DNA double helix
T G A 20. If a polyribonucleotide contains equal amounts of
C A U mRNA transcribed randomly positioned adenine and uracil bases, what pro-
Appropriate tRNA portion of its triplets will encode (a) phenylalanine, (b)
G C A isoleucine, (c) leucine, (d) tyrosine?
anticodon
Amino acids 21. You have synthesized three different messenger RNAs
Trp incorporated into with bases incorporated in random sequence in the fol-
protein lowing ratios: (a) 1 U : 5 C’s, ( b) 1 A : 1 C : 4 U’s, (c) 1 A : 1
C : 1 G : 1 U. In a protein-synthesizing system in vitro, in-
b. Label the 5′ and 3′ ends of DNA and RNA, as well as dicate the identities and proportions of amino acids that
the amino and carboxyl ends of the protein. will be incorporated into proteins when each of these
mRNAs is tested. (Refer to Figure 9-5.)
13. Consider the following segment of DNA:
22. In the fungus Neurospora, some mutants were obtained
5′ GCTTCCCAA 3′ that lacked activity for a certain enzyme. The mutations
3′ CGAAGGGTT 5′ were found, by mapping, to be in either of two unlinked
Assume that the top strand is the template strand used genes. Provide a possible explanation in reference to qua-
by RNA polymerase. ternary protein structure.
a. Draw the RNA transcribed. 23. What is meant by the statement “The genetic code is uni-
versal”? What is the significance of this finding?
b. Label its 5′ and 3′ ends.
24. The enzyme tryptophan synthetase is produced in two
c. Draw the corresponding amino acid chain.
sizes, large and small. Some mutants with no enzyme ac-
d. Label its amino and carboxyl ends. tivity produced exactly the same size enzymes as the
Repeat parts a through d, assuming the bottom strand to wild type. Other mutants with no activity produced just
be the template strand. the large enzyme; still others, just the small enzyme.
Problems 3 47
a. Explain the different types of mutants at the level 34. What evidence supports the view that ribosomal RNAs
of protein structure. are a more important component of the ribosome than
b. Why do you think there were no mutants that the ribosomal proteins?
produced no enzyme? 35. Explain why antibiotics, such as erythromycin and
25. In the Crick–Brenner experiments described in this chap- Zithromax, that bind the large ribosomal subunit do not
ter, three insertions or three deletions restored the nor- harm us.
mal reading frame and the deduction was that the code 36. Why do multicellular eukaryotes need to have hundreds
was read in groups of three. Is this deduction really of kinase-encoding genes?
proved by the experiments? Could a codon have been 37. Our immune system makes many different proteins that
composed of six bases, for example? protect us from viral and bacterial infection. Biotechnology
26. A mutant has no activity for the enzyme isocitrate lyase. companies must produce large quantities of these im-
Does this result prove that the mutation is in the gene mune proteins for human testing and eventual sale to the
encoding isocitrate lyase? public. To this end, their scientists engineer bacterial or
27. A certain nonsense suppressor corrects a nongrowing human cell cultures to express these immune proteins.
mutant to a state that is near, but not exactly, wild type Explain why proteins isolated from bacterial cultures are
(it has abnormal growth). Suggest a possible reason why often inactive, whereas the same proteins isolated from
the reversion is not a full correction. human cell cultures are active (functional).
38. Would you expect to find nuclear localization sequences
28. In bacterial genes, as soon as any partial mRNA transcript
(NLSs) in the proteins that make up prokaryotic and eu-
is produced by the RNA polymerase system, the ribo-
karyotic DNA and RNA polymerases? Explain why or
some assembles on it and starts translating. Draw a dia-
why not.
gram of this process, identifying 5′ and 3′ ends of mRNA,
the COOH and NH2 ends of the protein, the RNA poly-
C h a ll e n g i n g P r obl e m s
merase, and at least one ribosome. Why couldn’t this sys-
tem work in eukaryotes? 39. A single nucleotide addition and a single nucleotide de-
letion approximately 15 bases apart in the DNA cause a
29. In a haploid, a nonsense suppressor su1 acts on mutation
protein change in sequence from
1 but not on mutation 2 or 3 of gene P. An unlinked non-
sense suppressor su2 works on P mutation 2 but not on 1 Phe–Ser–Pro–Arg–Leu–Asn–Ala–Val–Lys
or 3. Explain this pattern of suppression in regard to the to
nature of the mutations and the suppressors. Phe–Val–His–Ala–Leu–Met–Ala–Val–Lys
30. In vitro translation systems have been developed in a. What are the old and new mRNA nucleotide se-
which specific RNA molecules can be added to a test quences? (Use the codon dictionary in Figure 9-5.)
tube containing a bacterial cell extract that includes all
b. Which nucleotide has been added? Which has been
the components needed for translation (ribosomes,
deleted?
tRNAs, amino acids). If a radioactively labeled amino
acid is included, any protein translated from that RNA 40. You are studying an E. coli gene that specifies a protein. A
can be detected and displayed on a gel. If a eukaryotic part of its sequence is
mRNA is added to the test tube, would radioactive pro- –Ala–Pro–Trp–Ser–Glu–Lys–Cys–His–
tein be produced? Explain.
You recover a series of mutants for this gene that show
31. An in vitro translation system contains a eukaryotic cell no enzymatic activity. By isolating the mutant enzyme
extract that includes all the components needed for products, you find the following sequences:
translation (ribosomes, tRNAs, amino acids). If bacterial
RNA is added to the test tube, would a protein be pro- Mutant 1:
duced? If not, why not? –Ala–Pro–Trp–Arg–Glu–Lys–Cys–His–
Mutant 2:
32. Would a chimeric translation system containing the –Ala–Pro–
large ribosomal subunit from E. coli and the small ribo- Mutant 3:
somal subunit from yeast (a unicellular eukaryote) be –Ala–Pro–Gly–Val–Lys–Asn–Cys–His–
able to function in protein synthesis? Explain why or Mutant 4:
why not. –Ala–Pro–Trp–Phe–Phe–Thr–Cys–His–
33. Mutations that change a single amino acid in the active
site of an enzyme can result in the synthesis of wild-type What is the molecular basis for each mutation? What is
amounts of an inactive enzyme. Can you think of other the DNA sequence that specifies this part of the protein?
regions in a protein where a single amino acid change 41. Suppressors of frameshift mutations are now known.
might have the same result? Propose a mechanism for their action.
3 48 CHA P TER 9 Proteins and Their Synthesis
42. Consider the genes that specify the structure of hemoglo- sequence used. Khorana found that sometimes differ-
bin. Arrange the following events in the most likely se- ent polypeptides were made from the same synthetic
quence in which they would take place. mRNA, suggesting that the initiation of protein synthe-
a. Anemia is observed. sis in the system in vitro does not always start at the first
nucleotide of the messenger. For example, from (CAA)n,
b. The shape of the oxygen-binding site is altered.
three polypeptides may have been made: aa1 homopoly-
c. An incorrect codon is transcribed into hemoglobin mer (abbreviated aa1-aa1), aa2 homopolymer (aa2-aa2),
mRNA.
and aa3 homopolymer (aa3-aa3). These polypeptides
d. The ovum (female gamete) receives a high radiation probably correspond to the following readings derived
dose. by starting at different places in the sequence:
e. An incorrect codon is generated in the DNA of a
CAA CAA CAA CAA . . .
hemoglobin gene.
ACA ACA ACA ACA . . .
f. A mother (an X-ray technician) accidentally steps
in front of an operating X-ray generator. AAC AAC AAC AAC . . .
g. A child dies. The following table shows the results of Khorana's
h. The oxygen-transport capacity of the body is severely experiment.
impaired. Synthetic
i. The tRNA anticodon that lines up is one of a type mRNA Polypeptide(s) synthesized
that brings an unsuitable amino acid.
(UC)n (Ser–Leu)
j. Nucleotide-pair substitution occurs in the DNA of a
gene for hemoglobin. (UG)n (Val–Cys)
(AC)n (Thr–His)
43. What structural features are shared by spliceosomes
(Figures 8-16 and 8-17) and ribosomes? Why are both (AG)n (Arg–Glu)
structures used to support the RNA World theory? (UUC)n (Ser–Ser) and (Leu–Leu) and (Phe–Phe)
44. A double-stranded DNA molecule with the sequence (UUG)n (Leu–Leu) and (Val–Val) and (Cys–Cys)
shown here produces, in vivo, a polypeptide that is five (AAG)n (Arg–Arg) and (Lys–Lys) and (Glu–Glu)
amino acids long. (CAA)n (Thr–Thr) and (Asn–Asn) and (Gln–Gln)
TACATGATCATTTCACGGAATTTCTAGCATGTA (UAC)n (Thr–Thr) and (Leu–Leu) and (Tyr–Tyr)
ATGTACTAGTAAAGTGCCTTAAAGATCGTACAT (AUC)n (Ile–Ile) and (Ser–Ser) and (His–His)
(GUA)n (Ser–Ser) and (Val–Val)
a. Which strand of DNA is the template strand, and in
which direction is it transcribed? (GAU)n (Asp–Asp) and (Met–Met)
b. Label the 5′ and the 3′ ends of each strand. (UAUC)n (Tyr–Leu–Ser–Ile)
c. If an inversion occurs between the second and the (UUAC)n (Leu–Leu–Thr–Tyr)
third triplets from the left and right ends, respectively, (GAUA)n None
and the same strand of DNA is transcribed, how long (GUAA)n None
will the resultant polypeptide be?
Note: The order in which the polypeptides or amino acids
d. Assume that the original molecule is intact and that
are listed in the table is not significant except for (UAUC)n
the bottom strand is transcribed from left to right. Give
and (UUAC)n.
the RNA base sequence, and label the 5′ and 3′ ends
of the anticodon that inserts the fourth amino acid into a. Why do (GUA)n and (GAU)n each encode only two
the nascent polypeptide. What is this amino acid? homopolypeptides?
45. One of the techniques Khorana used to decipher the ge- b. Why do (GAUA)n and (GUAA)n fail to stimulate
netic code was to synthesize polypeptides in vitro, using synthesis?
synthetic mRNA with various repeating base sequences. c. Using Khorana’s results, assign an amino acid to each
For example, (AGA)n , which can be written out as triplet in the following list. Remember that there are
AGAAGAAGAAGAAGA. . . . Sometimes the resulting often several codons for a single amino acid and that the
polypeptide contained just one amino acid (a homopol- first two letters in a codon are usually the important
ymer), and sometimes it contained more than one ami- ones (but that the third letter is occasionally significant).
no acid (a heteropolymer), depending on the repeating Also keep in mind that some very different-looking
Problems 3 49
10
C h a p t e r
Gene Isolation
and Manipulation
Learning Outcomes
After completing this chapter, you will be able to
Injection of foreign DNA into an animal cell. The microneedle used for
injection is shown at the right, and a cell-holding pipette is shown at the left.
[ Rapho Agence/Science Source.]
outline
351
3 52 CHAPTER 1 0 Gene Isolation and Manipulation
G
enes are the central focus of genetics, and so, clearly, it would be desirable
to isolate a gene of interest (or any DNA region) from the genome in an
amount suitable to study. Isolating individual genes and producing enough
of them to analyze can be a daunting task because a single gene is a tiny fraction of
an entire genome. For example, the haploid human genome contains over 3 billion
base pairs, whereas the coding region of an average gene contains only a few thou-
sand base pairs. How do scientists find the proverbial needle in the haystack, the
gene, and then produce quantities of it for analysis?
Many investigations in genetics begin with the desire to study a trait or a dis-
ease. Using forward genetics, as described in Chapter 2, we search for mutants that
exhibit an altered phenotype, and then perform crosses or analyze pedigrees to
determine whether that phenotype is determined by a single gene. In Chapter 4, we
discussed how mapping by recombination helps locate the gene at the DNA level.
In this chapter, we continue by presenting molecular methods for identifying a gene
of interest and studying its molecular function.
The first step in studying gene function is to isolate its DNA and reproduce it in
quantities suitable for study.
Just like a construction worker, a genetic engineer needs tools. Most toolboxes
that we are familiar with are filled with tools like hammers, screwdrivers, and
wrenches that are designed by people and manufactured in factories. In contrast,
the tools of the genetic engineer are molecules isolated from cells. Most of these
tools were the product of scientific discovery—where the objective was to answer a
biological question. Only later did some scientists appreciate the potential practical
value of some of these molecules and invent ways to integrate them into protocols
with the goal of isolating and amplifying DNA fragments. We have already been
introduced to some of these molecules in previous chapters, and in this chapter you
will see how they have become the foundation of the biotechnology revolution.
One way to separate our gene of interest from the rest of the genome is to cut
the genome with “molecular scissors” and isolate the small fragment containing the
gene. Werner Arber discovered these molecular scissors, and for this discovery he
was awarded the Nobel Prize in Physiology or Medicine in 1978. However, Arber
was not looking for a tool to cut DNA precisely. Rather, he was trying to understand
why some bacteria are resistant to infection by bacterial viruses. By answering this
biological question, he discovered that resistant bacteria possess a previously
unknown enzyme—a restriction endonuclease—that cuts DNA at specific sequences.
The enzyme he discovered, EcoRI, became the first commercially available molecu-
lar scissors.
As another example, it is unlikely that anyone would have predicted that DNA
polymerase, the enzyme discovered by Arthur Kornberg, a discovery for which he
received the Nobel Prize in Physiology or Medicine in 1959, could be fashioned into
two powerful tools for DNA isolation and analysis (see Chapter 7). To this day, many
of the techniques used to determine the nucleotide sequence of DNA rely on syn-
thesizing it with DNA polymerase. Similarly, most of the protocols used to isolate
and amplify specific regions of DNA from sources as disparate as a crime scene to a
fossil embedded in amber rely on the activity of DNA polymerase.
DNA technology is a term that describes the collective techniques for obtaining,
amplifying, and manipulating specific DNA fragments. Since the mid-1970s, the
development of DNA technology has revolutionized the study of biology, opening
many areas of research to molecular investigation. Genetic engineering, the appli-
cation of DNA technology to specific biological, medical, or agricultural problems, is
now a well-established branch of technology. Genomics is the ultimate extension of
the technology to the global analysis of the nucleic acids present in a nucleus, a cell,
an organism, or a group of related species (see Chapter 14). Later in this chapter, we
will see how the techniques of DNA technology and genomics, along with methods
presented in Chapters 2 and 4, can be used together to isolate and identify a gene.
10.1 Overview: Isolating and Amplifying Specific DNA Fragments 3 5 3
Vector Ligase
DNA
polymerase
ORI
DNA
polymerase
Repeated
Bacterial cycles of
genome DNA Figure 10-1 Two methods of
synthesis isolating and amplifying a gene are (a)
in vivo, by tricking the replication
machinery of a bacterium into
amplifying recombinant DNA
containing the gene, and (b) in vitro, in
the test tube using the polymerase-
chain-reaction technique. Both
methods employ the basic principles
of molecular biology: the ability of
specific proteins (red and green) to
bind to DNA and the ability of
Clone of bacterial cells complementary single-stranded
nucleic acid segments to hybridize
Ligase DNA polymerase Primer for DNA polymerization together (the primer used in the
test-tube method).
354 CHAPTER 1 0 Gene Isolation and Manipulation
into a cut vector chromosome to form recombinant DNA molecules. The recom-
binant DNA molecules are transferred into bacterial cells, and, generally, only
one recombinant molecule is taken up by each cell. Within each bacterial cell, the
recombinant molecule is amplified along with the vector during cell division.
From a single cell, this process results in a clone of identical cells, each containing
the recombinant DNA molecule, and so this technique of amplification is called
DNA cloning. Because many fragments of DNA are inserted into the vector, the
resulting mix of cells collectively contains as much as the entire genome of the
donor organism. The next stage is to find the rare clone among the many cells
containing the DNA of interest.
In the in vitro approach, called the polymerase chain reaction (PCR) (Fig-
ure 10-1b), a specific gene or DNA region of interest is isolated and amplified by
DNA polymerase. PCR “finds” the DNA region of interest (called the target DNA)
by the complementary binding of specific short synthetic DNA molecules called
primers to the ends of that sequence. These primers then guide the replication
process directed by the DNA polymerase, which cycles exponentially, resulting in
the production of large quantities of the target DNA as an isolated DNA fragment.
Even larger quantities of target DNA can be obtained by inserting the PCR prod-
uct into a plasmid (that is, the in vivo method), thus generating a recombinant
DNA molecule like that described above.
We will see repeatedly that DNA technology depends on two basic founda-
tions of molecular biology research:
• The ability of specific proteins to recognize and bind to specific base sequences
within the DNA double helix (examples are shown in green and red in Figure
10-1).
• The ability of complementary single-stranded DNA or RNA sequences to an-
neal to form double-stranded molecules (an example is the binding of the
primers shown in yellow in Figure 10-1).
The remainder of the chapter will explore some of the uses to which we put
amplified DNA. These uses range from routine gene isolation for basic biological
research to gene therapy to treat human disease to the production of herbicides and
pesticides by crop plants. To illustrate how recombinant DNA is made, let’s consider
the cloning of the gene for human insulin, a protein hormone used in the treatment
of diabetes. Diabetes is a disease in which blood sugar levels are abnormally high
either because the body does not produce enough insulin (type I diabetes) or because
cells are unable to respond to insulin (type II diabetes). Mild forms of type I diabetes
can be treated by dietary restrictions, but, for many patients, daily insulin treatments
are necessary. Until about 30 years ago, cows were the major source of insulin pro-
tein. The protein was harvested from the pancreases of animals slaughtered in meat-
packing plants and purified on a large scale to eliminate the majority of proteins and
other contaminants in the pancreas extracts. Then, in 1982, the first recombinant
human insulin came on the market. Human insulin could be made in purer form, at
lower cost, and on an industrial scale because it was produced in greater quantities
in bacteria by recombinant DNA techniques using the actual human insulin gene
sequence. Furthermore, there is no risk of introducing bovine viruses or exciting an
immune response against the cows’ insulin. We will use the isolation and production
of recombinant insulin as an example of the general steps necessary for making any
recombinant DNA.
DNA molecules that can be constructed from a variety of donor DNAs and vectors.
We begin by discussing sources of donor DNA:
• If the experimenter wants a collection of inserts that represents the entire
genome of an organism, the genomic DNA can be cut up before cloning.
• Alternatively, if the goal is to isolate a single gene, the polymerase chain reac-
tion can be used to amplify selected regions of DNA in vitro.
• Finally, if the researcher desires only the coding sequences of genes, without
introns, DNA copies of the mRNA products, called cDNA, can be synthesized
and inserted into a vector.
This type of segment is called a DNA palindrome, which means that both
strands have the same nucleotide sequence but in antiparallel orientation (read-
ing 5′ to 3′ produces the same sequence on either strand). Different restriction
enzymes cut at different palindromic sequences. Sometimes the cuts are in the
same position on each of the two antiparallel strands, leaving blunt ends. However,
the most useful restriction enzymes make cuts that are offset, or staggered. The
enzyme EcoRI makes cuts only between the G and the A nucleotides on each
strand of the palindrome:
These staggered cuts leave a pair of ends that each have an identical four base-
pair single-stranded end (AATT). The ends are called sticky because, being single
stranded, they can base-pair (that is, stick) to a complementary sequence. Com-
bining complementary single-strands so that they pair is called hybridization.
3 56 CHAPTER 1 0 Gene Isolation and Manipulation
Figure 10-2 illustrates the restriction enzyme EcoRI making staggered double-strand
cuts in a circular DNA molecule such as a plasmid; the cut opens up the circle, and
the resulting linear molecule has two sticky ends. It can now hybridize with a frag-
ment of a different DNA molecule having the same complementary sticky ends.
Digesting human genomic DNA with EcoRI generates approximately 500,000
fragments. You will see later in this section how scientists sift through all of these
fragments to find the needle in the haystack—the one or two fragments out of the
500,000 that contain the DNA sequence of interest (in our example, the human
insulin gene).
G
G CTTAA tending from a primer. The Taq DNA polymerase, from the
AATTC G bacterium Thermus aquaticus, is one such enzyme com-
monly used. (To survive in the extreme heat of thermal
vents, this bacterium has evolved proteins that are ex-
tremely heat resistant. Its DNA polymerase thus survives
Hybridization
the high temperatures required to denature the DNA
duplex, which would denature and inactivate DNA poly-
G C merase from most other species.) Complementary new
A C G T
T A strands are synthesized as in normal DNA replication in
A
T A TC
C TT
T
Recombinant
AG
5′
heat-denatured to generate single-stranded templates, and a
second cycle of replication is carried out by lowering the tem- Repeat step 4 .
perature in the presence of all the components necessary for
the polymerization to produce four identical duplexes. (e)
Repeated cycles of denaturation, annealing, and synthesis 3′
result in an exponential increase in the number of segments
replicated. Because a typical cycle lasts five minutes, amplifica-
tion by as much as a billion-fold can be readily achieved within
2.5 hours. As you will see later in this section, the PCR products
5′
can be further amplified by cloning them in bacterial cells.
PCR is a powerful technique that is routinely used to isolate Repeat steps 2 , 3 , and 4 .
specific genes or DNA fragments when there is prior knowl- (f)
edge of the sequence to be amplified. In fact, if the sequences
3′
corresponding to the primers are each present only once in the
genome and are sufficiently close together, the only DNA seg-
ment that can be amplified is the one between the two primers.
PCR is a very sensitive technique with numerous applications
in biology. It can amplify target sequences that are present in
extremely low copy numbers in a sample as long as primers
specific to this rare sequence are used. For example, crime
investigators can amplify segments of human DNA from the
few follicle cells surrounding a single pulled-out hair. If the
investigators chose to do so, they could amplify the insulin 5′
gene from this DNA sample using its precise location on chro-
mosome 11 to design flanking primers for direct PCR. After 25 cycles, the target sequence has
It would not be an overstatement to say that PCR has revo- been amplified about 106-fold.
lutionized the study of many fields of biology where DNA
358 CHAPTER 1 0 Gene Isolation and Manipulation
5' 3'
AAAA
Oligo-dT primer
anneals to polyA tail
AAAA
TTTT
Reverse transcriptase
copies mRNA into cDNA
AAAA
TTTT
mRNA
Single-stranded
cDNA
mRNA removed DNA polymerase copies
cDNA strand
Nick hairpin
F i g u r e 10 - 4 The formation of cDNA for the insulin gene. The insulin gene (with its two
introns) is transcribed in the pancreas into pre-mRNA. The introns are removed by splicing,
and A residues are added to the 3′ end to form polyadenylated mRNA. In the laboratory,
mRNAs are isolated from pancreatic cells and a short oligo-dT primer is hybridized to the
poly(A) tail of all mRNAs to prime synthesis of complementary DNA from the RNA template
by reverse transcriptase. Reverse transcriptase synthesizes a stem-loop structure that acts
as a primer for synthesis of the second cDNA strand, after the mRNA strand has been
degraded (by treatment with NaOH or with RNAseH).
Cloning DNA fragments with sticky ends Recall that the original scientists
who isolated theto
Introduction human
Geneticinsulin gene
Analysis, 11edid not know the gene sequence, and so they
needed to create
Figure a library of human genome DNA fragments from which to iso-
10.04 #1010
05/21/14
late the specific gene. To make recombinant DNA molecules containing donor
06/09/14
genomic DNA fragments, both donor and vector DNAs are digested by a restriction
06/23/14
enzymeDragonflyproduces
that the same complementary sticky ends (see Figure 10-2). The
Media Group
resulting fragments are then mixed to allow the sticky ends of vector and donor
3 6 0 CHAPTER 1 0 Gene Isolation and Manipulation
(a) Cleavage
by EcoRI
endonuclease Cleavage Cleavage by EcoRI
sites endonuclease
TT
A
A AATT AATT AATT
A
A Cleaved
T
T TTAA TTAA TTAA plasmid
Hybridization
AAT
T
AA
TT
(b) TTAA
TT
AA
TT
TT
A
AA
AA
A
TT
AA
TT
DNA ligase DNA ligase
AAT
(c) T
AA
TT
TTAA
TT
AA
TT
TT
AA
AA
AA
TT
AA
TT
Recombinant plasmid Recombinant plasmid
DNA to hybridize with each other and form recombinant molecules. Figure 10-5a
shows a bacterial plasmid DNA that carries a single EcoRI restriction site, so diges-
tion with the restriction enzyme EcoRI converts the circular DNA into a single
linear molecule with sticky ends. Donor DNA from any other source, such as
human DNA, also is treated with the EcoRI enzyme to produce a population of
fragments carrying the same sticky ends. When the two populations are mixed
under the proper physiological conditions, DNA fragments from the two sources
can hybridize because double helices form between their sticky ends (Figure 10-5b).
In any cloning reaction, there are many linearized plasmid molecules in the solu-
tion, as well as many different EcoRI fragments of donor DNA, a tiny fraction of
which will have the target DNA. Therefore, a diverse array of plasmids recom-
bined with different donor fragments will be produced. At this stage, the hybrid-
ized molecules do not have covalently joined sugar–phosphate backbones and are
likely to fall apart because eight hydrogen bonds provide only weak links between
the sequences. However, the backbones can be covalently sealed by the addition
of the enzyme DNA ligase, which creates phosphodiester linkages at the junc-
tions (Figure 10-5c).
Cloning DNA fragments with blunt ends Knowing the human insulin gene se-
quence helps us zero in on the gene, but it adds a small complication in the cloning
reaction. Some
Introduction restriction
to Genetic enzymes
Analysis, 11e produce blunt ends rather than staggered cuts.
In addition,
Figure cDNA and the DNA fragments that arise from PCR have blunt or near-
10.05 #1011
blunt ends. While blunt end fragments from all these sources can be joined to the
05/21/14
Dragonfly
vector Media
with theGroup
use of ligase alone, this is a very inefficient reaction because blunt
ends cannot stick together. One alternative method is to create PCR products with
sticky ends by using specially designed PCR primers that contain restriction
10.2 Generating Recombinant DNA Molecules 3 61
Ligate oligonucleotide
linkers containing
EcoRI sequence
Cut with
EcoRI F i g u r e 10 -7 Adding EcoRI sites to the
ends of cDNA molecules. The cDNA
AATTCNNN AAAAANNNG molecules come from the last step in
GNNN T T T T TNNNCTTAA Figure 10-4. Adapters (boxed region) are
added at both ends of the cDNA
molecules. These adapters are double-
stranded oligonucleotides that contain a
Ligate into vector cut with restriction site (EcoRI is shown in red) and
EcoRI random DNA sequence at both ends
(represented by N).
3 62 CHAPTER 1 0 Gene Isolation and Manipulation
the sequence for a restriction site that is not in the amplified DNA can be added
to the primers or the linkers.
K e y C o n c e p t Donor and vector DNAs with the same sticky ends can be
joined efficiently and ligated. Alternatively, donor DNA that is the product of PCR or
cDNA synthesis requires the addition of sticky ends prior to insertion into a vector.
1 2
Recombinant vector
with insert 1 or 2
Bacterial
genome
Transformation
1 2
Replication, 1 2 2
amplification,
and cell 1 2
division
bacteria will typically contain billions of copies of the single-donor DNA insert
fused to its vector. This set of amplified copies of the single-donor DNA fragment
within the cloning vector is the recombinant DNA clone.
The amplification of donor DNA inside a bacterial cell entails the following
steps:
• Choosing a cloning vector and introducing the insert (see the preceding section
for a discussion of the latter)
• Introducing the recombinant DNA molecule inside a bacterial cell
• Recovering the amplified recombinant molecules
Choice of cloning vectors Vectors must be small molecules for convenient manip-
ulation, but they may vary in many ways to suit the goal of the experiment. Some
vectors need to be capable of prolific replication in a living cell in order to amplify
the inserted donor fragment. In contrast, others are designed to be present in only
a single copy to maintain the integrity of the inserted DNA (see below). All vectors
must have convenient restriction sites at which the DNA to be cloned may be
inserted (called a polylinker or multiple cloning site). Ideally, the restriction site
should be present only once in the vector because then restriction fragments of
donor DNA will insert only at that one location in the vector. Having a way to iden-
tify and recover the desired recombinant molecule quickly also is important.
Numerous cloning vectors that meet a wide range of experimental needs are in cur-
rent use. Some general classes of cloning vectors follow.
Plasmid vectors As described earlier, bacterial plasmids are small circular DNA mol-
ecules that commonly replicate their DNA independent of the bacterial chromo-
some. The plasmids routinely used as vectors carry a gene for drug resistance and a
gene to distinguish plasmids with and without DNA inserts. These drug-resistance
genes provide a convenient way to select for bacterial cells transformed by plas-
mids: those cells still alive after exposure to the drug must carry the plasmid vec-
tors. However, not all the plasmids in these transformed cells will contain DNA
inserts. Some plasmid vectors also have a system that allows researchers to identify
bacterial colonies with plasmids containing DNA inserts. For this reason, it is desir-
able to be able to identify bacterial colonies with plasmids containing DNA inserts.
Such a feature is part of the pUC18 plasmid vector shown in Figure 10-9; DNA
inserts disrupt a gene (lacZ) in the plasmid that encodes an enzyme (β-galactosidase)
necessary to cleave a compound added to the petri plate agar (X-gal) so that it pro-
duces a blue pigment. Thus, the colonies that contain the plasmids with the DNA
insert will be white rather than blue (they cannot cleave X-gal because they do not
produce β-galactosidase).
Vectors for larger DNA inserts The standard plasmid and phage λ vector just
described can accept donor DNA of sizes as large as 10 to 15 kb. However, many
experiments require inserts well in excess of this upper limit. To meet these needs,
special vectors that require more sophisticated methods for transferring the DNA
3 6 4 CHAPTER 1 0 Gene Isolation and Manipulation
Bam+,
HinGOOO
Eco5O
β-galactosidase function of lacZ, resulting in
SmaO
SphO
XbaO
KpnO
SacO
PstO
SalO
an inability to convert the artificial substrate
X-gal into a blue dye. The polylinker has
several alternative restriction sites into which 3RO\OLQNHU
donor DNA can be inserted. [ Photo: Dr. James
M. Burnette III and Dr. Leslie Bañuelos.]
lacZ
&XWIRUHLJQ'1$DQG
YHFWRUZLWKUHVWULFWLRQHQ]\PH
7UDQVIRUPEDFWHULD
1RLQVHUW :LWKLQVHUW
LacZ LacZ ²
3ODFHRQDPSLFLOOLQDQG;JDO
%OXHFRORQ\ :KLWHFRORQ\
(Q]\PH 1RFOHDYDJH
FOHDYHV;JDO RI;JDO'1$
1R'1$LQVHUW LQVHUWLV
LVSUHVHQW SUHVHQW
into the host cell have been engineered. In each case, the DNAs replicate as large
plasmids after they have been delivered into the bacterium.
Fosmids are vectors that can carry 35- to 45-kb inserts (Figure 10-10). They are
engineered hybrids of λ phage DNA and bacterial F plasmid DNA (see Chapter 5).
Fosmids are packaged into λ phage particles, which act as the syringes that
10.2 Generating Recombinant DNA Molecules 3 6 5
Fosmids and BACs are cloning vectors that carry large inserts
AmpR
COS
COS Human
site
site insert
comprises
Fosmid Poly- 35–45 kb –80% of Single ~75,000
linker fosmid copy
F-factor CamR
origin of replication
CamR
Human
insert
Poly- comprises
BAC 100–200 kb –90% of Single 15,000–
linker
BAC copy 30,000
F-factor AmpR
origin of replication
Entry of recombinant molecules into the bacterial cell Three methods are used
to introducetorecombinant
Introduction DNA
Genetic Analysis, 11emolecules into bacterial cells: transformation, trans-
duction, and#1016
Figure 10.10 infection (Figure 10-11; see Sections 5.3 and 5.4).
05/21/14
• In transformation, bacteria are bathed in a solution containing the recombi-
06/09/14
Dragonfly
nant Media Group
DNA molecule. Because bacterial cells used in research cannot take up
DNA molecules as large as recombinant plasmids, they must be made compe-
tent (that is, able to take up the DNA from the surrounding media) by either
incubation in a calcium solution or exposure to a high-voltage electrical pulse
3 6 6 CHAPTER 1 0 Gene Isolation and Manipulation
F i g u r e 10 -11 Recombinant
DNA can be delivered into bacterial Modes of delivering recombinant DNA into bacterial cells
cells by transformation,
transduction, or infection with a (a) Plasmids, BACs
phage. (a) Plasmid and BAC Transformation
vectors are delivered by DNA- + Bacterial colony
mediated transformation. (b) Certain
vectors such as fosmids are
delivered within bacteriophage
(b) Fosmids
heads (transduction); however, after
having been injected into the Transduction
bacterium, they form circles and + Bacterial colony
replicate as large plasmids.
(c) Bacteriophage vectors such as
phage λ infect and lyse the Phage plaque
(c) Bacteriophage vectors
bacterium, releasing a clone of
progeny phages, all carrying the Infection
identical recombinant DNA +
molecule within the phage genome.
Lysis
ensure that we have cloned the DNA segment of interest, we have to make large col-
lections of DNA segments that are all-inclusive. For example, we take all the DNA
from a genome, break it up into segments of the right size for our cloning vector,
and insert each segment into a different copy of the vector, thereby creating a col-
lection of recombinant DNA molecules that, taken together, represent the entire
genome. We then transform or infect these molecules into separate bacterial recipi-
ent cells, where they are amplified. The resulting collection of recombinant-DNA-
bearing bacteria or bacteriophages is called a genomic library. If we are using a
cloning vector that accepts an average insert size of 10 kb and if the entire genome
is 100,000 kb in size (the approximate size of the genome of the nematode
Caenorhabditis elegans), then at least 10,000 independent recombinant clones would
be required to represent one genome’s worth of DNA. To ensure that all sequences
of the genome that can be cloned are contained within a collection, genomic librar-
ies typically represent an average segment of the genome at least five times (and so,
in our example, there will be 50,000 independent clones in the genomic library).
This multifold representation makes it highly unlikely that, by chance, a sequence
is not represented at least once in the library.
Similarly, representative collections of cDNA inserts require tens or hundreds
of thousands of independent cDNA clones; these collections are cDNA libraries
and represent only the protein-coding regions of the genome. A comprehensive
cDNA library includes mRNA samples from different tissues, different develop-
mental stages, or from organisms grown in different environmental conditions.
Whether we choose to construct a genomic DNA library or a cDNA library
depends on the situation. If we are seeking a specific gene that is active in a spe-
cific type of tissue in a plant or animal, then it makes sense to construct a cDNA
library from a sample of that tissue. For example, suppose we want to identify
cDNAs corresponding to insulin mRNAs. The β-islet cells of the pancreas are the
most abundant source of insulin, and so mRNAs from pancreas cells are the
appropriate source for a cDNA library because these mRNAs should be enriched
for the gene in question. A cDNA library represents a subset of the transcribed
regions of the genome; so it will inevitably be smaller than a complete genomic
library. Although genomic libraries are bigger, they do have the benefit of contain-
ing genes in their native form, including introns and untranscribed regulatory
sequences. A genomic library is necessary at some stage as a prelude to cloning an
entire gene or an entire genome.
Transfer colonies to
absorbent membrane.
is accomplished by using a specific probe that will find and mark only
the desired clone. There are two types of probes: (1) those that recog-
nize a specific nucleic acid sequence and (2) those that recognize a spe-
cific protein.
Incubate membrane
Membrane with radioactive probe. Probes for finding DNA Probing for DNA makes use of the power of
base complementarity. Two single-stranded nucleic acids with full or par-
tial complementary base sequence will “find” each other in solution by
random collision. After being united, the double-stranded hybrid so
formed is stable. This approach provides a powerful means of finding spe-
cific sequences of interest. Probing for DNA requires that all molecules be
made single stranded by heating. A single-stranded probe labeled radio-
actively or chemically is sent out to find its complementary target se-
quence in a population of DNAs such as a library. Probes as small as 15 to
Autoradiograph to
locate desired clone.
20 base pairs will hybridize to specific complementary sequences within
Film much larger cloned DNAs.
The identification of a specific clone in a library is a multistep proce-
dure. In Figure 10-12, these steps are shown for a library cloned into a
fosmid vector. The steps are similar for libraries of plasmids or BACs. For
libraries of phages, plaques are screened rather than colonies. First, col-
onies of the library on a petri dish are transferred to an absorbent mem-
brane by simply laying the membrane on the surface of the medium.
The membrane is peeled off, colonies clinging to the surface are lysed in
Desired place on the membrane, and the DNA is simultaneously denatured so
clone
that it is single-stranded. Second, the membrane is bathed with a solu-
tion of a single-stranded probe that is specific for the DNA sequence
being sought. Generally, the probe is itself a cloned piece of DNA that
has a sequence that is complementary to that of the desired gene. The
probe must be labeled with either a radioactive isotope or a fluorescent
Grow bacterial dye. Thus, the position of a concentrated radioactive or fluorescent label
colony.
will indicate the position of the positive clone. For radioactive probes, the
membrane is placed on a piece of X-ray film, and the decay of the radioiso-
tope produces subatomic particles that “expose” the film, producing a dark
spot on the film adjacent to the location of the radioisotope concentration.
Amplify Such an exposed film is called an autoradiogram. If a fluorescent dye is
desired gene. used as a label, the membrane is exposed to the correct wavelength of
light to activate the dye’s fluorescence, and a photograph is taken of the
membrane to record the location of the fluorescing dye.
Where does the DNA to make a probe come from? The DNA can
come from one of several sources.
• One can use a homologous gene or a cDNA from a related organism.
This method depends on the fact that organisms descended from a
recent common ancestor will have similar DNA sequences. Even
10.3 Using Molecular Probes to Find and Analyze a Specific Clone of Interest 3 6 9
though the probe DNA and the DNA of the desired clone might not be identi-
cal, they are often similar enough to promote hybridization.
• One can use the protein product of the gene of interest. If part or all of the
protein sequence is known, one can back-translate, by using the table of the
genetic code in reverse (from amino acid to codon), to deduce the DNA se-
quences that may have encoded it. A synthetic DNA probe that matches that
sequence is then designed. Recall, however, that the genetic code is degener-
ate—that is, most amino acids are encoded by multiple codons. Thus, several
possible DNA sequences could in theory encode the protein in question, but
only one of these DNA sequences is present in the gene that actually encodes
the protein. To get around this problem, a short stretch of amino acids with
minimal degeneracy is selected. A mixed set of probes is then designed con-
taining all possible DNA sequences that can encode this amino acid sequence.
This “cocktail” of oligonucleotides is used as a probe, and a correct or very simi-
lar strand within this cocktail will hybridize with the gene of interest.
Oligonucleotides of about 20 nucleotides in length embody enough specificity
to hybridize to one unique complementary DNA sequence in the library.
Probes for finding proteins If the protein product of a gene is known and isolated
in pure form, then this protein can be used to detect the clone of the corresponding
gene in a library. The process, described in Figure 10-13, requires two components.
First, it requires an expression library, made by using expression vectors that will
direct the host cell to produce the protein. To make the library, cDNA is inserted
into the vector in the correct triplet reading frame with a bacterial promoter, and
cells containing the vector and its insert produce a translation of the cDNA insert.
Second, the process requires an antibody that binds to the specific protein product
of the gene of interest. (An antibody is a protein made by an animal’s immune
system that binds with high affinity to a given molecule.) The antibody is used to
screen the expression library for that protein. A membrane is lain over the surface
of the medium and removed so that some of the cells of each colony are now
attached to the membrane at locations that correspond to their positions on the
original petri dish (see Figure 10-13). The imprinted membrane is then dried and
bathed in a solution of the antibody, which will bind to the imprint of any colony
that contains the fusion protein of interest. Positive clones are revealed by a labeled
secondary antibody that binds to the first antibody. By detecting the correct pro-
tein, the antibody identifies the clone containing the gene that must have synthe-
sized that protein and therefore contains the desired cDNA.
We can see how this type of probe works in practice by returning to the human
insulin example. To clone a cDNA corresponding to human insulin, we first syn-
thesize cDNA using mRNA isolated from pancreas cells as the template. The
cDNA molecules are then inserted into a bacterial expression vector and the vec-
tor is transformed into bacteria. Bacterial colonies containing insulin cDNA will
express insulin protein. The insulin protein is identified by its binding with an
insulin antibody as described above.
In vitro packaging
Plate on bacterial lawn.
Overlay
membrane.
Remove membrane.
Master plate
Autoradiography
125
I Labeled
secondary
Antibody
antibody X-ray film
identifies
Fusion protein Primary specific
bound to antibody plaques.
membrane
F i g u r e 10 -13 To find the clone of mutant organism and restore the wild-type phenotype. Thus, if we are able to intro-
interest, an expression library made with duce the library into the species bearing the recessive mutation (see Section 10.6),
special phage λ vector called λgt11 is we can detect specific clones in the library through their ability to restore the func-
screened with a protein-specific tion eliminated by the recessive mutation. This procedure is called functional
antibody. After the unbound antibodies
have been washed off the membrane,
complementation or mutant rescue. The general outline of the procedure is as
the bound antibodies are visualized follows:
through the binding of a radioactive • Make a library containing wild-type a+ recombinant-donor DNA inserts.
secondary antibody.
• Transform cells of recessive-mutant-cell-line a− with this library of DNA
inserts.
Introduction to Genetic Analysis, 11e
Figure 10.13 #1019
05/21/14 • Identify clones from the library that produce transformed cells with the domi-
06/09/14 nant a+ phenotype.
Dragonfly Media Group
• Recover the a+ gene from the successful bacterial or phage clone.
So far, we have described techniques to transform only bacterial cells. You will
see later in this chapter that DNA can be introduced into many genetic model
10.3 Using Molecular Probes to Find and Analyze a Specific Clone of Interest 371
Gel electrophoresis
M M
well separated, an individual band can be cut from the gel, and the DNA sample can
be purified from the gel matrix. Therefore, DNA electrophoresis can be either diag-
nostic (showing sizes and relative amounts of the DNA fragments present) or pre-
parative (useful in isolating specific DNA fragments).
Genomic DNA digested by restriction enzymes generally yields so many frag-
ments that electrophoresis produces a continuous smear of DNA and no discrete
bands. A probe can identify one fragment in this mixture, with the use of a tech-
nique developed by E. M. Southern called Southern blotting (Figure 10-15a). Like
clone identification (see Figure 10-12), this technique entails getting an imprint of
DNA molecules on a membrane by using the membrane to blot the gel after elec-
trophoresis is complete. The DNA must be denatured first, which allows it to stick
to the membrane. Then the membrane is hybridized with a labeled probe. An auto-
radiogram or a photograph of fluorescent bands will reveal the presence of any
bands on the gel that are complementary to the probe. To detect the insulin gene,
we can apply this protocol to human genomic DNA digested with restriction
enzymes on the membrane, using insulin cDNA as the labeled probe.
The Southern-blotting technique can be modified to detect a specific RNA mol-
ecule from a mixture of RNAs fractionated on a gel. This technique is called RNA
blotting or, more commonly, Northern blotting (thanks to some scientist’s sense
of humor) to contrast it with the Southern-blotting technique used for DNA analy-
sis. The RNA separated by electrophoresis can be a sample of the total RNA isolated
from a tissue or from an entire organism. In the example shown in Figure 10-15b,
the gel was run with RNA isolated from the seeds of various plants. Unlike DNA
that is loaded onto a gel, there is no need to digest the RNA sample as it is produced
in discrete transcript-size molecules. RNA gels are blotted onto a membrane and
probed in the same way as DNA is blotted and probed for Southern blotting. One
application of Northern analysis is to determine whether a specific gene is tran-
scribed in a certain tissue or under certain environmental conditions. Another is to
determine the size of the mRNA and whether an RNA of similar size can be detected
in closely related plants (as in Figure 10-15b).
We started this section on blot analysis by posing questions about the insulin
gene and its mRNA in human populations and in related species. Based on the
techniques above, can you design Southern- and Northern-blot experiments to
answer these questions? You can assume that you have access to samples of the
required genomic DNAs and RNAs.
Hence, we see that cloned DNA finds widespread application as a probe used
for detecting a specific clone, a DNA fragment, or an RNA molecule. In all these
cases, note that the technique again exploits the ability of nucleic acids with com-
plementary nucleotide sequences to find and bind to each other.
Salt
solution Membrane
(b)
Filter
Sorghum
Soybean
Gel
Cotton
Wheat
Maize
Millet
Rice
DNA
Hybridize with unique transferred
nucleic acid probe. to membrane
2.4 kb
Filter in
“seal-a-meal”
bag.
Probe
Wash away hybridized to
unbound probe. complementary
sequence
O O O
—
O
O—
—P—O—P—O—P—O— base
—
O O O
H H
F i g u r e 10 -16 2′,3′-Dideoxynucleotides, Cannot form a
which are employed in the Sanger DNA- phosphodiester bond
sequencing method, are missing the ribose with next incoming dNTP
hydroxyl group present in DNA.
10.4 Determining the Base Sequence of a DNA Segment 375
synthesize the complementary DNA strand, starting from the primer, but will
stop at any point at which the dideoxynucleotide triphosphate is incorporated
into the growing DNA chain in place of the normal deoxynucleotide triphosphate.
Suppose the DNA segment that we’re trying to sequence is
5′ ACGGGATAGCTAATTGTTTACCGCCGGAGCCA 3′
5 ACGGGATAGCTAATTGTTTACCGCCGGAGCCA 3
3 CGGCC TCGGT 5
Direction of DNA synthesis
Using the special DNA-synthesis cocktail “spiked” with a small amount of ddATP,
for example, we will create a nested set of DNA fragments that have the same
starting point but different end points because the fragments stop at whatever
point the insertion of ddATP instead of dATP halted DNA replication. The array
of different ddATP-arrested DNA chains looks like the list of sequences below.
(*A indicates the dideoxynucleotide.)
5 ATGGGATAGCTAATTGTTTACCGCCGGAGCCA 3 Template DNA clone
3 CGGCC TCGGT 5 Primer for synthesis
Direction of DNA synthesis
3 *ATGGCGGCC TCGGT 5 Dideoxy fragment 1
3 *AATGGCGGCC TCGGT 5 Dideoxy fragment 2
3 *AAATGGCGGCC TCGGT 5 Dideoxy fragment 3
3 *ACAAATGGCGGCC TCGGT 5 Dideoxy fragment 4
3 *AACAAATGGCGGCC TCGGT 5 Dideoxy fragment 5
3 *ATTAACAAATGGCGGCC TCGGT 5 Dideoxy fragment 6
3 *ATCGATTAACAAATGGCGGCC TCGGT 5 Dideoxy fragment 7
3 *ACCCTATCGATTAACAAATGGCGGCC TCGGT 5 Dideoxy fragment 8
We can generate an array of such fragments for each of the four possible dide-
oxynucleotide triphosphates in four separate cocktails (one spiked with ddATP,
one with ddCTP, one with ddGTP, and one with ddTTP). Each will produce a dif-
ferent array of fragments, with no two spiked cocktails producing fragments of
the same size. Next, the DNA fragments generated in the four cocktails are sepa-
rated and displayed in order using gel electrophoresis. By running the fragments
in four adjacent lanes of a polyacrylamide gel that can resolve fragments of DNA
that vary by only one nucleotide in length, we see that the fragments can be
ordered by length with the lengths increasing by one base at a time.
The newly synthesized strands must be labeled in some way to make the
bands visible on the gel. Strands are labeled as they are made either by using a
primer that is radioactively labeled or having one of the regular dNTPs carry a
radioactive label. Fluorescent labels can also be used, and in this case they are
carried by each ddNTP (see below).
The products of such dideoxy sequencing reactions are shown in Figure 10-17.
That result is a ladder of labeled DNA chains increasing in length by one, and so
all we need do is read up the gel to read the DNA sequence of the synthesized
strand in the 5′-to-3′ direction.
If the tag is a fluorescent dye, a different fluorescent color emitter is used for
each of the four ddNTP reactions, and a detector at the end of the gel can distin-
guish each color. The four reactions take place in the same test tube, and the four
sets of nested DNA chains can undergo electrophoresis together. Thus, four times
as many sequences can be produced in the same amount of time as can be
376 CHAPTER 1 0 Gene Isolation and Manipulation
A T
A T
T A
C G
T A
G C
G C
G C
C G
Acrylamide T A
gel A T
T A
T A
C G
G C
G C
G C
C G
G C
T A
F i g u r e 10 -18 Printout from an automatic sequencer that uses fluorescent dyes. Each of
the four colors represents a different base. The letter N represents a base that cannot be
assigned because peaks are too low. Note that, if this were a gel as in Figure 10-17c, each
of these peaks would correspond to one of the dark bands on the gel; in other words, these
colored peaks represent a different readout of the same sort of data as are produced on a
sequencing gel.
of these procedures directly identifies the gene. Thus, regions delimited by the
molecular landmarks such as SNPs and RFLPs usually contain many genes spread
over hundreds of thousands or even millions of base pairs. To identify the correct
gene responsible for a particular trait, researchers need to be able to analyze the
whole neighborhood for the gene of interest. In model organisms for which the
entire genome sequence is available, the local neighborhood with its numerous
genes can simply be obtained from a computer database (see Chapter 14). From
sequence analysis of these “neighborhood” genes, most likely candidates are cho-
sen that might represent the gene being sought.
Below, we will briefly discuss how positional cloning was used before the avail-
ability of the human genome sequence to isolate the gene responsible for the
devastating human disease cystic fibrosis (the CF gene). While this technique is
no longer necessary for identifying human genes, it is still used to identify specific
genes in organisms where a genome sequence is yet to be determined.
7q31.1
7q36.3
large inserts representing an entire
7p233
7p112
CFTR
7q22
eukaryotic genome. In the example
shown, the molecular marker 7q22 is used
Chromosome 7 to probe a human genomic library. Only
the insert DNAs are shown. The insert
DNA selected by the probe is then used to
isolate another recombinant phage or BAC
7q22
Collection of
overlapping
clones covers
7q31.2
7q22 7q31.2
Direction of
chromosome walk
none of the normal individuals. This mutation was a deletion of three base pairs,
eliminating a phenylalanine from the protein. In turn, from the inferred sequence,
the three-dimensional structure of the protein was predicted. This protein is struc-
turally similar to ion-transport proteins in other systems, suggesting that a transport
defect is the primary cause of CF. When used to transform mutant cell lines from CF
patients, the wild-type gene restored normal function; this phenotypic “rescue” was
the final confirmation that the isolated sequence was in fact the CF gene.
Other human genes isolated by positional cloning include those involved in
several heritable diseases, including Huntington disease, breast cancer, Werner
syndrome (see Chapter 7), and susceptibility to asthma. Because of the relative
ease of crossing plants, positional cloning has been a very powerful technique to
isolate genes involved in many processes, including the identification of genes
that contributed to crop domestication.
(a)
Starting Starting
marker 1 (M1) marker 2 (M2)
A C E F
B D G
(b)
Individuals Phenotype
a Disease (d )
b Normal (D)
c Normal (D)
d Normal (D)
e Normal (D)
f Disease (d )
g Disease (d )
h Normal (D)
i Normal (D)
j Normal (D)
k Normal (D)
l Normal (D)
10.5 Aligning Genetic and Physical Maps to Isolate Specific Genes 3 81
Starting Starting
marker 1 marker 2
M1 M2 M1 M2
D d
Parent 1, Parent 2,
dominant
mutant
normal
D d
M1 M2
D
F1
d
M1 M2 M1 M2
D D
Cross with
self or sib
(recombination can
occur between M1 and M2) d d
M1 M2 M1 M2 M1 M2 M1 M2 M1 M2
F2 D d D D d
D d d d d
Key
gene. For example, with only individuals a and b in Figure 10-20b, an investigator
could only narrow the search to four genes (D, E, F, G). Second, although online
databases contain lists of DNA-based markers such as SNPs, not all of these have
alleles that are only present in individuals with the trait of interest in a particular
pedigree or cross, so researchers must first screen a large number of markers to find
those that do. Finally, investigators need to determine the complete DNA sequence
of the disease allele of the gene to identify the causative lesion. In most cases, the
online genome sequence will contain the wild-type allele. The disease allele is most
easily sequenced using PCR to amplify that allele from the DNA of affected indi-
viduals and without actually cloning the DNA. The precise mutation can then be
deduced from the DNA sequence of the PCR product.
(a) (b)
DNA-coated
tungsten
2 Projectile
gun
3 Injection
1 Transformation
4 Virus
Transgene
Gene X +
Plasmid Nucleus
Marker
Gene X +
Chromosome 1 2
–
Gene X
Single crossover at 1
F i g u r e 10 -2 3 A plasmid bearing an expressed concern that the introduction of GMOs into the food supply may pro-
active allele (gene X+) inserts into a duce unexpected health problems. The concern about GMOs is one facet of an
recipient yeast strain bearing a defective ongoing public debate about complex public health, safety, ethical, and educa-
gene (X−) by homologous recombination.
tional issues raised by the new genetic technologies.
The result can be replacement of the
defective gene X− by X+ (top) or its
A vector routinely used to produce transgenic plants is derived from the
retention along with the new allele Ti plasmid, a natural plasmid from a soil bacterium called Agrobacterium tumefa-
(bottom). The mutant site of gene X− is ciens. This bacterium causes what is known as crown gall disease, in which the
represented as a vertical black bar. Single infected plant produces uncontrolled growths called tumors or galls. The key to
crossovers at position 2 also are possible tumor production is a large (200-kb) circular DNA plasmid—the Ti (tumor-inducing)
but are not shown.
T-DNA
Opine
utilization
T-DNA
transfer Ti plasmid
functions
F i g u r e 10 -2 4 Simplified
representation of the major regions of
the Ti plasmid of A. tumefaciens
containing an engineered T-DNA.
10.6 Genetic Engineering 3 8 5
Engineered
T-DNA
Tobacco-plant
cell
Transformed
cell
kanR cells
selected
Transgenic Cell of
Cultured tobacco transgenic
cells Plantlet plant plant
C. elegans
Extrachromosomal array
Integrated array
Chromosome
(a) Micropipette
with DNA
solution
Nucleus
extrachromosomal arrays (Figure 10-27b) that exist as independent units outside the
chromosomes. More rarely, the transgenes will become integrated into an ectopic
position in a chromosome, still as a multicopy array. Unfortunately, sequences may
become scrambled within the arrays, complicating the work of the researcher.
Transgenesis in M. musculus Mice are the most important models for mamma-
lian genetics. Most exciting, much of the technology developed in mice is poten-
tially applicable to humans. There are two strategies for transgenesis in mice,
each having its advantages and disadvantages:
• Ectopic insertions. Transgenes are inserted randomly in the genome, usually as
multicopy arrays.
• Gene targeting. The transgene sequence is inserted into a location occupied by
a homologous sequence in the genome. That is, the transgene replaces its nor-
mal homologous counterpart.
Targeting
vector
tk+
Cloned
gene
neoR
Exon 2
Add targeting
Add tk+ gene. vector.
Insert neoR
into exon 2.
Exon 1 Cultured
mouse embryonic
stem cells
Possible outcomes
Ectopic
Vector (random)
insertion
No insertion
Vector Vector
Nontarget Unchanged
gene in Nontarget chromosome
Chromosome Chromosome gene in
with targeted chromo- with random
some chromosome
Target gene in insertion insertion
chromosome
Neomycin
analog Ganciclovir Cell with no
insertion
Add to medium.
Black female
ES cells a / a ; M / M plus
Blastocyst- A /A ; M /m Surrogate
from brown
stage embryo Altered embryo mother
mouse
Embryo
A /A ; M /M
Mature chimera A /a ; M /m A /– ; m /m
a /a ; M /M A /a ; M /m A /– ; M /–
A /a ; M /M a /a ; M /–
Figure 10-30 A knockout mouse is produced by inserting ES cells carrying the targeted mutation. (a) Embryonic stem (ES)
cells are isolated from an agouti (brown) mouse strain ( A /A) and altered to carry a targeted mutation (m) in one chromosome. The
ES cells are then inserted into young embryos, one of which is shown. Coat color of the future newborns is a guide to whether
the ES cells have survived in the embryo. Hence, ES cells are typically put into embryos that, in the absence of the ES cells,
would acquire a totally black coat. Such embryos are obtained from a black strain that lacks the dominant agouti allele (a /a). The
embryos containing the ES cells grow to term in surrogate mothers. Agouti shading intermixed with black indicates those
newborns in which the ES cells have survived and proliferated. (Such mice are called chimeras because they contain cells
derived from two different strains of mice.) Solid black coloring, in contrast, indicates that the ES cells have perished, and these
mice are excluded. A represents agouti; a, black; m is the targeted mutation; and M is its wild-type allele. (b) Chimeric males are
mated with black (nonagouti) females. Progeny are screened for evidence of the targeted mutation (green in inset) in the gene of
interest. Direct examination of the genes in the agouti mice reveals which of those animals (boxed ) inherited the targeted
mutation. Males and females carrying the mutation are mated with one another to produce mice whose cells carry the chosen
mutation in both copies of the target gene ( inset ) and thus lack a functional gene. Such animals (boxed ) are identified definitively
by direct analyses of their DNA. The knockout in this case results in a curly-tail phenotype.
will be chimeric—that is, it will contain cells from two different mouse strains. When
the chimeric mouse reaches adulthood, it is mated with a normal mouse. If the
chimeric mouse had taken up ES cells (with the knockout gene) into germ-line cells,
then some of the resulting offspring will inherit the gene knockout in all their cells.
Sibling mice that are identified as being heterozygous for the knockout version of
the gene of interest are then mated in order to produce mice that are homozygous
for the knockout allele. (If the gene is essential, homozygotes will be lethal and none
will be obtained from this cross.) (Figure 10-30b).
summary
Recombinant DNA is constructed in the laboratory to allow amplification of plasmids, phages, and BACs is clones contain-
researchers to amplify and analyze DNA segments (donor ing multiple copies of each recombinant DNA construct. In
DNA) from any genome or from DNA copies of mRNAs. contrast, only a single fosmid is present in each bacterial cell.
Three sources of donor DNA are (1) the entire genome Often, finding a specific clone with a gene of interest
digested with a restriction enzyme, (2) PCR products of spe- requires screening a full genomic library. A genomic library is
cific DNA regions defined by the flanking primer sequences, a set of clones, ligated in the same vector, that together repre-
and (3) cDNA copies of mRNAs. sent all regions of the genome of the organism in question.
The polymerase chain reaction is a powerful method for The number of clones that constitute a genomic library
the direct amplification of a relatively small sequence of depends on (1) the size of the genome in question and (2) the
DNA from within a complex mixture of DNA, without the insert size tolerated by the particular cloning-vector system.
need for a host cell or very much starting material. The key Similarly, a cDNA library is a representation of the total
is to have primers that are complementary to flanking mRNA set produced by a given tissue or developmental stage
regions on each of the two DNA strands. These regions act as in a given organism.
sites for polymerization. Multiple rounds of denaturation, Labeled single-stranded DNA or RNA probes are impor-
priming, and polymerization amplify the sequence of inter- tant for fishing out similar or identical sequences from com-
est exponentially. plex mixtures of molecules, either in genomic or cDNA librar-
To insert donor DNA into vectors, donor DNA and vector ies or in Southern (DNA) and Northern (RNA) blotting. The
DNA are cut by the same restriction endonuclease at specific general principle of the technique for identifying clones or
sequences. Vector and donor DNA are joined by annealing the gel fragments is to create an “image” of the colonies or
sticky ends that result from digestion, followed by ligation to plaques on an agar petri-dish culture or of the nucleic acids
covalently join the molecules. PCR and cDNA molecules are that have been separated in an electric field passed through a
inserted into vectors by first adding restriction-endonuclease- gel matrix. The DNA or RNA is then denatured and mixed
recognition sequences to the 5′ end of PCR primers or by ligat- with a denatured probe that has been labeled with a fluores-
ing short adapters containing restriction sites to their ends cent dye or a radioactive label. After the unbound probe has
before insertion into the vector. been washed off, the location of the probe is detected either
There are a wide variety of bacterial vectors. The choice of by observing its fluorescence or, if radioactive, by exposing
vector depends largely on the size of the DNA fragment to be the sample to X-ray film. The locations of the probe corre-
cloned. Plasmids are used to clone small restriction fragments, spond to the locations of the relevant DNA or RNA in the
PCR molecules, or cDNA molecules. Intermediate-size frag- original petri dish or electrophoresis gel. Labeled antibodies
ments, such as those resulting from the digestion of genomic are important probes for fishing out specific proteins from
DNA, can be cloned into modified versions of λ bacteriophage complex mixtures produced either by expression libraries
(for inserts of 10–15 kb) or into phage–plasmid hybrids called (with cDNA inserts) or in Western blotting.
fosmids (for inserts of 35–45 kb). Finally, bacterial artificial Vast genomic resources are making it increasingly pos-
chromosomes (BACs) are used routinely to clone very large sible to isolate genes solely from knowledge of their position
genomic fragments (~100–200 kb). on a genetic map. Two overall procedures are forward
The vector–donor DNA construct is amplified inside bac- genetic strategies called positional cloning and fine-struc-
terial host cells as extrachromosomal molecules that are repli- ture mapping. With the sequencing of the human genome
cated when the host is replicating its genome. The result of and the availability of families with inherited disorders,
3 92 CHAPTER 1 0 Gene Isolation and Manipulation
fine-structure-mapping strategies have led to isolation of extrachromosomal molecules or they can be integrated into a
genes that when mutated produce human disease. chromosome, either in random (ectopic) locations or in place
Transgenes are engineered DNA molecules that are intro- of the homologous gene, depending on the system. Typically,
duced and expressed in eukaryotic cells. They can be used to the mechanisms used to introduce a transgene depend on an
engineer a novel mutation or to study the regulatory sequences understanding and exploitation of the reproductive biology of
that constitute part of a gene. Transgenes can be introduced as the organism.
k e y t erms
amplification (p. 353) ectopically (p. 382) Northern blotting (p. 372)
antibody (p. 369) fine mapping (p. 381) polymerase chain reaction (PCR)
autoradiogram (p. 368) fosmid (p. 364) (p. 354)
bacterial artificial chromosome functional complementation (mutant positional cloning (p. 377)
(BAC) (p. 365) rescue) (p. 370) position effect (p. 387)
cDNA library (p. 367) gel electrophoresis (p. 371) probe (p. 368)
chromosome walk (p. 378) gene knockout (p. 388) recombinant DNA (p. 354)
complementary DNA (cDNA) gene replacement (p. 388) restriction enzyme (p. 355)
(p. 358) genetically modified organism restriction fragment (p. 355)
dideoxy (Sanger) sequencing (GMO) (p. 383) RNA blotting (p. 372)
(p. 374) genetic engineering (p. 352) Sanger (dideoxy) sequencing (p. 374)
DNA cloning (p. 354) genomic library (p. 367) Southern blotting (p. 372)
DNA ligase (p. 360) genomics (p. 352) Ti plasmid (p. 384)
DNA palindrome (p. 355) hybridization (p. 355) transgene (p. 382)
DNA technology (p. 352) mutant rescue (functional transgenic organism (p. 382)
donor DNA (p. 353) complementation) (p. 370) vector (p. 353)
so lv ed prob l ems
SOLVED PROBLEM 1. In Chapter 9, we studied the structure these clones with the tRNA to confirm that these clones
of tRNA molecules. Suppose that you want to clone a fungal contain the gene of interest.
gene that encodes a certain tRNA. You have a sample of the
SOLVED PROBLEM 2. You have isolated a yeast gene that
purified tRNA and an E. coli plasmid that contains a single
functions in the synthesis of the amino acid leucine, and you
EcoRI cutting site in a tet R (tetracycline-resistance) gene, as
hypothesize that it has the same function as a related gene
well as a gene for resistance to ampicillin (amp R). How can
from E. coli. How would you use functional complementa-
you clone the gene of interest?
tion/mutant rescue to test your theory?
Solution Solution
You can use the tRNA itself or a cloned cDNA copy of it to First, let’s assume that the yeast gene of interest is on an
probe for the DNA containing the gene. One method is to EcoRI-digested fragment. You will use recombinant DNA
digest the genomic DNA with EcoRI and then mix it with the techniques to insert this fragment into the polylinker site of
plasmid, which you also have cut with EcoRI. After transfor- a bacterial plasmid that has been cut with EcoRI and treated
mation of an amp S tet S recipient, select AmpR colonies, indi- with ligase to reform the circular plasmid. This plasmid
cating successful transformation. Of these AmpR colonies, should also contain a selectable marker for antibiotic resis-
select the colonies that are TetS. These TetS colonies will con- tance, like ampR (see Figure 10-9). Next, you need to trans-
tain vectors with inserts in the tet R gene, and a great number form this recombinant plasmid into leu− E. coli mutants.
of them are needed to make the library. Test the library by However, because there are four genes needed for leucine
using the tRNA as the probe. Those clones that hybridize to biosynthesis and you don’t know which one your yeast gene
the probe will contain the gene of interest. may be, you need to test for complementation in four mutant
Alternatively, you can subject EcoRI-digested genom- E. coli strains, each with a mutation in a different one of the
ic DNA to gel electrophoresis and then identify the cor- four genes (called leuA, leuB, leuC, and leuD). Grow the four
rect band by probing with the tRNA. This region of the E. coli strains separately, and perform a transformation
gel can be cut out and used as a source of enriched DNA experiment where the same recombinant plasmid is intro-
to clone into the plasmid cut with EcoRI. You then probe duced into each of the four strains. Next plate out the
Problems 3 9 3
transformants on agar plates that contain the antibiotic ampi- only can you conclude that you isolated a yeast gene involved
cillin (so only cells that have taken up the plasmid can grow) in leucine biosynthesis, but you can also determine which leu
but do not contain leucine. If you see colonies growing on one gene (leuA, leuB, leuC, or leuD). That is, if the yeast gene is
of the four plates containing transformed E. coli cells, not leuA, it will only complement the leuA− E. coli mutant.
prob l ems
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
W or k ing wi t h t h e F igures Do all of the cells of a transgenic plant grown from one
1. Figure 10-1 shows that specific DNA fragments can be clump of cells contain T-DNA? Justify your answer.
synthesized in vitro prior to cloning. What are two ways 11. In Figure 10-27, what is the difference between extra-
to synthesize DNA inserts for recombinant DNA in chromosomal DNA and integrated arrays of DNA? Are
vitro? the latter ectopic? What is distinctive about the syncytial
2. In Figure 10-4, why is cDNA made only from mRNA region that makes it a good place to inject DNA?
and not also from tRNAs and ribosomal RNAs?
B asic P rob l ems
3. Redraw Figure 10-6 with the goal of adding one EcoRI
end and one XhoI end. Below is the Xhol recognition 12. From this chapter, make a list of all the examples of (a)
sequence. the hybridization of single-stranded DNAs and (b) pro-
teins that bind to DNA and then act on it.
Recognition sequence: After cut: 13. Compare and contrast the use of the word recombinant as
used in the phrases (a) “recombinant DNA” and (b) “re-
. . . C T C G A G . . . . . . C T C G A G . . . combinant frequency.”
. . . G A G C T C . . . . . . G A G C T C . . .
14. Why is ligase needed to make recombinant DNA? What
would be the immediate consequence in the cloning pro-
4. Redraw Figure 10-7 so that the cDNA can insert into an cess if someone forgot to add it?
XhoI site of a vector rather than into an EcoRI site as
15. In the PCR process, if we assume that each cycle takes
shown
5 minutes, how manyfold amplification would be accom-
5. In Figure 10-10, determine approximately how many plished in 1 hour?
BAC clones are needed to provide 1× coverage of
16. The position of the gene for the protein actin in the hap-
a. the yeast genome (12 Mbp). loid fungus Neurospora is known from the complete ge-
b. the E. coli genome (4.1 Mbp). nome sequence. If you had a slow-growing mutant that
c. the fruit-fly genome (130 Mbp). you suspected of being an actin mutant and you wanted
to verify that it was one, would you (a) clone the mutant
6. In Figure 10-14, why does DNA migrate to the anode
by using convenient restriction sites flanking the actin
(+ pole)?
gene and then sequence it or (b) amplify the mutant
7. In Figure 10-17a, why are DNA fragments of different gene by using PCR and then sequence it?
length and all ending in an A residue synthesized?
17. You obtain the DNA sequence of a mutant of a 2-kb gene
8. As you will see in Chapter 15, most of the genomes of in which you are interested and it shows base differences
higher eukaryotes (plants and animals) are filled with at three positions, all in different codons. One is a silent
DNA sequences that are present in hundreds, even thou- change, but the other two are missense changes (they en-
sands, of copies throughout the chromosomes. In the code new amino acids). How would you demonstrate
chromosome-walking procedure shown in Figure 10-19, that these changes are real mutations and not sequenc-
how would the experimenter know whether the frag- ing errors? (Assume that sequencing is about 99.9 per-
ment he or she is using to “walk” to the next BAC or cent accurate.)
phage is repetitive? Can repetitive DNA be used in a
18. In a T-DNA transformation of a plant with a transgene
chromosome walk?
from a fungus (not found in plants), the presumptive
9. Redraw Figure 10-23 to include the positions of the sin- transgenic plant does not express the expected pheno-
gle and double crossovers. type of the transgene. How would you demonstrate that
10. In Figure 10-25, why do only plant cells that have T-DNA the transgene is in fact present? How would you demon-
inserts in their chromosomes grow on the agar plates? strate that the transgene was expressed?
39 4 CHAPTER 1 0 Gene Isolation and Manipulation
19. How would you produce a mouse that is homozygous for site). Explain what a polylinker is and why it is such an
a rat growth-hormone transgene? important feature.
20. Why was cDNA and not genomic DNA used in the com- 30. A second feature that virtually all plasmid vectors have
mercial cloning of the human insulin gene? in common is the selectable marker. Explain what this is
21. After Drosophila DNA has been treated with a restriction and why it is such an important feature.
enzyme, the fragments are inserted into plasmids and se-
lected as clones in E. coli. With the use of this “shotgun” C h a l l enging P rob l ems
technique, every DNA sequence of Drosophila in a library 31. Prototrophy is often the phenotype selected to detect
can be recovered. transformants. Prototrophic cells are used for donor
a. How would you identify a clone that contains DNA DNA extraction; then this DNA is cloned and the clones
encoding the protein actin, whose amino acid sequence are added to an auxotrophic recipient culture. Successful
is known? transformants are identified by plating the recipient
b. How would you identify a clone encoding a specific culture on minimal medium and looking for colonies.
tRNA? What experimental design would you use to make sure
that a colony that you hope is a transformant is not, in
22. In any particular transformed eukaryotic cell (say, of
fact,
Saccharomyces cerevisiae), how could you tell if the trans-
forming DNA (carried on a circular bacterial vector) a. a prototrophic cell that has entered the recipient
culture as a contaminant?
a. replaced the resident gene of the recipient by double
crossing over or single crossing over? b. a revertant (mutation back to prototrophy by a second
mutation in the originally mutated gene) of the
b. was inserted ectopically?
auxotrophic mutation?
23. In an electrophoretic gel across which is applied a power-
ful electrical alternating pulsed field, the DNA of the 32. A cloned fragment of DNA was sequenced by using the
haploid fungus Neurospora crassa (n = 7) moves slowly dideoxy chain-termination method. A part of the autora-
but eventually forms seven bands, which represent DNA diogram of the sequencing gel is represented here.
fractions that are of different sizes and hence have ddA ddG ddT ddC
moved at different speeds. These bands are presumed to
be the seven chromosomes. How would you show which
band corresponds to which chromosome?
24. The protein encoded by the cystic fibrosis gene is 1480
amino acids long, yet the gene spans 250 kb. How is this
difference possible?
25. In yeast, you have sequenced a piece of wild-type DNA
and it clearly contains a gene, but you do not know what
gene it is. Therefore, to investigate further, you would
like to find out its mutant phenotype. How would you
use the cloned wild-type gene to do so? Show your ex-
perimental steps clearly.
26. Why is it necessary to use a special DNA polymerase
(Taq polymerase) in PCR?
a. Deduce the nucleotide sequence of the DNA
27. For each of the following experimental goals, is PCR or
nucleotide chain synthesized from the primer. Label the
gene cloning preferable and why?
5′ and 3′ ends.
a. Isolate the same gene from 20 individuals.
b. Deduce the nucleotide sequence of the DNA
b. Isolate 100 genes from the same individual. nucleotide chain used as the template strand. Label the
c. Isolate a mouse gene when you have a rat gene 5′ and 3′ ends.
fragment. c. Write out the nucleotide sequence of the DNA double
28. In Northern blotting, electrophoresis is used to resolve helix (label the 5′ and 3′ ends).
which biological molecules? What type of probe is used 33. The cDNA clone for the human gene encoding tyrosi-
to identify the target molecule(s)? nase was radioactively labeled and used in a Southern
29. One feature that virtually all plasmid vectors have in analysis of EcoRI-digested genomic DNA of wild-type
common is the polylinker (also called a multiple cloning mice. Three mouse fragments were found to be radio-
Problems 3 9 5
active (were bound by the probe). When albino mice also is a heterozygote for cystic fibrosis, but, in his case, it
were used in this Southern analysis, no genomic frag- is a different mutation in the same gene. How would you
ments bound to the probe. Explain these results in rela- counsel this couple about the risks of a child’s having cys-
tion to the nature of the wild-type and mutant mouse tic fibrosis?
alleles. 36. Bacterial glucuronidase converts a colorless substance
34. Transgenic tobacco plants were obtained in which the called X-Gluc into a bright blue indigo pigment. The
vector Ti plasmid was designed to insert the gene of gene for glucuronidase also works in plants if given a
interest plus an adjacent kanamycin-resistance gene. plant promoter region. How would you use this gene as
The inheritance of chromosomal insertion was fol- a reporter gene to find the tissues in which a plant gene
lowed by testing progeny for kanamycin resistance. that you have just cloned is normally active? (Assume
Two plants typified the results obtained generally. that X-Gluc is easily taken up by the plant tissues.)
When plant 1 was backcrossed with wild-type tobacco, 37. The plant Arabidopsis thaliana was transformed by using
50 percent of the progeny were kanamycin resistant the Ti plasmid into which a kanamycin-resistance gene
and 50 percent were sensitive. When plant 2 was back- had been inserted in the T-DNA region. Two kanamycin-
crossed with the wild type, 75 percent of the progeny resistant colonies (A and B) were selected, and plants
were kanamycin resistant and 25 percent were sensi- were regenerated from them. The plants were allowed to
tive. What must have been the difference between the self-pollinate, and the results were as follows:
two transgenic plants? What would you predict about
the situation regarding the gene of interest? Plant A selfed → 3
progeny resistant to kanamycin
4
35. A cystic-fibrosis mutation in a certain pedigree is due to 1 progeny sensitive to kanamycin
4
a single nucleotide-pair change. This change destroys an
EcoRI restriction site normally found in this position. Plant B selfed → 15
16
progeny resistant to kanamycin
How would you use this information in counseling mem- 1 progeny sensitive to kanamycin
16
bers of this family about their likelihood of being carri-
ers? State the precise experiments needed. Assume that a. Draw the relevant plant chromosomes in both
you find that a woman in this family is a carrier, and it plants.
transpires that she is married to an unrelated man who b. Explain the two different ratios.
This page intentionally left blank
344
11
Regulation of Gene
C h a p t e r
Expression in Bacteria
and Their Viruses
Learning Outcomes
After completing this chapter, you will be able to
outline
397
39 8 CHAPTER 1 1 Regulation of Gene Expression in Bacteria and Their Viruses
I
n December 1965, the king of Sweden presented the Nobel Prize in Physiology
or Medicine to François Jacob, Jacques Monod, and André Lwoff of the Pasteur
Institute for their discoveries of how gene expression is regulated (Figure 11-1).
The prizes were the fruit of an exceptional collaboration among three superb sci-
entists. They were also triumphs over great odds. The chances were slim that each
of these three men would have lived to see that day, let alone earn such honors.
Twenty-five years earlier, Monod had been a doctoral student at the Sorbonne
in Paris, working on a phenomenon in bacteria called “enzymatic adaptation” that
seemed so obscure to some that the director of the zoological laboratory where he
worked stated, “What Jacques Monod is doing is of no interest whatever to the
Sorbonne.” Jacob was a 19-year-old medical student intent on becoming a surgeon.
Lwoff was by that time a well-established member of the Pasteur Institute in Paris,
chief of its department of microbial physiology.
Then came World War II.
As France was invaded and quickly defeated, Jacob raced for the coast to join
the Free French forces assembling in England. He served as a medic in North Africa
and in Normandy until badly wounded. Monod joined the French Resistance while
continuing his work. After a Gestapo raid on his Sorbonne laboratory, Monod
decided that working there was too dangerous (his predecessor in the Resistance
was arrested and executed), and André Lwoff offered him space at the Pasteur.
Monod, in turn, connected Lwoff with the Resistance.
After the liberation of Paris, Monod served in the French army and happened
on an article by Oswald Avery and colleagues demonstrating that DNA is the hered-
itary material in bacteria (see Chapter 7). His interest in genetics was rekindled, and
he rejoined Lwoff after the war. Meanwhile, Jacob’s injuries were too severe for him
to pursue a career in surgery. Inspired by the enormous impact of antibiotics intro-
duced late in the war, Jacob eventually decided to pursue scientific research. Jacob
approached Lwoff several times for a position in his laboratory but was declined. He
made one last try and caught Lwoff in a jovial mood. The senior scientist told Jacob,
“You know, we have just found the induction of the prophage. Would you be inter-
ested in working on the phage?” Jacob had no idea what Lwoff was talking about. He
stammered, “That’s just what I would like to do.”
The cast was set. What unfolded in the subsequent decade was
Pioneers of gene regulation one of the most creative and productive collaborations in the his-
tory of genetics, whose discoveries still reverberate throughout biol-
ogy today.
One of the most important insights arrived not in the laboratory
but in a movie theater. Struggling with a lecture that he had to pre-
pare, Jacob opted instead to take his wife, Lise, to a Sunday matinee.
Bored and daydreaming, Jacob drew a connection between the work
he had been doing on the induction of prophage and that of Monod
on the induction of enzyme synthesis. Jacob became “involved by a
sudden excitement mixed with a vague pleasure. . . . Both experi-
ments . . . on the phage . . . and that done with Pardee and Monod on
the lactose system . . . are the same! Same situation. Same result . . .
In both cases, a gene governs the formation . . . of a repressor block-
ing the expression of other genes and so preventing either the syn-
thesis of the galactosidase or the multiplication of the virus. . . .
Where can the repressor act to stop everything at once? The only
simple answer . . . is on the DNA itself!” 1
And so was born the concept of a repressor acting on DNA to
F i g u r e 11-1 François Jacob, Jacques Monod, and André repress the induction of genes. It would take many years before the
Lwoff were awarded the 1965 Nobel Prize in Physiology or hypothesized repressors were isolated and characterized biochemi-
Medicine for their pioneering work on how gene expression is
regulated. [ The Pasteur Institute.] 1 F. Jacob, The Statue Within: An Autobiography, 1988.
11.1 Gene Regulation 3 9 9
cally. The concepts worked out by Jacob and Monod and explained in this chapter—
messenger RNA, promoters, operators, regulatory genes, operons, and allosteric
proteins—were deduced entirely from genetic evidence, and these concepts shaped
the future field of molecular genetics.
Walter Gilbert, who isolated the first repressor and was later awarded a Nobel
Prize in Chemistry for co-inventing a method of sequencing DNA, explained the
effect of Jacob and Monod’s work at that time: “Most of the crucial discoveries in
science are of such a simplifying nature that they are very hard even to conceive
without actually having gone through the experience involved in the discovery. . . .
Jacob’s and Monod’s suggestion made things that were utterly dark, very simple.” 2
The concepts that Jacob and Monod illuminated went far beyond bacterial
enzymes and viruses. They understood, and were able to articulate with exceptional
eloquence, how their discoveries about gene regulation pertained to the general mys-
teries of cell differentiation and embryonic development in animals. The two men
once quipped, “anything found to be true of E. coli must also be true of Elephants.” 3
In the next three chapters, we will see to what degree that assertion is true. We’ll start
in this chapter with bacterial examples that illustrate key themes and mechanisms in
the regulation of gene expression. We will largely focus on single regulatory proteins
and the genetic “switches” on which they act. Then, in Chapter 12, we’ll tackle gene
regulation in eukaryotic cells, which entails more complex biochemical and genetic
machinery. And, finally, in Chapter 13, we’ll examine the role of gene regulation in
the development of multicellular animals. There we will see how sets of regulatory
proteins act on arrays of genetic switches to control gene expression in time and
space and choreograph the building of bodies and body parts.
2 H. F. Judson, The Eighth Day of Creation: Makers of the Revolution in Biology, 1979.
3 F. Jacob and J. Monod, Cold Spring Harbor Quant. Symp. Biol. 26, 1963, 393.
40 0 CHAPTER 1 1 Regulation of Gene Expression in Bacteria and Their Viruses
produce the enzymes than it could ever derive from breaking down prospective
carbon sources. The cell has devised mechanisms to shut down (repress) the tran-
scription of all genes encoding enzymes that are not needed at a given time and to
turn on (activate) those genes encoding enzymes that are needed. For example, if
only lactose is in the environment, the cell will shut down the transcription of the
genes encoding enzymes needed for the import and metabolism of glucose, galac-
tose, xylose, and other sugars. Conversely, E. coli will initiate the transcription of the
genes encoding enzymes needed for the import and metabolism of lactose. In sum,
cells need mechanisms that fulfill two criteria:
1. They must be able to recognize environmental conditions in which they
should activate or repress the transcription of the relevant genes.
2. They must be able to toggle on or off, like a switch, the transcription of each
specific gene or group of genes.
Let’s preview the current model for prokaryotic transcriptional regulation and
then use a well-understood example—the regulation of the genes in the metabo-
lism of the sugar lactose—to examine it in detail. In particular, we will focus on
how this regulatory system was dissected with the use of the tools of classical
genetics and molecular biology.
Activator- Activator-
RNA
binding binding
No polymerase
site site
transcription Transcription
Both activator and repressor proteins must be able to recognize when environ- F i g u r e 11-2 The binding of regulatory
mental conditions are appropriate for their actions and act accordingly. Thus, for proteins can either activate or block
activator or repressor proteins to do their job, each must be able to exist in two states: transcription.
one that can bind its DNA targets and another that cannot. The binding state must
be appropriate to the set of physiological conditions present in the cell and its envi-
ronment. For many regulatory proteins, DNA binding is effected through the inter-
action of two different sites in the three-dimensional structure of the protein. One
site is the DNA-binding domain. The other site, the allosteric site, acts as a sensor
that sets the DNA-binding domain in one of two modes: functional or nonfunctional.
The allosteric site interacts with small molecules called allosteric effectors.
In lactose metabolism, it is actually an isomer of the sugar lactose (called allo-
lactose) that is an allosteric effector: the sugar binds to a regulatory protein that
inhibits the expression of genes needed for lactose metabolism. In general, an
allosteric effector binds to the allosteric site of the regulatory protein in such a
way as to change its activity. In this case, allolactose changes the shape and struc-
ture of the DNA-binding domain of a regulatory protein. Some activator or repres- F i g u r e 11- 3 Allosteric effectors
sor proteins must bind to their allosteric effectors before they can bind DNA. influence the DNA-binding activities of
Others can bind DNA only in the absence of their allosteric effectors. Two of these activators and repressors.
situations are shown in Figure 11-3.
Introduction to Genetic Analysis, 11e
K e y C o n c e p t Allosteric effectors control the Allosteric effectors bind to regulatory proteins
Figure 11.02 #1105
ability of activator or repressor proteins to bind to their
05/28/14
DNA target sites.
06/25/14 No effector Effector present
Dragonfly Media Group
Allosteric site
Effector
A first look at the lac regulatory circuit
Regulatory
The pioneering work of François Jacob and Jacques protein
Monod in the 1950s showed how lactose metabolism is
genetically regulated. Let’s examine the system under DNA-binding site
two conditions: the presence and the absence of lactose. Activator Activator-binding site
Figure 11-4 is a simplified view of the components of this
Effector
system. The cast of characters for lac operon regulation
includes protein-coding genes and sites on the DNA that
are targets for DNA-binding proteins.
F i g u r e 11- 4 A simplified
lac operon model. Repressor protein controls the lac operon
Coordinate expression of the
Z, Y, and A genes is under lac
I P O
DNA Z Y A DNA
the negative control of the
product of the I gene, the
repressor. When the inducer
binds the repressor, the
operon is fully expressed.
Inducer
I Z Y A
mRNA mRNA
allolactose and to cleave the lactose molecule to yield glucose and galactose
(Figure 11-5). The structures of the β -galactosidase and permease proteins are
encoded by two adjacent sequences, Z and Y, respectively. A third contiguous
sequence encodes an additional enzyme, termed transacetylase, which is not
required for lactose metabolism. We will call Z, Y, and A structural genes—in other
words, segments encoding proteins—while reserving judgment on this categoriza-
tion until later. We will focus mainly on the Z and Y genes. All three genes are
F i g u r e 11- 5 The metabolism of
transcribed into a single messenger RNA molecule. Regulation of the production
lactose. (a) The enzyme β-galactosidase
catalyzes a reaction in which water is
of this mRNA coordinates the synthesis of all three enzymes. That is, either all or
added to the β-galactoside linkage to none of the three enzymes are synthesized. Genes whose transcription is con-
break lactose into separate molecules of trolled by a common means are said to be coordinately controlled.
glucose and galactose. (b) The enzyme
also modifies a smaller proportion of
K e y C o n c e p t If the genes encoding proteins constitute a single transcription
lactose into allolactose, which acts as an
unit, the expression of all these genes will be coordinately regulated.
inducer of the lac operon.
I P O Z Y A
Structural genes
RNA polymerase
DNA
mRNA
Polypeptide
Folding
Repressor
protein
I P O Z Y A
Structural genes
RNA polymerase
DNA
mRNA
mRNA
Polypeptide
Folding
Repressor
protein Lactose
mRNA
Medium
operon, defined
Introduction as aAnalysis,
to Genetic segment of DNA that encodes a multigenic mRNA as well as
11e F i g u r e 11- 6 Regulation of the lac
an adjacent
Figure common promoter and regulatory region. The lacI gene, encoding the
11.06 #1109 operon. The I gene continually makes
05/28/14
Lac repressor, is not considered part of the lac operon itself, but the interaction repressor. (a) In the absence of lactose, the
Dragonfly Media Group repressor binds to the O (operator) region
between the Lac repressor and the lac operator site is crucial to proper regulation
and blocks transcription. (b) The binding of
of the lac operon. The Lac repressor has a DNA-binding site that can recognize the lactose changes the shape of the
operator DNA sequence and an allosteric site that binds allolactose or analogs of repressor so that the repressor no longer
lactose that are useful experimentally. The repressor will bind tightly only to the binds to O and falls off the DNA. The RNA
O site on the DNA near the genes that it is controlling and not to other sequences polymerase is then able to transcribe the
distributed throughout the chromosome. By binding to the operator, the repres- Z, Y, and A structural genes, and so the
sor prevents transcription by RNA polymerase that has bound to the adjacent three enzymes are produced.
promoter site; the lac operon is switched “off.” ANIMATED ART: Assaying
When allolactose or its analogs bind to the repressor protein, the protein lactose presence or absence through the Lac
repressor
undergoes an allosteric transition, a change in shape. This slight alteration in
shape in turn alters the DNA-binding site so that the repressor no longer has high
40 4 CHAPTER 1 1 Regulation of Gene Expression in Bacteria and Their Viruses
affinity for the operator. Thus, in response to binding allolactose, the repressor
falls off the DNA, allowing RNA polymerase to proceed (transcribe the gene): the
lac operon is switched “on.” The repressor’s response to allolactose satisfies one
requirement for such a control system—that the presence of lactose stimulates
the synthesis of genes needed for its processing. The relief of repression for sys-
tems such as lac is termed induction. Allolactose and its analogs that allosteri-
cally inactivate the repressor, leading to the expression of the lac genes, are termed
inducers.
Let’s summarize how the lac switch works. In the absence of an inducer (allo-
lactose or an analog), the Lac repressor binds to the lac operator site and prevents
transcription of the lac operon by blocking the movement of RNA polymerase. In
this sense, the Lac repressor acts as a roadblock on the DNA. Consequently, all the
structural genes of the lac operon (the Z, Y, and A genes) are repressed, and there
are very few molecules of β -galactosidase, permease, or transacetylase in the cell.
In contrast, when an inducer is present, it binds to the allosteric site of each Lac
repressor subunit, thereby inactivating the site that binds to the operator. The Lac
repressor falls off the DNA, allowing the transcription of the structural genes of the
lac operon to begin. The enzymes β -galactosidase, permease, and transacetylase
now appear in the cell in a coordinated fashion. So, when lactose is present in the
environment of a bacterial cell, the cell produces the enzymes needed to metabo-
lize it. But when no lactose is present, resources are not wasted.
O +/O C heterozygote
I+ P+ O+ Y+
Z+
R Repressor
Repressor
cannot bind
Expression blocked
to altered
operator
I+ P+ OC Y+
Z+
F i g u r e 11- 8 O+/OC heterozygotes β -galactosidase (like the wild-type haploid strain 1 in this table) even though it is
demonstrate that operators are cis-acting. heterozygous for mutant and wild-type Z alleles. This demonstrates that the
Because a repressor cannot bind to OC Z+ allele is dominant over its Z− counterpart.
operators, the lac structural genes linked Jacob and Monod first identified two classes of regulatory mutations, called OC
to an OC operator are expressed even in
the absence of an inducer. However, the
and I−. These were called constitutive mutations because they caused the lac
lac genes adjacent to an O+ operator are operon structural genes to be expressed regardless of whether inducer was present.
still subject to repression. Jacob and Monod identified the existence of the operator on the basis of their analy-
ANIMATED ART: OC lac
sis of the OC mutations. These mutations make the operator incapable of binding to
operator mutations repressor; they damage the switch such that the operon is always “on” (Table 11-1,
strain 3). Importantly, the constitutive effects of OC mutations were restricted solely
to those lac structural genes on the same chromosome as the OC mutation. For this
reason, the operator mutant was said to be cis-acting, as demonstrated by the phe-
notype of strain 4 in Table 11-1. Here, because the wild-type permease (Y+) gene is
cis to the wild-type operator, permease is expressed only when lactose or an analog
is present. In contrast, the wild-type β -galactosidase (Z+) gene is cis to the OC mutant
operator; hence, β -galactosidase is expressed constitutively. This unusual property of
cis action suggested that the operator is a segment of DNA that influences only the
expression of the structural genes linked to it (Figure 11-8). The operator thus acts
simply as a protein-binding site and makes no gene product.
Jacob and Monod did comparable genetic tests with the I− mutations (Table
11-2). A comparison of the inducible wild-type I+ (strain 1) with I− strains shows
that I− mutations are constitutive (strain 2). That is, they cause the structural genes
to be expressed at all times. Strain 3 demonstrates that the inducible phenotype of
I+ is dominant over the constitutive phenotype of I−. This finding showed Jacob and
Monod that the amount of wild-type protein encoded by one copy of the gene is
I +/I – heterozygote
I– P+ O+ Y+
Z+
No active
repressor
R
Repressor
I+ P+ O+ Y+
Z+
sufficient to regulate both copies of the operator in a diploid cell. Most significantly, F i g u r e 11- 9 The recessive nature of
strain 4 showed them that the I+ gene product is trans-acting, meaning that the I − mutations demonstrates that the re-
gene product can regulate all structural lac operon genes, whether residing on the pressor is trans-acting. Although no active
repressor is synthesized from the I− gene,
same DNA molecule or on different ones (in cis or in trans, respectively). Unlike the
the wild-type ( I +) gene provides a func-
operator, the I gene behaves like a standard protein-coding gene. The protein prod- tional repressor that binds to both opera-
uct of the I gene is able to diffuse throughout a cell and act on both operators in the tors in a diploid cell and blocks lac operon
partial diploid (Figure 11-9). expression (in the absence of an inducer).
ANIMATED ART: I − Lac
K e y C o n c e p t Operator mutations reveal that such a site is cis-acting; that is, repressor mutations
it regulates the expression of an adjacent transcription unit on the same DNA
molecule. In contrast, mutations in the gene encoding a repressor protein reveal that
this protein is trans-acting; that is, it can act on any copy of the target DNA.
Table 11-3 Synthesis of β-Galactosidase and Permease by the Wild Type and by Strains
Carrying Different Alleles of the λ Gene
I +/I S heterozygote
IS P+ O+ Y+
Z+
RS I S repressor
I cannot bind inducer I
I+ P+ O+ Y+
Z+
R + I = R I
F i g u r e 11-10 The dominance of I S strain 1). Unlike I− mutations, I S mutations are dominant over I+ (see Table 11-3,
mutation is due to the inactivation of the strain 3). This key observation led Jacob and Monod to speculate that IS mutations
allosteric site on the Lac repressor. In an alter the allosteric site so that it can no longer bind to an inducer. As a conse-
I S/ I − diploid cell, none of the lac structural
quence, IS-encoded repressor protein continually binds to the operator—prevent-
genes are transcribed. The I S repressor
lacks a functional allolactose-binding site
ing transcription of the lac operon even when the inducer is present in the cell. On
(the allosteric site) and thus is not this basis, we can see why I S is dominant over I+. Mutant I S protein will bind to
inactivated by an inducer. Therefore, even both operators in the cell, even in the presence of an inducer and regardless of the
in the presence of an inducer, the I S fact that I+-encoded protein may be present in the same cell (Figure 11-10).
repressor binds irreversibly to all operators
in a cell, thereby blocking transcription of
the lac operon.
Genetic analysis of the lac promoter
ANIMATED ART: I S Lac
Mutational analysis also demonstrated that an element essential for lac transcrip-
superrepressor mutations tion is located between the gene for the repressor I and the operator site O. This
element, termed the promoter (P ), serves as the initiation site for transcription by
RNA polymerase, as described in Chapter 8. There are two binding regions for RNA
polymerase in a typical prokaryotic promoter, shown in Figure 11-11 as the two
highly conserved regions at −35 and −10. Promoter mutations are cis-acting in that
they affect the transcription of all adjacent structural genes in the operon. Like
operators and other cis-acting elements, promoters are sites on the DNA molecule
that are bound by proteins and themselves produce no protein product.
GT ∆ AT AT A Mild effects
CGCA on transcription
One-base Two-base G
deletion change
GC AGT CG AT C Severe effects
A TC AT on transcription
strands that had been shielded from the enzyme activity by the repressor mole- F i g u r e 11-11 Specific DNA
cule. These short strands presumably constituted the operator sequence. The sequences are important for the efficient
base sequence of each strand was determined, and each operator mutation was transcription of E. coli genes by RNA
polymerase. Only the non-template strand
shown to be a change in the sequence (Figure 11-12). These results showed that
is shown here (see Figure 8-4).
the operator locus is a specific sequence of 17 to 25 nucleotides situated just Transcription would proceed from left to
before (5 ′ to) the structural Z gene. They also showed the incredible specificity of right (5 ′ to 3 ′ ), and the mRNA transcript
repressor–operator recognition, which can be disrupted by a single base substitu- would be homologous to the sequence
tion. When the sequence of bases in the lac mRNA (transcribed from the lac shown. The boxed sequences are highly
operon) was determined, the first 21 bases on the 5 ′ initiation end proved to be conserved in all E. coli promoters, an
complementary to the operator sequence that Gilbert had determined, showing indication of their role as contact sites on
the DNA for RNA polymerase binding, and
that the operator sequence is transcribed.
contacts are made with both strands (not
The results of these experiments provided crucial confirmation of the mecha- shown). Mutations in these regions have
nism of repressor action formulated by Jacob and Monod. mild (gold) and severe (brown) effects on
transcription. The mutations may be
changes of single nucleotides or pairs of
nucleotides, or a deletion ( ∆) may occur.
11.3 Catabolite Repression of the lac Operon: [ Data from J. D. Watson, M. Gilman,
Positive Control J. Witkowski, and M. Zoller, Recombinant
DNA, 2nd ed.]
Through a long evolutionary process, the existing lac system has been selected to
operate for the optimal energy efficiency of the bacterial cell. Presumably to maxi-
mize energy efficiency, two environmental conditions have to
be satisfied for the lactose metabolic enzymes to be expressed.
One condition is that lactose must be present in the envi- The operator is a specific DNA sequence
ronment. This condition makes sense because it would be
inefficient for the cell to produce the lactose metabolic 5′ T G G A A T T G T G A G C G G A T A A C A A T T 3′
enzymes if there is no lactose to metabolize. We have already 3′ A C C T T A A C A C T C G C C T A T T G T T A A 5′
seen that the cell is able to respond to the presence of lactose
through the action of a repressor protein.
The other condition is that glucose cannot be present in O c mutations
A T G T T A C T
the cell’s environment. Because the cell can capture more T A C A A T G A
energy from the breakdown of glucose than it can from the
F i g u r e 11-12 The DNA base sequence of the lactose operator
breakdown of other sugars, it is more efficient for the cell to
and the base changes associated with eight OC mutations.
metabolize glucose rather than lactose. Thus, mechanisms Regions of twofold rotational symmetry are indicated by color and
have evolved that prevent the cell from synthesizing the by a dot at their axis of symmetry. [ Data from W. Gilbert, A. Maxam,
enzymes for lactose metabolism when both lactose and glu- and A. Mirzabekov, in N. O. Kjeldgaard and O. Malløe, eds., Control of
cose are present together. The repression of the transcription Ribosome Synthesis. Academic Press, 1976.]
410 CHAPTER 1 1 Regulation of Gene Expression in Bacteria and Their Viruses
K e y C o n c e p t The lac operon has an added level of control so that the operon
is inactive in the presence of glucose even if lactose also is present. An allosteric
effector, cAMP, binds to the activator CAP to permit the induction of the lac operon.
However, high concentrations of glucose catabolites inhibit production of cAMP, thus
failing to produce cAMP–CAP and thereby failing to activate the lac operon.
5′ T G G A AT T G T G AG C G G ATA AC A AT T 3′
3′ A C C T TA AC AC T C G C C TAT T G T TAA 5′
5′ GTGAG T T A G C TC AC 3′
3′ CAC T C A AT CGAGTG 5′
F i g u r e 11-14 The DNA base sequences of (a) the lac operator, to which the Lac
repressor binds, and (b) the CAP-binding site, to which the CAP–cAMP complex binds.
Sequences exhibiting twofold rotational symmetry are indicated by the colored boxes and by
a dot at the center point of symmetry. [ (a) Data from W. Gilbert, A. Maxam, and A. Mirzabekov, in
N. O. Kjeldgaard and O. Malløe, eds., Control of Ribosome Synthesis. Academic Press, 1976.]
underlie the specificity of DNA binding by these very different regulatory proteins.
One property that these sequences do have in common and that is common to many
other DNA-binding sites is rotational twofold symmetry. In other words, if we rotate
the DNA sequence shown in Figure 11-14 by 180 degrees within the plane of the page,
the sequence of the highlighted bases of the binding sites will be identical. The high-
lighted bases are thought to constitute the important contact sites for protein–DNA
interactions. This rotational symmetry corresponds to symmetries within the DNA-
binding proteins, many of which are composed of two or four identical subunits. We Binding of CAP bends DNA
will consider the structures of some DNA-binding proteins later in the chapter.
How does the binding of the cAMP–CAP complex to the operon further the (a)
DNA −90
binding of RNA polymerase to the lac promoter? In Figure 11-15, the DNA is shown −35
as being bent when CAP is bound. This bending of DNA may aid the binding of
−10
RNA polymerase to the promoter. There is also evidence that CAP makes direct
contact with RNA polymerase that is important for the CAP activation effect. The 5′
CAP
base sequence shows that CAP and RNA polymerase bind directly adjacent to each 3′
P
RNA polymerase
CAP site interaction site
E. coli chromosome
Stop codon
Glu
Ser
Gln
Gly
5′ GAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTA CACTTT
CTTTCGCCCGTCACTCGCGTTGCGTTAATTACACTCAATCGAGTGAGTAATCCGTGGGGTCCGAAATGTGAAA
Repressor Promoter
CAP contact
region
CAP
I P O A
Z Y
R
Repressor
CAP
I P O A
Z Y
R + I R I Very little lac mRNA
Lactose Inducer–
repressor
O Z
fMet
mRNA
Met
Thr
ATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATG
TACGAAGGCCGAGCATACAACACACCTTAACACTCGCCTATTGTTAAAGTGTGTCCTTTGTCGATACTGGTAC 5′
Operator
regulator genes. R P O A B C
R
Active
11.4 Dual Positive and Negative repressor No transcription
Gene order in the trp operon corresponds to reaction order in the biosynthetic pathway
H H H H
O O NH2
COOH COOH COOH COOH O O H
O C C CH2O P C C COOH
HO C C C CH2O P H H H H
HO L-Glutamine
NH2 PRPP N H H N N
H CH2O P N CH H
O H H
O O L-Tryptophan
H2C C COOH H H
Anthranilic CDRP Indole-3-glycerol
Phosphoribosyl phosphate N
Chorismic acid acid H L-Serine
anthranilic acid
Indole
binds an operator, preventing the initiation of transcription. This repressor is the F i g u r e 11-2 1 The chromosomal order
Trp repressor, the product of the trpR gene. The Trp repressor binds tryptophan of genes in the trp operon of E. coli and the
when adequate levels of the amino acid are present, and only after binding tryp- sequence of reactions catalyzed by the
tophan will the Trp repressor bind to the operator and switch off transcription of enzyme products of the trp structural
genes. The products of genes trpD and
the operon. This simple mechanism ensures that the cell does not waste energy
trpE form a complex that catalyzes specific
producing tryptophan when the amino acid is sufficiently abundant. E. coli strains steps, as do the products of genes trpB
with mutations in trpR continue to express the trp mRNA and thus continue to and trpA. Tryptophan synthetase is a
produce tryptophan when the amino acid is abundant. tetrameric enzyme formed by the products
In studying these trpR mutant strains, Charles Yanofsky discovered that, when of trpB and trpA. It catalyzes a two-step
tryptophan was removed from the medium, the production of trp mRNA further process leading to the formation of
increased several-fold. This finding was evidence that, in addition to the Trp tryptophan. Abbreviations: PRPP,
phosphoribosylpyrophosphate; CDRP,
repressor, a second control mechanism existed to negatively regulate transcrip-
1-(o-carboxyphenylamino)-1-deoxyribulose
tion. This mechanism is called attenuation because mRNA production is nor- 5-phosphate. [ Data from S. Tanemura and
mally attenuated, meaning “decreased,” when tryptophan is plentiful. Unlike the R. H. Bauerle, Genetics 95, 1980, 545.]
other bacterial control mechanisms described thus far, attenuation acts at a step
after transcription initiation.
The mechanisms governing attenuation were discovered by identifying muta-
tions that reduced or abolished attenuation. Strains with these mutations pro-
duce trp mRNA at maximal levels even in the presence of tryptophan. Yanofsky
mapped the mutations to a region between the trp operator and the trpE gene; this
F i g u r e 11-2 2 In the trp mRNA
region, termed the leader sequence, is at the 5 ′ end of the trp operon mRNA
leader sequence, the attenuator region
before the first codon of the trpE gene (Figure 11-22). The trp leader sequence is precedes the trpE coding sequence.
unusually long for a prokaryotic mRNA, 160 bases, and detailed analyses have Farther upstream, at bases 54 through 59,
revealed how a part of this sequence works as an attenuator that governs the are the two tryptophan codons (shown in
further transcription of trp mRNA. red) of the leader peptide.
The key observations are that, in the absence of the TrpR repressor protein,
the presence of tryptophan halts transcription after the first 140 bases or so,
whereas, in the absence of tryptophan, transcription of the operon continues. The
F i g u r e 11-2 3 Model for attenuation in mechanism for terminating or continuing transcription consists of two key ele-
the trp operon. (a) Proposed secondary ments. First, the trp mRNA leader sequence encodes a short, 14-amino-acid pep-
structures in the conformation of trp leader tide that includes two adjacent tryptophan codons. Tryptophan is one of the least
mRNA that favors termination of
abundant amino acids in proteins, and it is encoded by a single codon. This pair
transcription. Four regions can base-pair
to form three stem-and-loop structures, of tryptophan codons is therefore an unusual feature. Second, parts of the trp
but only two regions base-pair with one mRNA leader form stem-and-loop structures that are able to alternate between
another at a given time. Thus, region 2 two conformations. One of these conformations favors the termination of tran-
can base-pair with either region 1 or scription (Figure 11-23a).
region 3. (b) When tryptophan is The regulatory logic of the operon pivots on the abundance of tryptophan.
abundant, segment 1 of the trp mRNA is When tryptophan is abundant, there is a sufficient supply of aminoacyl-tRNATrp
translated. Segment 2 enters the ribosome
to allow translation of the 14-amino-acid peptide. Recall that transcription and
(although it is not translated), which
enables segments 3 and 4 to base-pair. translation in bacteria are coupled; so ribosomes can engage mRNA transcripts
This base-paired region causes RNA and initiate translation before transcription is complete. The engagement of the
polymerase to terminate transcription. ribosome alters trp mRNA conformation to the form that favors termination of
(c) In contrast, when tryptophan is scarce, transcription (where segments 3 and 4 form base pairs; Figure 11-23b). However,
the ribosome is stalled at the codons of when tryptophan is scarce, the ribosome is stalled at the tryptophan codons, seg-
segment 1. Segment 2 interacts with ments 2 and 3 base-pair, and transcription is able to continue (Figure 11-23c).
segment 3 instead of being drawn into the
Other operons for enzymes in biosynthetic pathways have similar attenuation
ribosome, and so segments 3 and 4
cannot pair. Consequently, transcription controls. One signature of amino acid biosynthesis operons is the presence of
continues. [ Data from D. L. Oxender, multiple codons for the amino acid being synthesized in a separate peptide
G. Zurawski, and C. Yanofsky, Proc. Natl. encoded by the 5 ′ leader sequence. For instance, the phe operon has seven phe-
Acad. Sci. USA 76, 1979, 5524.]
(a) trp operon Met - Lys - Ala - Ile - Phe - Val - Leu - Lys - Gly - Trp - Trp - Arg - Thr - Ser - Stop
5′ AUG - AAA - GCA - AUU - UUC - GUA - CUG - AAA - GGU - UGG - UGG - CGC - ACU - UCC - UGA 3′
(b) phe operon Met - Lys - His - Ile - Pro - Phe - Phe - Phe - Ala - Phe - Phe - Phe - Thr - Phe - Pro - Stop
5′ AUG - AAA - CAC - AUA - CCG - UUU - UUU - UUC - GCA - UUC - UUU - UUU - ACC - UUC - CCC - UGA 3′
(c) his operon Met - Thr - Arg - Val - Gln - Phe - Lys - His - His - His - His - His - His - His - Pro - Asp
5′ AUG - ACA - CGC - GUU - CAA - UUU - AAA - CAC - CAC - CAU - CAU - CAC - CAU - CAU - CCU - GAC 3′
nylalanine codons in a leader peptide and the his operon has seven tandem histi- F i g u r e 11-2 4 (a) The translated part
dine codons in its leader peptide (Figure 11-24). of the trp leader region contains two
consecutive tryptophan codons, (b) the
phe leader sequence contains seven
K e y C o n c e p t A second level of regulation in amino acid biosynthesis phenylalanine codons, and (c) the his
operons is attenuation of transcription mediated by the abundance of the amino leader sequence contains seven
acid and translation of a leader peptide. consecutive histidine codons.
E. coli cell
λ phage
Chromosome + λ DNA
(in head)
Infection
1. Lytic 2. Lysogenic
cycle cycle
λ prophage
Cell lysis
Cell lysis
Lysogenic growth
the cro gene (for control of repressor and other things). The deci-
Clear and cloudy bacteriophage plaques
sion between the lytic and the lysogenic pathways hinges on the on a lawn of E. coli host bacteria
activity of the proteins encoded by the four genes cI, cII, cIII, and
cro, three of which are DNA-binding proteins.
We will first focus on the two genes cI and cro and the proteins
that they encode (Table 11-4). The cI gene encodes a repressor,
often referred to as l repressor, that represses lytic growth and
promotes lysogeny. The cro gene encodes a repressor that re- Cloudy plaque
presses lysogeny, thereby permitting lytic growth. The genetic
switch controlling the two l phage life cycles has two states: in the
lysogenic state, cI is on, but cro is off, and in the lytic cycle, cro is Clear plaque
on, but cI is off. Therefore, l repressor and Cro are in competition,
and whichever repressor prevails determines the state of the
switch and of the expression of the l genome.
The race between l repressor and Cro is initiated when phage
l infects a normal bacterium. The sequence of events in the race
is critically determined by the organization of genes in the l
genome and of promoters and operators between the cI and the
cro genes. The roughly 50-kb l genome encodes proteins having F i g u r e 11-2 6 Plaques are clear where
roles in DNA replication, recombination, assembly of the phage particle, and cell host cell lysis has occurred; they are
lysis (Figure 11-27). These proteins are expressed in a logical sequence such that cloudy where cells have survived infection
copies of the genome are made first, these copies are then packaged into viral and continued to grow as a lysogen. [ From
Microbiology An Evolving Science, 1st ed.,
particles, and, finally, the host cell is lysed to release the virus and begin the infec- Figure 10.22. © John Foster.]
tion of other host cells (see Figure 11-25). The order of viral gene expression flows
from the initiation of transcription at two promoters, PL and PR (for leftward and
rightward promoter with respect to the genetic map). On infection, RNA poly-
merase initiates transcription at both promoters. Looking at the genetic map (see
Figure 11-27), we see that from PR , cro is the first gene transcribed, and from PL, N
is the first gene transcribed.
The N gene encodes a positive regulator, but the mechanism of this protein dif-
fers from those of other regulators that we have considered thus far. Protein N works
by enabling RNA polymerase to continue to transcribe through regions of DNA that
would otherwise cause transcription to terminate. A regulatory protein such as N
that acts by preventing transcription termination is called an antiterminator.
Thus, N allows the transcription of cIII and other genes to the left of N, as well as cII
and other genes to the right of cro. The cII gene encodes an activator protein that
binds to a site that promotes transcription leftward from a different promoter, PRE
(for promoter of repressor establishment), which activates transcription of the
cI gene. Recall that the cI gene encodes l repressor, which will prevent lytic growth.
Before the expression of the rest of the viral genes takes place, a “decision”
must be made—whether to continue with viral-gene expression and lyse the cell
or to repress that pathway and lysogenize the cell. The decision whether to lyse or
lysogenize a cell pivots on the activity of the cII protein. The cII protein is unsta-
ble because it is sensitive to bacterial proteases—enzymes that degrade proteins.
420 CHAPTER 1 1 Regulation of Gene Expression in Bacteria and Their Viruses
Excisionase xis PL
Integrase int
att
Head
genes
Tail genes
Then:
The cII protein, protected by the cIII protein, turns on cI and int by activating
transcription at PRE.
Then:
If resources and proteases are abundant, cII is degraded, Cro represses cI, and
the lytic cycle continues.
If resources and proteases are not abundant, cII is active, cI transcription pro-
ceeds at a high level, Int protein integrates the phage chromosome, and cI (l
repressor) shuts off all genes except itself.
Lysogen λ repressor
RNA polymerase
cl OR3 OR2 OR1 cro
Lytic growth
Cro
F i g u r e 11-2 8 The binding of
cl OR3 OR2 OR1 cro λ repressor and Cro to operator sites.
Lysogeny is promoted by λ repressor
PRM PR binding to OR1 and OR2, which prevents
transcription from PR. On induction or in
the lytic cycle, the binding of Cro to OR3
prevents transcription of the cI gene.
[ Data from M. Ptashne and A. Gann, Genes
and Signals, p. 30, Fig. 1-13.]
422 CHAPTER 1 1 Regulation of Gene Expression in Bacteria and Their Viruses
After a lysogen has been established, it is generally stable. But the lysogen can
be induced to enter the lytic cycle by various environmental changes. Ultraviolet
light induces the expression of host genes. One of the host genes encodes a pro-
tein, RecA, that stimulates cleavage of the l repressor, thus crippling maintenance
of lysogeny and resulting in lytic growth. Prophage induction, just as Jacob and
Monod surmised, requires the release of a repressor from DNA. The physiological
role of ultraviolet light in lysogen induction makes sense in that this type of radia-
tion damages host DNA and stresses the bacteria; the phage replicates and leaves
the damaged, stressed cell for another host.
Helix-turn-helix is a common
DNA-binding motif
R
R
F i g u r e 11-2 9 The binding of a helix-turn-helix motif to
DNA. The purple cylinders are alpha helices. Many regulatory
proteins bind as dimers to DNA. In each monomer, the
recognition helix (R) makes contact with bases in the major
DNA-binding site
groove of DNA.
11.7 Alternative Sigma Factors Regulate Large Sets of Genes 423
(a)
Forespore
σF
σG
σAσH σE σK Spore
Vegetative Mother
cell cell
(b)
σE–regulated promoters
–35 –10
ybaN T C G GT T A T A T T C AAT T GT – C C AT GCT C A T A A G A T…
ydcC GT C T GCATAT T A GGGAAA–C C C CACT C A T A T A T T…
ydcA T AC G T ACTATT T AAA T G G – T T T T TCTC A T A A A C G…
σF–regulated promoters
yrrR A T C T G T T T A GCAGCGAAACACCTCGTCCACAATG…
ytfT CCGGG T T T AT T T T T T T –AGGAAT TGGCGATA A T G…
yuiC T T T T GAATA AT GCT C T CTCCACTTGGGAACAATG…
s u mmary
Gene regulation is often mediated by proteins that react to In negative regulatory control, a repressor protein blocks
environmental signals by raising or lowering the transcrip- transcription by binding to DNA at the operator site. Negative
tion rates of specific genes. The logic of this regulation is regulatory control is exemplified by the lac system. Negative
straightforward. In order for regulation to operate appropri- regulation is one very straightforward way for the lac system
ately, the regulatory proteins have built-in sensors that con- to shut down genes in the absence of appropriate sugars in the
tinually monitor cellular conditions. The activities of these environment. In positive regulatory control, protein factors
proteins would then depend on the right set of environmen- are required to activate transcription. Some prokaryotic gene
tal conditions. control, such as that for catabolite repression, operates
In bacteria and their viruses, the control of several through positive gene control.
structural genes may be coordinated by clustering the genes Many regulatory proteins are members of families of
together into operons on the chromosome so that they are proteins that have very similar DNA-binding motifs, such as
transcribed into multigenic mRNAs. Coordinated control sim- the helix-turn-helix domain. Other parts of the proteins,
plifies the task for bacteria because one cluster of regula- such as their protein–protein interaction domains, tend to
tory sites per operon is sufficient to regulate the expression be less similar. The specificity of gene regulation depends
of all the operon’s genes. Alternatively, coordinate control on chemical interactions between the side chains of amino
can also be achieved through discrete σ factors that regulate acids and chemical groups on DNA bases.
dozens of independent promoters simultaneously.
k e y t e rms
activator (p. 400) constitutive mutation (p. 406) lysogenic cycle (p. 417)
allosteric effector (p. 401) coordinately controlled genes lytic cycle (p. 417)
allosteric site (p. 401) (p. 402) negative control (p. 412)
allosteric transition (p. 403) cyclic adenosine monophosphate operator (p. 400)
antiterminator (p. 419) (cAMP) (p. 410) operon (p. 403)
attenuation (p. 415) DNA-binding domain (p. 401) partial diploid (p. 405)
attenuator (p. 415) genetic switch (p. 400) positive control (p. 412)
catabolite activator protein (CAP) inducer (p. 404) promoter (p. 400)
(p. 410) induction (p. 404) regulon (p. 424)
catabolite repression (p. 410) initiator (p. 413) repressor (p. 400)
cis-acting (p. 406) leader sequence (p. 415) trans-acting (p. 407)
426 CHAPTER 1 1 Regulation of Gene Expression in Bacteria and Their Viruses
solv e d probl e ms
This set of four solved problems, which are similar to scription is repressed by the repressor supplied from the first
Problem 10 in the Basic Problems at the end of this chapter, chromosome, which can act in trans through the cytoplasm.
is designed to test understanding of the operon model. Here, However, only the Z gene from this chromosome is intact.
we are given several diploids and are asked to determine Therefore, in the absence of an inducer, no enzyme is made;
whether Z and Y gene products are made in the presence or in the presence of an inducer, only the Z gene product,
absence of an inducer. Use a table similar to the one in β-galactosidase, is generated. The symbols to add to the table
Problem 11 as a basis for your answers, except that the col- are “−, +, −, −.”
umn headings will be as follows: RNA polymerase
cannot bind:
Z gene Y gene no transcription
SOLVED PROBLEM 1.
R
I P OC Z Y Repression in
I PO Z Y absence of IPTG
I– P+ O+ Z+ Y–
Solution
One way to approach these problems is first to consider each R I Active No
chromosome separately and then to construct a diagram. Induction enzyme in active
The following illustration diagrams this diploid: in presence presence enzyme
of IPTG of IPTG
RNA polymerase
cannot bind: SOLVED PROBLEM 3.
no transcription
I P OC Z Y
I P O Z Y
I– P– OC Z+ Y+
No active
repressor Solution
Because the second chromosome is P −, we need consider
R
only the first chromosome. This chromosome is OC, and so
I+ P+
enzyme is made in the absence of an inducer, although,
O+ Z– Y–
because of the Z − mutation, only active permease (Y ) is gen-
erated. The entries in the table should be “−, −, +, +.”
No No
RNA polymerase active active
can bind enzyme enzyme
Solution
In the presence of an I S repressor, all wild-type operators are
shut off, both with and without an inducer. Therefore, the
No transcription
first chromosome is unable to produce any enzyme. How- P+
IS O+ Z+ Y–
ever, the second chromosome has an altered (O C ) operator
and can produce enzyme in both the absence and the pres- I S repressor binds
to operator even in
ence of an inducer. Only the Y gene is wild type on the O C RS
presence of IPTG I
chromosome, and so only permease is produced constitu- No active Repressor cannot
tively. The entries in the table should be “−, −, +, +.”
repressor bind to O C operator
I– P+ OC Z– Y+
No Active enzyme
active in presence and
enzyme absence of IPTG
probl e ms
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
W or k i ng w i th th e F i g u r e s
www
1. C ompare the structure of IPTG shown in Figure 11-7 Unpacking the Problem
with the structure of galactose shown in Figure 11-5. www
10. The symbols a, b, and c in the table below represent the
Why is IPTG bound by the Lac repressor but not broken
E. coli lac system genes for the repressor (I ), the operator
down by β-galactosidase?
(O) region, and the β-galactosidase (Z ), although not
2. Looking at Figure 11-9, why were partial diploids essen- necessarily in that order. Furthermore, the order in
tial for establishing the trans-acting nature of the Lac which the symbols are written in the genotypes is not
repressor? Could one distinguish cis-acting from trans- necessarily the actual sequence in the lac operon.
acting genes in haploids?
Activity (+) or inactivity (−) of Z gene
3. Why do promoter mutations cluster at positions −10
and −35 as shown in Figure 11-11? Which protein-DNA Genotype Inducer absent Inducer present
interaction is disrupted by these mutations?
a- b+ c+ + +
4. Looking at Figure 11-16, note the large overlap between a+ b+ c- + +
the operator and the region of the lac operon that is a+ b- c- - -
transcribed. Which protein binds specifically to this a+ b- c+/a- b+ c- + +
overlapping sequence, and what effect does it have on a+ b+ c+/a- b- c- - +
transcription? a+ b+ c-/a- b- c+ - +
5. Examining Figure 11-21, what effect do you predict trpA a- b+ c+/a+ b- c- + +
mutations will have on tryptophan levels?
6. Examining Figure 11-21, what effect do you predict trpA a. Which symbol (a, b, or c) represents each of the lac
mutations will have on trp mRNA expression? genes I, O, and Z ?
b. In the table, a superscript minus sign on a gene symbol
B as i c P robl e ms merely indicates a mutant, but some mutant behaviors
7. Which of the following molecules is an inducer of the lac in this system are given special mutant designations.
operon: Using the conventional gene symbols for the lac operon,
designate each genotype in the table.
a. Galactose d. Isothiocyanate
b. Glucose e. cAMP 11. The map of the lac operon is
c. Allolactose f. Lactose POZY
8. Explain why alleles in the lac system are normally re-
I − The promoter (P ) region is the start site of transcription
cessive to I+ alleles and why I+ alleles are recessive to I S through the binding of the RNA polymerase molecule
alleles. before actual mRNA production. Mutationally altered
9. What do we mean when we say that O C mutations in the promoters (P −) apparently cannot bind the RNA poly-
lac system are cis-acting? merase molecule. Certain predictions can be made about
428 CHAPTER 1 1 Regulation of Gene Expression in Bacteria and Their Viruses
β-Galactosidase Permease
Genotype No lactose Lactose No lactose Lactose
I+ P+ O+ Z+ Y+/I+ P+ O+ Z+ Y+ - + - +
a. I- P+ OC Z+ Y-/ I+ P+ O+ Z- Y+
b. I+ P- OC Z- Y+/I- P+ OC Z+ Y-
c. IS P+ O+ Z+ Y-/I+ P+ O+ Z- Y+
d. IS P+ O+ Z+ Y+/I- P+ O+ Z+ Y+
e. I- P+ OC Z+ Y-/I- P+ O+ Z- Y+
f. I- P- O+ Z+ Y+/I- P+ OC Z+ Y-
g. I+ P+ O+ Z- Y+/I- P+ O+ Z+ Y-
14. What are the analogies between the mechanisms con- synthesis of daughter strands, the repressor must find
trolling the lac operon and those controlling phage λ the new operators by searching along the DNA, rapidly
genetic switches? binding to nonoperator sequences and dissociating from
them.)
15. Compare the arrangement of cis-acting sites in the con-
trol regions of the lac operon and phage λ. 19. Certain lacI mutations eliminate operator binding by
the Lac repressor but do not affect the aggregation of
16. Which regulatory protein induces the lytic phase genes subunits to make a tetramer, the active form of the re-
of the phage λ life cycle? pressor. These mutations are partly dominant over wild
a. cI type. Can you explain the partly dominant I − pheno-
b. Cro type of the I −/I + heterodiploids?
c. Lac repressor 20. You are examining the regulation of the lactose operon in
the bacterium Escherichia coli. You isolate seven new inde-
d. Lactose
pendent mutant strains that lack the products of all three
17. Predict the effect of a mutation that eliminates the DNA- structural genes. You suspect that some of these muta-
binding activity of the σ E protein on spore formation in tions are lacI S mutations and that other mutations are al-
Bacillus subtilis. terations that prevent the binding of RNA polymerase to
the promoter region. Using whatever haploid and partial
C hall e ng i ng P robl e ms diploid genotypes that you think are necessary, describe a
18. An interesting mutation in lacI results in repressors set of genotypes that will permit you to distinguish be-
with 110-fold increased binding to both operator and tween the lacI and lacP classes of uninducible mutations.
nonoperator DNA. These repressors display a “reverse” 21. You are studying the properties of a new kind of regula-
induction curve, allowing β-galactosidase synthesis in tory mutation of the lactose operon. This mutation,
the absence of an inducer (IPTG) but partly repressing called S, leads to the complete repression of the lacZ,
β-galactosidase expression in the presence of IPTG. How lacY, and lacA genes, regardless of whether lactose is
can you explain this? (Note that, when IPTG binds a re- present. The results of studies of this mutation in partial
pressor, it does not completely destroy operator affinity, diploids demonstrate that this mutation is completely
but rather it reduces affinity 110-fold. Additionally, as dominant over wild type. When you treat bacteria of the
cells divide and new operators are generated by the S mutant strain with a mutagen and select for mutant
Problems 429
bacteria that can express the enzymes encoded by lacZ, when the cells are grown in the presence of tryptophan?
lacY, and lacA genes in the presence of lactose, some of In its absence?
the mutations map to the lac operator region and others
(1)
R+ O+ A+ (wild type)
to the lac repressor gene. On the basis of your knowl-
edge of the lactose operon, provide a molecular genetic (2)
R- O+ A+/R+ O+ A-
explanation for all these properties of the S mutation. (3)
R+ O- A+/R+ O+ A-
Include an explanation of the constitutive nature of the
“reverse mutations.” 23. The activity of the enzyme β-galactosidase produced by
22. The trp operon in E. coli encodes enzymes essential for wild-type cells grown in media supplemented with dif-
the biosynthesis of tryptophan. The general mechanism ferent carbon sources is measured. In relative units, the
for controlling the trp operon is similar to that observed following levels of activity are found:
with the lac operon: when the repressor binds to the op-
erator, transcription is prevented; when the repressor Glucose Lactose Lactose + glucose
does not bind to (as in, to the operator), like phrase just 0 100 1
before the operator, transcription proceeds. The regula-
tion of the trp operon differs from the regulation of the
lac operon in the following way: the enzymes encoded by Predict the relative levels of β-galactosidase activity in
the trp operon are not synthesized when tryptophan is cells grown under similar conditions when the cells are
present but rather when it is absent. In the trp operon, lacI −, lacI S, lacO +, and crp −.
the repressor has two binding sites: one for DNA and the 24. A bacteriophage λ is found that is able to lysogenize its
other for the effector molecule, tryptophan. The trp re- E. coli host at 30°C but not at 42°C. What genes may be
pressor must first bind to a molecule of tryptophan be- mutant in this phage?
fore it can bind effectively to the trp operator.
25. What would happen to the ability of bacteriophage λ to
a. Draw a map of the tryptophan operon, indicating the lyse a host cell if it acquired a mutation in the OR binding
promoter (P ), the operator (O), and the first structural site for the Cro protein? Why?
gene of the tryptophan operon (trpA). In your drawing,
26. Contrast the effects of mutations in genes encoding spor-
indicate where on the DNA the repressor protein binds
ulation-specific σ factors with mutations in the −35 and
when it is bound to tryptophan.
−10 regions of the promoters of genes in their regulons.
b. The trpR gene encodes the repressor; trpO is the
operator; trpA encodes the enzyme tryptophan syn- a. Would functional mutations in the σ-factor genes or
thetase. A trpR− repressor cannot bind tryptophan, a in the individual promoters have the greater effect on
trpO− operator cannot be bound by the repressor, and sporulation?
the enzyme encoded by a trpA− mutant gene is com- b. On the basis of the sequences shown in Figure 11-30b,
pletely inactive. Do you expect to find active tryptophan would you expect all point mutations in −35 or −10
synthetase in each of the following mutant strains regions to affect gene expression?
This page intentionally left blank
344
12
C h a p t e r
Regulation of Gene
Expression in
Eukaryotes
Learning Outcomes
After completing this chapter, you will be able to
Xist RNA Xi
• Explain how eukaryotes generate many
different patterns of gene expression with a
limited number of regulatory proteins.
outline
T
he cloning of Dolly, a sheep, was reported worldwide in 1996 (Figure 12-1).
Dolly developed from adult somatic nuclei that had been implanted into
enucleated eggs (eggs with the nuclei removed). More recently, cows, pigs,
mice, and other mammals have been cloned as well with the use of similar tech-
nology. The successful cloning of Dolly was a great surprise to the scientific com-
munity because gamete (sperm and egg cells) formation was known to include
sex-specific modifications to the respective genomes that resulted in sex-specific
patterns of gene expression. Dolly is symbolic of how far we have progressed in
understanding aspects of eukaryotic gene regulation such as the global control of
gene expression exemplified by gamete development. However, for every success-
ful clone, including Dolly, there are many more, perhaps hundreds of embryos
that fail to develop into viable progeny. The extremely high failure rate under-
scores how much remains to be deciphered about eukaryotic gene regulation.
In this chapter, we will examine gene regulation in eukaryotes. In many ways,
our look at gene regulation will be a study of contrasts. In Chapter 11, you learned
how the activities of genetic switches in bacteria were
often governed by single activator or repressor proteins
The first cloned mammal and how the control of sets of genes was achieved by
their organization into operons or by the activity of spe-
cific factors. Initial expectations were that eukaryotic
gene expression would be regulated by similar means. In
eukaryotes, however, most genes are not found in oper-
ons. Furthermore, we will see that the proteins and DNA
sequences participating in eukaryotic gene regulation
are more numerous. Often, many DNA-binding proteins
act on a single switch, with many separate switches per
gene, and the regulatory sequences of these switches are
often located far from promoters. A key additional dif-
ference between bacteria and eukaryotes is that the
access to eukaryotic gene promoters is restricted by chro-
matin. Gene regulation in eukaryotes requires the activ-
ity of large protein complexes that promote or restrict
access to gene promoters by RNA polymerase. This
chapter will provide an essential foundation for under-
standing the regulation of gene expression in time and
F i g u r e 12 -1 The first cloned mammal was a sheep named Dolly. space that choreographs the process of development
[© Roslin Institute/Phototake.] described in Chapter 13.
level (through alterations in splicing or the stability of the mRNA) and after transla-
tion (by modifications of proteins). The varied ways that eukaryotic genes can be
regulated have been divided into two general categories that reflect when they act:
transcriptional gene regulation and post-transcriptional gene regulation. While the
former will be the primary focus of this chapter, the latter is receiving increasing
attention. In particular, the role of RNA to act post-transcriptionally to repress gene
expression (called gene silencing; see Chapter 8) is one of the hottest areas of current
research. Three of the RNA players that participate in silencing genes, miRNA,
ncRNA, and siRNA, were introduced in Chapter 8. Later in this chapter we will
explore the mechanisms of gene regulation mediated by miRNA and ncRNA.
Most regulation characterized to date takes place at the level of gene tran-
scription; so, in this chapter, the primary focus is on transcriptional gene regula-
tion. The basic mechanism at work is that molecular signals from outside or inside
the cell lead to the binding of regulatory proteins to specific DNA sites outside of
protein-encoding regions, and the binding of these proteins modulates the rate of
transcription. These proteins may directly or indirectly assist RNA polymerase in
binding to its transcription initiation site—the promoter—or they may repress tran-
scription by preventing the binding of RNA polymerase.
Although bacteria and eukaryotes have much of the logic of gene regulation in
common, there are some fundamental differences in the underlying mechanisms
and machinery. Both use sequence-specific DNA-binding proteins to modulate
the level of transcription. However, eukaryotic genomes are bigger, and their
range of properties is larger than that of bacteria. Inevitably, the regulation of
eukaryotic genomes is more complex, reflecting their structural and functional
complexity. This requires more types of regulatory proteins and more types of
interactions with the adjacent regulatory regions in DNA. One notable difference
is that eukaryotic DNA is packaged into nucleosomes, forming chromatin, whereas
bacterial DNA lacks nucleosomes. In eukaryotes, chromatin structure is dynamic
and is an essential ingredient in gene regulation.
In general, the ground state of a bacterial gene is “on.” Thus, RNA polymerase
can usually bind directly to a promoter when no other regulatory proteins are
around to bind to the DNA. In bacteria, transcription initiation is prevented or
reduced if the binding of RNA polymerase is blocked, usually through the binding
of a repressor regulatory protein. Activator regulatory proteins increase the binding
of RNA polymerase to promoters where a little help is needed. In contrast, the
ground state in eukaryotes is “off.” Therefore, the transcriptional machinery
(including RNA polymerase II and associated general transcription factors) cannot
bind to the promoter in the absence of other regulatory proteins (Figure 12-2). In
many cases, the binding of the transcriptional apparatus is not possible because
nucleosomes are positioned to block the promoter. Thus, chromatin structure usu-
ally has to be changed to activate eukaryotic transcription. Those changes generally
depend on the binding of sequence-specific DNA-binding regulatory proteins. The
structure of chromatin around activated or repressed genes within cells can be
quite stable and inherited by daughter cells. The inheritance of chromatin states is
a form of inheritance that does not directly entail DNA sequence; it provides a
means of epigenetic regulation.
The unique features of eukaryotic transcriptional regulation are the focus of
this chapter. Some differences from bacteria in the process of transcription and
that affect regulation were already noted in Chapter 8:
1. In bacteria, all genes are transcribed into RNA by the same RNA polymerase,
whereas three RNA polymerases function in eukaryotes. RNA polymerase II,
which transcribes DNA into mRNA, was the focus of Chapter 8 and will be the
only polymerase discussed in this chapter.
4 3 4 CHA P TER 1 2 Regulation of Gene Expression in Eukaryotes
BACTERIAL EUKARYOTIC
Activator
protein
RNA pol
TATA
Promoter Operator Coding
region
Ground state: on Ground state: off
Transcription Enhancer
factors
Repressor
protein
RNA pol II
TATA
F i g u r e 12 -2 In bacteria, RNA 2. RNA transcripts are extensively processed during transcription in eukaryotes;
polymerase can usually begin transcription the 5′ and 3′ ends are modified and introns are spliced out.
unless a repressor protein blocks it. In
eukaryotes, however, the packaging of 3. RNA polymerase II is much larger and more complex than its bacterial coun-
DNA with nucleosomes prevents terpart. One reason it is more complex is that RNA polymerase II must synthesize
transcription unless other regulatory RNA and coordinate the special processing events unique to eukaryotes.
proteins are present. These regulatory
proteins expose promoter sequences by
Multicellular eukaryotes may have as many as 25,000 genes, several-fold more
altering nucleosome density or position.
They may also recruit RNA polymerase II
than the average bacterium. Moreover, patterns of eukaryotic gene expression can
more directly through binding. be extraordinarily complex. That is, the timing of gene expression, and the amount
of transcript produced, vary widely among eukaryotic genes. For example, one
gene may be transcribed only during one stage of development and another only
in the presence of a viral infection. Finally, the majority of the genes in a eukary-
otic cell are off at any one time. On the basis of these considerations alone, eukary-
otic gene regulation must be able to
1. ensure that the expression of most genes in the genome is off at any one time
while activating a subset of genes.
2. generate thousands of patterns of gene expression.
As you will see later in the chapter, mechanisms have evolved to ensure that
most of the genes in a eukaryotic cell are not transcribed. Before considering how
genes are kept transcriptionally inactive, we will focus on the second point: How
are eukaryotic genes able to exhibit an enormous number and diversity of expres-
sion patterns? The machinery required for generating so many patterns of gene
12.1 Transcriptional Regulation in Eukaryotes: An Overview 4 3 5
3.5
Relative transcription level
3.0
1.0
Model
Organism Yeast
Saccharomyces cerevisiae, or budding yeast, has emerged in Fusion
a
recent years as the premier eukaryotic genetic system.
Humans have grown yeast for centuries because it is an (n) (n)
(2n)
essential component of beer, bread, and wine. Yeast has a /a
many features that make it an ideal model organism. As a Mitosis
unicellular eukaryote, it can be grown on agar plates and,
with yeast’s life cycle of just 90 minutes, large quantities of +
it can be cultured in liquid media. It has a very compact a /a
genome with only about 12 mega–base pairs of DNA (com- a /a (2n)
Ascus Meiosis (2n)
pared with almost 3000 mega–base pairs for humans) con-
(n) a (n)
taining approximately 6000 genes that are distributed on
(n) a (n)
16 chromosomes. It was the first eukaryote to have its
genome sequenced.
The yeast life cycle makes it very versatile for labora- Mitosis Mitosis
tory studies. Cells can be grown as either diploid or hap-
loid. In both cases, the mother cell produces a bud contain-
ing an identical daughter cell. Diploid cells either continue (n) (n) a
to grow by budding or are induced to undergo meiosis,
which produces four haploid spores held together in an
Culture Culture
ascus (also called a tetrad). Haploid spores of opposite colony colony
mating type (a or a) will fuse and form a diploid. Spores of
the same mating type will continue growth by budding. The life cycle of baker’s yeast. The nuclear alleles MATa and
Yeast has been called the E. coli of eukaryotes because MATa determine mating type.
of the ease of forward and reverse mutant analysis. To
isolate mutants by using a forward genetic approach,
haploid cells are mutagenized (with X rays, for example)
and screened on plates for mutant phenotypes. This pro-
cedure is usually done by first plating cells on a rich
medium on which all cells grow and by copying, or rep-
lica plating, the colonies from this master plate onto rep-
lica plates containing selective media or special growth
conditions. (See also Chapter 16.) For example, tempera-
ture-sensitive mutants will grow on the master plate at
the permissive temperature but not on a replica plate at
a restrictive temperature. Comparison of the colonies on
the master and replica plates will reveal the tempera-
ture-sensitive mutants. Using reverse genetics, scientists
can also replace any yeast gene (of known or unknown
function) with a mutant version (synthesized in a test
Electron micrograph of budding yeast cells. [SciMAT/ tube) to understand the nature of the gene product.
Science Source.]
start site is a single 118-bp region that contains four Gal4-binding sites (Figure 12-6).
Each Gal4-binding site is 17 base pairs long and is bound by one Gal4 protein dimer.
There are two Gal4-binding sites upstream of the GAL2 gene as well, and another
two upstream of the GAL7 gene. These binding sites are required for gene activa-
tion in vivo. If they are deleted, the genes are silent, even in the presence of galac-
tose. These regulatory sequences are enhancers. The presence of enhancers located
4 3 8 CHA P TER 1 2 Regulation of Gene Expression in Eukaryotes
Gal4
F i g u r e 12 - 6 The Gal4 protein at a considerable linear distance from a eukaryotic gene’s promoter is typical.
activates target genes through upstream- Because the Gal4-activated enhancers are located upstream (5′) of the genes they
activating-sequence (UAS) elements. The regulate, they are also called upstream activation sequences (UAS).
Gal4 protein has two functional domains:
a DNA-binding domain (pink square) and
an activation domain (orange oval). The K e y C o n c e p t The binding of sequence-specific DNA-binding proteins to
protein binds as a dimer to specific regions outside the promoters of target genes is a common feature of eukaryotic
sequences upstream of the promoters of transcriptional regulation.
Gal-pathway genes. Some of the GAL
genes are adjacent (GAL1, GAL10),
whereas others are on different
chromosomes. The GAL1 UAS element
The Gal4 protein has separable DNA-binding
contains four Gal4-binding sites. and activation domains
After Gal4 is bound to the UAS element, how is gene expression induced? A dis-
tinct domain of the Gal4 protein, the activation domain, is required for regula-
tory activity. Thus, the Gal4 protein has at least two domains: one for DNA binding
and another for activating transcription. A similar modular organization has been
found to be a common feature of other DNA-binding transcription factors as well.
The modular organization of the Gal4 protein was demonstrated in a series of
simple, elegant experiments. The strategy was to test the DNA binding and gene
activation of mutant forms of the protein in which parts had been either deleted
or fused to other proteins. By this means, investigators could determine whether
a part of the protein was necessary for a particular function. To carry out these
studies, experimenters needed a simple means of assaying the expression of the
enzymes encoded by the GAL genes.
The expression of GAL genes and other targets of transcription factors is typi-
cally monitored by using a reporter gene whose level of expression is easily mea-
sured. In reporter-gene constructs, the reporter gene is linked to the regulatory
sequences that govern the expression of the gene being investigated. The expres-
sion of the reporter gene reflects the activity of the regulatory element being
investigated. Often, the reporter gene is the lacZ gene of E. coli. LacZ is an effective
reporter gene because the products of its activity are easily measured. Another
common reporter gene is the gene that encodes the green fluorescent protein
(GFP) of jellyfish. As its name suggests, the concentration of reporter protein is
easily measured by the amount of light that it emits. To investigate the control of
GAL gene expression, the coding region of one of these reporter genes and a pro-
moter are placed downstream of a UAS element from a GAL gene. Reporter
expression is then a readout of Gal4 activity in cells (Figure 12-7a).
Let’s see what happens when a form of the Gal4 protein lacking the activation
domain is expressed in yeast. In this case, the binding sites of the UAS element are
12.2 Lessons from Yeast: The GAL System 4 3 9
+ Galactose
+ Gal3 Now we look at how activators and other regulatory proteins inter-
act with the transcriptional machinery to control gene expression.
MAT locus
a1 a1
α2 α1 α2 α1
Expressed
a1 α2 a1
regulatory
proteins α1 α2
MCM1 MCM1 MCM1
α2 α2 α2 α2
a-specific
ON OFF OFF
genes
MCM1 MCM1 MCM1
α1
α-specific
OFF ON OFF
genes
MCM1
a1 α2
Haploid-specific
ON ON OFF
genes
how this mechanism works, we need to first understand chromatin structure and
then consider how it can change and how these changes affect gene expression.
The recruitment of transcriptional machinery by activators may appear to be
somewhat similar in eukaryotes and bacteria, with the major difference being the
number of interacting proteins in the transcriptional machinery. Indeed, two
decades ago, many biologists pictured eukaryotic regulation simply as a biochemi-
cally more complicated version of what had been discovered in bacteria. However,
F i g u r e 12 -11 (a) The nucleosome in
this view has changed dramatically as biologists have considered the effect of the
decondensed and condensed chromatin.
organization of genomic DNA in eukaryotes. (b) End view of the coiled chain of
Compared with eukaryotic DNA, bacterial DNA is relatively “naked,” making it nucleosomes. (c) Chromatin structure
readily accessible to RNA polymerase. In contrast, eukaryotic chromosomes are varies along the length of a chromosome.
packaged into chromatin, which is composed of DNA and proteins (mostly his- The least-condensed chromatin
tones). The basic unit of chromatin is the nucleosome, which contains ~150 bp of (euchromatin) is shown in yellow, regions
DNA wrapped 1.7 times around a core of histone proteins (Figure 12-11). The of intermediate condensation are in orange
and blue, and heterochromatin coated
nucleosome core contains eight histones, two subunits of each of the four histones:
with special proteins (purple) is in red.
histones 2A, 2B, 3, and 4 (called H2A, H2B, H3, and H4) organized as two dimers of [ (c) From P. J. Horn and C. L. Peterson,
H2A and H2B and a tetramer of H3 and H4. Surrounding the nucleosome core is a “Chromatin Higher Order Folding: Wrapping
linker histone, H1, which can compact the nucleosomes into higher-order struc- Up Transcription,” Science 297, 2002, 1827,
tures that further condense the DNA. Fig. 3. Copyright 2002, AAAS.]
(a)
Short region of
2 nm
DNA double helix
Nucleosomes:
the basic unit 11 nm
of chromatin
Chromatin fiber
30 nm
of packed 30 nm
nucleosomes DNA wrapped
10 nm
around histone
(b) H1 histone core
Octameric DNA
histone core
H1 histone Histone
octamer
(c) Nucleosome H2A, H2B,
H3, H4
30 nm
The packaging of eukaryotic DNA into chromatin means that much of the
DNA is not readily accessible to regulatory proteins and the transcriptional appa-
ratus. Thus, whereas prokaryotic genes are generally accessible and “on” unless
repressed, eukaryotic genes are inaccessible and “off” unless activated. Therefore,
the modification of chromatin structure is a distinctive feature of many eukary-
otic processes including gene regulation (discussed in this chapter), DNA replica-
tion (Chapter 7), and DNA repair (Chapter 16). There are three major mechanisms
to alter chromatin structure that will be discussed in depth in this section:
1. moving nucleosomes along the DNA, also called chromatin remodeling.
2. histone modification in the nucleosome core.
3. replacing the common histones in a nucleosome with histone variants.
cycle or in certain cell types (in multicellular eukaryotes). For example, genes are
less accessible during mitosis, when chromatin is more condensed. At that stage,
Gal4 must recruit the chromatin-remodeling complex, whereas at other times, it
might not be necessary to use the complex.
A second reason is that many transcription factors act in combinations to con-
trol gene expression synergistically. We will see shortly that this combinatorial
synergy is a result of the fact that chromatin-remodeling complexes and the tran-
scriptional machinery are recruited more efficiently when multiple transcription
factors act together.
Modification of histones
Let’s look at the nucleosome more closely to see if any part of this structure could
carry the information necessary to influence nucleosome position, nucleosome
density, or both.
As already stated, most nucleosomes are composed of a histone octamer made
up of two dimers of H2A and H2B and a tetramer of H3 and H4. Histones are
known to be the most conserved proteins in nature; that is, histones are almost
identical in all eukaryotic organisms from yeast to plants to animals. In the past,
this conservation contributed to the view that histones could not take part in any-
thing more complicated than the packaging of DNA to fit in the nucleus. However,
recall that DNA with just its four bases was once considered too simple a molecule
to carry the blueprint for all organisms on Earth.
Figure 12-13a shows a model of nucleosome structure that represents contribu-
tions from many studies. Of note is that the histone proteins are organized into the
core octamer with some of their amino-terminal ends making electrostatic contacts
with the phosphate backbone of the surrounding DNA. These protruding ends are
(a)
H4 H3 H4 H3 H4 H3
A
(b) A A
A A
Lys
Lys Lys
Glu Lys Lys
H2B H2A
M F i g u r e 12 -13 (a) Histone tails
A protrude from the nucleosome core
A A Lys H4 H3
A Lys M A M (red). (b) Examples of histone tail
Ser A A
Lys Lys
Lys A A modifications are shown. Circles with
Lys
Ser Lys
Lys Lys
Ser
A represent acetylation while circles
Lys
with M represent methylation. See
text for details.
4 46 CHA P TER 1 2 Regulation of Gene Expression in Eukaryotes
called histone tails. Since the early 1960s, it has been known that specific basic
amino acid residues (lysine and arginine) in the histone tails can be covalently mod-
ified by the attachment of acetyl and methyl groups (Figure 12-13b). These reac-
tions, which take place after the histone protein has been translated and even after
the histone has been incorporated into a nucleosome, are called post-translational
modifications (PTMs).
There are now known to be at least 150 different histone modifications that uti-
lize a wide variety of molecules in addition to the acetyl and methyl groups already
mentioned, including phosphorylation, ubiquitination, and ADP ribosylation. The
covalent modification of histone tails is said to contribute to a histone code. Scien-
tists coined the expression “histone code” because the covalent modification of his-
tone tails is reminiscent of the genetic code. For the histone code, information is
stored in the patterns of histone modification rather than in the sequence of nucleo-
tides. With more than 150 known histone modifications, there are a huge number of
possible patterns, and scientists are just beginning to decipher their effects on chro-
matin structure and transcriptional regulation. To add to this complexity, the code is
likely not interpreted in precisely the same way in all organisms. The role of histone
acetylation and methylation in gene expression is described below.
NH+3 NH
CH2 CH2
Acetylation
CH2 by HATs CH2
CH2 CH2
Deacetylation
CH2 by HDACs CH2
C C
N H C N H C
H O H O
Note that the reaction is reversible, which means that acetyl groups can be
added by the enzyme histone acetyltransferase (HAT) and removed by the
enzyme histone deacetylase (HDAC) from the same histone residue. For now,
let’s see how the acetylation and deacetylation of histone amino acids influences
chromatin structure and gene expression.
Evidence had been accumulating for years that the histones associated with the
nucleosomes of active genes are rich in acetyl groups (said to be hyperacetylated),
whereas inactive genes are underacetylated (hypoacetylated). The HAT enzyme
proved very difficult to isolate. When it was finally isolated and its protein sequence
deduced, it was found to be an ortholog of a yeast transcriptional activator called
GCN5 (meaning that it was encoded by the same gene in a different organism).
Thus, the conclusion was that GCN5 is a histone acetyltransferase. It binds to the
DNA in the regulatory regions of some genes and activates transcription by acety-
lating nearby histones. Various protein complexes that are recruited by transcrip-
tional activators are now understood to possess HAT activity.
How does histone acetylation alter chromatin structure and, in the process,
facilitate changes in gene expression? The addition of acetyl groups to histone resi-
dues neutralizes the positive charge of lysine residues and reduces the interaction
of the histone tails with the negatively charged DNA backbone. This results in more
Introduction to Genetic Analysis, 11e
open chromatin as Figure
the electrostatic
12UN1 #1217interactions between adjacent nucleosomes and
between nucleosomes and adjacent DNA are reduced (Figure 12-14). In addition,
06/04/14
histone acetylation,Dragonfly Media Group
in conjunction with other histone modifications, influences the
12.3 Dynamic Chromatin 4 47
A A A A
A A A
A A A
A A
A A A
A
A
A
A A A
A A A
A A A A
A A A A
binding of regulatory proteins to the DNA. A bound regulatory protein may take F i g u r e 12 -14 Acetylation of lysine
part in one of several functions that either directly or indirectly increase the fre- amino acids in histone tails opens the
quency of transcription initiation. chromatin, exposing DNA to the activity of
Like other histone modifications, acetylation is reversible, and HDACs play proteins that regulate transcription.
key roles in gene repression. For example, in the presence of galactose and glu-
cose, the activation of GAL genes is prevented by the Mig1 protein. Mig1 is a
sequence-specific DNA-binding repressor that binds to a site between the UAS
element and the promoter of the GAL1 gene (Figure 12-15). Mig1 recruits a pro-
tein complex called Tup1 that contains a histone deacetylase and that represses
Introduction to Genetic Analysis, 11e
gene transcription. The Tup1 complex is an example of a corepressor, which
Figure 12.14 #1218
facilitates
06/04/14 gene repression but is not itself a DNA-binding repressor. The Tup1
06/23/14 is also recruited by other yeast repressors, such as MATα 2 (see pages
complex
07/23/14 and counterparts of this complex are found in all eukaryotes.
441–442),
08/01/14
Dragonfly Media Group
Gal4
F i g u r e 12 -15 Recruitment of a repressing
complex leads to repression of transcription. In the
Tup1 presence of glucose, GAL1 transcription is repressed
Mig1 by the Mig1 protein, regardless of the presence of
GAL1
OFF Gal4 at the UAS. Mig1 binds to a site between the
UAS Mig1 UAS and the promoter of the GAL1 gene and recruits
site the Tup1 repressing complex, which recruits a histone
deacetylase, turning gene transcription off.
4 48 CHA P TER 1 2 Regulation of Gene Expression in Eukaryotes
Key
H3C = M
The amino acid lysine is abbreviated with a “K.” As such, these post-translational
modifications of lysine 9 of histone H3 are referred to as H3K9me1, H3K9me2, and
H3K9me3, respectively.
Unlike acetylation, the addition of methyl groups can either activate or repress
gene expression. Recall that acetylation of lysine acts to neutralize the positive
histone charge and, in this direct way, activates gene expression by reducing inter-
actions between nucleosomes and DNA, opening chromatin. In contrast, methyla-
tion of specific lysine residues, which does not affect charge, creates binding sites
for other proteins that either activate or repress gene expression depending on the
residues modified. For example, methylation of H3 lysine residue 4 [H3K4(me)]
is associated with the activation of gene expression and is enriched in nucleosomes
near the start of transcription. There is a very different outcome when H3K9 or
H3K27 are methylated. These modifications, which are associated with gene
repression and tightly packed chromatin, will be discussed in greater detail later
in this chapter.
Newly synthesized
histones, no histone code
new histones are delivered to the replisome. The randomly distributed old histones F i g u r e 12 -16 In replication, old
serve as templates to guide the modification of new histones. In this way, the old histones (purple) with their histone codes
histones with their modified tails and the new histones with unmodified tails are are distributed randomly to the daughter
strands, where they direct the coding of
assembled into nucleosomes that become associated with both daughter strands adjacent newly assembled histones
(Figure 12-16). The modifications carried by the old histones are responsible in part (orange) to form complete nucleosomes.
for epigenetic inheritance. As such, these old modifications are called epigenetic
marks because they guide the modification of the new histones.
Histone variants
Unlike the common (also called consensus) histones that are added during DNA
replication, eukaryotes also have other histones, called histone variants, that can
replace the consensus histones that have already been assembled into nucleosomes.
For example, two variants for histone H2 are called H2A.Z and H2A.X, and one H3
variant is called CENP-A. Given that histones can be modified in so many ways, why
might it be necessary to replace one histone with a variant? While scientists are just
beginning to understand the different roles of histone variants, a common theme is
that they provide a quick way to change chromatin by replacing one histone code
with another. CENP-A, for example, replaces H3 in centromeric DNA, and its pres-
ence is thought to define centromere function. The role of histone variant H2A.Z in
identifying damaged
Introduction to Genetic DNA for11e
Analysis, rapid repair is discussed in detail in Chapter 16.
Figure 12.16 #1221
06/04/14
DNA
06/23/14methylation: another heritable mark that influences
chromatin structure
07/23/14
08/01/14
There
08/07/14is another important epigenetic mark in most (but not all) eukaryotes. This
mark is not
Dragonfly a histone
Media Group modification; rather, it is the addition of methyl groups to
DNA residues after replication. An enzyme usually attaches these methyl groups
to the carbon-5 position of a specific cytosine residue.
Methyl group
NH2 NH2
C C CH3
N3 4 4
5C DNA Methyltransferase N3 5C
C2 1
6C C2 1
6C
O N O N
Cytosine 5-Methylcytosine (5meC)
4 50 CHA P TER 1 2 Regulation of Gene Expression in Eukaryotes
M M
C*G
G C*
regulate genes during their lifetime, it is necessary to see how chromatin changes
during transcriptional activation. In addition, the development of a complex organ-
ism requires that transcription levels be regulated over a wide range of activities.
Think of a regulation mechanism as more like a rheostat that controls sound from an
iPod than an on-or-off switch: rather than a gene producing either many or no pro-
teins, it may produce a number anywhere in between depending on the transcrip-
tion level. In eukaryotes, transcription levels are made finely adjustable in a
chromatin environment by clustering binding sites into enhancers. Several different
transcription factors or several molecules of the same transcription factor may bind
to adjacent sites. The binding of these factors to sites that are the correct distance
apart leads to an amplified, or superadditive, effect on activating transcription.
When an effect is greater than additive, it is said to be synergistic.
The binding of multiple regulatory proteins to the multiple binding sites in an
enhancer can catalyze the formation of an enhanceosome, a large protein com-
plex that acts synergistically to activate transcription. In Figure 12-18 you can see
how architectural proteins bend the DNA to promote cooperative interactions
between the other DNA-binding proteins. In this mode of enhanceosome action,
transcription is activated to very high levels only when all the proteins are present
and touching one another in just the right way.
To better understand what an enhanceosome is and how it acts synergistically,
let’s look at a specific example.
DNA-
bending
proteins
CBP
RNA pol II
ON OFF
Promoter 2 Enhancer Enhancer- Promoter 1
blocking
insulator
to this model, insulators act by moving a promoter into a new loop, where it is
shielded from the enhancer.
As you will see later in the chapter, enhancer-blocking insulators are a funda-
mental component of a phenomenon called genomic imprinting.
Promoter 1 Promoter 2
ON
OFF
Enhancer
Enhancer-blocking
insulator
Figure 12-21 One proposal is that enhancer-blocking insulators create new loops that
physically separate a promoter from its enhancer.
4 54 CHA P TER 1 2 Regulation of Gene Expression in Eukaryotes
F i g u r e 12 -2 2 S. cerevisiae
Mating-type switching is controlled chromosome III encodes three mating-
by recombination of DNA cassettes type loci, but only the genes at the MAT
locus are expressed. HML encodes a
(a) silent cassette of the α genes, and HMR
encodes a silent cassette of the a genes.
HMLα MATa HMRa
Copying of a silent cassette and insertion
Silent Active Silent
through recombination at the MAT locus
switches mating type.
a mating type
α mating type
a mating type
chromosomes align in preparation for cell division. After cell division, regions
forming heterochromatin remain condensed, especially around the centromeres
and telomeres (called constitutive heterochromatin), whereas the regions form-
ing euchromatin become less condensed. As seen for the β-interferon example
(see Figure 12-19), the chromatin of active genes can change in response to devel-
opmental stage or environmental conditions.
The major distinction between heterochromatin and euchromatin is that the
former contains few genes while the latter is rich in genes. But what is heterochro-
matin if not genes? Most of the eukaryotic genome is composed of repetitive
sequences that do not make protein or structural RNA (see Chapter 14). Thus, the
densely packed nucleosomes of heterochromatin (organized into 30-nm chromatin
fibers; see Figure 12-11a) are said to form a “closed” structure that is largely inacces-
sible to regulatory proteins and inhospitable to gene activity. In contrast, euchroma-
tin, with its more widely spaced nucleosomes (organized into 10-nm fibers; see
Figure 12-11a), assumes an “open” structure that permits transcription.
Chromosome white+
white+ gene
Wild-type eye
expressed
Telomere Centromere
white+
white+ gene
Red facet
expressed
Geneticists reasoned that PEV could be exploited to identify the proteins neces-
sary for forming heterochromatin. To this end, they isolated mutations at a sec-
ond chromosomal locus that either suppressed or enhanced the variegated
pattern (Figure 12-24). Suppressors of variegation [called Su(var) ] are genes that,
when mutated, reduce the spread of heterochromatin, meaning that the wild-
type products of these genes are required for spreading. In fact, the Su(var) alleles
have proved to be a treasure trove for scientists interested in the proteins that are
required to establish and maintain the inactive, heterochromatic state. Among
more than to
Introduction 50Genetic
Drosophila gene
Analysis, 11eproducts identified by these screens was hetero-
Figure 12.23protein-1
chromatin #1229 (HP-1), which had previously been found associated with the
06/04/14
06/23/14
Dragonfly Media Group
4 58 CHA P TER 1 2 Regulation of Gene Expression in Eukaryotes
Second-site
mutations that affect
the spreading of
heterochromatin
Spreading
suppressed.
Su(var) Fewer white+
are silenced.
OFF
ON
OFF
OFF
OFF
ON
HP-1 HMTase
HAT
M M M M A A A A
Paternal allele
M M M M
Igf2 H19
ON >50 kb ICR OFF Enhancer
Maternal allele
F i g u r e 12 -2 7 Genomic imprinting in
the mouse. The imprinting control region
CTCF
Let’s turn again to the mouse Igf2 and H19 genes to see how imprinting works
at the molecular level. These two genes are located in a cluster of imprinted genes Unusual inheritance
on mouse chromosome 7. There are an estimated 100 imprinted genes in the of imprinted genes
mouse, and most are found in clusters comprising from 3 to 11 imprinted genes.
No mutations
(Humans have most of the same clustered imprinted genes as those in the mouse.)
In all cases examined, there is a specific pattern of DNA methylation for each
A B
gene copy of an imprinted gene. For the Igf2–H19 cluster, a specific region of DNA
lying between the two genes (Figure 12-27) is methylated in male germ cells and M
unmethylated in female germ cells. This region is called the imprinting control A B
region (ICR). Thus, methylation of the ICR leads to Igf2 being active and H19 ICR
being inactive, whereas lack of methylation leads to the reverse.
How does methylation control which of the two genes is active? Methylation Mutation in imprinted gene
acts as a block to the binding of proteins needed for transcription. Only the
unmethylated (female) ICR can bind a regulatory protein called CTCF. When A B
bound, CTCF acts as an enhancer-blocking insulator that prevents enhancer acti-
vation of Igf2 transcription. However, the enhancer in females can still activate M
H19 transcription. In males, CTCF cannot bind to the ICR and the enhancer can A B
activate Igf2 transcription (recall that enhancers can act at great distances). The ICR
enhancer cannot activate H19, however, because the methylated region extends OUTCOME UNAFFECTED
into the H19 promoter. The methylated promoter cannot bind proteins needed
for the transcription of H19. Mutation in nonimprinted gene
Thus, we see how an enhancer-blocking insulator (in this case, CTCF bound to
part of the ICR) prevents the enhancer from activating a distant gene (in this case,
A B
Igf2). Furthermore, we see that the CTCF-binding site is methylated only in chro-
mosomes derived from the male parent. The methylation of the CTCF-binding M
site prevents CTCF binding in males and permits the enhancer to activate Igf2. A B
Note that parental imprinting can greatly affect pedigree analysis. Because the ICR
inherited allele from one parent is inactive, a mutation in the allele inherited from OUTCOME AFFECTED
the other parent will appear to be dominant, whereas, in fact, the allele is expressed
because only one of the two homologs is active for this gene. Figure 12-28 shows F i g u r e 12 -2 8 A mutation (represented
how a mutation in an imprinted gene can have different outcomes on the pheno- by an orange star) in gene A will have no
type of the organism if inherited from the male or from the female parent. effect if inherited from the male.
Many steps are required for imprinting (Figure 12-29). Soon after fertilization, Abbreviations: M, methylation; ICR,
mammals set aside cells that will become their germ cells. Imprints are erased imprinting control region.
before the germ cells form. Without their distinguishing mark of DNA methylation,
these genes are now said to be epigenetically equivalent. As these primordial germ
cells become fully formed gametes, imprinted genes receive the sex-specific mark
that will determine whether the gene will be active or silent after fertilization.
Primordial
germ cells 1 Imprints
erased
Primordial
germ cells 2 Imprints
initiated
Gametes
Sperm Oocyte
3 Propagation
of imprints
Silent allele
Fertilization and
development
Active allele
AAAA
Protein-coding region
Possible mechanisms
used by RISC
AAAA AAAA
RISC RISC
(c) Interference with translation initiation: (e) Removal of the polyA tail
RISC prevents ribosome assembly
AAAA AAAA
RISC
F i g u r e 12 - 31 See text for details. Two types of small functional RNAs were introduced in Chapter 8: siRNAs
and miRNAs. In this section we explore how miRNAs assist in the regulation of
eukaryotic gene expression. The function of siRNAs will be considered further in
Chapter 15.
Recall from Chapter 8 that miRNAs are synthesized by RNA pol II as longer
RNAs that are processed into the smaller (~22 nt) biologically active miRNAs (see
Figure 8-20). Organisms contain hundreds of miRNAs that regulate thousands of
genes. Of these, about 1/3 are organized into clusters that are transcribed into a
single transcript, which is later processed to form several miRNAs. In contrast,
about 1/4 of all miRNAs are processed from transcripts derived from spliced
introns. The final steps in the processing of miRNAs occurs in the cytoplasm.
In Chapter 8 we saw how the active single-stranded miRNAs bind to the RNA-
inducing silencing complex (RISC) and hybridize to mRNAs that are complemen-
tary to the miRNAs. Specifically, the binding region of the miRNA consists of
nucleotides 2 through 8 of the ~22-nt miRNA, called the seed region. The nucleo-
tides of the seed region bind to the 3′ UTR of an mRNA that is being translated by
ribosomes (Figure 12-31). While the miRNA–RISC complex is known to inhibit
translation, the precise mechanism is still under investigation. Models for how
Summary 46 5
s u m m ary
Many aspects of eukaryotic gene regulation resemble the Histone modifications are epigenetic marks that, along with
regulation of bacterial operons. Both operate largely at the the methylation of cytosine bases, can be altered by tran-
level of transcription, and both rely on trans-acting proteins scription factors. These factors bind to regulatory regions and
that bind to cis-acting regulatory target sequences on the recruit protein complexes that enzymatically modify adja-
DNA molecule. These regulatory proteins determine the cent nucleosomes. These large multisubunit protein com-
level of transcription from a gene by controlling the binding plexes use the energy of ATP hydrolysis to move nucleosomes
of RNA polymerase to the gene’s promoter. and remodel chromatin.
There are three major distinguishing features of the con- DNA replication faithfully copies both the DNA sequence
trol of transcription in eukaryotes. First, eukaryotic genes and the chromatin structure from parent to daughter cells.
possess enhancers, which are cis-acting regulatory elements Newly formed cells inherit both genetic information, inher-
located at sometimes great linear distances from the pro- ent in the nucleotide sequence of DNA, and epigenetic infor-
moter. Many genes possess multiple enhancers. Second, mation, which is in the histone code and the pattern of DNA
these enhancers are often bound by more transcription fac- methylation.
tors than are bacterial operons. Multicellular eukaryotes The existence of epigenetic phenomena such as genetic
must generate thousands of patterns of gene expression with imprinting and X-chromosome inactivation demonstrates
a limited number of regulatory proteins (transcription fac- that eukaryotic gene expression can be silenced without
tors). They do so through combinatorial interactions among changing the DNA sequence of the gene. Another epigenetic
transcription factors. Enhanceosomes are complexes of reg- phenomenon, position-effect variegation, revealed the exis-
ulatory proteins that interact in a cooperative and synergis- tence of repressive heterochromatic domains that are associ-
tic fashion to promote high levels of transcription through ated with highly condensed nucleosomes and contain
the recruitment of RNA polymerase II to the transcription few genes. Barrier insulators maintain the integrity of the
start site. genome by preventing the conversion of euchromatin into
Third, eukaryotic genes are packaged in chromatin. Gene heterochromatin.
activation and repression require specific modifications to There is a growing appreciation for the role of functional
chromatin. The vast majority of the tens of thousands of RNAs, such as ncRNAs and miRNAs, in the regulation of
genes in a typical eukaryotic genome are turned off at any eukaryotic gene expression. These RNAs serve to target pro-
one time. Genes are maintained in a transcriptionally inac- tein complexes to complementary DNA or RNA in the cell.
tive state through the participation of nucleosomes, which For some RNAs (like Xist), the act of transcription may tether
serve to compact the chromatin and prevent the binding of the RNA to a chromosomal region where proteins will bind
RNA polymerase II. The position of nucleosomes and the and alter chromatin. In contrast, the translation of hundreds
extent of chromatin condensation are instructed by the pat- of mRNAs is repressed when RISC bound to complementary
tern of post-translational modifications of the histone tails. miRNAs is targeted to their 3′ UTRs.
46 6 CHA P TER 1 2 Regulation of Gene Expression in Eukaryotes
k ey ter m s
activation domain (p. 438) epigenetic mark (p. 449) paternal imprinting (p. 460)
Barr body (p. 463) euchromatin (p. 455) position-effect variegation (PEV)
barrier insulator (p. 459) genomic imprinting (p. 460) (p. 456)
chromatin remodeling (p. 444) hemimethylation (p. 450) post-transcriptional gene silencing
co-activator (p. 440) heterochromatin (p. 455) (p. 454)
constitutive heterochromatin histone code (p. 446) post-translational modification
(p. 456) histone deacetylase (HDAC) (p. 446) (PTM) (p. 446)
corepressor (p. 447) histone modification (p. 444) promoter-proximal element (p. 435)
CpG island (p. 450) histone variant (p. 444) reporter gene (p. 438)
dosage compensation (p. 463) histone tail (p. 446) synergistic effect (p. 451)
enhanceosome (p. 451) hyperacetylation (p. 446) transcriptional gene silencing
enhancer (p. 435) hypoacetylation (p. 446) (p. 454)
enhancer-blocking insulator (p. 452) maternal imprinting (p. 460) upstream activation sequence (UAS)
epigenetic inheritance (p. 448) mediator complex (p. 440) (p. 438)
proble m s
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
W or k i n g wit h t h e F ig u res tissue or both. If you need it, you have access to radioac-
1. In Figure 12-4, certain mutations decrease the relative tive white-gene DNA.
transcription rate of the β-globin gene. Where are these 9. In Figure 12-26, provide a biochemical mechanism for
mutations located, and how do they exert their effects on why HP-1 can bind to the DNA only on the left side of the
transcription? barrier insulator. Similarly, why can HMTase bind only
2. Based on the information in Figure 12-6, how does to the DNA on the left of the barrier insulator?
Gal4 regulate four different GAL genes at the same 10. In reference to Figure 12-28, draw the outcome if there is
time? Contrast this mechanism with how the Lac re- a mutation in gene B.
pressor controls the expression of three genes.
3. In any experiment, controls are essential in order to de- B asic P roble m s
termine the specific effect of changing some parameter. 11. What analogies can you draw between transcriptional
In Figure 12-7, which constructs are the “controls” that trans-acting factors that activate gene expression in eu-
serve to establish the principle that activation domains karyotes and the corresponding factors in bacteria? Give
are modular and interchangeable? an example.
4. Contrast the role of the MCM1 protein in different yeast 12. Contrast how the ground states of genes within DNA in
cell types shown in Figure 12-10. How are the a-specific bacteria and eukaryotes differ with respect to gene acti-
genes controlled differently in different cell types? vation (see Figure 12-2).
5. In Figure 12-11b, in what chromosomal region are you 13. Predict and explain the effect on GAL1 transcription, in
likely to find the most H1 histone protein? the presence of galactose alone, of the following
6. What is the conceptual connection between Figures 12- mutations:
12 and 12-19? a. Deletion of one Gal4-binding site in the GAL1 UAS
7. In Figure 12-19, where is the TATA box located before the element.
enhanceosome forms at the top of the figure? b. Deletion of all four Gal4-binding sites in the GAL1
8. Let’s say that you have incredible skill and can isolate the UAS element.
white and red patches of tissue from the Drosophila eyes c. Deletion of the Mig1-binding site upstream of GAL1.
shown in Figure 12-24 in order to isolate mRNA from d. Deletion of the Gal4 activation domain.
each tissue preparation. Using your knowledge of DNA
e. Deletion of the GAL80 gene.
techniques from Chapter 10, design an experiment that
would allow you to determine whether RNA is tran- f. Deletion of the GAL1 promoter.
scribed from the white gene in the red tissue or the white g. Deletion of the GAL3 gene.
Problems 467
14. How is the activation of the GAL1 gene prevented in the b. genes near heterochromatin tend to be silenced.
presence of galactose and glucose? c. enhancers can activate transcription whether they
15. What are the roles of histone deacetylation and histone are upstream or downstream of a gene.
acetylation in gene regulation, respectively? d. genes on the X chromosome have twice as many
16. An a strain of yeast that cannot switch mating type is copies in females as in males.
isolated. What mutations might it carry that would ex- 30. Which of the following is true of chromatin in
plain this phenotype? prokaryotes?
17. What genes are regulated by the a1 and a2 proteins in a. Bacterial chromosomes are not organized into
an a cell? chromatin.
18. What are Sir proteins? How do mutations in SIR genes b. It is found in the nucleus.
affect the expression of mating-type cassettes? c. It is found in the cytoplasm.
19. What is meant by the term epigenetic inheritance? What d. It is very similar to chromatin in eukaryotes.
are two examples of such inheritance? 31. Which of the following statements is/are true of the var-
20. What is an enhanceosome? Why could a mutation in any iegated eye color phenotype in Figure 12-23?
one of the enhanceosome proteins severely reduce the a. The white gene is active in the white sectors and
transcription rate? inactive in the red sectors.
21. Usually, the deletion of a gene results in a recessive mu- b. The white gene is inactive in the white sectors and
tation. Explain how the deletion of an imprinted gene active in the red sectors.
can be a dominant mutation. c. The white gene is inactive in the white sectors due
22. What has to happen for the expression of two different to the spread of heterochromatin.
genes on two different chromosomes to be regulated by d. b and c are both true.
the same miRNA?
32. Which of the following are epigenetic marks?
23. What mechanisms are thought to be responsible for the
a. Methylated basic amino acids in histone tails
inheritance of epigenetic information?
b. Methylated cytosine residues in DNA
24. What is the fundamental difference in how bacterial and
c. Acetylated amino acids in histone tails
eukaryotic genes are regulated?
d. All of the above are epigenetic marks.
25. Why is it said that transcriptional regulation in eukary-
otes is characterized by combinatorial interactions?
C h alle n gi n g P roble m s
26. Which of the following statements about histones is
33. The transcription of a gene called YFG (your favorite
true?
gene) is activated when three transcription factors (TFA,
a. They are proteins whose sequence is highly TFB, TFC) interact to recruit the co-activator CRX. TFA,
conserved in all eukaryotes. TFB, TFC, and CRX and their respective binding sites
b. They are the building blocks of nucleosomes. constitute an enhanceosome located 10 kb from the tran-
c. a and b are correct. scription start site. Draw a diagram showing how you
d. None of the above are correct. think the enhanceosome functions to recruit RNA poly-
merase to the promoter of YFG.
27. Nucleosomes are
34. A single mutation in one of the transcription factors in
a. composed of DNA and protein. Problem 33 results in a drastic reduction in YFG tran-
b. necessary to condense DNA/chromosomes. scription. Diagram what this mutant interaction might
c. essential for the correct regulation of eukaryotic look like.
gene expression. 35. Diagram the effect of a mutation in the binding site for
d. All of the above are true. one of the transcription factors in Problem 33.
28. The regions of chromosomes that form heterochromatin 36. Null alleles (mutant genes) produce no protein prod-
a. contain highly expressed genes. uct. This is a genetic change. However, epigenetically
b. contain few genes. silenced genes also produce no protein product. How
does one determine experimentally whether a gene
c. are associated with the nuclear envelope.
has been silenced by mutation or has been silenced
d. are abundant in prokaryotes. epigenetically?
29. Dosage compensation is necessary because 37. What are epigenetic marks? Which are associated with
a. some regions of the genome contain more genes heterochromatin? How are epigenetic marks thought to
than others. be responsible for determining chromatin structure?
46 8 CHA P TER 1 2 Regulation of Gene Expression in Eukaryotes
38. You receive four strains of yeast in the mail, and the No
accompanying instructions state that each strain con- treatment FGF Cort EP
Clone
tains a single copy of transgene A. You grow the four
A
strains and determine that only three strains express
the protein product of transgene A. Further analysis B
reveals that transgene A is located at a different posi- C
tion in the yeast genome in each of the four strains.
D
Provide a hypothesis to explain this result.
39. You wish to find the cis-acting regulatory DNA ele-
ments responsible for the transcriptional responses of The levels of transcripts produced from the introduced
two genes, c-fos and globin. Transcription of the c-fos genes in response to various treatments are shown; the
gene is activated in response to fibroblast growth factor intensity of these bands is proportional to the amount of
(FGF), but it is inhibited by cortisol (Cort). On the oth- transcript made from a particular clone. (The failure of a
er hand, transcription of the globin gene is not affected band to appear indicates that the level of transcript is
by either FGF or cortisol, but it is stimulated by the undetectable.)
hormone erythropoietin (EP). To find the cis-acting a. Where is the DNA element that permits activation
regulatory DNA elements responsible for these tran- by FGF?
scriptional responses, you use the following clones of
b. Where is the DNA element that permits repression
the c-fos and globin genes, as well as two “hybrid” com-
by Cort?
binations (fusion genes), as shown in the diagram be-
low. The letter A represents the intact c-fos gene, D c. Where is the DNA element that permits induction by
represents the intact globin gene, and B and C repre- EP? Explain your answer.
sent the c-fos–globin gene fusions. The c-fos and globin 40. Using the experimental system shown in Figure 12-26, a
exons (E) and introns (I) are numbered. For example, geneticist is able to insert a barrier insulator between the
E3(f) is the third exon of the c-fos gene and I2(g) is the white gene and the heterochromatin. The eye pheno-
second intron of the globin gene. (These labels are pro- type of the transgenic Drosophila would most likely be
vided to help you make your answer clear.) The tran- a. all white because the barrier element will prevent
scription start sites (black arrows) and polyadenylation the spread of heterochromatin.
sites (red arrows) are indicated.
b. all red because the barrier element will prevent the
↓ E1(f) E2(f) E3(f) ↓ spread of heterochromatin.
A c. still variegated because barrier elements don’t stop
I1(f) I2(f)
the spread of heterochromatin.
↓ ↓
B d. still variegated because barrier elements don’t work
in Drosophila.
↓ ↓ 41. Which of the following is an example of post-transcrip-
C
tional gene repression?
↓ I1(g) I2(g) ↓ a. The amount of RNA transcribed from a gene is
D reduced because the DNA is methylated.
E1(g) E2(g) E3(g)
b. The amount of RNA is reduced because it is rapidly
degraded.
You introduce all four of these clones simultaneously into
tissue-culture cells and then stimulate individual aliquots c. The amount of protein made is reduced by action of
of these cells with one of the three factors. Gel analysis of an miRNA.
the RNA isolated from the cells gives the following results. d. b and c are both correct.
344
13
C h a p t e r
13.6 From flies to fingers, feathers, and floor plates: the many
roles of individual toolkit genes
469
470 CHA P TER 1 3 The Genetic Control of Development
O
f all the phenomena in biology, few if any inspire more awe than the for-
mation of a complex animal from a single-celled egg. In this spectacular
transformation, unseen forces organize the dividing mass of cells into a
form with a distinct head and tail, various appendages, and many organs. The
great geneticist Thomas Hunt Morgan was not immune to its aesthetic appeal:
Yet, for all its beauty and fascination, biologists were stumped for many
decades concerning how biological form is generated during development. Mor-
Homeotic mutants of gan also said that “if the mystery that surrounds embryology is ever to come
Drosophila melanogaster within our comprehension, we must . . . have recourse to other means than
description of the passing show.”
(a) The long drought in embryology lasted well beyond Morgan’s heyday in the
1910s and 1920s, but it was eventually broken by geneticists working very much
in the tradition of Morgan-style genetics and with his favorite, most productive
genetic model, the fruit fly Drosophila melanogaster.
The key catalysts to understanding the making of animal forms were the discov-
eries of genetic monsters—mutant fruit flies with dramatic alterations of body
structures (Figure 13-1). In the early days of Drosophila genetics, rare mutants arose
spontaneously or as by-products of other experiments with spectacular transforma-
(b) tions of body parts. In 1915, Calvin Bridges, then Morgan’s student, isolated a fly
having a mutation that caused the tiny hind wings (halteres) of the fruit fly to
resemble the large forewings. He dubbed the mutant bithorax. The transformation
in bithorax mutants is called homeotic (Greek homeos, meaning same or similar)
because one part of the body (the hind wing) is transformed to resemble another
(the forewing), as shown in Figure 13-1b. Subsequently, several more homeotic
mutants were identified in Drosophila, such as the dramatic Antennapedia mutant
in which legs develop in place of the antennae (Figure 13-1c).
(c) The spectacular effects of homeotic mutants inspired what would become a
revolution in embryology, once the tools of molecular biology became available to
understand what homeotic genes encoded and how they exerted such enormous
influence on the development of entire body parts. Surprisingly, these strange
fruit-fly genes turned out to be a passport to the study of the entire animal king-
dom, as counterparts to these genes were discovered that played similar roles in
almost all animals.
The study of animal and plant development is a very large and still-growing
F i g u r e 13 -1 In homeotic mutants, discipline. As such, we do not attempt a comprehensive overview of embryology.
the identity of one body structure has Rather, in this chapter, we will focus on a few general concepts that illustrate the
been changed into another. (a) Normal logic of the genetic control of animal development. We will explore how the infor-
fly with one pair of forewings on the mation for building complex structures is encoded in the genome. In contrast to
second thoracic segment and one pair the control of gene regulation in single bacterial or eukaryotic cells, the genetic
of small hind wings on the third thoracic control of body formation and body patterning is fundamentally a matter of gene
segment. (b) Triple mutant for three
regulation in three-dimensional space and over time. Yet we will see that the prin-
mutations in the Ultrabithorax gene.
Ubx function is lost in the posterior ciples governing the genetic control of development are connected to those
thorax, which causes the development already presented in Chapters 11 and 12, governing the physiological control of
of forewings in place of the hind wings. gene expression in bacteria and single-celled eukaryotes.
(c) Antennapedia mutant in which the
antennae are transformed into legs.
[Sean Carroll.] 1 T. H. Morgan, Experimental Embryology. Columbia University Press, 1927.
13.1 The Genetic Approach to Development 471
(a)
Transplant
(b)
Model
Organism Drosophila
Mutational Analysis of Early Drosophila The development of the Drosophila adult body pat-
Development tern takes a little more than a week (see diagram). Small
populations of cells set aside during embryogenesis prolif-
The initial insights into the genetic control of pattern for- erate during three larval stages (instars) and differentiate
mation emerged from studies of the fruit fly Drosophila in the pupal stage into adult structures. These set-aside
melanogaster. Drosophila development has proved to be a cells include the imaginal disks, which are disk-shaped
gold mine to researchers because developmental prob- regions that give rise to specific appendages and tissues in
lems can be approached by the use of genetic and molecu- each segment as the leg, wing, eye, and antennal disks.
lar techniques simultaneously. Imaginal disks are easy to remove for analysis of gene
The Drosophila embryo has been especially important expression (see Figure 13-7).
in understanding the formation of the basic animal body Genes that contribute to the Drosophila body plan can
plan. One important reason is that an abnormality in the be cloned and characterized at the molecular level with
body plan of a mutant is easily identified in the larval exo- ease. The analysis of the cloned genes often provides valu-
skeleton in the Drosophila embryo. The larval exoskeleton is able information on the function of the protein product—
a noncellular structure, made of a polysaccharide polymer usually by identifying close relatives in amino acid sequence
called chitin that is produced as a secretion of the epidermal of the encoded polypeptide through comparisons with all
cells of the embryo. Each structure of the exoskeleton is the protein sequences stored in public databases. In addi-
formed from epidermal cells or cells immediately underly- tion, one can investigate the spatial and temporal patterns
ing that structure. With its intricate pattern of hairs, inden- of expression of (1) an mRNA, by using histochemically
tations, and other structures, the exoskeleton provides tagged single-stranded DNA sequences complementary to
numerous landmarks to serve as indicators of the fates the mRNA to perform RNA in situ hybridization, or (2) a
assigned to the many epidermal cells. In particular, there protein, by using histochemically tagged antibodies that
are many distinct anatomical structures along the antero- bind specifically to that protein.
posterior (A–P) and dorsoventral (D–V) axes (see p. 481).
Furthermore, because all the nutrients necessary to develop
to the larval stage are prepackaged in the egg, mutant
Using Knowledge from One Model Organism
embryos in which the A–P or D–V cell fates are drastically to Fast-Track Developmental Gene Discovery
altered can nonetheless develop to the end of embryogene- in Others
sis and produce a mutant larva in about 1 day (see diagram). With the discovery that there are numerous homeobox
The exoskeleton of such a mutant larva mirrors the mutant genes within the Drosophila genome, similarities among
fates assigned to subsets of the epidermal cells and can thus the DNA sequences of these genes could be exploited in
identify genes worthy of detailed analysis. treasure hunts for other members of the homeotic-gene
molecules required for a process. Second, the (limited) quantity of a gene product
is no impediment: all genes can be mutated regardless of the amount of product
made by a gene. And, third, the genetic approach can uncover phenomena for
which there is no biochemical or other bioassay.
From the genetic viewpoint, there are four key questions concerning the num-
ber, identity, and function of genes taking part in development:
1. Which genes are important in development?
2. Where in the developing animal and at what times are these genes active?
3. How is the expression of developmental genes regulated?
4. Through what molecular mechanisms do gene products affect development?
Egg
Embry
ogen
esis
(1 da
y)
Adult
Eclosion Larva
Imaginal disks
4 days
1st instar (1day)
family. These hunts depend on DNA base-pair comple- (Southern blots of restriction-enzyme-digested DNA from
mentarity. For this purpose, DNA hybridizations were car- different animals), by using radioactive Drosophila homeo-
ried out under moderate stringency conditions, in which box DNA as the probe. This approach led to the discovery
there could be some mismatch of bases between the of homologous homeobox sequences in many different
hybridizing strands without disrupting the proper hydro- animals, including humans and mice. (Indeed, it is a very
gen bonding of nearby base pairs. Some of these treasure powerful approach for “fishing” for relatives of almost any
hunts were carried out in the Drosophila genome itself, in gene in your favorite organism.) Now homologous genes
looking for more family members. Others searched for are typically identified by computational searches of
homeobox genes in other animals, by means of zoo blots genome sequences (Chapter 14).
of rearing, rapid life cycle, cytogenetics, and decades of classical genetic analysis
(including the isolation of many very dramatic mutants) provided important
experimental advantages (see the Model Organism box on Drosophila above). The
nematode worm Caenorhabditis elegans also presented many attractive features,
most particularly its simple construction and well-studied cell lineages. Among
vertebrates, the development of targeted gene disruption techniques opened up
the laboratory mouse to more systematic genetic study, and the zebrafish Danio
rerio has recently become a favorite model owing to the transparency of the
embryo and to advances in its genetic study. Among plants, Arabidopsis thaliana
has played a similar role as Drosophila in illuminating fundamental mechanisms
in plant development.
Through systematic and targeted genetic analysis, as well as comparative
genomic studies, much of the genetic toolkit for the development of the bodies,
body parts, and cell types of several different animal species has been defined. We
will first focus on the genetic toolkit of Drosophila melanogaster because its identi-
fication was a source of major insights into the genetic control of development; its
discovery catalyzed the identification of the genetic toolkit of other animals,
including humans.
474 CHA P TER 1 3 The Genetic Control of Development
(a) (b)
Figure 13-3 A late-nineteenth-century drawing from one of the first studies of homeotic
transformations in nature. (a) Homeosis in a sawfly, with the left antenna transformed into a
leg. (b) Homeosis in a frog. The middle specimen is normal. The specimen on the left has
extra structures growing out of the top of the vertebral column. The specimen on the right has
an extra set of vertebrae. [From W. Bateson, Material for the Study of Variation. Macmillan, 1894.]
tions have been observed in many species in nature, including sawflies in which a
leg forms in place of an antenna and frogs in which a thoracic vertebra forms in
place of a cervical vertebra (Figure 13-3). Whereas only one member of a bilateral
pair of structures is commonly altered in many naturally occurring variants, both
members of a bilateral pair of structures are altered in homeotic mutants of fruit
flies. In the former case, the alteration is not heritable, but homeotic mutants
breed true from generation to generation.
The scientific fascination with homeotic mutants stems from three properties.
First, it is amazing that a single gene mutation can alter a developmental pathway
so dramatically. Second, it is striking that the structure formed in the mutant is a
well-developed likeness of another body part. And, third, it is important to note that
homeotic mutations transform the identity of serially reiterated structures.
Insect and many animal bodies are made of repeating parts of similar structure, like
building blocks, arranged in a series. The forewings and hind wings, the segments,
and the antennae, legs, and mouthparts of insects are sets of serially reiterated body
parts. Homeotic mutations transform identities within these sets.
A mutation may cause a loss of homeotic gene function where the gene nor-
mally acts or it may cause a gain of homeotic function where the homeotic gene
does not normally act. For example, the Ultrabithorax (Ubx) gene acts in the devel-
oping hind wing to promote hind-wing development and to repress forewing
development. Loss-of-function mutations in Ubx transform the hind wing into a
forewing. Dominant gain-of-function mutations in Ubx transform the forewing
into a hind wing. Similarly, the antenna-to-leg transformations of Antennapedia
(Antp) mutants are caused by the dominant gain of Antp function in the antenna.
In addition to these transformations in appendage identity, homeotic mutations
can transform segment identity, causing one body segment of the adult or larva to
resemble another.
476 CHA P TER 1 3 The Genetic Control of Development
Extract antibodies
Fixed embryos (IgG) to protein.
or dissected tissue
Add enzyme
substrate.
In the developing embryo, the Hox genes are expressed in spatially restricted,
sometimes overlapping domains within the embryo (Figure 13-6). The genes are
also expressed in the larval and pupal tissues that will give rise to the adult body
parts.
The patterns of Hox-gene expression (and other toolkit genes) generally cor-
relate with the regions of the animal affected by gene mutations. For example, the
dark blue shading in Figure 13-6 indicates where the Ubx gene is expressed. This
478 CHA P TER 1 3 The Genetic Control of Development
(c) (d)
lab NNSGRTNFTNKQLTELEKEFHFNRYLTRARRIEIANTLQLNETQVKIWFQNRRMKQKKRV
pb PRRLRTAYTNTQLLELEKEFHFNKYLCRPRRIEIAASLDLTERQVKVWFQNRRMKHKRQT
Dfd PKRQRTAYTRHQILELEKEFHYNRYLTRRRRIEIAHTLVLSERQIKIWFQNRRMKWKKDN
Scr TKRQRTSYTRYQTLELEKEFHFNRYLTRRRRIEIAHALCLTERQIKIWFQNRRMKWKKEH
Antp RKRGRQTYTRYQTLELEKEFHFNRYLTRRRRIEIAHALCLTERQIKIWFQNRRMKWKKEN
Ubx RRRGRQTYTRYQTLELEKEFHTNHYLTRRRRIEMAHALCLTERQIKIWFQNRRMKLKKEI
abd-A RRRGRQTYTRFQTLELEKEFHFNHYLTRRRRIEIAHALCLTERQIKIWFQNRRMKLKKEL
abd-B VRKKRKPYSKFQTLELEKEFLFNAYVSKQKRWELARNLQLTERQVKIWFQNRRMKNKKNS
Consensus -RRGRT-YTR-QTLELEKEFHFNRYLTRRRRIEIAHALCLTERQIKIWFQNRRMK-KKE-
sequence Helix 1 Helix 2 Helix 3
F i g u r e 13 - 8 Sequences of fly
homeodomains. All eight Drosophila Hox
found that all eight Hox genes of the two complexes were similar enough to genes encode proteins containing a highly
hybridize to each other. This hybridization was found to be due to a short region conserved 60 amino acid domain, the
of sequence in each gene, 180 bp in length. This stretch of DNA sequence similar- homeodomain, composed of three α
ity, because of its presence in homeotic genes, was dubbed the homeobox. The helices. Helices 2 and 3 form a helix-turn-
helix motif similarly to the Lac repressor,
homeobox encodes a protein domain, the homeodomain, containing 60 amino
Cro, and other DNA-binding proteins.
acids. The amino acid sequence of the homeodomain is very similar among the Residues common to the Hox genes are
Hox proteins (Figure 13-8). shaded in yellow; divergent residues are
Although the discovery of a common protein motif in each of the Hox proteins shaded in red; those common to subsets
was very exciting, further analysis of the structure of the homeodomain revealed of proteins are shaded in blue or green.
that it forms a helix-turn-helix motif—the structure common to the Lac repressor, [ S. B. Carroll, J. K. Grenier, and S. D.
the λ repressor, Cro, and the α2 and a1 regulatory proteins of the yeast mating-type Weatherbee, From DNA to Diversity: Molecular
Genetics and the Evolution of Animal Design,
loci! This similarity suggested immediately (and it was subsequently borne out)
2nd ed. Blackwell, 2005.]
that Hox proteins are sequence-specific DNA-binding proteins and that they exert
their effects by controlling the expression of genes within developing segments and
appendages. Thus, the products of these remarkable genes function through prin-
ciples that are already familiar from Chapters 11 and 12—by binding to regulatory
elements of other genes to activate or repress their expression. We will see that it is
also true of many other toolkit genes: a significant fraction of these genes encode
transcription factors that control the expression of other genes.
We will examine how Hox proteins and other toolkit proteins orchestrate gene
expression in development a little later. First, there is one more huge discovery to
describe, which revealed that what we learn from fly Hox genes has very general
implications for the animal kingdom.
Mouse
Hoxb b-1 b-2 b-3 b-4 b-5 b-6 b-7 b-8 b-9 b-13
Mouse
Hoxc c-4 c-5 c-6 c-8 c-9 c-10 c-11 c-12 c-13
Mouse
Hoxd
F i g u r e 13 -10 Like those of the fruit fly, d-1 d-3 d-4 d-8 d-9 d-10 d-11 d-12 d-13
vertebrate Hox genes are organized in
clusters and expressed along the (b)
anteroposterior axis. (a) In the mouse, four
complexes of Hox genes, comprising Mouse embryo
39 genes in all, are present on four
different chromosomes. Not every gene is
represented in each complex; some have
been lost in the course of evolution.
(b) The Hox genes are expressed in
distinct domains along the anteroposterior
axis of the mouse embryo. The color
shading represents the different groups of
genes shown in part a. [ S. B. Carroll,
“Homeotic Genes and the Evolution of
Arthropods and Chordates,” Nature 376, 1995,
479–485.]
13.2 The Genetic Toolkit for Drosophila Development 4 81
the order of their most related counterparts in the fly Hox complexes, as well as in Dorsoventral
each of the other mouse Hox clusters (Figure 13-10a). This correspondence indi- axis
cates that the Hox complexes of insects and vertebrates are related and that some Anteroposterior
form of Hox complex existed in their distant common ancestor. The four Hox com- axis
plexes in the mouse arose by duplications of entire Hox complexes (perhaps of
entire chromosomes) in vertebrate ancestors.
Why would such different animals have these sets of genes in common? Their
deep, common ancestry indicates that Hox genes play some fundamental role in
the development of most animals. That role is apparent from analyses of how the
Hox genes are expressed in different animals. In vertebrate embryos, adjacent Dorsal
Hox genes also are expressed in adjacent or partly overlapping domains along the Anteroposterior Dorsoventral
anteroposterior body axis. Furthermore, the order of the Hox genes in the com- axis axis
plexes corresponds to the head-to-tail order of body regions in which the genes
are expressed (Figure 13-10b). Anterior Posterior
The Hox-gene expression patterns of vertebrates suggested that they also
specify the identity of body regions, and subsequent analyses of Hox-gene mutants Ventral
have borne this suggestion out. For example, mutations in the Hoxa11 and Hoxd11
The relationship between adult and
genes cause the homeotic transformation of sacral vertebrae to lumbar vertebrae embryonic body axes.
(Figure 13-11). Thus, as in the fly, the loss or gain of function of Hox genes in ver-
tebrates causes transformation of the identity of serially repeated structures.
Such results have been obtained in several classes, including mammals, birds,
amphibians, and fish. Furthermore, clusters of Hox genes have been shown to
govern the patterning of other insects and to be deployed in regions along the
anteroposterior axis in annelids, molluscs, nematodes, various arthropods, primi-
tive chordates, flatworms, and other animals. Therefore, despite enormous
differences in anatomy, the possession of one or more clusters of Hox genes that
are deployed in regions along the main body axis is a common, fundamental fea-
ture of at least all bilateral animals. Indeed, the surprising lessons from the Hox
genes portended what turned out to be a general trend among toolkit genes; that
is, most toolkit genes are common to different animals.
Now let’s take an inventory of the rest of the toolkit to see what other general
principles emerge.
Parents Offspring
m /+ m /+ m /m, m /+, +/+ all normal
m /m m /+ m /m, m /+ all normal
+/+, m /+, or m /m m /m m /+, m /m all mutant
phenotype
F i g u r e 13 -12 Genetic screens identify
whether a gene product functions in the ZYGOTICALLY REQUIRED GENES
egg or in the zygote. The phenotypes of
Parents Offspring
offspring depend on either (top) the
maternal genotype for maternal-effect m /+, +/+ normal
genes or (bottom) the offspring (zygotic) m /+ m /+
m /m mutant
genotype for zygotically required genes phenotype
( m, mutant; +, wild type).
13.3 Defining the Entire Toolkit 4 8 3
The power of the genetic screens was their systematic nature. By saturating each of
a fly’s chromosomes (except the small fourth chromosome) with chemically Bicoid mutants are
induced mutations, the researchers were able to identify most genes that were missing the anterior region
required for the building of the fly. For their pioneering efforts, Nüsslein-Volhard,
Wieschaus, and Lewis shared the 1995 Nobel Prize in Physiology or Medicine.
The most striking and telling features of the newly identified mutants were
that they showed dramatic but discrete defects in embryo organization or pattern-
ing. That is, the dead larva was not an amorphous carcass but exhibited specific,
often striking patterning defects. The Drosophila larval body has various features
whose number, position, or pattern can serve as landmarks to diagnose or classify
the abnormalities in mutant animals. Each locus could thus be classified accord-
ing to the body axis that it affected and the pattern of defects caused by muta-
tions. Genetic crosses reveal whether the locus is active in the maternal egg or the
zygote. Each class of genes appeared to represent different steps in the progres-
sive refinement of the embryonic body plan—from those that affect large regions
of the embryo to those with more limited realms of influence.
For any toolkit gene, three pieces of information are key toward understand-
ing gene function: (1) the mutant phenotype, (2) the pattern of gene expression,
and (3) the nature of the gene product. Extensive study of a few dozen genes has F i g u r e 13 -13 The Bicoid (bcd )
maternal-effect gene affects the anterior
led to a fairly detailed picture of how each body axis is established and subdivided
part of the developing larva. These
into segments or germ layers. photomicrographs are of Drosophila
larvae that have been prepared to show
The anteroposterior and dorsoventral axes their hard exoskeletons. Dense
structures, such as the segmental
A few dozen genes are required for proper organization of the anteroposterior denticle bands, appear white. (Left ) A
body axis of the fly embryo. The genes are grouped into five classes on the basis normal larva. (Right ) A larva from a
homozygous bcd mutant female. Head
of their realm of influence on embryonic pattern.
and anterior thoracic structures are
• The first class sets up the anteroposterior axis and consists of the maternal- missing. [From C. H. Nüsslein-Volhard,
effect genes. A key member of this class is the Bicoid gene. Embryos from Bicoid G. Frohnhöfer, and R. Lehmann,
“Determination of Anteroposterior Polarity in
mutant mothers are missing the anterior region of the embryo (Figure 13-13),
Drosophila,” Science 238, 1987, 1678.
telling us that the gene is required for the development of that region. Reprinted with permission from AAAS.]
The next three classes are zygotically active genes required for the develop-
ment of the segments of the embryo.
• The second class contains the gap genes. Each of these genes affects the for-
mation of a contiguous block of segments; mutations in gap genes lead to large
gaps in segmentation (Figure 13-14, left).
• The third class comprises the pair-rule genes, which act at a double-segment
periodicity. Pair-rule mutants are missing part of each pair of segments, but
different pair-rule genes affect different parts of each double segment. For ex-
ample, the even-skipped gene affects one set of segmental boundaries, and the
odd-skipped gene affects the complementary set of boundaries (Figure 13-14,
middle).
• The fourth class consists of the segment-polarity genes, which affect pattern-
ing within each segment. Mutants of this class display defects in segment polar-
ity and number (Figure 13-14, right).
to four cells wide (pair-rule proteins), and then in stripes from one to two cells
Expression of
wide (segment-polarity proteins).
anteroposterior-axis-
In addition to what we have learned from the spatial patterns of toolkit-gene
patterning proteins
expression, the order of toolkit-gene expression over time is logical. The maternal-
effect Bicoid protein appears before the zygotic gap proteins, which are expressed (a)
before the 7-striped patterns of pair-rule proteins appear, which in turn precede
the 14-striped patterns of segment-polarity proteins. The order of gene expression
and the progressive refinement of domains within the embryo reveal that the
making of the body plan is a step-by-step process, with major subdivisions of the
body outlined first and then refined until a fine-grain pattern is established.
The order of gene action further suggests that the expression of one set of genes (b)
might govern the expression of the succeeding set of genes.
One clue that this progression is indeed the case comes from analyzing the
effects of mutations in toolkit genes on the expression of other toolkit genes. For
example, in embryos from Bicoid mutant mothers, the expression of several gap
genes is altered, as well as that of pair-rule and segment-polarity genes. This find-
ing suggests that the Bicoid protein somehow (directly or indirectly) influences (c)
the regulation of gap genes.
Another clue that the expression of one set of genes might govern the expres-
sion of the succeeding set of genes comes from examining the protein products.
Inspection of the Bicoid protein sequence reveals that it contains a homeodomain,
related to but distinct from those of Hox proteins. Thus, Bicoid has the properties of
a DNA-binding transcription factor. Each gap gene also encodes a transcription fac-
(d)
tor, as does each pair-rule gene, several segment-polarity genes, and, as described
earlier, all Hox genes. These transcription factors include representatives of most
known families of sequence-specific DNA-binding proteins; so, although there is no
restriction concerning to which family they may belong, many early-acting toolkit
proteins are transcription factors. Those that are not transcription factors tend to be
components of signaling pathways (Table 13-1). These pathways, shown in generic
form in Figure 13-16, mediate ligand-induced signaling processes between cells, and F i g u r e 13 -15 Patterns of
their output generally leads to gene activation or repression. Thus, most toolkit pro- toolkit-gene expression correspond to
teins either directly (as transcription factors) or indirectly (as components of signal- mutant phenotypes. Drosophila
ing pathways) affect gene regulation. embryos have been stained with
antibodies to the (a) maternally
derived Bicoid protein, (b) Krüppel
K e y C o n c e p t Most toolkit proteins are transcription factors or components of gap protein, (c) Hairy pair-rule protein,
ligand-mediated signal-transduction pathways. and (d) Engrailed segment-polarity
protein and visualized by
immunoenzymatic (staining is brown)
The same principles apply to the making of the dorsoventral body axis as (a) or immunofluorescence (staining is
apply to the anteroposterior axis. The dorsoventral axis also is subdivided into green) (b–d ) methods. Each protein is
localized to nuclei in regions of the
regions. Several maternal-effect genes, such as dorsal, are required to establish embryo that are affected by mutations
these regions at distinct positions from the dorsal (top) to ventral (bottom) side of in the respective genes.
the embryo. Dorsal mutants are “dorsalized” and lack ventral structures (such as [ Photomicrographs courtesy of (a) Ruth
the mesoderm and nervous system). A handful of genes that are activated in the Lehmann and (b–d) James Langeland.]
zygote are also required for the subdivision of the dorsoventral axis.
The product of the dorsal maternal-effect gene is a transcription factor—the
Dorsal protein. This protein is expressed in a gradient along the dorsoventral axis,
with its highest level of accumulation in ventral cells (Figure 13-17a). The gradient
establishes subregions of differing Dorsal concentration. In each subregion, a differ-
ent set of zygotic genes are expressed that contribute to dorsoventral patterning.
The sets of zygotic genes expressed define regions that give rise to particular tissue
layers, such as the mesoderm and neuroectoderm (the part of the ectoderm that
gives rise to the ventral nervous system), as shown in Figure 13-17b.
48 6 CHA P TER 1 3 The Genetic Control of Development
Table 13-1
Examples of Drosophila A–P Axis Genes That Contribute to Pattern Formation
Role(s) in early
Gene symbol Gene Name Protein function development
hb-z hunchback-zygotic Transcription factor—zinc-finger protein Gap gene
Kr Krüppel Transcription factor—zinc-finger protein Gap gene
kni knirps Transcription factor—steroid receptor-type protein Gap gene
eve even-skipped Transcription factor—homeodomain protein Pair-rule gene
ftz fushi tarazu Transcription factor—homeodomain protein Pair-rule gene
opa odd-paired Transcription factor—zinc-finger protein Pair-rule gene
prd paired Transcription factor—PHOX protein Pair-rule gene
en engrailed Transcription factor—homeodomain protein Segment-polarity gene
wg wingless Signaling WG protein Segment-polarity gene
hh hedgehog Signaling HH protein Segment-polarity gene
ptc patched Transmembrane protein Segment-polarity gene
lab labial Transcription factor—homeodomain protein Segment-identity gene
Dfd Deformed Transcription factor—homeodomain protein Segment-identity gene
Antp Antennapedia Transcription factor—homeodomain protein Segment-identity gene
Ubx Ultrabithorax Transcription factor—homeodomain protein Segment-identity gene
Receptor
Extracellular
Cell membrane
Cytoplasm
Inactive
F i g u r e 13 -16 Most signaling TF
pathways operate through similar logic but
have different protein components and
Activation of Activation or translocation
signal-transduction mechanisms. Signaling phosphorylation of transcription factor
begins when a ligand binds to a cascade to nucleus
membrane-bound receptor, leading to the
P
release or activation of intracellular
Active TF Active TF
proteins. Receptor activation often leads
to the modification of inactive transcription Nuclear envelope
factors (TF). The modified transcription
factors are translocated to the cell Nucleus
nucleus, where they bind to cis-acting
regulatory DNA sequences or to DNA-
Binding to cis-acting regulatory sequences
binding proteins and regulate the level of
target-gene transcription. [S. B. Carroll, J. K. TF
Grenier, and S. D. Weatherbee, From DNA to
Diversity: Molecular Genetics and the Evolution
Enhancer Promoter
of Animal Design, 2nd ed. Blackwell, 2005.]
13.4 Spatial Regulation of Gene Expression in Development 4 87
Fragments
containing
A, B, or C
Promoter Reporter gene
Fly embryos
the eve transcription unit. (d) Within the eve Hunchback protein
stripe 2 element, several binding sites exist Krüppel protein
for each transcription factor (repressors are Eve stripe 2
shown above the element, activators protein
below). The net output of this combination
of activators and repressors is expression
of the narrow eve stripe.
Activators
Bcd-5 Bcd-4 Bcd-3 Bcd-2 Hb-3 Bcd-1
Specifically, the eve stripe 2 element contains multiple sites for the maternal
Bicoid protein and the Hunchback, Giant, and Krüppel gap proteins (Figure 13-20d).
Mutational analyses of different combinations of binding sites revealed that
Bicoid and Hunchback activate the expression of the eve stripe 2 element over a
broad region. The Giant and Krüppel proteins are repressors that sharpen the
boundaries of the stripe to just a few cells wide. The eve stripe 2 element acts,
then, as a genetic switch, integrating multiple regulatory protein activities to pro-
duce one stripe from three to four cells wide in the embryo.
The entire seven-striped periodic pattern of even-skipped expression is the sum
of different sets of inputs into separate cis-acting regulatory elements. The enhanc-
ers for other stripes contain different combinations of protein binding sites.
Cl
A2
An A1
Md Mx T2 T3
Lb T1
Dll derepressed in A1–A7
A8 A7 A6 A5
A4
(a) Wild type
A3
Slp Hox1 Exd En Hth Hox2 Cl
An
A2
Md Mx A1
Lb T1 T2 T3
Repressed in A1–A7
A8 A7 A6 A5
A4
(b) Hox mutations
A3
X X Cl
An
A2
Md Mx A1
Lb T1 T2 T3
Derepressed in A1–A7
A8 A7 A6 A5
(c) Slp mutation A4
A3
X Cl
An
A2
Md Mx Lb T1 T2 T3 A1
Derepressed in aA1–aA7
A8 A7 A6 A5
(d) En mutation A4
A3
X Cl
An
A2
Md Mx Lb T1 T2 T3 A1
Derepressed in pA1–pA7
A8 A7 A6 A5
(e) Slp, En mutations A4
A3
X X Cl
An
A2
Md Mx A1
Lb T1 T2 T3
Derepressed in A1–A7
A8 A7 A6 A5
A4
(f) Exd, Hth mutations
A3
X X Cl
An
A2
Md Mx A1
Lb T1 T2 T3
Derepressed in A1–A7
the product is a specific, longer isoform, DsxM, that contains a unique C-terminal
region of 150 amino acids not found in the female-specific isoform DsxF, which
instead contains a unique 30 amino acid sequence at its carboxyl terminus. Each
form of the Dsx protein is a DNA-binding transcription factor that apparently
binds the same DNA sequences. However, the activities of the two isoforms differ:
DsxF activates certain target genes in females that DsxM represses in males.
The alternative forms of the Dsx protein are generated by alternative splicing
of the primary dsx RNA transcript. Thus, in this case, the choice of splice sites
must be regulated to produce mature mRNAs that encode different proteins. The F i g u r e 13 -2 3 Three pre-mRNAs of
various genetic factors that influence Dsx expression and sex determination have major Drosophila sex-determining genes
are alternatively spliced. The female-
been identified by mutations that affect the sexual phenotype.
specific pathway is shown on the left and
One key regulator is the product of the transformer (tra) gene. Whereas null the male-specific pathway shown on the
mutations in tra have no effect on males, XX female flies bearing tra mutations right. The pre-mRNAs are identical in both
are transformed into the male phenotype. The Tra protein is an alternative splic- sexes and shown in the middle. In the
ing factor that affects the splice choices in the dsx RNA transcript. In the presence male Sex-lethal and transformer mRNAs,
of Tra (and a related protein Tra2), a splice occurs that incorporates exon 4 of the there are stop codons that terminate
dsx gene into the mature dsxF transcript (Figure 13-23), but not exons 5 and 6. translation. These sequences are removed
by splicing to produce functional proteins
Males lack the Tra protein; so this splice does not occur, and exons 5 and 6 are
in the female. The Transformer and Tra-2
incorporated into the dsxM transcript, but not exon 4. proteins then splice the female doublesex
The Tra protein explains how alternative forms of Dsx are expressed, but how pre-mRNA to produce the female-specific
is Tra expression itself regulated to differ in females and males? The tra RNA itself isoform of the Dsx protein, which differs
is alternatively spliced. In females, a splicing factor encoded by the Sex-lethal (Sxl ) from the male-specific isoform by the
alternative splicing of several exons.
Female-specific “Default”
Sex-lethal
splicing splicing
1 2 4 5 6 7 8 AAA 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 AAA
Male exon
Stop codon
Sex-lethal
transformer
1 3 4 AAA 1 2 3 4 1 2 3 4 AAA
Stop codon
Tra-2
Transformer
doublesex
1 2 3 4 AAA 1 2 3 4 5 6 1 2 3 5 6 AAA
gene is present. This splicing factor binds to the tra RNA and prevents a splicing
event that would otherwise incorporate an exon that contains a stop codon. In
males, no Tra protein is made because this stop codon is present.
The production of the Sex-lethal protein is, in turn, regulated both by RNA
splicing and by factors that alter the level of transcription. The level of Sxl transcrip-
tion is governed by activators on the X chromosome and repressors on the auto-
somes. In females, Sxl activation prevails and the Sxl protein is produced, which
regulates tra RNA splicing and feeds back to regulate the splicing of Sxl RNA itself.
In females, a stop codon is spliced out so that Sxl protein production can continue.
However, in males, where no Sxl protein is present, the stop codon is still present in
the unspliced Sxl RNA transcript and no Sxl protein can be produced.
This cascade of sex-specific RNA splicing in D. melanogaster illustrates one
way that the sex-chromosome genotype leads to different forms of regulatory pro-
teins being expressed in one sex and not the other. Interestingly, the genetic regu-
lation of sex determination differs greatly between animal species, in that sexual
genotype can lead to differential expression of regulatory genes through distinctly
different paths. However, proteins related to Dsx do play roles in sexual differen-
tiation in a wide variety of animals, including humans. Thus, although there are
many ways to generate differential expression of transcription factors, a family of
similar proteins appear to underlie much sexual differentiation.
Model
Organism Caenorhabditis elegans
The Nematode Caenorhabditis elegans as a
Model for Cell-Lineage-Fate Decisions of this animal led Sydney Brenner to advance its use as a
In the past 20 years, studies of the nematode worm model organism. The adult worm contains about 1000
Caenorhabditis elegans (see Diagram 1) have greatly ad- somatic cells, and researchers, led by John Sulston, have
vanced our understanding of the genetic control of cell-lin- carefully mapped out the entire series of somatic-cell deci-
eage decisions. The transparency and simple construction sions that produce the adult animal.
Pharynx Ovary
Intestine Eggs Rectum
Anus
Oviduct Oocytes
Uterus
Vulva
Diagram 1 An adult hermaphrodite Caenorhabditis elegans, showing various organs.
Some of the lineage decisions, such as the formation of Diagram 2). Exhaustive genetic screens have identified
the vulva, have been key models of so-called inductive many components participating in signaling and signal
interactions in development, where signaling between transduction in the formation of the vulva.
cells induces cell-fate changes and organ formation (see
Uterus
1°
3° 2° 2° 3°
Hypodermis Hypodermis
Vulva
l Left
r Right
l r l r a p a p N No division
l r l r l r a Anterior
N p Posterior
Diagram 2 Production of the vulval-cell lineages. (a) The parts of the vulval anatomy that are occupied
by so-called primary (1°), secondary (2°), and tertiary (3°) cells. (b) The lineages or pedigrees of the
primary, secondary, and tertiary cells are distinguished by their cell-division patterns.
For some of the embryonic and larval cell divisions, led by Robert Horvitz, has revealed many components of
particularly those that will contribute to a worm’s nervous programmed-cell-death pathways common to most ani-
system, a progenitor cell gives rise to two progeny cells, one mals. Sydney Brenner, John Sulston, and Robert Horvitz
of which then undergoes programmed cell death. Analysis shared the 2002 Nobel Prize in Physiology or Medicine for
of mutants in which programmed cell death is aberrant, their pioneering work based on C. elegans.
49 8 CHA P TER 1 3 The Genetic Control of Development
Mutated SCR
SCR ABp
lacZ
XX ABa P2
Mutations EMS
(c)
(Figure 13-24a). This localized expression of GLP-1 is critical for establishing dis-
tinct fates. Mutations that abolish glp-1 function at the four-cell stage alter the fates
of ABp and ABa descendants.
GLP-1 is localized to the anterior cells by repressing its translation in the pos-
terior cells. The repression of GLP-1 translation requires sequences in the 3 ′ UTR
of the glp-1 mRNA—specifically, a 61-nucleotide region called the spatial control
region (SCR). The importance of the SCR has been demonstrated by linking
mRNA transcribed from reporter genes to different variants of the SCR. Deletion
of this region or mutation of key sites within it causes the reporter gene to be
expressed in all four blastomeres of the early embryo (Figure 13-24b).
On the basis of how we have seen transcription controlled, we might guess
that one or more proteins bind(s) to the SCR to repress translation of the glp-1
mRNA. To identify these repressor proteins, researchers isolated proteins that
bound to the SCR. One protein, GLD-1, binds specifically to a region of the SCR.
Furthermore, the GLD-1 protein is enriched in posterior blastomeres, just where
13.5 Post-transcriptional Regulation of Gene Expression in Development 49 9
subdivision of the brain hemispheres and the subdivision of the developing eye
into the left and right sides. When the function of the Shh gene is eliminated by
mutation in the mouse, these hemispheres and eye regions do not separate, and
the resulting embryo is cyclopic, with one central eye and a single forebrain (it also
lacks limb structures).
The dramatic and diverse roles of Shh are a striking example of the different
roles played by toolkit genes at different places and times in development. The
outcomes of Shh signaling are different in each case: the Shh signaling pathway
will induce the expression of one set of genes in the developing limb, a different
set in the feather bud, and yet another set in the floor plate. How are different cell
types and tissues able to respond differently to the same signaling molecule? The
outcome of Shh signaling depends on the context provided by other toolkit genes
that are acting at the same time.
Polydactyly
A fairly common syndrome in humans is the development of extra partial or com-
plete digits on the hands and feet. This condition, called polydactyly, arises in
about 5 to 17 of every 10,000 live births. In the most dramatic cases, the condition
is present on both hands and feet (Figure 13-27). Polydactyly occurs widely
throughout vertebrates—in cats, chickens, mice, and other species.
The discovery of the role of Shh in digit patterning led geneticists to investi-
gate whether the Shh gene was altered in polydactylous humans and other spe-
cies. In fact, certain polydactyly mutations are mutations of the Shh gene.
Importantly, the mutations are not in the coding region of the Shh gene; rather,
they lie in a cis-acting regulatory element, far from the coding region, that
Polydactyly in humans
controls Shh expression in the developing limb bud. The extra digits are induced
by the expression of Shh in a part of the limb where the gene is not normally
expressed. Mutations in cis-acting regulatory elements have two important prop-
erties that are distinct from mutations in coding regions. First, because they affect
regulation in cis, the phenotypes are often dominant. Second, because only one of
several cis-acting regulatory elements may be affected, other gene functions may
be completely normal. Polydactyly can occur without any collateral developmen-
tal problems. Coding mutations in Shh, however, tell a different story, as we will
see in the next section.
Holoprosencephaly
Mutations in the human Shh coding region also have been identified. The conse-
quent alterations in the Shh protein are associated with a syndrome termed holo-
prosencephaly, in which abnormalities occur in brain size, in the formation of the
nose, and in other midline structures. These abnormalities appear to be less severe
counterparts of the developmental defects observed in homozygous Shh mutant
mice. Indeed, the affected children seen in clinics are heterozygous. One copy of a
normal Shh gene appears to be insufficient for normal midline development (the
gene is haploinsufficient). Human fetuses homozygous for loss-of-function Shh
mutations very likely die in gestation with more severe defects.
Holoprosencephaly is not caused exclusively by Shh mutations. Shh is a ligand
in a signal-transduction pathway. As might be expected, mutations in genes
encoding other components of the pathway affect the efficiency of Shh signaling
and are also associated with holoprosencephaly. Several components of the
human Shh pathway were first identified as homologs of members of the fly path-
way, demonstrating once again both the conservation of the genetic toolkit and
the power of model systems for biomedical discovery.
Table 13-2
Some Toolkit Genes Having Roles in Cancer
Fly gene Mammalian gene Cancer type
Signaling-Pathway
Components
Wingless armadillo β-catenin Colon and skin
D.TCF TLF Colon
Hedgehog cubitus interruptus Gli1 Basal-cell carcinoma
patched patched Basal-cell-carcinoma, medulloblastoma
smoothened smoothened Basal-cell carcinoma
Notch Notch hNotch1 Leukemia, lymphoma
EGF receptor torpedo C-erbB-2 Breast and colon
Decapentaplegic/TFG-β Medea DPC4 Pancreatic and colon
Toll dorsal NF-κB Lymphoma
Other extradenticle Pbx1 Acute pre-B-cell leukemia
cancer and the development of new therapies. For example, about 30 percent of
mice heterozygous for a targeted mutation in the patched gene develop medullo-
blastoma. These mice therefore serve as an excellent model for the biology of
human disease and a testing platform for therapy. Many of the newest anticancer
agents employed today are in fact targeted toward components of signal-transduc-
tion pathways that are disrupted in certain types of tumors.
It is fair to say that even the most optimistic and farsighted researchers did not
expect that the discovery of the genetic toolkit for building a fly would have such
far-ranging effects on understanding human development and disease. But such
huge unforeseen dividends are familiar in the recent history of basic genetic
research. The advent of genetically engineered medicines, monoclonal antibodies
for diagnosis and therapy, and forensic DNA testing all had similar origins in
seemingly unrelated investigations.
s u mmary
In Chapter 11, we mentioned the quip from Jacques Monod functions of introns, RNA splicing, distant and multiple cis-
and François Jacob that “anything found to be true of E. coli acting regulatory elements, chromatin, alternative splicing,
must also be true of Elephants.” 2 Now that we have seen the and, more recently, miRNAs. Still, central to the genetic con-
regulatory processes that build worms, flies, mice, and ele- trol of development is the control of differential gene
phants, would we say that they were right? If Monod and expression.
Jacob were referring to the principle that gene transcription This chapter has presented an overview of the logic and
is controlled by sequence-specific regulatory proteins, we mechanisms for the control of gene expression and develop-
have seen that the bacterial Lac repressor and the fly Hox ment in fruit flies and a few other model species. We have
proteins do indeed act similarly. Moreover, their DNA- concentrated on the toolkit of animal genes for developmen-
binding proteins have the same type of motif. The funda- tal processes and the mechanisms that control the organiza-
mental insights that Jacob and Monod had concerning the tion of major features of the body plan—the establishment of
central role of the control of gene transcription in bacterial body axes, segmentation, and segment identity. Although we
physiology and that they expected would apply to cell dif- explored only a modest number of regulatory mechanisms in
ferentiation and development in complex multicellular depth, and just a few species, similarities in regulatory logic
organisms have been borne out in many respects in the and mechanisms allow us to identify some general themes
genetic control of animal development. concerning the genetic control of development.
Many features in single-celled and multicellular eukary-
otes, however, are not found in bacteria and their viruses. 1. Despite vast differences in appearance and anatomy, ani-
Geneticists and molecular biologists have discovered the mals have in common a toolkit of genes that govern development.
This toolkit is a small fraction of all genes in the genome,
2 F. Jacob and J. Monod, Cold Spring Harbor Quant. Symp. Biol. 26, 1963, 393. and most of these toolkit genes control transcription factors
50 4 CHA P TER 1 3 The Genetic Control of Development
and components of signal-transduction pathways. Individual regard to specificity, combinatorial mechanisms provide the
toolkit genes typically have multiple functions and affect the means to localize gene expression to discrete cell populations
development of different structures at different stages. by using inputs that are not specific to cell type or tissue type.
The actions of toolkit proteins can thus be quite specific in
2. The development of the growing embryo and its body parts
different contexts. In regard to diversity, combinatorial
takes place in a spatially and temporally ordered progression.
mechanisms provide the means to generate a virtually limit-
Domains within the embryo are established by the expres-
less variety of gene-expression patterns.
sion of toolkit genes that mark out progressively finer subdi-
visions along both embryonic axes.
4. The modularity of cis-acting regulatory elements allows for
3. Spatially restricted patterns of gene expression are products independent spatial and temporal control of toolkit-gene expres-
of combinatorial regulation. Each pattern of gene expression sion and function. Just as the operators and UAS elements of
has a preceding causal basis. New patterns are generated by prokaryotes and simple eukaryotes act as switches in the phys-
the combined inputs of preceding patterns. In the examples iological control of gene expression, the cis-acting regulatory
presented in this chapter, the positioning of pair-rule stripes elements of toolkit genes act as switches in the developmental
and the restriction of appendage-regulatory-gene expres- control of gene expression. The distinguishing feature of tool-
sion to individual segments requires the integration of nu- kit genes is the typical presence of numerous independent cis-
merous positive and negative regulatory inputs by cis-acting acting regulatory elements that govern gene expression in dif-
regulatory elements. ferent spatial domains and at different stages of development.
Post-transcriptional regulation at the RNA level adds The independent spatial and temporal regulation of gene ex-
another layer of specificity to the control of gene expres- pression enables individual toolkit genes to have different but
sion. Alternative RNA splicing and translational control by specific functions in different contexts. In this light, it is not
proteins and miRNAs also contribute to the spatial and tem- adequate or accurate to describe a given toolkit-gene function
poral control of toolkit-gene expression. solely in relation to the protein (or miRNA) that it encodes
Combinatorial control is key to both the specificity and because the function of the gene product almost always de-
the diversity of gene expression and toolkit-gene function. In pends on the context in which it is expressed.
k e y terms
gap gene (p. 483) homeodomain (p. 479) pair-rule gene (p. 483) serially reiterated structure
gene complex (p. 476) housekeeping gene (p. 474) positional information (p. 475)
genetic toolkit (p. 473) Hox gene (p. 476) (p. 487) zygote (p. 482)
homeobox (p. 479) maternal-effect gene (p. 482) segment-polarity gene (p. 483)
s o lv ed pr o b l ems
SOLVED PROBLEM 1. The Bicoid gene (bcd ) is a maternal- mothers with three copies of bcd+, it is even more posterior.
effect gene required for the development of the Drosophila As additional gene doses are added, the cephalic furrow
anterior region. A mother heterozygous for a bcd deletion has moves more and more posteriorly, until, in the progeny of
only one copy of the bcd gene. With the use of P elements to mothers with six copies of bcd+, it is midway along the A–P
insert copies of the cloned bcd+ gene into the genome by axis of the embryo. Explain the gene-dosage effect of bcd+ on
transformation, it is possible to produce mothers with extra the formation of the cephalic furrow in light of the contribu-
copies of the gene. The early Drosophila embryo develops an tion that bcd makes to A–P pattern formation.
indentation called the cephalic furrow that is more or less
perpendicular to the longitudinal, anteroposterior (A–P) Solution
body axis. In the progeny of mothers with only a single copy The determination of anterior–posterior parts of the embryo
of bcd+, this furrow is very close to the anterior tip, lying at a is governed by a concentration gradient of Bicoid protein.
position one-sixth of the distance from the anterior to the The furrow develops at a critical concentration of bcd. As
posterior tip. In the progeny of standard wild-type diploids bcd+ gene dosage (and, therefore, Bicoid protein concentra-
(having two copies of bcd+), the cephalic furrow arises more tion) decreases, the furrow shifts anteriorly; as the gene dos-
posteriorly, at a position one-fifth of the distance from the age increases, the furrow shifts posteriorly.
anterior to the posterior tip of the embryo. In the progeny of
Problems 50 5
pr o b l ems
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
W o r k i n g with the F ig u res mally. When the gap gene kni is mutant, the fifth and
1. In Figure 13-2, the transplantation of certain regions of sixth ftz stripes do not form normally. Explain these re-
embryonic tissue induces the development of structures sults in regard to how segment number is established in
in new places. What are these special regions called, and the embryo.
what are the substances they are proposed to produce? 14. Some of the mammalian Hox genes have been shown to
2. In Figure 13-5, two different methods are illustrated for be more similar to one of the insect Hox genes than to the
visualizing gene expression in developing animals. others. Describe an experimental approach that would
Which method would allow one to detect where within a enable you to demonstrate this finding in a functional
cell a protein is localized? test in living flies.
3. Figure 13-7 illustrates the expression of the Ultrabithorax 15. The three homeodomain proteins Abd-B, Abd-A, and
(Ubx) Hox protein in developing flight appendages. Ubx are encoded by genes within the Bithorax complex
What is the relationship between where the protein is of Drosophila. In wild-type embryos, the Abd-B gene is
expressed and the phenotype resulting from the loss of expressed in the posterior abdominal segments, Abd-A
its expression (shown in Figure 13-1)? in the middle abdominal segments, and Ubx in the ante-
4. In Figure 13-11, what is the evidence that vertebrate Hox rior abdominal and posterior thoracic segments. When
genes govern the identity of serially repeated structures? the Abd-B gene is deleted, Abd-A is expressed in both the
5. As shown in Figure 13-14, what is the fundamental dis- middle and the posterior abdominal segments. When
tinction between a pair-rule gene and a segment-polarity Abd-A is deleted, Ubx is expressed in the posterior thorax
gene? and in the anterior and middle abdominal segments.
When Ubx is deleted, the patterns of Abd-A and Abd-B
6. In Table 13-1, what is the most common function of pro- expression are unchanged from wild type. When both
teins that contribute to pattern formation? Why is this Abd-A and Abd-B are deleted, Ubx is expressed in all seg-
the case? ments from the posterior thorax to the posterior end of
7. In Figure 13-20, which gap protein regulates the posteri- the embryo. Explain these observations, taking into con-
or boundary of eve stripe 2? Describe how it does so in sideration the fact that the gap genes control the initial
molecular terms. expression patterns of the homeotic genes.
8. As shown in Figure 13-22, how many different transcrip- 16. What genetic tests allow you to tell if a gene is required
tion factors govern where the Distal-less (Dll ) gene will zygotically or if it has a maternal effect?
be expressed?
17. In considering the formation of the A–P and D–V axes in
9. As shown in Figure 13-26, the Sonic hedgehog gene is ex- Drosophila, we noted that, for mutations such as bcd,
pressed in many places in a developing chicken. Is the homozygous mutant mothers uniformly produce mu-
identical Sonic hedgehog protein expressed in each tis- tant offspring with segmentation defects. This outcome
sue? If so, how do the tissues develop into different struc- is always true regardless of whether the offspring them-
tures? If not, how are different Sonic hedgehog proteins selves are bcd+/bcd or bcd/bcd. Some other maternal-
produced? effect lethal mutations are different, in that the mutant
phenotype can be “rescued” by introducing a wild-type
Basic P r o b l ems allele of the gene from the father. In other words, for
10. Engrailed, even-skipped, hunchback, and Antennapedia. such rescuable maternal-effect lethals, mut+/mut ani-
To a Drosophila geneticist, what are they? How do they mals are normal, whereas mut/mut animals have the mu-
differ? tant defect. Explain the difference between rescuable
11. Describe the expression pattern of the Drosophila gene and nonrescuable maternal-effect lethal mutations.
eve in the early embryo. 18. Suppose you isolate a mutation affecting A–P patterning
12. Contrast the function of homeotic genes with that of of the Drosophila embryo in which every other segment
pair-rule genes. of the developing mutant larva is missing.
13. When an embryo is homozygous mutant for the gap a. Would you consider this mutation to be a mutation in
gene Kr, the fourth and fifth stripes of the pair-rule gene a gap gene, a pair-rule gene, a segment-polarity gene, or
ftz (counting from the anterior end) do not form nor- a segment-identity gene?
50 6 CHA P TER 1 3 The Genetic Control of Development
b. You have cloned a piece of DNA that contains four examine whether the mouse and human genes are
genes. How could you use the spatial-expression pattern functionally equivalent to the fly eyeless gene.
of their mRNA in a wild-type embryo to identify which 22. Gene X is expressed in the developing brain, heart, and
represents a candidate gene for the mutation described? lungs of mice. Mutations that selectively affect gene X
c. Assume that you have identified the candidate gene. function in these three tissues map to three different
If you now examine the spatial-expression pattern of its regions (A, B, and C, respectively) 5 ′ of the X coding
mRNA in an embryo that is homozygous mutant for the region.
gap gene Krüppel, would you expect to see a normal a. Explain the nature of these mutations.
expression pattern? Explain. b. Draw a map of the X locus consistent with the pre-
19. How does the Bicoid protein gradient form? ceding information.
20. In an embryo from a homozygous Bicoid mutant female, c. How would you test the function of the A, B, and C
which class(es) of gene expression is (are) abnormal? regions?
a. Gap genes 23. Why are regulatory mutations at the mouse Sonic hedge-
b. Pair-rule genes hog gene dominant and viable? Why do coding muta-
tions cause more widespread defects?
c. Segment-polarity genes
24. A mutation occurs in the Drosophila doublesex gene that
d. Hox genes
prevents Tra from binding to the dsx RNA transcript.
What would be the consequences of this mutation for
C ha l l e n gi n g P r o b l ems Dsx protein expression in males? In females?
21. a. The eyeless gene is required for eye formation in 25. You isolate a glp-1 mutation of C. elegans and discover
Drosophila. It encodes a homeodomain. What would you that the DNA region encoding the spatial control region
predict about the biochemical function of the Eyeless (SCR) has been deleted. What will the GLP-1 protein ex-
protein? pression pattern be in a four-cell embryo in mutant het-
b. Where would you predict that the eyeless gene is erozygotes? In mutant homozygotes?
expressed in development? How would you test your 26. Assess the validity of Monod and Jacob’s remark that
prediction? “anything found to be true of E. coli must also be true of
c. The Small eye and Aniridia genes of mice and humans, Elephants.”
respectively, encode proteins with very strong sequence a. Compare the structures and mechanisms of action of
similarity to the fly Eyeless protein, and they are named animal Hox proteins and the Lac repressor. In what ways
for their effects on eye development. Devise one test to are they similar?
344
14
C h a p t e r
Genomes
and Genomics
Learning Outcomes
After completing this chapter, you will be able to
outline
507
50 8 CHAPTER 1 4 Genomes and Genomics
I
n the summer of 2009, Dr. Alan Mayer, a pediatrician at Children’s Hospital of
Wisconsin in Milwaukee, wrote to a colleague about the heartbreaking and baf-
fling case of a four-year-old patient of his (Figure 14-1). For two years, little Nich-
olas Volker had endured over 100 trips to the operating room as doctors tried to
manage a mysterious disease that was destroying his intestines, leaving him vul-
nerable to dangerous infections, severely underweight, and often unable to eat.
Neither Mayer nor any other doctors had ever seen a disease like Nicholas’s;
they were unable to diagnose it, or to stem its ravages by any medical, surgical, or
nutritional treatment. It was difficult to treat a disease that no one could identify.
So, Dr. Mayer asked his colleague, Dr. Howard Jacob at the Medical College of
Wisconsin, “if there is some way we can get his genome sequenced. There is a
good chance Nicholas has a genetic defect, and it is likely to be a new disease.
Furthermore, a diagnosis soon could save his life and truly showcase personalized
genomic medicine.” 1
Dr. Jacob knew that it would be a longshot. Finding a single mutation responsi-
ble for a disease would require sifting through thousands of variations in Nicholas’s
DNA. One key decision was to narrow the search to just the exon sequences in
Nicholas’s DNA. The rationale was that if the causal mutation was a protein-coding
change, then it could be identified by sequencing all of the exons, or Nicholas’s
exome, which comprise a little over 1 percent of the entire human genome. Still, it
would be an expensive search—the sequencing would cost about $75,000 with the
technology available at the time. Nevertheless, the money was raised from donors,
and Jacob and a team of collaborators undertook the task.
As Jacob expected, they found more than 16,000 possible candidate variations
in Nicholas’s DNA. They narrowed this long list by focusing on those mutations
that had not been previously identified in humans, and that caused amino acid
replacements that were not found in other species. Eventually, they identified a
single base substitution in a gene called the X-linked inhibitor of apoptosis (XIAP )
that changed one amino acid at position 203 of the protein—an amino acid that
1 M. Johnson and K. Gallagher, “A Baffling Illness,” Milwaukee Journal Sentinel. Published
Nicholas Volker
Figure 14-1 DNA sequencing of all the exons of Nicholas Volker’s genome
revealed a single mutation responsible for his debilitating, but previously
unidentified, disease. [Gary Porter/MCT/Newscom.]
Genomes and Genomics 50 9
was invariant among mammals, fish, and even the fruit-fly counterparts of the
XIAP gene.
Fortunately, the identification of Nicholas’s XIAP mutation suggested a thera-
peutic approach. The XIAP gene was previously known to have a role in the
inflammatory response, and mutations in the gene were associated with a very
rare but potentially fatal immune disorder (although not Nicholas’s intestinal
symptoms). Based on that knowledge, Nicholas’s doctors boosted his immune sys-
tem with an infusion of umbilical-cord blood from a well-matched donor. Over
the next several months, Nicholas’s health improved to the point where he was
able to eat steak and other foods. And over the next two years, Nicholas did not
require any further intestinal surgeries.
The diagnosis and treatment of Nicholas Volker illustrate the dramatic
advances in the technology and impact of genomics—the study of genomes in
their entirety. The long-awaited promise that genomics would shape clinical med-
icine is now very much a reality. The technological and biological progress from
what started as a trickle of data in the 1990s has been astounding. In 1995, the
1.8-Mb (1.8-megabase) genome of the bacterium Haemophilus influenzae was the
first genome of a free-living organism to be sequenced. In 1996 came the 12-Mb
genome of Saccharomyces cerevisiae; in 1998, the 100-Mb genome of C. elegans; in
2000, the 180-Mb genome of Drosophila melanogaster; in 2001, the first draft of
the 3000-Mb human genome; and, in 2005, the first draft of our closest living rela-
tive, the chimpanzee. These species are just a small sample. By the end of 2013,
the sequences of almost 27,000 bacterial genomes, and more than 6600 eukaryotic
species (including fungi, plants, and animals) had been deciphered.
It is no hyperbole to say that genomics has revolutionized how genetic analysis
is performed and has opened avenues of inquiry that were not conceivable just a
few years ago. Most of the genetic analyses that we have so far considered employ
a forward approach to analyzing genetic and biological processes. That is, the
analysis begins by first screening for mutants that affect some observable pheno-
type, and the characterization of these mutants eventually leads to the identifica-
tion of the gene and the function of DNA, RNA, and protein sequences. In
contrast, having the entire DNA sequences of an organism’s genome allows genet-
icists to work in both directions—forward from phenotype to gene, and in reverse
from gene to phenotype. Without exception, genome sequences reveal many
genes that were not detected from classical mutational analysis. Using so-called
reverse genetics, geneticists can now systematically study the roles of such for-
merly unidentified genes. Moreover, a lack of prior classical genetic study is no
longer an impediment to the genetic investigation of organisms. The frontiers of
experimental analysis are growing far beyond the bounds of the very modest
number of long-explored model organisms.
Analyses of whole genomes now contribute to every corner of biological
research. In human genetics, genomics is providing new ways to locate genes that
contribute to many genetic diseases, like Nicholas’s, that had previously eluded
investigators. The day is soon approaching when a person’s genome sequence is a
standard part of his or her medical record. The availability of genome sequences
for long-studied model organisms and their relatives has dramatically accelerated
gene identification, the analysis of gene function, and the characterization of non-
coding elements of the genome. New technologies for the global, genome-wide
analysis of the physiological role of all gene products are driving the development
of the new field called systems biology. From an evolutionary perspective, genom-
ics provides a detailed view of how genomes and organisms have diverged and
adapted over geological time.
The DNA sequence of the genome is the starting point for a whole new set of
analyses aimed at understanding the structure, function, and evolution of the
510 CHAPTER 1 4 Genomes and Genomics
genome and its components. In this chapter, we will focus on three major aspects
of genomic analysis:
• Bioinformatics, the analysis of the information content of entire genomes. This
information includes the numbers and types of genes and gene products as
well as the location, number, and types of binding sites on DNA and RNA that
allow functional products to be produced at the correct time and place.
• Comparative genomics, which considers the genomes of closely and distantly
related species for evolutionary insight.
• Functional genomics, the use of an expanding variety of methods, including re-
verse genetics, to understand gene and protein function in biological processes.
The first successes in genome sequencing set off waves of innovation that led
to faster and much less expensive sequencing technologies. Now, individual
machines can produce as much sequence in a day as centers used to accomplish
in months. New technologies can now obtain more than 100 billion bases of
sequence in a working day on a single instrument. This figure represents an
approximately 100,000-fold increase in throughput over earlier instruments used
to obtain the first human genome sequence.
Genomics, aided by the explosive growth in information technology, has
encouraged researchers to develop ways of experimenting on the genome as a
whole rather than simply one gene at a time. Genomics has also demonstrated the
value of collecting large-scale data sets in advance so that they can be used later
to address specific research problems. In the last sections of this chapter, we will
explore some ways that genomics now drives basic and applied genetics research.
In subsequent chapters, we will see how genomics is catalyzing advances in under-
standing the dynamics of mutation, recombination, and evolution.
F i g u r e 14 -2 To obtain a
genome sequence, multiple The logic of obtaining a genome sequence
copies of the genome are cut
into small pieces that are Genome
sequenced. The resulting
sequence reads are
overlapped by matching
identical sequences in 1 Cut many genome copies into random fragments.
different fragments until a
consensus sequence of each
DNA double helix in the
genome is produced. 2 Sequence each fragment.
Contig
4 Overlap contigs
for complete
sequence.
Rather, automated sequencing is the current state of the art in DNA sequencing
technology. Initially based on the pioneering Sanger dideoxy chain-termination
method (discussed in Chapter 10), automated sequencing now employs a variety
of chemistries and optical-detection methods. The methods now available vary in
the length of DNA sequence obtained, the bases determined per second, and raw
accuracy. For large-scale sequencing projects that seek to analyze large individual
genomes or the genomes of many different individuals or species, choosing a
method requires balancing speed, cost, and accuracy.
Individual sequencing reactions (called sequencing reads) provide letter
strings that, depending on the sequencing technique employed, range on average
from about 100 to 5000 bases long. Such lengths are tiny compared with the DNA
of a single chromosome. For example, an individual read of 300 bases is only
0.0001 percent of the longest human chromosome (about 3 × 108 bp of DNA) and
only about 0.00001 percent of the entire human genome. Thus, one major chal-
lenge facing a genome project is sequence assembly—that is, building up all of
the individual reads into a consensus sequence, a sequence for which there is
consensus (or agreement) that it is an authentic representation of the sequence
for each of the DNA molecules in that genome.
Let’s look at these numbers in a somewhat different way to understand the
scale of the problem. As with any experimental observation, automated sequenc-
ing machines do not always give perfectly accurate sequence reads. Indeed,
newer, higher-throughput sequencing technologies generate a greater frequency
of errors than older methods; the error rate may range from less than 1 percent to
14.2 Obtaining the Sequence of a Genome 513
Whole-genome sequencing
The current general strategy for obtaining and assembling the sequence of a
genome is called whole-genome shotgun (WGS) sequencing. This approach is based
on determining the sequence of many segments of genomic DNA that have been
generated by breaking the long chromosomes of DNA into many short segments.
Two approaches to whole-genome shotgun sequencing are responsible for most
genome sequences obtained to date. The fundamental differences between them
are in how the short segments of DNA are obtained and prepared for sequencing
and the sequencing chemistry employed. The first method, used to sequence the
first human genome, relied on the cloning of DNA in microbial cells and employed
the Sanger dideoxy sequencing technique. We will refer to this approach as
“traditional WGS.” Methods in the second group are generally cell-free methods
that employ new techniques for sequencing and are designed for very high
throughput (referring to the number of reads per machine per unit time). We will
refer to this group of methods as “next-generation WGS.”
Traditional WGS
The traditional WGS approach begins with the construction of genomic libraries,
which are collections of these short segments of DNA, representing the entire
genome. The short DNA segments in such a library have been inserted into one of
a number of types of accessory chromosomes (nonessential elements such as plas-
mids, modified bacterial viruses, or artificial chromosomes) and propagated in
514 CHAPTER 1 4 Genomes and Genomics
(a)
1 Single DNA strands are 2 These molecules are amplified
immobilized on individual beads. by PCR.
Pyrosequencing is based
on detecting synthesis reactions
Repeat additions
dGTP
Single-stranded
T A T G A C G C T A G C C DNA template
A T A C T G C G
DNA polymerase
PPi
ATP-
Sulfurylase
ATP
Luciferase
Stage 3. The sequencing of each bead is performed using a novel “sequencing-by-
synthesis” chemistry termed pyrosequencing (Figure 14-5). DNA polymerase and a
primer are added to the wells to prime the synthesis of a complementary DNA
strand. Each of the four deoxyribonucleotides dATP, dGTP, dTTP, and dCTP are
made to flow through all of the wells, one at a time, in a specific order. When a
nucleotide is added that is complementary to the next base in the template strand in
a given well, it is incorporated and the reaction releases a pyrophosphate molecule.
Two enzymes, sulfurylase and luciferase, which are also present, then act to convert
the pyrophosphate signal to a visible-light signal (see Figure 14-5). The light is
Introduction to Genetic Analysis, 11e
detected by14.05
Figure a special
#1408camera. Hence, growing DNA strands that have A as the first
base after the
06/18/14 primer will yield a signal only when dATP is made to flow through the
well and not when
Dragonfly Mediathe other deoxynucleotides are made to flow through. The reac-
Group
tion is repeated for at least 100 cycles, and the signals from each well over all of the
cycles are integrated to generate the sequence reads from each well.
Other widely used platforms such as the Illumina sequencing systems and the
Pacific Bioscience systems also detect the synthesis of DNA, but by different
means. The Illumina system detects the incorporation of individual, fluorescently
labeled dNTPs, while the Pacific Bioscience process detects bases being incorpo-
14.2 Obtaining the Sequence of a Genome 517
rated into a single, immobilized DNA molecule. The method chosen by investiga-
tors depends a great deal on the application. The Illumina system produces a
larger number of shorter reads than the 454 system, while the Pacific Bioscience
system provides the advantage of much longer individual reads than any other
system, but with a higher error rate. The high throughput of each approach is the
product of the massively parallel sequencing: several hundred thousand to more
than 1 million reactions can be run simultaneously. Earlier sequencing machines
were able to achieve just 384 sequencing reactions per run.
Whole-genome-sequence assembly
Whichever method of obtaining raw sequence is used, the challenge remains to
assemble the contigs into the entire genome sequence. The difficulty of that pro-
cess depends strongly on the size and complexity of the genome.
For instance, the genomes of bacterial species are relatively easy to assemble.
Bacterial DNA is essentially single-copy DNA, with no repeating sequences. There-
fore, any given DNA sequence read from a bacterial genome will come from one
unique place in that genome. Owing to these properties, contigs within bacterial
genomes can often be assembled into larger contigs representing most or all of the
genome sequence in a relatively straightforward manner. In addition, a typical
bacterial genome is only a few megabase pairs of DNA in size.
For eukaryotes, genome assembly often presents some difficulties. A big stum-
bling block is the existence of numerous classes of repeated sequences, some
arranged in tandem and others dispersed. Why are they a problem for genome
sequencing? In short, because a sequencing read of repetitive DNA fits into many
places in the draft of the genome. Not infrequently, a tandem repetitive sequence
is in total longer than the length of a maximum sequence read. In that case, there
is no way to bridge the gap between adjacent unique sequences. Dispersed repeti-
tive elements can cause reads from different chromosomes or different parts of
the same chromosome to be mistakenly aligned together.
Long-insert
vector
Scaffold A–B
it came from a library containing genomic inserts of uniform size, either the 2-kb,
100-kb, or 150-kb library), the distance between the end reads was known. Fur-
ther, aligning the sequences of the two contigs by using paired-end reads auto-
matically determines the relative orientation of the two contigs. In this manner,
single-copy contigs could be joined together, albeit with gaps where the repetitive
elements reside. These gapped collections of joined-together sequence contigs
are called scaffolds (sometimes also referred to as supercontigs). Because most
Drosophila repeats are large (3–8 kb) and widely spaced (one repeat approxi-
mately every 150 kb), this technique was extremely effective at producing a cor-
rectly assembled draft sequence of the single-copy DNA. A summary of the logic
of this approach is shown in Figure 14-7.
Next-generation WGS does not circumvent the problem of repetitive sequences
and gaps. Since this approach is intended to circumvent the construction of libraries,
which would otherwise facilitate the bridging of gaps between contigs via paired-end
reads, next-generation WGS researchers had to devise a way to bridge these gaps
without building genomic libraries in vectors. One solution was to build a library of
circularized genomic DNA fragments of desired sizes. The circularization allows for
short segments of previously distant sequences located at the ends of each fragment
Scaffold
Sequenced Sequenced Sequenced
contig 1 GAP contig 2 GAP contig 3
Direct evidence from cDNA sequences Another means of identifying ORFs and
exons is through the analysis of mRNA expression. This analysis can be done in two
14.3 Bioinformatics: Meaning from Genomic Sequence 521
ways. Both methods involve the synthesis of libraries of DNA molecules that are
complementary to mRNA sequences, called cDNA (see Chapter 10). The longest
established method entails the cloning and amplification of these cDNA molecules
in a vector. However, NGS technologies allow for the direct sequencing of short
cDNA molecules without the cloning step (called RNA sequencing or “RNA-seq”
for short). Whichever method is utilized, complementary DNA sequences are
extremely valuable in two ways. First, they are direct evidence that a given segment
of the genome is expressed and may thus encode a gene. Second, because the cDNA
is complementary to the mature mRNA, the introns of the primary transcript have
been removed, which greatly facilitates the identification of the exons and introns
of a gene (Figure 14-10).
The alignment of cDNAs with their corresponding genomic sequence clearly
delineates the exons, and hence introns are revealed as the regions falling between
the exons. In the assembled cDNA sequence, the ORF should be continuous from
initiation codon through stop codon. Thus, cDNA sequences can greatly assist in
identifying the correct reading frame, including the initiation and stop codons.
Full-length cDNA evidence is taken as the gold-standard proof that one has iden-
tified the sequence of a transcription unit, including its exons and its location in
the genome.
In addition to full-length cDNA sequences, there are large data sets of cDNAs
for which only the 5′ or the 3′ ends or both have been sequenced. These short
cDNA sequence reads are called expressed sequence tags (ESTs). Expressed
sequence tags can be aligned with genomic DNA and thereby used to determine
the 5′ and 3′ ends of transcripts—in other words, to determine the boundaries of
the transcript as shown in Figure 14-10.
Genomic DNA
Nontemplate strand
Splicing
Ribosome-binding site
mRNA
5′ UTR ORF 3′ UTR
Translation
Polypeptide
Using polypeptide and DNA similarity Because organisms have common ances-
tors, they also have many genes with similar sequences in common. Hence, a gene
will likely have relatives among the genes isolated and sequenced in other organ-
isms, especially in the closely related ones. Candidate genes predicted by the pre-
ceding techniques can often be verified by comparing them with all the other gene
sequences that have ever been found. A candidate sequence is submitted as a
“query sequence” to public databases containing a record of all known gene
sequences. This procedure is called a BLAST search (BLAST stands for Basic Local
Alignment Search Tool). The sequence can be submitted as a nucleotide sequence
(a BLASTn search) or as a translated amino acid sequence (BLASTp). The com-
puter scans the database and returns a list of full or partial “hits,” starting with the
closest matches. If the candidate sequence closely resembles that of a gene previ-
ously identified from another organism, then this resemblance provides a strong
indication that the candidate gene is a real gene. Less-close matches are still useful.
14.3 Bioinformatics: Meaning from Genomic Sequence 523
For example, an amino acid identity of only 35 percent, but at identical positions, is
a strong indicator that two proteins have a common three-dimensional structure.
BLAST searches are used in many other ways, but always the goal is to find
out more about some identified sequence of interest.
Predictions based on codon bias Recall from Chapter 9 that the triplet code for
amino acids is degenerate; that is, most amino acids are encoded by two or more
codons (see Figure 9-5). The multiple codons for a single amino acid are termed
synonymous codons. In a given species, not all synonymous codons for an amino acid
are used with equal frequency. Rather, certain codons are present much more fre-
quently in mRNAs (and hence in the DNA that encodes them). For example, in
D. melanogaster, of the two codons for cysteine, UGC is used 73 percent of the time,
whereas UGU is used 27 percent. This usage is a diagnostic for Drosophila because,
in other organisms, this “codon bias” pattern is quite different. Codon biases are
thought to be due to the relative abundance of the tRNAs complementary to these
various codons in a given species. If the codon usage of a predicted ORF matches
that species’ known pattern of codon usage, then this match is supporting evidence F i g u r e 14 -12 The different forms of
that the proposed ORF is genuine. gene-product evidence—cDNAs, ESTs,
BLAST-similarity hits, codon bias, and
motif hits—are integrated to make gene
Putting it all together A summary of how different sources of information are predictions. Where multiple classes of
combined to create the best-possible mRNA and gene predictions is depicted in evidence are found to be associated with
Figure 14-12. These different kinds of evidence are complementary and can cross- a particular genomic DNA sequence, there
validate one another. For example, the structure of a gene may be inferred from is greater confidence in the likelihood that
a gene prediction is accurate.
Predictions
BLAST similarity
from protein
Codon bias
Predictions Sequence
from mRNA motif
and its
properties EST
cDNA
Predictions
from binding- Promoter Splice Translation Splice sites Translation Polyadenylation
site analysis site sites start site termination site site
programs
Predicted gene
524 CHAPTER 1 4 Genomes and Genomics
Let’s consider some of the insights from our first view of the overall genome
structures and global parts lists of a few species whose genomes have been
sequenced. We will start with ourselves. What can we learn by looking at the
human genome by itself? Then we will see what we can learn by comparing our
genome with others.
Chromosome 20
Figure 14-13 Numerous genes have been identified on human chromosome 20. The
recombinational and cytogenetic map coordinates are shown in the top lines of the figure.
Various graphics depicting gene density and different DNA properties are shown in the
middle sections. The identifiers of the predicted genes are shown at the bottom of the panel.
[Courtesy of Jim Kent, Ewan Birney, Darryl Leja, and Francis Collins. After the International Human
Genome Sequencing Consortium, “Initial Sequencing and Analysis of the Human Genome,” Nature
409, 2001, 860–921.]
chromosome from the human genome is shown in Figure 14-13. Such predictions
are being revised continually as new data become available. The current state of the
predictions can be viewed at many Web sites, most notably at the public DNA data-
bases in the United States and Europe (see Appendix B). These predictions are the
current best inferences of the protein-coding genes present in the sequenced spe-
cies and, as such, are works in progress.
(a) (b)
Phylogenetic inference
The first step in comparing species’ genomes is to decide which species to com-
pare. In order for comparisons to be informative, it is crucial to understand the
528 CHAPTER 1 4 Genomes and Genomics
Turtles
Diapsids Tuatara
Lepidosaurs
Lizards,
snakes
Crocodilians
Archosaurs
Amniotes Birds
Monotremes
Synapsids Marsupials
Mammals
Humans
Eutherians
Chimpanzees
Mice
Figure 14-15 Phylogeny of living mammals and other amniotes. The phylogenetic tree
depicts the evolutionary relationships among the three major groups of mammals (monotremes,
marsupials, and eutherians) and other amniotes, including birds and various reptiles. By
mapping the presence or absence of genes in particular groups onto known phylogenies, one
can infer the direction of evolutionary change (gain or loss) in particular lineages.
Chicken
Human
Monotreme
Figure 14-16 Strings of genes along chicken chromosome 8 and human chromosome 1
and in the platypus are in the same relative order (boxes). Whereas the chicken genome has
three genes that encode egg-yolk proteins, the egg-laying platypus has one functional gene
and two pseudogenes, and humans have fragmented, very short remnants of the yolk genes.
Of course, evolution is also about the acquisition of new traits. For example,
milk production is a shared trait among all mammals. A family of genes encoding
the casein milk proteins are unique to mammals and tightly clustered together in
their genomes, including that of the platypus. Just this brief glance at a few mam-
malian genomes informs us that, yes indeed, some mammals have genes that oth-
ers do not, some genes are shared by all mammals, and the presence or absence of
certain genes correlates with mammals’ lifestyle. The latter is a pervasive finding
in comparative genomics.
Let’s look at a few more examples that illuminate the evolutionary history of
our genome and how we are different from, and similar to, other mammals.
The similarities between the genomes extend well beyond the inventory of
protein-coding genes to overall genome organization. More than 90 percent of the
mouse and human genomes can be partitioned into corresponding regions of con-
served synteny, where the order of genes within variously sized blocks is the
same as their order in the most recent common ancestor of the two species. This
synteny is very helpful in relating the maps of the two genomes. For example,
human chromosome 17 is orthologous to a single mouse chromosome (chromo-
some 11). Although there have been extensive intrachromosomal rearrangements
in the human chromosome, there are 23 segments of colinear sequences more
than 100 kb in size (Figure 14-17).
There are some detectable differences between the inventories of mouse and
human genes. In one family of genes involved in color vision, the opsins, humans
possess one additional paralog. The presence of this opsin has equipped humans
with so-called trichromatic vision, so that we can perceive colors across the entire
spectrum of visible light—violet, blue, green, red—whereas mice cannot. But
again, the presence of this additional paralog in humans and its absence in mice
does not alone tell us whether it was gained in the human lineage or lost in the
mouse lineage. Analysis of other primate and mammalian genomes has revealed
that Old World primates such as chimpanzees, gorillas, and the colobus monkey
possess this gene but that all nonprimate mammals lack it. We can safely infer
from this phylogenetic distribution of the additional opsin gene that it evolved in
an ancestor of Old World primates (that includes humans).
On the other hand, the mouse genome contains more functional copies of
some genes that reflect its lifestyle. Mice have about 1400 genes involved in olfac-
tion—this is the largest single functional category of genes in its genome. Dogs,
too, have a large number of olfactory genes. This certainly makes sense for the
species’ lifestyles. Mice and dogs rely heavily on their sense of smell, and they
encounter different odors from those encountered by humans. And the set of
The mouse and human genome have large syntenic blocks of genes in common
Mouse
chromosome 11 60 65 70 75 80 85 90 95 100 105 110 115 120
Ancestral
chromosome 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Human
chromosome 17 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75
Figure 14-17 Synteny between human chromosome 17 and mouse chromosome 11.
Large conserved syntenic blocks 100 kb or greater in size are shown in human chromosome
17, mouse chromosome 11, and the inferred chromosome of their last common ancestor
(reconstructed by analysis of other mammalian genomes). Direct blocks of synteny are
shown in light purple; inverted blocks are shown in green. Chromosome sizes are indicated
in megabases (Mb). [ Data from M. C. Zody et al., “DNA Sequence of Human Chromosome 17 and
Analysis of Rearrangement in the Human Lineage,” Nature 440, 2006, 1045–1049, Fig. 2.]
532 CHAPTER 1 4 Genomes and Genomics
human olfactory genes, compared to that of mice and dogs, is strikingly inferior.
We have a lot of olfactory genes, but a very large fraction of them are pseudogenes
that bear inactivating mutations. For example, in just one class of olfactory genes
called V1r genes, mice have about 160 functional genes, but just 5 out of the 200 or
so V1r genes in the human genome are functional.
Still, these differences in gene content are relatively modest in light of the vast
differences in anatomy and behavior. The overall similarity in the mouse and
human genomes corresponds to the picture we get from examining the genetic
toolkit controlling development in different taxa (see Chapter 13)—that great dif-
ferences can evolve from genomes containing similar sets of genes. This same
theme is illustrated by comparing our genome with that of our closest living rela-
tive, the chimpanzee.
Overall, any two unrelated humans’ genomes are 99.9 percent identical. That
difference of just 0.1 percent still corresponds to roughly 3 million bases. The
challenge today is to decipher which of those base differences are meaningful
with respect to physiology, development, or disease.
Once the sequence of the first human genome was advanced, that accomplish-
ment opened the door to much more rapid and less costly analysis of other indi-
viduals. The reason is that with a known genome assembly as a reference, it is
much easier to align the raw sequence reads of additional individuals, and to
design approaches to studying and comparing parts of the genome.
One of the first and greatest surprises that has emerged from comparing indi-
vidual human genomes is that humans differ not merely at one base in a thou-
sand, but also in the number of copies of parts of individual genes, entire genes,
or sets of genes. These copy number variations (CNVs) include repeats and
duplications that increase copy number and deletions that reduce copy number.
Between any two unrelated individuals, there may be 1000 or more segments of
DNA greater than 500 bp in length that differ in copy number. Some CNVs can be
quite large and span over 1 million base pairs.
How such copy numbers may play a role in human evolution and disease is of
intense interest. One case where increased copy number appears to have been
adaptive concerns diet. People with high-starch diets have, on average, more cop-
ies of a salivary amylase (an enzyme that breaks down starch) gene than people
with traditionally low-starch diets. In other cases, copy number variations have
been associated with syndromes such as autism.
F i g u r e 14 -18 In order to
sequence just the exon fraction of Exome sequencing
the genome, genomic DNA is
fragmented and denatured, and Genomic DNA
exon-containing fragments are Intron Intron Exon Intron Exon Intron
hybridized with biotin-labeled
probes. Duplexes containing
annealed probes are then purified
and prepared for sequencing.
1 Shear DNA.
A B C D
a
d
b
c
Introduction to Genetic
cancers, Analysis,
or to subsets of11e
cancers, will not only further our understanding of cancer,
Figure 14.18 #1424
but
06/18/14 also promises to impact diagnosis and treatment in powerful ways. Researchers
across
07/07/14 the world are collaborating to create an “atlas” of cancer genomes that com-
piles our expanding knowledge of the genetic mutations associated with many can-
07/23/14
08/04/14
cers. (See http://cancergenome.nih.gov/ for further information.)
Dragonfly Media Group
The ability to rapidly analyze organisms’ genomes is also impacting other
dimensions of medicine. We will look at one such case next.
were not known before sequencing and would not be known today had research-
ers relied solely on E. coli K-12 as a guide to all E. coli.
The surprising level of diversity between two members of the same species
shows how dynamic genome evolution can be. Most new genes in E. coli strains
are thought to have been introduced by horizontal transfer from the genomes of
viruses and other bacteria (see Chapter 5). Differences can also evolve owing to
gene deletion. Other pathogenic E. coli and bacterial species also exhibit many
differences in gene content from their nonpathogenic cousins. The identification
of genes that may contribute directly to pathogenicity opens new avenues to the
understanding, prevention, and treatment of infectious disease.
“’Omics”
In addition to the genome, other global data sets are of interest. Following the
example of the term genome, for which “gene” plus “-ome” becomes a word for “all
genes,” genomics researchers have coined a number of terms to describe other
global data sets on which they are working. This ’ome wish list includes
The transcriptome. The sequence and expression patterns of all RNA tran-
scripts (which kinds, where in tissues, when, how much).
The proteome. The sequence and expression patterns of all proteins (where,
when, how much).
The interactome. The complete set of physical interactions between proteins
and DNA segments, between proteins and RNA segments, and between
proteins.
We will not consider all of these omes in this section but will focus on some of the
global techniques that are beginning to be exploited to obtain these data sets.
Figure 14-20 The key steps in a microarray analysis are (1) extraction of
mRNA from cells or tissues, (2) synthesis of fluorescent-dye-labeled cDNA Microarrays can detect differences
probes, (3) hybridization to the microarray, (4) detection of the fluorescent in gene expression
signal from hybridized probes, and (5) image analysis to identify relative levels
of hybridized probe. The relative levels reveal those genes whose expression 1 Extract mRNA.
is increased or decreased under the conditions analyzed. 2 Make cDNA reverse transcript.
ANIMATED ART: DNA microarrays: using an oligonucleotide array to Label cDNAs with fluorescent dyes.
analyze patterns of gene expression Control Experiment
lar cell type grown under typical conditions. The second set of
probes might be made from RNA extracted from cells grown under
3 Hybridize
some experimental condition. Fluorescent labels are attached to to microarray. 4 Laser excitation
the probes, and the probes are hybridized to the microarray. The at dye-specific Hz
relative binding of the probe molecules to the microarray is mon-
itored automatically with the use of a laser-beam-illuminated
microscope. In this manner, genes whose levels of expression are
increased or decreased under the given experimental condition
are identified. Similarly, genes that are active in a given cell type or
at a given stage of development can be identified.
With an understanding of which genes are active or inactive at
a given developmental stage, in a particular cell type, or in various
environmental conditions, the sets of genes that may respond to
similar regulatory inputs can be identified. Furthermore, gene- 5 Detect laser
expression profiles can paint a picture of the differences between emission.
normal and diseased cells. By identifying genes whose expression
is altered by mutations, in cancer cells, or by a pathogen, research-
ers may be able to devise new therapeutic strategies.
the DNA-binding domain, and this fusion protein acts as “bait.” On the other plas-
mid, a gene for another protein under investigation is spliced next to the activa-
tion domain and this fusion protein is said to be the “target” (Figure 14-21). The
two hybrid plasmids are then introduced into the same yeast cell—perhaps by
mating haploid cells containing bait and target plasmids. The final step is to look
for activation of transcription by a GAL4-regulated reporter gene construct, which
would be proof that bait and target bind to each other. The two-hybrid system can
be automated to make it possible to hunt for protein interactions throughout the
proteome.
“Bait” “Target”
protein protein
Trp 1+ Leu 2+
Unite
Interaction
Target Bait
Gal4 AD
Gal4 BD
Transcription
GAL Reporter
promoter lacZ
F i g u r e 14 -2 1 The system uses the binding of two proteins, a “bait” protein and a
“target” protein, to restore the function of the Gal4 protein, which activates a reporter gene.
Cam, Trp, and Leu are components of the selection systems for moving the plasmids
around between cells. The reporter gene is lacZ, which resides on a yeast chromosome
(shown in blue).
14.7 Functional Genomics and Reserve Genetics 5 3 9
Antibody
Amplify and
sequence
Let’s say that you have isolated a gene from yeast and suspect that it encodes
a protein that binds to DNA when yeast is grown at high temperature. You want
to know whether this protein binds to DNA and, if so, to what yeast sequence. One
way to address this question is first to treat yeast cells that have been grown at
high temperature with a chemical that will cross-link proteins to the DNA. In this
way proteins bound to the DNA at the time of chromatin isolation will remain
bound through subsequent treatments. The next step is to break the chromatin
into small pieces. To separate the fragment containing your protein–DNA com-
plex from others, you use an antibody that reacts specifically with the encoded
protein. You add your antibody to the mixture so that it forms an immune com-
plex that can be purified. The DNA bound in the immune complex can be ana-
lyzed after cross-linking is reversed. DNA bound by the protein may be sequenced
directly or amplified into many copies by PCR to prepare for DNA sequencing.
As you saw in Chapter 12, regulatory proteins often activate transcription of
many genes simultaneously by binding to several promoter regions. A variation
of the ChIP procedure, called ChIP-chip, has been devised to identify multiple
binding sites in a sequenced genome. Proteins that bind to many genomic regions
are immunoprecipitated as described above. Then, after cross-linking is reversed,
the DNA fragments are labeled and used to probe microarray chips that contain
the entire genomic sequence of the species under study.
Reverse genetics
The kinds of data obtained from microarray experiments and protein-interaction
screens are suggestive of interactions within the genome and proteome, but they
do not allow one to draw firm conclusions about gene functions and interactions
540 CHAPTER 1 4 Genomes and Genomics
in vivo. For example, finding out that the expression of certain genes is lost in
some cancers is not proof of cause and effect. The gold standard for establishing
the function of a gene or genetic element is to disrupt its function and to under-
stand phenotypes in native conditions. Starting from available gene sequences,
researchers can now use a variety of methods to disrupt the function of a specific
gene. These methods are referred to as reverse genetics. Reverse-genetic analy-
sis starts with a known molecule—a DNA sequence, an mRNA, or a protein—and
then attempts to disrupt this molecule to assess the role of the normal gene prod-
uct in the biology of the organism.
There are several approaches to reverse genetics and new technologies are
constantly being developed and refined. One approach is to introduce random
mutations into the genome but then to home in on the gene of interest by molecu-
lar identification of mutations in the gene. A second approach is to conduct a
targeted mutagenesis that produces mutations directly in the gene of interest. A
third approach is to create phenocopies—effects comparable to mutant pheno-
types—usually by treatment with agents that interfere with the mRNA transcript
of the gene.
Each approach has its advantages. Random mutagenesis is well established,
but it requires that one sift through all the mutations to find those that include
the gene of interest. Targeted mutagenesis can also be labor intensive but, after
the targeted mutation has been obtained, its characterization is more straightfor-
ward. Creating phenocopies can be very efficient, especially as libraries of tools
have been developed for particular model species. We will consider examples of
each of these approaches.
Chromosome
5′ Gene A 3′
Recombination between
mutant transgene
and chromosomal gene
Mutant gene A
Chromosome
5′ 3′
targeted gene mutations were first developed using genetic techniques for model
organisms, new technologies are enabling the disruption and manipulation of
genes in nonmodel species.
Gene-specific mutagenesis usually requires the replacement of a resident
wild-type copy of an entire gene by a mutated version of that gene. The mutated
gene inserts into the chromosome by a mechanism resembling homologous
recombination, replacing the normal sequence with the mutant (Figure 14-23).
This approach can be used for targeted gene knockout, in which a null allele
replaces the wild-type copy. Some techniques are so efficient that, in E. coli and S.
cerevisiae, for example, it has been possible to mutate every gene in the genome to
try to ascertain its biological function.
5′ 3′
3′ 5′
dsRNA
2 RNA transcript forms
2 Complementary RNA
a self-complementary
molecules are transcribed
stem and loop.
and hybridize.
2 dsRNA is injected 5′ 5′ 3′ dsRNA
dsRNA
into cell. 3′ 3′ 5′
The technique has been widely applied in model systems such as C. elegans, Dro-
Inserting transgenes
sophila, zebrafish, and several plant species.
into a nonmodel organism
But what makes RNAi especially powerful is that it can be applied to non-
Eye enhancer– Transposase model organisms. First, target genes of interest can be identified by comparative
promoter–GFP genomics. Then RNAi sequences are produced to target the inhibition of the spe-
cific target genes. This technique has been applied, for example, to a mosquito
that carries malaria (Anopheles gambiae). Using these techniques, scientists can
TIR Donor TIR Helper
plasmid plasmid better understand the biological mechanisms relating to the medical or economic
effect of such species. The genes that control the complicated life cycle of the
malaria parasite, partly inside a mosquito host and partly inside the human body,
Nuclei
can be better understood, revealing new ways to control the single most common
infectious disease in the world.
Microinjection of
plasmid mixture
Embryo into embryo K e y C o n c e p t RNAi-based methods provide general ways of experimentally
interfering with the function of a specific gene without changing its DNA sequence
(generally called phenocopying).
Primordial germ
cells (some have
taken up plasmids)
Functional genomics with nonmodel organisms Much of our consideration of
mutational dissection and phenocopying has focused on genetic model organisms.
Transgene construct One current focus of many geneticists is the broader application of these tech-
transposes into genome
of some germ-line cells niques, including to species that have negative effects on human society, such as
parasites, disease carriers, or agricultural pests. Classical genetic techniques are not
readily applicable to most of these species, but the roles of specific genes can be
Next generation
assessed through the insertion of transgenes and the generation of phenocopies.
The insertion of transgenes into a beetle is shown in Figure 14-25. Transgenic
beetles can be produced by using methodology similar to that used to produce
transgenic Drosophila (see Chapter 10). However, some way is needed to identify
successful transgenesis. Therefore, the technique depends on using a reporter
gene that can be expressed in a wild-type recipient. The green fluorescent protein
(GFP), originally isolated from a jellyfish, is a useful reporter for this application.
Beetles with green fluorescent eyes have As in Drosophila, transgenes are inserted as parts of transposons, and a helper
inherited a genome-integrated transgene
plasmid encoding a transposase facilitates insertion of the transposon bearing the
F i g u r e 14 -2 5 Creation of transgenic transgene. Figure 14-25 shows the use of GFP transgenes driven by an enhancer
beetles expressing a green fluorescent element that drives expression in the insect eye. This method has also been effec-
protein. TIR, terminal inverted repeat. tively used to create GFP-expressing transgenes in the mosquito species that car-
Key Terms 5 4 3
ries yellow fever and dengue fever (Aedes aegypti), a flour beetle (Tribolium
Examples of nonmodel insects
castaneum), and the silkworm moth (Bombyx mori) (Figure 14-26). Often, the GFP
expressing a transgene
transgene is used simply as a genetic marker for experiments in which an RNAi
construct or other transgene is co-inserted in order to manipulate gene function.
s u mm a ry
Genomic analysis takes the approaches of genetic analysis and applies them to the
collection of global data sets to fulfill goals such as the mapping and sequencing
of whole genomes and the characterization of all transcripts and proteins.
Genomic techniques require the rapid processing of large sets of experimental
material, all dependent on extensive automation.
The key problem in compiling an accurate sequence of a genome is to take short
sequence reads and relate them to one another by sequence identity to build up a
consensus sequence of an entire genome. This can be done straightforwardly for
bacterial or archaeal genomes by aligning overlapping sequences from different
sequence reads to compile the entire genome, because there are few or no DNA seg-
ments that are present in more than one copy in such organisms. The problem is that
complex genomes of plants and animals are replete with such repetitive sequences.
These repetitive sequences interfere with accurate sequence-contig production. The
problem is resolved in whole-genome shotgun (WGS) sequencing with the use of
paired-end reads.
Having a genomic sequence map provides the raw, encrypted text of the genome.
The job of bioinformatics is to interpret this encrypted information. For the analysis
of gene products, computational techniques are used to identify ORFs and noncod-
ing RNAs, then to integrate these results with available experimental evidence for
transcript structures (cDNA sequences), protein similarities, and knowledge of char-
acteristic sequence motifs.
One of the most powerful means to advance the analysis and annotation of
genomes is by comparing with the genomes of related species. Conservation of F i g u r e 14 -2 6 Examples of a
sequences among species is a reliable guide to identifying functional sequences in transgenic green fluorescent protein
reporter expressed in the eyes of some
the complex genomes of many animals and plants. Comparative genomics can also
nonmodel insects. Expression is driven
reveal how genomes have changed in the course of evolution and how these changes from one single promoter active in the eye.
may relate to differences in physiology, anatomy, or behavior among species. The insects are mosquito ( Aedes aegypti,
Comparisons of human genomes are accelerating the discovery of rare disease muta- top) and silkworm moth ( Bombyx mori,
tions. In bacterial genomics, comparisons of pathogenic and nonpathogenic strains bottom). [ ( Top) AP Photo/Jacquelyn Martin;
have revealed many differences in gene content that contribute to pathogenicity. (bottom) Marek Jindra.]
Functional genomics attempts to understand the working of the genome as a
whole system. Two key elements are the transcriptome, the set of all transcripts pro-
duced, and the interactome, the set of interacting gene products and other molecules
that together enable a cell to be produced and to function. The function of individual
genes and gene products for which classical mutations are not available can be tested
through reverse genetics—by targeted mutation or phenocopying.
k e y t e rms
annotation (p. 520) exome (p. 508) ortholog (p. 528)
bioinformatics (p. 519) expressed sequence tag (EST) (p. 521) outgroup (p. 528)
ChIP (chromatin functional genomics (p. 536) paired-end read (p. 517)
immunoprecipitation assay) genome project (p. 510) paralog (p. 528)
(p. 538) genomic library (p. 513) parsimony (p. 528)
comparative genomics (p. 527) genomics (p. 509) phylogeny (p. 528)
consensus sequence (p. 512) homolog (p. 528) phylogenetic inference (p. 528)
copy number variation (CNV) (p. 533) microarray (p. 536) processed pseudogene (p. 524)
DNA template library (p. 515) open reading frame (ORF) (p. 520) proteome (p. 520)
54 4 CHAPTER 1 4 Genomes and Genomics
pseudogene (p. 524) RNA sequencing (p. 521) supercontig (p. 518)
pyrosequencing (p. 516) scaffold (p. 518) synteny (p. 531)
reverse genetics (p. 540) sequence assembly (p. 512) two-hybrid test (p. 537)
RNA interference (RNAi) (p. 541) sequence contig (p. 514) vector (p. 514)
so lv e d p rob l e ms
SOLVED PROBLEM 1. You want to study the development of DNA chips containing sequences that correspond to all
the olfactory (smell-reception) system in the mouse. You known mRNAs in the mouse. For example, you may choose
know that the cells that sense specific chemical odors (odor- to first examine mRNAs that are expressed in the nasal-
ants) are located in the lining of the nasal passages of the passage lining but nowhere else in the mouse as important
mouse. Describe some approaches for using reverse genetics candidates for a specific role in olfaction. (Many of the
to study olfaction. important molecules may also have other jobs elsewhere in
the body, but you have to start somewhere.) Alternatively,
Solution you may choose to start with those genes whose protein
There are many approaches that can be imagined. For products are candidate proteins for binding the odorants
reverse genetics, you would want to identify candidate genes themselves. Regardless of your choice, the next step would
that are expressed in the lining of the nasal passages. Given be to engineer a targeted knockout of the gene that encodes
the techniques of functional genomics, this identification each mRNA or protein of interest or to use RNA interfer-
could be accomplished by purifying RNA from isolated ence to attempt to phenocopy the loss-of-function pheno-
nasal-passage-lining cells and using this RNA as a probe of type of each of the candidate genes.
p rob l e ms
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
over five regions on different chromosomes. How is this region, but there is a two-base-pair deletion that disrupts
result possible? the reading frame.
16. In an in situ hybridization experiment, a certain clone a. How would you determine whether the deletion was
bound to only the X chromosome in a boy with no disease correct or an error in the sequencing?
symptoms. However, in a boy with Duchenne muscular b. You find that the exact same deletion exists in the
dystrophy (X-linked recessive disease), it bound to the chimpanzee homolog of the gene but that the gorilla
X chromosome and to an autosome. Explain. Could this gene reading frame is intact. Given the phylogeny of
clone be useful in isolating the gene for Duchenne mus- great apes below, what can you conclude about when in
cular dystrophy? ape evolution the mutation occurred?
17. In a genomic analysis looking for a specific disease gene, Human
one candidate gene was found to have a single-base-pair
substitution resulting in a nonsynonymous amino acid Chimp
change. What would you have to check before conclud- Bonobo
ing that you had identified the disease-causing gene?
Gorilla
18. Is a bacterial operator a binding site?
Orangutan
19. A certain cDNA of size 2 kb hybridized to eight genomic
fragments of total size 30 kb and contained two short Siamang
ESTs. The ESTs were also found in two of the genomic Green monkey
fragments each of size 2 kb. Sketch a possible explana-
Owl monkey
tion for these results.
20. A sequenced fragment of DNA in Drosophila was used in 25. In browsing through the chimpanzee genome, you find
a BLAST search. The best (closest) match was to a kinase that it has three homologs of a particular gene, whereas
gene from Neurospora. Does this match mean that the humans have only two.
Drosophila sequence contains a kinase gene? a. What are two alternative explanations for this
21. In a two-hybrid test, a certain gene A gave positive re- observation?
sults with two clones, M and N. When M was used, it gave b. How could you distinguish between these two
positives with three clones, A, S, and Q. Clone N gave possibilities?
only one positive (with A). Develop a tentative interpre-
26. The platypus is one of the few venomous mammals. The
tation of these results.
male platypus has a spur on the hind foot through which
22. You have the following sequence reads from a genomic it can deliver a mixture of venom proteins. Looking at
clone of the Drosophila melanogaster genome: the phylogeny in Figure 14-15, how would you go about
Read 1: TGGCCGTGATGGGCAGTTCCGGTG determining whether these venom proteins are unique
to the platypus?
Read 2: TTCCGGTGCCGGAAAGA
27. You have sequenced the genome of the bacterium
Read 3: CTATCCGGGCGAACTTTTGGCCG Salmonella typhimurium, and you are using BLAST anal-
Read 4: CGTGATGGGCAGTTCCGGTG ysis to identify similarities within the S. typhimurium
genome to known proteins. You find a protein that
Read 5: TTGGCCGTGATGGGCAGTT
is 100 percent identical in the bacterium Escherichia
Read 6: CGAACTTTTGGCCGTGATGGGCAGTTCC coli. When you compare nucleotide sequences of the
Use these six sequence reads to create a sequence S. typhimurium and E. coli genes, you find that their nu-
contig of this part of the D. melanogaster genome. cleotide sequences are only 87 percent identical.
23. Sometimes, cDNAs turn out to be “chimeras”; that is, fu- a. Explain this observation.
sions of DNA copies of two different mRNAs accidental- b. What do these observations tell you about the merits
ly inserted adjacently to each other in the same clone. of nucleotide- versus protein-similarity searches in
You suspect that a cDNA clone from the nematode identifying related genes?
Caenorhabditis elegans is such a chimera because the se- 28. To inactivate a gene by RNAi, what information do you
quence of the cDNA insert predicts a protein with two need? Do you need the map position of the target gene?
structural domains not normally observed in the same
protein. How would you use the availability of the entire 29. What is the purpose of generating a phenocopy?
genomic sequence to assess if this cDNA clone is a mon- 30. What is the difference between forward and reverse
ster or not? genetics?
24. In browsing through the human genome sequence, you 31. Why might exome sequencing fail to identify a disease-
identify a gene that has an apparently long coding causing mutation in an affected person?
5 46 CHAPTER 1 4 Genomes and Genomics
15
C h a p t e r
The Dynamic
Genome: Transposable
Elements
Learning Outcomes
After completing this chapter, you will be able to
outline
547
548 CHAPTER 1 5 The Dynamic Genome: Transposable Elements
A
boy is born with a disease that makes his immune system ineffective. Diag-
nostic testing determines that he has a recessive genetic disorder called
SCID (severe combined immunodeficiency disease), more commonly
known as bubble-boy disease. This disease is caused by a mutation in the gene
encoding the blood enzyme adenosine deaminase (ADA). As a result of the loss of
this enzyme, the precursor cells that give rise to one of the cell types of the
immune system are missing. Because this boy has no ability to fight infection, he
has to live in a completely isolated and sterile environment—that is, a bubble in
which the air is filtered for sterility (Figure 15-1). No pharmaceutical or other
conventional therapy is available to treat this disease. Giving the boy a tissue
transplant containing the precursor cells from another person would not work in
the vast majority of cases because a precise tissue match between donor and
patient is extremely rare. Consequently, the donor cells would end up creating an
immune response against the boy’s own tissues (graft-versus-host disease).
In the past two decades, techniques have been developed that offer the possibil-
ity of a different kind of transplantation therapy—gene therapy—that could help
people with SCID and other incurable genetic diseases. In regard to SCID, a normal
ADA gene is “transplanted” into cells of a patient’s immune system, thereby permit-
ting these cells to survive and function normally. In the earliest human gene-therapy
trials, scientists modified a type of virus called a retrovirus in the laboratory (“engi-
neered”) so that it could insert itself and a normal ADA gene into chromosomes of
the immune cells taken from patients with SCID. In this chapter, you will see that
retroviruses have many biological properties in common with a type of mobile ele-
ment called a retrotransposon, which is present in our genome and the genomes of
most eukaryotes. Lessons learned about the behavior of retrotransposons and other
mobile elements from model organisms such as yeast are sources of valuable insights
into the design of a new generation of biological vectors for human gene therapy.
Starting in the 1930s, genetic studies of maize yielded results that greatly upset
the classical genetic picture of genes residing only at fixed loci on chromosomes.
The research literature began to carry reports suggesting that certain genetic ele-
ments present in the main chromosomes can somehow move from one location to
another. These findings were viewed with skepticism for many years, but it is now
clear that such mobile elements are widespread in nature.
A variety of colorful names (some of which help to describe their respective
properties) have been applied to these genetic elements: controlling elements, jump-
ing genes, mobile genes, mobile elements, and transposons. Here we
use the terms transposable elements and mobile elements, which
A boy with SCID embrace the entire family of types. Transposable elements can move
to new positions within the same chromosome or even to a different
chromosome. They have been detected genetically in model organ-
isms such as E. coli, maize, yeast, C. elegans, and Drosophila through
the mutations that they produce when they insert into and inacti-
vate genes.
DNA sequencing of genomes from a variety of microbes, plants,
and animals indicates that transposable elements exist in virtually
all organisms. Surprisingly, they are by far the largest component of
the human genome, accounting for almost 50 percent of our chro-
mosomes. Despite their abundance, the normal genetic role of
these elements is not known with certainty.
In their studies, scientists are able to exploit the ability of trans-
posable elements to insert into new sites in the genome. Transpos-
able elements engineered in the test tube are valuable tools, both in
F i g u r e 15 -1 A patient with SCID must live in a protective prokaryotes and in eukaryotes, for genetic mapping, creating
bubble. [ © Bettmann/Corbis.] mutants, cloning genes, and even producing transgenic organisms.
15.1 Discovery of Transposable Elements in Maize 5 49
GENOTYPES PHENOTYPES
c sh wx
Colorless (c)
Shrunken (sh)
Not shiny (wx)
Ac-activated Pigmented (C )
Acentric fragment
chromosome Plump (Sh)
is lost.
breakage at Ds Shiny (Wx)
C
Ds
Sh
Wx
c sh wx
c-Ds Sh Wx
c sh wx
Colorless
background
Ac activates
Ds loss from
C gene.
Ds
C Sh Wx
c sh wx
are small spots when Ds leaves the C gene later in kernel development. Unstable
mutant phenotypes that revert to wild type are a clue to the participation of
mobile elements.
Model
Organism Maize
Maize, also known as corn, is actually Zea mays, a member
of the grass family. Grasses—also including rice, wheat, and
barley—are the most important source of calories for
humanity. Maize was domesticated from the wild grass teo-
sinte by Native Americans in Mexico and Central America
and was first introduced to Europe by Columbus on his
return from the New World.
In the 1920s, Rollins A. Emerson set up a laboratory at
Cornell University to study the genetics of corn traits,
including kernel color, which were ideal for genetic analysis.
In addition, the physical separation of male and female
flowers into the tassel and ear, respectively, made controlled
genetic crosses easy to accomplish. Among the outstanding
geneticists attracted to the Emerson laboratory were Mar-
cus Rhoades, Barbara McClintock, and George Beadle (see The maize laboratory of Rollins A. Emerson at Cornell University,
Chapter 6). Before the advent of molecular biology and the 1929. Standing from left to right: Charles Burnham, Marcus Rhoades,
rise of microorganisms as model organisms, geneticists per- Rollins Emerson, and Barbara McClintock. Kneeling is George
formed microscopic analyses of chromosomes and related Beadle. Both McClintock and Beadle were awarded a Nobel Prize.
their behavior to the segregation of traits. The large root-tip [ Department of Plant Breeding, Cornell University.]
chromosomes of maize and the salivary-gland chromo-
somes of Drosophila made them the organisms of choice for Maize still serves as a model genetic organism. Molecular
cytogenetic analyses. The results of these early studies led biologists continue to exploit its beautiful pachytene chromo-
to an understanding of chromosome behavior during meio- somes with new antibody probes (see photograph b below)
sis and mitosis, including such events as recombination and and have used its wealth of genetically well-characterized
the consequences of chromosome breakage such as inver- transposable elements as tools to identify and isolate impor-
sions, translocations, and duplication. tant genes.
(a) (b)
Analysis of maize chromosomes, then and now. Maize chromosomes are large and easily visualized by light microscopy. (a) An
image from Marcus Rhoades (1952). (b) This image is comparable to that in part a except that the spindle is shown in blue (stained
with antibodies to tubulin), the centromeres are shown in red (stained with antibodies to a centromere-associated protein), and the
chromosomes are shown in green. [(a) James A. Birchler, R. Kelly Dawe, and John F. Doebley, “Perspectives Anecdotal, Historical and Critical
Commentaries on Genetics: Marcus Rhoades, Preferential Segregation and Meiotic Drive.” © 2003 Genetics Society of America, p. 836. (b) R. K.
Dawe, L. Reed, H.-G. Yu, M. G. Muszynski, and E. N. Hiatt, “A Maize Homolog of Mammalian CENPC Is a Constitutive Component of the Inner
Kinetochore,” Plant Cell 11, 1999, 1227–1238.]
552 CHAPTER 1 5 The Dynamic Genome: Transposable Elements
to the next question: Are the segments of DNA that insert into genes merely ran-
dom DNA fragments or are they distinct genetic entities? The answer to this ques-
tion came from the results of hybridization experiments showing that many
different insertion mutations are caused by a small set of insertion sequences.
These experiments are performed with the use of λdgal phages that contain the
gal− operon from several independently isolated gal mutant strains. Individual
phage particles from the strains are isolated, and their DNA is used to synthesize
radioactive RNA in vitro. Certain fragments of this RNA are found to hybridize
with the DNA from other gal− mutations containing large DNA insertions but not
with wild-type DNA. These results were interpreted to mean that independently
isolated gal mutants contain the same extra piece of DNA. These particular RNA
fragments also hybridize to DNA from other mutants containing IS insertions in
other genes, showing that the same bit of DNA can insert in different places in the
bacterial chromosome.
Prokaryotic transposons
In Chapter 5, you learned about R plasmids, which carry genes that encode resis-
tance to several antibiotics. These R plasmids (for resistance), also known as
R factors, are transferred rapidly on cell conjugation, much like the F factor in
E. coli.
The R factors proved to be just the first of many similar F-like factors to be
discovered. R factors have been found to carry many different kinds of genes in
bacteria. In particular, R factors pick up genes conferring resistance to different
antibiotics. How do they acquire their new genetic abilities? It turns out that the
drug-resistance genes reside on a mobile genetic element called a transposon
(Tn). There are two types of bacterial transposons. Composite transposons con-
tain a variety of genes that reside between two nearly identical IS elements that
are oriented in opposite direction (Figure 15-6a) and, as such, form what is called
an inverted repeat sequence. Transposase encoded by one of the two IS ele-
ments is necessary to catalyze the movement of the entire transposon. An exam-
15.2 Transposable Elements in Prokaryotes 555
Transposon Tn3
IR IR
Transposase Resolvase amp R
Ampicillin
resistance
ple of a composite transposon is Tn10, shown in Figure 15-6a. Tn10 carries a gene
that confers resistance to the antibiotic tetracycline and is flanked by two IS10
elements in opposite orientation. The IS elements that make up composite trans-
posons are not capable of transposing on their own because of mutations in their
inverted repeats.
Simple transposons also consist of bacterial genes flanked by inverted repeat
sequences, but these sequences are short (< 50 bp) and do not encode the trans-
posase enzyme that is necessary for transposition. Thus, their mobility is not due
to an association with IS elements. Instead, simple transposons encode their own
transposase in the region between the inverted repeat sequences in addition to
carrying bacterial genes. An example of a simple transposon is Tn3, shown in
Figure 15-6b.
To review, IS elements are short mobile sequences that encode only those pro-
teins necessary for their mobility. Composite transposons and simple transposons
contain additional genes that confer new functions to bacterial cells. Whether com-
posite or simple, transposons are usually just called transposons, and different
transposons are designated Tn1, Tn2, Tn505, and so forth.
A transposon can jump from a plasmid to a bacterial chromosome or from one
plasmid to another plasmid. In this manner, multiple-drug-resistant plasmids are
generated. Figure 15-7 is a composite diagram of an R factor, indicating the vari-
ous places at which transposons can be located. We next consider the question of
how such transposition or mobilization events occur.
Tn5 Tn3
kan R IS50 sm R su R am R
IS50 p
R hg R
cm Tn4 IS1
IS1
10
Resistance-determinant segment
tet R IS
Tn10
10
IS
Figure 15-7 A schematic map of a plasmid with several insertions of simple and composite
transposons carrying resistance genes. Plasmid sequences are in blue. Genes encoding
resistance to the antibiotics tetracycline (tetR ), kanamycin (kanR ), streptomycin (smR ),
sulfonamide (suR ), and ampicillin (ampR ) and to mercury (hgR ) are shown. The resistant-
determinant segment can move as a cluster of resistance genes. Tn3 is within Tn4. Each
transposon can be transferred independently. [Data from S. N. Cohen and J. A. Shapiro,
“Transposable Genetic Elements.” Copyright 1980 by Scientific American, Inc. All rights reserved.]
Mechanism of transposition
As already stated, the movement of a transposable element depends on the action
of a transposase. This enzyme plays key roles in the two stages of transposition:
excision (leaving) from the original location and inserting into the new location.
Replicative
... A Tn3 B ... ... C D ...
Target site (empty)
Conservative
... A Tn10 B ... ... C D ...
generated in the transposition event. The results of the transposition are that one
Replicative
copy appears at the new site and one copy remains at the old site. In the conserva- transposition of Tn3
tive pathway (as shown for Tn10), there is no replication. Instead, the element is
excised from the chromosome or plasmid and is integrated into the new site. The
conservative pathway is also called “cut and paste.” Donor
Tn3 plasmid
Replicative transposition Because this mechanism is a bit complicated, it will be
described here in more detail. As Figure 15-8 illustrates, one copy of Tn3 is pro-
+ Target sequence
duced from an initial single copy, yielding two copies of Tn3 altogether.
Figure 15-9 shows the details of the intermediates in the transposition of Tn3
from one plasmid (the donor) to another plasmid (the target). During transposi- Target
tion, the donor and recipient plasmids are temporarily fused together to form a plasmid
double plasmid. The formation of this intermediate is catalyzed by Tn3-encoded
transposase, which makes single-strand cuts at the two ends of Tn3 and staggered Transposase Formation of cointegrate
cuts at the target sequence and joins the free ends together, forming a fused circle
called a cointegrate.The transposable element is duplicated in the fusion event.
The cointegrate then resolves by a recombination-like event that turns a cointe-
grate into two smaller circles, leaving one copy of the transposable element in
each plasmid. The result is that one copy remains at the original location of the
element, whereas the other is integrated at a new genomic position.
Conservative transposition Some transposons, such as Tn10, excise from the chromo-
some and integrate into the target DNA. In these cases, the DNA of the element is
not replicated, and the element is lost from the site of the original chromosome (see DNA synthesis
Figure 15-8). Like replicative transposition, this reaction is initiated by the element-
encoded transposase, which cuts at the ends of the transposon. However, in contrast
with replicative transposition, the transposase cuts the element out of the donor site.
It then makes a staggered cut at a target site and inserts the element into the target Cointegrate
site. We will revisit this mechanism in more detail in a discussion of the transposition
of eukaryotic transposable elements, including the Ac/Ds family of maize.
Insertion into a new location We have seen that transposase continues to play an
important role in insertion. In one of the first steps of insertion, the transposase
makes a staggered cut in the target-site DNA (not unlike the staggered breaks cata- Resolution of cointegrate
lyzed by restriction endonucleases in the sugar–phosphate backbone of DNA). Fig-
ure 15-10 shows the steps in the insertion of a generic transposable element. In this
case, the transposase makes a five-base-pair staggered cut. The transposable ele-
Tn3 Original
ment inserts between the staggered ends, and the host DNA repair machinery (see
Chapter 16) fills in the gap opposite each single-strand overhang by using the bases
in the overhang as a template. There are now two duplicate sequences, each five +
base pairs in length, at the sites of the former overhangs. These sequences are called
a target-site duplication. Virtually all transposable elements (in both prokaryotes Tn3 New
and eukaryotes) are flanked by a target-site duplication, indicating that all use a
mechanism of insertion similar to that shown in Figure 15-10. What differs is the
length of the duplication; a particular type of transposable element has a character-
F i g u r e 15 - 9 Replicative transposition
istic length for its target-site duplication—as small as two base pairs for some ele-
of Tn3 takes place through a cointegrate
ments. It is important to keep in mind that the transposable elements have inverted intermediate.
repeats at their ends and that the inverted repeats are flanked by the target-site
ANIMATED ART: Replicative
duplication—which is a direct repeat.
transposition
TCCATTCC ATC
Class 1: retrotransposons
Transposable The laboratory of Gerry Fink was among the first to use
element inserts. yeast as a model organism to study eukaryotic gene
regulation. Through the years, he and his colleagues
AGG TAAGGTAG isolated thousands of mutations in the HIS4 gene,
which encodes one of the enzymes in the pathway lead-
TCCATTCC ATC
ing to the synthesis of the amino acid histidine.
They isolated more than 1500 spontaneous HIS4
Host repairs
gaps.
mutants and found that two of them had an unstable
mutant phenotype. The unstable mutants (called pseu-
dorevertants) were more than 1000 times as likely to
AGGTAAGG TAAGGTAG
revert to a phenotype that was similar to wild type as the
T C C AT T C C ATTCCATC other HIS4 mutants. Symbolically, we say that these
unstable mutants reverted from His − to His+ (wild types
Five-base-pair direct have a superscript plus sign, whereas mutants have a
repeat flanks element. superscript minus sign). Like the E. coli gal − mutants,
these yeast mutants were found to harbor a large DNA
F i g u r e 15 -10 A short sequence of insertion in the HIS4 gene. The insertion turned out to
DNA is duplicated at the transposon be very similar to one of a group of transposable elements already characterized in
insertion site. The recipient DNA is cleaved yeast, called the Ty elements. There are, in fact, about 35 copies of the inserted ele-
at staggered sites (a 5-bp staggered cut is ment, called Ty1, in the yeast genome.
shown), leading to the production of two
Cloning of the elements from these mutant alleles led to the surprising discov-
copies of the five-base-pair sequence
flanking the inserted element. ery that the insertions did not look at all like bacterial IS elements or transposons.
Instead, they resembled a well-characterized class of animal viruses called retrovi-
ruses. A retrovirus is a single-stranded RNA virus that employs a double-stranded
DNA intermediate for replication. The RNA is copied into DNA by the enzyme
reverse transcriptase. The double-stranded DNA is integrated into host chromo-
somes, from which it is transcribed to produce the RNA viral genome and proteins
that form new viral particles. When integrated into host chromosomes as double-
stranded DNA, the double-stranded DNA copy of the retroviral genome is called a
provirus. The life cycle of a typical retrovirus is shown in Figure 15-11. Some retro-
viruses, such as mouse mammary tumor virus (MMTV) and Rous sarcoma virus
(RSV), are responsible for the induction of cancerous tumors. For MMTV, this hap-
pens when it inserts randomly into the genome next to a gene whose altered expres-
sion leads to cancer.
Figure 15-12 shows the similarity in structure and gene content of a retrovirus
and the Ty1 element isolated from the HIS4 mutants. Both are flanked by long ter-
minal repeat (LTR) sequences that are several hundred base pairs long. Retrovi-
ruses encode at least three proteins that take part in viral replication: the products
of the gag, pol, and env genes. The gag-encoded protein has a role in the maturation
of the RNA genome, pol encodes the all-important reverse transcriptase, and env
encodes the structural protein that surrounds the virus. This protein is necessary
for the virus to leave the cell to infect other cells. Interestingly, Ty1 elements have
15.3 Transposable Elements in Eukaryotes 559
Host cell
DNA
RNA
Double-stranded viral
DNA is integrated into
the DNA of the host
Proviral DNA chromosome. DNA
genes related to gag and pol but not env. These features
A retrotransposon is transposed through an
led to the hypothesis that, like retroviruses, Ty1 elements
RNA intermediate
are transcribed into RNA transcripts that are copied
into double-stranded DNA by the reverse transcriptase.
5-bp direct repeat of target DNA However, unlike retroviruses, Ty1 elements cannot leave
TAATC TAATC the cell because they do not encode env. Instead, the
LTR gag pol LTR
ATTAG ATTAG double-stranded DNA copies are inserted back into the
Another genome of the same cell. These steps are diagrammed in
Transcription chromosome
Figure 15-13.
In 1985, David Garfinkel, Jef Boeke, and Gerald Fink
5' 3'
showed that, like retroviruses, Ty elements do in fact
Ty1 copy
transpose through an RNA intermediate. Figure 15-14 dia-
Nu
Cy
cle
us inserts grams their experimental design. They began by altering a
top yeast Ty1 element, cloned on a plasmid. First, near one
las
m
end of an element, they inserted a promoter that can be
activated by the addition of galactose to the medium. Sec-
ond, they introduced an intron from another yeast gene
Translation into the coding region of the Ty transposon.
The addition of galactose greatly increases the fre-
Reverse
transcription quency of transposition of the altered Ty element. This
Reverse increased frequency suggests the participation of RNA
transcriptase because galactose stimulates the transcription of Ty DNA
into RNA, beginning at the galactose-sensitive promoter.
The key experimental result, however, is the fate of the
transposed Ty DNA. The researchers found that the intron
Figure 15-13 An RNA transcript from had been removed from the transposed Ty DNA. Because introns are spliced only in
the retrotransposon undergoes reverse the course of RNA processing (see Chapter 8), the transposed Ty DNA must have
transcription into DNA, by a reverse been copied from an RNA intermediate. The conclusion was that RNA is transcribed
transcriptase encoded by the from the original Ty element and spliced. The spliced mRNA undergoes reverse tran-
retrotransposon. The DNA copy is
inserted at a new location in the genome.
scription back into double-stranded DNA, which is then integrated into the yeast
chromosome. Transposable elements that employ reverse transcriptase to transpose
through an RNA intermediate are termed retrotransposons. They are also known
as class 1 transposable elements. Retrotransposons such as Ty1 that have long ter-
minal repeats at their ends are called LTR-retrotransposons, and the mechanism
they use to transpose is called “copy and paste” to distinguish them from the cut-
and-paste mechanism that characterizes most DNA transposable elements.
Several spontaneous mutations isolated through the years in Drosophila also
were shown to contain retrotransposon insertions. The copia-like elements of Dro-
sophila are structurally similar to Ty1 elements and appear at 10 to 100 positions in
the Drosophila genome (see Figure 15-12c). Certain classic Drosophila mutations
result from the insertion of copia-like and other elements. For example, the white-
apricot (w a) mutation for eye color is caused by the insertion of an element of the
copia family into the white locus. The insertion of LTR-retrotransposons into plant
genes (including maize) also has been shown to contribute to spontaneous muta-
tions in this kingdom.
Before we leave retrotransposons (we will return to them later in this chap-
ter), there is one question that needs to be answered. Recall that the first LTR-
retrotransposon was discovered in an unstable His− strain of yeast that reverted
frequently to His+. However, we have just seen that LTR-retrotransposons, unlike
most DNA transposable elements, do not excise when they transpose. What, then,
is responsible for this allele’s ~1000-fold increase in reversion frequency when com-
pared to other His− alleles? The answer is shown in Figure 15-15, which shows that
the Ty1 element in the His− allele is located in the promoter region of the His gene,
where it prevents gene transcription. In contrast, the revertants contain a single
copy of the LTR, called a solo LTR. This much smaller insertion does not interfere
with the transcription of the His gene. The solo LTR is the product of recombination
15.3 Transposable Elements in Eukaryotes 561
Coding region
LTR LTR
Ty element
Add galactose.
Primary transcript
Splicing
mRNA
Reverse transcription
Insertion
between the identical LTRs, which results in the deletion of the rest of the element
(see Chapters 4 and 16 for more on recombination). Solo LTRs are a very common
feature in the genomes of virtually all eukaryotes, indicating the importance of this
process. The sequenced yeast genome contains more than fivefold as many solo
LTRs as complete Ty1 elements.
No
transcription
His –
LTR gag pol LTR His
Recombination
F i g u r e 15 -15 His+ revertants contain a
Transcription solo LTR that results from recombination
Solo
between the identical DNA sequences in the
His +
LTR His two LTRs of the LTR-retrotransposon in the
HIS promoter.
562 CHAPTER 1 5 The Dynamic Genome: Transposable Elements
P-element structure
P element
Transposase gene
× ×
F1 F1
No offspring
genes, thereby rendering them inactive. According to this view, reversion would
usually result from the excision of these inserted sequences. This hypothesis has
been critically tested by isolating unstable dysgenic mutations at the eye-color
locus white. Most of the mutations were found to be caused by the insertion of a
transposable element into the white+ gene. The element, called the P element, was
found to be present in from 30 to 50 copies per genome in P strains but to be com-
pletely absent in M strains.
Why do P elements not cause trouble in P strains? The simple answer is that
P-element transposition is repressed in P strains. At first it was thought that repres-
sion was due to a protein repressor that was in P but not M strains. This model is no
longer favored. Instead, geneticists now think that all the transposase genes in
P elements are silenced in P strains. The genes are activated in the F1 generation as
shown in Figure 15-18. Gene silencing has been discussed previously (see Chapters
8 and 12) and will be revisited at the end of this chapter. For some reason, most labo-
ratory strains have no P elements, and consequently the silencing mechanism is not
activated. In hybrids from the cross M (female, no P elements) × P (male, P ele-
ments), the P elements in the newly formed zygote are in a silencing-free environ-
ment. The P elements derived from the male genome can now transpose throughout
the diploid genome, causing a variety of damage as they insert into genes and cause
mutations. These molecular events are expressed as the various manifestations of
hybrid dysgenesis. On the other hand, P (female) × M (male) crosses do not result
in dysgenesis because, presumably, the egg cytoplasm contains the components
required for silencing the P-element transposase.
56 4 CHAPTER 1 5 The Dynamic Genome: Transposable Elements
F i g u r e 15 -2 0 P-element-mediated
gene transfer in Drosophila. The rosy+ P elements can be engineered to transfer genes
(ry+) eye-color gene is engineered into a
deleted P element carried on a bacterial Deleted P element
vector. At the same time, a helper P element Transposase
plasmid bearing an intact P element is
used. Both are injected into an ry− ry +
embryo, where ry+ transposes with the
P element into the chromosomes of the Bacterial
vector
germ-line cells.
ry + plasmid Helper
plasmid
Nuclei
Anterior Posterior
Eventual location
Drosophila embryo of germ cells
ry –/ry –
Grow to adult.
rosy + eyes
Chromosome ry + gene
Some
ry +/ry –
progeny
Fraction
Element Transposition Structure Length Copy number
of genome
ORF1 ORF2 (pol )
LINEs Autonomous AAA 1– 5 kb 20,000 – 40,000 21%
transposase
DNA Autonomous 2 – 3 kb
transposons 300,000 3%
Nonautonomous 80 – 3000 bp
F i g u r e 15 -2 1 Several general
sophical thought experiment, If a tree falls in the forest and no one is around to classes of transposable elements are
hear it, does it make a sound? found in the human genome. [ Data from
Nature 409, 880 (15 February 2001), “Initial
Sequencing and Analysis of the Human
Large genomes are largely transposable elements Genome,” The International Human Genome
Long before the advent of DNA-sequencing projects, scientists using a variety of Sequencing Consortium.]
biochemical techniques discovered that DNA content (called C-value) varied dra-
matically in eukaryotes and did not correlate with biological complexity. For
example, the genomes of salamanders are 20 times as large as the human genome,
whereas the genome of barley is more than 10 times as large as the genome of rice,
a related grass. The lack of correlation between genome size and the biological
complexity of an organism is known as the C-value paradox.
Barley and rice are both cereal grasses, and, as such, their gene content should
be similar. However, if genes are a relatively constant component of the genomes
of multicellular organisms, what is responsible for the additional DNA in the
larger genomes? On the basis of the results of additional experiments, scientists
were able to determine that DNA sequences that are repeated thousands, even
hundreds of thousands, of times make up a large fraction of eukaryotic genomes
and that some genomes contain much more repetitive DNA than others.
Thanks to many recent projects to sequence the genomes of a wide variety of
organisms (including Drosophila, humans, the mouse, Arabidopsis, and rice), we
now know that there are many classes of repetitive sequences in the genomes of
higher organisms and that some are similar to the DNA transposons and ret-
rotransposons shown to be responsible for mutations in plants, yeast, and insects.
Most remarkably, these sequences make up most of the DNA in the genomes of
multicellular eukaryotes.
Rather than correlating with gene content, genome size frequently correlates
with the amount of DNA in the genome that is derived from transposable ele-
ments. Organisms with big genomes have lots of sequences that resemble trans-
posable elements, whereas organisms with small genomes have many fewer. Two
examples, one from the human genome and the other from a comparison of the
grass genomes, illustrate this point. The structural features of the transposable
elements that are found in human genomes are summarized in Figure 15-21 and
will be referred to in the next section.
10 kb
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Exons
Alu s
SINEs
LINEs
15.4 The Dynamic Genome: More Transposable Elements Than Ever Imagined 56 9
inactive by host regulatory mechanisms (see Section 15.5). There are, however, a
few active LINEs and Alus that have managed to escape host control and have
inserted into important genes, causing several human diseases. Three separate
insertions of LINEs have disrupted the factor VIII gene, causing hemophilia A. At
least 11 Alu insertions into human genes have been shown to cause several diseases,
including hemophilia B (in the factor IX gene), neurofibromatosis (in the NF1
gene), and breast cancer (in the BRCA2 gene).
The overall frequency of spontaneous mutation due to the insertion of class 2
elements in humans is quite low, accounting for less than 0.2 percent (1 in 500) of
all characterized spontaneous mutations. Surprisingly, retrotransposon inser-
tions account for about 10 percent of spontaneous mutations in another mammal,
the mouse. The approximately 50-fold increase in this type of mutation in the
mouse most likely corresponds to the much higher activity of these elements in
the mouse genome than in the human genome.
Safe havens
The abundance of transposable elements in the genomes of multicellular organ-
isms led some investigators to postulate that successful transposable elements
(those that are able to attain very high copy numbers) have evolved mechanisms
to prevent harm to their hosts by not inserting into host genes. Instead, successful
transposable elements insert into so-called safe havens in the genome. For the
grasses, a safe haven for new insertions appears to be into other retrotransposons.
Another safe haven is the heterochromatin of centromeres, where there are very
few genes but lots of repetitive DNA (see Chapter 12 for more on heterochroma-
tin). Many classes of transposable elements in both plant and animal species tend
to insert into the centric heterochromatin.
Safe havens in small genomes: targeted insertions In contrast with the genomes
of multicellular eukaryotes, the genome of unicellular yeast is very compact, with
closely spaced genes and very few introns. With almost 70 percent of its genome as
exons, there is a high probability that new insertions of transposable elements will
disrupt a coding sequence. Yet, as we have seen earlier in this chapter, the yeast
genome supports a collection of LTR-retrotransposons called Ty elements.
570 CHAPTER 1 5 The Dynamic Genome: Transposable Elements
Chromosomes
F i g u r e 15 -2 3 The grasses,
including barley, rice, sorghum, and How are transposable elements able to spread to new sites in genomes with
maize, diverged from a common few safe havens? Investigators have identified hundreds of Ty elements in the
ancestor about 70 million years ago. sequenced yeast genome and have determined that they are not randomly distrib-
Since that time, the transposable uted. Instead, each family of Ty elements inserts into a particular genomic region.
elements have accumulated to different
For example, the Ty3 family inserts almost exclusively near but not in tRNA genes,
levels in each species. Chromosomes
are larger in maize and barley, whose at sites where they do not interfere with the production of tRNAs and, presum-
genomes contain large amounts of ably, do not harm their hosts. Ty elements have evolved a mechanism that allows
LTR-retrotransposons. Green in the them to insert into particular regions of the genome: Ty proteins necessary for
partial genome at the bottom represents integration interact with specific yeast proteins bound to genomic DNA. Ty3 pro-
a cluster of transposons, whereas teins, for example, recognize and bind to subunits of the RNA polymerase com-
orange represents genes. plex that have assembled at tRNA promoters (Figure 15-24a).
The ability of some transposons to insert preferentially into certain sequences
or genomic regions is called targeting. A remarkable example of targeting is illus-
trated by the R1 and R2 elements of arthropods, including Drosophila. R1 and R2
are LINEs (see Figure 15-21) that insert only into the genes that produce ribo-
somal RNA. In arthropods, several hundred rRNA genes are organized in tandem
arrays (Figure 15-24b). With so many genes encoding the same product, the host
tolerates insertion into a subset. However, too many insertions of R1 and R2 have
been shown to decrease insect viability, presumably by interfering with ribosome
assembly.
(a) Ty3 integration into tRNA-gene safe haven (b) R1 and R2 in rRNA safe havens
rRNA gene R1 R1 R1
Integration 2 kb
protein encoded
byTy3
R2
RT
18S 5.8S 28S
RNA pol III
1 kb
RT
tRNA promoter R1
expression is associated with leukemia. A likely scenario is that insertion of the F i g u r e 15 -2 4 Some transposable
retroviral vector near the cellular gene altered its expression and, indirectly, elements are targeted to specific safe
caused the leukemia. havens. (a) The yeast Ty3
retrotransposon inserts into the promoter
Clearly, the serious risk associated with this form of gene therapy might be
region of transfer RNA genes. (b) The
greatly improved if doctors were able to control where the retroviral vector inte- Drosophila R1 and R2 non-LTR-
grates into the human genome. We have already seen that there are many simi- retrotransposons (LINEs) insert into the
larities between LTR-retrotransposons and retroviruses. It is hoped that, by genes encoding ribosomal RNA that are
understanding Ty targeting in yeast, we can learn how to construct retroviral vec- found in long tandem arrays on the
tors that insert themselves and their transgene cargo into safe havens in the chromosome. Only the reverse
human genome. transcriptase (RT) genes of R1 and R2
are noted.
piRNAs in animals In the germ lines of several animal species including Drosoph-
ila and mammals, active transposons are repressed through the action of piRNAs
(short for Piwi-interacting RNAs). Like siRNAs, piRNAs are short single-stranded
RNAs (26–30 nt in mammals) that interact with a protein complex (in this case, one
that contains the protein piwi-Argonaute from which it derives its name). Once
associated, piRNAs guide piwi-Argonaute to degrade complementary mRNAs (Fig-
ure 15-27). Unlike siRNAs, piRNAs do not originate from the double-stranded RNA
pathway shown in Figure 15-26 for Tc1. Instead, and quite ingeniously, animal
genomes contain several long (often >100 kb) loci called pi-clusters that serve as
traps to ensnare active transposons. A pi-cluster is comprised of remnants of many
different transposons that represent a historical record of prior insertions of active
transposons into that locus.
The first step in host genome surveillance is the insertion of a transposon into
one of several pi-cluster loci scattered around the genome. Transcription of
TIR
All transposase
Dicer mRNA targeted
for degradation F i g u r e 15 -2 6 The production of dsRNA from
only a single Tc1 element is sufficient to silence all
of the Tc1 transposase genes and thereby repress
siRNA transposition in the germ line. The siRNA derived
RISC from Tc1 dsRNA is bound to RISC and targets all
complementary RNA for degradation.
574 CHAPTER 1 5 The Dynamic Genome: Transposable Elements
Transcription Transcription
Inactive element
not transcribed
Processing Processing
piRNA mRNA
piwi-
Argonaute Translation
Transposition of
Genome surveillance “yellow” elements in
genome
pi-clusters produce long RNAs that may include antisense RNA from the newly
inserted element. These long RNAs are then processed into the final piRNAs that
associate with piwi-Argonaute and go on to degrade transposon-derived mRNAs
transcribed from anywhere in the genome. Thus, an active TE inserting randomly
throughout the genome is recognized by genome surveillance only when it hap-
pens to insert into a pi-cluster and become a permanent part of the locus.
Acquisition of DNA by the CRISPR
locus in some bacterial species crRNAs in bacteria Nucleic acid molecules usually invade bacterial species dur-
ing bacteriophage infection (see Chapter 5) when viral DNA genomes are injected
Phage into bacteria (see Figure 5-22). In an antiviral pathway that is still being elucidated,
fragments of the invading virus genome are captured by bacterial loci called
CRISPR (clustered regularly interspaced short palindromic repeats) (Figure 15-28),
DNA where they are transcribed into long RNAs that are processed into short crRNAs.
Introduction to Genetic Analysis, 11e
invader
Figure 15.27 #1531 Much like siRNAs and piRNAs, crRNAs interact with and guide bacterial protein
06/30/14 complexes to degrade complementary RNAs from the invading viral genome.
07/23/14
08/04/14
One shared feature of the pi-clusters and CRISPR loci is that new insertions of
08/25/14 transposon or viral DNA fragments, respectively, result in a permanent, genetic
CRISPR locus 08/26/14 change in these loci that is inherited by their progeny.
Dragonfly Media Group
F i g u r e 15 -2 8 Acquisition of DNA by the
K e y C o n c e p t Like siRNAs, piRNAs in animals and crRNAs in bacteria interact
CRISPR locus in some bacterial species. Part
with protein complexes and guide them to degrade complementary sequences in
of the DNA from an invading phage genome
transposons and viruses, respectively. These small noncoding RNAs have their origin
(shown in yellow) is incorporated into the
in long RNAs transcribed from loci that capture fragments of invasive DNA.
CRISPR locus by an unknown mechanism.
15.5 Regulation of Transposable Element Movement by the Host 575
Much like planes that evade radar by flying close to the ground, some trans- W hat Geneticists
posons have evolved mechanisms that allow them to evade the RNAi silencing A re D o ing t o day
pathway. These transposons can attain very high copy numbers. Evidence for
these mechanisms can be found in the genomes of all characterized plants and
animals containing transposon families (such as Alu in humans) that have thou-
sands of members.
How do some transposons avoid detection by the RNAi silencing pathway?
The short answer is that in most cases, we do not know. To understand how a
transposon avoids detection, it is necessary to study actively transposing ele-
ments. To date, scientists have detected very few transposon families with high
copy numbers that are still actively transposing. One of the best-characterized
elements among this small group is a special type of nonautonomous DNA trans-
poson called a miniature inverted repeat transposable element (abbreviated
MITE). Like other nonautonomous elements, MITEs can form by deletion of the
transposase gene from an autonomous element. However, unlike most nonauto-
nomous elements, MITEs can attain very high copy numbers, particularly in the
genomes of some grasses (see Figure 15-23). Some MITEs in grasses have been
amplified to thousands of copies.
The only actively transposing MITE isolated to date is the mPing element of
rice, which is formed from the autonomous Ping element by deletion of the entire
transposase gene (Figure 15-29). This element was discovered in the laboratory of
Susan Wessler by Ning Jiang. Another member of the Wessler laboratory, Ken Naito,
documented that in individuals of some rice strains there are only 3 to 7 copies of
transposase
Ping (about 3
to 7 copies)
Nonautonomous
elements, including
mPing
Only mPing
amplifies to high
copy numbers
Ping and over 1000 copies of mPing. Remarkably, the copy number of mPing in these
strains is increasing by almost 40 new insertions per plant per generation.
Two questions about the rapid increase in mPing copy number immediately
come to mind. First, how does a rice strain survive a transposable-element burst
of this magnitude? To address this question, the Wessler laboratory used next-
generation sequencing technology (see Chapter 14) to determine the insertion
sites of over 1700 mPing elements in the rice genome. Surprisingly, they found
that the element avoided inserting into exons, thus minimizing the impact of
insertion on rice gene expression. The mechanism underlying this preference is
currently being investigated.
The second question is, why does the rice host apparently fail to repress mPing
transposition? While this question is also an active area of current research, a
reasonable hypothesis is that mPing can fly under the hosts’ RNAi radar because
it does not contain any part of the transposase gene that resides on the Ping ele-
ment (Figure 15-29). Thus, read-through transcription into mPing elements
inserted throughout the rice genome will produce lots of dsRNA and siRNA. How-
ever, because siRNAs derived from mPing share no sequence with the source of
transposase, siRNAs produced from mPing will not induce silencing mechanisms
aimed at transposase. Instead, the transposase gene will remain active and will
continue to catalyze the movement of mPing. According to this hypothesis, mPing
transposition will be repressed only when a much rarer Ping insertion generates
dsRNA that triggers the silencing of its transposase gene.
summary
Transposable elements were discovered in maize by Barbara transposase that binds to the ends of autonomous and non-
McClintock as the cause of several unstable mutations. An autonomous elements and catalyzes excision of the element
example of a nonautonomous element is Ds, the transposi- from the donor site and reinsertion into a new target site
tion of which requires the presence of the autonomous Ac elsewhere in the genome.
element in the genome. Retrotransposons were first molecularly isolated from
Bacterial insertion-sequence elements were the first yeast mutants, and their resemblance to retroviruses was
transposable elements isolated molecularly. There are many immediately apparent. Retrotransposons are class 1 elements,
different types of IS elements in E. coli strains, and they are as are all transposable elements that use RNA as their trans-
usually present in at least several copies. Composite transpo- position intermediate.
sons contain IS elements flanking one or more genes, such as The active transposable elements isolated from such
genes conferring resistance to antibiotics. Transposons with model organisms as yeast, Drosophila, E. coli, and maize con-
resistance genes can insert into plasmids and are then trans- stitute a very small fraction of all the transposable elements
ferred by conjugation to nonresistant bacteria. in the genome. DNA sequencing of whole genomes, includ-
There are two major groups of transposable elements in ing the human genome, has led to the remarkable finding
eukaryotes: class 1 elements (retrotransposons) and class that almost half of the human genome is derived from trans-
2 elements (DNA transposons). The P element was the first posable elements. Despite having so many transposable ele-
eukaryotic class 2 DNA transposon to be isolated molecu- ments, eukaryotic genomes are extremely stable as transposi-
larly. It was isolated from unstable mutations in Drosophila tion is relatively rare because of two factors. First, most of the
that were induced by hybrid dysgenesis. P elements have transposable elements in eukaryotic genomes cannot move
been developed into vectors for the introduction of foreign because inactivating mutations prevent the production of
DNA into Drosophila germ cells. normal transposase and reverse transcriptase. Second,
Ac, Ds, and P are examples of DNA transposons, so expression of the vast majority of the remaining elements is
named because the transposition intermediate is the DNA silenced by the RNAi pathway in plants and C. elegans and
element itself. Autonomous elements such as Ac encode a the piRNA pathway in animals. Silencing of transposable
Solved Problems 577
elements depends on the ability of the host to detect new elements, such as MITEs, may evade silencing because they
insertions in the genome and generate small noncoding do not trigger the silencing of the transposase that catalyzes
RNAs that guide protein complexes to degrade complemen- their transposition.
tary transposon-encoded RNAs. Some high-copy-number
k e y terms
Activator (Ac) (p. 549) genome surveillance (p. 573) replicative transposition
Alu (p. 568) hybrid dysgenesis (p. 562) (p. 556)
autonomous element (p. 552) insertion-sequence (IS) element retrotransposon (p. 560)
class 1 element (retrotransposon) (p. 553) retrovirus (p. 558)
(p. 560) inverted repeat (p. 554) reverse transcriptase (p. 558)
class 2 element (DNA transposon) long interspersed element (LINE) R plasmid (p. 554)
(p. 562) (p. 568) safe haven (p. 569)
cointegrate (p. 557) long terminal repeat (LTR) (p. 558) short interspersed element (SINE)
composite transposon (p. 554) LTR-retrotransposon (p. 560) (p. 568)
conservative transposition (p. 556) M cytotype (p. 562) simple transposon (p. 555)
copia-like element (p. 560) miniature inverted repeat solo LTR (p. 560)
“copy and paste” (p. 560) transposable element (MITE) synteny (p. 569)
CRISPR loci (p. 574) (p. 575) targeting (p. 570)
crRNA (p. 574) negative selection (p. 568) target-site duplication (p. 557)
“cut and paste” (p. 557) nonautonomous element transposase (p. 554)
C-value (p. 567) (p. 552) transpose (p. 552)
C-value paradox (p. 567) P cytotype (p. 562) transposition (p. 555)
Dissociation (Ds) (p. 549) P element (p. 562) transposon (Tn) (p. 554)
DNA transposon (p. 562) pi-cluster (p. 573) transposon tagging (p. 565)
excise (p. 552) piRNAs (p. 573) Ty element (p. 558)
gene therapy (p. 548) provirus (p. 558) unstable phenotype (p. 549)
s o lv ed pr o b lems
SOLVED PROBLEM 1. Transposable elements have been transposition, how appropriate is the term “jumping genes”
referred to as “jumping genes” because they appear to jump for the vast majority of transposable elements in the human
from one position to another, leaving the old locus genome and in the genomes of most other mammals?
and appearing at a new locus. In light of what we now know
concerning the mechanism of transposition, how appro- Solution
priate is the term “jumping genes” for bacterial transposable The vast majority of transposable elements in the charac-
elements? terized mammalian genomes are retrotransposons. In
humans, two retrotransposons (the LINE called L1 and the
Solution SINE called Alu) account for a whopping one-third of our
In bacteria, transposition takes place by two different modes. entire genome. Like bacterial elements, retrotransposons
The conservative mode results in true jumping genes do not excise from the original site, so they are not really
because, in this case, the transposable element excises from jumping genes. Instead, the element serves as a template for
its original position and inserts at a new position. The other the transcription of RNAs that can be reverse-transcribed
mode is the replicative mode. In this pathway, a transposable by the enzyme reverse transcriptase into double-stranded
element moves to a new location by replicating into the tar- cDNA. Each cDNA can potentially insert into target sites
get DNA, leaving behind a copy of the transposable element throughout the genome. Note that while both bacterial ele-
at the original site. When operating by the replicative mode, ments and retrotransposons do not leave the original site,
transposable elements are not really jumping genes because their respective mechanisms of transposition are dramati-
a copy does remain at the original site. cally different. Finally, while LTR-retrotransposons do not
excise, they can become much shorter insertions due to the
SOLVED PROBLEM 2. Following from the question above, in production of solo LTRs by recombination.
light of what we now know concerning the mechanism of
578 CHAPTER 1 5 The Dynamic Genome: Transposable Elements
pr o b lems
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
W o r k ing with the F igures 13. What are safe havens? Are there any places in the much
1. In the chapter-opening photograph of kernels on an ear more compact bacterial genomes that might be a safe ha-
of corn, what is the genetic basis of the following (Hint: ven for insertion elements?
Refer to Figure 15-4 for some clues): 14. Nobel prizes are usually awarded many years after the
a. the fully pigmented kernel? actual discovery. For example, James Watson, Francis
Crick, and Maurice Wilkens were awarded the Nobel
b. the unpigmented kernels? Note that they can arise in
Prize in Medicine or Physiology in 1962, almost a decade
two different ways.
after their discovery of the double-helical structure of
2. In Figure 15-3a, what would the kernel phenotype be if DNA. However, Barbara McClintock was awarded the
the strain were homozygous for all dominant markers on Nobel Prize in 1983, almost four decades after her dis-
chromosome 9? covery of transposable elements in maize. Why do you
3. F
or Figure 15-7, draw out a series of steps that could ex- think it took this long for the significance of her discov-
plain the origin of this large plasmid containing many ery to be recognized in this manner?
transposable elements. 15. Transposase protein can
4. D
raw a figure for the mode of transposition not shown in a. bind to DNA.
Figure 15-8, retrotransposition. b. catalyze the excision of a transposable element from
5. I n Figure 15-10, show where the transposase would have a donor site.
to cut to generate a 6-bp target-site duplication. Also c. catalyze the insertion of a transposable element into
show the location of the cut to generate a 4-bp target-site a target site.
duplication. d. All of the above
6. I f the transposable element in Figure 15-14 were a DNA 16. Which of the following are safe havens for transposable
transposon that had an intron in its transposase gene, element insertions?
would the intron be removed during transposition?
a. Introns
Justify your answer.
b. Exons
7. F
or Figure 15-22, draw the pre-mRNA that is transcribed
from this gene and then draw its mRNA. c. Other transposable elements
d. a and c are both correct.
Basic P r o b lems 17. Why can’t retrotransposons move from one cell to an-
8. Describe the generation of multiple-drug-resistant other like retroviruses?
plasmids. a. Because they do not encode the Env protein
9. Briefly describe the experiment that demonstrates that b. Because they are nonautonomous elements
the transposition of the Ty1 element in yeast takes place c. Because they require reverse transcriptase
through an RNA intermediate.
d. a and b are both true.
10. Explain how the properties of P elements in Drosophila 18. Unlike retrotransposons, DNA transposons
make gene-transfer experiments possible in this
organism. a. have terminal inverted repeats.
b. generate a target-site duplication upon insertion.
11. Although DNA transposons are abundant in the ge-
nomes of multicellular eukaryotes, class 1 elements usu- c. transpose via an RNA intermediate.
ally make up the largest fraction of very large genomes d. are not found in prokaryotes.
such as those from humans (~2500 Mb), maize (~2500 19. The major difference between retrotransposons and ret-
Mb), and barley (~5000 Mb). Given what you know roviruses is
about class 1 and class 2 elements, what is it about their
a. retrotransposons encode reverse transcriptase.
distinct mechanisms of transposition that would account
for this consistent difference in abundance? b. retroviruses move from one site in the genome to
another.
12. As you saw in Figure 15-22, the genes of multicellular eu-
karyotes often contain many transposable elements. c. retroviruses encode the env gene, which allows them
Why do most of these elements not affect the expression to move from one cell to another.
of the gene? d. None of the above are correct.
Problems 579
20. Which of the following is true of reverse transcriptase? a. A LINE inserts into an enhancer of a human gene.
a. It is required for the movement of DNA transposons. b. A transposable element contains a binding site for a
b. It catalyzes the synthesis of DNA from RNA. transcriptional repressor and inserts adjacent to a
promoter.
c. It is required for the transposition of retrotrans-
posons. c. An Alu element inserts into the 3′ splice (AG) site of
an intron in a human gene.
d. b and c are correct.
d. A Ds element that was inserted into the exon of a
21. Which transposable element is used to introduce foreign
maize gene excises imperfectly and leaves three base
DNA into the fruit fly Drosophila melanogaster?
pairs behind in the exon.
a. Ac element
e. Another excision by that same Ds element leaves
b. P element two base pairs behind in the exon.
c. Alu element f. A Ds element that was inserted into the middle of
d. Composite transposons an intron excises imperfectly and leaves five base pairs
22. What is the major reason why the maize genome is much behind in the intron.
larger than the rice genome? 27. Before the integration of a transposon, its transposase
a. Maize has more genes than rice. makes a staggered cut in the host target DNA. If the
staggered cut is at the sites of the arrows below, draw
b. Rice has more genes than maize.
what the sequence of the host DNA will be after the
c. Maize has more DNA transposons than rice. transposon has been inserted. Represent the transposon
d. Maize has more retrotransposons than rice. as a rectangle.
23. Why are transposable elements found much more often T
in introns than in exons? AATTTGGCCTAGTACTAATTGGTTGG
a. Because transposable elements prefer to insert into TTAAACCGGATCATGATTAACCAACC
introns c
b. Because transposable elements prefer to insert into 28. In Drosophila, M. Green found a singed allele (sn) with
exons some unusual characteristics. Females homozygous for
c. Because transposable elements insert into both this X-linked allele have singed bristles, but they have
exons and introns but selection removes exon insertions numerous patches of sn+ (wild-type) bristles on their
d. None of the above are true. heads, thoraxes, and abdomens. When these flies are
mated with sn males, some females give only singed
24. Approximately what percentage of the human genome is progeny, but others give both singed and wild-type prog-
derived from transposable elements? eny in variable proportions. Explain these results.
a. 10% 29. Consider two maize plants:
b. 25% a. Genotype C/c m ; Ac/Ac +, where c m is an unstable
c. 50% allele caused by a Ds insertion
d. 75% b. Genotype C/c m, where c m is an unstable allele caused
25. Why do plants and animals thrive with so many trans- by Ac insertion
posable elements in their genomes? What phenotypes would be produced and in what
a. Most of the transposable elements are inactive due proportions when (1) each plant is crossed with a base-
to mutation. pair-substitution mutant c /c and (2) the plant in part a is
b. Active transposable elements are silenced by the crossed with the plant in part b? Assume that Ac and c
host. are unlinked, that the chromosome-breakage frequency
is negligible, and that mutant c /C is Ac+.
c. Most transposable elements are inserted in safe
havens. 30. You meet your friend, a scientist, at the gym and she be-
gins telling you about a mouse gene that she is studying
d. All of the above are true.
in the lab. The product of this gene is an enzyme re-
quired to make the fur brown. The gene is called FB, and
C hallenging P r o b lems the enzyme is called FB enzyme. When FB is mutant and
26. The insertion of transposable elements into genes can cannot produce the FB enzyme, the fur is white. The sci-
alter the normal pattern of expression. In the following entist tells you that she has isolated the gene from two
situations, describe the possible consequences on gene mice with brown fur and that, surprisingly, she found
expression. that the two genes differ by the presence of a 250-bp
58 0 CHAPTER 1 5 The Dynamic Genome: Transposable Elements
SINE (like the human Alu element) in the FB gene of one 35. You have isolated a transposable element from the hu-
mouse but not in the gene of the other. She does not un- man genome and have determined its DNA sequence.
derstand how this difference is possible, especially given How would you use this sequence to determine the
that she determined that both mice make the FB en- copy number of the element in the human genome if
zyme. Can you help her formulate a hypothesis that ex- you just had a computer with an Internet connection?
plains why the mouse can still produce FB enzyme with (Hint: see Chapter 14.)
a transposable element in its FB gene? 36. Following up on the previous question, how would you
31. The yeast genome has class 1 elements (Ty1, Ty2, and so determine whether other primates had a similar ele-
forth) but no class 2 elements. What is a possible reason ment in their genomes?
why DNA elements have not been successful in the 37. Of all the genes in the human genome, the ones with
yeast genome? the most characterized Alu insertions are those that
32. In addition to Tc1, the C. elegans genome contains other cause hemophilia, including several insertions in the
families of DNA transposons such as Tc2, Tc3, Tc4, and factor VIII and factor IX genes. Based on this fact, your
Tc5. Like Tc1, their transposition is repressed in the colleague hypothesizes that the Alu element prefers to
germ line but not in somatic cells. Predict the behavior insert into these genes. Do you agree? What other rea-
of these elements in the mutant strains where Tc1 is no son can you provide that also explains these data?
longer repressed due to mutations in the RNAi pathway. 38. If all members of a transposable element family can be
Justify your answer. silenced by dsRNA synthesized from a single family
33. Based on the mechanism of gene silencing, what features member, how is it possible for one element family (like
of transposable elements does the RNAi pathway exploit Tc1) to have 32 copies in the C. elegans genome while
to ensure that the host’s own genes are not also silenced? another family (Tc2) has fewer than 5 copies?
34. What are the similarities and differences between retro- 39. How can the CRISPR and pi-cluster loci change over
viruses and retrotransposons? It has been hypothesized time?
that retroviruses evolved from retrotransposons. Do you
agree with this model? Justify your answer.
344
16
C h a p t e r
Mutation, Repair,
and Recombination
Learning Outcomes
After completing this chapter, you will be able to
outline
581
582 CHAPTER 1 6 Mutation, Repair, and Recombination
A
young patient develops a great many small, frecklelike, precancerous skin
growths and is extremely sensitive to sunlight (Figure 16-1). A family his-
tory is taken, and the patient is diagnosed with an autosomal recessive
disease called xeroderma pigmentosum. Throughout her life, she will be prone to
developing pigmented skin cancers. Several different genes can be mutated to
generate this disease phenotype. In a person without the disease, each of these
genes contributes to the biochemical processes in the cell that respond to chemi-
cal damage to DNA and repair this damage before it leads to the formation of new
mutations. Later in this chapter, we will see how mutations in the repair systems
lead to genetic diseases such as xeroderma pigmentosum.
Persons with this disease are examples of genetic variants—individuals that
show phenotypic differences in one or more particular characters. Because genet-
ics is the study of inherited differences, genetic analysis would not be possible
without variants. In preceding chapters you saw many analyses of the inheritance
of such variants; now, we consider their origin. How do genetic variants arise?
Two major processes are responsible for genetic variation: mutation and recom-
bination. We have seen that mutation is a change in the DNA sequence of a gene.
Mutation is especially significant because it is the ultimate source of evolutionary
change; new alleles arise in all organisms, some spontaneously and others result-
ing from exposure to radiation or chemicals in the environment. The new alleles
produced by mutation become the raw material for a second level of variation,
effected by recombination. As its name suggests, recombination is the outcome of
cellular processes that cause alleles of different genes to become grouped in new
combinations (see Chapter 4). To use an analogy, mutation occasionally produces
a new playing card, but it is recombination that shuffles the cards and deals them
out as different hands.
In the cellular environment, DNA molecules are not absolutely stable: each base
pair in a DNA double helix has a certain probability of mutating. As we will see, the
term mutation covers a broad array of different kinds of changes. These changes
range from the simple swapping of one base pair for another to the disappearance of
an entire chromosome. In Chapter 17, we will consider mutational changes
Skin cancer in that affect entire chromosomes or large pieces of chromosomes. In the pres-
xeroderma pigmentosum ent chapter, we focus on mutational events that take place within individual
genes. We call such events gene mutations.
Cells have evolved sophisticated systems to identify and repair dam-
aged DNA, thereby preventing the occurrence of most but not all muta-
tions. We can view DNA as being subjected to a dynamic tug-of-war between
the chemical processes that damage DNA and lead to new mutations and
the cellular repair processes that constantly monitor DNA for such damage
and correct it. However, this tug-of-war is not straightforward. As already
mentioned, mutations provide the raw material for evolution and thus the
introduction of a low level of mutation must be tolerated. We will see that
DNA-replication and DNA-repair systems can actually introduce muta-
tions. Others turn potentially devastating mutations (such as double-strand
breaks) into mutations that may affect only a single gene product.
We will see that the most potentially serious class of DNA damage, a
double-strand break, is also an intermediate step in the normal cellular
process of recombination through meiotic crossing over. Thus, we can
draw parallels between mutation and recombination at two levels. First,
F i g u r e 16 -1 The recessive hereditary disease
as mentioned earlier, mutation and recombination are the major sources
xeroderma pigmentosum is caused by deficiencies in
one of several proteins that help correct damaged
of variation. Second, mechanisms of DNA repair and recombination have
DNA. These enzyme deficiencies lead to the some features in common, including the use of some of the same pro-
formation of skin cancers on exposure of the skin to teins. For this reason, we will explore mechanisms of DNA repair first and
ultraviolet rays in sunlight. [ KOKEL/BSIP/SuperStock.] then compare them with the mechanism of DNA recombination.
16.1 The Phenotypic Consequences of DNA Mutations 58 3
Types of mutations
Results at the molecular level
at the DNA level
Thr STOP
Nonsense Altered codon signals
mutation A C A T A G A G A G G T chain termination.
Indel
Thr Glu Glu Arg
Base insertion Frameshift
mutation A C A G A A G A G A G G T
Alters all codons
from indel until a
stop codon
Thr Arg Glu Val is encountered.
Base deletion Frameshift
mutation A C A A G A G A G G T
A
F i g u r e 16 -2 Point mutations within the coding region of a gene vary in their effects on protein function.
Proteins with synonymous and missense mutations are usually still functional.
58 4 CHAPTER 1 6 Mutation, Repair, and Recombination
replaced by another. Base substitutions can be divided into two subtypes: transi-
tions and transversions. To describe these subtypes, we consider how a mutation
alters the sequence on one DNA strand (the complementary change will take
place on the other strand). A transition is the replacement of a base by the other
base of the same chemical category. Either a purine is replaced by a purine (from
A to G or from G to A) or a pyrimidine is replaced by a pyrimidine (from C to T or
from T to C). A transversion is the opposite—the replacement of a base of one
chemical category by a base of the other. Either a pyrimidine is replaced by a
purine (from C to A, C to G, T to A, or T to G) or a purine is replaced by a pyrimi-
dine (from A to C, A to T, G to C, or G to T). In describing the same changes at the
double-stranded level of DNA, we must represent both members of a base pair in
the same relative location. Thus, an example of a transition is G • C → A • T; that
of a transversion is G • C → T • A. Insertion or deletion mutations are actually
insertions or deletions of nucleotide pairs; nevertheless, the convention is to call
them base-pair insertions or deletions. Collectively, they are termed indel muta-
tions (for insertion-deletion). The simplest of these mutations is the addition or
deletion of a single base pair. Mutations sometimes arise through the simultane-
ous addition or deletion of multiple base pairs at once. As we will see later in this
chapter, mechanisms that selectively produce certain kinds of multiple-base-pair
additions or deletions are the cause of certain human genetic diseases.
Synonymous substitutions never alter the amino acid sequence of the polypep-
tide chain. The severity of the effect of missense and nonsense mutations on the
polypeptide differs from case to case. For example, a missense mutation may replace
one amino acid with a chemically similar amino acid, called a conservative substitu-
tion. In this case, the alteration is less likely to affect the protein’s structure and
function severely. Alternatively, one amino acid may be replaced by a chemically
different amino acid in a nonconservative substitution. This type of alteration is
more likely to produce a severe change in protein structure and function. Nonsense
mutations will lead to the premature termination of translation. Thus, they have a
considerable effect on protein function. The closer a nonsense mutation is to the 3′
end of the open reading frame (ORF), the more plausible it is that the resulting
protein might possess some biological activity. However, many nonsense mutations
produce completely inactive protein products.
Single-base-pair changes that inactivate proteins are often due to splice site
mutations. As seen in Figure 16-3, such changes can dramatically change the
16.1 The Phenotypic Consequences of DNA Mutations 58 5
GT AG GT AG
mRNA transcript by leading to large insertions or deletions that may or may not
be in frame.
Like nonsense mutations, indel mutations may have consequences on poly-
peptide sequence that extend far beyond the site of the mutation itself (see
Figure 16-2). Recall that the sequence of mRNA is “read” by the translational
apparatus in register (“in frame”), three bases (one codon) at a time. The addition
or deletion of a single base pair of DNA changes the reading frame for the remain-
der of the translation process, from the site of the base-pair mutation to the next
stop codon in the new reading frame. Hence, these lesions are called frameshift F i g u r e 16 - 4 Point mutations in
coding regions can alter protein
mutations. These mutations cause the entire amino acid sequence translation-
structure with or without altering mRNA
ally downstream of the mutant site to bear no relation to the original amino acid size. Point mutations in regulatory
sequence. Thus, frameshift mutations typically result in complete loss of normal regions can prevent the synthesis of
protein structure and function (Figure 16-4). mRNA (and protein).
mRNA No mRNA
Protein
No protein
N W N W N W N W N W
Errors in DNA replication An error in DNA replication can result when an ille-
gitimate nucleotide pair (say, A–C) forms in DNA synthesis, leading to a base substi-
tution that may be either a transition or a transversion. Other errors may add or
subtract base pairs such that a frameshift mutation is created.
Transitions You saw in Chapter 7 that each of the bases in DNA can appear in one
of several tautomeric forms that can pair to the wrong base. Mismatches can also
16.2 The Molecular Basis of Spontaneous Mutations 58 9
P O P
O
O N H O H
N N N
H H C H H
C N H
N N N N N
O O O+
H H H
Guanine Guanine
O O
A mammalian cell spontaneously loses about 10,000 purines from its DNA in a
20-hour cell-cycle period at 37°C. If these lesions were to persist, they would result
in significant genetic damage because, in replication, the resulting apurinic sites
cannot specify a base complementary to the original purine. However, as we will see
later in the chapter, efficient repair systems remove apurinic sites. Under certain
conditions (to be described later), a base can be inserted across from an apurinic
site; this insertion will frequently result in a mutation.
59 0 CHAPTER 1 6 Mutation, Repair, and Recombination
5′ — CGTTTT 5′ — CTGAGAGA
3′ — GCAAAAACGTAC — 3′ — GACTCTCTCTCTGCA —
T
5′ — CG TTTT 5′ — CT GAGAGA
3′ — GC AAAAACGTAC — 3′ — GA CTCTCTCTGCA —
CT
Loop stabilized by Loop stabilized by
repetitive sequences repetitive sequences
T
5′ — CG TTTTTGCATG 5′ — CT GAGAGAGACGT
3′ — GC AAAAACGTAC — 3′ — GA CTCTCTCTGCA —
CT
Next round of Next round of
replication replication
Figure 16-8 Base additions and deletions (indel mutations) cause frameshift mutations
through the slipped mispairing of repeated sequences in the course of replication.
ANIMATED ART: Molecular mechanism of mutation
H H
N O
H
N N
Deamination
N O H2O NH3 N O
Cytosine Uracil
Unrepaired uracil residues will pair with adenine in replication, resulting in the
conversion of a G • C pair into an A • T pair (a G • C → A • T transition).
Oxidatively damaged bases represent a third type of spontaneous lesion that
generates mutations. Active oxygen species, such as superoxide radicals (O2 • − ),
hydrogen peroxide (H2O2), and hydroxyl radicals ( • OH), are produced as by-prod-
ucts of normal aerobic metabolism. They can cause oxidative damage to DNA, as
well as to precursors of DNA (such as GTP), resulting in mutation. Mutations from
oxidative damage have been implicated in a number of human diseases. Figure 16-9
shows two products of oxidative damage. The 8-oxo-7-hydrodeoxyguanosine (8-oxo
dG, or GO) product frequently mispairs with A, resulting in a high level of G • T
transversions. The thymidine glycol product blocks DNA replication if unrepaired.
(a) (CGG)n
ATG TAA
1 234 5 6 7 8 9 10 11 12 13 14 15 16 17
Normal
(CGG)6–59 ATG Normal Stable No Yes
...
Premutation Unstable,
(CGG)60–200 ATG Largely
prone to No Yes
... normal
expansion
Full mutation
(CGG)>200 ATG Affected Unstable Yes No
...
X
592 CHAPTER 1 6 Mutation, Repair, and Recombination
Mechanisms of mutagenesis
Mutagens induce mutations by at least three different mechanisms. They can
replace a base in the DNA, alter a base so that it specifically mispairs with another
base, or damage a base so that it can no longer pair with any base under normal
conditions. Mutagenizing genes and observing the phenotypic consequences is
one of the primary experimental strategies used by geneticists.
Specific mispairing Some mutagens are not incorporated into the DNA but F i g u r e 16 -12 (a) An analog of
instead alter a base in such a way that it will form a specific mispair. Certain alkyl- adenine, 2-aminopurine (2-AP) can pair
ating agents, such as ethylmethanesulfonate (EMS) and the widely used nitroso- with thymine. (b) In its protonated state,
guanidine (NG), operate by this pathway. 2-AP can pair with cytosine.
CH3
O N N H
C N
O
NO2
H5C2 O S CH3 N
O H
EMS NG
Such agents add alkyl groups (an ethyl group in EMS and a methyl group
in NG) to many positions on all four bases. However, the formation of a mutation
is best correlated with an addition to the oxygen at position 6 of guanine to create
an O-6-alkylguanine. This addition leads to direct mispairing with thymine, as
shown in Figure 16-13, and would result in G • C → A • T transitions at the next
round of replication.
59 4 CHAPTER 1 6 Mutation, Repair, and Recombination
O O
N N O CH3 Intercalating agents The intercalating agents form
6 6 another important class of DNA modifiers. This group
N 1NH N G•C A•T
EMS
1N H N3
1 of compounds includes proflavin, acridine orange, and
N N N a class of chemicals termed ICR compounds (Figure
N H N H O
16-14a). These agents are planar molecules that mimic
H H base pairs and are able to slip themselves in (interca-
Guanine O -6-Ethylguanine Thymine
late) between the stacked nitrogen bases at the core of
the DNA double helix (Figure 16-14b). In this interca-
lated position, such an agent can cause an insertion or
H3C CH3 deletion of a single nucleotide pair.
H3C O EMS N
H3C O O
4
4 T•A Base damage A large number of mutagens damage
C•G
1 3NH 3N H N N one or more bases, and so no specific base pairing is
1
N N
N possible. The result is a replication block because DNA
O O H N synthesis will not proceed past a base that cannot
H specify its complementary partner by hydrogen bond-
Thymine O-4-Ethylthymine Guanine ing. Replication blocks can cause further mutation—as
will be explained later in the chapter (see the section
F i g u r e 16 -13 Treatment with EMS on base-excision repair).
alters the structure of guanine and thymine Ultraviolet light usually causes damage to nucleotide bases in most organisms.
and leads to mispairings. Ultraviolet light generates a number of distinct types of alterations in DNA, called
photoproducts (from the word photo, for “light”). The most likely of these products
to lead to mutations are two different lesions that unite adjacent pyrimidine resi-
dues in the same strand. These lesions are the cyclobutane pyrimidine photodi-
mer and the 6-4 photoproduct (Figure 16-15).
Ionizing radiation results in the formation of ionized and excited molecules
that can damage DNA. Because of the aqueous nature of biological systems, the
molecules generated by the effects of ionizing radiation on water produce the
most damage. Many different types of reactive oxygen species are produced, but
the most damaging to DNA bases are • OH, O2−, and H2O2. These species lead to
the formation of different adducts and degradation products. Among the most
F i g u r e 16 -14 Structures of common
intercalating agents (a) and their interaction
prevalent, pictured in Figure 16-9, are thymidine glycol and 8-oxo dG, both of
with DNA (b). [ Data from L. S. Lerman, “The which can result in mutations.
Structure of the DNA–Acridine Complex,” Proc. Ionizing radiation can also damage DNA directly rather than through reactive
Natl. Acad. Sci. USA 49, 1963, 94.] oxygen species. Such radiation may cause breakage of the N-glycosydic bond,
Intercalating agents
H CH2CH2Cl
N
CH2
CH2
H CH2
H3C CH3 N Nitrogenous
bases
H2N N NH2 N N N OCH3
H3C CH3 Intercalated
Cl molecule
N
Proflavin Acridine orange ICR-191
(a) (b)
16.3 The Molecular Basis of Induced Mutations 59 5
UV-light-generated photoproducts
5′ 5′
O
O H 2C
O H
O P O 2
N
3
N1 4 O
O 3′ 6 5
H2C H CH3
O
O O
O
O P
O O
O O CH3
H2C
O P CH2
O 2 3
N1 5 4 O
O O 6
N
H2C O P O O H
O N C O
O C4
3 2
1N O O
5 6 O
C C O P
H3C
O O
O O
O P CH2 3′
O O
H2C P O
O
O N C O
O C4
3 2
1N
O
5 6
C C
H3C H2C
O O O H
O O N
O P Cyclobutyl ring CH2 N O
O O
O H CH3
H2C P O H C
O O 3 OH
O
O O P N
O
H2C N O
O O
O
H CH2
T (6-4) T
3′ O
P O O
O O
O O P
(a) Cyclobutane pyrimidine dimer 5′ O (b) 6-4 Photoproduct
leading to the formation of apurinic or apyrimidinic sites, and it can cause strand F i g u r e 16 -15 Photoproducts that
breaks. In fact, strand breaks are responsible for most of the lethal effects of ion- unite adjacent pyrimidines in DNA are
izing radiation. strongly correlated with mutagenesis.
Aflatoxin B1 is a powerful carcinogen that attaches to guanine at the N-7 posi-
tion (Figure 16-16). The formation of this addition product leads to the breakage
of the bond between the base and the sugar, thereby liberating the base and gen-
erating an apurinic site. Aflatoxin B1 is a member of a class of chemical carcino-
gens known as bulky addition products when they bind covalently to DNA. Other
examples include the diol epoxides of benzo(a)pyrene, a compound produced by
internal combustion engines. All compounds of this class induce mutations,
although by what mechanisms is not always clear.
S9 S9 Strain X
2
+ +
are able to excise mismatched bases that have been inserted erro-
Ames test of aflatoxin B1 mutagenicity
neously. Let’s now examine some of the other repair pathways,
beginning with error-free repair. 2000
TA100
Direct reversal of damaged DNA
The most straightforward way to repair a lesion is to reverse it
directly, thereby regenerating the normal base (Figure 16-19).
1500
Although most types of damage are essentially irreversible, lesions
Revertant colonies per plate
No mutation
CPD photolyase
UV light
Light
>300 nm
Photodimer
hot spots in the lacI gene showed that 5-methylcytosine residues are present at each
hotspot. Recall from Chapter 12 that eukaryotic DNA may be methylated to inactive
genes. Similarly, E. coli and other bacteria also methylate their DNA, although for
different purposes. Some of the data from this lacI study are shown in Figure 16-21.
The height of each bar on the graph represents the frequency of mutations at each
site. The positions of 5-methylcytosine residues can be seen to correlate nicely with
the most mutable sites.
Why are 5-methylcytosines hot spots for mutations? The deamination of
5-methylcytosine generates thymine (5-methyluracil):
H H
N O
H
H3C H3C
4
5 3N Deamination N
6 2
1
N O H2O NH3 N O
5-Methylcytosine Thymine
Thymine is not recognized by the enzyme uracil-DNA glycosylase and thus is not
repaired. Therefore, C → T transitions generated by deamination are seen more
frequently at 5-methylcytosine sites because they escape this repair system. A
consequence of the frequent mutation of 5-methylcytosine to thymine is that
methylated regions of the genome (which are usually transcriptionally inactive;
see Chapter 12) are converted, over evolutionary time, to AT-rich regions. In con-
trast, coding and regulatory regions, which are less methylated, remain GC rich.
15
*
* G .C A .T
Number of occurrences
*
10
5
*
correct bulky adducts that distort the DNA helix, adducts such as the cyclobutane
pyrimidine dimers caused by UV light (see Figure 16-15), nor correct damage to
more than one base. A DNA polymerase cannot continue DNA synthesis past
such lesions, and so the result is a replication block. A blocked replication fork can
cause cell death. Similarly, an abnormal or damaged base can stall the transcrip-
tion complex. To cope with both of these situations, prokaryotes and eukaryotes
utilize an extremely versatile pathway called nucleotide-excision repair (NER)
that is able to relieve replication and transcription blocks and repair the damage.
Interestingly, two autosomal recessive diseases in humans, xeroderma
pigmentosum (XP) and Cockayne syndrome, are caused by defects in nucleotide-
excision repair. Although patients with either XP or Cockayne syndrome are
exceptionally sensitive to UV light, other symptoms are dramatically different.
Xeroderma pigmentosum was introduced at the beginning of this chapter and is
characterized by the early development of cancers, especially skin cancer and, in
some cases, neurological defects. In contrast, patients afflicted with Cockayne
syndrome have a variety of developmental disorders including dwarfism, deaf-
ness, and retardation. In broad terms, XP patients get early cancer, whereas Cock-
ayne syndrome patients age prematurely. How can defects in the same repair
pathway lead to such different disease symptoms? Although there is no simple
answer to this question, work on the genetic basis of these diseases has led to the
identification of important proteins in the NER pathway.
Nucleotide-excision repair is a complex process that requires dozens of pro-
teins. Despite this complexity, the repair process can be divided into four phases:
1. Recognition of damaged base(s)
2. Assembly of a multiprotein complex at the site
3. Cutting of the damaged strand several nucleotides upstream and downstream
of the damage site and removal of the nucleotides (~30) between the cuts
4. Use of the undamaged strand as a template for DNA polymerase followed by
strand ligation
The fact that, as already mentioned, both stalled replication forks and stalled
transcription complexes activate this repair pathway implies that there are two
types of nucleotide-excision repair that differ in damage recognition (step 1). We
now know that one type, called global genomic nucleotide-excision repair
(GG-NER) corrects lesions anywhere in the genome and is activated by stalled
replication forks. The other type repairs transcribed regions of DNA and is called,
not surprisingly, transcription-coupled nucleotide-excision repair (TC-NER).
As can be seen in Figure 16-22, although the recognition step differs, both GG-
NER and TC-NER share the last four steps.
At this point, you might be asking yourself, What if differences in the disease
symptoms of XP and Cockayne syndrome were due to mutations in different
classes of recognition proteins? You would be on the right track in asking that
question. Patients with XP fall into 8 complementation groups, carrying muta-
tions in one of 8 genes encoding proteins XPA through XPG (Figure 16-22).
Patients with Cockayne syndrome have a mutation in one of two proteins called
CSA and CSB, which are thought to recognize stalled transcription complexes.
GG-NER is initiated when a protein complex of XPC and RAD23B recognizes
a distorted double helix caused by a damaged base and binds to the opposite
strand. In contrast, TC-NER is initiated when an RNA polymerase complex is
stalled by a DNA lesion in the transcribed strand and CSA and CSB bind at this
site to form a recognition complex. After lesion recognition, the GG-NER and TC-
NER pathways utilize largely the same proteins to remove and repair the dam-
aged DNA because the role of both XPC-RAD23B and CSA/CSB is to attract the
multiprotein TFIIH complex. Two of its 10 subunits, XPB and XPD, are helicases
16.4 Biological Repair Mechanisms 6 01
Step 1: The nucleotide-excision repair (NER) pathway is activated when bulky adducts or multiple damaged bases are recognized in
nontranscribed (GG-NER) or transcribed (TC-NER) regions of the genome. In the GG-NER pathway, XPA-RAD23 binds to a lesion.
In the TC-NER pathway, CSA and CSB bind to RNA polymerase II complexes that are stalled at a lesion.
Global genomic nucleotide-excision repair (GG-NER) Transcription-coupled NER (TC-NER)
RNA
Recognition of polymerase Recognition of stalled
damaged base transcription complex
XPC-RAD23B
CSA
CSB
ERCC1
XPF XPG
XPA
(3′-to-5′ and 5′-to-3′, respectively) that unwind and open the DNA helix around
the lesion. Subsequent steps common to GG-NER and TC-NER mediate the cleav-
age and excision of the damaged base and as many as 30 adjacent nucleotides
followed by DNA synthesis to fill the gap (see details in Figure 16-22). In addition
to XPC-RAD23D, XPB, and XPD, XP patients harbor mutations in other proteins
that participate in the common steps of NER. As shown in Figure 16-22, XPA pro-
motes the release of the CAK subunit and the binding of RPA while endonucle-
ases XPF (with ERCCI) and XPG cut 5′ and 3′, respectively, of the DNA damage.
After removal of the damaged base and surrounding DNA, the gap is filled by a
DNA polymerase assisted by the RFC and PCNA proteins. The last step of NER
involves ligation of the new strand to the surrounding DNA by one of two ligation
complexes (XRCC1/LIG3 or FEN1/LIG1).
Can the molecular differences between GG-NER and TC-NER provide an
explanation for the different symptoms displayed by patients with XP and Cock-
ayne syndrome? Recall that XP patients develop early cancers, whereas Cockayne
syndrome patients have a variety of symptoms associated with premature aging.
We have seen that the repair system of Cockayne syndrome patients cannot rec-
ognize stalled transcription complexes. A consequence of this defect is that the
cell is more likely to activate the apoptosis suicide pathway. In a healthy person,
cell death is often preferable to the propagation of a cell that has sustained DNA
damage. However, according to this theory, the cell-death pathway would be acti-
vated more frequently in a Cockayne syndrome patient, thus leading to a variety
of premature-aging symptoms. In contrast, XP patients can recognize stalled tran-
scription complexes (they have normal CSA and CSB proteins) and prevent cell
death when transcription is restarted. However, they cannot repair the original
damage because of mutations in one of their XP proteins. Thus, mutations will
accumulate in the cells of patients with XP and, as stated earlier in this chapter,
the presence of mutations, whether caused by mutagens or the failure of repair
pathways, increases the risk of developing many types of cancer.
E. coli Model Organism box on page 180). Especially noteworthy Mismatch repair corrects
was the reconstitution of the mismatch-repair system in the test replicative errors
tube in the laboratory of Paul Modrich. Conservation of many of
the mismatch-repair proteins from bacteria to yeast to human
indicates that this pathway is both ancient and important in all 5′
living organisms. Recently, the human mismatch-repair system 3′ G M
T GATC
also was reconstituted in the test tube in the Modrich laboratory. CTAG 3′
The ability to study the details of the reaction will spur future
GATC 5′
studies of the human pathway. However, for now we will focus on G CTAG
5′ C
the very well characterized E. coli system (Figure 16-23). M
3′
The first step in mismatch repair is the recognition of the
damage in newly replicated DNA by the MutS protein. The bind- MutS recognizes
ing of this protein to distortions in the DNA double helix caused mismatched pair.
by mismatched bases initiates the mismatch-repair pathway by
5′ MutS
attracting three other proteins to the site of the lesion [MutL,
3′ M
MutH, and UvrD (not shown)]. The key protein is MutH, which G
GATC
T
performs the crucial function of cutting the strand containing the CTAG 3′
incorrect base. Without this ability to discriminate between the GATC 5′
correct and the incorrect bases, the mismatch-repair system G CTAG
5′ C
could not determine which base to excise to prevent a mutation M
3′
from arising. If, for example, a G–T mismatch occurs as a replica-
MutH recognizes
tion error, how can the system determine whether G or T is incor- methylated parent
rect? Both are normal bases in DNA. But replication errors strand and nicks
produce mismatches on the newly synthesized strand, and so the daughter strand.
mismatch-repair system replaces the base on that strand.
How does mismatch repair distinguish the newly synthesized MutL MutH
5′
strand from the old one? Recall from Chapter 12 that cytosine 3′
G M
bases are often methylated in eukaryotes and that this so-called T GATC
CTAG 3′
epigenetic mark is propagated from parent to daughter strand
soon after replication. E. coli DNA also is methylated, but the GATC 5′
G CTAG
methyl groups relevant to mismatch repair are added to adenine 5′ C
M
bases. To distinguish the old template strand from the newly syn- 3′
thesized strand, the bacterial repair system takes advantage of a New strand is excised
delay in the methylation of the following sequence: and replaced between
nick and mismatched pair.
5′-G-A-T-C-3′
3′-C-T-A-G-5′ 5′
3′ G M
The methylating enzyme is adenine methylase, which creates C GATC
6-methyladenine on each strand. However, adenine methylase CTAG 3′
requires several minutes to recognize and modify the newly syn- G A T C 5′
thesized GATC stretches. In that interval, the MutH protein nicks G CTAG
5′ C
the methylation site on the strand containing the A that has not M
3′
yet been methylated. This site can be several hundred base pairs
away from the mismatched base. After the site has been nicked,
the UrvD protein binds at the nick and uses its helicase activity to
unwind the DNA. A protective single-strand-binding protein Figure 16-23 Model for mismatch repair
coats the unwound parental strand while the part of the new strand between the in E. coli. DNA is methylated (Me) at the
mismatch and the nick is excised. A residue in the sequence GATC. DNA
Many of the proteins in E. coli mismatch repair are conserved in human mis- replication yields a hemimethylated duplex
that exists until methylase can modify the
match repair. Nonetheless, how eukaryotes recognize and repair only the newly
newly synthesized strand. The mismatch-
replicated strand is still unknown. The problem is particularly perplexing in organ- repair system makes any necessary
isms that lack most or all DNA methylation such as yeast, Drosophila, and C. elegans. corrections based on the sequence found
A popular model proposes that discrimination is based on the recognition of free on the methylated strand (original template).
3′ ends that characterize the newly synthesized leading and lagging strands. MutS,
Introduction to Genetic MutH,11e
Analysis, and MutL are proteins.
An important target of the human mismatch system is short repeatFiguresequences
16.23 #1630
that can be expanded or deleted in replication by the slipped-mispairing07/01/14
mechanism
Dragonfly Media Group
6 0 4 CHAPTER 1 6 Mutation, Repair, and Recombination
described previously (see Figure 16-8). Mutations in some of the components of this
pathway have been shown to be responsible for several human diseases, especially
cancers. There are thousands of short repeats (microsatellites) located throughout
the human genome (see Chapter 4). Although most are located in noncoding regions
(given that most of the genome is noncoding), a few are located in genes that are
critical for normal growth and development.
Therefore, defects in the human mismatch-repair pathway would be pre-
dicted to have very serious disease consequences. This prediction has turned out
to be true, a case in point being a syndrome called hereditary nonpolyposis
colorectal cancer (HNPCC), which, despite its name, is not a cancer itself but
increases cancer risk. One of the most common inherited predispositions to can-
cer, the disease affects as many as 1 in 200 people in the Western world. Studies
have shown that HNPCC results from a loss of the mismatch-repair system due in
large part to inherited mutations in genes that encode the human counterparts
(and homologs) of the bacterial MutS and MutL proteins (see Figure 16-23). The
inheritance of HNPCC is autosomal dominant. Cells with one functional copy of
the mismatch-repair genes have normal mismatch-repair activity, but tumor cell
lines arise from cells that have lost the one functional copy and are thus mismatch
deficient. These cells display high mutation rates owing in part to an inability to
correct the formation of indels in replication.
active form of this protein. In this situation, RecA acts as a signal Translesion synthesis bypasses
that leads to the induction of several genes that are now known to lesions at stalled replication forks
encode members of a newly discovered family of DNA polymer-
ases that can bypass the replication block and are distinct from
Beta clamp Pol III DNA damage
replicative polymerases. DNA polymerases that can bypass repli-
cation stalls have also been found in diverse taxa of eukaryotes
ranging from yeast to human. These eukaryotic polymerases con-
tribute to a damage-tolerance mechanism called translesion Pol III stalls at site of damage.
DNA synthesis that resembles the SOS bypass system in E. coli. Rec A
These translesion, or bypass, polymerases, as they have
come to be known, differ from the main replicative polymerases in
several ways. First, they can tolerate unusually large adducts on the Bypass polymerase replaces Pol III.
bases. Whereas the replicative polymerase stalls if a base does not
fit into an active site, the bypass polymerases have much larger
pockets that can accommodate damaged bases. Second, in some
situations, the bypass polymerases have a much higher error rate, Pol V
in part because they lack the 3′-to-5′ proofreading activity of the
main replicative polymerases. Third, they can only add a few
nucleotides before falling off. This feature is attractive because the
main function of an error-prone polymerase is to unblock the rep-
lication fork, not to synthesize long stretches of DNA that could
Bypass polymerase continues DNA synthesis.
contain many mismatches.
Several bypass polymerases that appear to be always present
in eukaryotic cells are now known. Because they are always pres-
ent, their access to DNA must be regulated so that they are used
Bypass polymerase falls off.
only when needed. The cell has evolved a neat solution to this
problem. Recall that an integral part of the replisome is the PCNA
(proliferating cell nuclear antigen) protein that functions as a
sliding clamp to orchestrate the myriad events at the replication Pol V Pol III
fork (see Figure 7-20). One critical protein present at a stalled
replication fork is Rad6, which, curiously, is an enzyme that adds Pol III continues synthesis.
ubiquitin to proteins (Figure 16-25). As described in Chapter 9,
the addition of chains of many ubiquitin monomers serves to tar-
get a protein for degradation (see Figure 9-23). In contrast, the
binding of a single ubiquitin monomer to PCNA changes its con-
formation so that it can now bind the bypass polymerase and
orchestrate translesion synthesis. Enzymatic removal of the ubiq-
uitin tag on PCNA leads to the dissociation of the bypass polymerase and the F i g u r e 16 -2 4 A model for translesion
eventual restoration of normal replication. Any base mismatch due to translesion synthesis in E. coli. In the course of
synthesis still has a chance of detection and correction by the mismatch-repair replication, DNA polymerase III is
temporarily replaced by a bypass
pathway.
polymerase (pol V) that can continue
The regulation of PCNA function by the addition and removal of ubiquitin replicating past a lesion. Bypass
monomers illustrates the importance of post-translational modifications in polymerases are error prone. The bacterial
eukaryotes. If base damage in the template strand is not corrected quickly, β clamp (red protein) is equivalent to the
the stalled replication fork will signal the activation of the cell-death pathway. A eukaryotic PCNA. [ Data from E. C. Friedberg,
eukaryotic cell cannot wait for the de novo synthesis of bypass polymerases fol- A. R. Lehmann, and R. P. Fuchs, “Trading
lowing transcription and translation as occurs in the E. coli SOS system. Instead, Places: How Do DNA Polymerases Switch
During Translesion DNA Synthesis?” Molec.
eukaryotic bypass polymerases are constitutively transcribed and are always pres-
Cell 18, 2005, 499–505.]
ent; their access to the replication fork is controlled by rapid and reversible post-
translational modifications.
derived from a child with a rare inherited disorder were in for a surprise. Although
they were able to demonstrate that cell line 2BN was defective for double-strand-
break repair, they were not able to restore the repair system and produce the
wild-type phenotype by genetic complementation with any of the genes encod-
ing NHEJ proteins. That is, when they introduced wild-type genes encoding
known NHEJ proteins (for example, KU70, KU80, ligase IV) into the 2BN line, the
cell line was still defective in the repair of double-strand breaks. This negative result
indicated that cell line 2BN carried a mutation in an unknown NHEJ protein.
In the era of genomics, the identification of proteins linked to diseases is becom-
ing more common because of the wide availability of cell lines from persons with
disease phenotypes. When we humans have a health problem, we go to a doctor and
tell him or her about our symptoms, including information about relatives with
similar problems. Such information is of increasing importance in the genomics era
with its ever-expanding genetic toolbox that can often be used to identify mutant
genes associated with inherited disorders (see Chapters 10 and 14).
What can geneticists do in the laboratory to find a protein, such as the W h at Geneticists
unknown NHEJ protein, that has not yet been identified? Two laboratories using A r e D o in g t o day
very different approaches succeeded in identifying the new NHEJ component;
one approach will be described here because it has been successfully employed to
discover several other proteins. As noted in preceding chapters, many cellular
proteins perform their jobs by interacting with other proteins. Chapter 14, for
example, described the yeast two-hybrid test used to identify proteins that inter-
act with a protein of interest. In the case under consideration now, the protein of
interest was the NHEJ component XRCC4 (see Figure 16-26), and the two-hybrid
test identified a 33-kD interacting protein that was encoded by an uncharacterized
human open reading frame. That two proteins interact in the yeast two-hybrid test Error-prone nonhomologous
does not necessarily mean that these proteins interact in human cells. To establish end joining repairs
a connection between the 33-kD protein and the NHEJ pathway, the geneticists double-strand breaks
used another valuable technique from their toolbox, RNAi (see Chapter 8). In this
case, they demonstrated that normal cells expressing antisense RNA from the ORF Double-strand break
that encodes the 33-kD protein, which would prevent translation of this gene into
protein, were now defective in the execution of the NHEJ pathway.
This story came full circle when the 2BN cells defective in double-strand KU80 End binding by
repair were shown to lack the 33-kD protein. Expression of this protein corrected and protein complex
the cellular defects. KU70
DNA synthesis. This process is called strand invasion. The 3′ end of the invading
Error-free repair of
strand displaces one of the undamaged sister chromatids, which forms a D-loop (for
double-strand breaks by SDSA
displacement), and primes DNA synthesis from its free 3′ end. New DNA synthesis
continues from both 3′ ends until both strands unwind from their templates and
anneal. Ligation seals the nicks, leaving a repaired patch of DNA that has one very
distinctive feature: it has been replicated by a conservative process. That is, both
strands are newly synthesized, which stands in marked contrast to the semiconser-
End trimming vative replication of most DNA (see Chapter 7).
End trimming
(a) (b)
F i g u r e 16 - 3 0 Scanning electron chapter, many mutagenic agents such as chemicals and radiation cause cancer,
micrographs of (a) normal cells and suggesting that they produce cancer by introducing mutations into genes. Second,
(b) cells transformed by Rous sarcoma and most importantly, mutations that are frequently associated with particular
virus, which infects cells with the src
kinds of cancers have been identified.
oncogene. (a) A normal cell line called
3T3. Note the organized monolayer
Two general kinds are associated with tumors: oncogene mutations and muta-
structure of the cells. (b) A transformed tions in tumor-suppressor genes. Oncogene mutations act in the cancer cell as gain-
derivative of 3T3. Note how the cells are of-function dominant mutations (see Chapter 6 for a discussion of dominant
rounder and piled up on one another. mutations). That statement suggests two key characteristics of oncogene mutations.
[ From Victor R. Ambros, Lan Bo Chen, and First, the proteins encoded by oncogenes are usually activated in tumor cells, and,
John M. Buchanan, “Surface Ruffles as second, the mutation need be present in only one allele to contribute to tumor for-
Markers for Studies of Cell Transformation by
mation. The gene in its normal, unmutated form is called a proto-oncogene.
Rous Sarcoma Virus,” Proc. Nat. Acad. Sci.
USA 72, No. 8, 3144–3148, August 1975, Cell
Mutations in tumor-suppressor genes that promote tumor formation are
Biology, p. 3144, Figure 1A and 1B.] loss-of-function recessive mutations. That is, this type of mutation causes the
encoded gene products to lose much or all of their activity (that is, the mutation
is a null mutation). Moreover, for cancer to develop, the mutation must be pres-
ent in both alleles of the gene.
Inactive Ras
GDP
GTP
GDP
ras oncogene DNA GGC GCC GTC GGT GTG GGC GTP
Val Ras oncoprotein
(a)
is blocked here.
Signal remains on.
Continuously
activates signal
(b) to proliferate.
Mutations in the p53 gene are associated with many types of tumors. In fact,
estimates are that 50 percent of human tumors lack a functional p53 gene. The
active p53 protein is a transcriptional regulator that is activated in response to
DNA damage. Activated wild-type p53 serves double duty: it prevents the pro-
gression of the cell cycle until the DNA damage is repaired, and, under some cir-
cumstances, it induces apoptosis. If no functional p53 gene is present, the cell
cycle progresses even if damaged DNA has not been repaired. The progression of
the cell cycle into mitosis elevates the overall frequency of mutations, chromo-
somal rearrangements, and aneuploidy and thus increases the chances that other
mutations that promote cell proliferation or block apoptosis will arise.
It is now clear that mutations able to elevate the mutation rate are important
contributors to the progression of tumors in humans. These mutations are reces-
sive mutations in tumor-suppressor genes that normally function in DNA-repair
pathways. Mutations in these genes thus interfere with DNA repair. They pro-
mote tumor growth indirectly by elevating the mutation rate, which makes it
much more likely that a series of oncogene and tumor-suppressor mutations will
arise, corrupting the normal regulation of the cell cycle and programmed cell
death. Large numbers of such tumor-suppressor-gene mutations have been iden-
tified, including some associated with heritable forms of cancer in specific tissues.
Examples are the BRCA1 and BRCA2 mutations and breast cancer.
s u mmary
DNA change within a gene (point mutation) generally entails occur in somatic cells are the source of many human cancers.
one or a few base pairs. Single-base-pair substitutions can cre- Many biological pathways have evolved to correct the broad
ate missense codons or nonsense (translation termination) spectrum of spontaneous and induced mutations. Some path-
codons. A purine replaced by the other purine (or a pyrimi- ways, such as base- and nucleotide-excision repair and mis-
dine replaced by the other pyrimidine) is a transition. A match repair, use the information inherent in base comple-
purine replaced by a pyrimidine (or vice versa) is a transver- mentarity to execute error-free repair. Other pathways that
sion. Single-base-pair additions or deletions (indels) produce use bypass polymerases to correct damaged bases can intro-
frameshift mutations. Certain human genes that contain tri- duce errors in the DNA sequence.
nucleotide repeats—especially those that are expressed in The correction of double-strand breaks is particularly
neural tissue—become mutated through the expansion of important because these lesions can lead to destabilizing
these repeats and can thus cause disease. The formation of chromosomal rearrangements. Nonhomologous end joining
monoamino acid repeats within the polypeptides encoded by is a pathway that ligates broken ends back together so that a
these genes is often responsible for the mutant phenotypes. stalled replication fork does not result in cell death. In rep-
Mutations can occur spontaneously as a by-product of licating cells, double-strand breaks can be repaired in an
normal cellular processes such as DNA replication or error-free manner by the synthesis-dependent strand-
metabolism, or they can be induced by mutagenic radiation annealing pathway, which utilizes the sister chromatid to
or chemicals. Mutagens often result in a specific type of repair the break.
change because of their chemical specificity. For example, Hundreds of programmed double-strand breaks initiate
some produce exclusively G • C → A • T transitions; others, meiotic crossing over between nonsister chromatids. Just
exclusively frameshifts. like other double-strand breaks, the meiotic breaks must be
Although mutations are necessary to generate diversity, processed quickly and efficiently to prevent serious conse-
many mutations are associated with inherited genetic diseases quences such as cell death and cancer. Just how this repair
such as xeroderma pigmentosum. In addition, mutations that is done is still being explored.
Problems 613
k e y t e rms
Ames test (p. 596) mismatch repair (p. 602) spontaneous lesion (p. 589)
apoptosis (p. 609) missense mutation (p. 584) spontaneous mutation (p. 586)
apurinic site (p. 589) mutagen (p. 586) synonymous mutation (p. 584)
base analog (p. 593) mutagenesis (p. 593) synthesis-dependent strand annealing
base-excision repair (p. 598) nonconservative substitution (SDSA) (p. 607)
bypass (translesion) polymerase (p. 584) transcription-coupled nucleotide-
(p. 605) nonhomologous end joining (NHEJ) excision repair (TC-NER)
cancer (p. 609) (p. 606) (p. 600)
Cockayne syndrome (p. 600) nonsense mutation (p. 584) transition (p. 584)
double-strand break (p. 606) nucleotide-excision repair (NER) translesion (bypass) polymerase
fluctuation test (p. 587) (p. 600) (p. 605)
frameshift mutation (p. 585) oncogene (p. 610) translesion DNA synthesis (p. 605)
global genomic nucleotide-excision oncoprotein (p. 610) transversion (p. 584)
repair (GG-NER) (p. 600) point mutation (p. 583) trinucleotide repeat (p. 591)
indel mutation (p. 589) proto-oncogene (p. 610) tumor-suppressor gene (p. 610)
induced mutation (p. 586) replica plating (p. 588) xeroderma pigmentosum (XP)
intercalating agent (p. 594) SOS system (p. 604) (p. 600)
s o lv e d pr o b l e ms
SOLVED PROBLEM 1. In Chapter 9, we learned that UAG and site in the protein, aflatoxin B1 could revert UAG codons.
UAA codons are two of the chain-terminating nonsense trip- Aflatoxin B1 would not revert UAA codons because no
lets. On the basis of the specificity of aflatoxin B1 and ethyl- G • C base pairs appear at the corresponding position in
methanesulfonate (EMS), describe whether each mutagen the DNA.
would be able to revert these codons to wild type
SOLVED PROBLEM 2. Explain why mutations induced by
Solution acridines in phage T4 or by ICR-191 in bacteria cannot be
EMS induces primarily G • C → A • T transitions. UAG reverted by 5-bromouracil.
codons could not be reverted to wild type because only the
UAG → UAA change would be stimulated by EMS and that Solution
generates a nonsense (ochre) codon. UAA codons would not Acridines and ICR-191 induce mutations by deleting or add-
be acted on by EMS. Aflatoxin B1 induces primarily G • C → ing one or more base pairs, which results in a frameshift.
T • A transversions. Only the third position of UAG codons However, 5-bromouracil induces mutations by causing the
would be acted on, resulting in a UAG → UAU change (on substitution of one base for another. This substitution can-
the mRNA level), which produces tyrosine. Therefore, if not compensate for the frameshift resulting from ICR-191
tyrosine were an acceptable amino acid at the corresponding and acridines.
pr o b l e ms
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
W o r k in g wit h t h e F i g u r e s
1. In Figure 16-3a, what is the consequence of the new 5′ 3. In the Ames test shown in Figure 16-17, what is the rea-
splice site on the open reading frame? In 16-3b, how big son for adding the liver extract to each sample?
could the intron be to maintain the reading frame (let’s 4. Based on the mode of action of aflatoxin (Figure 16-16),
say between 75 and 100 bp)? propose a scenario that explains its response in the Ames
2. Using Figure 16-4 as an example, compare the migration test (Figure 16-18).
of RNA and protein for the wild-type gene and the muta- 5. In Figure 16-22, point out the mutant protein(s) in pa-
tion shown in Figure 16-3b. Assume that the retained tients with Cockayne syndrome. What protein(s) is/are
intron maintains the reading frame. mutant in patients with XP? How are these different
614 CHAPTER 1 6 Mutation, Repair, and Recombination
mutations thought to account for the different disease 18. Under what circumstances could nonhomologous end
symptoms? joining be said to be error prone?
6. The MutH protein nicks the newly synthesized strand 19. Why are many chemicals that test positive by the Ames
(Figure 16-23). How does it “know” which strand this is? test also classified as carcinogens?
7. What features of the bypass polymerase make it ideal for 20. The Spo11 protein is conserved in eukaryotes. Do you
its role in translesion synthesis, shown in Figure 16-24? think it is also conserved in bacterial species? Justify your
answer.
B asic P r o b l e ms
21. Differentiate between the elements of the following
8. Consider the following wild-type and mutant sequences: pairs:
Wild-type ....CTTGCAAGCGAATC.... a. Transitions and transversions
Mutant ....CTTGCTAGCGAATC.... b. Synonymous and neutral mutations
The substitution shown seems to have created a stop c. Missense and nonsense mutations
codon. What further information do you need to be d. Frameshift and nonsense mutations
confident that it has done so? 22. Describe two spontaneous lesions that can lead to
9. What type of mutation is depicted by the following se- mutations.
quences (shown as mRNA)? 23. What are bypass polymerases? How do they differ from
Wild type ....5′ AAUCCUUACGGA 3′.... the replicative polymerases? How do their special fea-
tures facilitate their role in DNA repair?
Mutant ....5′ AAUCCUACGGA 3′....
24. In adult cells that have stopped dividing, what types of
10. Can a missense mutation of proline to histidine be repair systems are possible?
made with a G • C → A • T transition-causing mutagen?
25. A certain compound that is an analog of the base cytosine
What about a proline-to-serine missense mutation?
can become incorporated into DNA. It normally hydro-
11. By base-pair substitution, what are all the synonymous gen bonds just as cytosine does, but it quite often isomer-
changes that can be made starting with the codon CGG? izes to a form that hydrogen bonds as thymine does. Do
12. a. What are all the transversions that can be made start- you expect this compound to be mutagenic, and, if so,
ing with the codon CGG? what types of changes might it induce at the DNA level?
b. Which of these transversions will be missense? Can 26. Two pathways, homologous recombination and nonho-
you be sure? mologous end joining (NHEJ), can repair double-strand
breaks in DNA. If homologous recombination is an error-
13. a. Acridine orange is an effective mutagen for produc- free pathway whereas NHEJ is not always error free, why
ing null alleles by mutation. Why does it produce null is NHEJ used most of the time in eukaryotes?
alleles?
27. Which repair pathway recognizes DNA damage during
b. A certain acridine-like compound generates only transcription? What happens if the damage is not
single insertions. A mutation induced with this repaired?
compound is treated with the same compound, and
some revertants are produced. How is this outcome 28. Where in a gene would a 4-bp insertion mutation have
possible? the least effect on gene expression?
a. Introns
14. Defend the statement “Cancer is a genetic disease.”
b. Exons
15. Give an example of a DNA-repair defect that leads to
cancer. c. Regulatory regions
16. In mismatch repair in E. coli, only a mismatch in the d. Introns and exons
newly synthesized strand is corrected. How is E. coli able 29. Which of the following gene mutations is most likely to
to recognize the newly synthesized strand? Why does have the most severe impact on gene expression?
this ability make biological sense? a. A nonsense mutation in the last exon
17. A mutational lesion results in a sequence containing a b. A point mutation in an exon
mismatched base pair: c. A point mutation in the splice donor site of an
5′ AGCTGCCTT 3′ intron
3′ ACGATGGAA 5′ d. A point mutation in the middle of an intron
Codon 30. Which of the following is not possible?
If mismatch repair occurs in either direction, which a. A nonsynonymous mutation in an intron
amino acids could be found at this site? b. A nonsynonymous mutation in an exon
Problems 615
c. An indel mutation in an intron When this type is crossed with a standard wild-type
d. An indel mutation in an exon strain, the progeny consist of 90 percent prototrophs and
10 percent auxotrophs. Give a full explanation for these
31. Which of the following is/are associated with spontane-
results, including a precise reason for the frequencies
ous mutation?
observed.
a. An occurrence of lung cancer due to smoking
35. You are using nitrosoguanidine to “revert” mutant nic-2
b. A nonsense mutation in an exon caused by an error (nicotinamide-requiring) alleles in Neurospora.
in DNA replication
You treat cells, plate them on a medium without nico-
c. An indel mutation in an intron caused by replication
tinamide, and look for prototrophic colonies. You obtain
slippage
the following results for two mutant alleles. Explain
d. A nonsense mutation in an exon caused by an error these results at the molecular level, and indicate how you
in DNA replication, and an indel mutation in an intron would test your hypotheses.
caused by replication slippage
a. With nic-2 allele 1, you obtain no prototrophs at all.
32. Which of the following statements best describe the mis-
match repair pathway? b. With nic-2 allele 2, you obtain three prototrophic
colonies A, B, and C, and you cross each separately with a
a. It is part of the 3′ to 5′ proofreading function of DNA
wild-type strain. From the cross prototroph A × wild
polymerase.
type, you obtain 100 progeny, all of which are prototrophic.
b. It acts after DNA replication by recognizing mis- From the cross prototroph B × wild type, you obtain
matched base pairs. 100 progeny, of which 78 are prototrophic and 22 are
c. It is activated by stalled replication forks. nicotinamide requiring. From the cross prototroph
d. It is coupled to transcription. C × wild type, you obtain 1000 progeny, of which 996 are
prototrophic and 4 are nicotinamide requiring.
C h a l l e n g in g P r o b l e ms 36. You are working with a newly discovered mutagen, and
33. a. Why is it impossible to induce nonsense mutations you wish to determine the base change that it introduc-
(represented at the mRNA level by the triplets UAG, es into DNA. Thus far, you have determined that the
UAA, and UGA) by treating wild-type strains with muta- mutagen chemically alters a single base in such a way
gens that cause only A • G → T • C transitions in DNA? that its base-pairing properties are altered permanently.
b. Hydroxylamine (HA) causes only G • C → A • T tran- To determine the specificity of the alteration, you exam-
sitions in DNA. Will HA produce nonsense mutations in ine the amino acid changes that take place after muta-
wild-type strains? genesis. A sample of what you find is shown here:
c. Will HA treatment revert nonsense mutations? Original: Gln–His–Ile–Glu–Lys
34.
Several auxotrophic point mutants in Neurospora Mutant: Gln–His–Met–Glu–Lys
are treated with various agents to see if reversion will Original: Ala–Val–Asn–Arg
take place. The following results were obtained (a plus
sign indicates reversion; HA causes only G • C → A • T Mutant: Ala–Val–Ser–Arg
transitions). Original: Arg–Ser–Leu
Mutant: Arg–Ser–Leu–Trp–Lys–Thr–Phe
Spontaneous
Mutant 5-BU HA Proflavin reversion What is the base-change specificity of the mutagen?
1 - - - - 37. You now find an additional mutant from the experiment
2 - - + + in Problem 31:
3 + - - + Original: Ile–Leu–His–Gln
4 - - - + Mutant: Ile–Pro–His–Gln
5 + + - + Could the base-change specificity in your answer to
Problem 31 account for this mutation? Why or why not?
a. For each of the five mutants, describe the nature of 38. You are an expert in DNA-repair mechanisms. You re-
the original mutation event (not the reversion) at the ceive a sample of a human cell line derived from a wom-
molecular level. Be as specific as possible. an who has symptoms of xeroderma pigmentosum. You
b. For each of the five mutants, name a possible mutagen determine that she has a mutation in a gene that has not
that could have caused the original mutation event. been previously associated with XP. How is this
(Spontaneous mutation is not an acceptable answer.) possible?
c. In the reversion experiment for mutant 5, a parti- 39. Ozone (O3) is an important naturally occurring compo-
cularly interesting prototrophic derivative is obtained. nent in our atmosphere, where it forms a layer
616 CHAPTER 1 6 Mutation, Repair, and Recombination
that absorbs UV radiation. A hole in the ozone layer was skin cancer in the beach communities in Australia. A
discovered in the 1970s over Antarctica and Australia. newspaper reporter friend offers to let you publish a
The hole appears seasonally and was found to be due to short note (a paragraph) in which you are to describe the
human activity. Specifically, ozone is destroyed by a possible connection between the ozone hole and the
class of chemicals (called CFCs for chlorofluorocarbons) increased skin cancers. On the basis of what you have
that are found in refrigerants, air-conditioning systems, learned about DNA repair in this chapter, write a
and aerosols. paragraph that explains the mechanistic connection.
As a scientist working on DNA-repair mechanisms,
you discover that there has been a significant increase in
344
17
C h a p t e r
Large-Scale
Chromosomal
Changes
Learning Outcomes
After completing this chapter, you will be able to
outline
A
young couple is planning to have children. The husband knows that his
grandmother had a child with Down syndrome by a second marriage.
Down syndrome is a set of physical and mental disorders caused by the
presence of an extra chromosome 21 (Figure 17-1). No records of the birth, which
occurred early in the twentieth century, are available, but the couple knows of no
other cases of Down syndrome in their families.
The couple has heard that Down syndrome results from a rare chance mistake
in egg production and therefore decide that they stand only a low chance of having
such a child. They decide to have children. Their first child is unaffected, but the
next conception aborts spontaneously (a miscarriage), and their second child is
born with Down syndrome. Was their having a Down syndrome child a coinci-
dence, or did a connection between the genetic makeup of the child’s father and
that of his grandmother lead to their both having Down syndrome children? Was
the spontaneous abortion significant? What tests might be necessary to investigate
this situation? The analysis of such questions is the topic of this chapter.
We have seen throughout the book that gene mutations are an important
source of change in the genomic sequence. However, the genome can also be
remodeled on a larger scale by alterations to chromosome structure or by changes
in the number of copies of chromosomes in a cell. These large-scale variations are
termed chromosome mutations to distinguish them from gene mutations. Broadly
speaking, gene mutations are defined as changes that take place within a gene,
whereas chromosome mutations are changes in a chromosome region encompass-
ing multiple genes. Gene mutations are never detectable microscopically; a chro-
Child with Down syndrome mosome bearing a gene mutation looks the same under the microscope as one
carrying the wild-type allele. In contrast, many chromosome mutations can be
detected by microscopy, by genetic or molecular analysis, or by a combination of all
techniques. Chromosome mutations have been best characterized in eukaryotes,
and all the examples in this chapter are from that group.
Chromosome mutations are important from several biological perspectives.
First, they can be sources of insight into how genes act in concert on a genomic
scale. Second, they reveal several important features of meiosis and chromosome
architecture. Third, they constitute useful tools for experimental genomic manip-
ulation. Fourth, they are sources of insight into evolutionary processes. Fifth, chro-
mosomal mutations are regularly found in humans, and some of these mutations
cause genetic disease.
Many chromosome mutations cause abnormalities in cell and organismal func-
tion. Most of these abnormalities stem from changes in gene number or gene posi-
tion. In some cases, a chromosome mutation results from chromosome breakage.
If the break occurs within a gene, the result is functional disruption of that gene.
For our purposes, we will divide chromosome mutations into two groups:
changes in chromosome number and changes in chromosome structure. These two
groups represent two fundamentally different kinds of events. Changes in chromo-
some number are not associated with structural alterations of any of the DNA mol-
ecules of the cell. Rather, it is the number of these DNA molecules that is changed,
and this change in number is the basis of their genetic effects. Changes in chromo-
some structure, on the other hand, result in novel sequence arrangements within
one or more DNA double helices. These two types of chromosome mutations are
illustrated in Figure 17-2, which is a summary of the topics of this chapter. We begin
by exploring the nature and consequences of changes in chromosome number.
Relocation of Loss of
Deletion
genetic material genetic material
Translocation
Missing chromosome(s)
From another
chromosome Wild-type sequence
Extra chromosome(s)
Inversion
Gain of
Duplication genetic material
Aberrant euploidy
Organisms with multiples of the basic chromosome set (genome) are referred to as
euploid. You learned in earlier chapters that familiar eukaryotes such as plants, ani-
mals, and fungi carry in their cells either one chromosome set (haploidy) or two
chromosome sets (diploidy). In these species, both the haploid and the diploid states
are cases of normal euploidy. Organisms that have more or fewer than the normal
number of sets are aberrant euploids. Polyploids are individual organisms that
have more than two chromosome sets. They can be represented by 3n (triploid),
4n (tetraploid), 5n (pentaploid), 6n (hexaploid), and so forth. (The number of
chromosome sets is called the ploidy or ploidy level.) An individual member of a
normally diploid species that has only one chromosome set (n) is called a mono-
ploid to distinguish it from an individual member of a normally haploid species
(also n). Examples of these conditions are shown in the first four rows of Table 17-1.
Monoploids Male bees, wasps, and ants are monoploid. In the normal life cycles of
these insects, males develop by parthenogenesis (the development of a specialized
type of unfertilized egg into an embryo without the need for fertilization). In most
other species, however, monoploid zygotes fail to develop. The reason is that virtu-
ally all members of a diploid species carry a number of deleterious recessive muta-
tions, together called a genetic load. The deleterious recessive alleles are masked
by wild-type alleles in the diploid condition but are automatically expressed in a
monoploid derived from a diploid. Monoploids that do develop to advanced stages
are abnormal. If they survive to adulthood, their germ cells cannot proceed through
meiosis normally, because the chromosomes have no pairing partners. Thus,
620 CHAPTER 1 7 Large-Scale Chromosomal Changes
monoploids are characteristically sterile. (Male bees, wasps, and ants bypass meio-
sis; in these groups, gametes are produced by mitosis.)
Polyploids Polyploidy is very common in plants but rarer in animals (for reasons
that we will consider later). Indeed, an increase in the number of chromosome
sets has been an important factor in the origin of new plant species. The evidence
for this benefit is that above a haploid number of about 12, even numbers of chro-
mosomes are much more common than odd numbers. This pattern is a conse-
quence of the polyploid origin of many plant species, because doubling and
redoubling of a number can give rise only to even numbers. Animal species do not
show such a distribution, owing to the relative rareness of polyploid animals.
In aberrant euploids, there is often a correlation between the number of cop-
ies of the chromosome set and the size of the organism. A tetraploid organism, for
example, typically looks very similar to its diploid counterpart in its proportions,
except that the tetraploid is bigger, both as a whole and in its component parts.
The higher the ploidy level, the larger the size of the organism (Figure 17-3).
Pairing possibilities
Two diploid
Mitosis in a diploid, 2n = 4 al cells
Norm
Figure 17-5 Colchicine may be applied to generate a tetraploid from a diploid. Colchicine
added to mitotic cells during metaphase and anaphase disrupts spindle-fiber formation,
preventing the migration of chromatids after the centromere has split. A single cell is created
that contains pairs of identical chromosomes that are homozygous at all loci.
much more tolerant of polyploidy. Note that all alleles in the genotype are dou-
bled. Therefore, if a diploid cell of genotype A /a ; B / b is doubled, the resulting
autotetraploid will be of genotype A/A /a /a ; B /B/b /b.
Because four is an even number, autotetraploids can have a regular meiosis,
although this result is by no means always the case. The crucial factor is how the
four chromosomes of each set pair and segregate. There are several possibilities,
as shown in Figure 17-6. If the chromosomes pair as bivalents or quadrivalents,
the chromosomes segregate normally, producing diploid gametes. The fusion of
gametes at fertilization regenerates the tetraploid state. If trivalents form, segre-
gation leads to nonfunctional aneuploid gametes and, hence, sterility.
Pairing
possibilities
F i g u r e 17- 6 There are three different pairing possibilities at meiosis in tetraploids. The
four homologous chromosomes may pair as two bivalents or as a quadrivalent, and each
can yield functional gametes. A third possibility, a trivalent plus a univalent, yields
nonfunctional gametes. ANIMATED ART: Autotetraploid meiosis
17.1 Changes in Chromosome Number 623
Raphanus
of the sterile hybrid, presumably in tissue that eventually
S
2n = 18
became a flower and underwent meiosis to produce gametes.
In 2n1 + 2n2 tissue, there is a pairing partner for each chromo- Parents ×
+
some, and functional gametes of the type n1 n2 are pro- Sterile F1 hybrid
Brassica n+n=9+9
duced. These gametes fuse to give 2n1 + 2n2 allopolyploid 2n = 18
Raphanobrassica
2n = 18
progeny, which also are fertile. This kind of allopolyploid Fertile amphidiploid
is sometimes called an amphidiploid, or doubled diploid 2n + 2n = 18 + 18
(Figure 17-7). Treating a sterile hybrid with colchicine greatly 4n = 36
increases the chances that the chromosome sets will double.
Amphidiploids are now synthesized routinely in this manner.
(Unfortunately for Karpechenko, his amphidiploid had the
roots of a cabbage and the leaves of a radish.) F i g u r e 17-7 In the progeny of a cross of cabbage (Brassica) and
When Karpechenko’s allopolyploid was crossed with radish (Raphanus), the fertile amphidiploid arose from spontaneous
either parental species—the cabbage or the radish—sterile doubling in the 2n = 18 sterile hybrid.
624 CHAPTER 1 7 Large-Scale Chromosomal Changes
B. oleracea,
2n = 18
Cabbage
Cauliflower
Broccoli
Kale
Kohlrabi
Brussels sprouts
n=9 n=9
B. carinata, B. napus,
2n = 34 2n = 38
Abyssinian mustard Rutabaga
Oil rape
n=8 n = 10
F i g u r e 17- 8 Allopolyploidy is
important in the production of new offspring resulted. The offspring of the cross with cabbage were 2n1 + n2, consti-
species. In the example shown, three tuted from an n1 + n2 gamete from the allopolyploid and an n1 gamete from the
diploid species of Brassica (light green cabbage. The n2 chromosomes had no pairing partners; hence, a normal meiosis
boxes) were crossed in different could not take place, and the offspring were sterile. Thus, Karpechenko had effec-
combinations to produce their
allopolyploids (tan boxes). Some of the
tively created a new species, with no possibility of gene exchange with either cab-
agricultural derivatives of some of the bage or radish. He called his new plant Raphanobrassica.
species are shown within the boxes. In nature, allopolyploidy seems to have been a major force in the evolution of
new plant species. One convincing example is shown by the genus Brassica, as illus-
trated in Figure 17-8. Here, three different parent species have hybridized in all pos-
sible pair combinations to form new amphidiploid species. Natural polyploidy was
once viewed as a somewhat rare occurrence, but recent work has shown that it is a
recurrent event in many plant species. The use of DNA markers has made it possible
to show that polyploids in any population or area that appear to be the same are the
result of many independent past fusions between genetically distinct individuals of
the same two parental species. An estimated 50 percent of all angiosperm plants are
polyploids, resulting from auto- or allopolyploidy. As a result of multiple polyploidi-
zations, the amount of allelic variation within a polyploid species is much higher
than formerly thought, perhaps contributing to its potential for adaptation.
A particularly interesting natural allopolyploid is bread wheat, Triticum aesti-
vum (6n = 42). By studying its wild relatives, geneticists have reconstructed a
probable evolutionary history of this plant. Figure 17-9 shows that bread wheat is
17.1 Changes in Chromosome Number 625
×
AA gamete BB gamete
(10,000 yr ago)
(~8500 yr ago)
Monoploids Diploidy is an inherent nuisance for plant breeders. When they want to
induce and select new recessive mutations that are favorable for agricultural pur-
poses, the new mutations cannot be detected unless they are homozygous. Breeders
may also want to find favorable new combinations of alleles at different loci, but
such favorable allele combinations in heterozygotes will be broken up by recombina-
tion at meiosis. Monoploids provide a way around some of these problems.
Monoploids can be artificially derived from the products of meiosis in a plant’s
anthers. A haploid cell destined to become a pollen grain can instead be induced
by cold treatment (subjected to low temperatures) to grow into an embryoid, a
small dividing mass of monoploid cells. The embryoid can be grown on agar to
form a monoploid plantlet, which can then be potted in soil and allowed to mature
(Figure 17-10).
Plant monoploids can be exploited in several ways. In one approach, they are
first examined for favorable allelic combinations that have arisen from the recom-
bination of alleles already present in a heterozygous diploid parent. Hence, from
a parent that is A/a ; B/b might come a favorable monoploid combination a ; b.
The monoploid can then be subjected to chromosome doubling to produce homo-
zygous diploid cells, a/a ; b/b, that are capable of normal reproduction.
Another approach is to treat monoploid cells basically as a population of hap-
loid organisms in a mutagenesis-and-selection procedure. A population of mono-
ploid cells is isolated, their walls are removed by enzymatic treatment, and they
are exposed to a mutagen. They are then plated on a medium that selects for
some desirable phenotype. This approach has been used to select for resistance to
toxic compounds produced by a plant parasite as well as to select for resistance to
herbicides being used by farmers to kill weeds. Resistant plantlets eventually
grow into monoploid plants, whose chromosome number can then be doubled
with the use of colchicine, leading to a resistant homozygous diploid. These pow-
erful techniques can circumvent the normally slow process of meiosis-based plant
breeding. They have been successfully applied to important crop plants such as
soybeans and tobacco.
Anthers
F i g u r e 17-10 Monoploid plants can be artificially derived from cells destined to become
pollen grains by exposing the cells to cold treatment in tissue culture.
17.1 Changes in Chromosome Number 627
Autotriploids The bananas that are widely available commercially are sterile trip-
loids with 11 chromosomes in each set (3n = 33). The most obvious expression of
the sterility of bananas is the absence of seeds in the fruit that we eat. (The black
specks in bananas are not seeds; banana seeds are rock hard—real tooth breakers.)
Seedless watermelons are another example of the commercial exploitation of trip-
loidy in plants.
Aneuploidy
Aneuploidy is the second major category of chromosomal aberrations
in which the chromosome number is abnormal. An aneuploid is an
individual organism whose chromosome number differs from the wild
type by part of a chromosome set. Generally, the aneuploid chromo-
some set differs from the wild type by only one chromosome or by a
small number of chromosomes. An aneuploid can have a chromosome
number either greater or smaller than that of the wild type. Aneuploid
nomenclature (see Table 17-1) is based on the number of copies of the
specific chromosome in the aneuploid state. For autosomes in diploid
organisms, the aneuploid 2n + 1 is trisomic, 2n − 1 is monosomic,
and 2n − 2 (the “− 2” represents the loss of both homologs of a
628 CHAPTER 1 7 Large-Scale Chromosomal Changes
F i g u r e 17-12 Aneuploid products Nondisjunction occurs spontaneously. Like most gene mutations, it is an
of meiosis (that is, gametes) are example of a chance failure of a basic cellular process. The precise molecular
produced by nondisjunction at the first processes that fail are not known but, in experimental systems, the frequency of
or second meiotic division. Note that all nondisjunction can be increased by interference with microtubule polymeriza-
other chromosomes are present in
normal number, including in the cells in
tion, thereby inhibiting normal chromosome movement. Disjunction appears to
which no chromosomes are shown. be more likely to go awry in meiosis I. This failure is not surprising, because nor-
mal anaphase I disjunction requires that the homologous chromatids of the tetrad
ANIMATED ART: Meiotic
remain paired during prophase I and metaphase I, and it requires crossovers. In
nondisjunction
contrast, proper disjunction at anaphase II or at mitosis requires that the centro-
mere split properly but does not require chromosome pairing or crossing over.
Crossovers are a necessary component of the normal disjunction process.
Somehow the formation of a chiasma helps to hold a bivalent together and ensures
that the two dyads will go to opposite poles. In most organisms, the amount of
crossing over is sufficient to ensure that all bivalents will have at least one chi-
asma per meiosis. In Drosophila, many of the nondisjunctional chromosomes seen
in disomic (n + 1) gametes are nonrecombinant, showing that they arise from
meioses in which there is no crossing over on that chromosome. Similar observa-
tions have been made in human trisomies. In addition, in several different experi-
mental organisms, mutations that interfere with recombination have the effect of
massively increasing the frequency of meiosis I nondisjunction. All these observa-
tions provide evidence for the role of crossing over in maintaining chromosome
pairing; in the absence of these associations, chromosomes are vulnerable to ana-
phase I nondisjunction.
17.1 Changes in Chromosome Number 629
F i g u r e 17-16 Down syndrome results from the presence of an F i g u r e 17-17 Older mothers have a higher proportion of babies
extra copy of chromosome 21. with Down syndrome than younger mothers do. [ Data from L. S. Penrose
and G. F. Smith, Down’s Anomaly. Little, Brown and Company, 1966.]
short stature; short hands with a crease across the middle; and a large, wrinkled
tongue. Females may be fertile and may produce normal or trisomic progeny, but
males are sterile with very few exceptions. Mean life expectancy is about 17 years,
and only 8 percent of persons with Down syndrome survive past age 40.
The incidence of Down syndrome is related to maternal age: older mothers
run a greatly elevated risk of having a child with Down syndrome (Figure 17-17).
For this reason, fetal chromosome analysis (by amniocentesis or by chorionic vil-
lus sampling) is now recommended for older expectant mothers. A less-pro-
nounced paternal-age effect also has been demonstrated.
Even though the maternal-age effect has been known for many years, its cause
is still not known. Nonetheless, there are some interesting biological correlations.
With age, possibly the chromosome bivalent is less likely to stay together during
prophase I of meiosis. Meiotic arrest of oocytes (female meiocytes) in late pro-
phase I is a common phenomenon in many animals. In female humans, all oocytes
are arrested at diplotene before birth. Meiosis resumes at each menstrual period,
which means that the chromosomes in the bivalent must remain properly associ-
ated for as long as five or more decades. If we speculate that these associations
have an increasing probability of breaking down by accident as time passes, we
can envision a mechanism contributing to increased maternal nondisjunction
with age. Consistent with this speculation, most nondisjunction related to the
effect of maternal age is due to nondisjunction at anaphase I, not anaphase II.
The only other human autosomal trisomics to survive to birth are those with
trisomy 13 (Patau syndrome) and trisomy 18 (Edwards syndrome). Both have
severe physical and mental abnormalities. The phenotypic syndrome of trisomy
13 includes a harelip; a small, malformed head; “rocker-bottom” feet; and a mean
life expectancy of 130 days. That of trisomy 18 includes “faunlike” ears, a small
jaw, a narrow pelvis, and rocker-bottom feet; almost all babies with trisomy 18 die
within the first few weeks after birth. All other trisomics die in utero.
6 32 CHAPTER 1 7 Large-Scale Chromosomal Changes
(a) (b)
1 2 3 4
1 2 3 4 1 4
Deletion 3 2 Loss
3 2 4 1 4
Loss 1
1 2 3 4 1 2 4 1 2 3 4 1 4
Deletion
and
duplication 1 2 3 4 1 2 3 3 4 1 2 3 4 1 2 3 2 3 4
1 2 3 4
1 2 3 4 1 3 2 4
Inversion 2 3
1 3 2 4
1 4
1 2 3 4 1 2 8 9 10 1 2 3 4 1 2 8 9 10
Translocation
5 6 7 8 9 10 5 6 7 3 4 5 6 7 8 9 10 5 6 7 3 4
A B E
A B C C D E
However, the duplicate segment can end up at a different position on the same
chromosome or even on a different chromosome.
Balanced rearrangements change the chromosomal gene order but do not
remove or duplicate any DNA. The two simple classes of balanced rearrange-
ments are inversions and reciprocal translocations. An inversion is a rearrange-
ment in which an internal segment of a chromosome has been broken twice,
flipped 180 degrees, and rejoined.
A B C D
A C B D
17.2 Changes in Chromosome Structure 6 37
Reciprocal
translocation
Deletions
A deletion is simply the loss of a part of one chromosome arm. The process of
deletion requires two chromosome breaks to cut out the intervening segment.
The deleted fragment has no centromere; consequently, it cannot be pulled to a
spindle pole in cell division and is lost. The effects of deletions depend on their
size. A small deletion within a gene, called an intragenic deletion, inactivates the
gene and has the same effect as that of other null mutations of that gene. If the
homozygous null phenotype is viable (as, for example, in human albinism), the
homozygous deletion also will be viable. Intragenic deletions can be distinguished
from mutations caused by single nucleotide changes because genes with such
deletions never revert to wild type.
For most of this section, we will be dealing with multigenic deletions, in
which several to many genes are missing. The consequences of these deletions are
more severe than those of intragenic deletions. If such a deletion is made homo-
zygous by inbreeding (that is, if both homologs have the same deletion), the com-
bination is always lethal. This fact suggests that all regions of the chromosomes
are essential for normal viability and that complete elimination of any segment
from the genome is deleterious. Even an individual organism heterozygous for a
multigenic deletion—that is, having one normal homolog and one that carries the
deletion—may not survive. Principally, this lethal outcome is due to disruption of
normal gene balance. Alternatively, the deletion may “uncover” deleterious reces-
sive alleles, allowing the single copies to be expressed.
1 2 3 4 5 1 2 3 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9101 2 3 4 1 2 3 4 5 6 7 8 9101112 1 2 3 4 5 6 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 1 2
2D 2E 2F 3A 3B 3C 3D 3E 3F 4A
258-11
258-14
N-8Mohr
264-38
264-36
264-30
Extent of 264-31
13 deletions
264-32
264-33
264-37
264-39
264-2
264-19
17 genes
PMS PMS
Williams syndrome
deletion
PMS
plus
Duplication
Duplications
The processes of chromosome mutation sometimes produce an extra copy of
some chromosome region. The duplicate regions can be located adjacent to each
other—called a tandem duplication—or the extra copy can be located elsewhere
in the genome—called an insertional duplication. A diploid cell containing a
duplication will have three copies of the chromosome region in question: two in
one chromosome set and one in the other—an example of a duplication heterozy-
gote. In meiotic prophase, tandem-duplication heterozygotes show a loop consist-
ing of the unpaired extra region.
Synthetic duplications of known coverage can be used for gene mapping. In hap-
loids, for example, a chromosomally normal strain carrying a new recessive muta-
tion m may be crossed with strains bearing a number of duplication-generating
rearrangements (for example, translocations and pericentric inversions). In any
one cross, if some duplication progeny have the recessive phenotype, the duplica-
tion does not span gene m, because, if it did, its extra segment would mask the
recessive m allele.
17.2 Changes in Chromosome Structure 6 41
Chr 1
Chr 2
Chr 3
Common ancestor
1 2 34 5 6 7 8 9 10 11 12 13 14 15 16
1 2 34 5 6 7 8 9 10 11 12 13 14 15 16
1 2 34 5 6 7 8 9 10 11 12 13 14 15 16
2. Some genes were lost.
1 2 34 5 6 7 8 9 10 11 12 13 14 15 16
1 2 34 5 6 7 8 9 10 11 12 13 14 15 16
1 2 34 5 6 7 8 9 10 11 12 13 14 15 16
S. cerevisiae
copy2 2 3 5 7 8 11 13 15
Inversions
We have seen that, to create an inversion, a segment of a chromosome is cut out,
flipped, and reinserted. Inversions are of two basic types. If the centromere is
outside the inversion, the inversion is said to be paracentric. Inversions spanning
the centromere are pericentric.
A B C D E F
Normal sequence
A B C E D F
Paracentric
A D C B E F
Pericentric
Because inversions are balanced rearrangements, they do not change the overall
amount of genetic material, and so they do not result in gene imbalance. Individu-
als with inversions are generally normal, if there are no breaks within genes. A
break that disrupts a gene produces a mutation that may be detectable as an abnor-
mal phenotype. If the gene has an essential function, then the break point acts as a
lethal mutation linked to the inversion. In such a case, the inversion cannot be bred
to homozygosity. However, many inversions can be made homozygous, and, fur-
thermore, inversions can be detected in haploid organisms. In these cases, the break
17.2 Changes in Chromosome Structure 6 4 3
Breaks in DNA
A P B C D
5´ 3´ 5´ 3´ 5´ 3´
3´ 5´ 3´ 5´ 3´ 5´
P P P
Inverted alignment
A P C P B P D
5´ 3´ 5´ 3´ 5´ 3´
3´ 5´ 3´ 5´ 3´ 5´
P
Joining of breaks to complete inversion
A P C P B P D
5´ 3´
3´ 5´
P
Inversion
Inversion
A B C D E
Normal product
A B C D
Deletion product
F i g u r e 17-2 8 A crossover in the loop of a paracen-
A tric-inversion heterozygote gives rise to chromosomes
Deletion product
containing deletions.
A D C B E
Inversion product ANIMATED ART: Chromosome rearrangements:
meiotic behavior of paracentric inversions
17.2 Changes in Chromosome Structure 6 4 5
( ) A D
+ Inversion +
Segregation
Pericentric inversions also can be detected microscopi-
cally through new arm ratios. Consider the following peri- End of meiosis l End of meiosis ll
centric inversion: A B C D
A B C D Normal product
Normal Arm ratio, long : short 4:1 A B C A Duplication A arm
A B C A
Deletion D arm
Inversion ( ) Arm ratio, long : short 1:1
D B C D Duplication D arm
Note that the length ratio of the long arm to the short D B C D Deletion A arm
arm has been changed from about 4 : 1 to about 1 : 1 by the D B C A
D B C A Inversion product
inversion. Paracentric inversions do not alter the arm ratio,
but they may be detected microscopically by observing
changes in banding or other chromosomal landmarks, if F i g u r e 17-2 9 A crossover in the loop
available. of a pericentric-inversion heterozygote
gives rise to chromosomes containing
duplications and deletions.
K e y C o n c e p t The main diagnostic features of heterozygous inversions are
inversion loops, reduced recombinant frequency, and reduced fertility because of
unbalanced or deleted meiotic products.
Reciprocal translocations
There are several types of translocations, but here we consider only reciprocal
translocations, the simplest type. Recall that, to form a reciprocal translocation, two
chromosomes trade acentric fragments created by two simultaneous chromosome
breaks. As with other rearrangements, meiosis in heterozygotes having two translo-
cated chromosomes and their normal counterparts produces characteristic config-
urations. Figure 17-30 illustrates meiosis in an individual that is heterozygous for a
reciprocal translocation. Note the cross-shaped pairing configuration. Because the
law of independent assortment is still in force, there are two common patterns of
6 46 CHAPTER 1 7 Large-Scale Chromosomal Changes
N1 T2
Alternate
Up T1 + T2 Translocation genotype
Down N1 + N2 Normal Both complete and viable
segregation. Let us use N1 and N2 to represent the normal chromosomes and T1 and
T2 the translocated chromosomes. The segregation of each of the structurally nor-
mal chromosomes with one of the translocated ones (T1 + N2 and T2 + N1) is called
adjacent-1 segregation. Each of the two meiotic products is deficient for a differ-
ent arm of the cross and has a duplicate of the other. These products are inviable.
Normal and aborted pollen of On the other hand, the two normal chromosomes may segregate together, as will
a translocation heterozygote the reciprocal parts of the translocated ones, to produce N1 + N2 and T1 + T2 prod-
ucts. This segregation pattern is called alternate segregation. These products are
both balanced and viable.
Adjacent-1 and alternate segregations are equal in number, and so half the
overall population of gametes will be nonfunctional, a condition known as semis-
terility or “half sterility.” Semisterility is an important diagnostic tool for identify-
ing translocation heterozygotes. However, semisterility is defined differently for
plants and animals. In plants, the 50 percent of meiotic products that are from the
adjacent-1 segregation generally abort at the gametic stage (Figure 17-31). In ani-
mals, these products are viable as gametes but lethal to the zygotes that they
produce on fertilization.
Remember that heterozygotes for inversions also may show some reduction in
fertility but by an amount dependent on the size of the affected region. The pre-
Figure 17-31 Pollen of a semisterile corn cise 50 percent reduction in viable gametes or zygotes is usually a reliable diag-
plant. The clear pollen grains contain nostic clue for a translocation.
chromosomally unbalanced meiotic Genetically, genes on translocated chromosomes act as though they are linked
products of a reciprocal-translocation
heterozygote. The opaque pollen grains,
if their loci are close to the translocation break point. Figure 17-32 shows a trans-
which contain either the complete location heterozygote that has been established by crossing an a/a ; b/b individual
translocation genotype or normal with a translocation homozygote bearing the wild-type alleles. When the hetero-
chromosomes, are functional in fertilization zygote is testcrossed, recombinants are created but do not survive because they
and development. [ William Sheridan.] carry unbalanced genomes (duplication-and-deletions). The only viable progeny
17.2 Changes in Chromosome Structure 6 47
Robertsonian translocations
b a a b
Let’s return to the family with the Down syndrome child,
Phenotype ab Phenotype a b
introduced at the beginning of the chapter. The birth can
indeed be a coincidence—after all, coincidences do hap-
pen. However, the miscarriage gives a clue that something F i g u r e 17- 3 2 When a translocated
else might be going on. A large proportion of spontaneous abortions carry chro- fragment carries a marker gene, this
mosomal abnormalities, so perhaps that is the case in this example. If so, the marker can show linkage to genes on the
couple may have had two conceptions with chromosome mutations, which other chromosome.
would be very unlikely unless there was a common cause. However, a small pro- ANIMATED ART: Chromosome
portion of Down syndrome cases are known to result from a translocation in one rearrangements: pseudolinkage of genes
of the parents. We have seen that translocations can produce progeny that have
extra material from part of the genome, and so a translocation concerning chro-
mosome 21 can produce progeny that have extra material from that chromo- F i g u r e 17- 3 3 In a small minority of
some. In Down syndrome, the translocation responsible is of a type called a cases, the origin of Down syndrome is a
Robertsonian translocation. It produces progeny carrying an almost complete parent heterozygous for a Robertsonian
extra copy of chromosome 21. The translocation and its segregation are illus- translocation concerning chromosome
trated in Figure 17-33. Note that, in addition to complements causing Down syn- 21. Meiotic segregation results in some
gametes carrying a chromosome with a
drome, other aberrant chromosome complements are produced, most of which large additional segment of chromosome
abort. In our example, the man may have this translocation, which he may have 21. In combination with a normal
inherited from his grandmother. To confirm this possibility, his chromosomes chromosome 21 provided by the gamete
are checked. His unaffected child might have normal chromosomes or might from the opposite sex, the symptoms of
have inherited his translocation. Down syndrome are produced even
though there is not full trisomy 21.
Gametes from
translocation
Gametes carrier
from
normal parent
thereby preventing its expression and thereby allowing the expression of w. If the
positions of the w+ and w alleles are exchanged by a crossover, then position-effect
variegation is not detected (see Figure 17-34a, bottom section).
RegIG MYC
Translocation
Translocation
break point
Chr 22
BCR1
Hybrid oncogene
Translocation
F i g u r e 17- 3 5 The two main ways that translocations can cause cancer in a body
(somatic) cell are illustrated by the cancers Burkitt lymphoma (a) and chronic myelogenous
leukemia (b). The genes MYC, BCR1, and ABL are proto-oncogenes.
6 50 CHAPTER 1 7 Large-Scale Chromosomal Changes
The other mechanism by which translocations can cause cancer is the forma-
tion of a hybrid gene. An example is provided by the disease chronic myelogenous
leukemia (CML), a cancer of white blood cells. This cancer can result from the
formation of a hybrid gene between the two proto-oncogenes BCR1 and ABL (Fig-
ure 17-35b). The abl proto-oncogene encodes a protein kinase in a signaling path-
way. The protein kinase passes along a signal initiated by a growth factor that
leads to cell proliferation. The Bcr1-Abl fusion protein has a permanent protein
kinase activity. The altered protein continually propagates its growth signal
onward, regardless of whether the initiating signal is present.
10
Ratio of binding of fluorescent probes (mutant/wild type)
NORMAL CHROMOSOME
2
1
0
10
Duplication
2
1
0
10
Deletion
2
1
0
10 Tandem
amplification
2
1
0
Map positions of cDNA clones in chromosomal microarray
1,000,000
conceptions
850,000 150,000
live births spontaneous abortions
39,000 trisomics
(3,510 trisomy 21)
5,165
chromosome abnormalities
13,500 XO
758 balanced
Robertsonian
translocations
758 balanced
reciprocal
translocations
117 inversions
500 unbalanced
structural aberrations
s u mma ry
Polyploidy is an abnormal condition in which there is a larger- and are useful in engineering special strains of organisms for
than-normal number of chromosome sets. Polyploids such as experimental and applied genetics. In organisms with one
triploids (3n) and tetraploids (4n) are common among plants normal chromosome set plus a rearranged set (heterozygous
and are represented even among animals. Organisms with an rearrangements), there are unusual pairing structures at
odd number of chromosome sets are sterile because not every meiosis resulting from the strong pairing affinity of homolo-
chromosome has a partner at meiosis. Unpaired chromo- gous chromosome regions. For example, heterozygous inver-
somes attach randomly to the poles of the cell in meiosis, lead- sions show loops, and reciprocal translocations show cross-
ing to unbalanced sets of chromosomes in the resulting gam- shaped structures. Segregation of these structures results in
etes. Such unbalanced gametes do not yield viable progeny. In abnormal meiotic products unique to the rearrangement.
polyploids with an even number of sets, each chromosome A deletion is the loss of a section of chromosome, either
has a potential pairing partner and hence can produce bal- because of chromosome breaks followed by loss of the inter-
anced gametes and progeny. Polyploidy can result in an organ- vening segment or because of segregation in heterozygous
ism of larger dimensions; this discovery has permitted impor- translocations or inversions. If the region removed in a dele-
tant advances in horticulture and in crop breeding. tion is essential to life, a homozygous deletion is lethal.
In plants, allopolyploids (polyploids formed by combin- Heterozygous deletions may be lethal because of chromo-
ing chromosome sets from different species) can be made somal imbalance or because they uncover recessive deleteri-
by crossing two related species and then doubling the prog- ous alleles, or they may be nonlethal. When a deletion in one
eny chromosomes through the use of colchicine or through homolog allows the phenotypic expression of recessive alleles
somatic cell fusion. These techniques have potential appli- in the other, the unmasking of the recessive alleles is called
cations in crop breeding because allopolyploids combine pseudodominance.
the features of the two parental species. Duplications are generally produced from other rear-
When cellular accidents change parts of chromosome rangements or by aberrant crossing over. They also unbal-
sets, aneuploids result. Aneuploidy itself usually results in an ance the genetic material, producing a deleterious pheno-
unbalanced genotype with an abnormal phenotype. Examples typic effect or death of the organism. However, duplications
of aneuploids include monosomics (2n − 1) and trisomics can be a source of new material for evolution because func-
(2n + 1). Down syndrome (trisomy 21), Klinefelter syndrome tion can be maintained in one copy, leaving the other copy
(XXY), and Turner syndrome (XO) are well-documented free to evolve new functions.
examples of aneuploid conditions in humans. The spontane- An inversion is a 180-degree turn of a part of a chromo-
ous level of aneuploidy in humans is quite high and accounts some. In the homozygous state, inversions may cause little
for a large proportion of genetically based ill health in human problem for an organism unless heterochromatin brings
populations. The phenotype of an aneuploid organism about a position effect or one of the breaks disrupts a gene.
depends very much on the particular chromosome affected. On the other hand, inversion heterozygotes show inversion
In some cases, such as human trisomy 21, there is a highly loops at meiosis, and crossing over within the loop results in
characteristic constellation of associated phenotypes. inviable products. The crossover products of pericentric
Most instances of aneuploidy result from accidental inversions, which span the centromere, differ from those of
chromosome missegregation at meiosis (nondisjunction). paracentric inversions, which do not, but both show reduced
The error is spontaneous and can occur in any particular recombinant frequency in the affected region and often
meiocyte at the first or second division. In humans, a mater- result in reduced fertility.
nal-age effect is associated with nondisjunction of chromo- A translocation moves a chromosome segment to another
some 21, resulting in a higher incidence of Down syndrome position in the genome. A simple example is a reciprocal
in the children of older mothers. translocation, in which parts of nonhomologous chromo-
The other general category of chromosome mutations somes exchange positions. In the heterozygous state, translo-
comprises structural rearrangements, which include dele- cations produce duplication-and-deletion meiotic products,
tions, duplications, inversions, and translocations. These which can lead to unbalanced zygotes. New gene linkages can
changes result either from breakage and incorrect reunion or be produced by translocations. The random segregation of
from crossing over between repetitive elements (nonallelic centromeres in a translocation heterozygote results in 50 per-
homologous recombination). Chromosomal rearrangements cent unbalanced meiotic products and, hence, 50 percent
are an important cause of ill health in human populations sterility (semisterility).
Solved Problems 6 5 3
k e y t e r ms
acentric chromosome (p. 636) duplication (p. 634) paracentric inversion (p. 642)
acentric fragment (p. 643) embryoid (p. 626) parthenogenesis (p. 619)
adjacent-1 segregation (p. 646) euploid (p. 619) pentaploid (p. 619)
allopolyploid (p. 620) gene balance (p. 632) pericentric inversion (p. 642)
alternate segregation (p. 646) gene-dosage effect (p. 633) polyploid (p. 619)
amphidiploid (p. 623) genetic load (p. 619) polytene chromosome (p. 637)
anaphase bridge (p. 636) hexaploid (p. 619) position-effect variegation (p. 648)
aneuploid (p. 621) homeologous chromosomes (p. 620) pseudodominance (p. 638)
autopolyploid (p. 620) insertional duplication (p. 640) pseudolinkage (p. 647)
balanced rearrangement (p. 636) intragenic deletion (p. 637) rearrangement (p. 634)
balancer (p. 645) inversion (p. 634) segmental duplication (p. 641)
bivalent (p. 621) inversion heterozygote (p. 643) semisterility (p. 646)
chromosome mutation (p. 618) inversion loop (p. 643) tandem duplication (p. 640)
deletion (p. 634) Klinefelter syndrome (p. 630) tetraploid (p. 619)
deletion loop (p. 637) monoploid (p. 619) translocation (p. 635)
deletion mapping (p. 638) monosomic (p. 627) triploid (p. 619)
dicentric bridge (p. 644) multigenic deletion (p. 637) trisomic (p. 627)
dicentric chromosome (p. 636) nonallelic homologous recombination trivalent (p. 621)
disomic (p. 628) (NAHR) (p. 636) Turner syndrome (p. 629)
dosage compensation (p. 634) nondisjunction (p. 628) unbalanced rearrangement (p. 636)
Down syndrome (p. 630) nullisomic (p. 628) univalent (p. 621)
so lv e d p r ob l e ms
SOLVED PROBLEM 1. A corn plant is heterozygous for a To simplify the diagram, we do not show the chromosomes
reciprocal translocation and is therefore semisterile. This divided into chromatids (although they would be at this
plant is crossed with a chromosomally normal strain that is stage of meiosis). We then diagram the first cross:
homozygous for the recessive allele brachytic (b), located on
Translocation strain
chromosome 2. A semisterile F1 plant is then backcrossed to
the homozygous brachytic strain. The progeny obtained
b
show the following phenotypes:
Nonbrachytic Brachytic b
d. There may have been a viable deletion spanning V in the c. Nondisjunction in the normal parent
meiocyte from her normal parent.
This explanation would give a nullisomic gamete that would
Which of these explanations are possible, and which are combine with v to give the F1 waltzer the hemizygous geno-
eliminated by the genetic analysis? Explain in detail. type v. The subsequent matings would be
Solution • v × v/v, which gives v/v and v progeny, all waltzers. This fits.
The best way to answer the question is to take the explana- • v × V/V, which gives V/v and V progeny, all normals. This
tions one at a time and see if each fits the results given. also fits.
a. Mutation V to v • First intercrosses of normal progeny: V × V. These inter-
This hypothesis requires that the exceptional waltzer female crosses give V and V/V, which are normal. This fits.
be homozygous v /v . This assumption is compatible with the • Second intercrosses of normal progeny: V × V/v. These
results of mating her both with waltzer males, which would, intercrosses give 25 percent each of V/V, V/v, V (all normals),
if she is v /v, produce all waltzer offspring (v /v), and with nor- and v (waltzers). This also fits.
mal males, which would produce all normal offspring (V /v).
However, brother–sister matings within this normal progeny This hypothesis is therefore consistent with the data.
should then produce a 3 : 1 normal-to-waltzer ratio. Because
d. Deletion of V in normal parent
some of the brother–sister matings actually produced no
waltzers, this hypothesis does not explain the data. Let’s call the deletion D. The F1 waltzer would be D/v, and
the subsequent matings would be
b. Epistatic mutation s to S
• D/v × v/v, which gives v/v and D/v, which are waltzers. This
Here the parents would be V/V • s /s and v /v • s /s, and a germi- fits.
nal mutation in one of them would give the F1 waltzer the
• D/v ×V/V, which gives V/v and D/V, which are normal. This
genotype V/ v • S /s. When we crossed her with a waltzer male,
fits.
who would be of the genotype v /v • s /s, we would expect some
V/ v • S /s progeny, which would be phenotypically normal. • First intercrosses of normal progeny: D/V × D/V , which
However, we saw no normal progeny from this cross, and so give D/V and V/V, all normal. This fits.
the hypothesis is already overthrown. Linkage could save the • Second intercrosses of normal progeny: D/V × V/v , which
hypothesis temporarily if we assumed that the mutation was give 25 percent each of V/V, V/v, D/V (all normals), and D/v
in the normal parent, giving a gamete V S. Then the F1 waltzer (waltzers). This also fits.
would be V S / v s, and, if linkage were tight enough, few or no
V s gametes would be produced, the type that are necessary to Once again, the hypothesis fits the data provided; so we
combine with the v s gamete from the male to give V s / v s are left with two hypotheses that are compatible with the re-
normals. However, if the linkage hypothesis were true, the sults, and further experiments are necessary to distinguish
cross with the normal males would be V S / v s × V s / V s, and them. One way of doing so would be to examine the chromo-
this would give a high percentage of V S / V s progeny, which somes of the exceptional female under the microscope: aneu-
would be waltzers, none of which were seen. ploidy should be easy to distinguish from deletion.
p r ob l e ms
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
6. In Figure 17-12, what would be the constitution of an in- autopolyploid? How would you test that it was a poly-
dividual formed from the union of a monosomic from a ploid and not just growing in rich soil?
first-division nondisjunction in a female and a disomic 23. Is a trisomic an aneuploid or a polyploid?
from a second-division nondisjunction in a male, assum-
24. In a tetraploid B /B / b /b, how many quadrivalent possible
ing the gametes were functional?
pairings are there? Draw them (see Figure 17-5).
7. In Figure 17-14, what would be the expected percentage
25. Someone tells you that cauliflower is an amphidiploid.
of each type of segregation?
Do you agree? Explain.
8. In Figure 17-19, is there any difference between the in-
26. Why is Raphanobrassica fertile, whereas its progenitor
version products formed from breakage and those
wasn’t?
formed from crossing over?
27. In the designation of wheat genomes, how many chro-
9. Referring to Figure 17-19, draw a diagram showing the
mosomes are represented by the letter B?
process whereby an inversion formed from crossing over
could generate a normal sequence. 28. How would you “re-create” hexaploid bread wheat from
Triticum tauschii and Emmer?
10. In Figure 17-21, would the recessive fa allele be expressed
when paired with deletion 264-32? 265-11? 29. How would you make a monoploid plantlet by starting
with a diploid plant?
11. Look at Figure 17-22 and state which bands are missing
in the cri du chat deletion. 30. A disomic product of meiosis is obtained. What is its
likely origin? What other genotypes would you expect
12. In Figure 17-25, which species is most closely related to among the products of that meiosis under your
the ancestral yeast strain? Why are genes 3 and 13 re- hypothesis?
ferred to as duplicate?
31. Can a trisomic A/A/a ever produce a gamete of geno-
13. Referring to Figure 17-26, draw the product if breaks oc- type a?
curred within genes A and B.
32. Which, if any, of the following sex-chromosome aneu-
14. In Figure 17-26, the bottom panel shows that genes B and ploids in humans are fertile: XXX, XXY, XYY, XO?
C are oriented in a different direction (note the promot-
ers). Do you think this difference in orientation would 33. Why are older expectant mothers routinely given amnio-
affect their functionality? centesis or CVS?
15. In Figure 17-28, what would be the consequence of a 34. In an inversion, is a 5′ DNA end ever joined to another 5′
crossover between the centromere and locus A? end? Explain.
16. Based on Figure 17-30, are normal genomes ever formed 35. If you observed a dicentric bridge at meiosis, what rear-
from the two types of segregation? Are normal genomes rangement would you predict had taken place?
ever formed from an adjacent-1 segregation? 36. Why do acentric fragments get lost?
17. Referring to Figure 17-32, draw an inviable product from 37. Diagram a translocation arising from repetitive DNA.
the same meiosis. Repeat for a deletion.
18. Based on Figure 17-35, write a sentence stating how 38. From a large stock of Neurospora rearrangements avail-
translocation can lead to cancer. Can you think of anoth- able from the fungal genetics stock center, what type
er genetic cause of cancer? would you choose to synthesize a strain that had a dupli-
cation of the right arm of chromosome 3 and a deletion
19. Looking at Figure 17-36, why do you think the signal ra-
for the tip of chromosome 4?
tio is so much higher in the bottom panel?
39. You observe a very large pairing loop at meiosis. Is it
20. Using Figure 17-37, calculate what percentage of concep-
more likely to be from a heterozygous inversion or het-
tions are triploid. The same figure shows XO in the spon-
erozygous deletion? Explain.
taneous-abortion category; however, we know that many
XO individuals are viable. In which of the viable catego- 40. A new recessive mutant allele doesn’t show pseudodomi-
ries would XO be grouped? nance with any of the deletions that span Drosophila
chromosome 2. What might be the explanation?
B asic P r ob l e ms 41. Compare and contrast the origins of Turner syndrome,
Williams syndrome, cri du chat syndrome, and Down
21. In keeping with the style of Table 17-1, what would you
syndrome. (Why are they called syndromes?)
call organisms that are MM N OO; MM NN OO; MMM
NN PP? 42. List the diagnostic features (genetic or cytological) that
are used to identify these chromosomal alterations:
22. A large plant arose in a natural population. Qualitatively,
it looked just the same as the others, except much larg- a. Deletions
er. Is it more likely to be an allopolyploid or an b. Duplications
Problems 6 57
unusual waltzers, however, one member of a chromo- q3.1 q2.2 q2.1 q1.2 q1.1 p1.1 p1.2
some pair was abnormally short. Interpret these obser- Borneo
vations as completely as possible, both genetically and
cen
d. What fraction of the gametes produced by a hybrid 13. What were the gametes from the unusual plant, and
orangutan will give rise to viable progeny, if these what were their proportions?
chromosomes are the only ones that differ between the 14. Which gametes were in the majority?
parents? (Problem 48 is from Rosemary Redfield.)
15. Which gametes were in the minority?
49. In corn, the genes for tassel length (alleles T and t) and
16. Which of the progeny types seem to be recombinant?
rust resistance (alleles R and r) are known to be on sepa-
rate chromosomes. In the course of making routine 17. Which allelic combinations appear to be linked in some
crosses, a breeder noticed that one T/t ; R/r plant gave way?
unusual results in a testcross with the double-recessive 18. How can there be linkage of genes supposedly on sepa-
pollen parent t/t ; r/r. The results were rate chromosomes?
19. What do these majority and minority classes tell us about
Progeny: T/t ; R/r 98
the genotypes of the parents of the unusual plant?
t/ t ; r/r 104
20. What is a corncob?
T/t ; r/r 3
21. What does a normal corncob look like? (Sketch one and
t/t ; R/r 5 label it.)
Corncobs: Only about half as many seeds as usual 22. What do the corncobs from this cross look like? (Sketch
one.)
a. What key features of the data are different from the
23. What exactly is a kernel?
expected results?
24. What effect could lead to the absence of half the kernels?
b. State a concise hypothesis that explains the results.
25. Did half the kernels die? If so, was the female or the male
c. Show genotypes of parents and progeny.
parent the reason for the deaths?
d. Draw a diagram showing the arrangement of alleles
on the chromosomes. Now try to solve the problem.
e. Explain the origin of the two classes of progeny
having three and five members. 50. A yellow body in Drosophila is caused by a mutant allele
y of a gene located at the tip of the X chromosome (the
wild-type allele causes a gray body). In a radiation ex-
www
Unpacking Problem 49 periment, a wild-type male was irradiated with X rays
www
1. What do a “gene for tassel length” and a “gene for rust and then crossed with a yellow-bodied female. Most of
resistance” mean? the male progeny were yellow, as expected, but the scan-
2. Does it matter that the precise meaning of the allelic ning of thousands of flies revealed two gray-bodied (phe-
symbols T, t, R, and r is not given? Why or why not? notypically wild-type) males. These gray-bodied males
were crossed with yellow-bodied females, with the fol-
3. How do the terms gene and allele, as used here, relate to
lowing results:
the concepts of locus and gene pair?
Progeny
4. What prior experimental evidence would give the corn
geneticist the idea that the two genes are on separate gray male 1 × yellow female females all yellow
chromosomes? males all gray
5. What do you imagine “routine crosses” are to a corn gray male 2 × yellow female 1 females yellow
2
breeder? 1 females gray
6. What term is used to describe genotypes of the type 2
T/t ; R/r?
1 males yellow
2
7. What is a “pollen parent”? 1 males gray
2
8. What are testcrosses, and why do geneticists find them
so useful? a. Explain the origin and crossing behavior of gray
9. What progeny types and frequencies might the breeder male 1.
have been expecting from the testcross? b. Explain the origin and crossing behavior of gray
10. Describe how the observed progeny differ from male 2.
expectations. 51. In corn, the allele Pr stands for green stems, pr for purple
11. What does the approximate equality of the first two stems. A corn plant of genotype pr/pr that has standard
progeny classes tell you? chromosomes is crossed with a Pr/Pr plant that is homo-
12. What does the approximate equality of the second two zygous for a reciprocal translocation between chromo-
progeny classes tell you? somes 2 and 5. The F1 is semisterile and phenotypically
Problems 6 59
Pr. A backcross with the parent with standard chromo- However, you occasionally find some rare types as
somes gives 764 semisterile Pr, 145 semisterile pr, 186 follows (all other chromosomes are normal):
normal Pr, and 727 normal pr. What is the map distance a. 4K/4K/4
between the Pr locus and the translocation point?
b. 4K/4S/4
52. Distinguish among Klinefelter, Down, and Turner syn-
dromes. Which syndromes are found in both sexes? c. 4K
53. Show how you could make an allotetraploid between two Explain the rare types that you have found. Give, as
related diploid plant species, both of which are 2n = 28. precisely as possible, the stages at which they originate,
54. In Drosophila, trisomics and monosomics for the tiny and state whether they originate in the male parent,
chromosome 4 are viable, but nullisomics and tetraso- the female parent, or the zygote. (Give brief reasons.)
mics are not. The b locus is on this chromosome. Deduce 58. A cross is made in tomatoes between a female plant that
the phenotypic proportions in the progeny of the follow- is trisomic for chromosome 6 and a normal diploid male
ing crosses of trisomics. plant that is homozygous for the recessive allele for po-
a. b+/b/ b × b /b tato leaf (p/p). A trisomic F1 plant is backcrossed to the
b. b+/b+/b × b /b potato-leaved male.
a. What is the ratio of normal-leaved plants to potato-
c. b+/b+/b × b+/b
leaved plants when you assume that p is located on
55. A woman with Turner syndrome is found to be color- chromosome 6?
blind (an X-linked recessive phenotype). Both her mother
and her father have normal vision. b. What is the ratio of normal-leaved to potato-leaved
plants when you assume that p is not located on
a. Explain the simultaneous origin of Turner syndrome
chromosome 6?
and color blindness by the abnormal behavior of
chromosomes at meiosis. 59. A tomato geneticist attempts to assign five recessive
mutations to specific chromosomes by using trisomics.
b. Can your explanation distinguish whether the
She crosses each homozygous mutant (2n) with each of
abnormal chromosome behavior occurred in the father
three trisomics, in which chromosomes 1, 7, and 10 take
or the mother?
part. From these crosses, the geneticist selects trisomic
c. Can your explanation distinguish whether the progeny (which are less vigorous) and backcrosses them
abnormal chromosome behavior occurred at the first or to the appropriate homozygous recessive. The diploid
second division of meiosis? progeny from these crosses are examined. Her results, in
d. Now assume that a color-blind Klinefelter man has which the ratios are wild type : mutant, are as follows:
parents with normal vision, and answer parts a, b, and c.
Trisomic Mutation
56. a. How would you synthesize a pentaploid?
chromosome d y c h cot
b. How would you synthesize a triploid of genotype
A/a/a? 1 48:55 72:29 56:50 53:54 32:28
c. You have just obtained a rare recessive mutation a* 7 52:56 52:48 52:51 58:56 81:40
in a diploid plant, which Mendelian analysis tells you is 10 45:42 36:33 28:32 96:50 20:17
A/a*. From this plant, how would you synthesize a
tetraploid (4n) of genotype A/A/a*/a*? Which of the mutations can the geneticist assign to
d. How would you synthesize a tetraploid of genotype which chromosomes? (Explain your answer fully.)
A/a/a/a? 60. A petunia is heterozygous for the following autosomal
homologs:
57. Suppose you have a line of mice that has cytologically
distinct forms of chromosome 4. The tip of the chromo- A B C D E F G H I
some can have a knob (called 4K) or a satellite (4S) or
neither (4). Here are sketches of the three types: a b c d h g f e i
4K
a. Draw the pairing configuration that you would see
at metaphase I, and identify all parts of your diagram.
4S Number the chromatids sequentially from top to
bottom of the page.
4 b. A three-strand double crossover occurs, with one
crossover between the C and D loci on chromatids 1 and
You cross a 4K/4S female with a 4/4 male and find that 3, and the second crossover between the G and H loci on
most of the progeny are 4K/4 or 4S/4, as expected. chromatids 2 and 3. Diagram the results of these
6 6 0 CHAPTER 1 7 Large-Scale Chromosomal Changes
recombination events as you would see them at anaphase 63. The following corn loci are on one arm of chromosome
I, and identify all parts of your diagram. 9 in the order indicated (the distances between them
c. Draw the chromosome pattern that you would see are shown in map units):
at anaphase II after the crossovers described in part b. c-bz-wx-sh-d-centromere
d. Give the genotypes of the gametes from this meiosis 12 8 10 20 10
that will lead to the formation of viable progeny. Assume
C gives colored aleurone ; c, white aleurone.
that all gametes are fertilized by pollen that has the gene
order A B C D E F G H I. Bz gives green leaves; bz, bronze leaves.
61. Two groups of geneticists, in California and in Chile, be- Wx gives starchy seeds; wx, waxy seeds.
gin work to develop a linkage map of the medfly. They Sh gives smooth seeds; sh, shrunken seeds.
both independently find that the loci for body color D gives tall plants; d, dwarf.
(B = black, b = gray) and eye shape (R = round, r =
star) are linked 28 m.u. apart. They send strains to each A plant from a standard stock that is homozygous for all
other and perform crosses; a summary of all their find- five recessive alleles is crossed with a wild-type plant
ings is shown here: from Mexico that is homozygous for all five dominant
Progeny of alleles. The F1 plants express all the dominant alleles
Cross F1 F1× any b r/b r and, when backcrossed to the recessive parent, give the
following progeny phenotypes:
B R/B R (Calif.) B R/b r B R/b r 36%
× b r/b r (Calif.) b r/b r 36 colored, green, starchy, smooth, tall 360
B r/b r 14 white, bronze, waxy, shrunk, dwarf 355
b R/b r 14 colored, bronze, waxy, shrunk, dwarf 40
B R/B R (Chile) B R/b r B R/b r 36 white, green, starchy, smooth, tall 46
× b r/b r (Chile) b r/b r 36 colored, green, starchy, smooth, dwarf 85
B r/b r 14 white, bronze, waxy, shrunk, tall 84
b R/b r 14 colored, bronze, waxy, shrunk, tall 8
B R/B R (Calif.) B R/b r B R/b r 48 white, green, starchy, smooth, dwarf 9
× b r/b r (Chile) or b r/b r 48 colored, green, waxy, smooth, tall 7
b r/b r (Calif.) B r/b r 2
× B R/B R (Chile) white, bronze, starchy, shrunk, dwarf 6
b R/b r 2
Propose a hypothesis to explain these results. Include
a. Provide a genetic hypothesis that explains the three
sets of testcross results. a. a general statement of your hypothesis, with dia-
b. Draw the key chromosomal features of meiosis in the grams if necessary;
F1 from a cross of the Californian and Chilean lines. b. why there are 10 classes;
62. An aberrant corn plant gives the following RF values c. an account of the origin of each class, including its
when testcrossed: frequency; and
Interval d. at least one test of your hypothesis.
d-f f-b b-x x-y y-p 64. Chromosomally normal corn plants have a p locus on
chromosome 1 and an s locus on chromosome 5.
Control 5 18 23 12 6
P gives dark green leaves; p, pale green leaves.
Aberrant plant 5 2 2 0 6
S gives large ears; s, shrunken ears.
(The locus order is centromere-d–f–b–x–y–p.) The aberrant An original plant of genotype P/p ; S/s has the expected
plant is a healthy plant, but it produces far fewer normal phenotype (dark green, large ears) but gives unexpected
ovules and pollen than does the control plant. results in crosses as follows:
a. Propose a hypothesis to account for the abnormal • On selfing, fertility is normal, but the frequency of
recombination values and the reduced fertility in the p/p ; s/s types is 1/4 (not 1/16 as expected).
aberrant plant. • When crossed with a normal tester of genotype p/p;
b. Use diagrams to explain the origin of the recombinants s/s, the F1 progeny are 1 ; P/p ; S/s and 1 ; p/p ; s/s;
2 2
according to your hypothesis. fertility is normal.
Problems 6 61
G. hirsutum 13 large bivalents 73. In Drosophila, a cross (cross 1) was made between two
mutant flies, one homozygous for the recessive mutation
× G. herbaceum + 13 small univalents
bent wing (b) and the other homozygous for the recessive
G. thurberi 13 large univalents mutation eyeless (e). The mutations e and b are alleles of
two different genes that are known to be very closely
× G. herbaceum + 13 small univalents
linked on the tiny autosomal chromosome 4. All the prog-
eny had a wild-type phenotype. One of the female prog-
Interpret these observations phylogenetically, using eny was crossed with a male of genotype b e / b e ; we will
diagrams. Clearly indicate the relationships between the call this cross 2. Most of the progeny of cross 2 were of the
species. How would you prove that your interpretation is expected types, but there was also one rare female of
correct? wild-type phenotype.
71. There are six main species in the Brassica genus: B. cari- a. Explain what the common progeny are expected to
nata, B. campestris, B. nigra, B. oleracea, B. juncea, and be from cross 2.
B. napus. You can deduce the interrelationships among b. Could the rare wild-type female have arisen by (1)
these six species from the following table: crossing over or (2) nondisjunction? Explain.
Problems 6 6 3
c. The rare wild-type female was testcrossed to a male producing beige ascospores, is in a gene just to the left of
of genotype b e / b e (cross 3). The progeny were the same centromere. In a cross of fawn and beige parents
1 1 (+f × b+), most octads showed four fawn and four beige
6 wild type 3 bent
1 1
ascospores, but three rare exceptional octads were found,
6 bent, eyeless 3 eyeless as shown in the accompanying illustration. In the sketch,
Which of the explanations in part b is compatible with black is the wild-type phenotype, a vertical line is fawn, a
this result? Explain the genotypes and phenotypes of the horizontal line is beige, and an empty circle represents an
progeny of cross 3 and their proportions. aborted (dead) ascospore.
www
Unpacking Problem 73
www
1. Define homozygous, mutation, allele, closely linked, reces-
sive, wild type, crossing over, nondisjunction, testcross, phe- Double mutants (bf )
notype, and genotype.
2. Does this problem concern sex linkage? Explain.
3. How many chromosomes does Drosophila have? 1 2 3
4. Draw a clear pedigree summarizing the results of crosses a. Provide reasonable explanations for these three ex-
1, 2, and 3. ceptional octads.
5. Draw the gametes produced by both parents in cross 1. b. Diagram the meiosis that gave rise to octad 2.
6. Draw the chromosome 4 constitution of the progeny of 75. The life cycle of the haploid fungus Ascobolus is similar to
cross 1. that of Neurospora. A mutational treatment produced
7. Is it surprising that the progeny of cross 1 are wild-type two mutant strains, 1 and 2, both of which when crossed
phenotype? What does this outcome tell you? with wild type gave unordered tetrads, all of the following
8. Draw the chromosome 4 constitution of the male tester type (fawn is a light brown color; normally, crosses pro-
used in cross 2 and the gametes that he can produce. duce all black ascospores):
9. With respect to chromosome 4, what gametes can the spore pair 1 black spore pair 3 fawn
female parent in cross 2 produce in the absence of non-
spore pair 2 black spore pair 4 fawn
disjunction? Which would be common and which rare?
10. Draw first- and second-division meiotic nondisjunction a. What does this result show? Explain.
in the female parent of cross 2, as well as in the resulting The two mutant strains were crossed. Most of the un-
gametes. ordered tetrads were of the following type:
11. Are any of the gametes from part 10 aneuploid? spore pair 1 fawn spore pair 3 fawn
12. Would you expect aneuploid gametes to give rise to via- spore pair 2 fawn spore pair 4 fawn
ble progeny? Would these progeny be nullisomic, mono-
somic, disomic, or trisomic? b. What does this result suggest? Explain.
13. What progeny phenotypes would be produced by the When large numbers of unordered tetrads were screened
various gametes considered in parts 9 and 10? under the microscope, some rare ones that contained
14. Consider the phenotypic ratio in the progeny of cross 3. black spores were found. Four cases are shown here:
Many genetic ratios are based on halves and quarters, Case A Case B Case C Case D
but this ratio is based on thirds and sixths. To what might
spore pair 1 black black black black
this ratio point?
spore pair 2 black fawn black abort
15. Could there be any significance to the fact that the
crosses concern genes on a very small chromosome? spore pair 3 fawn fawn abort fawn
When is chromosome size relevant in genetics? spore pair 4 fawn fawn abort fawn
16. Draw the progeny expected from cross 3 under the two
(Note: Ascospores with extra genetic material survive,
hypotheses, and give some idea of relative proportions.
but those with less than a haploid genome abort.)
c. Propose reasonable genetic explanations for each of
74. In the fungus Ascobolus (similar to Neurospora), asco-
these four rare cases.
spores are normally black. The mutation f, producing
fawn-colored ascospores, is in a gene just to the right of d. Do you think the mutations in the two original
the centromere on chromosome 6, whereas mutation b, mutant strains were in one single gene? Explain.
This page intentionally left blank
344
18
Ch a p t e r
Population
Genetics
Learning Outcomes
After completing this chapter, you will be able to
Artist Lynn Fellman’s conception of the “Eurasian Adam,” an African man with a
Y chromosome belonging to a haplotype group that was ancestral to all
Y chromosomes of men outside Africa and arose within Africa approximately
70,000 years ago. [ Lynn Fellman www.Fellmanstudio.com.]
outline
665
6 6 6 C H APTER 1 8 Population Genetics
I
n 2009, Sean Hodgson was released from a British prison after serving 27 years
behind bars for the murder of Teresa De Simone, a clerk and part-time barmaid.
Hodgson, who suffers from mental illness, initially confessed to the crime but
withdrew his confession during the trial. Throughout his years in prison, he main-
tained his innocence. More than two decades after the crime, the courts analyzed
DNA of the assailant found at the crime scene and determined that it did not come
from Mr. Hodgson. His conviction was overturned, and the police have now
reopened the investigation of Ms. De Simone’s murder. As you will learn in this
chapter, the DNA-based analysis used to exonerate Mr. Hodgson and hundreds of
other wrongly convicted prisoners was dependent on population genetic analysis.
The principles of population genetics are at the heart of many questions facing
society today. What are the risks that a couple will have a child with a genetic dis-
ease? Have the practices of plant and animal breeding caused a loss of genetic
diversity on the farm, and does this loss of diversity place our food supply at risk?
As the human population continues to expand and wildlife retreats into smaller
and smaller parts of the earth, will wildlife species be able to avoid inbreeding and
survive? The principles of population genetics are also fundamental to under-
standing many historical and evolutionary questions. How are human populations
from different regions of the world related to one another? How has the human
genome responded as humans have spread out across the globe and become
adapted to different environments and lifestyles? How do populations and species
evolve over time?
A population is a group of individuals of the same species. Population genetics
analyzes the amount and distribution of genetic variation in populations and the
forces that control this variation. It has its roots in the early 1900s, when geneticists
began to study how Mendel’s laws could be extended to understand genetic varia-
tion within whole populations of organisms. While Mendel’s laws explain how genes
are passed from parent to offspring in the cases of controlled crosses and known
pedigrees, these laws are insufficient to understand the transmission of genes from
one generation to the next in natural populations, in which not all individuals pro-
duce offspring and not all offspring survive. Geneticists began developing the prin-
ciples of population genetics in the early 1900s, but at the time, they had rather
limited tools to actually measure genetic variation. With the development of DNA-
based technologies over the past three decades, geneticists now have the ability to
observe directly differences between the DNA sequences of individuals throughout
their genomes, and they can measure these differences in large samples of individu-
als in many species. The result has been a revolution in our understanding of
genetic variation in populations.
In this chapter, we will consider the concept of the gene pool and how geneti-
cists estimate allele and genotype frequencies in populations. Next, we will exam-
ine the impact that mating systems have on the frequencies of genotypes in a
population. We will also discuss how geneticists measure variation using DNA-
based technologies. We will then discuss the forces that modulate the levels of
genetic variation within populations. Finally, we will look at some case studies
involving the application of population genetics to questions of interest to society.
Nucleotide position
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Chromosomes from
seven individuals
1 G G C A T C G C G C C G T T A C G T A G A G A G A G A G G T G A A T C
2 G G C A A C G C G C C G T T A C G T A G A G A G A G A G G T G A A T C
3 G G G A T C G C G C C G T T A C G T A G A G A G A G A G G T G A A T C
4 G C C A T C G C T C C G T T A C T T A G A G A G - - - - G T T A G T C
5 G C C A T C G C T C - - - T A C T T A G A G A G - - - - G T T A G T C
6 G C C A T C G C T C - - - T A C T T A G A G A G - - - - G T T A G T C
7 G C C A T C G C T C C G T T A C T T A G A G A G - - - - C T T A G T C
* * * * * * * *
Indel Microsatellite
group in humans can be detected using antibody probes. From these protein differ-
ences, investigators can infer differences in the DNA sequence of this gene among
individuals. Over the past three decades, new technologies, such as DNA sequenc-
ing, DNA microarrays, and PCR (see Chapters 10 and 14), have been developed that
allow geneticists to observe differences in the DNA sequences directly. As a result,
population genetic analyses are no longer confined to a small set of genes such as
ABO but have expanded to include every nucleotide in the genome.
In population genetics, a locus is simply a location in the genome; it can be a
single nucleotide site or a stretch of many nucleotides. The simplest form of varia-
tion one might observe among individuals at a locus is a difference in the nucleo-
tide present at a single nucleotide site, whether adenine, cytosine, guanine, or
thymine. These types of variants are called single nucleotide polymorphisms
(SNPs), and they are the most widely studied variants in human population
genetics (Figure 18-1; see also Chapter 4). Population genetics also makes exten-
sive use of microsatellite loci (see Chapter 4). These loci have a short sequence
motif, 2 to 6 base pairs long, that is repeated multiple times with different alleles
having different numbers of repeats. For example, the 2-bp-sequence motif AG at
a locus might be tandemly repeated five times in one allele (AGAGAGAGAG) but
three times in another (AGAGAG) (see Figure 18-1).
(b)
with a probe for the motif of interest (for example, AG repeats), and determines
Detecting variation
the DNA sequence of the selected clones to identify the microsatellites and the
in microsatellites
sequences that flank them. The molecular methods for doing this type of work
were discussed in Chapter 10.
Individuals
Once a microsatellite and its flanking sequences have been identified, DNA 1 2 3
samples from a set of individuals in the population can be analyzed to determine
the number of repeats that are present in each individual. To carry out the analy-
sis, oligonucleotide primers are designed that match the flanking sequences for Locus 1
use in PCR. If the primers are labeled with a fluorescent tag, then the sizes of the
PCR products can be determined on the same apparatus used to determine the
sequence of DNA molecules (Figure 18-3). These sizes reveal the number of
Locus 2
Migration
repeats in a microsatellite allele. For example, the PCR product of a microsatellite
allele containing seven AG repeats will be 8 bp longer than an allele containing
three AG repeats. Heterozygous individuals will possess products of two different Locus 3
sizes. Since PCR, the sizing of PCR products, and scoring of the alleles can all be
automated, it is possible to determine the genotypes of large samples of individu-
als for large numbers of microsatellites relatively rapidly. Locus 4
Haplotypes Locus 5
A B
A b
a B
a b
A more complex, but more realistic, example is shown in Figure 18-4. In Figure
18-4a, there are seven chromosome segments but only six haplotypes because
chromosome segments 5 and 6 have the same haplotype (E).
Haplotypes are most often used in population genetics for loci that are physi-
cally close. For example, the variable-nucleotide sites in a single gene can be used to
define haplotypes for that gene. However, the haplotype concept works for larger
regions when there is little or no recombination over the region. It can even be
applied to an entire chromosome such as the human Y chromosome. Finally, it is
sometimes useful to group haplotypes into classes. As shown in Figure 18-4a, there
are two major classes of haplotypes (I and II) that differ at five nucleotide sites plus
a microsatellite. However, each class contains several subtypes (I-a, I-b, . . .). The
haplotype network shows the relationships among the haplotypes, placing each
mutation on one of the branches (Figure 18-4b).
What insights can we gain from haplotype analysis? Population geneticists
studying the human Y chromosome among Asian men discovered one highly preva-
lent haplotype, termed the “star-cluster” haplotype (Figure 18-5a). Typically, most
men have a rare Y chromosome haplotype, but the “star-cluster” haplotype is pres-
ent in 8 percent of Asian men. Using the known mutation rate, the researchers
estimated that this common haplotype arose between 700 and 1300 years ago.
(Later in this chapter, we will discuss mutation rates and their use in population
670 C H APTER 1 8 Population Genetics
Haplotype
Haplotype
(a) Haplotypes
Nucleotide position
class
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Chromosomes from
seven individuals
1 G G C A T C G C G C C G T T A C G T A G A G A G A G A G G T G A A T C A I-a
2 G G C A A C G C G C C G T T A C G T A G A G A G A G A G G T G A A T C B I-b
3 G G G A T C G C G C C G T T A C G T A G A G A G A G A G G T G A A T C C I-c
4 G C C A T C G C T C C G T T A C T T A G A G A G - - - - G T T A G T C D II-a
5 G C C A T C G C T C - - - T A C T T A G A G A G - - - - G T T A G T C E II-b
6 G C C A T C G C T C - - - T A C T T A G A G A G - - - - G T T A G T C E II-b
7 G C C A T C G C T C C G T T A C T T A G A G A G - - - - C T T A G T C F II-c
* * * * * * * *
Indel Microsatellite
C F
SN
P3
P2
SN
9
SN
A D
l
de
P5
In
Microsatellite
SNP2
SNP9
SNP17
SNP31
SNP33
B
E
F i g u r e 18 - 4 (a) There are a total of six haplotypes (A–F) in the aligned DNA sequences
from seven individual chromosomes from different people. (b) These six haplotypes are
joined in a haplotype network showing the relationships among the haplotypes. Each circle
represents one of the six haplotypes. Any two haplotypes differ at the loci noted on all of the
branches connecting them. The asterisks show the location of SNPs.
(b)
RUSSIA
Ewenki Han
Ossetian MONGOLIA Daur (heilongjiang)
Georgian
Han
Aral (Inner-
Lezgi Sea Mongolian) Chinese
Korean
Armenian Azeri TURKMENISTAN Inner
Kazak Uyghur Mongolian
Japanese
Chinese Mongolian
Uzbek Uyghur Manchu
Korean
Kyrgyz Xibe
Kurd Tajik
Turkmen Chinese Huizu
Kazak
Han
Kalash xingjiang
Han
AFGHANISTAN Balti (Gansu)
CHINA PACIFIC OCEAN
IRAN Hazara
Brahui Baloch
Burusho Shezu
Tibetan Han
Qiangzu (Sichuan)
PAKISTAN
Makrani
Baloch Makrani
n>60
Buyi Yaozu
Negroid NEPAL (Liannan)
Parsi
BHUTAN Han
Hani Yaozu (Guangdong)
(Bamma)
BANGLADESH Lizu
INDIA n=30
Star-cluster
chromosomes
and microsatellites are found in these organelle genomes. Since mtDNA and
cpDNA are usually maternally inherited, their analysis can be used to follow the
history of female lineages. In 1987, a prominent study of the human mitochon-
drial lineage traced the history of the human mtDNA haplotypes and determined
that the mitochondrial genomes of all modern humans trace back to a single
woman who
Introduction tolived in Analysis,
Genetic Africa about
11e 150,000 years ago (Figure 18-6). She was dubbed
Figure 18.05 #1810
the “mitochondrial Eve” in the popular press. This study of mtDNA was the first
07/08/14
thorough genetic analysis to suggest that all modern humans came from Africa.
07/23/14
Dragonfly Media Group
The HapMap Project
A major advance in human population genetics over the past decade was the cre-
ation of a genome-wide haplotype map, or HapMap. A consortium of scientists
around the world genotyped thousands of people representing the diversity of our
species for hundreds of thousands of SNPs and microsatellites. The result is a highly
672 C H APTER 1 8 Population Genetics
C, D, G
A
H, J, T, U, Uk, V C, D
A
I, W, X
B
B
N
F
B
L3
L2
M
L1
L0
A
C, D
Q
S B
P
F i g u r e 18 - 6 The haplotype network detailed picture of variation in our species. The data are available to the public at
for human mtDNA haplotype groups several Web sites, including that of the International HapMap Project (www.
drawn onto a world map. The ancestral hapmap.org) and the Human Genome Diversity Project (hgdp.uchicago.edu). In this
L haplotype group appears in Africa, and
the derived groups (A, B, and so on) are
chapter, we will use these data to present the principles of population genetics.
dispersed throughout the world. [Data from Although first developed for humans, HapMaps have since been developed for sev-
www.mitomap.org.] eral other species, including Drosophila, mouse, Arabidopsis, rice, and maize.
At a locus with two alleles A and a, let’s define the fre- q = fa / a + 21 f A / a = frequency of a
quencies of the three genotypes A /A, A/a, and a /a as
fA/A, fA/a, and fa/a, respectively. We can use these genotype Therefore,
frequencies to calculate the allele frequencies: p is the
frequency of the A allele, and q is the frequency of the a p + q = fA/A + fA/a + fa/a = 1.0
allele. Because each homozygote A/A consists only of A
alleles and because half the alleles of each heterozygote and
A /a are A alleles, the total frequency p of A alleles in the q=1−p
population is calculated as
If there are more than two different allelic forms, the fre-
p = f A / A + 21 f A / a = frequency of A quency for each allele is simply the frequency of its
homozygote plus half the sum of the frequencies for all
Similarly, the frequency q of the a allele is given by the heterozygotes in which it appears.
674 C H APTER 1 8 Population Genetics
an egg or a sperm. Knowing this, we can calculate the probability that a frog in the
next generation will be an A/A homozygote. If we reach into the frog gene pool
(see Figure 18-7) and pick the first allele, the probability that it will be an A is
p = 0.56, and similarly the probability that the second allele we pick is also an A
is p = 0.56. The product of these two probabilities, or p2 = 0.3136, is the probabil-
ity that a frog in the next generation will be A/A. The probability that a frog in the
next generation will be a/a is q2 = 0.44 × 0.44 = 0.1936. There are two ways to
make a heterozygote. We might first pick an A with probability p and then pick an
a with probability q, or we might pick the a first and the A second. Thus, the prob-
ability that a frog in the next generation will be heterozygous A/a is pq + qp = 2pq
= 0.4928. Overall, the frequencies (f ) of the genotypes are
fA/A = p2
fa/a = q2
fA/a = 2pq
Finally, as expected, the sum of the probability of being A/A plus the probability
of being A/a plus the probability of being a/a is 1.0:
p2 + 2pq + q2 = 1.0
This simple equation is the Hardy–Weinberg law, and it is part of the foundation
for the theory of population genetics.
The process of reaching into the gene pool to pick an allele is called sampling
the gene pool. Since any individual that contributes to the gene can produce many
eggs or sperm that carry exactly the same copy of an allele, it is possible to pick a
particular copy and then reach back into the gene pool and pick exactly the same
copy again. There is also an element of chance involved when sampling the gene
A form of albinism pool. Just by chance, some copies may be picked more than once and other copies
common among some may not be picked at all. Later in the chapter, we will look at how these properties
African ethnic groups of sampling the gene pool can lead to changes in the gene pool over time.
We used the Hardy–Weinberg law to calculate genotype frequencies in the
next generation from the allele frequencies in the current generation. We can also
use the Hardy–Weinberg law to calculate allele frequencies from the genotype
frequencies within a single generation. For example, some forms of albinism in
humans are due to recessive alleles at the OCA2 locus. In Africa, a form of albi-
nism called brown oculocutaneous albinism results from a recessive allele of
OCA2 (Figure 18-8). Individuals with this condition are present at frequencies as
high as 1 in 1100 among some ethnic groups in Africa. We can use the Hardy–
Weinberg law to calculate the allele frequencies:
and
p = 1 - q = 0.97
Using the allele frequencies, we can also calculate the frequency of heterozygotes
F i g u r e 18 - 8 Individual of African in the population as
ancestry with brown oculocutaneous
albinism (BOCA), a condition defined 2pq = 2 × 0.97 × 0.03 = 0.06
by light tan skin and beige to light
brown hair. [Dr. Michele Ramsay,
The latter number predicts that about 6 percent of this population are heterozy-
Department of Human Genetics, School of gotes, or carriers of the recessive allele at OCA2.
Pathology, the National Health Laboratory When we use the Hardy–Weinberg law to calculate allele or genotype frequen-
Service University of Witwatersrand.] cies, we make some critical assumptions.
18.2 The Gene-Pool Concept and the Hardy–Weinberg Law 675
• First, we assume that mating is random in the population with respect to the
gene in question. Deviation from random mating violates this assumption,
making it inappropriate to apply Hardy–Weinberg. For example, a tendency for
individuals who are phenotypically similar to mate with each other violates the
Hardy–Weinberg law. If albinos mated more frequently with other albinos than
with non-albinos, then the Hardy–Weinberg law would overestimate the fre-
quency of the recessive allele.
• Second, if one of the genotypes has reduced viability such that some individuals
with that genotype die before the genotype frequencies are counted, then the
estimate of the gene frequencies will be inaccurate.
• Third, for the Hardy–Weinberg law to apply, the population must not be divid-
ed into subpopulations that are partially or fully genetically isolated. If there
are separate subpopulations, alleles may be present at different frequencies in
the different subpopulations. If so, using genotypic counts from the overall
population may not give an accurate estimate of the overall allele frequencies.
• Finally, the Hardy–Weinberg law strictly applies only to infinitely large popula-
tions. For finite populations, there will be deviations from the frequencies pre-
dicted by the Hardy–Weinberg law due to chance when sampling the gene pool
to produce the next generation.
We have seen how we can use the Hardy–Weinberg law and the gene frequen-
cies in the current generation (t0) to calculate genotype frequencies in the next gen-
eration (t1) by randomly sampling the gene pool for the production of eggs and
sperm. Similarly, the predicted genotype frequencies for generation t1 can be used
in turn to calculate gene frequencies for the next generation (t2). The gene frequen-
cies in generation t2 will remain the same as generation t1. Under the Hardy–Wein-
berg law, neither gene nor genotype frequencies change from one generation to the
next when an infinitely large population is randomly sampled for the formation of
eggs and sperm. Thus, an important lesson from the Hardy–Weinberg law is that, in
large populations, genetic variation is neither created nor destroyed by the process
of transmitting genes from one generation to the next. Populations that adhere to
this principle are said to be at Hardy–Weinberg equilibrium.
Genotype frequencies Gene frequencies
Generation A/A A/a a/a A a
t0 0.64 0.32 0.04 0.8 0.2
t1 0.64 0.32 0.04 0.8 0.2
...
...
...
...
...
...
and the frequencies of the different heterozygous classes are two times the prod-
uct of the frequencies of the first and second allele. Table 18-1 gives an example
with p1 = 0.5, p2 = 0.3, and p3 = 0.2.
Table 18-1 H
ardy–Weinberg Genotype
Frequencies for a Locus with Three
Alleles A1, A2, and A3 with Frequencies
0.5, 0.3, and 0.2, Respectively
Table 18-2
Frequencies of SNP rs9272426 Genotypes in HLA-DQA1 of the MHC
Locus for People from Tuscany, Italy
Genotypes
A/A A/G G/G Sum
Observed number 17 55 12 84
Observed frequency 0.202 0.655 0.143 1
Expected frequency 0.281 0.498 0.221 1
Expected number 23.574 41.851 18.574 84
(Observed – expected)2/expected 1.833 4.131 2.327 8.29
Source: International HapMap Project (www.hapmap.org).
random. It also assumes that all genotypes are equally fit—that is, that they are all
equally viable and have the same success at reproduction. Real populations deviate
from this idealized one. In the rest of the chapter, we will examine how factors such
as nonrandom mating, finite population size, and the unequal fitness of different
genotypes cause deviations from Hardy–Weinberg expectations. We will also see
how the Hardy–Weinberg law can be modified to compensate for these factors.
Assortative mating
Assortative mating occurs if individuals choose mates based on resemblance to
themselves. Positive assortative mating occurs when similar types mate; for
example, if tall individuals preferentially mate with other tall individuals and short
individuals mate with other short individuals. In these cases, genes controlling the
difference in height will not follow the Hardy–Weinberg law. Rather, we’d expect to
see an excess of homozygotes for the “tall” alleles among the progeny of tall mating
pairs and an excess of homozygotes for “short” alleles among the progeny of short
mating pairs. In humans, there is positive assortative mating for height. F i g u r e 18 -10 Disassortative mating
Negative assortative or disassortative mating occurs when unlike individu- caused by the self-incompatibility locus
als mate—that is, when opposites attract. One example of negative assortative mat- (S) of the flowering plant genus Brassica.
(a) A self-pollinated S1/S2 stigma shows
ing is provided by the self-incompatibility, or S, locus in plants such as Brassica
no pollen-tube growth. (b) There is
(broccoli and its relatives). There are numerous alleles at the S locus, S1, S2, S3, and pollen-tube growth for an S1/S2 stigma
so forth. The stigma of a plant will not be receptive to pollen that carries either of its cross-pollinated with pollen from an S3 /S4
own two alleles (Figure 18-10). For example, the stigma of an S1/S2 heterozygote will heterozygote. [June Bowman Nasrallah.]
678 C H APTER 1 8 Population Genetics
not allow pollen grains carrying either an S1 or S2 allele to germinate and fertilize its
ovules, although pollen grains carrying the S3 or S4 alleles can do so. This mecha-
nism blocks self-fertilization, thereby enforcing cross-pollination. The S locus vio-
lates the Hardy–Weinberg law since homozygous genotypes at S are not formed.
A second example of negative assortative mating is provided by the major
histocompatibility complex (MHC), which is known to influence mate choice in
vertebrates. MHC affects body odor in mice and rats, providing a basis for mate
choice. In what are known as the “sweaty T-shirt experiments,” researchers asked
a group of men to wear T-shirts for two days. Then they asked a group of women
to smell the T-shirts and rate them for “pleasantness.” Women preferred the scent
of men whose MHC haplotypes were different from their own. Data from the
human HapMap project have since confirmed that American couples are signifi-
cantly more heterozygous at the MHC than expected by chance. The MHC plays
a central role in our immune response to pathogens, and heterozygotes may be
more resistant to pathogens. Therefore, our offspring benefit if we mate disassor-
tatively with respect to our MHC genotype. This mechanism may explain why the
SNP in the MHC gene HLA-DQA1 that we discussed above does not follow the
Hardy–Weinberg law among residents of Tuscany. Look back at Table 18-2 and
you will notice that there are more heterozygotes than expected, 55 versus 42.
Tuscans appear to be practicing disassortative mating with respect to this SNP.
Isolation by distance
Another form of bias in mate choice arises from the amount of geographic dis-
tance between individuals. Individuals are more apt to mate with a neighbor than
another member of their species on the opposite side of the continent—that is,
individuals can show isolation by distance. As a consequence, allele and geno-
type frequencies often differ between fish in separate lakes or between pine trees
in different regions of a continent. Species or populations exhibiting such pattern-
ing of genetic variation are said to show population structure. A species can be
divided into a series of subpopulations such as frogs in different ponds or people
in different cities.
If a species has population structure, the proportion of homozygotes will be
greater species-wide than expected under the Hardy–Weinberg law. Consider a
hypothetical example of a species of wild sunflowers distributed across Kansas
with a gradient in the frequency of the A allele from 0.9 near Kansas City to 0.1
near Elkhart (Figure 18-11a). We sample 100 sunflower plants from each of these
two cities plus 100 from Hutchinson, in the middle of the state, and we calculate
allele frequencies. Each city represents a subpopulation. For any of the three cit-
ies, the Hardy–Weinberg law works fine. For example, in Elkhart, we expect Nq2 =
100 × (0.9)2 = 81 a/a homozygotes, and that is what we observe. However, state-
wide, we’d predict Nq2 = 300 × (0.5)2 = 75 a/a homozygotes, yet we observed 107.
Because of population structure, there are more homozygous sunflower plants
than expected.
Number of individuals
N A/A A/a a/a p q
Kansas City 100 81 18 1 0.90 0.10
Hutchinson 100 25 50 25 0.50 0.50
Elkhart 100 1 18 81 0.10 0.90
State-wide (observed) 300 107 86 107 0.50 0.50
State-wide (expected) 300 75 150 75 — —
Here is a real example of population structure from our own species. In Africa,
the FY null allele of the Duffy blood group shows a gradient with a low frequency in
18.3 Mating Systems 679
Hutchinson
Elkhart
(b)
Frequency of FY null
10 – 50
50 – 70
70 – 75
75 – 80
80 – 85
85 – 90
90 – 95
95 – 100
eastern and northern Africa, moderate frequency in southern Africa, and high fre-
quency across central Africa (Figure 18-11b). This allele is rare outside of Africa.
Because of this gradient, we cannot use overall allele frequencies in Africa to calcu-
late genotype frequencies using the Hardy–Weinberg law. Later in the chapter and
in Chapter 20, we will discuss the relationship between FYnull and malaria.
Inbreeding
The third type of bias in mating is inbreeding, or mating between relatives. Long
before anyone knew about deleterious recessive alleles, some societies recognized
that disorders such as muteness, deafness, and blindness were more frequent
among the children of marriages between relatives. Accordingly, brother–sister
and first-cousin marriages were either outlawed or discouraged. Nevertheless,
many famous individuals have married a cousin, including Charles Darwin, Albert
Einstein, J. S. Bach, Edgar Allan Poe, Jesse James, and Queen Victoria. As we will
see, the offspring of marriages between relatives are at higher risk of having an
inherited disorder.
Progeny of inbreeding are more likely to be homozygous at any locus than
progeny of non-inbred matings. Thus, they are more likely to be homozygous for
6 8 0 C H APTER 1 8 Population Genetics
deleterious recessive alleles. For this reason, inbreeding can lead to a reduction in
vigor and reproductive success called inbreeding depression. However, inbreed-
ing can have advantages too. Many plant species are highly self-pollinating and
highly inbred. These include the model plant Arabidopsis, a successful weed, and
the productive cereal crops rice and wheat. Since most plant species bear male
and female organs on the same individual, self-pollination can be accomplished
more easily than outcrossing. Another advantage of self-pollination is that when
a single seed is dispersed to a new location, the plant that grows from the seed has
a ready mate—itself, enabling a new population to be established from a single
seed. Finally, if an individual plant has a beneficial combination of alleles at dif-
ferent loci, then inbreeding preserves that combination. In selfing plant species,
benefits such as these offer advantages that outweigh the cost associated with
Pedigrees show when genes inbreeding depression.
are identical by descent
The inbreeding coefficient
(a) A Inbreeding increases the risk that an individual will be homozygous for a reces-
sive deleterious allele and exhibit a genetic disease. The amount that risk increases
depends on two factors: (1) the frequency of the deleterious allele in the popula-
tion and (2) the degree of inbreeding. To measure the degree of inbreeding,
geneticists use the inbreeding coefficient (F), which is the probability that two
alleles in an individual trace back to the same copy in a common ancestor. Let’s
B C first consider how to calculate F using pedigrees and then examine how F can be
used to determine the increase in risk of inheriting a recessive disease condition.
Consider a simple pedigree for a mating between half-sibs, individuals who have
one parent in common (Figure 18-12a). In the figure, B and C are half-sibs who have
the same mother, A, but different fathers; B and C have a daughter, I. Notice that
there is a closed loop from I through B and A and back to I through C. The presence
I of a closed loop in the pedigree informs us that I is inbred. The two copies of the
gene in A are colored blue and pink—the blue from A’s father and pink from her
(b)
mother. As drawn, I has inherited the pink copy both through her father (B) and her
A mother (C). Since I’s two copies of the gene trace back to the same copy in her
grandmother, her two copies are identical by descent (IBD). More generally, if
z y
the two copies of a gene in an individual trace back to the same copy in an ancestor,
then the copies are IBD. We’d like a way to calculate the probability that I’s two
alleles will be IBD. This probability is the inbreeding coefficient for I, which is in
B C symbol form as FI.
First, since we are only interested in tracing the path of IBD alleles, we can sim-
plify the pedigree to contain only the individuals in the closed loop and still follow
w x the transmission of any IBD alleles (Figure 18-12b). Also, since the sex of the indi-
vidual doesn’t matter, we use circles for both sexes. The alleles transmitted with
I
each mating are labeled w, x, y, and z. We use “~” to symbolize IBD. We’d like to cal-
culate the probability that w and x are IBD, but let’s take this calculation step by
step. First, what is the probability that x and y are IBD or, symbolically, what is
F i g u r e 18 -12 (a) Pedigree for a
P(x ~ y)? This is the probability that C transmits the copy inherited from A to I,
half-sib mating drawn in the standard
format. Small colored balls represent a which is 1/2, or P(x ~ y) = 1/2. Similarly, the probability that B transmits the copy
single copy of a gene. Within individual A, inherited from A to I is 1/2, or P(w ~ z) = 1/2.
the pink and blue copies represent the Now we need to calculate the probability that z and y are IBD. There are two
copies of the gene that she inherited from ways that z and y can be IBD. The first way is when z and y are both the same copy
her mother and father, respectively. (both pink or both blue). This happens 1/2 of the time since 1/4 of the time they
(b) Pedigree for a half-sib mating drawn in are both blue and 1/4 both pink. The second way is when z and y are different
the simplified format used for the analysis
copies (one pink and the other blue) but individual A was inbred. If individual A
of inbreeding. Only lines connecting parent
to offspring are drawn, and only individuals is inbred, then there is a probability that her two copies of the gene are IBD. The
in the “closed inbreeding loop” are probability that A’s two copies are IBD is the inbreeding coefficient of A, FA. The
included. w, x, y and z are symbols for the probability that z and y are different copies (one pink, the other blue) is 1/2. So,
allele transmitted from parent to offspring. the probability that z and y are different copies that are IBD is 1/2 multiplied by
18.3 Mating Systems 6 81
the inbreeding coefficient (FA) to give 21 FA. Altogether, the probability that z and
y are IBD is the probability that they are the same copy (1/2) plus the probability
that they are different copies that are IBD ( 21 FA). Symbolically, we write
P ( z ∼ y ) = 21 + 21 FA
P(x ~ y), P(w ~ z), and P(z ~ y) are independent probabilities, so we can use the
product rule and put it all together to obtain
In the analysis of inbred pedigrees, we can substitute the value of FA into the
equation above if it is known. Otherwise, we can assume FA is zero if there is no
information to suggest that individual A is inbred. In the current example, if we
assume FA = 0, then
3
FI = ( 21 ) = 1
8
This calculation tells us that the offspring of half-sib matings will be homozygous
for alleles that are IBD for at least 1/8 of their genes. It could be more than 1/8 if
FA is greater than zero. Additional inbred pedigrees and a general formula for
calculating F can be found in Box 18-2.
14
Frequency of disorders (%)
12
10
0
United States France Sweden Japan Average
probability that the second allele we pick will be exactly the same copy is 1/2N
and the inbreeding coefficient for this individual is 1.0. The probability that the
second allele we pick will be a different copy from the first allele is 1 − 1/2N and
the level of inbreeding for the resulting individual would be Ft , the average
inbreeding coefficient for the initial population at generation t. The level of
inbreeding in the next generation is the sum of these two possible outcomes or
Ft +1 = ( 21N )1 + (1 − 21N )F
t
This equation informs us that F will increase over time as a function of popu-
lation size. When N is large, F increases slowly over time. When N is small,
F increases rapidly over time. For example, suppose Ft in the initial population is
0.1 and N = 10,000. Then Ft +1 would be 0.10005, just a slightly higher value. How-
ever, if N = 10, then Ft +1 would be 0.145, a much higher value. We can also use this
equation recursively to calculate Ft +2 by using Ft +1 in place of Ft on the right side.
The result with N = 10 and Ft = 0.1 would be Ft +2 = 0.188. The effects of popula-
tion size on inbreeding in populations are further explored in Box 18-3.
A consequence of the increased inbreeding is that individuals in small popula-
tions are more likely to be homozygous for deleterious alleles just as the offspring
of first-cousin marriages are more likely to be homozygous for such alleles. This
effect is seen in ethnic groups that live in small, reproductively isolated communi-
ties. For example, a form of dwarfism in which affected individuals have six fin-
gers occurs at a frequency of more than 1 in 200 among a population of about
13,000 Amish in Lancaster County, Pennsylvania, although its frequency in the
general U.S. population is only 1 in 60,000.
Introduction to Genetic Analysis, 11e
Figure 18.13 #1817
K e y C o n c e p t Inbreeding increases the frequency of homozygotes in a
07/08/14
population, and can result in a higher frequency of recessive genetic disorders. The
07/23/14
Dragonfly Media
inbreeding Group ( F ) is the probability that two alleles in an individual trace back
coefficient
to the same copy in a common ancestor.
6 8 4 C H APTER 1 8 Population Genetics
In the main text, we derived the formula for the increase and the change in F over t generations is given by
in inbreeding between generations in finite populations as t
(
Ft = 1 − 1 −
1
)
(1 − F0 )
Ft +1 = ( 21N )1 + (1 − 21N )F t
2N
As shown in the figure below, inbreeding will increase
which can be rewritten as with time in a finite population even when there is no
(1 − Ft +1 ) = 1 − ( 1
2N
)
(1 − Ft ) inbreeding in the initial population.
1
We also presented the formula for the frequency of het-
erozygotes (H) with inbreeding as
N = 10
H = fA/a = 2pq - 2pqF
Inbreeding (F )
which can be rewritten as
N = 50
(1 - F ) = H/2pq 0.5
H t +1 / 2 pq = 1 −( 1
2N
)
H t / 2 pq
N = 500
and then
( )
1 0
H t +1 = 1 − Ht 0 50 100
2N Time in generations
Thus, for each generation, the level of heterozygosity is Increase in inbreeding ( F ) over time for several different population
reduced by the fraction (1 − 1/2N). The reduction in H sizes.
over t generations is t
Ht = 1 −( 1
2N
H0)
F i g u r e 18 -14 Nucleotide variation for 5102 bp of the G6PD gene for a worldwide
sample of 47 men. Only the 18 variable sites are shown. The functional allele class (A−, A+,
or B) is shown for each sequence. SNP2 is a nonsynonymous SNP that causes a valine-to-
methionine change that underlies differences in enzyme activity associated with the
A− allele. SNP3 is a nonsynonymous SNP that causes an aspartic-acid-to-asparagine
amino acid change. [ Data from M. A. Saunders et al., Genetics 162, 2002, 1849–1861.]
18.4 Genetic Variation and Its Measurement 6 8 5
Haplotype
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
A G A C C G C C C C C G G C T C A C
1 Southern African A- G A G G T T C G 1
2 Central African A- G A G G T T C G 1
3 Central African A- G A G G T T C G 1
4 African American A- G A G G T T C G 1
5 African American A- G A G G T T C G 1
6 Central African A- G A G G T T C G 1
7 Central African A+ G G T C G 2
8 Central African A+ G G T C G 2
9 Central African B C G 3
10 Southern African B A C T G 4
11 Southern African B A C T G 4
12 Southern African B T C T G 5
13 Southern African B C T G 6
14 Southern African B T A C G 7
15 Central African B T C G 8
16 European B T C G 8
17 European B T C G 8
18 European B T C G 8
19 Southwest Asian B T C G 8
20 East Asian B C G 3
21 Native American B A T C G 9
22 Southern African B 10
23 Native American B 10
24 Native American B 10
25 Native American B 10
26 Native American B 10
27 Native American B 10
28 Native American B 10
29 Native American B 10
30 Native American B 10
31 Native American B 10
32 European B 10
33 European B 10
34 European B 10
35 European B 10
36 European B 10
37 European B 10
38 Southwest Asian B 10
39 East Asian B 10
40 East Asian B 10
41 East Asian B 10
42 East Asian B 10
43 East Asian B 10
44 East Asian B 10
45 East Asian B 10
46 Pacific Islander B T 11
47 East Asian B T 12
6 8 6 C H APTER 1 8 Population Genetics
remaining 5084 sites were fixed, or invariant: only a single allele (nucleotide)
exists in the entire sample for each of these sites. By sampling only males, we
observe just one allele and one haplotype for each individual because the gene is
X linked. The A+ allele differs from B by a single amino acid change (aspartic acid
in place of asparagine) at SNP3 in Figure 18-14. The A− allele differs from the
B allele at two amino acids: it contains both the “aspartic acid in place of aspara-
gine” change found in the A+ allele and a second amino acid difference (methio-
nine in place of valine) at SNP2.
How can we quantify variation at the G6PD locus? One simple measure is the
number of polymorphic or segregating sites (S). For the G6PD data, S is 18 for
the total sample, 14 for the African sample, and 7 for the non-African sample.
Africans contain twice the number of segregating sites despite the fact that our
sample has fewer Africans. Another simple measure is the number of haplo-
types (NH). The value of NH is 12 for the total sample, 9 for the African sample,
and 6 for the non-African sample. Again, the African sample has greater variation.
One shortcoming of measures such as S and NH is that the values we observe
depend heavily on sample size. If one samples more individuals, then the values
of S and NH are apt to increase. For example, our sample has 16 Africans com-
pared to 31 non-Africans. Although S is twice as large in Africans as non-Africans,
the difference would likely be even greater if we had an equal number (31) of
Africans and non-Africans.
In place of S and NH, we can calculate allele frequencies, which are not biased
by differences in sample size. For the G6PD data, B, A−, and A+ have worldwide
frequencies of 0.83, 0.13, and 0.04, respectively. However, you’ll note that A− has
a frequency of 0.0 outside of Africa and 0.38 in our African sample, which is a
substantial difference. We can use allele frequency data to calculate a statistic
called gene diversity (GD), which is the probability that two alleles drawn at
random from the gene pool will be different. The probability of drawing two dif-
ferent alleles is equal to 1 minus the probability of drawing two copies of the same
allele summed over all alleles at the locus. Thus,
GD = 1 − ∑ pi2
where pi is the frequency of the ith allele and ∑ is the summation sign, indicating
that we add the squares of all n observed values of p for i = 1, 2, through the nth
allele. The value of GD can vary from 0 to 1. It will approach 1 when there is a
large number of alleles of roughly equal frequencies. It is 0 when there is a single
allele, and it is near 0 whenever there is a single very common allele with a fre-
quency of 0.99 or higher. Table 18-4 shows that gene diversity is quite high in
Africans (0.47). Since non-Africans have only the B allele, gene diversity is 0.0.
Sample size 47 16 31
Number of segregating sites 18 14 7
Number of haplotypes 12 9 6
Gene diversity (GD) at SNP2 0.22 0.47 0.00
Nucleotide diversity 0.0006 0.0008 0.0002
18.5 The Modulation of Genetic Variation 6 87
DNA sequence of the founding individual to the DNA sequences of the descen-
dants several generations later and record any new mutations that have occurred.
The number of observed mutations per genome per generation provides an esti-
mate of the rate. Because one is looking for rather rare events, it is necessary to
sequence billions of nucleotides to find just a few SNP mutations. In 2009, the
SNP mutation rate for a part of the human Y chromosome was estimated by this
approach to be 3.0 × 10−8 mutations/nucleotide/generation, or about one muta-
tion every 30 million bp. If we extrapolate to the entire human genome (3 billion
bp), then each of us has inherited 100 new mutations from each of our parents.
Luckily, the vast majority of mutations are not detrimental since they occur in
regions of the genome that are not critical.
Table 18-5 lists the mutation rates for SNPs and microsatellites in several
model organisms. The SNP mutation rate is several orders of magnitude lower
than the microsatellite rate. Their higher mutation rate and greater variation
make microsatellites particularly useful in population genetics and DNA foren-
sics. The SNP mutation rate per generation appears to be lower for unicellular
organisms than for large multicellular organisms. This difference can be explained
at least partially by the number of cell divisions per generation. There are about
200 cell divisions from zygote to gamete in humans but only 1 in E. coli. If the
human rate is divided by 200, then the rate per cell division in humans is remark-
ably close to the rate in E. coli.
Other than mutation, the only other means for new variation to enter a popu-
lation is through migration or gene flow, the movement of individuals (or gam-
etes) between populations. Most species are divided into a set of small local
populations or subpopulations. Physical barriers such as oceans, rivers, or moun-
tains may reduce gene flow between subpopulations, but often some degree of
gene flow occurs despite such barriers. Within subpopulations, an individual may
have a chance to mate with any other member of the opposite sex; however, indi-
viduals from different subpopulations cannot mate unless there is migration.
Isolated subpopulations tend to diverge as each accumulates its own unique
mutations. Gene flow limits genetic divergence between subpopulations. One of the
genetic consequences of migration is genetic admixture, the mix of genes that
results when individuals have ancestry from more than one subpopulation. This
phenomenon is common in human populations. It is readily observed in South
Africa, where migrants from around the world were brought together. As shown in
Figure 18-16, the genomes of South Africans of mixed ancestry are complex and
include parts from the indigenous people of southern Africa plus contributions of
migrants from western Africa, Europe, India, East Asia, and other regions.
18.5 The Modulation of Genetic Variation 6 8 9
K e y C o n c e p t Mutation is the ultimate source of all genetic variation. Migrants from around the
Migration can add genetic variation to a population via gene flow from another world have contributed to the
population of the same species. genomes of some South Africans
A B
Percent
50
a b
If a crossover occurs in this individual, then gametes with two new haplo-
types, Ab and aB, could be formed and enter the population in generation t1.
A b a B
Thus, recombination can create variation that takes the form of new hap-
lotypes. The new haplotypes can have unique properties that alter protein
function. For example, suppose an amino acid variant in a protein on one
haplotype increases the enzyme activity of the protein twofold and a sec- 0
Individuals
ond amino acid variant on another haplotype also increases activity two-
fold. A recombination event that combines these two variants would yield Southern Africa
Western Africa
a protein with fourfold higher activity. Europe
Let’s now consider the observed and expected frequencies of the four East Asia
possible haplotypes for two loci, each with two alleles. Linked loci, A and India
B, have alleles A and a and B and b with frequencies pA, pa, pB , and pb,
respectively. The four possible haplotypes are AB, Ab, aB, and ab with
observed frequencies PAB , PAb , PaB , and Pab. At what frequency do we expect to F i g u r e 18 -16 Graphical
find each of these four haplotypes? If there is a random relationship between the representation of genetic admixture for
alleles at the two loci, then the frequency of any haplotype will be the product of 39 people of mixed ancestry from South
Africa. Each column represents one
the frequencies of the two alleles that compose that haplotype:
person’s genome, and the colors
PAB = pA × pB represent the parts of their genome
PAb = pA × pb contributed by their ancestors, who came
from many regions of the world. The figure
PaB = pa × pB is based on the population genetic
Pab = pa × pb analysis of over 800 microsatellites and
500 indels that were scored for nearly
For example, suppose that the frequency of each of the alleles is 0.5; that is, pA = pa 4000 people from around the world,
= pB = pb = 0.5. When we sample the gene pool, the probability of drawing a chro- including the 39 of mixed ancestry from
mosome with an A allele is 0.5. If the relationship between the alleles at locus A and South Africa. [ Data from S. A. Tishkoff et al.,
the alleles at locus B is random, then the probability that the selected chromosome Science 324, 2009, 1035–1044.]
has the B allele is also 0.5. Thus, the probability that we draw a chromosome with
the AB haplotype is
PAB = pA × pB = 0.5 × 0.5 = 0.25
If the association between the alleles at two loci is random as just described, then
the two loci are said to be at linkage equilibrium. In this case, the observed and
expected frequencies will be the same. Figure 18-17a diagrams a case of two loci
at linkage equilibrium.
If the association between the alleles at two loci is nonrandom, then the loci
are said to be in linkage disequilibrium (LD). In this case, a specific allele at the
first locus is associated with a specific allele at the second locus more often than
expected by chance. Figure 18-17b diagrams a case of complete LD between two
6 9 0 C H APTER 1 8 Population Genetics
loci. The A allele is always associated with the B allele, while the a
Linkage disequilibrium is the nonrandom allele is always associated with the b allele. There are no chromo-
association between two loci
somes with haplotypes Ab or aB. In this case, the observed and
expected frequencies will not be the same.
(a) Linkage equilibrium (b) Linkage disequilibrium
We can quantify the level of LD between two loci as the differ-
A B A B ence (D) between the observed frequency of a haplotype and the
A B A B expected frequency given a random association among alleles at
A b A B the two loci. If both loci involved have just two alleles, then
A b A B
a a b
D = PAB - pApB
B
a B a b In Figure 18-17a, D = 0 since there is no LD, and in Figure 18-17b,
a b a b D = 0.25, which is greater than 0, indicating the presence of LD.
a b a b How does LD arise? Whenever a new mutation occurs at a locus,
pA = 0.5 PAB = 0.25 pA = 0.5 PAB = 0.5 the mutation appears on a single specific chromosome and so it is
pa = 0.5 PAb = 0.25 pa = 0.5 PAb = 0.0 instantly linked to (or associated with) the specific alleles at any
pB = 0.5 PaB = 0.25 pB = 0.5 PaB = 0.0 neighboring loci on that chromosome. Consider a population in
pb = 0.5 Pab = 0.25 pb = 0.5 Pab = 0.5 which there are just two haplotypes: AB and Ab. If a new mutation
(a) arises at the A locus on a chromosome that already possesses
F i g u r e 18 -17 (a) Linkage equilibrium the b allele at the B locus, then a new ab haplotype would be
and (b) linkage disequilibrium for two loci formed. Over time, this new ab haplotype might rise in frequency in the popula-
( A and B ). tion. Other chromosomes in the population would possess the AB or Ab haplo-
types at these two loci, but no chromosomes would possess aB. Thus, the loci
would be in LD. Migration can also cause LD when one subpopulation possesses
only the AB haplotype and another only the ab haplotype. Any migrants between
the subpopulations would give rise to LD within the subpopulation that receives
the migrants.
LD between two loci will decline over time as crossovers between them ran-
domize the relationship between their alleles. The rate of decline in LD depends
on the rate at which crossing over occurs. The frequency of recombinants (RF)
between the two loci among the gametes that form the next generation (see Chap-
ter 4) provides an estimate of recombination rate, which in population genetics is
symbolized by the lowercase letter r. If D0 is the value for linkage disequilibrium
between two loci in the current generation, then the value in the next generation
(D1) is given by this equation:
D1 = D0(1 - r)
Consider a population of N diploid individuals segregating outcomes for all possible values of k and obtain a proba-
for two alleles A and a at the A locus with frequencies p bility distribution, shown in the figure below.
and q, respectively. The population is random mating, and
the size of the population remains the same (N ) in each 0.2
generation. When the gene pool is sampled to create the
next generation, the exact number of copies of the A allele
that are drawn cannot be strictly predicted because of sam-
pling error. However, the probability that a specific num-
Probability
ber of copies of A will be drawn can be calculated using the 0.1
binomial formula. Let k be a specific number of copies of
the A allele. The probability of drawing k copies is
0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
If we set N = 10 and p = q = 0.5, then the probability of
drawing 10 copies of the A allele is Copies of A
Probability distribution showing the likelihood that different numbers
Prob(10 ) = ( 20!
)
10! ( 20 − 10 )!
0.510 0.5( 20–10 ) = 0.176
of A will be present after one generation.
Third, notice that when N = 10, the populations became fixed (either p = 1 or
p = 0) before generation 20 in five of the six trials. However, when N = 500, the
populations retained both alleles in all six trials even after 30 generations.
In addition to population size, the fate of an allele is determined by its fre-
quency in the population. Specifically, the probability that an allele will drift to
fixation in a future generation is equal to its frequency in the present generation.
An allele that is at a frequency of 0.5 has a 50 : 50 chance of fixation or loss from
the population in a future generation. You can see the effect of allele frequency on
the fate of an allele in Figure 18-18c. For ten populations with an initial frequency
of p = 0.1, eight populations experienced the loss of the A allele, one its fixation,
and one population retained both alleles after 30 generations. That’s very close to
the expectation that A will go to fixation 10 percent of the time when p = 0.1.
The fact that the frequency of an allele is equal to its probability of fixation
means that most newly arising mutations will ultimately be lost from a popula-
tion because of drift. The initial frequency of a new mutation in the gene pool is
1
2N
If N is even modestly large, such as 10,000, then the probability that a new muta-
tion will ultimately reach fixation is extremely small: 1/2N = 1/20,000 =
5 × 10−5. The probability that a new mutation will ultimately be lost from the
population is
2N − 1 1
=1−
2N 2N
which is close to 1.0 in large populations. It is 0.99995 in a population of 10,000.
18.5 The Modulation of Genetic Variation 6 9 3
0.5
0.0
0 10 20 30
Generations (t)
(b)
1.0
N = 500
Allele frequency (p)
0.5
0.0
0 10 20 30
Generations (t)
(c)
1.0
N = 10
Allele frequency (p)
0.5
0.0
0 10 20 30
Generations (t)
(a) 4N
2N
Number of allele copies
0
Time
(b)
4N
Number of allele copies
2N
Time
As species diverge over time, their DNA sequences become The expected value for d will be the product of the rate
increasingly different as mutations arise and become (k) at which substitutions occur and two times the time
fixed in the population. At what rate do sequences diverge? in generations (2t) during which substitution accumu-
To answer this question, consider a population at genera- lated. The 2 is required because there are two lineages
tion t0. The number of mutations that will appear in gen- leading away from the common ancestor. Thus, we have
eration t1 is the product of the number of copies of the
sequence in the gene pool (2N) times the rate at which d = 2tk
they mutate ( μ); that is, 2Nμ. If a mutation is neutral, then This equation can be rewritten as
the probability that it drifts to fixation is 1/2N. So each
generation, 2Nμ new mutations enter the gene pool, and d
t=
1/2N of these will become fixed. The product of these two 2k
numbers is the rate (k) at which sequences evolve: showing how we can calculate the time in generations
1 since the divergence of two species if we know d and k.
k = 2Nµ × =µ The SNP mutation rate per generation (μ) is known for
2N
many groups of organisms (see Table 18-5), and it is the
The value k is called the substitution rate, and it is equal to same as the substitution rate (k) for neutral mutations.
the mutation rate for neutral mutations. If the mutation One can sequence one or more genes from two species
rate remains constant over time, then the substitution rate and determine the proportion of silent (neutral) nucleo-
will “tick” regularly like a clock, the molecular clock. tide sites at which they differ and use this proportion as an
Consider two species A and B and their common estimate for d. Thus, one can calculate the time since two
ancestor. Let’s define d (divergence) as the number of sequences (two species) diverged using the molecular
neutral substitutions at nucleotide sites in the DNA clock. Between humans and chimps, there are about 0.018
sequence of a gene that have occurred since the diver- base differences at synonymous sites in coding sequences.
gence of A and B from their ancestor. The SNP mutation rate for humans is 3 × 10−8, and the
Common generation time is about 20 years. Using these values and
ancestor the equation above, the estimated divergence time for
Time in generations
Species Species
A B
land bridge from Asia to the Americas during the ice age about 15,000 to 30,000
years ago. As a result, genetic diversity among Native Americans is lower than
among people in other regions of the world (Figure 18-20).
Population size can also change within a single location. A period of one or
several consecutive generations of contraction in population size is known as a
population bottleneck. Bottlenecks occur in natural populations because of envi-
ronmental fluctuations such as a reduction in the food supply or increase in pre-
dation. The gray wolf, American bison, bald eagle, California condor, whooping
crane, and many whale species are some familiar examples of species that have
experienced recent bottlenecks because of hunting by humans or encroachment
by humans on their habitat. The reduction in population size during a bottleneck
increases the level of drift in a population. As explained earlier in the chapter, the
level of inbreeding in populations is also dependent on population size. Thus,
bottlenecks also cause an increase in the level of inbreeding.
The California condor presents a remarkable example of a bottleneck. This spe-
cies was once wide ranging but in the 1980s declined to a breeding population of
only 14 captive birds. The population is now above 400 individuals, but the average
6 9 6 C H APTER 1 8 Population Genetics
Box 18-6 discusses the well-characterized bottleneck that occurred during the
0.7 domestication of crop species. This bottleneck explains why our crop plants pos-
sess much less genetic diversity than their wild ancestors.
0.6
K e y C o n c e p t Population size is a key factor affecting genetic variation in
populations. Genetic drift is a stronger force in small populations than in large ones.
0.5 The probability that an allele will become fixed in (or lost from) a population by drift is a
0.5 0.6 0.7 0.8 function of its frequency in the population and population size. Most new neutral
Microsatellite heterozygosity mutations are lost from populations by drift.
Before 10,000 years ago, our ancestors around the world Since there are fewer alleles per locus, crops have a smaller
provided for themselves by hunting wild animals and col- repertoire of alleles at disease-resistance genes and poten-
lecting wild plant foods. At about that time, human societ- tially greater susceptibility to emerging pathogens. To
ies began to develop farming. People took local wild plants reduce this vulnerability, breeders make crosses between
and animals and bred them into crop plants and domesti- modern varieties and the wild relatives (or traditional
cated animals. Some of the major crops that were domes- varieties) to reintroduce critically important alleles into
ticated at this time include wheat in the Middle East, rice modern crops.
in Asia, sorghum in Africa, and maize in Mexico. Wild
When the first farmers collected seeds from the wild
to begin domestication, they drew a sample of the wild
gene pool. This sample possessed only a subset of the
genetic variation found in the wild. The domesticated
populations were put through a bottleneck. As a conse- Domestication
quence, crop plants and domesticated animals typically Traditional
have less genetic variation than their wild progenitors.
Modern scientific plant breeding aimed at crop
improvement has created a second bottleneck. By sam-
pling the gene pool of the traditional crop varieties, mod- Improvement
ern plant breeders have created elite varieties with traits
Elite
of commercial value such as high yield and suitability for
mechanical harvesting and processing. As a conse-
quence, elite or modern varieties have even less genetic
variation than traditional varieties.
Crop domestication and improvement bottlenecks. Colored dots
The loss of genetic variation resulting from the domes- represent different alleles. [M. Yamasaki et al., Plant Cell 17, 2005,
tication and improvement bottlenecks can pose a threat. 2859–2872.]
and so they use a measure called relative fitness (symbolized by a lowercase w),
which is the fitness of an individual relative to some other individual, usually the
most fit individual in the population. If individual X has two offspring and the most
fit individual, Y, has 10 offspring, then the relative fitness of X is w = 2/10 = 0.2. The
relative fitness of Y is w = 10/10 = 1. For every 10 alleles Y contributes to the next
generation, X will contribute 2.
The concept of fitness applies to genotypes as well as to individuals. The abso-
lute fitness for the A/A genotype (WA/A) is the average number of offspring left by
individuals with that genotype. If we know the absolute fitnesses for all genotypes
at a locus, we can calculate the relative fitnesses for each of the genotypes.
Let’s now look at how allele frequencies can change over time when different
genotypes have different fitnesses; that is, when natural selection is at work.
Below are the fitnesses and genotype frequencies for the three genotypes at the A
locus in a population. In this case, A is a favored dominant allele since the fit-
nesses of the A/A and A/a individuals are the same and superior to the fitness of
the a/a individuals. We are assuming that this population follows the Hardy–
Weinberg law, with p = 0.1 and q = 0.9.
A/A A/a a/a
Average number of offspring (W) 10 10 5
Relative fitness (w) 1.0 1.0 0.5
Genotype frequency 0.01 0.18 0.81
6 9 8 C H APTER 1 8 Population Genetics
The relative contribution of each genotype to the gene pool is determined by the
product of its fitness and its frequency. The more fit and the higher the frequency
of a genotype, the more it contributes.
Using these expected genotype frequencies and the Hardy–Weinberg law, we can
calculate the frequencies of the alleles in the next generation:
The difference between p ′ and p (∆p = p ′ − p) is 0.17 − 0.1 = 0.07, so we conclude
that the A allele has climbed 7 percent in one generation due to natural selection.
Box 18-7 presents the standard equations for calculating changes in allele fre-
quencies over time due to natural selection.
We could go through this process recursively, using the allele frequencies from
the first generation to calculate those in the second generation, then using those
Allele frequencies change under from the second to calculate the third, and so forth. If we then plotted p by time
the force of natural selection measured in number of generations (t), we’d have a picture of the tempo with which
allele frequencies change under the force of natural selection. Figure 18-21 shows
1.0
such a plot for both a favored dominant and a favored recessive allele. The domi-
nant allele rises rapidly to start but then hits a plateau and only slowly approaches
fixation. Once the favored dominant allele is at a high frequency, the unfavored
recessive allele occurs mostly in heterozygotes and rarely as homozygotes with
Favored
dominant reduced fitness, so selection is ineffective at purging it from the population. The
favored recessive behaves in the opposite manner—it rises slowly in frequency at
Allele frequency
first since a/a homozygotes with enhanced fitness are rare but proceeds more rap-
0.5 idly to fixation later. Since the heterozygous class has reduced fitness, the unfa-
vored dominant allele can eventually be purged from the population.
Favored
Forms of selection
recessive Natural selection can operate in several different ways. Directional selection,
which we have been discussing, moves the frequency of an allele in one direction
until it reaches fixation or loss. Directional selection can be either positive or purify-
0.0 ing. Positive selection works to bring a new, favorable mutation or allele to a higher
0 200 400 600
frequency. This type of selection is at work when new adaptations evolve. A selective
Time in generations
sweep occurs when a favorable allele reaches fixation. Directional selection can also
work to remove deleterious mutations from the population. This form of selection
F i g u r e 18 -2 1 Change in allele
frequency of a favored dominant allele
is called purifying selection, and it prevents existing adaptive features from being
(red) and a favored recessive allele (blue) degraded or lost. Selection does not always proceed directionally until loss or fixa-
driven by natural selection over the course tion of an allele. If the heterozygous class has a higher fitness than either of the
of 600 generations. homozygous classes, then natural selection will favor the maintenance of both
18.5 The Modulation of Genetic Variation 6 9 9
Selection causes change in allele frequencies between Notice the expression pWA/A + qWA/a. This is called the
generations because some genotypes contribute more allelic fitness or mean fitness of A alleles (WA):
alleles to the gene pool than others. Let’s describe a set of
equations to predict gene frequencies in the next genera- WA = pWA/A + qWA/a
tion when selection is operating. The genotype frequen-
From the Hardy–Weinberg law, we know that a propor-
cies and absolute fitnesses are symbolized as follows:
tion p of all A alleles are present in homozygotes with
genotype A/A A/a a/a another A, in which case they have a fitness of WA/A,
whereas a proportion q of all the A alleles are present in
frequency p2 2pq q2
heterozygotes with a and have a fitness of WA/a.
absolute fitness WA/A WA/a Wa/a Substituting WA into the equation above, we obtain
alleles in the population. In this case, the locus is under balancing selection and
natural selection will move the population to an equilibrium point at which both
alleles are maintained in the population (see Chapter 20).
The different forms of selection each leave a distinct signature on the
DNA sequence near the target locus in a population. For example, positive selec-
tion can be detected in DNA sequences by its effects on genetic diversity and
70 0 C H APTER 1 8 Population Genetics
F i g u r e 18 -2 2 Schematic of
haplotypes found in a population before
Positive selection leaves a distinct signature
and after a favored allele (red) is swept to
fixation. There are 11 loci altogether. There Haplotypes before selection Haplotypes after selection
are two alleles (red and gray) at the locus
that was the target of selection. There are
two alleles (black and gray) at each locus
that is linked to the target locus. After
selection, the target and some neighboring
sites have all been swept to fixation.
A B A B
Selective sweep
linkage disequilibrium. Figure 18-22 shows schematic haplotypes before and after
an episode of positive selection. In the panel showing the haplotypes before selec-
tion, the bracketed region has many polymorphisms and multiple haplotypes.
However, after selection, there is only a single haplotype in this region and thus
no polymorphism. When selection is applied to the target site (shown in red), the
target and neighboring sites can all be swept to fixation before recombination
breaks up the haplotype in which the favorable mutation first occurred. The
result is lower diversity and higher LD near the target. As distance from the target
increases, there is more opportunity for recombination, and so diversity goes
gradually back up.
Figure 18-23 shows the pattern of diversity in the region surrounding the
SLC24A5 gene in humans. This gene influences the deposition of the melanin in
the skin. When people migrated from Africa to Europe, a selective sweep at
SLC24A5 caused a loss of all diversity at this locus. As a consequence, there is a
single allele and a single haplotype at this locus in Europe. The single allele that
was selected for in Europe produces lighter skin color. Moving away from the gene
in either direction, the number of haplotypes rises in European populations since
recombination disrupted the linkage disequilibrium between
In Europe, a selective sweep caused a loss SLC25A5 and more distance sites. Light skin may be adaptive
of all diversity at the SLC24A5 locus in northern latitudes. People are able to synthesize vitamin D,
but to do so they need to absorb UV radiation through the skin.
0.3
In the equatorial latitudes, people are exposed to high levels of
Native Americans UV light and can synthesize vitamin D even with heavily pig-
Gene diversity (GD)
Table 8-6 Some Genes Showing Evidence for Natural Selection in Specific Human Populations
Gene Presumed Trait Population
Another group of selected genes in Table 18-6 adapts people to regional diets.
Before 10,000 years ago, all humans were hunter–gatherers. More recently, most
humans switched to agricultural foods, but there are regional differences in diet. In
northern Europe and parts of Africa, milk products are a substantial part of the
diet. In most populations, the lactase enzyme for digesting milk sugar (lactose) is
expressed during childhood but is switched off in adults. In parts of Europe and
Africa where adults drink milk, however, special alleles of the lactase gene that con-
tinue to express the lactase enzyme during adulthood have risen in frequency due
to natural selection. Finally, Table 18-6 includes some genes for physiological adap-
tations to climate. Among these are the genes for skin pigmentation such as
SLC24A5, discussed above.
Whereas directional selection causes a loss of genetic variation in the region
Figure 18-24 Number of segregating sites
surrounding the target locus, balancing selection can prevent the loss of diversity
(S) or SNPs in 20-kilo-basepair windows
by random genetic drift, leading to regions of unusually high genetic diversity in along the short arm of human chromosome
the genome. One region of high genetic diversity surrounds the major histocom- 6. There is a spike of high diversity at the
patibility complex (MHC) gene complex on chromosome 6. Figure 18-24 shows a MHC locus. [Data from International HapMap
distinct spike in the number of SNPs at the MHC. This complex includes the Project, www.hapmap.org.]
human leukocyte antigen (HLA) genes, which are involved
in immune system recognition of (and response to) patho-
gens. Balancing selection is one hypothesis proposed to Balancing selection can lead to regions
explain the high diversity observed at the MHC. Since het- of unusually high genetic diversity
erozygotes have two alleles, they may be resistant to a
greater repertoire of pathogen types, giving heterozygotes a MHC
SNPs/20 kb
fitness advantage.
Finally, selection can be imposed by an agent other than 100
nature. Humans have imposed selection in the process of
domesticating and improving cultivated plants and animals. 0
0 10 20 30 40 50
This form of selection is called artificial selection. In this
Distance in mega–base pairs
case, individuals with traits that humans prefer contribute
702 C H APTER 1 8 Population Genetics
more alleles to the gene pool than individuals with unfavored traits. Over time,
the alleles that confer the favored traits rise in frequency in the population. The
many breeds of dogs and dairy cows and varieties of garden vegetables and cereal
crops are all the products of artificial selection.
H′= 1−( 1
2N
H)
This equation applies to the effects of drift as well as those of inbreeding. From
this equation, it follows that the change in H between generations due to drift is
1
∆H = H − H ′ = H
2N
DH = 2m(1 - H)
This equation gives the equilibrium value of Ĥ when the loss by drift and gain by
mutation are balanced. This equation applies only to neutral variation; that is, we
are assuming selection is not at work. We are also assuming that each new muta-
tion yields a unique allele.
Expressions such as this are useful when we have estimates for two of the vari-
ables and would like to know the third. For example, nucleotide diversity (H at the
nucleotide level) for noncoding sequences, which are largely neutral, is about
0.0013 in humans, and µ for humans is 3 × 10−8 (see Table 18-5). Using these values
and solving the equation above for N yields an estimate of the human population
size of 10,498 humans. This estimate is far below the 7.2 billion of us alive today.
What’s up? This is an estimate for the equilibrium value. Modern humans are a
young group, only about 150,000 years old. Over the last 150,000 years, our popula-
tion has grown dramatically as we filled the globe, but mutation is a slow process, so
genetic diversity has not kept up and the human population is not at equilibrium.
The population size of 10,498 represents an estimate of our historical size, or how
many breeding members there were about 150,000 years ago.
If we let q be the frequency of the deleterious allele a and Equilibrium means that the increase in the allele fre-
p = 1 − q be the frequency of the normal allele A, then the quency due to mutation exactly balances the decrease in
change in allele frequency due to the mutation rate μ is the allele frequency due to selection, so
Δqmut = μp ˆˆ2
− spq
µ ˆp =
1 − sqˆ2
A simple way to express the fitnesses of the genotypes in
the case of a recessive deleterious allele a is wA/A = wA/a The frequency of a recessive deleterious allele (qˆ) at equi-
= 1.0 and wa/a = 1 − s, where s, the selection coefficient, librium will be quite small, so 1 − sq̂2 ≈ 1, and we have
is the loss of fitness in the recessive homozygotes. We now
µ pˆ = −spq
ˆˆ2
can substitute these fitnesses in our general expression for
allele frequency change (see Box 18-7) and obtain µ
qˆ =
s
− pq(sq) −spq2
∆qsel = =
1 − sq2 1 − sq2 at equilibrium.
Here is an example. If µ = 10−6 and the lethal allele is not totally recessive but
causes a 5 percent reduction in fitness in heterozygotes (s = 1.0, h = 0.05), then
µ
q̂ = = 2 × 10 −5
hs
This result is smaller by two orders of magnitude than the equilibrium frequency
for the purely recessive case described above. In general, then, we can expect
deleterious, completely recessive alleles to have frequencies much higher than
those of partly dominant alleles because the recessive alleles are protected in
heterozygotes.
Conservation genetics
Conservation biologists attempting to save endangered wild species, and zookeep-
ers attempting to maintain small populations of captive animals, often perform
population genetic analyses. Above, we discussed how a genetic bottleneck caused
a loss of genetic variation in the California condor and an increase in the fre-
18.6 Biological and Social Applications 70 5
quency of a lethal form of dwarfism. Bottlenecks may also increase the level of
inbreeding in a population, perhaps leading to a decline in fitness through
inbreeding depression. The issue is complex, however, because inbreeding is not
always associated with a decline in fitness. Inbreeding can sometimes help purge
deleterious recessive alleles from a population. Purifying selection is more effec-
tive at eliminating deleterious recessive alleles since the homozygous recessive
class becomes more frequent in inbred populations. Thus, conservation biologists
have debated whether they should attempt to maximize genetic diversity and
minimize inbreeding or deliberately subject zoo populations to inbreeding with
the goal of purging deleterious alleles.
To help address this question, researchers looked for evidence of successful
purging among zoo populations. Let’s define inbreeding depression as delta (δ )
wf
δ =1−
w0
where wf is the fitness of inbred individuals and w0 the fitness of non-inbred indi-
viduals. The value of δ will be positive when there is a decline in fitness with
inbreeding but negative when fitness improves with inbreeding. Researchers cal-
culated δ for 119 zoo populations, including 88 species, and they found evidence
that purging had improved fitness (negative values for δ ) in 14 populations. Still,
it is not clear that deliberate inbreeding of zoo animals is advisable. For one thing,
although 14 of the 119 populations improved, the majority of the populations
declined in fitness when inbred. Thus, if one starts with a small zoo population and
purposely inbreeds the animals, a decline in fitness is the most likely outcome.
I
1 2 3 4
II
1 2 3
III
1
III-1 could inherit the cystic fibrosis allele from his mother, II-3. Individual II-3
does not have CF, but we are not sure whether or not she is a carrier. If the fre-
quency (q) of the disease allele in the population is 0.025, then the probability
that an unaffected individual such as II-3 is a carrier is 2pq/(1 − q2) = 0.049. If II-3
1
is a carrier, then there is a 2 chance she will transmit the disease allele to III-1.
70 6 C H APTER 1 8 Population Genetics
These are all independent probabilities, so we can use the product rule. The prob-
ability that III-1 will have cystic fibrosis is
1 1
× × 0.049 = 0.003
8 2
The frequency of cystic fibrosis among Caucasians is p2 = (0.025)2 = 0.000625.
These calculations tell us that individuals who have a first cousin with cystic
fibrosis have a 0.003 ÷ 0.000625 = 4.9-fold higher risk of having a child with the
disease than members of the general population.
Here is another application of population genetics to assessing disease risk.
Sickle-cell anemia, a recessive disease, has a frequency of about 0.25 percent, or
1 in 400, among African Americans (see Chapter 6). Applying the Hardy–Wein-
berg law, we estimate the frequency of the disease allele (HbS) as 0.05. What
would be the expected frequency of this disease among the offspring of African
Americans who are first cousins? Using the method described in Box 18-2, we
calculate that the inbreeding coefficient (F ) for the offspring of first-cousin mar-
riages is 1/16. In the section on inbreeding above, we saw that the frequency of the
homozygotes when there is inbreeding is increased, as shown by this equation:
fa/a = q2 + pqF
Using this equation, we obtain
2 1
f ( Hb S /Hb S ) = (0.05) + (0.05 × 0.95) = 0.0055
16
This represents a 2.2-fold increase in the risk of having a child with the disease
for first-cousin marriages compared to that in a marriage between unrelated
individuals.
DNA forensics
Criminals can leave DNA evidence at the scene of a crime in the form of blood,
semen, hairs, or even buccal cells from saliva on a cigarette butt. The polymerase
chain reaction (PCR) enables forensic scientists to amplify very tiny amounts of
DNA and determine the genotype of the individual who left the specimen. If the
DNA found at the crime scene matches that of the suspect, then they “may be”
the same individual. The key phrase here is “may be,” and this is where popula-
tion genetics comes into play. Let’s see how this works.
Consider two microsatellite loci, each with multiple alleles: A1, A2, . . . An and
B1, B2, . . . Bn. Forensic scientists determine that a DNA specimen from a crime
scene and the suspect are both A3/A8 B1/B7. They have determined that there is a
“match” between the evidence and the suspect. Does the match prove that the
DNA evidence came from the suspect? Does it prove that that the suspect was at
the crime scene?
What population geneticists do with this type of evidence is to test a specific
hypothesis: The evidence came from someone other than the suspect. This is what
statisticians call the “null hypothesis,” or the hypothesis that is considered true
unless the evidence shows that it is very unlikely (see Chapter 4). To perform the
test, we calculate the probability of observing a match between the evidence and
the suspect, given that the suspect and the person who left the evidence are dif-
ferent individuals. Symbolically, we write
Prob(match | different individuals)
where “|” means “given.” If this probability is very small, then we can reject the
null hypothesis and argue in favor of an alternative hypothesis: The evidence was
left by the suspect. We never formally prove the suspect left the evidence since
18.6 Biological and Social Applications 707
there could be alternative hypotheses such as The evidence was left by the suspect’s
identical twin.
To calculate the probability of observing a match between the evidence and
the suspect if the evidence is from a different individual, we need to know the
frequencies of the microsatellite alleles in the population.
A4 0.03
A6 0.05
B1 0.01
B7 0.12
Prob(match | different individuals) is the same as the probability that the evi-
dence came from a randomly chosen individual. We can calculate this probability
using the allele frequencies above. First, we will assume that the Hardy–Weinberg
law applies and calculate the probability of being A4 /A6 at the first locus and B1/B7
at the second:
Prob(A4/A6) = 2pq = 2 × 0.03 × 0.05 = 0.003
Prob(B1/B7) = 2 × 0.01 × 0.12 = 0.0024
Thus, the probability under the null hypothesis that the evidence came from
someone other than the suspect is 7.2 × 10−6, or about 7 in a million. That’s a small
probability, and so the null hypothesis seems unlikely in this case. However, if
Prob(match | different individuals) were 0.1, then 10 percent of the population
would be a match and could have left the evidence. In that case, we would not
want to reject the null hypothesis.
Two microsatellites do not provide very much power to discriminate, so the FBI
in the United States uses a set of 13 microsatellites. Microsatellite loci typically have
large numbers of alleles (10 to 20 or more); therefore, the number of possible geno-
types based on 13 microsatellites is astronomically large. With 10 alleles per locus,
there are 55 possible genotypes at each locus and 5513, or 4.2 × 1022, possible multi-
locus genotypes for 13 loci. The FBI has also assembled a database called CODIS
(Combined DNA Index System) that contains the frequencies of different alleles at
these loci in the population, including data specific to different ethnic groups and
regions of the country.
mouth with a Q-tip and deposit the sample in a kiosk while out to hear a band.
Weeks later, she would be able to view her genome sequence on a Web site, com-
pare it to those of her friends and relatives, and learn about her ancestry. As we’ll
see in the next chapter, our ability to predict a person’s disease risks, talents, and
other traits from his or her genotype is improving. To the extent that someone’s
taste in music or love of extreme sports has genetic underpinnings, a person could
in theory “Google” DNA mates likely to share similar interests. The DNA technol-
ogy and population genetics theory are already in place; however, there are social
and ethical questions to be addressed. Can and how will the information be kept
private? Are there any limits on what a person should know about his or her own
sequence? Should the government sequence everyone’s genome when he or she
is born? Could medical insurance providers require that their clients submit their
genome sequences? An understanding of the science can aid in determining how
these questions are answered.
s u m m a ry
Population genetics seeks to understand the laws that govern new variation into a population. Migration results in some
and forces that influence the amount of genetic variation individuals who are genetically admixed, having ancestry
within populations and changes in genetic variation over time. from multiple populations. Genetic recombination can also
The concept of the gene pool provides a model for thinking add variation to populations by recombining alleles into new
about the transmission of genetic variation from one genera- haplotypes.
tion to the next for an entire population. Basic population Two forces control the fate of genetic variation in popula-
genetic theory starts with an idealized population that is infi- tions. First, genetic drift is a random force that can lead to the
nite in size and in which mating is random. In such a popula- loss or fixation of an allele as a result of sampling error in
tion, the Hardy–Weinberg law defines the relationship finite populations. Drift is a strong force in small populations
between allele frequencies in the gene pool and genotype fre- and a weak force in large ones. Second, natural selection
quencies in the population. drives changes in allele frequencies in populations over time.
Real populations usually deviate to a small or large Alleles that enhance the fitness of the individuals that carry
degree from the Hardy–Weinberg model. One source of them will rise in frequency and can become fixed, while del-
deviation comes in the form of nonrandom or assortative eterious alleles that reduce fitness will be purged from the
mating. If individuals preferentially mate with others who population.
share a similar phenotype, then there will be an excess of The fundamental goal of population genetics is to under-
homozygotes at genes controlling that phenotype compared stand the relative contributions made by mating systems,
to Hardy–Weinberg expectations. When individuals mate mutation, migration, recombination, drift, and natural selec-
more frequently with relatives than expected by chance, tion to the amount and distribution of genetic variation in
then there will be an excess of homozygous genotypes populations. In this chapter, we have seen how research in
throughout the entire genome and the population becomes population genetics has both developed the basic theory and
inbred. Even when local populations of a species conform to collected a vast amount of data to achieve this goal. Our
Hardy–Weinberg expectations, those populations are apt to understanding of the population genetics of our own species
be isolated from other populations at distant locations. is remarkably detailed.
Thus, a species often consists of a series of genetically dis- Finally, the methods and results of population genetics
tinct subpopulations; that is, species show population both inform us about the evolutionary process and have prac-
genetic structure. tical applications to issues facing modern societies. Population
Several forces can add new variation to a population or genetic theory and analyses play important roles in the man-
remove existing variation from it. Mutation is the ultimate agement of endangered species, the identification of perpe-
source of all genetic variation. Population geneticists have trators of crimes, plant and animal breeding, and assessing
determined reasonably precise estimates of the rate at which the risks that a couple will have a child with a disease
new mutations arise in populations. Migration can also bring condition.
SOLVED PROBLEMS 70 9
key terms
absolute fitness (p. 696) haplotype network (p. 669) natural selection (p. 696)
adaptation (p. 696) HapMap (p. 671) negative assortative mating (p. 677)
allele frequency (p. 673) Hardy–Weinberg equilibrium neutral allele (p. 694)
artificial selection (p. 701) (p. 675) neutral evolution (p. 694)
balancing selection (p. 699) Hardy–Weinberg law (p. 674) nucleotide diversity (p. 687)
bottleneck (p. 695) heterozygosity (H) (p. 687) number of haplotypes (NH) (p. 686)
common SNP (p. 667) identical by descent (IBD) (p. 680) population (p. 666)
Darwinian fitness (p. 696) inbreeding (p. 679) population genetics (p. 666)
directional selection (p. 698) inbreeding coefficient (p. 680) population structure (p. 678)
disassortative mating (p. 677) inbreeding depression (p. 680) positive assortative mating (p. 677)
discovery panel (p. 668) isolation by distance (p. 678) positive selection (p. 698)
fixed (p. 686) linkage disequilibrium (LD) purifying selection (p. 698)
founder effect (p. 694) (p. 689) random genetic drift (p. 691)
gene diversity (GD) (p. 686) linkage equilibrium (p. 689) rare SNP (p. 667)
gene flow (p. 688) locus (p. 667) relative fitness (p. 697)
gene pool (p. 672) microsatellite (p. 667) segregating sites (S) (p. 686)
genetic admixture (p. 688) migration (p. 688) selection coefficient (s) (p. 703)
genotype frequency (p. 673) molecular clock (p. 694) single nucleotide polymorphism
haplotype (p. 669) mutation rate (μ) (p. 687) (SNP) (p. 667)
s olv e d p r obl e m s
SOLVED PROBLEM 1. About 70 percent of all Caucasians can Solution
taste the chemical phenylthiocarbamide, and the remainder Here, mutation and selection are working in opposite direc-
cannot. The ability to taste this chemical is determined by the tions, and so an equilibrium is predicted. Such an equilib-
dominant allele T, and the inability to taste is determined by rium is described by the formula
the recessive allele t. If the population is assumed to be in
Hardy–Weinberg equilibrium, what are the genotype and µ
q̂ =
allele frequencies in this population? s
p r obl e m s
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
Working with the Figures c. Calculate the linkage disequilibrium parameter (D)
1. Which individual in Figure 18-3 has the most heterozy- between the SNPs at positions 29 and 33.
gous loci, and which individual has the fewest? 3. Looking at Figure 18-6, can you count how many mito-
2. Suppose that the seven chromosomes in Figure 18-4a chondrial haplotypes were carried from Asia into the
represent a random sample of chromosomes from a Americas?
population. 4. In Figure 18-13, the “unrelated” (blue) column for Japan
a. Calculate gene diversity (GD) separately for the is higher than the “unrelated” column for France. What
indel, the microsatellite locus, and the SNP at position 3. does this tell you?
b. If the sequence was shortened so that you had data 5. In Figure 18-14, some individuals have unique SNP
only for positions 1 through 24, how many haplotypes alleles—for example, the T allele at SNP4 occurs only in
would there be?
PROBLEMS 711
individual 12. Can you identify two individuals each of 15. A/A and A/a individuals are equally fertile. If 0.1 per-
whom have unique alleles at two SNPs? cent of the population is a /a, what selection pressure
6. Looking at Figure 18-20, do people of the Middle East exists against a /a if the A → a mutation rate is 10−5 ?
tend to have higher or lower levels of heterozygosity Assume that the frequencies of the alleles are at their
compared to the people of East Asia? Why might this be equilibrium values.
the case? 16. When alleles at a locus act in a semidominant fashion
on fitness, the relative fitness of the heterozygote is
B a s i c P r obl e m s midway between the two homozygous classes. For ex-
7. What are the forces that can change the frequency of an ample, genotypes with semidominance at the A locus
allele in a population? might have these relative fitnesses: wA/A = 1.0, wA/a =
0.9, and wa/a = 0.8.
8. What assumptions are made when using the Hardy–
a. Change one of these fitness values so that a /a be-
Weinberg formula to estimate genotypic frequencies
comes a deleterious recessive allele.
from allele frequencies?
b. Change one of these fitness values so that A/A be-
9. In a population of mice, there are two alleles of the A comes a favored dominant allele.
locus (A1 and A2). Tests showed that, in this population,
17. If the recessive allele for an X-linked recessive disease
there are 384 mice of genotype A1 /A1, 210 of A1 /A2 , and
in humans has a frequency of 0.02 in the population,
260 of A2/A2. What are the frequencies of the two alleles
what proportion of individuals in the population will
in the population?
have the disease? Assume that the population is 50 : 50
10. In a natural population of Drosophila melanogaster, the male:female.
alcohol dehydrogenase gene has two alleles called 18. Red-green color blindness is an X-linked recessive dis-
F (fast) and S (slow) with frequencies of Adh-F at 0.75 order in humans caused by mutations in one of the
and Adh-S at 0.25. In a sample of 480 flies from this genes that encodes the light-sensitive protein, opsin. If
population, how many individuals of each genotypic the mutant allele has a frequency of 0.08 in the popu-
class would you expect to observe under Hardy– lation, what proportion of females will be carriers?
Weinberg equilibrium? Assume that the population is 50 : 50 male : female.
11. In a randomly mating laboratory population of Droso- 19. Is a new neutral mutation more likely to reach fixation
phila, 4 percent of the flies have black bodies (encoded in a large or small population?
by the autosomal recessive b), and 96 percent have
20. It seems clear that inbreeding causes a reduction in fit-
brown bodies (the wild type, encoded by B). If this pop-
ness. Can you explain why?
ulation is assumed to be in Hardy–Weinberg equilibri-
um, what are the allele frequencies of B and b and the 21. In a population of 50,000 diploid individuals, what is
genotypic frequencies of B/B and B/b? the probability that a new neutral mutation will ulti-
mately reach fixation? What is the probability that it
12. In a population of a beetle species, you notice that there will ultimately be lost from the population?
is a 3 : 1 ratio of shiny to dull wing covers. Does this ratio
22. Inbreeding in a population causes a deviation from
prove that the shiny allele is dominant? (Assume that
Hardy–Weinberg expectations such that there are
the two states are caused by two alleles of one gene.) If
more homozygotes than expected. For a locus with a
not, what does it prove? How would you elucidate the
rare deleterious allele at a frequency of 0.04, what
situation?
would be the frequency of homozygotes for the delete-
13. Cystic fibrosis (CF) is an autosomal recessive disorder rious allele in populations with inbreeding coefficients
that occurs relatively frequently among people of of F = 0.0 and F = 0.125?
European descent. In an Amish community in Ohio,
23. Sickle-cell anemia is a recessive autosomal disorder
medical researchers reported the occurrence of cystic
that is caused by an amino acid substitution in the
fibrosis (CF) as being 1/569 live births. Using the
β-hemoglobin protein. The DNA mutation underlying
Hardy–Weinberg rule, estimate the frequency of carri-
this substitution is a SNP that alters a GAG codon for
ers of the disease allele in this Amish population.
the amino acid glutamate to a GTG that codes a valine.
14. The relative fitness values of three genotypes are wA/A = The frequency of sickle-cell anemia among African
1.0, wA/a = 1.0, and wa/a = 0.7. Americans is about 1/400. What is the frequency of
a. If the population starts at the allele frequency p = this GTG codon in the β-hemoglobin gene among
0.5, what is the value of p in the next generation? African Americans?
b. What is the predicted equilibrium allele frequency if 24. You have a sample of 10 DNA sequences of 100 bp in
the rate of mutation of A to a is 2 × 10−5? length from a section of highly conserved gene from
712 C H APTER 1 8 Population Genetics
10 individuals of a species. The 10 sequences are al- fitness of 0.6. If there is no mutation, what will p and q be
most entirely identical; however, each sequence car- in the next generation?
ries one unique SNP not found in any of the others. 28. The hemoglobin B gene (Hb) has a common allele (A) of
What is the nucleotide diversity for this sample of a SNP (rs334) that encodes the HbA form of (adult) he-
sequences? moglobin and a rare allele (T ) that encodes the sickling
form of hemoglobin, HbS. Among 571 residents of a vil-
Ch a ll e n g i n g P r obl e m s lage in Nigeria, 440 were A/A and 129 were A/T, and 2
25. Figure 18-14 presents haplotype data for the G6PD gene were T/T individuals were observed. Use the χ2 test to
in a worldwide sample of people. determine whether these observed genotypic frequen-
cies fit Hardy–Weinberg expectations.
a. Draw a haplotype network for these haplotypes.
Label the branches on which each SNP occurs. 29. A population has the following gametic frequencies at
two loci: AB = 0.4, Ab = 0.1, aB = 0.1, and ab = 0.4. If
b. Which of the haplotypes has the most connections
the population is allowed to mate at random until link-
to other haplotypes?
age equilibrium is achieved, what will be the expected
c. On what continents is this haplotype found? frequency of individuals that are heterozygous at both
d. Counting the number of SNPs along the branches loci?
of your network, how many differences are there 30. Two species of palm trees differ by 50 bp in a 5000-bp
between haplotypes 1 and 12? stretch of DNA that is thought to be neutral. The muta-
26. Figure 18-12 shows a pedigree for the offspring of a half- tion rate for these species is 2 × 10−8 substitutions per
sib mating. site per generation. The generation time for these spe-
a. If the inbreeding coefficient for the common cies is five years. Estimate the time since these species
ancestor (A) in Figure 18-12 is 1/2, what is the in- had a common ancestor.
breeding coefficient of I? 31. Color blindness in humans is caused by an X-linked re-
b. If the inbreeding coefficient of individual I in cessive allele. Ten percent of the males of a large and
Figure 18-12 is 1/8, what is the inbreeding coefficient randomly mating population are color-blind. A repre-
of the common ancestor, A? sentative group of 1000 people from this population mi-
grates to a South Pacific island, where there are already
27. Consider 10 populations that have the genotype fre-
1000 inhabitants and where 30 percent of the males are
quencies shown in the following table:
color-blind. Assuming that Hardy–Weinberg equilibri-
Population A/A A/a a/a um applies throughout (in the two original populations
before the migration and in the mixed population im-
1 1.0 0.0 0.0
mediately after the migration), what fraction of males
2 0.0 1.0 0.0 and females can be expected to be color-blind in the gen-
3 0.0 0.0 1.0 eration immediately after the arrival of the migrants?
4 0.50 0.25 0.25 32. Using pedigree diagrams, calculate the inbreeding
5 0.25 0.25 0.50 coefficient (F ) for the offspring of (a) parent–offspring
matings; (b) first-cousin matings; (c) aunt–nephew
6 0.25 0.50 0.25 or uncle–niece matings; (d) self-fertilization of a
7 0.33 0.33 0.33 hermaphrodite.
8 0.04 0.32 0.64 33. A group of 50 men and 50 women establish a colony on
9 0.64 0.32 0.04 a remote island. After 50 generations of random mating,
how frequent would a recessive trait be if it were at a
10 0.986049 0.013902 0.000049
frequency of 1/500 back on the mainland? The popula-
a. Which of the populations are in Hardy–Weinberg tion remains the same size over the 50 generations, and
equilibrium? the trait has no effect on fitness.
b. What are p and q in each population? 34. Figure 18-22 shows 10 haplotypes from a population be-
fore a selective sweep and another 10 haplotypes many
c. In population 10, the A → a mutation rate is dis- generations later after a selective sweep has occurred for
covered to be 5 × 10 −6. What must be the fitness of the this chromosomal region. There are 11 loci defining each
a /a phenotype if the population is at equilibrium? haplotype, including one with a red allele that was the
d. In population 6, the a allele is deleterious; fur- target of selection. In the figure, two loci are designated
thermore, the A allele is incompletely dominant; so A/A as A and B. These loci each have two alleles: one black
is perfectly fit, A/a has a fitness of 0.8, and a/a has a and the other gray. Calculate the linkage disequilibrium
PROBLEMS 713
parameter (D) between A and B, both before and after 40. The sd gene causes a lethal disease of infancy in humans
the selective sweep. What effect has the selective sweep when homozygous. One in 100,000 newborns die each
had on the level of linkage disequilibrium? year of this disease. The mutation rate from Sd to sd is
35. The recombination rate (r) between linked loci A and B 2 × 10 −4. What must the fitness of the heterozygote be to
is 0.10. In a population, we observe the following haplo- explain the observed gene frequency in view of the mu-
typic frequencies: tation rate? Assign a relative fitness of 1.0 to Sd /Sd ho-
mozygotes. Assume that the population is at equilibrium
AB 0.40 with respect to the frequency of sd.
aB 0.10 41. If we define the total selection cost to a population of del-
Ab 0.10 eterious recessive genes as the loss of fitness per indi-
ab 0.40 vidual affected (s) multiplied by the frequency of affect-
ed individuals (q 2), then selection cost = sq 2.
a. What is the level of linkage disequilibrium as
a. Suppose that a population is at equilibrium be-
measured by D in the present generation?
tween mutation and selection for a deleterious
b. What will D be in the next generation? recessive allele, where s = 0.5 and μ = 10 −5. What is the
c. What is the expected frequency of the Ab haplotype equilibrium frequency of the allele? What is the
in the next generation? selection cost?
d. Using a spreadsheet computer software program, b. Suppose that we start irradiating individual
make a graph of the decline in D over 10 generations. members of the population so that the mutation rate
36. Allele B is a deleterious autosomal dominant. The fre- doubles. What is the new equilibrium frequency of the
quency of affected individuals is 4.0 × 10−6. The repro- allele? What is the new selection cost?
ductive capacity of these individuals is about 30 percent c. If we do not change the mutation rate but we lower
that of normal individuals. Estimate μ, the rate at which the selection coefficient to 0.3 instead, what happens
b mutates to its deleterious allele B. Assume that the fre- to the equilibrium frequency and the selection cost?
quencies of the alleles are at their equilibrium values. 42. Balancing selection acts to maintain genetic diversity at
37. What is the equilibrium heterozygosity for a SNP in a a locus since the heterozygous class has a greater fitness
population of 50,000 when the mutation rate is 3 × 10 −8 ? than the homozygous classes. Under this form of selec-
38. Of 31 children born of father–daughter matings, 6 died tion, the allele frequencies in the population approach
in infancy, 12 were very abnormal and died in child- an equilibrium point somewhere between 0 and 1.
hood, and 13 were normal. From this information, calcu- Consider a locus with two alleles A and a with frequen-
late roughly how many recessive lethal genes we have, cies p and q, respectively. The relative genotypic fitness-
on average, in our human genomes. (Hint: If the answer es are shown below, where s and g are the selective disad-
were 1, then a daughter would stand a 50 percent chance vantages of the two homozygous classes.
of carrying the lethal allele, and the probability of the
union’s producing a lethal combination would be 1/2 × Genotype A/A A/a a/a
1/4 = 1/8. So 1 is not the answer.) Consider also the pos- Relative fitness 1 - s 1 1 - g
sibility of undetected fatalities in utero in such matings.
How would they affect your result? a. At equilibrium, the mean fitness of the A alleles
39. The B locus has two alleles B and b with frequencies of (wA) will be equal to the mean fitness of the a alleles
0.95 and 0.05, respectively, in a population in the current (wa) (see Box 18-7). Set the mean fitness of the A alleles
generation. The genotypic fitnesses at this locus are wB /B (wA) equal to the mean fitness of the a alleles (wa).
= 1.0, wB / b = 1.0, and wb / b = 0.0 Solve the resulting equation for the frequency of the
a. What will the frequency of the b allele be in two A allele. This is the expression for the equilibrium fre-
generations? quency of A ( ˆp).
b. What will the frequency of the b allele be in two b. Using the expression that you just derived, find p̂
generations if the fitnesses were wB /B = 1.0, wB / b = 0.0, when s = 0.2 and g = 0.8.
and wb / b = 0.0?
c. Explain why there is a difference in the rate of
change for the frequency of the b allele under parts (a)
and (b) of this problem.
This page intentionally left blank
344
19
C h a p t e r
The Inheritance
of Complex Traits
Learning Outcomes
After completing this chapter, you will be able to
outline
715
716 CHAPTER 1 9 The Inheritance of Complex Traits
L
ook at almost any large group of men or women and you’ll notice a consider-
able range in their heights—some are short, some tall, and some about average.
Kareem Abdul-Jabbar, a star basketball center of the 1970s and 1980s, was a
towering 7 feet, 2 inches tall, whereas Willie Shoemaker, a renowned jockey who
won the Kentucky Derby four times, was a mere 4 feet, 11 inches. You might also
have noticed that in some families, the parents and their adult children are all on
the tall side, whereas in other families, the parents and adult children are all fairly
short. Such observations suggest that genes play a role in determining our heights.
Still, people do not segregate cleanly into tall and short categories as we saw for
Mendel’s pea plants. At first inspection, continuous traits, such as height, do not
appear to follow Mendel’s laws despite the fact that they are heritable.
Traits such as height that show a continuous range of variation and do not behave
in a simple Mendelian fashion are known as quantitative or complex traits. The
term complex trait is often preferred because variation for such traits is governed by
a “complex” of genetic and environmental factors. How tall you are is partly explained
by the genes you inherited from your parents and partly by environmental factors
such as how well you were nourished as a child. Teasing apart the genetic and envi-
ronmental contributions to an individual phenotype is a substantial challenge, but
geneticists have a powerful set of tools to meet it.
In the early 1900s, when Mendel’s laws were rediscovered, controversy arose
about whether these laws were applicable to continuous traits. A group known as
the biometricians discovered that there are correlations between relatives for con-
tinuous traits such that tall parents tend to have tall children. However, the biome-
tricians saw no evidence that such traits followed Mendel’s laws. Some biometricians
concluded that Mendelian loci do not control continuous traits. On the other hand,
some adherents of Mendelism thought continuous variation was unimportant and
could be ignored when studying inheritance. By 1920, this controversy was resolved
with the formulation of the multifactorial hypothesis. This hypothesis proposed
that continuous traits are governed by a combination of multiple Mendelian loci,
each with a small effect on the trait, and environmental factors. The multifactorial
hypothesis brought quantitative traits into the realm of Mendelian genetics.
Although the multifactorial hypothesis provided a sensible explanation for con-
tinuous variation, classic Mendelian analysis is inadequate for the study of complex
traits. If progeny cannot be sorted into categories with expected ratios, then the
Mendelian approach has little utility for the analysis of complex traits. In response
to this problem, geneticists developed a set of mathematical models and statistical
methods for the analysis of complex traits. Through the application of these analyti-
cal methods, geneticists have made great strides in understanding complex traits.
The subfield of genetics that develops and applies these methods to understand the
inheritance of complex traits is called quantitative genetics.
At the heart of the field of quantitative genetics is the goal of defining the
genetic architecture of complex traits. Genetic architecture is a description of all
of the genetic factors that influence a trait. It includes the number of genes affect-
ing the trait and the relative contribution of each gene. Some genes may have a large
effect on the trait, while others have only a small effect. As we will see in this chap-
ter, genetic architecture is the property of a specific population and can vary among
populations of a species. For example, the genetic architecture of a trait such as
systolic blood pressure in humans differs among different populations. This is
because different alleles segregate in different populations and different popula-
tions experience different environments; therefore, different populations are apt to
have different architectures for many traits.
Understanding the inheritance of complex traits is one of the most important
challenges facing geneticists in the twenty-first century. Complex traits are of para-
mount importance in medical and agricultural genetics. For humans, blood
19.1 Measuring Quantitative Variation 717
The mean
When quantitative geneticists study the inheritance of a trait, they work with a
particular group of individuals, or population. For example, we might be inter-
ested in the inheritance of height for the population of adult men in Shanghai,
China. Here, we are using “population” to denote a group that shares certain fea-
tures in common such as age, sex, ethnicity, or geographic origin. Since there are
more than 5 million adult men in Shanghai, determining each of their heights
would be a herculean task. Therefore, quantitative geneticists typically study just
a subset or sample of the full population. The sample should be randomly chosen
such that each of the 5 million men has an equal chance of being included in the
sample. If the sample meets this criterion, then we can use measurements made
on the sample to make inferences about the entire population.
Using the example of height for men from Shanghai, we can describe the pop-
ulation using the mean or average value for the trait. We select a random sample
of 100 men from the population and measure their heights. Some of the men
might be 166 cm tall, others 172 cm tall, and so forth. To calculate the mean, we
simply sum all the individual measurements and divide the sum by the size of the
sample (n), which in this case is 100. For the data in Table 19-1, the result would
be 170 cm, or 5 feet, 7 inches. Since we have a random sample, we can infer that
the average height in the entire population is 170 cm.
Height is a random variable which means it can take on different values, and
when we select someone at random from the population, the value we observe is
governed by an element of chance. Random variables are usually represented by
the letter X in statistics. We have measurements for X1, X2, X3, . . . X100 for the n =
100 men in the sample. Symbolically, we can express the mean as
n
1
X= ∑ Xi
n i =1
where X represents the sample mean. The uppercase Greek letter sigma ( S) is
the summation sign, indicating that we add all n observed values of X for i = 1, 2,
through n. (Often, the n above S and the i = 1 below S are omitted to simplify the
appearance of equations.)
There is a distinction made between the mean of a sample ( X ) and the true
mean of the population. To learn the true mean for the height of men in Shang-
hai, we would need to determine the height of each and every man. The true
mean is symbolized by the Greek letter µ, so that we have different symbols for
the sample and population means.
19.1 Measuring Quantitative Variation 719
Here is another way to calculate the mean, which is often Table 19-1 Simulated
Data for the Heights
quite useful. We can add the products of each class of values of of 100 Men from Shanghai, China
X in the data set times the frequency of that class in the data
set. This operation is symbolized as Height (cm) Count Frequency × Height
k 156 1 1.56
X = ∑ fi X i
i =1
157 2 3.14
158 1 1.58
where fi is the frequency of the ith class of observations, Xi is 159 2 3.18
the value of the ith class, and there are a total of k classes. For 160 1 1.60
the data in Table 19-1, one man of the 100 ( f = 0.01) is 156 cm 161 1 1.61
tall, two men ( f = 0.02) are 157 cm tall, and so forth, so we can
162 2 3.24
calculate the sample mean as
164 7 11.48
X = (0.01 × 156) + (0.02 × 157) + … + (0.02 × 184) = 170 165 7 11.55
166 1 1.66
The mean is useful for both describing populations and 167 6 10.02
comparing differences between populations. For example, 168 9 15.12
men in urban areas of China are on average 170 cm tall, while 169 7 11.83
men in rural areas of China are 166 cm tall. These values were 170 9 15.30
calculated using samples drawn from each region. One ques- 171 5 8.55
tion that a quantitative geneticist might ask about the observed 172 5 8.60
difference in height between rural and urban Chinese men is 173 6 10.38
the following: Is the difference due to genetic factors, or is it 174 5 8.70
due to differences in nutrition, health care, or other environ- 175 6 10.50
mental factors? Later in the chapter, we’ll see how quantitative
176 3 5.28
geneticists tease apart genetic versus environmental contribu-
177 4 7.08
tions to a trait.
Lastly, here is another helpful notation from statistics that 178 2 3.56
can be used to define the mean. The mean of a random variable, 179 2 3.58
X, is the expectation or expected value of that random variable. 180 2 3.60
The expected value is the average of all the values we would 181 2 3.62
observe if we measured X many times. The expectation is sym- 184 2 3.68
bolized by E, and we write E(X) to signify “the expected value of Sum 100 170.00
X.” Symbolically, we write
E(X) = X
We will use the notation of expectation in several places in this chapter.
The variance
Besides the mean, we also need a measure of how much variation exists in popula-
tions. We can create a visual representation of the variation by plotting the count
or frequency of each height class. Figure 19-1 shows such a plot for our simulated
height data for 100 men from Shanghai. The x-axis shows different height classes,
and the y-axis shows the count or frequency of each class. In this figure, the men
were binned into 4-cm groups, for example, between 155 and 158 cm. This type of
graph is called a frequency histogram. If the values are clustered tightly around
the mean, then there is less variation, and if the values are spread out along the
x-axis, there is greater variation.
We can quantify the amount of variation in a population using a statistical
measure called the variance. The variance measures the extent to which indi-
viduals in the population deviate from the population mean. If all 100 men in our
sample had heights very close to the mean, then the variance would be small. If
their heights deviated greatly from the mean, the variance would be large.
720 CHAPTER 1 9 The Inheritance of Complex Traits
25
Some individuals will have X values above the mean, and they
Frequency
20 0.2
will have a positive deviation. Others will have X values below
15 the mean, and they will have a negative deviation. For the pop-
ulation overall, the expected value of x is 0, or E(x) = 0.
10 0.1 To measure the amount of variation for X in the popula-
tion, we use the variance, which is the mean of the squared
5
deviations. First, we calculate the sum of the squared devia-
0 0
tions (or sum of squares for short) as
2
sum of squares = ∑( X i − X )
8
6
15
16
16
17
17
17
18
18
5–
9–
3–
7–
1–
5–
9–
3–
i
15
15
16
16
17
17
17
18
Height (cm) 2
= ∑( x i )
i
F i g u r e 19 -1 Frequency histogram of Since deviations with negative values form positive squares, both negative and
simulated data for the height of adult men positive deviations will contribute positively to the sum of squares. The variance
from Shanghai, China. is the mean of the squared deviations (or the sum of squares divided by n). Sym-
bolically, we express the population variance as
1 2
VX = ∑
n i
(X i − X )
1 2
= ∑( x i )
n i
where VX denotes the variance of X. The population variance is sometimes sym-
bolized using the lowercase Greek letter sigma squared (s2 ). In statistics, there is
also a distinction made between the population variance (s2 ) and the sample vari-
ance (s 2 ). The latter is calculated by dividing the sums of squares by n - 1 rather
than n to correct a bias caused by small sample size. For simplicity, we will use the
population variance and the formula above throughout this chapter.
There are several points to understand about the variance. First, the variance
provides a measure of dispersion about the mean. When the variance is high, the
individual values are spread farther apart from the mean; when it is low, then the
individual values cluster closer to the mean. Second, the variance is measured in
squared units such that if we measure human height in centimeters, then the
variance would be in centimeters2. Third, the variance can range from 0.0 to infin-
ity. Fourth, the variance is equal to the expected value of the squared deviation
( x 2 ) or E ( x 2 ).
The variance of quantitative traits is measured in squared units. These squared
units have desirable mathematical properties as we will see below; however, they do
not make intuitive sense. If we measure weight in kilograms, then the variance
would be in kilograms2, which has no clear meaning. Therefore, another statistic
used to quantify the extent of deviation from the mean in a population is the
standard deviation (s), which is the square root of the variance:
σ = σ2
The standard deviation is expressed in the same units as the trait itself, so its
meaning is more intuitive. We will use the standard deviation in the description
of traits below.
19.1 Measuring Quantitative Variation 721
Number of individuals
the frequency distribution for many biological traits approximates
a normal curve. For this reason, geneticists can take advantage of 40
several features of the normal distribution to describe quantitative 30
traits and dissect the underlying genetics.
20
The normal distribution is a continuous frequency distribution
similar to the frequency histogram shown in Figure 19-1. The normal 10
distribution applies to continuous traits. As mentioned above, con-
tinuous traits can take on an infinite number of values. A person 140 150 160 170 180
might be 170 cm tall or 170.2 or 170.002 and so forth. For such traits, Height (cm)
the expected frequency of different trait values is better represented (b)
by a curve than by a frequency histogram. For the normal distribu-
tion, the shape of the curve is determined by two factors—the mean 0.08
99.7%
95.5%
and the standard deviation. 68.2%
Here is an example using height data for 660 women from the 0.06
Frequency
United States collected by the Centers for Disease Control and Pre-
vention. The frequency histogram shows the classic “bell curve” 0.04
shape with the peak near the mean value of 164.4 cm and the off-
mean values distributed symmetrically around the mean (Figure 0.02
19-2a). We can fit a normal curve to this distribution using just two
0.00
pieces of information—the mean and the standard deviation. The –3 –2 –1 +1 +2 +3
shape of the curve is defined by an equation called the normal
probability density function, into which the mean and the stan-
dard deviation are plugged. The normal distribution allows us to (c)
predict the percentage of the observations that will fall within a 0.10
=4
certain distance from the mean (Figure 19-2b). If we measure dis-
tance along the x-axis in standard deviations, then 68 percent of 0.08
=6.18
the observations are expected to fall within 1 standard deviation
Frequency
0.06
(s) of the mean and 95.5 percent within 2 standard deviations. For
the height data for U.S. women, 71 percent (449 women) fall within 0.04
1 standard deviation of the mean and 96 percent (633 women)
=8
within 2 standard deviations. These values are very close to the 0.02
predictions of 68.2 percent and 95.5 percent based on the normal
curve. 0.00
140 150 160 170 180
If we know just the mean and the standard deviation for a trait,
Height (cm)
we can predict the shape of the distribution of the trait in the popula-
tion, and we can predict how likely we are to observe certain values
when sampling the population. For example, if the mean height for U.S. women is Figure 19-2 (a) Frequency histogram of
actual data for the height of adult women
164.4 cm (5 feet, 5 inches) and the standard deviation is 6.18 cm, we can predict that
from the United States. The red line
only 2 percent of women will be more than 177 cm tall, or five feet, 10 inches. As represents the normal curve fit to these data
shown in Figure 19-2c, if the standard deviation is greater (for example, 8), then the with a mean of 164.4 cm and standard
curve would be flatter and a greater percentage would fall above 177 cm. However, deviation of 6.18 cm. (b) Normal curve for
it would still be true that only 2 percent would be more than 2s above the mean, or the height of U.S. women showing the
180.4 cm [(164.4 + (2 × 8)]. predicted percentages of women who will
fall within different numbers of standard
deviations from the mean. (c) Normal
Key Concept A complex trait is any trait that does not show simple Mendelian curves with the same mean (164.4 cm) but
inheritance. A complex trait can be either a discontinuous trait such as the presence or different standard deviations, showing the
absence of a disease condition or a continuously variable trait such as height in humans. effect of the standard deviation on the
The field of quantitative genetics studies the inheritance of complex traits using some shape of the curve.
basic statistical tools, including the mean, variance, and normal distribution.
722 CHAPTER 1 9 The Inheritance of Complex Traits
environment in which the real Yao Ming was raised. Plugging these values into
the equation above, we obtain
229 = 170 + 42 + 17
We conclude that Yao Ming’s exceptional height is mostly due to exceptional
genetics, but he also experienced an environment that boosted his height.
Although our imaginary experiment of cloning Yao Ming is far-fetched, many
plant species and some animal species can be clonally propagated with ease. For
example, one can use “cuttings” of an individual plant to produce multiple geneti-
cally identical individuals. Another way of creating genetically identical individu-
als is by producing inbred lines or strains (Box 19-1). All individuals in such
strains are genetically identical because they are fully inbred from a common
parent or parents. By using clones or inbred lines, geneticists can estimate the
genetic and environmental contributions to a trait by rearing the clones in ran-
domly assigned environments. Here is an example.
Table 19-2 (experiment I) shows simulated data for 10 inbred strains of maize
that were grown in three different environments and scored for the number of
days between planting and the time that the plants first shed pollen. The overall
mean is 70 days. Let’s consider line A when grown in environment 1. The mean
for all lines in environment 1 is 68, or 2 less than the overall mean, so e for envi-
ronment 1 is −2. The mean line A over all three environments is 64, or 6 less than
Table 19-2 Simulated Data for Days to Pollen Shed for 10 Inbred Lines of Maize Grown in Two Experiments
Experiment I
Inbred lines A B C D E F G H I J Mean
Environment 1 62 64 66 66 68 68 70 70 72 74 68
Environment 2 64 66 68 68 70 70 72 72 74 76 70
Environment 3 66 68 70 70 72 72 74 74 76 78 72
Mean 64 66 68 68 70 70 72 72 74 76 70
Experiment II
Inbred lines A B C D E F G H I J Mean
Environment 4 58 60 62 62 64 64 66 66 68 70 64
Environment 5 64 66 68 68 70 70 72 72 74 76 70
Environment 6 70 72 74 74 76 76 78 78 80 82 76
Mean 64 66 68 68 70 70 72 72 74 76 70
the overall mean, so g for line A is −6. Putting these two values together, we
decompose the phenotype of line A when grown in environment 1 as
62 = 70 + (−6) + (−2)
We could do the same calculations for the other nine inbred lines, and then we’d
have a complete description of all the phenotypes in each environment in terms
of the extent to which their deviation from the overall mean is due to genetic and
environmental factors.
1 = E[g2 + e2 + 2ge]
COVX , Y = ∑(X i − X )(Yi − Y )
n i = E(g2) + E(e2) + E(2ge)
1
= ∑( x i y i )
n i The first term [E(g2)] is the genetic variance, the mid-
dle term [E(e2)] is the environmental variance, and the
where x and y are the deviations of X and Y from their
last term is twice the covariance between genotype and
respective means as described in the main text. The term
environment.
( X i − X )(Yi − Y ) or (xi yi) is referred to as the cross prod-
In controlled experiments, different genotypes are
uct. The covariance is obtained by summing all the cross
placed into different environments at random. In oth-
products together and dividing by n. The covariance is the
er words, genotype and environment are independent.
average or expected value, E(xy), of the cross products.
If genotype and environment are independent, then
The covariance can vary from negative infinity to positive
the covariance between genotype and environment
infinity. If large values of X are associated with large values
E( ge) = 0, and the equation reduces to
of Y, then the covariance will be positive. If large values of
X are associated with small values of Y, then the covari- VX = E(g2) + E(e2)
ance will be negative. If there is no association between X
and Y, then the covariance will be zero. For independent = Vg + Ve
traits, the covariance will be zero.
In the main text, we saw that the variance is the ex- Thus, the phenotypic variance is the sum of the
pected value of the squared deviations: variance due to the different genotypes in the popula-
tion and the variance due to the different environments
VX = E(x2) within which the organisms are reared.
in this manner. Below, we will see how this property of the variance is helpful for
quantifying the extent to which trait variation is heritable versus environmental.
Finally, let’s look at what would happen to the variances if genotype and envi-
ronment are correlated. To do this, imagine that we knew the genetic deviations
(g) for nine Thoroughbred horses for the time it takes them to run the Kentucky
Derby. We also know the environmental deviations (e) that their trainers contrib-
ute to the time it takes each horse to run this race. We will suppose that besides
training, there are no other sources of environmental variation. The population
mean for this set of Thoroughbreds is 123 seconds to run the Derby. We assign the
best horses to the best trainers and the worst horses to the worst trainers. By doing
this, we have created a nonrandom relationship or correlation between horses
(genotypes) and trainers (environments).
Table 19-3 shows the data for this imaginary experiment. You’ll notice that VX
(6.67) is not equal to the sum of Vg (2.22) and Ve (1.33). Because genotype and
environment are correlated, we violated the assumption of the equation that
states VX = Vg + Ve. The equation only works when genotype and environment
are uncorrelated.
The term VX VY is used to scale the covariance to vary between −1 and +1. The
expanded equation for the correlation coefficient is
Σ( X i − X )(Yi − Y )
rX ,Y =
2 2
Σ( X i − X ) Σ(Yi − Y )
158
158 178 198
When all of the variation in a population is due to genetic sources, then Vg equals
VX and H 2 is 1.0. H 2 is called “broad sense” because it encompasses several differ-
ent ways by which genes contribute to variation. For example, some of the varia-
tion will be due to the contributions of individual genes. Additional genetic
variation can be contributed by the way genes work together, the interactions
between genes, or epistasis.
In Section 19.2, we showed how we can calculate the genetic and environmen-
tal variances when we have inbred lines or clones. For the imaginary example of
days to pollen shed for maize inbred lines in Table 19-2 (experiment I), we saw
that Vg is 12.0 and VX is 14.67. Using these values, the heritability of the trait is
12.0/14.67 = 0.82, or 82 percent. This estimate of H 2 tells us that genes contribute
most of the variation and environmental factors contribute a more modest share
of the variation. Thus, we might conclude that days to pollen shed is a highly heri-
table trait in maize.
Let’s look at the data for experiment II in Table 19-2. The genotypes are exactly
the same as in experiment I; these are the genotypes of the inbred lines A through
J. In this case, however, the lines are reared in more extreme environments. If we
calculate the variance for the means of the inbred line in experiment II, Vg will be
12.0 days2 as in experiment I. Since the genotypes are the same in both experi-
ments, the genetic variance is the same. If we calculate the variance for the means
of the different environments (Ve) in experiment II, we will obtain 24.0 days2,
which is much larger than the value for Ve in experiment I (2.67). Since the envi-
ronments are more extreme, the environmental variance is larger. Finally, if we
calculate H 2 for experiment II, we obtain
Vg 12
H2 = = = 0.33
Vg + Ve 12 + 24
If we had many sets of identical twins who were reared between the environments to which the X′ and X″ twin
apart, how could we use them to measure H 2 ? Let’s arbi- of each pair are assigned. Accordingly, the covariance
trarily represent the trait value for one member of each between the environments [E(e ′e ″)] will be 0.0. Simi-
pair of twins as X′ and the trait value for the other as X ″. larly, because the assignment of twins to households is
We have many (n) sets of twins: X1′ X1″, X2′ X2″… random, we expect no correlation between the genetic
Xn′ Xn″. We can express the phenotypic deviations for deviation of twins ( g) and the household to which they
one set of twins as the sum of their genetic and environ- are assigned, so E( ge ′) and E( ge ″) will be 0.0. Therefore,
mental deviations, the equation for the covariance among twins reduces to
using x ′ as the deviation for one twin and x ″ for the other In other words, the covariance among identical twins
twin. Notice that g is the same because the twins are ge- reared apart is equal to the genetic variance. If we have
netically identical, but e ′ and e ″ are different because the a large set of identical twins who were reared apart, we
twins were reared in separate households. Next, we de- can use the covariance between the twins to estimate the
velop an expression for the covariance between the twins. amount of genetic variation for a trait in the general pop-
In Box 19-2, we saw that the covariance is the average or ulation. If we divide this covariance by the phenotypic
expected value of the cross products E(xy). Using our nota- variance, then we have an estimate of H 2:
tion for twins, x ′ and x ″, in place of x and y, we get COVX ′, X ′′
H2 =
COVX′, X″ = E(x ′x ″) VX
We can substitute ( g + e ′) for x ′ and (g + e ″) for x ″, giv- This equation is essentially the correlation coefficient
ing us between the twins. The variance for the twin of each set
designated X′ and that for the twin designated X″ are ex-
COVX′,X″ = E[(g × e′)(g × e″)] pected to be the same over a large sample. Thus, we can
rewrite the denominator of the equation as follows:
= E(g2 + ge′ + ge″ + e′e″)
COVX ′, X ′′
= E(g2) + E(ge′) + E(ge″) + E(e′e″) rX ′,X ′′ = = H2
VX ′VX ′′
Let’s consider the last three terms of this expres-
sion. Under our model, the twins are assigned randomly and we will see that H 2 is equivalent to the correlation
to households, and thus there should be no correlation between twins.
variance (Vg ). Thus, we can estimate H 2 in humans by using this covariance as the
numerator and the trait variance (VX) as the denominator:
COVX ′, X ′′
H2 =
VX
Here’s how it’s done. For each set of twins, let’s designate the trait value for one
twin as X′ and the other as X″. If we have n sets of twins, then the trait values for
the n sets could be designated X1′ X1″, X2′ X2″, … Xn′ Xn″.
Suppose we had IQ measurements for five sets of twins as follows:
Twin
X′ X″
1 100 110
2 125 118
3 97 90
4 92 104
5 86 89
73 0 CHAPTER 1 9 The Inheritance of Complex Traits
Using these data and the formula for the covariance from Box 19-2, we calculate
that COVX′, X ″ is 119.2 points2. Using the formula for trait variance, we would cal-
culate that the value of VX is 154.3 points2. Thus, we obtain
119.2 points2
H2 = = 0.77
154.3 points2
The points2 in the numerator and denominator cancel out, and we are left with a
unitless measure that is the proportion of the total variance that is due to
genetics.
Box 19-3 provides some additional details about estimating H2 from twin data,
including the derivation of the formula we just used. It also discusses the relation-
ship between the ratio COVX′, X ″/VX and the correlation coefficient. Quantitative
geneticists have developed several means for estimating heritability using the
correlation among relatives. Identical twins share
Table 19-4
Broad-Sense Heritability for Some Traits in 100 percent of their genes, while brothers, sisters,
Humans as Determined by Twin Studies and dizygotic twins share 50 percent of their genes.
The strength of the correlation between different
Trait H 2 types of relatives can be scaled for the proportion of
Physical attributes their genes that they share and the results used to
Height 0.88 estimate the genetic and environmental contribu-
tions to trait variation.
Chest circumference 0.61
Over the last 100 years, there have been exten-
Waist circumference 0.25
sive genetic studies of twins and other sets of rela-
Fingerprint ridge count 0.97
tives. A great deal has been learned about heritable
Systolic blood pressure 0.64 variation in humans from these studies. Table 19-4
Heart rate 0.49 lists some results from twin studies. It may or may
Mental attributes not be surprising to you, but there is a genetic con-
IQ 0.69 tribution to the variance for many different traits,
Speed of spatial processing 0.36 including physique, physiology, personality attri-
Speed of information acquisition 0.20 butes, psychiatric disorders, and even our social atti-
Speed of information processing 0.56 tudes and political beliefs. We readily observe that
Personality attributes traits such as hair and eye color run in families, and
we know these traits are the manifestation of geneti-
Extraversion 0.54
cally controlled biochemical, developmental pro-
Conscientiousness 0.49
cesses. In this context, it is not so surprising that
Neuroticism 0.48 other aspects of who we are as people also have a
Positive emotionality 0.50 genetic influence.
Antisocial behavior in adults 0.41 Twin studies and the estimates of heritability that
Psychiatric disorders they provide can easily be over- or misinterpreted.
Autism 0.90 Here are a few important points to keep in mind.
Schizophrenia 0.80 First, H 2 is a property of a particular population and
Major depression 0.37 environment. For this reason, estimates of H 2 can dif-
Anxiety disorder 0.30 fer widely among different populations and environ-
Alcoholism 0.50-0.60 ments. We saw this phenomenon above in the case of
the days to pollen shed for maize inbred lines. Sec-
Beliefs and political attitudes
ond, the twin sets used in many studies were sepa-
Religiosity among adults 0.30-0.45 rated at birth and placed into adoptive homes.
Conservatism among adults 0.45-0.65 Adoption agencies do not assign babies randomly to
Views on school prayer 0.41 the full range of households in a society; rather, they
Views on pacifism 0.38 place babies in economically, socially, and emotion-
ally stable households. As a result, Ve is smaller than
Sources: J. R. Alford et al., American Political Science Review 99, 2005,
1-15; T. Bouchard et al., Science 250, 1990, 223–228; T. Bouchard, in the general population, and the estimate of H 2 will
Curr. Dir. Psych. Sci. 13, 2004, 148–151; P. J. Clark, Am. J. Hum. Genet. 7, be inflated. Accordingly, the published estimates
1956, 49–54; C. M. Freitag, Mol. Psychiatry 12, 2007, 2–22. likely lead us to underestimate the importance of
19.4 Narrow-Sense Heritability: Predicting Phenotypes 731
environment and overestimate the importance of genetics. Third, for twins, pre-
natal effects could cause a positive correlation between genotype and environ-
ment. As we saw in the case of Thoroughbreds and jockeys above, such a cor-
relation violates our model and will bias H 2 upward.
Finally, heritability is not useful for interpreting differences between groups.
Table 19-4 shows that the heritability for height in humans can be very high: 0.88.
However, this high value for heritability does not tell us anything about whether
groups with different heights differ because of genetics or the environment. For
example, men in the Netherlands today average 184 cm in height, while around
1800, men in the Netherlands were about 168 cm tall on average, a 16-cm differ-
ence. The gene pool of the Netherlands has probably not changed appreciably
over that time, so genetics cannot explain the huge difference in height between
the current population and the one of 200 years ago. Rather, improvements in
health and nutrition are the likely cause. Thus, even though height is highly heri-
table and the past and present Dutch populations differ greatly in height, the
difference has an environmental basis.
locus has two alleles B1 and B2 and three genotypes—B1/B1, B1 /B2, and B2 /B2. As
diagrammed in Figure 19-6a, plants with the B1/B1 genotype have 1 flower, B1/B2 2
A=1
plants have 2 flowers, and B2 /B2 plants have 3 flowers. In a case like this, when the
D=1
heterozygote’s trait value is midway between those of the two homozygous classes, 1
gene action is defined as additive. In Figure 19-6b, the heterozygote has 3 flow-
B1/B1 B1/B2 B2/B2
ers, the same as the B2 /B2 homozygote. Here, the B2 allele is dominant to the B1
allele. In this case, the gene action is defined as dominant. (We could also define
F i g u r e 19 - 6 Plot of genotype ( x-axis)
this gene action as recessive with the B1 allele being recessive to the B2 allele.) Gene by phenotype ( y-axis) for a hypothetical
action need not be purely additive or dominant but can show partial dominance. locus, B, that regulates number of flowers
For example, if B1/B2 heterozygotes had 2.5 flowers on average, then we would say per plant. (a) Additive gene action.
that the B2 allele shows partial dominance. (b) Dominant gene action.
732 CHAPTER 1 9 The Inheritance of Complex Traits
Since the heterozygote has a phenotype that is midway between the two
homozygous classes, gene action is additive. There are no environmental effects,
and the genotype alone determines the number of flowers, so H 2 is 1.0. If the
plant breeder selects 3-flowered plants (B2 /B2), intermates them, and grows the
offspring, then all the offspring will be B2B2 and the mean number of flowers per
plant among the offspring will be 3.0. When gene action is completely additive
and there are no environmental effects, the phenotype is fully heritable. Selec-
tion as practiced by the plant breeder works perfectly.
Now let’s consider the case diagrammed in Figure 19-6b, in which the B2 allele
is dominant to the B1. In this case, the B1 B2 heterozygote is 3-flowered. The fre-
quency of the B1 and B2 alleles are both 0.5, and the frequencies of the B1/B1, B1 /B2,
and B2 /B2 genotypes are 0.25, 0.50, and 0.25, respectively. Again, there is no envi-
ronmental contribution to the differences among individuals, so H 2 is 1.0. The
mean number of flowers per plant in the starting population is 2.5.
If the plant breeder selects a group of 3-flowered plants, 2/3 will be B1/B2 and
1/3 B2 /B2. When the breeder intermates the selected plants, 0.44 (2/3 × 2/3) of
the crosses would be between heterozygotes, and 1/4 of the offspring from these
crosses would be B1 /B1 and thus 1-flowered. The remainder of the offspring would
be either B1/B2 or B2 /B2 and thus 3-flowered. The overall mean for the offspring
would be 2.78, although the mean of their parents was 3.0. Hence, when there is
dominance, the phenotype is not fully heritable. Selection as practiced by the
plant breeder worked but not perfectly because some of the differences among
individuals are due to dominance.
In conclusion, when there is dominance, we cannot strictly predict the off-
spring’s phenotypes from the parents’ phenotypes. Some of the differences (vari-
ation) among the individuals in the parental generation are due to the dominance
interactions between alleles. Since parents transmit their genes but not their
19.4 Narrow-Sense Heritability: Predicting Phenotypes 73 3
Using these values and the formulas above, we can calculate the additive and
dominance effects. The additive effect (A) is
0.024/0.04 = 0.6
The 0.6 value for the ratio indicates that the long (l ) allele
of Pitx1 is partially dominant to the short (s) allele.
One can also calculate additive and dominance
(b) Pelvic spine
effects averaged over all the genes in the genome that
affect the trait. Here is an example using cave fish
(Astyanax mexicanus) and their surface relatives (Fig-
ure 19-7b). The cave populations have highly reduced
(small-diameter) eyes compared to the surface popula-
tions. Populations colonizing lightless caves do not ben-
efit from having eyes. Since there are physiological and
neurological costs to forming and maintaining eyes,
evolution may have favored a reduction in the size of
the eye in cave populations.
Horst Wilkins at the University of Hamburg mea-
sured mean eye diameter (in mm) for the cave and sur-
face populations and their F1 hybrid:
Cave F1 Surface
2.10 5.09 7.05
individual, the offspring will all be B2 /B2, and the mean trait value of these off-
spring would be 3.0. Even though the B1/B2 and B2 /B2 individuals have the same
trait value and the same value for their genotypic deviation ( g), they do not pro-
duce the equivalent offspring because the underlying basis of their phenotypes is
different. The phenotype of the B1/B2 individual depends on the dominance effect
(D), while that of the B2/B2 individual does not involve dominance.
We can expand the simple model (x = g + e) to incorporate the additive and
dominance contributions. The genotypic deviation ( g) is the sum of two compo-
nents—a the additive deviation, which is transmitted to offspring, and d the domi-
nance deviation, which is not transmitted to offspring. We can rewrite the simple
model and separate out these two components as follows:
x=g+e
x=a+d+e
The genotypic deviations (g) are simply calculated by subtracting the population
mean (2.5) from the trait value for each genotype. Each genotypic deviation is
then decomposed into the additive (a) and dominance (d ) deviations using for-
mulas that are beyond the scope of this book. These formulas include the additive
(A) and dominance (D) effects as well as the frequencies of the B1 and B2 allele in
the population. You’ll notice that a + d sum to g. The additive (a) and dominance
(d ) deviations are dependent on the allele frequencies because the phenotype of
an offspring receiving a B1 allele from one parent will depend on whether that
allele combines with a B1 or B2 allele from the other parent, and that outcome
depends on the frequencies of the alleles in the population.
The additive deviation (a) has an important meaning in plant and animal
breeding. It is the breeding value, or the part of an individual’s deviation from
the population mean that is due to additive effects. This is the part that is trans-
mitted to its progeny. Thus, if we wanted to increase the number of flowers per
plant in the population, the B2 /B2 individuals have the highest breeding value.
Breeding values can also be calculated for the genome overall for an individual.
Animal breeders estimate the genomic breeding values of individual animals, and
these estimates can determine the economic value of the animal.
We have partitioned the genetic deviation ( g) into the additive (a) and domi-
nance (d ) deviations. Using algebra similar to that described in Box 19-2, we can
also partition the genetic variance into the additive and dominance variances as
follows:
Vg = Va + Vd
where Va is the additive genetic variance and Vd is the dominance variance. Va
is the variance of the additive deviations or the variance of the breeding values. It
73 6 CHAPTER 1 9 The Inheritance of Complex Traits
is the part of the genetic variation that is transmitted from parents to their offspring.
Vd is the variance of the dominance deviations. Finally, we can substitute these terms
in the equation for the phenotypic variance presented earlier in the chapter:
VX = Vg + Ve
VX = Va + Vd + Ve
where Ve is the environmental variance. This equation assumes that the additive
and dominance components are not correlated with the environmental effects.
This assumption will be true in experiments in which individuals are randomly
assigned to environments.
Thus far, we have described models with genetic, environmental, additive,
and dominance deviations and variances. In quantitative genetics, the models can
get even more complex. In particular, the models can be expanded to include
interaction between factors. If one factor alters the effect of another factor, then
there is an interaction. Box 19-4 briefly reviews how interactions are factored into
quantitative genetic models.
Narrow-sense heritability
We can now define narrow-sense heritability, which is symbolized by a lowercase h
squared (h2), as the ratio of the additive variance to the total phenotypic variance:
Va Va
h2 = =
VX Va + Vd + Ve
This form of heritability measures the extent to which variation among individuals
in a population is predictably transmitted to their offspring. Narrow-sense heritabil-
ity is the form of heritability of interest to plant and animal breeders because it
provides a measure of how well a trait will respond to selective breeding.
To estimate h2, we need to measure Va , but how can this be accomplished?
Using algebra and logic similar to that we used to show that Vg can be estimated
using the covariance between monozygotic twins reared separately (see Box 19-3),
it can also be shown that the covariance between a parent and its offspring is
equal to one-half the additive variance:
COVP ,O = 21 Va
The simple model for decomposing traits into genetic and and
environmental deviations, x = g + e, assumes that there is
VX = Vg + Ve + Vg×e
no genotype–environment interaction. By this statement, we
mean that the differences between genotypes do not change where Vg×e is the variance of the genotype–environment
across environments. In other words, a genotype–environment interaction. If the interaction term is not included in the
interaction occurs when the performance of different geno- model, then there is an implicit assumption that there
types is unequally affected by a change in the environment. are no genotype–environment interactions.
Here’s an example. Consider two inbred lines, IL1 and IL2, Interactions can also occur between the alleles at sep-
that have different genotypes. We rear both of these inbred arate genes. This type of interaction is called epistasis.
lines in two environments, E1 or E2. We can visualize the per- Let’s look at how epistatic interactions affect variation in
formance of these two lines in the two environments using a quantitative traits.
graph (below). This type of graph, which shows the pattern Consider two genes, A with alleles A1 and A2 and
of trait values of different genotypes across two or more envi- B with alleles B1 and B2. The left side of the table below
ronments is called a reaction norm. shows the case of no interaction between these genes.
If there is no interaction, then the difference in trait Starting with the A1/A1; B1/B1 genotype, whenever you
value between the inbred lines will be the same in both substitute an A2 allele for an A1 allele, the trait value goes
environments, as shown by the graph on the left. up by 1 regardless of the genotype at the B locus. The
same is true when substituting alleles at the B locus. The
No interaction Interaction
effects of alleles at the A locus are independent of those
at the B locus and vice versa. There is no interaction or
Trait value
3 3
IL1 epistasis.
2 2
No interaction Interaction
1 IL2 1
0 0 B1/B1 B1/B2 B2/B2 B1/B1 B1/B2 B2/B2
E1 E2 E1 E2 A1/A1 0 1 2 A1/A1 0 1 2
With no interaction, the difference between the two inbreds A1/A2 1 2 3 A1/A2 0 1 3
is 1.0 in both environments, and so the difference between A2/A2 2 3 4 A2/A2 0 1 4
the lines averaged over the two environments is 1.0.
Environment 1: IL1 - IL2 = 2 - 1 = 1.0 Now look at the right side of the table. Starting with
Environment 2: IL1 - IL2 = 3 - 2 = 1.0 the A1/A1; B1/B1 genotype, substituting an A2 allele
for an A1 allele only has an effect on the trait value
The difference in the overall mean shows that the lines when the genotype at the B locus is B2/B2. The effects
are genetically different. The mean over both environ- of alleles at the A locus are dependent of those at the
ments is 2.5 for IL1 and 1.5 for IL2. B locus. There is an interaction or epistasis between
The graph on the right shows a case of an interac- the genes.
tion between genotype and environment. IL1 does well The genetic model can be expanded to include an
in Environment 1 but poorly in Environment 2. The op- epistatic or interaction term (i):
posite is true for IL2. The difference in the trait value
between the two lines is +1.0 in Environment 1 but x=a+d+i+e
-1.0 in Environment 2. and
Environment 1: IL1 - IL2 = 2 - 1 = +1.0 VX = Va + Vd + Vi + Ve
Environment 2: IL1 - IL2 = 1 - 2 = -1.0
where Vi is the interaction or epistatic variance.
The difference between the lines averaged over the two If the interaction term is not included in the model,
environments is 0.0, so we might incorrectly conclude then there is an implicit assumption that the genes
that these inbreds are genetically equivalent if we looked work independently; that is, there is no epistasis. The
just at the overall mean. interaction variance (Vi), like the dominance variance,
The simple model can be expanded to include a geno- is not transmitted from parents to their offspring since
type–environment interaction term (g×e): new genotypes and thus new epistatic relationships are
x = g + e + g×e formed with each generation.
73 8 CHAPTER 1 9 The Inheritance of Complex Traits
estimated using the covariance between half-sibs, in which case all individuals in
the experiment can be reared at the same time in the same environment. Half-
sibs share one-fourth of their genes, so Va equals 4 × the covariance between
half-sibs.
If you compare the equation for h2 to the one for H 2 (see Box 19-3), you will
see that both involve the ratio of a covariance to a variance. The correlation coef-
ficient introduced earlier in the chapter is also the ratio of a covariance to a vari-
ance. We are using the degree of correlation among relatives to infer the extent to
which traits are heritable.
Here is an exercise that your class can try. Have each student submit his or her
height and the height of their same-sex parent. Using these data and spreadsheet
computer software, calculate the covariance between parents and their offspring
(the students). Then estimate h2 as two times the covariance divided by the phe-
notypic variance. For the total phenotypic variance (VX) in the denominator of
the equation, you can use the variance among the parents. Data
for male and female students should be analyzed separately.
The heights of individuals and their
Typically, values for narrow-sense heritability of height in
same-sex parent are correlated
humans are about 0.8, meaning that about 80 percent of the
variance is additive, or transmissible, from parent to offspring.
Female students The results for your class could deviate from this value for sev-
72 h = 0.86
2 eral reasons. First, if your class is small, sampling error can
affect the accuracy of your estimate of h2. Second, you will not
Height of students (inches)
x=a+d+e
The additive part is the heritable part that is transmitted to the offspring. Let’s
look at a set of parents with phenotypic deviations x ′ for the mother and x ″ for the
father. The parents’ dominance deviations (d ′ and d ″) are not transmitted to their
offspring since new genotypes and new dominance interactions are created with
each generation. Similarly, the parents do not transmit their environmental devi-
ations (e ′ and e ″) to their offspring.
Mom Dad
x′ = a′ + d′ + e′ x″ = a″+ d″ + e″
Offspring
a ′ + a ′′
xo = = ap
2
740 CHAPTER 1 9 The Inheritance of Complex Traits
Thus, the only factors that parents transmit to their offspring are their additive
deviations (a ′ and a ″). Accordingly, we can estimate the offspring’s phenotypic
deviation (xo) as the mean of the additive deviations of its parents ( a p).
So to predict the offspring’s phenotype, we need to know its parents’ additive
deviations. We cannot directly observe the parents’ additive deviations, but we
can estimate them. The additive deviation of an individual is the heritable part of
its phenotypic deviation; that is,
â = h 2x
where â signifies an estimate of the additive deviation or breeding value. Thus, we
can estimate the mean of the parents’ additive deviations as the product of h2
times the mean of their phenotypic deviation and this product will be an estimate
of the phenotypic deviation of the offspring ( x̂ o):
x̂ o = h 2 ( x ′ +2 x ′′)
or
x̂ o = h 2 x p
The offspring will have its own dominance and environmental deviations. How-
ever, these cannot be predicted. Since they are deviations, they will be zero on
average over a large number of offspring.
Here is an example. Icelandic sheep are prized for the quality of their fleece.
The average adult sheep in a particular population produces 6 lb of fleece per
year. A sire that produces 6.5 lb per year is mated with a dam that produces 7.0 lb
per year. The narrow-sense heritability of fleece production in this population is
0.4. What is the predicted fleece production for offspring of this mating? First,
calculate the phenotypic deviations for the parents by subtracting the population
mean from their phenotypic values:
Finally, add the population mean (6.0) to the predicted phenotypic deviation of
the offspring (0.3) and obtain the result that the predicted phenotype of the off-
spring is 6.3 lb of fleece per year.
It may seem surprising that the offspring are predicted to produce less fleece
than either parent. However, this outcome is expected for a trait with a modest
heritability of 0.4. Most (60 percent) of the superior performance of the parents
is due to dominance and environmental factors that are not transmitted to the
offspring. If the heritability were 1.0, then the predicted value for the offspring
would be midway between the parents’. If the heritability were 0.0, then the pre-
dicted value for the offspring would be at the population mean since all the varia-
tion would be due to nonheritable factors.
x̂ o = h 2 x p
and rewrite it as
xo
h2 = F i g u r e 19 - 9 Distribution of trait values
xp for provitamin A in maize kernels in a
starting population (a) and offspring
x p is the mean deviation of the parents (the selected plants) from the population population (b) after one generation of
mean. This is known as the selection differential (S), the difference between the selection. The starting population had a
mean of the selected group and that of the base population. For our example, mean of 1.25 µg/g, the selected
individuals a mean of 1.63 µg/g, and the
x p = 1.63 − 1.25 = 0.38 offspring population a mean of 1.44 µg/g.
x o is the mean deviation of the offspring from the population mean. This is known
as the selection response (R), the difference between the mean of the offspring
and that of the base population. For our example,
Now we can calculate the narrow-sense heritability for this trait in this population
as
R x 0.19
h2 = = o = = 0.5
S xp 0.38
The underlying logic of this calculation is that the response represents the heri-
table or additive part of the selection differential.
Over the last century, quantitative geneticists have conducted a large number of
selection experiments like this. Typically, these experiments are performed over
many generations and are referred to as long-term selection studies. Each genera-
tion, the best individuals are selected to produce the subsequent generation. Such
studies have been performed in economically important species such as crop plants
and livestock and in many model organisms such as Drosophila, mice, and nema-
todes. This work has shown that virtually any species will respond to selection for
742 CHAPTER 1 9 The Inheritance of Complex Traits
sec, and neither the flies nor the gains made by selection showed
any signs of slowing down after 100 generations. In the second
100 experiment, mice were selected over 10 generations for the
amount of “wheel running” they did per day (Figure 19-10b).
There was a 75 percent increase over just 10 generations. These
studies and many more like them demonstrate the tremendous
power of artificial selection and deep pools of additive genetic
0
variation in species.
0 50 100
Generation K e y C o n c e p t Narrow-sense heritability ( h2) is the proportion
of the phenotypic variance that is attributable to additive effects.
(b)
This form of heritability measures the extent to which variation
among individuals in a population is predictably transmitted to their
offspring. The value of h2 can be estimated in two ways: (1) using
10,000 the correlation between parents and offspring and (2) using the ratio
Revolutions per day
Frequency
SNPs or microsatellites can be scored unambiguously as 0.04
homozygous P1 or heterozygous for each BC1 individual. If b/b B/B
there is a QTL linked to the marker locus, then the mean 0.02
trait value for individuals that are homozygous P1 at the
marker locus will be different from the mean trait value for
0.00
the heterozygous individuals. Based on such evidence, one Low Intermediate High
can infer that a QTL is located near the marker locus. Let’s
Trait value
look in more detail at how this works.
F i g u r e 19 -11 Frequency distributions showing how the
The basic method distributions for the different genotypic classes at QTL locus B relate
to the overall distribution for the population (black line).
There are a variety of experimental designs that can be
used in QTL mapping experiments. We will begin by
describing a simple design. Let’s say we have two inbred lines of tomato that differ
in fruit weight—Beefmaster with fruits of 230 g in weight and Sungold with fruits
of 10 g in weight (Figure 19-12). We cross the two lines to produce an F1 hybrid
and then backcross the F1 to the Beefmaster line to produce a BC1 generation. We
grow several hundred BC1 plants to maturity and measure the weight of the fruit
on each. We also extract DNA from each of the BC1 plants. We use these DNA
samples to determine the genotype of each plant at a set of marker loci (SNPs or
SSRs) that are distributed across all of the chromosomes such that we have a
marker locus every 5 to 10 centimorgans.
Beefmaster Sungold
Beefmaster F1
From this process, we would assemble a data set for several hundred plants and
100 or more marker loci distributed around the genome. Table 19-6 shows part of
such a data set for just 20 plants and 5 marker loci that are linked on a single chro-
mosome. For each BC1 plant, we have the weight of its fruit and the genotypes at the
marker loci. You’ll notice that trait values for the BC1 plants are intermediate
between the two parents as expected but closer to the Beefmaster value because
this is a BC1 population and Beefmaster was the backcross parent. Also, since this is
a backcross population, the genotypes at each marker locus are either homozygous
for the Beefmaster allele (B/B) or heterozygous (B/S). In Table 19-6, you can see the
positions of crossovers between the marker loci that occurred during meiosis in the
F1 parent of the BC1 generation. For example, plant BC1-001 has a recombinant
chromosome with a crossover between marker loci M3 and M4.
The overall mean fruit weight for the BC1 population is 175.7. We can also cal-
culate the mean for the two genotypic classes at each marker locus as shown in
Table 19-6. For marker M1, the means for the B /B (176.3) and B/S (175.3) geno-
typic classes are very close to the overall mean (175.7). This is the expectation if
there is no QTL affecting fruit weight near M1. For marker M3, the means for the
B/B (180.7) and B/S (169.6) genotypic classes are quite different from the overall
mean (175.7) and from each other. This is the expectation if there is a QTL affect-
ing fruit weight near M3. Thus, we have evidence for a QTL affecting fruit weight
near marker M3. Also notice that the B/B class has heavier fruit than the B/S class
of M3. Plants that inherited the S allele from the small-fruited Sungold line have
Distinct distributions
smaller fruits than those that inherited the B allele from the Beefmaster line.
for genotypic classes
Figure 19-13 is a graphical representation of QTL-mapping data for many plants
at a marker locus signal
along one chromosome. The phenotypic data for the B /B and B /S genotypic classes the location of a QTL near
are represented as frequency distributions so we can see the distributions of the the marker
trait values. At marker M1, the distributions are fully overlapping and the means for
the B/B and B/S distributions are very close. It appears that the B/B and B/S classes
have the same underlying distribution. At marker M3, the distributions are only B/B
partially overlapping and the means for the B/B and B/S distributions are quite dif-
M1
ferent. The B/B and B/S classes have different underlying distributions similar to the
situation in Figure 19-11. We have evidence for a QTL near M3. B/S
As shown in Figure 19-13, the trait means for the B/B and B/S groups at some
markers are nearly the same. At other markers, these means are rather different.
How different do they need to be before we declare that a QTL is located near a
M2
marker? The statistical details for answering this question are beyond the scope of
this text. However, let’s review the basic logic behind the statistics. The statistical
analysis involves calculating the probability of observing the data (the specific
fruit weights and marker-locus genotypes for all the plants) given that there is a
QTL near the marker locus and the probability of observing the data given that
M3
there is not a QTL near the marker locus. The ratio of these two probabilities is
called the “odds”:
Prob (data|QTL)
odds =
Prob (data|no QTL)
M4
The vertical line | means “given,” and the term Prob(data|QTL) reads “the prob-
ability of observing the data given that there is a QTL.” If the probability of the
data when there is a QTL is 0.1 and the probability of the data when there is no
QTL is 0.001, then the odds are 0.1/0.001 = 100. That is, the odds are 100 to 1 in
favor of there being a QTL. Researchers report the log10 of the odds, or the Lod M5
score. So, if the odds ratio is 100, then the log10 of 100, or Lod score, is 2.0.
If there is a QTL near the marker, then the data were drawn from two underly-
ing distributions—one distribution for the B/B class and one for the B/S class. Each 160 190
of these distributions has its own mean and variance. If there is no QTL, then the Fruit weight (g)
data were drawn from a single distribution for which the mean and the variance
are those of the entire BC1 population. At marker locus M1 in Figure 19-13, the F i g u r e 19 -13 A tomato chromosomal
segment with marker loci M1 through M5.
distributions for the B/B and B/S classes are nearly identical. Thus, there is a high
At each marker locus, the frequency
probability that the data were drawn from a single underlying distribution. At distributions for fruit weight from a BC1
marker M3, the distributions for the B/B and B/S classes are quite different. Thus, population of a Beefmaster × Sungold
there is a higher probability of observing our data if we infer that the B/B plants cross are shown. The red distributions are
were drawn from one distribution and B/S plants from another. for the homozygous Beefmaster (B /B)
In addition to testing for QTL at the marker loci where the genotypes are genotypic class at the marker; the gray
known, Lod scores can be calculated for points between the markers. This can be distributions are for the heterozygous
(B /S) genotypic class. Yellow lines
done by using the genotypes of the flanking markers to infer the genotypes at
represent the mean of each distribution.
points between the markers. For example, in Table 19-6, plant BC1-001 is B/B at
markers M1 and M2, and so it has a high probability of being B/B at all points in
between. Plant BC1-003 is B /B at marker M1 but B/S at M2, and so the plant might
be either B/B or B /S at points in between. The odds equation incorporates this
uncertainty when one calculates the Lod score at points between the markers.
The Lod scores can be plotted along the chromosome as shown by the blue
line in Figure 19-14. Such plots typically show some peaks of various heights as
well as stretches that are relatively flat. The peaks represent putative QTL, but
how high does a peak need to be before we declare that it represents a QTL? As
discussed in Chapters 4 and 18, we can set a statistical threshold for rejecting the
“null hypothesis.” In this case, the null hypothesis is that “there is not a QTL at a
746 CHAPTER 1 9 The Inheritance of Complex Traits
Lod score
score exceeds the threshold value, there
is statistical evidence for a QTL. 5
Threshold
value
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
specific position along the chromosome.” The greater the Lod score, then the
lower the probability under the null hypothesis. There are different statistical
procedures for setting a “threshold value” for the Lod score. Where the Lod score
exceeds the threshold value, then we reject the null hypothesis in favor of the
alternative hypothesis that a QTL is located at that position. In Figure 19-14, the
Lod score exceeds the threshold value (red line) near marker locus M3. We con-
clude that a QTL is located near M3.
In addition to backcross populations, QTL mapping can be done with F2 popu-
lations and other breeding designs. An advantage of using an F2 population is that
one gets estimates of the mean trait values for all three QTL genotypes: homozy-
gous parent-1, homozygous parent-2, and heterozygous. With these data, one can
get estimates of the additive (A) and dominance (D) effects of the QTL as dis-
cussed earlier in this chapter. Thus, QTL mapping enables us to learn about gene
action, whether dominant or additive, for each QTL.
Here is an example. Suppose we studied an F2 population from a cross of Beef-
master and Sungold tomatoes and we identified two QTL for fruit weight. The
mean fruit weights for the different genotypic classes at the QTL might look
something like this:
Fruit weights Effects
B/B B/S S/S A D
QTL 1 180 170 160 10 0
QTL 2 200 185 110 45 30
We can use these fruit weight values for the QTL to calculate the additive and
dominance effects. QTL 1 is purely additive (D = 0), but QTL 2 has a large domi-
nance effect. Also, notice that the additive effect of QTL 2 is more than 4 times
that of QTL 1 (45 versus 10). Some QTL have large effects, and others have rather
small effects.
What can be learned from QTL mapping? With the most powerful QTL-map-
ping designs, geneticists can estimate (1) the number of QTL (genes) affecting a
trait, (2) the genomic locations of these genes, (3) the size of the effects of each
QTL, (4) the mode of gene action for the QTL (dominant versus additive), and (5)
whether one QTL affects the action of another QTL (epistatic interaction). In
other words, one can get a rather complete description of the genetic architecture
for the trait.
Much has been learned about genetic architecture from QTL-mapping studies
in diverse organisms. Here are two examples. First, flowering time in maize is a
classic quantitative or continuous trait. Flowering time is a trait of critical impor-
tance in maize breeding since the plants must flower and mature before the end
19.5 Mapping QTL in Populations with Known Pedigrees 747
of the growing season. Maize from Canada is adapted to flower within 45 days
after planting, while maize from Mexico can require 120 days or longer. QTL map-
ping has shown that the genetic architecture for flowering time in maize involves
more than 50 genes. Results from one experiment are shown in Figure 19-15a;
these results show evidence for 15 QTL. QTL for maize flowering time generally
have a small effect, such that substituting one allele for another at a QTL alters
flowering time by only one day or less. Thus, the difference in flowering time
between tropical and temperate maize involves many QTL.
Second, mice have been used to map QTL for many disease-susceptible traits.
What one learns about disease-susceptibility genes in mice is often true in humans
as well. Figure 19-15b shows the results of a genomic scan in mice for QTL for
bone mineral density (BMD), the trait underlying osteoporosis. This scan identi-
fied two QTL, one on chromosome 9 and one on chromosome 12. From studies
such as this, researchers have indentified over 80 QTL in mice that may contrib-
ute to susceptibility to osteoporosis. Similar studies have been done on dozens of
other disease conditions.
10
0
0 100 200
centimorgans
(b) QTL for bone mineral density in mice
Lod score
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 x
Chromosome
F i g u r e 19 -15 Plot of Lod scores from genomic scans for QTL. (a) Results from a scan for
flowering time QTL in maize. (b) Results from a scan for bone-mineral-density QTL in mice.
[ (a) Data from E. S. Buckler et al., Science 325, 2009, 714 –718; (b) Data from N. Ishimori et al., J. Bone
Min. Res. 23, 2008, 1529 –1537. ]
74 8 CHAPTER 1 9 The Inheritance of Complex Traits
1 181.4
2 182.2
3 180.6
4 169.3
5 171.2
6 180.7
7 181.8
8 169.3
9 170.7
10 171.4
F i g u r e 19 -16 A tomato chromosomal segment for a set of 10 congenic lines that have
crossovers near a QTL for fruit weight. Red chromosomal segments are derived from the
Beefmaster line and yellow segments from the Sungold line. Differences in fruit weight
among the lines make it possible to identify the kin1 gene as the gene underlying this QTL.
differs between the congenic lines. Thus, the use of congenic lines eliminates the
complications caused by having multiple QTL segregate at the same time.
Using the tomato fruit weight example from above, the chromosome region
for a set of such congenic lines is shown in Figure 19-16. The genes (flc, arf4,. . .)
are shown at the top, and the location for each crossover is indicated by the switch
in color from red (Beefmaster genotype) to yellow (Sungold genotype). The mean
fruit weight for the congenic lines carrying these recombinant chromosomes is
indicated on the right. By inspection of Figure 19-16, you’ll notice that all lines
with the Beefmaster allele of kin1 (a kinase gene) have fruit of ~180 g, while those
with the Sungold allele of kin1 have fruit of about ~170 g. None of the other genes
are associated with fruit weight in this way. If confirmed by appropriate statistical
tests, this result allows us to identify kin1 as the gene underlying this QTL.
Table 19-7 lists a small sample of the hundreds of genes or QTL affecting
quantitative variation from different species that have been identified. The list
includes the gene for maize flowering time, Vgt, that underlies one of the Lod
peaks in Figure 19-15a. One notable aspect of this list is the diversity of gene func-
tions. There does not appear to be a rule that only particular types of genes can be
a QTL. Most, if not all, genes in the genomes of organisms are likely to contribute
to quantitative variation in populations.
Table 19-7 Some Genes Contributing to Quantitative Variation that Were First Identified Using QTL Mapping
Organism Trait Gene Gene function
Yeast High-temperature growth RHO2 GTPase
Arabidopsis Flowering time CRY2 Cryptochrome
Maize Branching Tb1 Transcription factor
Maize Flowering time Vgt Transcription factor
Rice Photoperiod sensitivity Hd1 Transcription factor
Rice Photoperiod sensitivity CK2a Casein kinase a subunit
Tomato Fruit-sugar content Brix9-2-5 Invertase
Tomato Fruit weight Fw2.2 Cell-cell signaling
Drosophila Bristle number Scabrous Secreted glycoprotein
Cattle Milk yield DGAT1 Diacylglycerol acyltransferase
Mice Colon cancer Mom1 Modifier of a tumor-suppressor gene
Mice Type 1 diabetes I-Ab Histocompatibility antigen
Humans Asthma ADAM33 Metalloproteinase-domain-containing protein
Humans Alzheimer’s disease ApoE Apolipoprotein
Humans Type 1 diabetes HLA-DQA MHC class II surface glycoprotein
Source: A. M. Glazier et al., Science 298, 2002, 2345-2349.
Association mapping offers several advantages over QTL mapping. First, since
it is performed with random-mating populations, there is no need to make con-
trolled crosses or work with human families with known parent–offspring rela-
tionships. Second, it tests many alleles at a locus at once. In QTL-mapping studies,
there are two parents (Beefmaster and Sungold tomatoes in the example above)
Strong disequilibrium
No disequilibrium
F i g u r e 19 -17 (top) Diagram of the distribution of SNPs and haplotypes for a chromosomal segment
from 18 individuals. Haplotypes often occur in blocks (regions of lower recombination) separated from one
another by recombination hotspots (different colors indicate haplotype blocks). (The column of S’s and D’s
at the right are for Problem 19-4.) SNP8 (bold) controls a difference in trait values. ( bottom) You can tell
whether two SNPs show disequilibrium by noting the color of the square where the rows for the markers
intersect. Within a haplotype block, SNPs show strong disequilibrium. SNPs in different haplotype blocks
show weak or
Introduction to no disequilibrium.
Genetic [ Data from David Altshuler et al., Science 322, 2008, 881–888.]
Analysis, 11e
Figure 19.17 #1917
08/27/14
Dragonfly Media Group
19.6 Association Mapping in Random-Mating Populations 751
and so only two alleles are being compared. With association mapping, all the
alleles in the population are being assayed at the same time. Finally, association
mapping can lead to the direct identification of the genes at the QTL without the
need for subsequent fine-mapping studies. This is possible because the SNPs in
any gene that influences the trait will show stronger associations with the trait
than SNPs in other genes. Let’s take a look at how it works.
15
10
APOE
5
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2122 x
Crohn’s disease
CARD15
15 IL23R ATG16L1 IRGM
5
(P value)
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2122 x
Rheumatoid arthritis
–log10
HLA-DRB1
15
PTPN22
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2122 x
Type 1 diabetes
HLA-DRB1
15 PTPN22
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2122 x
Currently, it is unknown why this is the case. Researchers had expected that com-
mon diseases such as type 2 diabetes would be caused by common alleles; that is,
alleles with frequencies between 5 and 95 percent. GWA studies are designed to
detect the effects of common alleles, but they are not designed to detect the
effects of rare alleles. Thus, one hypothesis is that susceptibility for many com-
mon diseases (or height variation) is caused by a large number of rare alleles. In
other words, the disease-susceptibility alleles that segregate in one family are dif-
ferent from those in another, unrelated family.
Despite the inability of GWA studies to explain all the heritable variation for
traits, this approach has provided a major advance in understanding quantitative
genetic variation. Hundreds of new genes contributing to quantitative variation
for disease risk have been identified. These genes are now the targets for the
development of new therapies. Beyond humans, GWA studies have advanced our
understanding of the inheritance of quantitative traits in Arabidopsis, Drosophila,
yeast, and maize.
summary
Quantitative genetics seeks to understand the inheritance of heritability (H 2) of the trait. H 2 is the ratio of the genetic vari-
complex traits—traits that are influenced by a mix of genetic ance to the phenotypic variance. Broad-sense heritability
and environmental factors and do not segregate in simple expresses the degree to which the differences in the pheno-
Mendelian ratios. Complex traits can be categorical traits, types among the individuals in a population are determined
threshold traits, counting (meristic) traits, or continuously by differences in their genotypes. The measurement of H 2 in
variable traits. Any trait for which we cannot directly infer humans has revealed that many traits have genetic influ-
genotype from phenotype is a target for quantitative genetic ences, including physical attributes, mental functions, per-
analysis. sonality features, psychiatric disorders, and even political
The genetic architecture of a trait is the full description attitudes.
of the number of genes affecting the trait, their relative con- Parents transmit genes but not genotypes to their off-
tributions to the phenotype, the contribution of environ- spring. At each generation, new dominance interactions
mental factors to the phenotype, and an understanding of between the alleles at a locus are created. To incorporate
how the genes interact with one another and with environ- this phenomenon into the mathematical model for quanti-
mental factors. To decipher the genetic architecture of com- tative variation, the genetic deviation (g) is decomposed
plex traits, quantitative geneticists have developed a simple into the additive (a) and dominance (d) deviations. Only
mathematical model that decomposes the phenotypes of the additive deviation is transmitted from parents to off-
individuals into differences that are due to genetic factors spring. The additive deviation represents the heritable part
(g) and those that are due to environmental factors (e). of the phenotype in the narrow sense. The additive part of
The differences in trait values among members of a pop- the variance in a population is the heritable part of the vari-
ulation can be summarized by a statistical measure called ance. Narrow-sense heritability (h2) is the ratio of the addi-
the variance. The variance measures the extent to which tive variance to the phenotypic variance. Narrow-sense
individuals deviate from the population mean. The vari- heritability provides a measure of the degree to which the
ance for a trait can be partitioned into a part that is due to phenotypes of individuals are determined by the genes they
genetic factors (the genetic variance) and a part that is due inherit from their parents.
to environmental factors (the environmental variance). A A knowledge of the narrow-sense heritability of a trait is
key assumption behind partitioning the trait variance into fundamental to understanding how a trait will respond to
genetic and environmental components is that genetic and selective breeding or the force of natural selection. Plant and
environmental factors are uncorrelated or independent. animal breeders use their knowledge of narrow-sense herita-
The degree to which variation for a trait in a population is bility for traits of interest to guide plant and animal improve-
explained by genetic factors is measured by the broad-sense ment programs. Narrow-sense heritability is used to predict
Solved Problems 755
the phenotypes of offspring and estimate the breeding value between the genotypes at marker loci and trait values in ran-
of individual members of the breeding population. dom-mating populations. Association mapping can allow
The genetic loci underlying variation in complex traits researchers to identify the genes that underlie the QTL.
are known as quantitative trait loci, or QTL for short. There Genome-wide association (GWA) studies use markers blan-
are two experimental methods for characterizing QTL and keting the entire genome.
determining their locations in the genome. First, QTL map- Most traits of importance in medicine, agriculture, and
ping looks for statistical correlations between the genotypes evolutionary biology show complex inheritance. Examples
at marker loci and trait values in populations with known include disease risk in humans, yield in soybeans, milk pro-
pedigrees such as a BC1 population. QTL mapping provides duction in dairy cows, and the full spectrum of phenotypes
estimates of the number of genes controlling a trait, whether that differentiate all the species of plants, animals, and
the alleles at the QTL exhibit additivity or dominance, and microbes on earth. Quantitative genetic analyses are at
whether each QTL has a small or large effect on the trait. the forefront of understanding the genetic basis of these
Second, association mapping looks for statistical correlations critical traits.
ke y terms
additive effect (A) (p. 733) dominance effect (D) (p. 733) nearly isogenic line (p. 747)
additive gene action (p. 731) dominant gene action (p. 731) normal distribution (p. 721)
additive genetic variance (p. 735) environmental variance (p. 725) partial dominance (p. 731)
association mapping (p. 749) fine-map (p. 747) personal genomics (p. 752)
breeding value (p. 735) frequency histogram (p. 719) population (p. 718)
broad-sense heritability (H 2 ) gene action (p. 731) QTL mapping (p. 742)
(p. 727) genetic architecture (p. 716) quantitative genetics (p. 716)
candidate gene (p. 749) genetic variance (p. 725) quantitative trait (p. 716)
categorical trait (p. 717) genome-wide association (GWA quantitative trait loci (QTL)
complex inheritance (p. 717) or GWAS) (p. 749) (p. 742)
complex trait (p. 716) inbred line or strain (p. 723) sample (p. 718)
congenic line (p. 747) isogenic (p. 747) selection differential (S) (p. 741)
continuous trait (p. 717) mean (p. 718) selection response (R) (p. 741)
correlation (p. 726) meristic trait (p. 718) simple inheritance (p. 717)
correlation coefficient (p. 726) multifactorial hypothesis (p. 716) standard deviation (p. 720)
covariance (p. 725) narrow-sense heritability threshold trait (p. 717)
deviation (p. 720) (h2) (p. 731) variance (p. 719)
s o lved pr o blems
SOLVED PROBLEM 1. In a flock of 100 broiler chickens, the population will fall within 2 standard deviations of the mean
mean weight is 700 g and the standard deviation is 100 g. and the remaining 4.5 percent will lie more than 2 standard
Assume the trait values follow the normal distribution. deviations from the mean. Of this 4.5 percent, one-half (2.25
percent) will be more than 2 standard deviations less than
a. How many of the chickens are expected to weigh more
the mean, and the other half (2.25 percent) will be more
than 700 g?
than 2 standard deviations greater than the mean. Thus, we
b. How many of the chickens are expected to weigh more expect about 2.25 percent of the 100 chickens (or roughly 2
than 900 g? chickens) to weigh more than 900 g.
c. If H 2 is 1.0, what is the genetic variance for this population? c. When H 2 is 1.0, then all of the variance is genetic. We
know that the standard deviation is 100, and the variance is
Solution the square of the standard deviation.
a. Since the normal distribution is symmetrical about the
Variance = s2
mean, 50 percent of the population will have a trait value
above the mean and the other 50 percent will have a trait Thus, the genetic variance would be (100)2 = 10,000 g2.
value below the mean. In this case, 50 of the 100 chickens are
expected to weigh more than 700 g. SOLVED PROBLEM 2. Two inbred lines of beans are inter-
b. The value of 900 g is 2 standard deviations greater than crossed. In the F1, the variance in bean weight is measured at
the mean. Under the normal distribution, 95.5 percent of the 15 g2. The F1 is selfed; in the F2, the variance in bean weight is
756 CHAPTER 1 9 The Inheritance of Complex Traits
61 g2. Estimate the broad heritability of bean weight in the F2 SOLVED PROBLEM 4. One research team reports that the
population of this experiment. broad-sense heritability for height in humans is 0.5 based on
a study of identical twins reared apart in Iceland. Another
Solution team reports that the narrow-sense heritability for human
The key here is to recognize that all the variance in the F1 height is 0.8 based on a study of parent–offspring correlation
population must be environmental because all individuals in the United States. What seems unexpected about these
have the same genotype. Furthermore, the F2 variance must results? How could the unexpected results be explained?
be a combination of environmental and genetic components,
because all the genes that are heterozygous in the F1 will seg- Solution
regate in the F2 to give an array of different genotypes that Broad-sense heritability is the ratio of the total genetic vari-
relate to bean weight. Hence, we can estimate ance (Vg) to the phenotypic variance (VX). The total genetic
variance includes both the additive (Va) and the dominance
Ve = 15 g2
(Vd) variance
Vg + Ve = 61 g2
Vg Va + Vd
Therefore, h2 = =
VX VX
Vg = 61 - 15 = 46 g2
and broad heritability is Narrow-sense heritability is the ratio of the additive
variance (Va) to the phenotypic variance (VX).
46
H2 = = 0.75 (75%)
61 Va
H2 =
SOLVED PROBLEM 3. In an experimental population of
VX
Tribolium (flour beetles), body length shows a continuous
Thus, all other variables being equal, H 2 should be greater
distribution with a mean of 6 mm. A group of males and
than or equal to h2. It will be equal to h2 when Vd is 0.0. It
females with a mean body length of 9 mm are removed and
is unexpected that h2 should be greater than H 2. However,
interbred. The body lengths of their offspring average 7.2 mm.
the two research teams studied different populations—in
From these data, calculate the heritability in the narrow
Iceland and in the United States. Estimates of heritability
sense for body length in this population.
apply only to the population and environment in which
Solution they were measured. Estimates made in one population
The selection differential (S) is 9 - 6 = 3 mm, and the selec- can be different from those made in another population
tion response (R) is 7.2 - 6 = 1.2 mm. Therefore, the herita- because the two populations may segregate for different
bility in the narrow sense is alleles at numerous genes and the two populations experi-
ence different environments.
R 1.2
h2 = = = 0.4 (40%)
S 3.0
pr o blems
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
W o rking with the F igures 3. Figure 19-16 shows the results of a QTL fine-mapping ex-
1. Figure 19-9 shows the trait distributions before and after periment. Which gene would be implicated as control-
a cycle of artificial selection. Does the variance of the ling fruit weight if the mean fruit weight for each line
trait appear to have changed as a result of selection? was as follows?
Explain. Line Fruit weight (g)
2. Figure 19-11 shows the expected distributions for the 1 181.4
three genotypic classes if the B locus is a QTL affecting 2 169.3
the trait value. 3 170.7
a. As drawn, what is the dominance/additive (D/A) 4 171.2
ratio? 5 171.4
b. How would you redraw this figure if the B locus had 6 182.2
no effect on the trait value? 7 180.6
8 180.7
c. How would the positions along the x-axis of the
9 181.8
curves for the different genotypic classes of the B locus
change if D/A = 1.0? 10 169.3
Problems 757
4. Figure 19-17 shows a set of haplotypes. Suppose these are a. What are the heights of very tall and very short
haplotypes for a chromosomal segment from 18 haploid women?
yeast strains. On the right edge of the figure, the S and D b. In a population of 10,000 women, how many are
indicate whether the strain survives (S) or dies (D) at high expected to be very tall and how many very short?
temperature (40°C). Using the χ2 test (see Chapter 3) and
12. A bean breeder is working with a population in which the
Table 3-1, does either SNP1 or SNP6 show evidence for an
mean number of pods per plant is 50 and the variance is
association with the growth phenotype? Explain.
10 pods2. The broad-sense heritability is known to be 0.8.
5. Figure 19-18a shows a plot of P values (represented by Given this information, can the breeder be assured that
the dots) along the chromosomes of the dog genome. the population will respond to selection for an increase in
Each P value is the result of a statistical test of associa- the number of pods per plant in the next generation?
tion between a SNP and body size. Other than the cluster
13. The table below shows the number of piglets per litter
of small P values near IGF1, do you see any chromosomal
for a group of 60 sows. What is the mean number of pig-
regions with evidence for a significant association be-
lets per litter? What is the relative frequency of litters
tween a SNP and body size? Explain.
with at least 12 piglets?
6. Figure 19-19 shows plots of P values (represented by the
dots) along the chromosomes of the human genome. Number of litters Piglets/litter
Each P value is the result of a statistical test of association 1 6
between a SNP and a disease condition. There is a cluster, 3 7
or spike, of statistically significant P values (green dots) at 7 8
the gene HLA-DRB1 for two diseases. Why might this par- 12 9
ticular gene contribute to susceptibility for the autoim- 18 10
mune diseases rheumatoid arthritis and type 1 diabetes? 20 11
17 12
B asic P r o blems
14 13
7. Distinguish between continuous and discontinuous varia- 6 14
tion in a population, and give some examples of each. 2 15
8. What are the central assumptions of the multifactorial
hypothesis?
14. A chicken breeder is working with a population in which
9. The table below shows a distribution of bristle number in the mean number of eggs laid per hen in one month is 28
a Drosophila population. Calculate the mean, variance, and the variance is 5 eggs2. The narrow-sense heritability
and standard deviation for these data. is known to be 0.8. Given this information, can the breed-
Bristle number Number of individuals er expect that the population will respond to selection
for an increase in the number of eggs per hen in the next
1 1
generation?
2 4
3 7 a. No, applying selection is always risky and a breeder
4 31 never knows what to expect.
5 56 b. No, a breeder needs to know the broad-sense
6 17 heritability to know what to expect.
7 4 c. Yes, since the narrow-sense heritability is close to
1 (0.8), then we would expect selective breeding could
10. Suppose that the mean IQ in the United States is roughly lead to increased egg production in the next generation.
100 and the standard deviation is 15 points. People with d. Yes, since the variance is greater than 0.
IQs of 145 or higher are considered “geniuses” on some
scales of measurement. What percentage of the popula- e. Both c and d are correct.
tion is expected to have an IQ of 145 or higher? In a coun- 15. The narrow-sense heritability of the number of peas per
try with 300 million people, how many geniuses are pod in a population of sugar snap peas is 0.5. The mean
there expected to be? of the population is 6.2 peas per pod. A plant breeder
11. In a sample of adult women from the United States, the selects one plant with 6.8 peas per pod and crosses with
average height was 164.4 cm and the standard deviation a second plant that has 8.0 peas per pod. What is the ex-
was 6.2 cm. Women who are more than 2 standard devia- pected numbers of peas per pod among the offspring of
tions above the mean are considered very tall, and wom- this cross?
en who are more than 2 standard deviations below the 16. QTL mapping and GWA (association) mapping are two
mean are considered very short. Height in women is nor- different methods used to identify genes that affect com-
mally distributed. plex traits. For each of the following statements, choose
758 CHAPTER 1 9 The Inheritance of Complex Traits
whether it applies to QTL mapping, association map- variance in adult weight is measured at 3 g2. The F1 ani-
ping, or both. mals are intercrossed to create an F2 in which the vari-
ance in adult weight is 16 g2. Estimate the broad herita-
Statement QTL GWA Both bility of adult weight in the F2 population of this
This method requires that the experiment. (The environments in which the F1 and F2
experimenter make crosses animals were reared were equivalent.)
between different strains to produce
a mapping population. 20. The table below shows the weights of 100 individual
mice of the same inbred strain reared on different diets.
This method can scan the entire For an individual mouse that weights 27 g, how much of
genome to find QTL for a trait. its weight is due to its genetics and how much to the spe-
cific diet it was fed (environment)? (Other than diet, the
This method can often identify the
mice were reared in equivalent environments.)
specific genes that represent the QTL.
Number of mice Weight (g)
This method may sample a large
5 21
number of individuals from a 13 22
random-mating population that has
18 23
variation for the trait being studied.
21 24
This method typically tests two 22 25
alleles that differ between the two 16 26
parents of the mapping population. 5 27
C hallenging P r o blems
21. The table below contains measurements of total serum
cholesterol (mg/dl) for 10 sets of monozygotic twins
17. In a large herd of cattle, three different characters show- who were reared apart. Calculate the following: overall
ing continuous distribution are measured, and the vari- mean, overall variance, covariance between the twins,
ances in the following table are calculated: and broad-sense heritability (H 2).
Characters
X′ X″
Shank Neck Fat
228 222
Variance length length content
186 152
Phenotypic 310.2 730.4 106.0 204 220
Environmental 248.1 292.2 53.0 142 185
Additive genetic 46.5 73.0 42.4 226 210
Dominance genetic 15.6 365.2 10.6 217 190
207 226
a. Calculate the broad- and narrow-sense heritabilities
185 213
for each character.
179 159
b. In the population of animals studied, which
170 129
character would respond best to selection? Why?
c. A project is undertaken to decrease mean fat 22. The table below contains the height in centimeters for
content in the herd. The mean fat content is currently 10 sets of adult women twins. Calculate the correlation
10.5 percent. Animals with a mean of 6.5 percent fat coefficient (r) between the heights of the sisters for the
content are interbred as parents of the next generation. twin pairs.
What mean fat content can be expected in the Twin 1 Twin 2
descendants of these animals?
18. In a species of the Darwin’s finches (Geospiza fortis), the 158 163
narrow-sense heritability of bill depth has been estimat- 156 150
ed to be 0.79. Bill depth is correlated with the ability of 172 173
the finches to eat large seeds. The mean bill depth for the 156 154
population is 9.6 mm. A male with a bill depth of 10.8 160 163
mm is mated with a female with a bill depth of 9.8 mm. 159 153
What is the expected value for bill depth for the offspring 170 174
of this mating pair? 177 174
165 168
19. Two inbred lines of laboratory mice are intercrossed. In 172 165
the F1 (which have identical genotypes at all loci), the
Problems 759
23. Population A consists of 100 hens that are fully isogenic c. Which of these has the largest additive effect?
and that are reared in a uniform environment. The d. The mean pulse rate for L. kohalensis is 3.72, and it is
average weight of the eggs they lay is 52 g, and the vari-
0.71 for L. paranigra. Do all six QTL act in the expected
ance is 3.5 g2. Population B consists of 100 genetically
direction with the L. kohalensis allele conferring a higher
variable hens that produce eggs with a mean weight of
pulse rate than the L. paranigra allele?
52 g and a variance of 21.0 g2. Population B is raised in an
environment that is equivalent to that of Population A. 27. Question 26 refers to QTL on the cricket autosomes. For
What is the environmental variance (Ve) for egg weight? the sex chromosomes, females crickets are XX and males
What is the genetic variance in Population B? What is the crickets are XO, having just one X chromosome but no
broad-sense heritability in Population B? Y chromosome. Can QTL for pulse rate be mapped on
cricket X chromosomes? If the song is only sung by male
24. Maize plants in a population are on average 180 cm tall. crickets, can the dominance effects of QTL on the X be
Narrow-sense heritability for plant height in this popu- estimated?
lation is 0.5. A breeder selects plants that are 10 cm
28. GWA studies reveal statistical correlations between the
taller on average than the population mean to produce
genotypes at marker loci in genes and complex traits. Do
the next generation, and the breeder continues applying
GWA studies prove that allelic variation in a gene actu-
this level of selection for eight generations. What will be
ally causes the variation in the trait? If not, what experi-
the average height of the plants after eight generations
ments could prove that allelic variants in a gene in a
of selection? Assume that h2 remains 0.5 and Ve does not
population are responsible for variation in a trait?
change over the course of the experiment.
29. The ocular albinism-2 (OCA2) gene and the melanocor-
25. In a population of Drosophila melanogaster reared in the
tin-1-receptor (MC1R) gene are both involved in mela-
laboratory, the mean wing length is 0.55 mm and the
nin metabolism in skin cells in humans. To test whether
range is 0.35 to 0.65. A geneticist selects a female with
variation at these genes contributes to sun sensitivity
wings that are 0.42 mm in length and mates her with a
and the associated risk of being afflicted with skin
male that has wings that are 0.56 mm in length.
cancer, you perform association analyses. A sample of
a. What is the expected wing length of their offspring 1000 people from Iceland were asked to classify them-
if wing length has a narrow-sense heritability of 1.0? selves as having tanning or burning (nontanning) skin
b. What is the expected wing length of their offspring if when exposed to the sun. The individuals were also
wing length has a narrow-sense heritability of 0.0? genotyped for a SNP in each gene (rs7495174 and
rs1805007). The table shows the number of individuals
26. Different species of crickets have distinct songs, and
in each class.
they use these songs for mate recognition. Researchers
crossed two species of Hawaiian crickets (Laupala para- OCA2 (rs7495174) MC1R (rs1805007)
nigra and L. kohalensis) whose songs are distinguished
A/A A/G G/G C/C C/T T/T
by pulse rate (the number of pulses per second; Shaw et
Burning 245 56 1 192 89 21
al., Molecular Ecology 16, 2007, 2879–2892.) Then, they
Tanning 555 134 9 448 231 19
mapped QTL in the F2 population derived from this
cross. Six autosomal QTL were detected. The mean trait
values (pulses per second) at the three genotypic classes a. What are the frequencies of tanning and burning
in the F2 for each QTL are shown in the table below, phenotypes in Iceland?
where P indicates the L. paranigra allele and K indicates b. What are the allelic frequencies at each locus (SNP)?
the L. kohalensis allele. c. Using the χ2 test (see Chapter 3) and Table 3-1, test
QTL P/P P/K K/K the null hypothesis that there is no association between
these SNPs and sun-sensitive skin. Does either SNP
1 1.54 1.89 2.10 show evidence for an association?
2 1.75 1.87 1.94 d. If you find evidence for an association between the
gene and the trait, what is the mode of gene action?
3 1.72 1.88 1.92
4 1.70 1.82 2.02 e. If the P value is greater than 0.05, does that prove
that the gene does not contribute to variation for sun
5 1.67 1.80 2.13
sensitivity? Why?
6 1.57 1.88 2.19
20
C h a p t e r
Evolution of Genes
and Traits
Learning Outcomes
After completing this chapter, you will be able to
outline
761
762 CHAPTER 2 0 Evolution of Genes and Traits
C
harles Darwin (1809–1882) arrived in the Galápagos Islands in 1835, well
into the fourth year of what was supposed to be a two-year voyage. One
might think that these islands, now inextricably linked with Darwin’s name,
were the young naturalist’s paradise. Far from it. Darwin found the islands hellishly
hot, their broken black volcanic rock scorching under the hot sun. In his diary he
observed that “the stunted trees show little signs of life . . . the plants also smell
unpleasantly. . . . The black lava rocks on the beach are frequented by large (2–3 ft.)
most disgusting clumsy lizards. . . . They assuredly well become the land they
inhabit.”1 Other than the lizards and the tortoises, the animal life on the islands was
scant and unimpressive. He could not wait to leave the place. The 26-year-old
explorer did not know that his five weeks in the Galápagos would inspire a series of
radical ideas that, some 24 years later with the publication of his On the Origin of
Species (1859), would change our perception of the world and our place in it.
Several months after leaving the islands, on the last leg of the voyage home to
England, Darwin had his first flash of insight. He had begun to organize his copious
field notes from his nearly five years of exploration and collecting. His plan was for
experts back in England to lead the study of his collections of fossils, plants, animals,
and rocks. Turning to his observations on the birds of the Galápagos, he recalled that
he had found slightly different forms of mockingbirds on three different islands.
Now, there was a puzzle. The prevailing view of the origin of species in 1835, held by
most of Darwin’s teachers and much of the scientific establishment, was that species
were specially created by God in their present form, unchangeable, and placed in the
habitat to which they were best suited. Why, then, would there be slightly different
birds on such similar islands? Darwin jotted in his ornithology notebook:
When I see these Islands in sight of each other and possessed of but a scanty
stock of animals, tenanted by these birds but slightly differing in structure
filling the same place in Nature, I must suspect they are only varieties.
. . . If there is the slightest foundation for these remarks, the zoology of
Archipelagoes will be well worth examining; for such facts would undermine
the stability of species [emphasis added].2
Darwin’s insight was that species might change. This was not what he had learned
at Cambridge University. This was heresy. Although Darwin decided to keep such
dangerous thoughts to himself, he was gripped by the idea. After arriving home in
England, he filled a series of notebooks with thoughts about species changing. Within
a year he had convinced himself that species arise naturally from preexisting species,
as naturally as children are born from parents and parents from grandparents. He
then pondered how species change and adapt to their particular circumstances. In
1838, just two years after the conclusion of his voyage and before he had yet turned
30, he conceived his answer—natural selection. In this competitive process, indi-
viduals bearing some relative advantage over others live longer and produce more
offspring, which in turn inherit the advantage.
Darwin knew that to convince others of these two ideas—the descent of species
from ancestors and natural selection—he would need more evidence. He spent the
next two decades marshaling all of the facts he could from botany, zoology, embryol-
ogy, and the fossil record.
He received crucial information from experts who helped to sort out and charac-
terize his collections. Ornithologist John Gould pointed out to Darwin that what the
young naturalist thought were blackbirds, grosbeaks, and finches from the Galápa-
gos were actually 12 (now recognized as 13) new and distinct species of ground
finches (Figure 20-1). The Galápagos species, though clearly finches, exhibit an
immense variation in feeding behavior and in the bill shape that corresponds to their
1 C. Darwin, Charles Darwin's Beagle Diary, Ed. R. D. Keynes, Cambridge University Press, 2001.
2 C. Darwin, Charles Darwin's Beagle Diary, Ed. R. D. Keynes, Cambridge University Press, 2001.
20.1 Evolution by Natural Selection 76 3
Seed eaters
Seed eaters have bills that are adapted
for collecting and crushing seeds.
Bud eater
The bud eater has a heavy bill adapted
for pulling buds from branches.
Vegetarian finch
(Platyspiza crassirostris)
ANCESTOR FINCH
from South American
mainland
Insect eaters
Insect eaters have a variety of bills adapted
to eating different sizes and types of insects,
which they capture in different ways.
Mangrove finch
(C. heliobates) The woodpecker finch has a long
beak it uses to search for insects
Woodpecker finch in crevices in dead wood and bark.
(C. pallidus)
food sources. For example, the vegetarian tree finch uses its heavy bill to eat fruits
and leaves, the insectivorous finch has a bill with a biting tip for eating large insects,
and, most remarkable of all, the woodpecker finch grasps a twig in its bill and uses it
to obtain insect prey by probing holes in trees.
This diversity of species, Darwin deduced, must have arisen from an original
population of finch that arrived in the Galápagos from the mainland of South Amer-
ica and populated the islands. The descendants of the original colonizers spread to
the different islands and formed local populations that diverged from one another
and eventually formed different species.
The finches illustrate the process of adaptation, in which the characteristics of
a species become modified to suit the environments in which they live. Darwin pro-
vided one level of explanation for the process, natural selection, but he could not
explain how traits varied or how they changed with time because he did not under-
stand the mechanisms of inheritance. Understanding the genetic basis of adaptation
has been one of the long-standing goals of evolutionary biology.
A first step toward this goal was taken when Mendel’s work pointing to the exis-
tence of genes was rediscovered two decades after Darwin died. Another key emerged
a half century later, when the molecular basis of inheritance and the genetic code
were deciphered. For many decades since, biologists have known that species and
traits evolve through changes in DNA sequence. However, the elucidation of specific
changes in DNA sequence underlying physiological or morphological evolution has
posed considerable technical challenges. Advances in molecular genetics, develop-
mental genetics, and comparative genomics are now revealing the diverse mecha-
nisms underlying the evolution of genes, traits, and organismal diversity.
The study of evolution is a very large and expanding discipline. As such, we will
not attempt a comprehensive overview of all facets of evolutionary analysis. Rather,
in this chapter, we will examine the molecular genetic mechanisms underlying the
variation in and evolution of traits and the adaptation of organisms to their environ-
ments. We will first examine the evolutionary process in general and then focus on
specific examples for which the genetic and molecular bases of the phenotypic dif-
ferences between populations or species have been pinpointed. All of the examples
will focus on the evolution of relatively simple traits controlled by a single gene.
These relatively simple examples are sufficient to illustrate the fundamental process
of evolution at the DNA level and the variety of ways in which the evolution of genes
affects the gain, loss, and modification of traits.
types change over time. Thus, the three critical ingredients to evolutionary change
Darwin put forth were variation, selection, and time:
Can it, then, be thought improbable. . . that variations useful in some way to
each being in the great and complex battle of life, should sometimes occur in
the course of thousands of generations? . . . Can we doubt (remembering that
many more individuals are born than can possibly survive) that individuals
having any advantage, however slight, over others, would have the best chance
of surviving and of procreating their kind? On the other hand, we may feel
sure that any variation in the least degree injurious would be rigidly destroyed.
This preservation of favorable variations and the rejection of injurious
variations I call Natural Selection. (On the Origin of Species, Chapter IV)3
Darwin’s writings and ideas are well known, and justifiably so, but it is very
important to note that he was not alone in arriving at this concept of natural
selection. Alfred Russel Wallace (1823–1913), a fellow Englishman who explored
the jungles of the Amazon and the Malay Archipelago for a total of 12 years,
reached a very similar conclusion in a paper that was co-published with an excerpt
from Darwin in 1858:
The life of wild animals is a struggle for existence. . . . Perhaps all the
variations from the typical form of a species must have some definite effect,
however slight, on the habits or capacities of the individuals. . . . It is also
evident that most changes would affect, either favourably or adversely, the
powers of prolonging existence. . . . If, on the other hand, any species should
produce a variety having slightly increased powers of preserving existence,
that variety must inevitably in time acquire a superiority in numbers.4
3 C. Darwin, On the Origin of Species by Means of Natural Selection, or the Preservation of Favored
for this evolutionary process in the wild, he had in mind the selection that breed-
ers exercise on successive generations of domestic plants and animals.
We can summarize the theory of evolution by natural selection in three
principles:
1. Principle of variation. Among individuals within any population, there is varia-
tion in morphology, physiology, and behavior.
2. Principle of heredity. Offspring resemble their parents more than they resem-
ble unrelated individuals.
3. Principle of selection. Some forms are more successful at surviving and repro-
ducing than other forms in a given environment.
Balanced polymorphism
F i g u r e 2 0 - 3 A colorized electron
Red blood cells in someone with sickle-cell trait micrograph showing sickle cells among
normal red blood cells. [ Eye of Science/
Science Source.]
the mosquitoes reproduce. Allison surmised that the HbS allele might, by altering
red blood cells, confer some degree of resistance to malarial infection.
The proportion of individuals with sickle cells in any population, then, will
be the result of a balance between two factors: the severity of malaria, which
will tend to increase the frequency of the gene, and the rate of elimination
of the sickle-cell genes in individuals dying of sickle-cell anaemia. . . .
Genetically speaking, this is a balanced polymorphism [emphasis added],
where the heterozygote has an advantage over either homozygote.5
F i g u r e 2 0 - 4 A blood smear of an
individual infected with malarial parasites. A
red blood cell sample was treated with Giemsa
stain to reveal parasites within cells (red dots).
[CDC/Dr. Mae Melvin.]
20.2 Natural Selection in Action: An Exemplary Case 76 9
In other words, the sickle-cell mutation was under balancing selection (see Chap-
ter 18) in areas where malaria was present. Positive selection operating on AS indi-
viduals is balanced by natural selection operating against AA individuals susceptible
to malaria and SS individuals who would succumb to sickle-cell anemia.
How much of an advantage do AS individuals experience? This can be calcu-
lated by measuring the frequency of the HbS allele in populations and examining
how these frequencies differ from the frequencies expected under the assump-
tions of the Hardy–Weinberg equation (see Chapter 18). A large-scale survey of
12,387 West Africans has revealed an HbS allele frequency (q) of 0.123. The fre-
quencies calculated from the Hardy–Weinberg equation are lower for the homo-
zygous phenotypes and higher for the heterozygous phenotype (Table 20-2). If it
is assumed that the AS heterozygote has a fitness of 1.0, then the relative fitness
of the other genotypes can be estimated from these differences. The relative fit- F i g u r e 2 0 - 5 The relative survival of
ness of the heterozygous AS genotype is 1.0/0.88 = 1.136, which corresponds to a approximately 1000 children from Kisumu
is plotted from birth through the first few
selective advantage of approximately 14 percent.
years of life. Sickle-cell heterozygotes
This selective advantage has been well documented by long-term survival experienced a significant advantage in
studies of AA, AS, and SS children in Kenya. These studies have found that AS overall survival from ages 2 to 16 months.
individuals have a pronounced survival advantage over AA and SS individuals in [ Data from M. Aidoo et al., The Lancet 359,
the first few years of life (Figure 20-5). 2002, 1311–1312.]
1.02
0.97
HbAS
Estimate of relative survival
0.92
0.87
HbAA
0.82
0.77 HbSS
0.72
0 30 180 360 540 720 900 1080 1260 1440 1620 1800 2140
Time until death (days)
770 CHAPTER 2 0 Evolution of Genes and Traits
Americans, the frequency of HbS is on the decline because of selection against the
allele in the absence of malaria in North America.
3. Natural selection acts on whatever variation is available, and not necessarily by the
best means imaginable. The HbS mutation, while protective against malaria, also
causes a life-threatening condition. In areas where malaria is prevalent, where over
40 percent of the world’s population lives, the imperative of combating malaria
counterbalances the deleterious effect of the sickle-cell mutation.
ey
200
al s
s
ep s/re
sh
s/
m tile
pr
s te
/fi
m
m
am p
ct ra
es
M s/re
am
la
se eb
til
p/
in ert
rd
ar
Bi
The debate was resolved by an onslaught of empirical data and the decipher-
ing of the genetic code. Because multiple codons encode the same amino acid, a
mutation that changes, say, CAG to CAC does not change the amino acid encoded.
Therefore, variation can exist at the DNA level that has no effect on protein
sequences, and thus neutral alleles do exist. But even more important for popula-
tion genetics was the development of the “neutral theory of molecular evolution”
by Motoo Kimura, Jack L. King, and Thomas Jukes. These authors proposed that
most, but not all, mutations that are fixed are neutral or nearly neutral and any
differences between species at such sites in DNA evolve by random genetic drift.
The “neutral theory” marked a profound conceptual shift away from a view of
evolution as always guided by natural selection. Moreover, it provided a baseline
assumption of how DNA should change over time if no other agent such as natu-
ral selection intervened.
That is, we expect that, in every generation, there will be μ substitutions in the
population, purely from the genetic drift of neutral mutations.
different orders in which these mutations could have occurred over evolutionary
time. First, site A may have been fixed in the population, then D, then C, then E,
and, finally, B. On the other hand, the order of fixation might have been E, D, A,
B, C. For five sites there are 5 × 4 × 3 × 2 × 1 = 120 possible orders. Two impor-
tant questions in understanding evolution are: How many of these alternative
evolutionary pathways are possible? And, what are the probabilities of the differ-
ent possible pathways relative to one another?
Daniel Weinreich and his colleagues have characterized in detail such a set of
adaptive walks through genetic space in their study of the evolution of antibiotic
resistance in the bacterium E. coli. Resistance to the antibiotic cefotaxime is
acquired through the accumulation of five mutations at different sites in the bac-
terial β-lactamase gene. Four of the mutations lead to amino acid changes, and
the fifth is a noncoding mutation. When all five mutations are present, the mini-
mum concentration of antibiotic required to inhibit bacterial growth increases by
a factor of 100,000. The experimenters first measured the resistance conferred by
a mutation at a given site in the presence of all 24 = 16 possible combinations of
mutants and nonmutants at the other four sites. In most combinations, but not
all, a mutant at one site was more resistant, irrespective of the state of the other
four sites. For example, a mutant at site G238S showed significant resistance, irre-
spective of the mutant or nonmutant state of the other four sites (Table 20-3). On
the other hand, the mutation at the noncoding site g4205a conferred significant
resistance in eight combinations, negligible change in resistance in six cases, and
a decrease in resistance in two combinations. This dependency of the fitness
advantage or disadvantage of a new mutation on the mutations that have previ-
ously been fixed is what the experimenters call sign epistasis.
Weinreich and colleagues measured the resistance at every stage in the tempo-
ral sequence of adding mutations one site after another. If a mutation in one of the
120 possible orderings did not confer a higher resistance, then presumably that
evolutionary path would terminate, because there would be no selection either in
favor of the mutation or even against it. They found that, of the 120 possible path-
ways through the mutational history, only 18 provided increased resistance at each
mutational step. Thus, 102/120 = 85 percent of the possible mutational pathways
to maximum resistance were not accessible to evolution by natural selection.
Finally, we assume that, in a population evolving resistance, the likelihood that a
particular accessible pathway will actually be followed is proportional to the magni-
tude of the increased resistance at each step. Under that assumption, only 10 of the
18 accessible pathways will account for 90 percent of the cases of evolution of bacte-
rial resistance to the antibiotic (Figure 20-9).
+++++
––+–– –++–– – + + – + M182T – + + + + g4205a Max. resistance
A42G G238S
allele
0.13 G2 1.41 2050 2900 4100
38 T
S 2G 82
K 1
04 A4 M
K
E1
04
E1
–––––
2G
G
––+–+ ++––+ +++–+
2
Wild type
A4
E104K
A4
0.09 G2 362 362 1650
38 5 a
S 20
4K
g4
G
0
E1
A4 2
––––+ – + – – + M182T –+–++ –++++
K
– – – + + E104T ––+++
04
E1
32.0 g4 362
20
5a
+––++
362
First mutation Second mutation Third mutation Fourth mutation Fifth mutation
F i g u r e 2 0 - 9 The mutational steps for the 10 most probable trajectories from wild-type
susceptibility to the antibiotic cefotaxime to maximal resistance. Each circle represents an
allele whose identity is denoted by a string of five + or − symbols corresponding ( left to
right) to the presence or absence of mutations g4205a, A42G, E104K, M182T, and G238S,
respectively. Numbers indicate degree of cefotaxime resistance in micrograms per milliliter.
The relative probability of each beneficial mutation is represented by the color and width of
arrows: green—wide, highest; blue—medium, moderate; purple—narrow, low; and
orange—very narrow, lowest. [ Data from D. Weinreich et al., Science 312, 2006, 111–114.]
The bacteriophage has only 11 genes, and so the experimenters were able to
record the successive changes in the DNA for all these genes and in the proteins
encoded by them during the selection process. There were 15 DNA changes in
strain TX, located in six different genes; in strain ID, there were 14 changes located
in four different genes. In seven cases, the changes to the two strains were identi-
cal, including a large deletion, but even these identical changes appeared in each
line in a different order (Table 20-4). So, for example, the change at DNA site
1533, causing a substitution of isoleucine for threonine, was the third change in
the ID strain but the 14th change in the TX strain.
Thus, the course of evolution followed by the initially identical viruses
depended on the mutations available at any given time in the cumulative selec-
tion process. Contrast this situation with the repeated origin of the sickle-cell
allele HbS: in this case, the same mutation arose and spread five times. Clearly, in
some cases there are many molecular “solutions” to selective conditions and in
others just one or very few.
Fixed species
differences Polymorphisms
Nonsynonymous a c
Synonymous b d
Ratio a/b c/d
conclude that some of the amino acid replacements in the enzyme were adaptive
changes driven by natural selection.
usually light colored (Figure 20-11). Field studies suggest that color matching of
coat color and environment protects mice against being seen by predators.
The rock pocket mice give an example of melanism—the occurrence of a dark
form within a population or species. Melanism is one of the most common types
of phenotypic variation in animals. The dark color of the fur is due to heavy depo-
sition of the pigment melanin, the most widespread pigment in the animal king-
dom. In mammals, two types of melanin are produced in melanocytes (the
pigment cells of the epidermis and hair follicles): eumelanin, which forms black
or brown pigments, and phaeomelanin, which forms yellow or red pigments. The
relative amounts of eumelanin and phaeomelanin are controlled by the products
of several genes. Two key proteins are the melanocortin 1 receptor (MC1R) and
the agouti protein. During the hair-growth cycle, the α-melanocyte-stimulating
hormone (α-MSH) binds to the MC1R protein, which triggers the induction of
pigment-producing enzymes. The agouti protein blocks MC1R activation and
inhibits the production of eumelanin.
Michael Nachman and his colleagues examined the DNA sequences of the
mc1r genes of light- and dark-colored pocket mice. They found the presence of
four mutations in the mc1r gene in dark mice that cause the MC1R protein to dif-
fer at four amino acid residues from the corresponding protein in light mice.
Findings from biochemical studies suggest that such mutations cause the MC1R
protein to be constitutively active (active at all times), bypassing the regulation of
receptor activity by the agouti protein. Indeed, mutations in mc1r are associated
with melanism in all sorts of wild and domesticated vertebrates. Many of these
mutations alter residues in the same part of the MC1R protein, and the same
mutations have occurred independently in some species (Figure 20-12).
In many ways, we can think of these dark mice as analogs of Darwin’s finches
and the lava outcrops as new “island” habitats produced by the same volcanic activ-
ity that produced the Galápagos Islands. The sandy-colored form of the mouse
appears to be the ancestral type, akin to the continental ancestral finch that colo-
nized the Galápagos. The advantage of being less visible to predators resulted in
20.5 Morphological Evolution 781
natural selection for coat color, and the invasion of the lava-rock islands Amino acid replacements cluster in
by the mice led to the spread of an allele that was favored on the black- one part of the MC1R protein
rock background and selected against on the sandy-colored background.
New mutations in the mc1r gene were essential to this adaptation to the MC1R
changing landscape. NH
The evolution of melanism in the pocket mice illustrates how fitness Extracellular
depends on the conditions in which an organism lives. The new black
mutation was favored on the lava outcrops but disfavored in the ances-
tral population living on sandy-colored terrain. Intracellular
7 C. Darwin, On the Origin of Species by Means of Natural Selection, p. 137. John Murray, London, 1859.
782 CHAPTER 2 0 Evolution of Genes and Traits
vision. Some other pigmentation genes, when mutated in fish, cause dramatic
Evolution of albinism in reductions in viability. The effects of Oca2 mutations appear, then, to be less pleio-
blind cave fish
tropic and have effects on overall fitness that are less harmful than those of muta-
tions in other fish pigmentation genes. Second, the Oca2 locus is very large,
spanning some 345 kb in humans and containing 24 exons. It presents a very large
target for random mutations that would disrupt gene function; Oca2 mutations
are therefore more likely to arise than are mutations at smaller loci.
The loss of gene function is not what we usually think about when we think
Molino cave about evolution. But gene inactivation is certainly what we should predict to hap-
pen when selective conditions change or when populations or species shift their
habitats or lifestyles and certain gene functions are no longer necessary.
Regulatory-sequence evolution
As discussed above, a major constraint on gene evolution is the potential for
harmful side effects caused by mutations in coding regions that alter protein
Surface function. These effects can be circumvented by mutations in regulatory sequences,
which play a major role in the evolution of gene regulation and body form.
The examples of body-coloration evolution we have looked at thus far have the
F i g u r e 2 0 -13 Surface forms of the
fish Astyanu mexicanus appear normal, coat or scale pattern changing over the entire body. The evolution of solid black
but cave populations, such as those from or entirely unpigmented body coloration can arise through mutations in pigmen-
the Molino and Pachón caves in Mexico, tation genes. However, many color schemes are often made up of two or more
have repeatedly evolved blindness and colors in some spatial pattern. In such cases, the expression of pigmentation genes
albinism. [Courtesy of Richard Borowsky.] must differ in areas of the body that will be of different colors. In different popu-
lations or species, the regulation of pigmentation genes must evolve by some
mechanism that does not disrupt the function of pigmentation proteins.
The species of the fruit-fly genus Drosophila display extensive diversity of body
Wing spots on fruit flies and wing markings. A common pattern is the presence of a black spot near the tip
of the wing in males (Figure 20-14). The production of the black spots requires
enzymes that synthesize melanin, the same pigment made in pocket mice. Many
genes controlling the melanin synthesis pathway have been well studied in the
model organism Drosophila melanogaster. One gene is named yellow because muta-
tions in the gene cause darkly pigmented areas of the body to appear yellowish or
tan. The yellow gene plays a central role in the development of divergent melanin
patterns. In species with spots, the Yellow protein is expressed at high levels in wing
cells that will produce the black spot, whereas in species without spots, Yellow is
expressed at a low level throughout the wing blade (Figure 20-15a).
The difference in Yellow expression between spotted and unspotted species
could be due to differences in how the yellow gene is regulated in the two species.
Either or both of two possible mechanisms could be at play: the species could dif-
fer in the spatial deployment of transcription factors that regulate yellow (that is,
changes in trans-acting sequences to the yellow gene), or they could differ in cis-
acting regulatory sequences that govern how the yellow gene is regulated. To
examine which mechanisms are involved, investigators examined the activity of
F i g u r e 2 0 -14 Drosophila yellow cis-acting regulatory sequences from different species by placing them
melanogaster males lack wing spots ( top), upstream of a reporter gene and introducing them into D. melanogaster.
whereas Drosophila biarmipes males
The yellow gene is regulated by an array of separate cis-acting regulatory
( bottom) have dark wing spots that are
displayed in a courtship ritual. This simple
sequences that govern gene transcription in different tissues and cell types and at
morphological difference is due to different times in development (Figure 20-15b). These regulatory sequences include
differences in the regulation of those controlling transcription in the larval mouthparts, the pupal thorax and abdo-
pigmentation genes. [Nicolas Gompel.] men, and the developing wing blade. It was discovered that, whereas the wing-blade
20.5 Morphological Evolution 78 3
(a)
Shallow water Deep water
Predator:
insect larva Predator:
big fish
(b)
Adult
sticklebacks
(c)
Stickleback
larvae
Loss of Pitx1 Pitx1
expression expression
(d)
Olfactory Touch Olfactory Touch
organs organs Thymus organs organs Thymus
Pitx1 Pitx1
Pelvic Caudal Pelvic Caudal
fin fin fin fin
Malarial Parasites
parasite cannot bind
Blood Blood
Purkinje Endothelial cells Purkinje Endothelial cells
F i g u r e 2 0 -17 A regulatory mutation is the second most prevalent form of malarial parasite in most tropical and sub-
in a human Duffy gene enhancer is tropical regions of the world but at present is absent from sub-Saharan Africa.
associated with resistance to malaria. The parasite gains entry to red blood cells and red-blood-cell precursors by bind-
(a) The Duffy protein (dark blue) is typically
ing to the Duffy glycoprotein (see Figure 20-17). The very high frequency of Fynull
expressed on blood cells as well as on
Purkinje cells in the brain and endothelial
homozygotes in Africa prevents P. vivax from being common there. Moreover, if
cells. (b) A high proportion of West we suppose that P. vivax was common in Africa in the past, then the Fynull allele
Africans lack Duffy expression on their red would have been selected for.
blood cells due to a mutation in a The complete absence of Duffy protein on the red blood cells of a large sub-
blood-cell enhancer (the GATA sequence population raises the question of whether the Duffy protein has any necessary
is mutated to GACA). Since the Duffy function, because it is apparently dispensable. But it is not the case that these
protein is part of the receptor for the
individuals lack Duffy protein expression altogether. The protein is expressed on
P. vivax malarial parasite (orange),
individuals with the regulatory mutation are
endothelial cells of the vascular system and the Purkinje cells of the cerebellum.
resistant to infection but have normal Duffy As with the evolution of Yellow expression in wing-spotted fruit flies and of Pitx
expression elsewhere in the body. expression in stickleback fish, the regulatory mutation at the Fy locus allows one
aspect of gene expression (in red blood cells) to change without disrupting others
(see Figure 20-17).
Modifications to coding and regulatory sequences are common means to evo-
lutionary change. They illustrate how diversity can arise without the number of
genes in a species changing. However, larger-scale mutational changes can and do
happen in DNA that result in the expansion of gene number, and this expansion
provides raw material for evolutionary innovation.
Where does the DNA for new genes come from? What are the fates of new genes?
And how do new protein functions evolve?
8
1100 11
1000
900
800 14
700
Number of species
600
16 18
500
400
20
24
300 22
200
28
26
100 30 32 34 36
F i g u r e 2 0 -18 Frequency
38 40 distribution of haploid chromosome
numbers in dicotyledonous plants.
0 1 5 10 20 30 40 50 60 [Data from Verne Grant, The Origin of
Haploid number Adaptations. Columbia University
Press, 1963.]
78 8 CHAPTER 2 0 Evolution of Genes and Traits
it may carry along additional host genetic material and insert a copy of some part
of the genome into another location (see Chapter 15).
A fourth mechanism that can expand gene number is retrotransposition.
Many animal genomes harbor retroviral-like genetic elements (see Chapter 15)
that encode reverse-transcriptase activity. Retrotransposons themselves make up
approximately 40 percent of the human genome. Occasionally, host genome
mRNA transcripts are reverse transcribed into cDNA and inserted back in the
genome, producing an intronless gene duplicate.
(a)
Duplication
(b) Gene inactivated (c) Evolution of new function (d) Function divided between genes
(pseudogenization) (neofunctionalization) (subfunctionalization)
***
F i g u r e 2 0 -19 The alternative fates of duplicated genes. (a) The duplication of a gene.
The orange, yellow, and pink boxes denote cis-regulatory elements; the beige box denotes
the coding region. After duplication, several alternative fates of the duplicates are possible:
(b) any inactivating mutation in a coding region will render that duplicate into a pseudogene,
and purifying selection will then operate on the remaining paralog; (c) mutations may arise
that alter the function of a protein and may be favored by positive selection
(neofunctionalization); (d) mutations may affect a subfunction of either duplicate, and so long
as the two paralogs together provide the ancestral functions, different subfunctions may be
retained, resulting in the evolution of two complementary loci (subfunctionalization).
(ε, β, δ, and γ) (Figure 20-20). Each cluster contains a pseudogene, Ψα and Ψb, respec-
tively, that has accumulated random, inactivating mutations.
Each cluster contains genes that have evolved distinct expression profiles, a
distinct function, or both. Of greatest interest are the two γ genes. These genes are
expressed during the last seven months of fetal development to produce fetal hemo-
globin (also known as hemoglobin F), which is composed of two α chains and two γ
chains. Fetal hemoglobin has a greater affinity for oxygen than does adult hemoglo-
F i g u r e 2 0 -2 0 Chromosomal
bin, which allows the fetus to extract oxygen from the mother’s circulation via the distribution of the genes for the a family of
placenta. At birth, up to 95 percent of hemoglobin is the fetal type, then expression globins on chromosome 16 and the β
of the adult β form replaces γ and a small amount of δ globin is also produced. The family of globins on chromosome 11 in
order of appearance of globin chains during development is orchestrated by a humans. Gene structure is shown by black
bars (exons) and colored bars (introns).
60 50 40 30 20 10 kb
Chromosome 16
ζ2 ζ1 ψα α2 α1
5′ 3′
Chromosome 11
Gγ Aγ ψβ δ β
5′ 3′
79 0 CHAPTER 2 0 Evolution of Genes and Traits
complex set of cis-acting regulatory sequences and, remarkably, follows the order
of genes on each chromosome.
The γ genes are restricted to placental mammals. Their distinct developmen-
tal regulation and protein products mean that these duplicates have evolved dif-
ferences in function that have contributed to the evolution of the placental
lifestyle. Interestingly, regulatory variants of these genes are known that cause
expression of the fetal hemoglobin to persist into childhood and adulthood. These
naturally occurring variants appear to moderate the severity of sickle-cell anemia
by suppressing the levels of HbS produced. One widespread strategy for the treat-
ment of sickle-cell anemia is to administer drugs that stimulate the reactivation of
fetal-hemoglobin expression.
s u mma ry
The theory of evolution by natural selection explains the previous steps taken may affect whether a new mutation is
changes that take place in populations of organisms as being favored, disfavored, or neutral.
the result of changes in the relative frequencies of different Before the advent of molecular genetics, it was not possible
variants in the population. If there is no variation within a to know whether independent evolutionary events might have
species for some trait, there can be no evolution. Moreover, given rise to the same adaptation multiple times. By pinpoint-
that variation must be influenced by genetic differences. If ing the genes and exact mutations involved in changes in func-
differences are not heritable, they cannot evolve, because tion, we now appreciate that evolution can and does repeat
the reproductive advantage of a variant will not carry across itself by acting on the same genes to produce similar results in
generational lines. It is crucial to understand that the muta- independent cases. For example, changes to the same genes
tional processes that generate variation within the genome are responsible for independently arising cases of melanism
act at random, but that the selective process that sorts out and albinism in some vertebrates or for the loss of pelvic spines
the advantageous and disadvantageous variants is not in different stickleback-fish populations. Evolution may repeat
random. itself by altering the very same nucleotide in the case of inde-
The ability to study evolution at the level of DNA and pendently arising sickle-cell mutations that lead to adaptive
proteins has transformed our understanding of the evolu- resistance to malaria.
tionary process. Before we had the ability to study evolu- An important constraint on the evolution of coding
tion at the molecular level, there was no inkling that much sequences is the potentially harmful side effects of mutations.
of evolution was in fact a result of genetic drift and not If a protein serves multiple functions in different tissues, as is
natural selection. A great deal of molecular evolution the case for many genes involved in the regulation of develop-
seems to be the replacement of one protein sequence mental processes, mutations in coding sequences may affect
by another one of equivalent function. Among the evi- all functions and decrease fitness. The potential pleiotropic
dence for the prevalence of neutral evolution is that the effects of coding mutations can be circumvented by muta-
number of amino acid differences between two different tions in noncoding regulatory sequences. Mutations in these
species in some molecule—for example, hemoglobin—is sequences may selectively change gene expression in only one
directly proportional to the number of generations since tissue or body part and not others. The evolution of cis-acting
their divergence from a common ancestor in the evolution- regulatory sequences is central to the evolution of morpho-
ary past. We would not expect such a “molecular clock” logical traits and the expression of toolkit genes that control
with a constant rate of change to exist if the selection of development.
differences were dependent on particular changes in the New protein functions often arise through the duplication
environment. of genes and subsequent mutation. New DNA may arise by
So much sequence evolution is neutral that there is no duplication of the entire genome (polyploidy), a frequent
simple relation between the amount of change in a gene’s occurrence in plants, or by various mechanisms that produce
DNA sequence and the amount of change, if any, in the duplicates of individual genes or sets of genes. The fate of
encoded protein’s function. Some protein functions can duplicate genes depends a great deal on the nature of muta-
change through a single amino acid substitution, whereas tions acquired after duplication. Possible fates are the inacti-
others require a suite of substitutions brought about vation of one duplicate, the splitting of function between two
through cumulative selection. Such multistep, adaptive duplicates, or the gain of new functions.
walks may follow different paths even when the conditions Overall, genetic evolution is subject to historical contin-
of natural selection are the same. This is because the paths gency and chance, but it is constrained by the necessity of
available to any population at any given moment depend on organisms to survive and reproduce in a constantly changing
the chance occurrence of mutations that may not arise in world. “The fittest” is a conditional status, subject to change as
the same order in different populations. Furthermore, the the planet and habitats change.
Problems 791
ke y te r ms
adaptation (p. 764) molecular clock (p. 773) retrotransposition (p. 788)
adaptive walk (p. 774) natural selection (p. 762) sign epistasis (p. 775)
cumulative selection (p. 774) neofunctionalization (p. 788) subfunctionalization (p. 788)
gene duplication (p. 787) nonsynonymous substitution (p. 773) synonymous substitution (p. 773)
gene family (p. 786) pseudogene (p. 788)
so lv ed p r ob l ems
SOLVED PROBLEM 1. Two closely related species of bacteria species grew as well on the particular nutrient that the
are found to be fixed for two different electrophoretically enzyme broke down.
detected alleles at a locus encoding an enzyme involved in Ideally, in order to measure fitness differences, one
breaking down a nutrient. How could you test experimen- would replace the enzyme-coding region of one species
tally whether the divergence in enzyme sequences may with the enzyme-coding region from the second species
have caused differences in function and fitness? and vice versa. Then the growth of each wild-type and
transgenic strain could be compared on the same nutri-
Solution ent-containing media, with growth being an indicator of
In order to test whether the enzymes have different func- fitness. If there are differences in the relative fitness of
tional properties, one could devise both in vitro and in vivo the transgenic and wild-type strains, then it is possible
experiments. If the substrates and properties of the enzyme that the enzymes have diverged under natural selec-
are known, one could purify the enzyme from each species tion. If not, then it is likely that the enzymes have
and measure directly whether there are functional differ- evolved neutrally or that the effect of selection is too
ences. Alternatively, an indirect test would be whether each small to measure experimentally.
p r ob l ems
Most of the problems are also available for review/grading through the http://www.whfreeman.com/
launchpad/iga11e.
W o r king with the F ig u r es 7. I n Figure 20-18, what is the evidence that polyploid for-
1. In Figure 20-5, note that the difference in survival rates be- mation has been important in plant evolution?
tween AS and AA genotypes declines as children get old- B asic P r ob l ems
er. Offer one possible explanation for this observation.
8. Compare Darwin’s description of natural selection as
2. Examining Figure 20-8, explain why the rate of evolution quoted on page 765 with Wallace’s description of the ten-
at nonsynonymous sites is lower. Do you expect this to be dency of varieties to depart from the original type quoted
true only of globin genes or of most genes? below it, on the same page. What ideas do they have in
3. From Table 20-3, would you expect the noncoding muta- common?
tion g4205a to be fixed before or after the coding muta- 9. What are the three principles of the theory of evolution
tion G238S in a population of bacteria evolving resis- by natural selection?
tance to the antibiotic cefotaxime? Give at least two 10. Why was the neutral theory of molecular evolution a
reasons for your answer. revolutionary idea?
4. E
xamining Table 20-4, what do you think would be the 11. What would you predict to be the relative rate of synony-
order of mutations fixed during selection in a third evolv- mous and nonsynonymous substitutions in a globin
ing virus line? Would the mutations become fixed in the pseudogene?
same order as the TX or ID virus? 12. Are AS heterozygotes completely resistant to malarial
5. Examining Table 20-5, how would the interpretation of infection? Explain the evidence for your answer.
the McDonald–Kreitman test results differ if the number
C ha l l enging P r ob l ems
of nonsynonymous observed species differences was
1 instead of 7? www
13. Unpacking Problem If the mutation rate to a new
6. U
sing Figure 20-17, explain how the mutation in the www is 10 –5, assuming there is no migration, how large
allele
GATA sequence of the Duffy gene imparts resistance to must isolated populations be to prevent chance differ-
P. vivax infection. entiation among them in the frequency of this allele?
792 CHAPTER 2 0 Evolution of Genes and Traits
14. Glucose-6-phosphate dehydrogenase (G6PD) is a criti- loss of spot formation to entail coding or noncoding
cal enzyme involved in the metabolism of glucose, changes in pigmentation genes? How would you test
especially in red blood cells. Deficiencies in the enzyme which is the case?
are the most common human enzyme defect and occur 21. It has been claimed that “evolution repeats itself.” What
at a high frequency in certain populations of East African is the evidence for this claim from
children.
a. the analysis of HbS alleles?
a. Offer one hypothesis for the high incidence of G6PD
b. the analysis of antibiotic resistance in bacteria?
mutations in East African children.
c. the analysis of experimentally selected bacteriophage
b. How would you test your hypothesis further?
φX174?
c. Scores of different G6PD mutations affecting en-
d. the analysis of Oca2 mutations in cave fish?
zyme function have been found in human populations.
Offer one explanation for the abundance of different e. the analysis of stickleback Pitx1 loci?
G6PD mutations. 22. What is the molecular evidence that natural selection in-
15. Large differences in HbS frequencies among Kenyan and cludes the “rejection of injurious change”?
Ugandan tribes had been noted in surveys conducted by 23. What are three alternative fates of a new gene
researchers other than Tony Allison. These researchers duplicate?
offered alternative explanations different from the ma- 24. What is the evidence that gene duplication has been the
larial linkage proposed by Allison. Offer one counterar- source of the α and β gene families for human
gument to, or experimental test for, the following alter- hemoglobin?
native hypotheses: 25. DNA-sequencing studies for a gene in two closely related
a. The mutation rate is higher in certain tribes. species produce the following numbers of sites that vary:
b. There is a low degree of genetic mixing among tribes,
so the allele rose to high frequency through inbreeding Synonymous polymorphisms 50
in certain tribes. Nonsynonymous polymorphisms 20
Synonymous species differences 18
16. How many potential evolutionary paths are there for an
Nonsynonymous species differences 2
allele to evolve six different mutations? Seven different
mutations? Ten different mutations?
Does this result support neutral evolution of the gene?
17. The MC1R gene affects skin and hair color in humans. Does it support an adaptive replacement of amino
There are at least 13 polymorphisms of the gene in acids? What explanation would you offer for the
European and Asian populations, 10 of which are non- observations?
synonymous. In Africans, there are at least 5 polymor-
26. In humans, two genes encoding the opsin visual pigments
phisms of the gene, none of which are nonsynonymous.
that are sensitive to green and red wavelengths of light
What might be one explanation for the differences in
are found adjacent to one another on the X chromosome.
MC1R variation between Africans and non-Africans?
They encode proteins that are 96 percent identical.
18. Opsin proteins detect light in photoreceptor cells of the Nonprimate mammals possess just one gene encoding an
eye and are required for color vision. The nocturnal owl opsin sensitive to the red/green wavelength.
monkey, the nocturnal bush baby, and the subterranean
a. Offer one explanation for the presence of the two
blind mole rat have different mutations in an opsin gene
opsin genes on the human X chromosome.
that render it nonfunctional. Explain why all three spe-
cies can tolerate mutations in this gene that operates in b. How would you test your explanation further and
most other mammals. pinpoint when in evolutionary history the second gene
arose?
19. Full or partial limblessness has evolved many times in
vertebrates (snakes, lizards, manatees, whales). Do you 27. About 9 percent of Caucasian males are color-blind and
expect the mutations that occurred in the evolution of cannot distinguish red-colored from green-colored
limblessness to be in the coding or noncoding sequences objects.
of toolkit genes? Why? a. Offer one genetic model for color blindness.
20. Several Drosophila species with unspotted wings are de- b. Explain why and how color blindness has reached a
scended from a spotted ancestor. Would you predict the frequency of 9 percent in this population.
A Brief Guide to Model Organisms
Escherichia coli • Saccharomyces cerevisiae • Neurospora crassa • Arabidopsis thaliana
Caenorhabditis elegans • Drosophila melanogaster • Mus musculus
793
Escherichia coli Genetic “Vital Statistics”
Genome size: 4.6 Mb
Chromosomes: 1, circular
Number of genes: 4000
Percentage with
Key organism for studying: human homologs: 8%
• Transcription, translation, replication, recombination Average gene size: 1 kb, no introns
• Mutation Transposons: Strain specific, , 60 copies per
• Gene regulation genome
• Recombinant DNA technology Genome sequenced in: 1997
794
A Brief Guide to Model Organisms 795
The ascomycete S. cerevisiae, alias “baker’s yeast,” “budding yeast,” or simply “yeast,” has
been the basis of the baking and brewing industries since antiquity. In nature, it proba-
bly grows on the surfaces of plants, using exudates as nutrients, although its precise niche
is still a mystery. Although laboratory strains are mostly haploid, cells in nature can be
diploid or polyploid. In approximately 70 years of genetic research, yeast has become “the
E. coli of the eukaryotes.” Because it is haploid and unicellular, and forms compact colonies
on plates, it can be treated in much the same way as a bacterium. However, it has eukary-
otic meiosis, cell cycle, and mitochondria, and these features have been at the center of the
yeast success story.
Yeast cells, Saccharomyces cerevisiae. [SciMAT/Science Source.]
796
A Brief Guide to Model Organisms 797
Neurospora crassa, the orange bread mold, was one of the first eukaryotic microbes to be
adopted by geneticists as a model organism. Like yeast, it was originally chosen because of its
haploidy, its simple and rapid life cycle, and the ease with which it can be cultured. Of particu-
lar significance was the fact that it will grow on a medium with a defined set of nutrients, mak-
ing it possible to study the genetic control of cellular chemistry. In nature, it is found in many
parts of the world growing on dead vegetation. Because fire activates its dormant ascospores, it
is most easily collected after burns—for example, under the bark of burnt trees and in fields of
crops such as sugar cane that are routinely burned before harvesting.
ability to grow on defined medium, has made it an or- asexual spore (called a conid-
ganism of choice for studying biochemical genetics of ium) germinates to produce a
nutrition and nutrient uptake. germ tube that extends at its
Asci
Another unique feature of Neurospora (and re- tip. Progressive tip growth and
lated fungi) allows geneticists to trace the steps of branching produce a mass of
single meioses. The four haploid products of one branched threads (called hy-
meiosis stay together in a sac called an ascus. Each phae), which forms a compact
Meiosis
of the four products of meiosis undergoes a further colony on growth medium.
mitotic division, resulting in a linear octad of eight Because hyphae have no cross
ascospores (see page 103). This feature makes Neu- walls, a colony is essentially
rospora an ideal system in which to study crossing one cell containing many hap-
over, gene conversion, chromosomal rearrange- loid nuclei. The colony buds
ments, meiotic nondisjunction, and the genetic con- off millions of asexual spores, Synchronous division and fusion
to form diploid meiocytes
trol of meiosis itself. Chromosomes, although small, which can disperse in air and
are easily visible, and so meiotic processes can be repeat the asexual cycle.
studied at both the genetic and the chromosomal In N. crassa’s sexual Cross-
fertilization
levels. Hence, in Neurospora, fundamental studies cycle, there are two identical-
have been carried out on the mechanisms underly- looking mating types MAT-A
ing these processes (see page 133). and MAT-a, which can be
viewed as simple “sexes.” As
in yeast, the two mating types
Genetic analysis
are determined by two alleles
Genetic analysis is straightforward (see page 103). of one gene. When colonies of Maternal nucleus Maternal nucleus
Stock centers provide a wide range of mutants af- different mating type come Mating type A Mating type a
fecting all aspects of the biology of the fungus. into contact, their cell walls
Neurospora genes can be mapped easily by crossing and nuclei fuse. Many transient diploid nuclei arise, each of which undergoes
them with a bank of strains with known mutant loci meiosis, producing an octad of ascopores. The ascospores germinate and pro-
or known RFLP alleles. Strains of opposite mating duce colonies exactly like those produced by asexual spores.
type are crossed simply by growing them together.
A geneticist with a handheld needle can pick out a Length of life cycle: 4 weeks for sexual cycle
single ascospore for study. Hence, analyses in which
798
A Brief Guide to Model Organisms 799
Main contributions
George Beadle and Edward Tatum used Neurospora as the model
organism in their pioneering studies on gene–enzyme relations,
Wild-type (left) and mutant (right) Neurospora grown in a petri in which they were able to determine the enzymatic steps in the
dish. [Courtesy of Anthony Griffiths/Olivera Gavric.] synthesis of arginine (see page 224). Their work with Neurospora
established the beginning of molecular genetics. Many compa-
either complete asci or random ascospores are used are rapid and rable studies on the genetics of cell metabolism with the use of
straightforward. Neurospora followed.
Because Neurospora is haploid, newly obtained mutant
phenotypes are easily detected with the use of various types of Mutant for Mutant for Wild type
gene 1 gene 2
screens and selections. A favorite system for study of the mecha-
nism of mutation is the ad-3 gene, because ad-3 mutants are pur-
ple and easily detected.
Although vegetative diploids of Neurospora are not readily ob-
tainable, geneticists are able to create a “mimic diploid,” useful for
complemenation tests and other analyses requiring the presence
of two copies of a gene (see page 229). Namely, the fusion of two
different strains produces a heterokaryon, an individual containing
two different nuclear types in a common cytoplasm. Heterokaryons
also enable the use of a version of the specific-locus test, a way to
recover mutations in a specific recessive allele. (Cells from a 1/m gene 1 gene 2
heterokaryon are plated and m/m colonies are sought.) Nonpigment enzyme 1 Yellow enzyme 2 Normal
precursor precursor orange
carotenoid
pigment
800
A Brief Guide to Model Organisms 801
Arabidopisis mutants. (Left ) Wild-type flower of Arabidopsis. (Middle ) The agamous mutation
(ag), which results in flowers with only petals and sepals (no reproductive structures). (Right ) A
double-mutant ap1, cal, which makes a flower that looks like a cauliflower. (Similar mutations in
cabbage are probably the cause of real cauliflowers.) [George Haughn.]
Caenorhabditis elegans may not look like much under a microscope, and, indeed, this
1-mm-long soil-dwelling roundworm (a nematode) is relatively simple as animals go. But
that simplicity is part of what makes C. elegans a good model organism. Its small size, rapid
growth, ability to self, transparency, and low number of body cells have made it an ideal
choice for the study of the genetics of eukaryotic development.
Pharynx Ovary
Intestine Eggs Rectum
Anus
Oviduct Oocytes
Uterus
Vulva
Special features simply by watching the worms under a light microscope. The re-
Geneticists can see right through C. elegans. Unlike other multicel- sults of such studies have shown that C. elegans’s development is
lular model organisms, such as fruit flies or Arabidopsis, this tiny tightly programmed and that each worm has a surprisingly small
worm is transparent, making it efficient to screen large populations and consistent number of cells (959 in hermaphrodites and 1031 in
for interesting mutations affecting virtually any aspect of anatomy males). In fact, biologists have tracked the fates of specific cells as
or behavior. Transparency also lends itself well to studies of devel- the worm develops and have determined the exact pattern of cell
opment: researchers can directly observe all stages of development division leading to each adult organ. This effort has yielded a lin-
eage pedigree for every adult cell (see page 497).
P3,P4,P5,
W P1 P2 P6,P7,P8,P9 P10 Genetic analysis
Because the worms are small and reproduce quickly and pro-
lifically (selfing produces about 300 progeny and crossing yields
L1
× ×
P1.p;vh
P2.p;vh
W.pa
W.pp
Pn.p;vh
Life Cycle
P1.apa
P1.app
W.aaa
W.aap
P2.aap
P2.apa
P2.app
Pn.apa
Pn.app
P1.aaaa
P1.aaap
P2.aaaa
P2.aaap
P10.apa
P10.app
Pn.aaaa
Pn.aaap
P10.aaaa
P10.aaap
P10.aapa
P10.aapp
P10.ppaa
P10.ppap
P10.pppa
P10.paaa
P10.paap
P10.papa
P10.papp
P10.ppppa
P10.ppppp
802
A Brief Guide to Model Organisms 803
about 1000), they produce large populations of progeny that can (a) Syncitial region Gonad Nuclei
Integrated array
of these nuclei will incorporate the DNA (see Chapter 10). Hypodermis
Vulva
Hypodermis
The fruit fly Drosophila melanogaster (loosely translated as “dusky syrup-lover”) was one of
the first model organisms to be used in genetics. It was chosen in part because it is readily
available from ripe fruit, has a short life cycle of the diploid type, and is simple to culture and
cross in jars or vials containing a layer of food. Early genetic analysis showed that its inheri-
tance mechanisms have strong similarities to those of other eukaryotes, underlining its role as
a model organism. Its popularity as a model organism went into decline during the years when
E. coli, yeast, and other microorganisms were being developed as molecular tools. However,
Drosophila has experienced a renaissance because it lends itself so well to the study of the
genetic basis of development, one of the central questions of biology. Drosophila’s importance
Polytene chromosomes. [William
as a model for human genetics is demonstrated by the discovery that approximately 60 per-
M. Gelbart, Harvard University.]
cent of known disease-causing genes in humans, as well as 70 percent of cancer genes, have
counterparts in Drosophila.
Special features
Drosophila came into vogue as an experimental organism in the
early twentieth century because of features common to most Life Cycle
model organisms. It is small (3 mm long), simple to raise (origi-
nally, in milk bottles), quick to reproduce (only 12 days from egg Drosophila has a short diploid life cycle that lends itself well
to adult), and easy to obtain (just leave out some rotting fruit). It to genetic analysis. After hatching from an egg, the fly de-
proved easy to amass a large range of interesting mutant alleles velops through several larval stages and a pupal stage before
that were used to lay the ground rules of transmission genetics. emerging as an adult, which soon becomes sexually mature.
Early researchers also took advantage of a feature unique to the Sex is determined by X and Y sex chromosomes (XX is fe-
fruit fly: polytene chromosomes (see page 638). In salivary glands male, XY is male), although, in contrast with humans, the
and certain other tissues, these “giant chromosomes” are pro- number of X’s in relation to the number of autosomes deter-
duced by multiple rounds of DNA replication without chromo- mines sex (see page 54).
somal segregation. Each polytene chromosome displays a unique Total length of life cycle: 12 days from egg to adult
banding pattern, providing geneticists with landmarks that could
be used to correlate recombination-based maps with actual chro- Adult
mosomes. The momentum provided by these early advances,
along with the large amount of accumulated knowledge about
the organism, made Drosophila an attractive genetic model.
1 day
1 1
Genetic analysis 3 2 –4 2 days
1 day
1
2 2 –3 days
Second instar
Third instar
Wild type Bar eyes Vestigial wings
1 day
Two morphological mutants of Drosophila, with the wild type for
comparison.
804
A Brief Guide to Model Organisms 805
To perform a cross, males and females are placed together in The normal thoracic and
a jar, and the females lay eggs in semisolid food covering the jar’s abdominal segments of
Wing Drosophila.
bottom. After emergence from the pupae, offspring can be anes-
thetized to permit counting members of phenotypic classes and to Notum Haltere
distinguish males and females (by their different abdominal stripe (rudimentary wing)
patterns). However, because female progeny stay virgin for only
T2 A5-A8
a few hours after emergence from the pupae, they must immedi- T1 T3
A4
ately be isolated if they are to be used to make controlled crosses. A1 A2 A3
Crosses designed to build specific gene combinations must be
carefully planned, because crossing over does not take place in
Drosophila males. Hence, in the male, linked alleles will not re-
combine to help create new combinations.
For obtaining new recessive mutations, special breeding pro- Their discoveries opened the door to other pioneering studies:
grams (of which the prototype is Muller’s ClB test) provide con- • Early studies on the kinetics of mutation induction and
venient screening systems. In these tests, mutagenized flies are the measurement of mutation rates were performed
crossed with a stock having a balancer chromosome (see page 645). with the use of Drosophila. Muller’s ClB test and similar
Recessive mutations are eventually brought to homozygosity by in- tests provided convenient screening methods for recessive
breeding for one or two generations, starting with single F1 flies. mutations.
• Chromosomal rearrangements that move genes adjacent to
heterochromatin were used to discover and study position-
Techniques of Genetic Modification effect variegation.
• In the last part of the twentieth century, after the
Standard mutagenesis: identification of certain key mutational classes such as
Chemical (EMS) and radiation Random germ-line and homeotic and maternal-effect mutations, Drosophila
somatic mutations assumed a central role in the genetics of development, a
role that continues today (see Chapter 13). Maternal-effect
Transgenesis:
mutations that affect the development of embryos, for
P element mediated Random insertion
example, have been crucial in the elucidation of the genetic
Targeted gene knockouts: determination of the Drosophila body plan; these mutations
Induced replacement Null ectopic allele exits are identified by screening for abnormal developmental
and recombines with phenotypes in the embryos from a specific female.
wild-type allele Techniques such as enhancer trap screens have enabled the
RNAi Mimics targeted knockout discovery of new regulatory regions in the genome that
affect development. Through these methods and others,
Drosophila biologists have made important advances in
understanding the determination of segmentation and of
Genetic engineering the body axes. Some of the key genes discovered, such as
Transgenesis. Building transgenic flies requires the help of a the homeotic genes, have widespread relevance in animals
Drosophila transposon called the P element. Geneticists construct generally.
a vector that carries a transgene flanked by P-element repeats.
The transgene vector is then injected into a fertilized egg along
with a helper plasmid containing a transposase. The transposase
allows the transgene to jump randomly into the genome in ger-
minal cells of the embryo (see Chapter 15).
Targeted knockouts. Targeted gene knockouts can be accom-
plished by, first, introducing a null allele transgenically into an (a) Bicoid protein (b) Hairy pair-rule protein
Because humans and most domesticated animals are mammals, the genetics of mammals is of
great interest. However, mammals are not ideal for genetics: they are relatively large in size
compared with other model organisms, thereby taking up large and expensive facilities, their
life cycles are long, and their genomes are large and complex. Compared with other mam-
mals, however, mice (Mus musculus) are relatively small, have short life cycles, and are easily
obtained, making them an excellent choice for a mammal model. In addition, mice had a head
start in genetics because mouse “fanciers” had already developed many different interesting
lines of mice that provided a source of variants for genetic analysis. Research on the Mende-
lian genetics of mice began early in the twentieth century.
Special features the mouse’s success as a model organism; these similarities allow
Mice are not exactly small, furry humans, but their genetic mice to be treated as “stand-ins” for their human counterparts in
makeup is remarkably similar to ours. Among model organisms, many ways. Potential mutagens and carcinogens that we suspect
the mouse is the one whose genome most closely resembles the of causing damage to humans, for example, are tested on mice,
human genome. The mouse genome is about 14 percent smaller and mouse models are essential in studying a wide array of
than that of humans (the human genome is 3000 Mb), but it has human genetic diseases.
approximately the same number of genes (current estimates
are just under 30,000). A surprising 99 percent of mouse genes Genetic analysis
seem to have homologs in humans. Furthermore, a large propor-
Mutant and “wild type” (though not actually from the wild) mice
tion of the genome is syntenic with that of humans; that is, there
are easy to come by: they can be ordered from large stock centers
are large blocks containing the same genes in the same relative
that provide mice suitable for crosses and various other types of ex-
positions (see page 531). Such genetic similarities are the key to
periments. Many of these lines are derived from mice bred in past
centuries by mouse fanciers. Controlled crosses can be performed
simply by pairing a male with a nonpregnant female. In most cases,
Human chromosomes
the parental genotypes can be provided by male or female.
Life Cycle
Mice have a familiar di 2n 2n
Adult Adult
ploid life cycle, with an XY
sex-determination system Meiosis Meiosis
similar to that of humans.
Litters are from 5 to 10 pups; n n n n Gametes n n n n
however, the fecundity of
2n 2n
females declines after about Zygote Zygote
9 months, and so they rarely
have more than five litters. Many Many
1 2 3 4 5 6 7 8 9 10 11 12 mitoses mitoses
Total length of life cycle:
A mouse–human synteny map of 12 chromosomes from the 10 weeks from birth to giv- 2n 2n
human genome. Color coding is used to depict the regional matches of ing birth, in most labora- Adult Adult
each block of the human genome to the corresponding sections of the tory strains
mouse genome. Each color represents a different mouse chromosome.
806
A Brief Guide to Model Organisms 807
Most of the standard estimates of mammalian mutation rates Production of ES cells with a gene knockout
809
Appendix B
Bioinformatic Resources for Genetics This outstanding site contains the reference sequence and working
draft assemblies for a large collection of genomes and a number of tools
and Genomics for exploring those genomes. The Genome Browser zooms and scrolls
over chromosomes, showing the work of annotators worldwide. The
“You certainly usually find something, if you look, but it is not always Gene Sorter shows expression, homology, and other information on
quite the something you were after.” — The Hobbit, J. R. R. Tolkien groups of genes that can be related in many ways. Blat is an alignment
tool that quickly maps sequences to the genome. The Table Browser
The field of bioinformatics encompasses the use of computational tools provides access to the underlying database.
to distill complex data sets. Genetic and genomic data are so diverse • EBI http://www.ncbi.nlm.nih.gov/
that it has become a considerable challenge to identify the authoritative • DDBJ http://www.ddbj.nig.ac.jp/
site(s) for a specific type of information. Furthermore, the landscape of
Web-accessible software for analyzing this information is constantly The harsh reality is that, with so much biological information, the goal of
changing as new and more powerful tools are developed. This appen making these online resources “transparent” to the user is not fully
dix is intended to provide some valuable starting points for exploring achieved. Thus, exploration of these sites will entail familiarizing your
the rapidly expanding universe of online resources for genetics and self with the contents of each site and exploring some of the ways the site
genomics. helps you to focus your queries so you get the right answer(s). For one
example of the power of these sites, consider a search for a nucleotide
sequence at NCBI. Databases typically store information in separate bins
called “fields.” By using queries that limit the search to the appropriate
1. Finding Genetic and Genomic Web Sites field, a more directed question can be asked. Using the “Limits” option, a
Here are listed several central resources that contain large lists of rele query phrase can be used to identify or locate a specific species, type of
vant Web sites: sequence (genomic or mRNA), gene symbol, or any of several other data
fields. Query engines usually support the ability to join multiple query
• T
he scientific journal called Nucleic Acids Research (NAR) statements together. For example: retrieve all DNA sequence records
publishes a special issue every January listing a wide variety of that are from the species Caenorhabditis elegans AND that were pub
online database resources at http://nar.oupjournals.org/. lished after January 1, 2000. Using the “History” option, the results of
• The Virtual Library has Model Organisms and Genetics subdivi multiple queries can be joined together, so that only those hits common
sions with rich arrays of Internet resources at http://ceolas.org/VL/ to multiple queries will be retrieved. By proper use of the available query
mo/. options on a site, a great many false positives can be computationally
eliminated while not discarding any of the relevant hits.
• The National Human Genome Research Institute (NHGRI)
Because protein sequence predictions are a natural part of the
maintains a list of genome Web sites at http://www.nhgri.nih.gov/
analysis of DNA and mRNA sequences, these same sites provide access
10000375/. to a variety of protein databases. One important protein database is
• The Department of Energy (DOE) maintains a Human Genome Project SwissProt/TrEMBL. TrEMBL sequences are automatically predicted
site at http://genomicscience.energy.gov/. from DNA and/or mRNA sequences. SwissProt sequences are curated,
• SwissProt maintains Amos’ WWW links page at http://www.expasy.ch/ meaning that an expert scientist reviews the output of computational
alinks.html. analysis and makes expert decisions about which results to accept or
reject. In addition to the primary protein sequence records, SwissProt
also offers databases of protein domains and protein signatures (amino
acid sequence strings that are characteristic of proteins of a particular
2. General Databases type). The SwissProt home page is http://www.ebi.ac.uk/swissprot/.
Nucleic Acid and Protein Sequence Databases By international
agreement, three groups collaborate to house the p rimary DNA and Protein Domain Databases The functional units within proteins are
mRNA sequences of all species: the National Center for Biotechnology thought to be local folding regions called domains. Prediction of
Information (NCBI) houses GenBank, the European Bioinformatics domains within newly discovered proteins is one way to guess at their
Institute (EBI) houses the European Molecular Biology Laboratory function. Numerous protein domain databases have emerged that pre
(EMBL) Data Library, and the National Institute of Genetics in Japan dict domains in somewhat different ways. Some of the individual
houses the DNA DataBase of Japan (DDBJ). domain databases are Pfam, PROSITE, PRINTS, SMART, ProDom,
TIGRFAMs, BLOCKS, and CDD. InterPro allows querying multiple pro
Primary DNA sequence records, called accessions, are submitted
tein domain databases simultaneously and presents the combined
by individual research groups. In addition to providing access to these
results. Web sites for some domain databases are
DNA sequence records, these sites provide many other data sets. For
example, NCBI also houses RefSeq, a summary synthesis of informa • InterPro http://www.ebi.ac.uk/interpro/
tion on the DNA sequences of fully sequenced genomes and the gene • Pfam http://www.sanger.ac.uk/Software/Pfam/index.shtml
products that are encoded by these sequences. • PROSITE http://www.expasy.ch/prosite/
Many other important features can be found at the NCBI, EBI, and • PRINTS http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/
DDBJ sites. Home pages and some other key Web sites are
• SMART http://smart.embl-heidelberg.de/
• NCBI http://www.ncbi.nlm.nih.gov/ • ProDom http://prodom.prabi.fr/
• NCBI-Genomes http://www.ncbi.nlm.nih.gov/Genomes/index.html • TIGRFAMs http://www.tigr.org/TIGRFAMs/
• NCBI-RefSeq http://www.ncbi.nlm.nih.gov/refseq • BLOCKS http://blocks.fhcrc.org/
• The UCSC Genome Bioinformatics Site http://genome.ucsc.edu/ • CDD http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml
810
Appendix B 811
Protein Structure Databases The representation of three-dimensional • Washington University School of Medicine Genome
protein structures has become an important aspect of global molecular Sequencing Center http://genome.wustl.edu/
analysis. Three-dimensional structure databases are available from the • Baylor College of Medicine Human Genome Sequencing
major DNA/protein sequence database sites and from independent Center http://www.hgsc.bcm.tmc.edu
protein structure databases, notably the Protein DataBase (PDB). NCBI • Sanger Institute http://www.sanger.ac.uk/
has an application called Cn3D that helps in viewing PDB data.
• DOE Joint Genomics Institute http://www.jgi.doe.gov/
• PDB http://www.rcsb.org/pdb/ • International HapMap Project http://hapmap.ncbi.nlm.nih.gov/
• Cn3D http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml
4. Relationships of Genes Within
3. Specialized Databases
and Between Databases
Organism-Specific Genetic Databases In order to mass some classes of
genetic and genomic information, especially phenotypic information, Gene products may be related by virtue of sharing a common evolu
expert knowledge of a particular species is required. Thus, MODs (model tionary origin, sharing a common function, or participating in the same
organism databases) have emerged to fulfill this role for the major pathway.
genetic systems. These include databases for Saccharomyces cerevisiae
(SGD), Caenorhabditis elegans (WormBase), Drosophila melanogaster BLAST: Identification of Sequence Similarities Evidence for a com
(FlyBase), the zebra fish Danio rerio (ZFIN), the mouse Mus musculus mon evolutionary origin comes from the identification of sequence
(MGI), the rat Rattus norvegicus (RGD), Zea mays (MaizeGDB), and Ara- similarities between two or more sequences. One of the most important
bidopsis thaliana (TAIR). Home pages for these MODs can be found at tools for identifying such similarities is BLAST (Basic Local Alignment
Search Tool), which was developed by NCBI. BLAST is really a suite of
• SGD http://genome-www.stanford.edu/Saccharomyces/ related programs and databases in which local matches between long
• WormBase http://www.wormbase.org/ stretches of sequence can be identified and ranked. A query for similar
• FlyBase http://flybase.org/ DNA or protein sequences through BLAST is one of the first things that
• ZFIN http://zfin.org/ a researcher does with a newly sequenced gene. Different sequence
databases can be accessed and organized by type of sequence (refer
• MGI http://www.informatics.jax.org/
ence genome, recent updates, nonredundant, ESTs, etc.), and a particu
• RGD http://rgd.mcw.edu/ lar species or taxonomic group can be specified. One BLAST routine
• MaizeGDB http://www.maizegdb.org matches a query nucleotide sequence translated in all six frames to a
• TAIR http://www.arabidopsis.org/ protein sequence database. Another matches a protein query sequence
to the six-frame translation of a nucleotide sequence database. Other
Human Genetics and Genomics Databases Because of the impor BLAST routines are customized to identify short sequence pattern
tance of human genetics in clinical as well as basic research, a diverse matches or pair-wise alignments, to screen genome-sized DNA segments,
set of human genetic databases has emerged. This set includes a human and so forth, and can be accessed through the same top-level page:
genetic disease database called Online Mendelian Inheritance in Man
• NCBI-BLAST http://www.ncbi.nlm.nih.gov/BLAST/
(OMIM), a database of brief descriptions of human genes called Gene
Cards, a compilation of all known mutations in human genes called
Human Gene Mutation Database (HGMD), a database of the current Function Ontology Databases Another approach to developing rela
sequence map of the human genome called the Golden Path, and some tionships among gene products is by assigning these products to func
links to human genetic disease databases: tional roles based on experimental evidence or prediction. Having a
common way of describing these roles, regardless of the experimental
• OMIM http://www.omim.org system, is then of great importance. A group of scientists from different
• GeneCards http://genecards.org databases are working together to develop a common set of hierarchi
• HGMD http://www.hgmd.org cally arranged terms—an ontology—for function (biochemical event),
process (the cellular event to which a protein contributes), and subcel-
• Golden Path http://genome.ucsc.edu/
lular location (where a product is located in a cell) as a way of describing
• Online Genetic Support Groups http://mostgene.org/genetics/ the activities of a gene product. This particular ontology is called the
genetic-support-groups-directory/ Gene Ontology (GO), and many different databases of gene products
• Genetic Disease Information http://www.geneticalliance.org/ now incorporate GO terms. A full description can be found at
diseaseinfo/search.html
• http://www.geneontology.org/
• The Encyclopedia of Genome Elements (ENCODE) http://
genome.ucsc.edu/ENCODE/
Pathway Databases Still another way to relate products to one another is
• Cancer Genome Project http://cancergenome.nih.gov/ by assigning them to steps in biochemical or cellular pathways. Pathway
diagrams can be used as organized ways of presenting relationships of
Genome Project Databases The individual genome projects also have these products to one another. Some of the more advanced attempts at
Web sites, where they display their results, often including information producing such pathway databases include Kyoto Encyclopedia of Genes
that doesn’t appear on any other Web site in the world. The largest of and Genomes (KEGG) and Signal Transduction Database (TRANSPATH):
the publicly funded genome centers include
• Whitehead Institute/MIT Center for Genome Research • KEGG http://www.genome.ad.jp/kegg/
http://www-genome.wi.mit.edu/ • TRANSPATH http://genexplain.com/transpath-1
This page intentionally left blank
Glossary
A See adenine; adenosine. allopolyploid See amphidiploid.
absolute fitness The number of offspring an individual has. allosteric effector A small molecule that binds to an allosteric
site.
accessory protein A protein associated with DNA polymerase
III of E. coli that is not part of the catalytic core. allosteric site A site on a protein to which a small molecule
binds, causing a change in the conformation of the protein that
Ac element See Activator element.
modifies the activity of its active site.
acentric chromosome A chromosome having no centromere.
allosteric transition A change from one conformation of a
acentric fragment A chromosome fragment having no protein to another.
centromere.
alternate segregation In a reciprocal translocation, the passage
activation domain A part of a transcription factor required of both normal chromosomes to one pole and both translocated
for the activation of target-gene transcription; it may bind to chromosomes to the other pole.
components of the transcriptional machinery or may recruit alternative splicing A process by which different messenger
proteins that modify chromatin structure or both. RNAs are produced from the same primary transcript, through
activator A protein that, when bound to a cis-acting regulatory variations in the splicing pattern of the transcript. Multiple
DNA element such as an operator or an enhancer, activates mRNA “isoforms” can be produced in a single cell or the
transcription from an adjacent promoter. different isoforms can display different tissue-specific patterns
of expression. If the alternative exons fall within the open
Activator (Ac) element A class 2 DNA transposable element reading frames of the mRNA isoforms, different proteins will be
so named by its discoverer, Barbara McClintock, because it is produced by the alternative mRNAs.
required to activate chromosome breakage at the Dissociation
(Ds) locus. Alu A short transposable element that makes up more than 10
percent of the human genome. Alu elements are retroelements
active site The part of a protein that must be maintained in a that do not encode proteins and as such are nonautonomous
specific shape if the protein is to be functional—for example, in elements.
an enzyme, the part to which the substrate binds.
Ames test A way to test whether a chemical compound is
adaptation In the evolutionary sense, some heritable feature of mutagenic by exposing special mutant bacterial strains to the
an individual’s phenotype that improves its chances of survival product formed by that compound’s digestion by liver extract
and reproduction in the existing environment. and then counting the number of colonies. Only new mutations,
adaptive walk A pathway of stepwise evolutionary change that presumably produced by the compound, can produce revertants
results from natural selection and the accumulation of a series of to wild type able to form colonies.
mutations. amino acid A peptide; the basic building block of proteins (or
additive effect Half the difference between the mean of the polypeptides).
phenotypic values for the homozygous genotypic classes at a QTL. aminoacyl-tRNA binding site A site in the ribosome that binds
additive gene action When the trait value for the heterozygous incoming aminoacyl-tRNAs. The anticodon of each incoming
class at a QTL is exactly intermediate between the trait values for aminoacyl-tRNA matches the codon of the mRNA. Also called
the two homozygous classes. the A site.
additive genetic variation The part of the genetic variance for aminoacyl-tRNA synthetase An enzyme that attaches an
a trait in a population that is predictably transmitted from parent amino acid to a tRNA before its use in translation. There are 20
to offspring. different aminoacyl-tRNAs, one for each amino acid.
adenine (A) A purine base that pairs with thymine in the DNA amino end The end of a protein having a free amino group. A
double helix. protein is synthesized from the amino end at the 5′ end of an
mRNA to the carboxyl end near the 3′ end of the mRNA during
adenosine (A) A nucleoside containing adenine as its base. translation.
adjacent-1 segregation In a reciprocal translocation, the amphidiploid An allopolyploid; a polyploid formed from the
passage of a translocated and a normal chromosome to each of union of two separate chromosome sets and their subsequent
the poles. doubling.
allele One of the different forms of a gene that can exist at a amplification The production of many DNA copies from one
single locus. master region of DNA.
allele frequency A measure of the commonness of an allele anaphase bridge In a dicentric chromosome, the segment
in a population; the proportion of all alleles of that gene in the between centromeres being drawn to opposite poles at nuclear
population that are of this specific type. division.
allelic fitness The average fitness of individuals carrying a aneuploid A genome having a chromosome number that differs
particular allele. It is expressed by the equation WA = from the normal chromosome number for the species by a small
pWA/A + qWA/a . number of chromosomes.
allelic series The set of known alleles of one gene. See also annotation The identification of all the functional elements of a
multiple alleles. particular genome.
813
814 Glossary
antibody A protein (immunoglobulin) molecule, produced classes of balanced rearrangements are inversions and reciprocal
by the immune system, that recognizes a particular substance translocations.
(antigen) and binds to it.
balancer A chromosome with multiple inversions, used to retain
anticodon A nucleotide triplet in a tRNA molecule that aligns favorable allele combinations in the uninverted homolog.
with a particular codon in mRNA under the influence of the balancing selection Natural selection that results in an
ribosome; the amino acid carried by the tRNA is inserted into a equilibrium with intermediate allele frequencies.
growing protein chain.
Barr body A densely staining mass that represents an
antiparallel A term used to describe the opposite orientations inactivated X chromosome.
of the two strands of a DNA double helix; the 5′ end of one strand
aligns with the 3′ end of the other strand. barrier insulator A DNA element that prevents the spread of
heterochromatin by serving as a binding site for proteins that
antisense RNA strand An RNA strand having a sequence maintain euchromatic chromatin modifications such as histone
complementary to a transcribed RNA strand. acetylation.
antiterminator A protein that promotes the continuation of base See nucleotide base.
transcription by preventing the termination of transcription at
specific sites on DNA. base analog A chemical whose molecular structure mimics that
of a DNA base; because of the mimicry, the analog may act as a
apoptosis The cellular pathways responsible for programmed mutagen.
cell death.
base-excision repair One of several excision-repair pathways.
apurinic site A DNA site that has lost a purine residue. In this pathway, subtle base-pair distortions are repaired by the
artificial selection Breeding of successive generations by the creation of apurinic sites followed by repair synthesis.
deliberate human selection of certain phenotypes or genotypes β (beta) clamp A protein that encircles the DNA like a donut,
as the parents of each generation. keeping the enzyme DNA pol III attached to the DNA molecule
ascus In a fungus, a sac that encloses a tetrad or an octad of at the replication fork.
ascospores. bioinformatics Computational information systems and
A site See aminoacyl-tRNA binding site. analytical methods applied to biological problems such as
genomic analysis.
association mapping A method for locating quantitative trait
loci in the genome based on linkage disequilibrium between a bivalents Two homologous chromosomes paired at meiosis.
marker locus and the quantitative trait locus in a random-mating bottleneck A period of one or several consecutive generations of
population. contraction in population size.
attachment site Region at which prophage integrates. breeding value The part of an individual’s deviation from the
population mean that is due to additive effects and is transmitted
attenuation A regulatory mechanism in which the level of
to its progeny.
transcription of an operon (such as trp) is reduced when the end
product of a pathway (for example, tryptophan) is plentiful; the broad-sense heritability (H2) The proportion of total
regulated step is after the initiation of transcription. phenotypic variance at the population level that is contributed by
genetic variance.
attenuator A region of RNA sequence that forms alternative
secondary structures that govern the level of transcription of bypass (translesion) polymerase A DNA polymerase that
attenuated operons. can continue to replicate DNA past a site of damage that would
halt replication by the normal replicative polymerase. Bypass
autonomous element A transposable element that encodes the polymerases contribute to a damage-tolerance mechanism called
protein(s)—for example, transposase or reverse transcriptase— translesion DNA synthesis.
necessary for its transposition and for the transposition of
nonautonomous elements in the same family.
autopolyploid Polyploid formed from the doubling of a single C See cytidine; cytosine.
genome. cAMP See cyclic adenosine monophosphate.
autoradiogram A pattern of dark spots in a developed cancer A class of disease characterized by the rapid and
photographic film or emulsion in the technique of uncontrolled proliferation of cells within a tissue of a multitissue
autoradiography. eukaryote. Cancers are generally thought to be genetic diseases
auxotroph A strain of microorganisms that will proliferate only of somatic cells, arising through sequential mutations that create
when the medium is supplemented with a specific substance not oncogenes and inactivate tumor-suppressor genes.
required by wild-type organisms (compare prototroph). candidate gene A gene that, because of its chromosomal
position or some other property, becomes a candidate for a
particular function such as disease risk.
BAC Bacterial artificial chromosome; an F plasmid engineered
to act as a cloning vector that can carry large inserts. CAP See catabolite activator protein.
bacterial artificial chromosome See BAC. cap A special structure, consisting of a 7-methylguanosine
residue linked to the transcript by three phosphate groups, that
bacteriophage (phage) A virus that infects bacteria. is added in the nucleus to the 5′ end of eukaryotic mRNA. The
balanced rearrangement A change in the chromosomal gene cap protects an mRNA from degradation and is required for
order that does not remove or duplicate any DNA. The two translation of the mRNA in the cytoplasm.
Glossary 815
carboxyl end The end of a protein having a free carboxyl chromosome walk A method for the dissection of large
group. The carboxyl end is encoded by the 3′ end of the segments of DNA, in which a cloned segment of DNA, usually
mRNA and is the last part of the protein to be synthesized in eukaryotic, is used to screen recombinant DNA clones from
translation. the same genome bank for other clones containing neighboring
sequences.
carboxy terminal domain (CTD) The protein tail of the ß
subunit of RNA polymerase II; it coordinates the processing cis-acting element A site on a DNA (or RNA) molecule that
of eukaryotic pre-mRNAs including capping, splicing, and functions as a binding site for a sequence-specific DNA- (or
termination. RNA-) binding protein. The term cis-acting indicates that protein
binding to this site affects only nearby DNA (or RNA) sequences
catabolite activator protein (CAP) A protein that unites on the same molecule.
with cAMP at low glucose concentrations and binds to the lac
promoter to facilitate RNA polymerase action. cis conformation In a heterozygote having two mutant sites
within a gene or within a gene cluster, the arrangement A1A2 /a1a2.
catabolite repression The inactivation of an operon caused by
the presence of large amounts of the metabolic end product of class 1 element A transposable element that moves through an
the operon. RNA intermediate. Also called an RNA or retro element.
categorical trait A trait for which individuals can be sorted into class 2 element A transposable element that moves directly from
discrete or discontinuous groupings, such as tall versus short one site in the genome to another. Also called a DNA element.
stems for Mendel’s pea plants. cloning vector In cloning, the plasmid or phage chromosome
cDNA See complementary DNA. used to carry the cloned DNA segment.
cDNA library A library composed of cDNAs, not necessarily CNV See copy number variation.
representing all mRNAs. co-activator A special class of eukaryotic regulatory complex
cell clone Members of a colony that have a single genetic that serves as a bridge to bring together regulatory proteins and
ancestor. RNA polymerase II.
centimorgan (cM) See map unit. c.o.c. See coefficient of coincidence.
centromere A specialized region of DNA on each eukaryotic Cockayne syndrome A genetic disorder caused by defects
chromosome that acts as a site for the binding of kinetochore in the nucleotide-excision-repair system and leading to
proteins. symptoms of premature aging. Individuals with Cockayne
syndrome have a mutation in one of two proteins thought to
character An attribute of individual members of a species for recognize transcription complexes that are stalled owing to
which various heritable differences can be defined. DNA damage.
charged tRNA A transfer RNA molecule with an amino acid coding strand The nontemplate strand of a DNA molecule
attached to its 3′ end. Also called aminoacyl-tRNA. having the same sequence as that in the RNA transcript.
ChIP See chromatin immunoprecipitation. codominance A situation in which a heterozygote shows the
phenotypic effects of both alleles equally.
chi-square (χ2) test A statistical test used to determine the
probability of obtaining observed proportions by chance, under a codon A section of RNA (three nucleotides in length) that
specific hypothesis. encodes a single amino acid.
chloroplast DNA (cpDNA) The small genomic component coefficient of coincidence (c.o.c.) The ratio of the observed
found in the chloroplasts of plants, concerned with number of double recombinants to the expected number.
photosynthesis and other functions taking place within that cointegrate The product of the fusion of two circular
organelle. transposable elements to form a single, larger circle in replicative
chromatid One of the two side-by-side replicas produced by transposition.
chromosome division. colony A visible clone of cells.
chromatin The substance of chromosomes; now known to common SNP A single nucleotide polymorphism (SNP) for
include DNA and chromosomal proteins. which the less common allele occurs at a frequency of about
chromatin immunoprecipitation (ChIP) The use of antibodies 5 percent or greater.
to isolate specific regions of chromatin and to identify the regions comparative genomics Analysis of the relations of the genome
of DNA to which regulatory proteins are bound. sequences of two or more species.
chromatin remodeling Changes in nucleosome position along complementary (base pairs) Refers to specific pairing between
DNA. adenine and thymine and between guanine and cytosine.
chromosome A linear end-to-end arrangement of genes and complementary DNA (cDNA) DNA transcribed from a
other DNA, sometimes with associated protein and RNA. messenger RNA template through the action of the enzyme
reverse transcriptase.
chromosome map A representation of all chromosomes in the
genome as lines, marked with the positions of genes known from complementation Production of wild-type phenotype when two
their mutant phenotypes, plus molecular markers. Based on full or partial haploid genomes are united in the same cell.
analysis of recombinant frequency.
complementation test A test for determining whether two
chromosome mutation Any type of change in chromosome mutations are in different genes (they complement) or the same
structure or number. gene (they do not complement).
816 Glossary
complete dominance Describes an allele that expresses correlation coefficient A statistical measure of association that
itself the same in single copy (heterozygote) as in double copy signifies the extent to which two variables vary together.
(homozygote).
cosuppression An epigenetic phenomenon whereby a transgene
complex inheritance The type of inheritance exhibited by becomes reversibly inactivated along with the gene copy in the
traits affected by a mix of genetic and environmental factors. chromosome.
Continuous traits, such as height, typically have complex
cotranscriptional processing The simultaneous transcription
inheritance.
and processing of eukaryotic pre-mRNA.
complex trait A trait exhibiting complex inheritance.
cotransductants Two donor alleles that simultaneously
composite transposon A type of bacterial transposable element transduce a bacterial cell; their frequency is used as a measure of
containing a variety of genes that reside between two nearly closeness of the donor genes on the chromosome map.
identical insertion sequence (IS) elements.
covariance A statistical measure of the extent to which two
congenic lines Strains or stocks of a species that are identical variables change together. It is used in computing the correlation
throughout their genomes except for a small region of interest. coefficient between two variables.
conjugation The union of two bacterial cells during which cpDNA Chloroplast DNA.
chromosomal material is transferred from the donor to the
recipient cell. CpG island Unmethylated CG dinucleotides found in clusters
near gene promoters.
consensus sequence The nucleotide sequence of a segment of
DNA that is derived by aligning similar sequences (either from CRISPR loci Regions in bacterial chromosomes containing
the same or different organisms) and determining the most clustered regularly interspaced short palindromic repeats
common nucleotide at each position. involved in immunity to viruses.
consensus sequence The nucleotide sequence of a segment of cross The deliberate mating of two parental types of organisms
DNA that is in agreement with most sequence reads of the same in genetic analysis.
segment from different individuals. crossing over The exchange of corresponding chromosome
conservative replication A disproved model of DNA synthesis parts between homologs by breakage and reunion.
suggesting that one-half of the daughter DNA molecules should crossover products Meiotic product cells with chromosomes
have both strands composed of newly polymerized nucleotides. that have engaged in a crossover.
conservative transposition A mechanism of transposition that crRNA RNA transcribed from CRISPR loci that guide a protein
moves a mobile element to a new location in the genome as it complex to degrade complementary invading viral nucleic acid.
removes it from its previous location.
CTD See carboxy terminal domain.
constitutive expression Refers to genes that are expressed
continuously regardless of biological conditions. cumulative selection The situation when natural selection
promotes multiple substitutions that alter the function of a
constitutive heterochromatin Chromosomal regions of protein or regulatory element through repeated rounds of
permanently condensed chromatin usually around the telomeres mutation and selection.
and centromeres.
“cut and paste” A descriptive term for a transposition
constitutive mutation A change in a DNA sequence that causes mechanism in which a class 2 (DNA) transposon is excised (cut)
a gene that is repressed at times to be expressed continuously, or from the donor site and inserted (pasted) into a new target site.
“constitutively.” See also copy and paste.
continuous trait A trait that can take on a potentially infinite C-value The DNA content of a haploid genome.
number of states over a continuous range, such as height in
humans. C-value paradox The discrepancy (or lack of correlation)
between the DNA content of an organism and its biological
coordinately controlled genes Genes whose products are complexity.
simultaneously activated or repressed in parallel.
cyclic adenosine monophosphate (cAMP) A molecule
copia-like element A transposable element (retrotransposon) of
containing a diester bond between the 3′ and 5′ carbon atoms
Drosophila that is flanked by long terminal repeats and typically
of the ribose part of the nucleotide. This modified nucleotide
encodes a reverse transcriptase.
cannot be incorporated into DNA or RNA. It plays a key role as
“copy and paste” A descriptive term for a transposition an intracellular signal in the regulation of various processes.
mechanism in which a class 1 retrotransposon is copied from the
cytidine (C) A nucleoside containing cytosine as its base.
donor site and a double-stranded DNA copy is inserted (pasted)
into a new target site. See also cut and paste. cytoplasmic segregation Segregation in which genetically
different daughter cells arise from a progenitor that is a cytohet.
copy number variation (CNV) Variation for a large DNA
segment among homologous chromosomes caused by cytosine (C) A pyrimidine base that pairs with guanine.
differences in the numbers of tandem copies of a single or
multiple genes.
Darwinian fitness The relative probability of survival and
corepressor A repressor that facilitates gene repression but is reproduction for a genotype.
not itself a DNA-binding repressor.
daughter molecule One of the two products of DNA replication
correlation The tendency of one variable to vary in proportion composed of one template strand and one newly synthesized
to another variable, either positively or negatively. strand.
Glossary 817
decoding center The region in the small ribosomal subunit DNA (deoxyribonucleic acid) A chain of linked nucleotides
where the decision is made whether an aminoacyl-tRNA can bind (having deoxyribose as their sugars). Two such chains in double-
in the A site. This decision is based on complementarity between helical form are the fundamental substance of which genes are
the anticodon of the tRNA and the codon of the mRNA. composed.
degenerate code A genetic code in which some amino acids DNA-binding domain The site on a DNA-binding protein that
may be encoded by more than one codon each. directly interacts with specific DNA sequences.
deletion The removal of a chromosomal segment from a DNA clone A section of DNA that has been inserted into
chromosome set. a vector molecule, such as a plasmid or a phage, and then
deletion loop The loop formed at meiosis by the pairing of a replicated to produce many copies.
normal chromosome and a deletion-containing chromosome. DNA element A class 2 transposable element found in both
deletion mapping The use of a set of known deletions to map prokaryotes and eukaryotes and so named because the DNA
new recessive mutations by pseudodominance. element participates directly in transposition.
donor DNA Any DNA to be used in cloning or in DNA-mediated tissue-specific enhancers can determine spatial patterns of gene
transformation. expression in higher eukaryotes. Enhancers can act on promoters
over many tens of kilobases of DNA and can be 5′ or 3′ of the
dosage compensation The process in organisms using a
promoters that they regulate.
chromosomal sex-determination mechanism (such as XX versus
XY) that allows standard structural genes on the sex chromosome enol form See tautomeric shift.
to be expressed at the same levels in females and males,
environmental variance The part of the phenotypic variation
regardless of the number of sex chromosomes. In mammals,
among individuals in a population that is due to the different
dosage compensation operates by maintaining only a single
environments the individuals have experienced.
active X chromosome in each cell; in Drosophila, it operates by
hyperactivating the male X chromosome. epigenetic Nongenetic chemical changes in histones or DNA
that alter gene function without altering the DNA sequence.
double helix The structure of DNA first proposed by James
Watson and Francis Crick, with two interlocking helices joined by epigenetic inheritance Heritable modifications in gene
hydrogen bonds between paired bases. function not due to changes in the base sequence of the
DNA of the organism. Examples of epigenetic inheritance
double (mixed) infection Infection of a bacterium with two
are paramutation, X-chromosome inactivation, and parental
genetically different phages.
imprinting.
double mutant Genotype with mutant alleles of two different
epigenetic mark A heritable alteration, such as DNA
genes.
methylation or a histone modification, that leaves the DNA
double-strand break A DNA break cleaving the sugar– sequence unchanged.
phosphate backbones of both strands of the DNA double helix.
epigenetic silencing The repression of the expression of a
double-stranded RNA (dsRNA) An RNA molecule comprised gene by virtue of its position in the chromosome rather than
of two complementary strands. by a mutation in its DNA sequence. Epigenetic silencing can be
inherited from one cell generation to the next.
double transformation Simultaneous transformation by two
different donor markers. epistasis A situation in which the differential phenotypic
expression of a genotype at one locus depends on the genotype
downstream A way to describe the relative location of a site in a
at another locus; a mutation that exerts its expression while
DNA or RNA molecule. A downstream site is located closer to the
canceling the expression of the alleles of another gene.
3′ end of a transcription unit.
E site See exit site.
Down syndrome An abnormal human phenotype, including
mental retardation, due to a trisomy of chromosome 21; more essential gene A gene without at least one copy of which the
common in babies born to older mothers. organism dies.
drift See random genetic drift. EST See expressed sequence tag.
dsRNA See double-stranded RNA. euchromatin A less-condensed chromosomal region, thought to
contain most of the normally functioning genes.
duplication More than one copy of a particular chromosomal
segment in a chromosome set. euploid A cell having any number of complete chromosome sets
or an individual organism composed of such cells.
dyad A pair of sister chromatids joined at the centromere, as in
the first division of meiosis. excise Describes what a transposable element does when it
leaves a chromosomal location. Also called transpose.
exconjugant A female bacterial cell that has just been in
ectopic integration In a transgenic organism, the insertion of
conjugation with a male and contains a fragment of male DNA.
an introduced gene at a site other than its usual locus.
exit (E) site The site on the ribosome where the deacylated
elongation The stage of transcription that follows initiation and
tRNA can be found.
precedes termination.
exogenote See merozygote.
embryoid A small dividing mass of monoploid cells, produced
from a cell destined to become a pollen cell by exposing it to cold. exon Any nonintron section of the coding sequence of a gene;
together, the exons correspond to the mRNA that is translated
endogenote See merozygote.
into protein.
endogenous gene A gene that is normally present in an
expressed sequence tag (EST) A cDNA clone for which only
organism, in contrast with a foreign gene from a different
the 5′ or the 3′ ends or both have been sequenced; used to
organism that might be introduced by transgenic techniques.
identify transcript ends in genomic analysis.
enhanceosome The macromolecular assembly responsible for
interaction between enhancer elements and the promoter regions expressivity The degree to which a particular genotype is
of genes. expressed in the phenotype.
enhancer A set of regulatory proteins consisting of transcription extranuclear Refers to a small specialized fraction of eukaryotic
factors that bind to cis-acting regulatory sequences in the DNA. genomes found in mitochondria or chloroplasts.
F′ factor A fertility factor into which a part of the bacterial GD See gene diversity.
chromosome has been incorporated.
gel electrophoresis A method of molecular separation in
F ′ plasmid See F ′ factor. which DNA, RNA, or proteins are separated in a gel matrix
according to molecular size, with the use of an electrical
F1 generation The first filial generation, produced by crossing field to draw the molecules through the gel in a predetermined
two parental lines. direction.
F2 generation The second filial generation, produced by selfing gene The fundamental physical and functional unit of heredity,
or intercrossing the F1 generation. which carries information from one generation to the next;
fertility factor (F factor) A bacterial episome whose presence a segment of DNA composed of a transcribed region and a
confers donor ability (maleness). regulatory sequence that makes transcription possible.
fibrous protein A protein with a linear shape such as the gene action Interaction among alleles at a locus.
components of hair and muscle. gene balance The idea that a normal phenotype requires a 1:1
fine mapping Finding the genomic location of a gene of interest relative proportion of genes in the genome.
(or a functional region within a gene) with marker loci that are gene complex A group of adjacent functionally and structurally
very tightly linked to it. related genes that typically arise by gene duplication in the
first-division segregation pattern (MI pattern) A linear course of evolution.
pattern of spore phenotypes within an ascus for a particular gene discovery The process whereby geneticists find a set of
allele pair, produced when the alleles go into separate nuclei at genes affecting some biological process of interest by the single-
the first meiotic division, showing that no crossover has taken gene inheritance patterns of their mutant alleles or by genomic
place between the allele pair and the centromere. analysis.
first filial generation (F1) The progeny individuals arising gene diversity (GD) The probability that two alleles drawn at
from a cross of two homozygous diploid lines. random from the gene pool will be different.
5′ untranslated region (5′ UTR) The region of the RNA gene-dosage effect (1) Proportionality of the expression of
transcript at the 5′ end upstream of the translation start site. some biological function to the number of copies of an allele
fixed allele An allele for which all members of the population present in the cell. (2) A change in phenotype caused by an
under study are homozygous, and so no other alleles for this abnormal number of wild-type alleles (observed in chromosomal
locus exist in the population. mutations).
fluctuation test A test used in microbes to establish the random gene duplication The duplication of genes or segments of DNA
nature of mutation or to measure mutation rates. through misreplication of DNA.
forward genetics The classical approach to genetic analysis, in gene expression The process by which a gene’s DNA sequence
which genes are first identified by mutant alleles and mutant is transcribed into RNA and, for protein-coding genes, into a
phenotypes and later cloned and subjected to molecular analysis. polypeptide.
fosmid A vector that can carry a 35- to 45-kb insert of foreign gene family A set of genes in one genome, all descended from
DNA. the same ancestral gene.
founder effect A random difference in the frequency of an gene flow See migration.
allele or a genotype in a new colony as compared to the parental gene knockout The inactivation of a gene by either a naturally
population that results from a small number of founders. occurring mutation or through the integration of a specially
frameshift mutation The insertion or deletion of a nucleotide engineered introduced DNA fragment. In some systems, such
pair or pairs, causing a disruption of the translational reading inactivation is random, with the use of transgenic constructs
frame. that insert at many different locations in the genome. In other
systems, it can be carried out in a directed fashion. See also
frequency histogram A “step curve” in which the frequencies targeted gene knockout.
of various arbitrarily bounded classes are graphed.
gene locus The specific place on a chromosome where a gene is
full dominance See complete dominance. located.
functional complementation (mutant rescue) The use of a gene pair The two copies of a particular type of gene present in
cloned fragment of wild-type DNA to transform a mutant into wild a diploid cell (one in each chromosome set).
type; used in identifying a clone containing one specific gene.
gene pool The sum total of all alleles in the breeding members
functional genomics The study of the patterns of transcript and of a population at a given time.
protein expression and of molecular interactions at a genome-
wide level. generalized transduction The ability of certain phages to
transduce any gene in the bacterial chromosome.
functional RNA An RNA type that plays a role without being
translated. general transcription factor (GTF) A eukaryotic protein
complex that does not take part in RNA synthesis but binds
to the promoter region to attract and correctly position RNA
G See guanine; guanosine. polymerase II for transcription initiation.
gap gene In Drosophila, a class of cardinal genes that are gene replacement The insertion of a genetically engineered
activated in the zygote in response to the anterior–posterior transgene in place of a resident gene; often achieved by a double
gradients of positional information. crossover.
820 Glossary
gene silencing A gene that is not expressed owing to epigenetic genomic library A library encompassing an entire genome.
regulation. Unlike genes that are mutant due to DNA sequence
genomics The cloning and molecular characterization of entire
alterations, genes inactivated by silencing can be reactivated.
genomes.
gene therapy The correction of a genetic deficiency in a cell
genotype The allelic composition of an individual or of a cell—
by the addition of new DNA and its insertion into the genome.
either of the entire genome or, more commonly, of a certain gene
Different techniques have the potential to carry out gene therapy
or a set of genes.
only in somatic tissues or, alternatively, to correct the genetic
deficiency in the zygote, thereby correcting the germ line as well. genotype frequency The proportion of individuals in a
population having a particular genotype.
genetic admixture The mix of genes that results when
individuals have ancestry from more than one subpopulation. GGR See global genomic repair.
genetically modified organism (GMO) A popular term for a global genomic nucleotide-excision repair (GG-NER) A type
transgenic organism, especially applied to transgenic agricultural of nucleotide-excision repair that takes place at nontranscribed
organisms. sequences.
genetic architecture All of the genetic and environmental globular protein A protein with a compact structure, such as an
factors that influence a trait. enzyme or an antibody.
genetic code A set of correspondences between nucleotide GMO See genetically modified organism.
triplets in RNA and amino acids in protein. GU-AG rule So named because the GU and AG dinucleotides are
genetic dissection The use of recombination and mutation almost always at the 5′ and 3′ ends, respectively, of introns, where
to piece together the various components of a given biological they are recognized by components of the splicosome.
function. guanine (G) A purine base that pairs with cytosine.
genetic drift The change in the frequency of an allele in a GWA See genome-wide association.
population resulting from chance differences in the actual
numbers of offspring of different genotypes produced by
different individual members.
H See heterozygosity.
genetic engineering The process of producing modified DNA in
a test tube and reintroducing that DNA into host organisms. haploid A cell having one chromosome set or an organism
composed of such cells.
genetic load The total set of deleterious alleles in an individual
genotype. haplosufficient Describes a gene that, in a diploid cell, can
promote wild-type function in only one copy (dose).
genetic map unit (m.u.) A distance on the chromosome map
corresponding to 1 percent recombinant frequency. haplotype The type (or form) of a haploid segment of a
chromosome as defined by the alleles present at the loci within
genetic marker An allele used as an experimental probe to
that segment.
keep track of an individual organism, a tissue, a cell, a nucleus, a
chromosome, or a gene. haplotype network A network that shows relationships among
haplotypes and the positions of the mutations defining the
genetics (1) The study of genes. (2) The study of inheritance.
haplotypes on the branches.
genetic switch A segment of regulatory DNA and the regulatory
HapMap A genome-wide haplotype map.
protein(s) that binds to it that govern the transcriptional state of
a gene or set of genes. Hardy–Weinberg equilibrium The stable frequency distribution
of genotypes A/A, A/a, and a/a, in the proportions of p2, 2pq, and
genetic toolkit The set of genes responsible for the regulation
q2, respectively (where p and q are the frequencies of the alleles A
of animal development, largely comprised of members of cell-cell
and a), that is a consequence of random mating in the absence of
signaling pathways and transcription factors.
mutation, migration, natural selection, or random drift.
genetic variance The part of the phenotypic variation among
Hardy–Weinberg law An equation used to describe the
individuals in a population that is due to the genetic differences
relationship between allelic and genotypic frequencies in a
among the individuals.
random-mating population.
genome The entire complement of genetic material in a
helicase An enzyme that breaks hydrogen bonds in DNA and
chromosome set.
unwinds the DNA during movement of the replication fork.
genome project A large-scale, often multilaboratory effort
hemimethylated DNA DNA sequence with one methylated
required to sequence a complex genome.
strand and one unmethylated strand.
genome surveillance A collection of mechanisms that recognize
hemizygous gene A gene present in only one copy in a diploid
and destroy invading nucleic acids or active transposons. See also
organism—for example, an X-linked gene in a male mammal.
crRNA and piRNA.
heterochromatin Densely staining condensed chromosomal
genome-wide association (GWA) Association mapping that
regions, believed to be for the most part genetically inert.
uses marker loci throughout the entire genome.
heteroduplex DNA DNA in which there is one or more
genomic imprinting A phenomenon in which a gene
mismatched nucleotide pairs in a gene under study.
inherited from one of the parents is not expressed, even
though both gene copies are functional. Imprinted genes are heterogametic sex The sex that has heteromorphic sex
methylated and inactivated in the formation of male or female chromosomes (for example, XY) and hence produces two
gametes. different kinds of gametes with respect to the sex chromosomes.
Glossary 821
heterokaryon A culture of cells composed of two different homozygous recessive Refers to a genotype such as a/a.
nuclear types in a common cytoplasm.
horizontal transmission Inheritance of DNA from another
heteroplex DNA DNA in which there is a mismatched member of the same generation.
nucleotide pair in a gene under study.
housekeeping gene An informal term for a gene whose product
heterozygote An individual organism having a heterozygous is required in all cells and carries out a basic physiological
gene pair. function.
heterozygosity A measure of the genetic variation in a Hox genes Members of this gene class are the clustered
population; with respect to one locus, stated as the frequency of homeobox-containing, homeotic genes that govern the identity
heterozygotes for that locus. of body parts along the anterior–posterior axis of most bilateral
animals.
heterozygous gene pair A gene pair having different alleles in
the two chromosome sets of the diploid individual—for example, hybrid dysgenesis A syndrome of effects including sterility,
A/a or A1/A2. mutation, chromosome breakage, and male recombination in the
hybrid progeny of crosses between certain laboratory and natural
hexaploid A cell having six chromosome sets or an organism
isolates of Drosophila.
composed of such cells.
hybridize (1) To form a hybrid by performing a cross. (2) To
Hfr See high frequency of recombination cell.
anneal complementary nucleic acid strands from different sources.
high frequency of recombination (Hfr) cell In E. coli, a
hybrid vigor A situation in which an F1 is larger or healthier
cell having its fertility factor integrated into the bacterial
than its two different pure parental lines.
chromosome; a donor (male) cell.
hyperacetylation An overabundance of acetyl groups attached
histone A type of basic protein that forms the unit around which
to certain amino acids of the histone tails. Transcriptionally
DNA is coiled in the nucleosomes of eukaryotic chromosomes.
active chromatin is usually hyperacetylated.
histone code Refers to the pattern of modification (for example,
hypoacetylation An underabundance of acetyl groups on
acetylation, methylation, phosphorylation) of the histone tails
certain amino acids of the histone tails. Transcriptionally inactive
that may carry information required for correct chromatin
chromatin is usually hypoacetylated.
assembly.
histone deacetylase The enzymatic activity that removes an
acetyl group from a histone tail, which promotes the repression IBD See identical by descent.
of gene transcription.
identical by descent (IBD) When two copies of a gene in an
histone modification Covalent alteration of one or more amino individual trace back to the same copy in an ancestor.
acid residues of the histone protein. Modifications include
imino form See tautomeric shift.
acetylation, phosphorylation, and methylation.
inbred line A stock consisting of genetically identical
histone tail The end of a histone protein protruding from
individuals that were fully inbred from a common parent(s).
the core nucleosome and subjected to post-translational
modification. See also histone code. inbreeding Mating between relatives.
homeobox (homeotic box) A family of quite similar 180-bp inbreeding coefficient (F) The probability that the two alleles
DNA sequences that encode a polypeptide sequence called a at a locus in an individual are identical by descent.
homeodomain, a sequence-specific DNA-binding sequence.
inbreeding depression A reduction in vigor and reproductive
Although the homeobox was first discovered in all homeotic
success from inbreeding.
genes, it is now known to encode a much more widespread DNA-
binding motif. incomplete dominance A situation in which a heterozygote
shows a phenotype quantitatively (but not exactly) intermediate
homeodomain A highly conserved family of sequences, 60
between the corresponding homozygote phenotypes. (Exact
amino acids in length and found within a large number of
intermediacy means no dominance.)
transcription factors, that can form helix-turn-helix structures
and bind DNA in a sequence-specific manner. indel mutation A mutation in which one or more nucleotide
pairs is added or deleted.
homeologous chromosomes Partly homologous chromosomes,
usually indicating some original ancestral homology. independent assortment See Mendel’s second law.
homogametic sex The sex with homologous sex chromosomes induced mutation A mutation that arises through the action of
(for example, XX). an agent that increases the rate at which mutations occur.
homolog A member of a pair of homologous chromosomes. inducer An environmental agent that triggers transcription from
an operon.
homologous chromosomes Chromosomes that pair with each
other at meiosis or chromosomes in different species that have induction (1) The relief of repression of a gene or set of genes
retained most of the same genes during their evolution from a under negative control. (2) An interaction between two or more
common ancestor. cells or tissues that is required for one of those cells or tissues to
change its developmental fate.
homozygote An individual organism that is homozygous.
initiation The first stage of transcription or translation. Its main
homozygous Refers to the state of carrying a pair of identical
function in transcription is to correctly position RNA polymerase
alleles at one locus.
before the elongation stage, and in translation it is to correctly
homozygous dominant Refers to a genotype such as A/A. position the first aminoacyl-tRNA in the P site.
822 Glossary
initiation factor A protein required for the correct initiation of λ (lambda) attachment site Where the λ prophage inserts in
translation. the E. coli chromosome.
initiator A special tRNA that inserts the first amino acid of law of equal segregation (Mendel’s first law) The production
a polypeptide chain into the ribosomal P site at the start of of equal numbers (50 percent) of each allele in the meiotic
translation. The amino acid carried by the initiator in bacteria is products (for example, gametes) of a heterozygous meiocyte.
N-formylmethionine.
law of independent assortment (Mendel’s second
insertional duplication A duplication in which the extra copy law) Unlinked or distantly linked segregating gene pairs assort
is not adjacent to the normal one. independently at meiosis.
insertional mutagenesis The situation when a mutation arises LD See linkage disequilibrium.
by the interruption of a gene by foreign DNA, such as from a
transgenic construct or a transposable element. leader sequence The sequence at the 5′ end of an mRNA that is
not translated into protein.
insertion sequence (IS) element A mobile piece of bacterial
DNA (several hundred nucleotide pairs in length) capable of leading strand In DNA replication, the strand that is made
inactivating a gene into which it inserts. in the 5′-to-3′ direction by continuous polymerization at the 3′
growing tip.
interactome The entire set of molecular interactions within
cells, including in particular protein–protein interactions. leaky mutation A mutation that confers a mutant
phenotype but still retains a low but detectable level of wild-type
intercalating agent A mutagen that can insert itself between function.
the stacked bases at the center of the DNA double helix, causing
an elevated rate of indel mutations. lethal allele An allele whose expression results in the death of
the individual organism expressing it.
interference A measure of the independence of crossovers
from each other, calculated by subtracting the coefficient of LINE See long interspersed element.
coincidence from 1. linkage disequilibrium (LD) Deviation in the frequencies
interrupted mating A technique used to map bacterial genes by of different haplotypes in a population from the frequencies
determining the sequence in which donor genes enter recipient expected if the alleles at the loci defining the haplotypes are
cells. associated at random.
intervening sequence An intron; a segment of largely unknown linkage equilibrium A perfect fit of haplotype frequencies in
function within a gene. This segment is initially transcribed, but a population to the frequencies expected if the alleles at the loci
the transcript is not found in the functional mRNA. defining the haplotypes are associated at random.
intragenic deletion A deletion within a gene. linkage map A chromosome map; an abstract map of
chromosomal loci that is based on recombinant frequencies.
intron See intervening sequence.
linked The situation in which two genes are on the same
inversion A chromosomal mutation consisting of the removal chromosome as deduced by recombinant frequencies less than
of a chromosome segment, its rotation through 180°, and its 50 percent.
reinsertion in the same location.
lncRNA See long noncoding RNA.
inversion heterozygote A diploid with a normal and an
inverted homolog. locus (plural, loci) See gene locus.
inversion loop A loop formed by meiotic pairing of homologs in long interspersed element (LINE) A type of class 1
an inversion heterozygote. transposable element that encodes a reverse transcriptase. LINEs
are also called non-LTR retrotransposons.
inverted repeat (IR) sequence A sequence found in identical
(but inverted) form—for example, at the opposite ends of a DNA long noncoding RNA (lncRNA) Nonprotein-coding transcripts
transposon. that are over approximately 200 nucleotides in length.
IR sequence See inverted repeat sequence. long terminal repeat (LTR) A direct repeat of DNA sequence
at the 5′ and 3′ ends of retroviruses and retrotransposons.
IS element See insertion sequence element.
LTR See long terminal repeat.
isoforms Related by different proteins. They can be generated
by alternative splicing of a gene. LTR-retrotransposon A type of class 1 transposable element
that terminates in long terminal repeats and encodes several
isolation by distance A bias in mate choice that arises from
proteins including reverse transcriptase.
the amount of geographic distance between individuals, causing
individuals to be more apt to mate with a neighbor than another lysate Population of phage progeny.
member of their species farther away.
lysis The rupture and death of a bacterial cell on the release of
phage progeny.
keto form See tautomeric shift. lysogen See lysogenic bacterium.
Klinefelter syndrome An abnormal human male phenotype lysogenic bacterium A bacterial cell containing an inert
due to an extra X chromosome (XXY). prophage integrated into, and that is replicated with, the host
chromosome.
lagging strand In DNA replication, the strand that is lysogenic cycle The life cycle of a normal bacterium when
synthesized apparently in the 3′-to-5′ direction by the ligation of it is infected by a wild-type λ phage and the phage genome is
short fragments synthesized individually in the 5′-to-3′ direction. integrated into the bacterial chromosome as an inert prophage.
Glossary 823
lytic cycle The bacteriophage life cycle that leads to lysis of the microsatellite marker A difference in DNA at the same locus
host cell. in two genomes that is due to different repeat lengths of a
microsatellite.
MI pattern See first-division segregation pattern.
migration The movement of individuals (or gametes) between
MII pattern See second-division segregation pattern. populations.
major groove The larger of the two grooves in the DNA double miniature inverted repeat transposable element (MITE)
helix. A type of nonautonomous DNA transposon that can form by
deletion of the transposase gene from an autonomous element
mapping function A formula expressing the relation between
and attain very high copy numbers.
distance in a linkage map and recombinant frequency.
minimal medium Medium containing only inorganic salts, a
map unit (m.u.) The “distance” between two linked gene pairs
carbon source, and water.
where 1 percent of the products of meiosis are recombinant; a
unit of distance in a linkage map. minisatellite marker Heterozygous locus representing
a variable number of tandem repeats of a unit 15 to 100
maternal-effect gene A gene that produces an effect only when
nucleotides long.
present in the mother.
minor groove The smaller of the two grooves in the DNA
maternal imprinting The expression of a gene only when
double helix.
inherited from the father, because the copy of the gene inherited
from the mother is inactive due to methylation in the course of miRNA See microRNA.
gamete formation.
mismatch-repair system A system for repairing damage to
maternal inheritance A type of uniparental inheritance in DNA that has already been replicated.
which all progeny have the genotype and phenotype of the
missense mutation Nucleotide-pair substitution within a
parent acting as the female.
protein-coding region that leads to the replacement of one amino
M cytotype Laboratory stocks of Drosophila melanogaster that acid by another amino acid.
completely lack the P element transposon, which is found in
MITE See miniature inverted repeat transposable element.
stocks from the wild (P cytotype).
mitochondrial DNA (mtDNA) The subset of the genome found
mean The arithmetic average.
in the mitochondrion, specializing in providing some of the
mean fitness The mean of the fitness of all individual members organelle’s functions.
of a population.
mitosis A type of nuclear division (occurring at cell division) that
mediator complex A protein complex that acts as an adaptor produces two daughter nuclei identical with the parent nucleus.
that interacts with transcription factors bound to regulatory
mixed (double) infection The infection of a bacterial culture
sites and with general initiation factors for RNA polymerase II–
with two different phage genotypes.
mediated transcription.
modifier A mutation at a second locus that changes the degree
meiocyte A cell in which meiosis takes place.
of expression of a mutated gene at a first locus.
meiosis Two successive nuclear divisions (with corresponding
molecular clock The constant rate of substitution of amino
cell divisions) that produce gametes (in animals) or sexual spores
acids in proteins or nucleotides in nucleic acids over long
(in plants and fungi) that have one-half of the genetic material of
evolutionary time.
the original cell.
molecular genetics The study of the molecular processes
meiotic recombination Recombination from assortment or
underlying gene structure and function.
crossing over at meiosis.
monohybrid A single-locus heterozygote of the type A/a.
Mendel’s first law (law of equal segregation) The two
members of a gene pair segregate from each other in meiosis; monohybrid cross A cross between two individuals identically
each gamete has an equal probability of obtaining either member heterozygous at one gene pair—for example, A/a × A/a.
of the gene pair.
monoploid A cell having only one chromosome set (usually as
Mendel’s second law (law of independent assortment) an aberration) or an organism composed of such cells.
Unlinked or distantly linked segregating gene pairs assort
monosomic A cell or individual organism that is basically
independently at meiosis.
diploid but has only one copy of one particular chromosome type
meristic trait A counting trait, taking on a range of discrete values. and thus has chromosome number 2n + 1.
merozygote A partly diploid E. coli cell formed from a complete morph One form of a genetic polymorphism; the morph can be
chromosome (the endogenote) plus a fragment (the exogenote). either a phenotype or a molecular sequence.
messenger RNA See mRNA. mRNA (messenger RNA) An RNA molecule transcribed
from the DNA of a gene; a protein is translated from this RNA
microarray A set of DNAs containing all or most genes in a
molecule by the action of ribosomes.
genome deposited on a small glass chip.
mtDNA Mitochondrial DNA.
microRNA (miRNA) A class of functional RNA that regulates
the amount of protein produced by a eukaryotic gene. m.u. See map unit.
microsatellite A locus composed of several to many copies multifactorial hypothesis A hypothesis that explains
(repeats) of a short (about 2 to 6 bp) sequence motif. Different quantitative variation by proposing that traits are controlled by a
alleles have different numbers of repeats. large number of genes, each with a small effect on the trait.
824 Glossary
multigenic deletion A deletion of several adjacent genes. non-protein-coding RNA (ncRNA) RNA that is not translated
into protein.
multiple alleles The set of forms of one gene, differing in their
DNA sequence or expression or both. nonsense mutation Nucleotide-pair substitution within a
protein-coding region that changes a codon for an amino acid
mutagen An agent capable of increasing the mutation rate.
into a termination (nonsense) codon.
mutagenesis An experiment in which experimental organisms
nonsynonymous substitution A change in the DNA of a
are treated with a mutagen and their progeny are examined for
protein-coding sequence that causes an amino acid change.
specific mutant phenotypes.
normal distribution A continuous distribution defined by the
mutant An organism or cell carrying a mutation.
normal density function with a specified mean and standard
mutant rescue See functional complementation. deviation showing the expected frequencies for different values
of a random variable (the “bell curve”).
mutation (1) The process that produces a gene or a
chromosome set differing from that of the wild type. (2) The Northern blotting The transfer of electrophoretically separated
gene or chromosome set that results from such a process. RNA molecules from a gel onto an absorbent sheet, which is
then immersed in a labeled probe that will bind to the RNA of
mutation rate (µ) The probability that a copy of an allele
interest.
changes to some other allelic form in one generation.
nuclear localization sequence (NLS) Part of a protein required
for its transport from the cytoplasm to the nucleus.
NAHR See nonallelic homologous recombination.
nucleosome The basic unit of eukaryotic chromosome structure;
narrow-sense heritability (h2) The proportion of phenotypic a ball of eight histone molecules that is wrapped by two coils of
variance that can be attributed to additive genetic variance. DNA.
natural selection The differential rate of reproduction nucleotide A molecule composed of a nitrogen base, a sugar,
of different types in a population as the result of different and a phosphate group; the basic building block of nucleic acids.
physiological, anatomical, or behavioral characteristics of the types.
nucleotide diversity Heterozygosity or gene diversity averaged
ncRNA See non-protein-coding RNA. over all the nucleotide sites in a gene or any other stretch of DNA.
nearly isogenic line See congenic line. nucleotide-excision-repair (NER) system An excision-repair
negative assortative mating Preferential mating between pathway that breaks the phosphodiester bonds on either side of
phenotypically unlike partners. a damaged base, removing that base and several on either side
followed by repair replication.
negative control Regulation mediated by factors that block or
turn off transcription. null allele An allele whose effect is the absence either of normal
gene product at the molecular level or of normal function at the
negative selection The elimination of a deleterious trait from a phenotypic level.
population by natural selection.
null hypothesis In statistics, the hypothesis being tested that
neofunctionalization The evolution of a new function by a gene. makes a prediction about the expected results of an experiment.
NER See nucleotide-excision-repair system. If the probability of observing the results under the null
hypothesis is less than 0.05, then the null hypothesis is rejected.
neutral allele An allele that has no effect on the fitness of
individuals that possess it. nullisomic Refers to a cell or individual organism with one
chromosomal type missing, with a chromosome number such as
neutral evolution Nonadaptive evolutionary changes due to n 1 or 2n 2.
random genetic drift.
null mutation A mutation that results in complete absence of
NH See number of haplotypes. function for the gene.
NHEJ See nonhomologous end joining. number of haplotypes (NH) A simple count of the number of
haplotypes at a locus in a population.
NLS See nuclear localization sequence.
nonallelic homologous recombination (NAHR) Crossing over
between short homologous units found at different chromosomal O See origin of replication.
loci. octad An ascus containing eight ascospores, produced in species
nonautonomous element A transposable element that relies in which the tetrad normally undergoes a postmeiotic mitotic
on the protein products of autonomous elements for its mobility. division.
Dissociation (Ds) is an example of a nonautonomous transposable Okazaki fragment A small segment of single-stranded DNA
element. synthesized as part of the lagging strand in DNA replication.
nonconservative substitution Nucleotide-pair substitution oncogene A gain-of-function mutation that contributes to the
within a protein-coding region that leads to the replacement of production of a cancer.
an amino acid by one having different chemical properties.
oncoprotein The protein product of an oncogene mutation.
nondisjunction The failure of homologs (at meiosis) or sister
one-gene–one-polypeptide hypothesis A mid-twentieth-
chromatids (at mitosis) to separate properly to opposite poles.
century hypothesis that originally proposed that each gene
nonhomologous end joining (NHEJ) A mechanism used by (nucleotide sequence) encodes a polypeptide sequence; generally
eukaryotes to repair double-strand breaks. true, with the exception of untranslated functional RNA.
Glossary 825
open reading frame (ORF) A gene-sized section of a sequenced penetrance The proportion of individuals with a specific
piece of DNA that begins with a start codon and ends with a stop genotype that manifest that genotype at the phenotype level.
codon; it is presumed to be the coding sequence of a gene.
pentaploid An individual organism with five sets of
operator A DNA region at one end of an operon that acts as the chromosomes.
binding site for a repressor protein.
peptidyl (P) site The site in the ribosome to which a tRNA with
operon A set of adjacent structural genes whose mRNA is the growing polypeptide chain is bound.
synthesized in one piece, plus the adjacent regulatory signals that
affect transcription of the structural genes. peptidyltransferase center The site in the large ribosomal
subunit at which the joining of two amino acids is catalyzed.
ORF See open reading frame.
pericentric inversion An inversion that includes the
origin (O) See origin of replication. centromere.
origin of replication (O) The point of a specific sequence at permissive temperature The temperature at which a
which DNA replication is initiated. temperature-sensitive mutant allele is expressed the same as the
orthologs Genes in different species that evolved from a wild-type allele.
common ancestral gene by speciation. personal genomics The analysis of the genome of an individual
outgroup Taxa outside of a group of organisms among which to better understand his or her ancestry or the genetic basis of
evolutionary relationships are being determined. phenotypic traits such as his or her risk of developing a disease.
PEV See position-effect variegation.
paired-end reads In whole-genome shotgun sequence assembly, phage See bacteriophage.
the DNA sequences corresponding to both ends of a genomic phage recombination The production of recombinant phage
DNA insert in a recombinant clone. genotypes as a result of doubly infecting a bacterial cell with
pair-rule gene In Drosophila, a member of a class of zygotically different “parental” phage genotypes.
expressed genes that act at an intermediary stage in the process phenotype (1) The form taken by some character (or group of
of establishing the correct numbers of body segments. Pair-rule characters) in a specific individual. (2) The detectable outward
mutations have half the normal number of segments, owing to manifestations of a specific genotype.
the loss of every other segment.
phosphate An ion formed of four oxygen atoms attached
paracentric inversion An inversion not including the
to a phosphorus atom or the chemical group formed by the
centromere.
attachment of a phosphate ion to another chemical species by an
paralogs Genes that are related by gene duplication in a ester bond.
genome.
phylogenetic inference Determining the state of a character or
parental generation The two strains or individual organisms the direction of change in a character based on the distribution of
that constitute the start of a genetic breeding experiment; their that character within a phylogeny of organisms.
progeny constitute the F1 generation.
phylogeny The evolutionary history of a group.
parsimony To favor the simplest explanation involving the
smallest number of evolutionary changes. physical map The ordered and oriented map of cloned DNA
fragments on the genome.
parthenogenesis The production of offspring by a female with
no genetic contribution from a male. PIC See preinitiation complex.
partial diploid See merozygote. pi-cluster Region in vertebrate and invertebrate genomes that
codes for clusters of piRNAs.
partial dominance Gene action under which the phenotype of
heterozygotes is intermediate between the two homozygotes but piRNA See piwi-interacting RNA.
more similar to that of one homozygote than the other. piwi-interacting RNA (piRNA) An RNA transcribed from
paternal imprinting The expression of a gene only when pi-clusters that helps to protect the integrity of plant and animal
inherited from the mother, because the allele of the gene genomes and to prevent the spread of transposable elements to
inherited from the father is inactive due to methylation in the other chromosomal loci. piRNAs restrain transposable elements
course of gamete formation. in animals.
PCNA See proliferating cell nuclear antigen. P site See peptidyl site.
PCR See polymerase chain reaction. plaque A clear area on a bacterial lawn, left by lysis of the
bacteria through progressive infections by a phage and its
P cytotype Natural stocks of Drosophila melanogaster that descendants.
contain 20 to 50 copies of the P element. Laboratory stocks have
none. See M cytotype. plasmid An autonomously replicating extrachromosomal DNA
molecule.
pedigree analysis Deducing single-gene inheritance of human
phenotypes by a study of the progeny of matings within a family, plating Spreading the cells of a microorganism (bacteria, fungi)
often stretching back several generations. on a dish of nutritive medium to allow each cell to form a visible
colony.
P element A DNA transposable element in Drosophila that has
been used as a tool for insertional mutagenesis and for germ-line pleiotropic allele An allele that affects several different
transformation. properties of an organism.
826 Glossary
point mutation A mutation that alters a single base position positive selection The process by which a favorable allele
in a DNA molecule by converting it to a different base or by the is brought to a higher frequency in a population because
insert/deletion of a single base in a DNA molecule. individuals carrying that allele have more viable offspring than
other individuals.
Poisson distribution A mathematical distribution giving the
probability of observing various numbers of a particular event in post-transcriptional gene silencing Occurs when the mRNA
a sample when the mean probability of an event on any one trial of a particular gene is destroyed or its translation blocked. The
is very small. mechanism of silencing usually involves RNAi or miRNA.
pol III holoenzyme See DNA polymerase III holoenzyme. post-transcriptional processing Modifications of amino acid
side groups after a protein has been translated.
poly(A) tail A string of adenine nucleotides added to mRNA
after transcription. post-translational modification (PTM) An alteration of amino
acid residues that occurs after the protein has been translated.
polygene (quantitative trait locus) A gene whose alleles are
capable of interacting additively with alleles at other loci to affect preinitiation complex (PIC) A very large eukaryotic protein
a phenotype (trait) showing continuous distribution. complex comprising RNA polymerase II and the six general
transcription factors (GTFs), each of which is a multiprotein
polymerase chain reaction (PCR) An in vitro method for
complex.
amplifying a specific DNA segment that uses two primers that
hybridize to opposite ends of the segment in opposite polarity pre-mRNA See primary transcript.
and, over successive cycles, prime exponential replication of that
primary structure of a protein The sequence of amino acids in
segment only.
the polypeptide chain.
polymerase III holoenzyme See DNA polymerase III
primary transcript (pre-mRNA) Eukaryotic RNA before it has
holoenzyme.
been processed.
polymorphism The occurrence in a population of multiple
primase An enzyme that makes RNA primers in DNA replication.
forms of a trait or multiple alleles at a genetic locus.
primer An RNA or DNA oligonucleotide that can serve as a
polypeptide A chain of linked amino acids; a protein.
template for DNA synthesis by DNA polymerase when annealed
polyploid A cell having three or more chromosome sets or an to a longer DNA molecule.
organism composed of such cells.
primosome A protein complex at the replication fork whose
polytene chromosome A giant chromosome in specific tissues central component is primase.
of some insects, produced by an endomitotic process in which
probe Labeled nucleic acid segment that can be used to identify
the multiple DNA sets remain bound in a haploid number of
specific DNA molecules bearing the complementary sequence,
chromosomes.
usually through autoradiography or fluorescence.
population (1) A group of individuals that mate with one
processed pseudogene A pseudogene that arose by the reverse
another to produce the next generation. (2) A group of
transcription of a mature mRNA and its integration into the
individuals from which a sample is drawn.
genome.
population genetics The study of genetic variation in
processive enzyme As used in Chapter 7, describes the behavior
populations and changes over time in the amount or patterning
of DNA polymerase III, which can perform thousands of rounds
of that variation resulting from mutation, migration,
of catalysis without dissociating from its substrate (the template
recombination, random genetic drift, natural selection, and
DNA strand).
mating systems.
product of meiosis One of the (usually four) cells formed by
population structure The division of a species or population
the two meiotic divisions.
into multiple genetically distinct subpopulations.
product rule The probability of two independent events
positional cloning The identification of the DNA sequences
occurring simultaneously is the product of the individual
encoding a gene of interest based on knowledge of its genetic or
probabilities.
cytogenetic map location.
prokaryote An organism composed of a prokaryotic cell, such as
positional information The process by which chemical cues
a bacterium or a blue-green alga.
that establish cell fate along a geographic axis are established in a
developing embryo or tissue primordium. proliferating cell nuclear antigen (PCNA) Part of the
replisome, PCNA is the eukaryotic version of the prokaryotic
position effect Describes a situation in which the phenotypic
sliding clamp protein.
influence of a gene is altered by changes in the position of the
gene within the genome. promoter A regulator region that is a short distance from the 5′
end of a gene and acts as the binding site for RNA polymerase.
position-effect variegation (PEV) Variegation caused by
the inactivation of a gene in some cells through its abnormal promoter-proximal element The series of transcription-factor
juxtaposition with heterochromatin. binding sites located near the core promoter.
positive assortative mating A situation in which like property A characteristic feature of an organism, such as size,
phenotypes mate more commonly than expected by chance. color, shape, or enzyme activity.
positive control Regulation mediated by a protein that is prophage A phage “chromosome” inserted as part of the linear
required for the activation of a transcription unit. structure of the DNA chromosome of a bacterium.
Glossary 827
propositus In a human pedigree, the person who first came to rare SNP A single nucleotide polymorphism (SNP) for which
the attention of the geneticist. the less common allele occurs at a frequency below 5 percent.
proteome The complete set of protein-coding genes in a genome. rearrangement The production of abnormal chromosomes by
the breakage and incorrect rejoining of chromosomal segments;
proto-oncogene The normal cellular counterpart of a gene that examples are inversions, deletions, and translocations.
can be mutated to become a dominant oncogene.
recessive allele An allele whose phenotypic effect is not
prototroph A strain of organisms that will proliferate on expressed in a heterozygote.
minimal medium (compare auxotroph).
recipient The bacterial cell that receives DNA in a unilateral
provirus The chromosomally inserted DNA genome of a transfer between cells; examples are F − in a conjugation or the
retrovirus. transduced cell in a phage-mediated transduction.
pseudoautosomal regions 1 and 2 Small regions at the ends of recombinant Refers to an individual organism or cell having a
the X and Y sex chromosomes; they are homologous and undergo genotype produced by recombination.
pairing and crossing over at meiosis.
recombinant DNA A novel DNA sequence formed by the
pseudodominance The sudden appearance of a recessive combination of two nonhomologous DNA molecules.
phenotype in a pedigree, due to the deletion of a masking
dominant gene. recombinant frequency (RF) The proportion (or percentage)
of recombinant cells or individuals.
pseudogene A mutationally inactive gene for which no
functional counterpart exists in wild-type populations. recombination (1) In general, any process in a diploid or
partly diploid cell that generates new gene or chromosomal
pseudolinkage The appearance of linkage of two genes on combinations not previously found in that cell or in its
translocated chromosomes. progenitors. (2) At meiosis, the process that generates a haploid
pulse–chase experiment An experiment in which cells are product of meiosis whose genotype is different from either of the
grown in radioactive medium for a brief period (the pulse) and two haploid genotypes that constituted the meiotic diploid.
then transferred to nonradioactive medium for a longer period recombination map A chromosome map in which the positions
(the chase). of loci shown are based on recombinant frequencies.
pure line A population of individuals all bearing the identical regulon Genes that are transcribed in a manner that is coordinated
fully homozygous genotype. by the same regulatory protein (for example, sigma factor).
purifying selection Natural selection that removes deleterious relative fitness A measure of the fitness of an individual or
variants of a DNA or protein sequence, thus reducing genetic genotype relative to some other individual or genotype, usually
diversity. the most fit individual or genotype in the population.
purine A type of nitrogen base; the purine bases in DNA are release factor (RF) A protein that binds to the A site of the
adenine and guanine. ribosome when a stop codon is in the mRNA.
pyrimidine A type of nitrogen base; the pyrimidine bases in replica plating In microbial genetics, a way of screening
DNA are cytosine and thymine. colonies arrayed on a master plate to see if they are mutant
under other environments; a felt pad is used to transfer the
pyrosequencing DNA sequencing technology that is based on colonies to new plates.
the generation and detection of a pyrophosphate group liberated
from a nucleotide triphosphate. replication fork The point at which the two strands of DNA are
separated to allow the replication of each strand.
replicative transposition A mechanism of transposition that
QTL See quantitative trait locus. generates a new insertion element integrated elsewhere in the
genome while leaving the original element at its original site of
quantitative genetics The subfield of genetics that studies the insertion.
inheritance of complex or quantitative traits.
replisome The molecular machine at the replication fork that
quantitative trait Any trait exhibiting complex inheritance coordinates the numerous reactions necessary for the rapid and
because it is controlled by a mix of genetic and/or environmental accurate replication of DNA.
factors.
reporter gene A gene whose phenotypic expression is easy to
quantitative trait locus (QTL) A gene contributing to the monitor; used to study tissue-specific promoter and enhancer
phenotypic variation in a trait that shows complex inheritance, activities in transgenes.
such as height and weight.
repressor A protein that binds to a cis-acting element such as an
quantitative trait locus mapping A method for locating QTL operator or a silencer, thereby preventing transcription from an
in the genome and characterizing the effects of QTL on trait adjacent promoter.
variation.
resistant mutant A mutant that can grow in a normally toxic
quaternary structure of a protein The multimeric constitution environment.
of a protein.
restriction enzyme An endonuclease that will recognize specific
random genetic drift Changes in allele frequency that result target nucleotide sequences in DNA and break the DNA chain at
because the genes appearing in offspring are not a perfectly those points; a variety of these enzymes are known, and they are
representative sampling of the parental genes. extensively used in genetic engineering.
828 Glossary
restriction fragment A DNA fragment resulting from cutting RNA polymerase holoenzyme The bacterial multisubunit
DNA with a restriction enzyme. complex composed of the four subunits of the core enzyme plus
the σ factor.
restriction fragment length polymorphism (RFLP) A
difference in DNA sequence between individuals or haplotypes RNA processing The collective term for the modifications
that is recognized as different restriction fragment lengths. For to eukaryotic RNA, including capping and splicing, that are
example, a nucleotide-pair substitution can cause a restriction- necessary before the RNA can be transported into the cytoplasm
enzyme-recognition site to be present in one allele of a gene and for translation.
absent in another. Consequently, a probe for this DNA region will
hybridize to different-sized fragments within restriction digests RNA sequencing A method used to determine the transcribed
of DNAs from these two alleles. regions of a genome within some specific cell population, tissue
sample, or organism.
restrictive temperature The temperature at which a
temperature-sensitive mutation expresses the mutant phenotype. RNA splicing A reaction found largely in eukaryotes that
removes introns and joins together exons in RNA.
retrotransposition A mechanism of transposition characterized
by the reverse flow of information from RNA to DNA. RNA world The name of a popular theory that RNA must have
been the genetic material in the first cells because only RNA is
retrotransposon A transposable element that uses reverse known to both encode genetic information and catalyze biological
transcriptase to transpose through an RNA intermediate. See reactions.
class 1 element.
rolling circle replication A mode of replication used by some
retrovirus An RNA virus that replicates by first being converted circular DNA molecules in bacteria (such as plasmids) in which
into double-stranded DNA. the circle seems to rotate as it reels out one continuous leading
reverse genetics An experimental procedure that begins strand.
with a cloned segment of DNA or a protein sequence and uses R plasmid A plasmid containing one or several transposons that
it (through directed mutagenesis) to introduce programmed bear resistance genes.
mutations back into the genome to investigate function.
rRNA (ribosomal RNA) A class of RNA molecules, encoded
reverse transcriptase An enzyme that catalyzes the synthesis in the nucleolar organizer, that have an integral (but poorly
of a DNA strand from an RNA template. understood) role in ribosome structure and function.
revertant An allele with wild-type function arising by the
mutation of a mutant allele; caused either by a complete reversal
of the original event or by a compensatory second-site mutation. s See selection coefficient.
RF See recombinant frequency; release factor. S See segregating sites or selection differential.
R factors Plasmids carrying genes that encode resistance to safe haven A site in the genome where the insertion of a
several antibiotics. transposable element is unlikely to cause a mutation, thus
RFLP See restriction fragment length polymorphism. preventing harm to the host.
ribonucleic acid See RNA. sample A small group of individual members or observations
meant to be representative of a larger population from which the
ribose The pentose sugar of RNA. group has been taken.
ribosomal RNA See rRNA. Sanger sequencing See dideoxy sequencing.
ribosome A complex organelle that catalyzes the translation scaffold (1) The central framework of a chromosome to which
of messenger RNA into an amino acid sequence; composed of the DNA solenoid is attached as loops; composed largely of
proteins plus rRNA. topoisomerase. (2) In genome projects, an ordered set of contigs
ribozyme An RNA with enzymatic activity—for instance, the in which there may be unsequenced gaps connected by paired-
self-splicing RNA molecules in Tetrahymena. end sequence reads.
RISC (RNA-induced silencing complex) A multisubunit screen A mutagenesis procedure in which essentially all
protein complex that associates with siRNAs and is guided to mutagenized progeny are recovered and are individually
a target mRNA by base complementarity. The target mRNA is evaluated for mutant phenotype; often the desired phenotype is
cleaved by RISC activity. marked in some way to enable its detection.
RNA (ribonucleic acid) A single-stranded nucleic acid similar SDSA See synthesis-dependent strand annealing.
to DNA but having ribose sugar rather than deoxyribose sugar
secondary structure of a protein A spiral or zigzag
and uracil rather than thymine as one of the bases.
arrangement of the polypeptide chain.
RNA blotting See Northern blotting.
second-division segregation pattern (MII pattern) A pattern
RNAi See RNA interference. of ascospore genotypes for a gene pair showing that the two
alleles separate into different nuclei only at the second meiotic
RNA interference (RNAi) A system in eukaryotes to control
division, as a result of a crossover between that gene pair and its
the expression of genes through the action of siRNAs and
centromere; can be detected only in a linear ascus.
miRNAs. See gene silencing.
second filial generation (F2) The progeny of a cross between
RNA polymerase An enzyme that catalyzes the synthesis of an
two individuals from the F1 generation.
RNA strand from a DNA template. Eukaryotes possess several
classes of RNA polymerase; structural genes encoding proteins segmental duplication Presence of two or more large
are transcribed by RNA polymerase II. nontandem repeats.
Glossary 829
segment-polarity gene In Drosophila, a member of a class of transcription correctly at the start site. The σ factor dissociates
genes that contribute to the final aspects of establishing the from the holoenzyme before RNA synthesis.
correct number of segments. Segment-polarity mutations cause
signal sequence The amino-terminal sequence of a secreted
a loss of or change in a comparable part of each of the body
protein; it is required for the transport of the protein through the
segments.
cell membrane.
segregating sites (S) The number of variable or polymorphic
nucleotide sites in a set of homologous DNA sequences. sign epistasis The dependency of the fitness advantage or
disadvantage of a new mutation on the mutations that have been
selection (1) An experimental procedure in which only a previously fixed.
specific type of mutant can survive. (2) The production of
different average numbers of offspring by different genotypes in simple inheritance A form of inheritance in which only one (or
a population as a result of the different phenotypic properties of a few) genes are involved and the environment has little or no
those genotypes. effect on the phenotype; categorical traits often exhibit simple
inheritance.
selection coefficient (s) The loss of fitness in (or selective
disadvantage of) one genotype relative to another genotype. simple sequence length polymorphism (SSLP) The existence in
the population of individuals showing different numbers of copies
selection differential (S) The difference between the mean of of a short simple DNA sequence at one chromosomal locus.
a population and the mean of the individual members selected to
be parents of the next generation. simple transposon A type of bacterial transposable element
containing a variety of genes that reside between short inverted
selection response (R) The amount of change in the average repeat sequences.
value of some phenotypic character between the parental
generation and the offspring generation as a result of the SINE See short interspersed element.
selection of parents. single nucleotide polymorphism (SNP) (snip) A nucleotide-
selective system A mutational selection technique that enriches pair difference at a given location in the genomes of two or more
the frequency of specific (usually rare) genotypes by establishing naturally occurring individuals.
environmental conditions that prevent the growth or survival of single-strand-binding (SSB) protein A protein that binds to
other genotypes. DNA single strands and prevents the duplex from re-forming
self To fertilize eggs with sperms from the same individual. before replication.
self-splicing intron The first example of catalytic RNA; in this siRNA See small interfering RNA.
case, an intron that can be removed from a transcript without the small interfering RNA (siRNA) Short double-stranded RNAs
aid of a protein enzyme. produced by the cleavage of long double-stranded RNAs by Dicer.
semiconservative replication The established model of DNA small nuclear RNA (snRNA) Any of several short RNAs found
replication in which each double-stranded molecule is composed in the eukaryotic nucleus, where they assist in RNA processing
of one parental strand and one newly polymerized strand. events.
semisterility (half-sterility) The phenotype of an organism SNP See single nucleotide polymorphism.
heterozygotic for certain types of chromosome aberration;
expressed as a reduced number of viable gametes and hence snRNA See small nuclear RNA.
reduced fertility. solo LTR A single copy of an LTR.
sequence assembly The compilation of thousands or millions SOS (repair) system An error-prone process whereby a bypass
of independent DNA sequence reads into a set of contigs and polymerase replicates past DNA damage at a stalled replicating
scaffolds. fork by inserting nonspecific bases.
sequence contig A group of overlapping cloned segments.
Southern blot The transfer of electrophoretically separated
serially reiterated structures Body parts that are members of fragments of DNA from a gel to an absorbent sheet such as paper;
repeated series, such as digits, ribs, teeth, limbs, and segments. this sheet is then immersed in a solution containing a labeled
probe that will bind to a fragment of interest.
sex chromosome A chromosome whose presence or absence is
correlated with the sex of the bearer; a chromosome that plays a specialized transduction The situation in which a particular
role in sex determination. phage will transduce only specific regions of the bacterial
chromosome.
sex linkage The location of a gene on a sex chromosome.
spliceosome The ribonucleoprotein processing complex that
Shine–Dalgarno sequence A short sequence in bacterial RNA
removes introns from eukaryotic mRNAs.
that precedes the initiation AUG codon and serves to correctly
position this codon in the P site of the ribosome by pairing splicing A reaction that removes introns and joins together
(through base complementarity) with the 3′ end of the 16S RNA exons in RNA.
in the 30S ribosomal subunit.
spontaneous lesion DNA damage occurring in the absence of
short interspersed element (SINE) A type of class 1 exposure to mutagens; due primarily to the mutagenic action of
transposable element that does not encode reverse transcriptase the by-products of cellular metabolism.
but is thought to use the reverse transcriptase encoded by LINEs.
spontaneous mutation A mutation occurring in the absence of
See also Alu.
exposure to mutagens.
sigma (σ) factor A bacterial protein that, as part of the RNA
SRY gene The maleness gene, residing on the Y chromosome.
polymerase holoenzyme, recognizes the 10 and 35 regions of
bacterial promoters, thus positioning the holoenzyme to initiate SSB See single-strand-binding protein.
830 Glossary
SSLP See short-sequence-length polymorphism. telomerase An enzyme that, with the use of a special small
RNA as a template, adds repetitive units to the ends of linear
standard deviation The square root of the variance.
chromosomes to prevent shortening after replication.
structure-based drug design The use of basic information
telomere The tip, or end, of a chromosome.
about cellular processes and machinery to develop drugs.
subfunctionalization A path of gene duplication and mutation temperate phage A phage that can become a prophage.
that produces paralogs with complementary functions. temperature-sensitive mutation A conditional mutation that
subunit As used in Chapter 9, a single polypeptide in a protein produces the mutant phenotype in one temperature range and
containing multiple polypeptides. the wild-type phenotype in another temperature range.
sum rule The probability that one or the other of two mutually template A molecular “mold” that shapes the structure or
exclusive events will occur is the sum of their individual sequence of another molecule; for example, the nucleotide
probabilities. sequence of DNA acts as a template to control the nucleotide
sequence of RNA during transcription.
supercontig See scaffold (2).
termination The last stage of transcription; it results in the
suppressor A secondary mutation that can cancel the effect of a release of the RNA and RNA polymerase from the DNA template.
primary mutation, resulting in wild-type phenotype.
terminus The end represented by the last added monomer
synergistic effect A feature of eukaryotic regulatory proteins in the unidirectional synthesis of a polymer such as RNA or a
for which the transcriptional activation mediated by the polypeptide.
interaction of several proteins is greater than the sum of the
effects of the proteins taken individually. tertiary structure of a protein The folding or coiling of the
secondary structure to form a globular molecule.
synonymous mutation A mutation that changes one codon for
an amino acid into another codon for that same amino acid. Also testcross A cross of an individual organism of unknown
called silent mutation. genotype or a heterozygote (or a multiple heterozygote) with a
tester.
synonymous substitution See synonymous mutation.
tester An individual organism homozygous for one or more
synteny A situation in which genes are arranged in similar recessive alleles; used in a testcross.
blocks in different species.
tetrad (1) Four homologous chromatids in a bundle in the first
synthesis-dependent strand annealing (SDSA) An error-free meiotic prophase and metaphase. (2) The four haploid product
mechanism for correcting double-strand breaks that occur after cells from a single meiosis.
the replication of a chromosomal region in a dividing cell.
tetraploid A cell having four chromosome sets; an organism
synthetic lethal Refers to a double mutant that is lethal, composed of such cells.
whereas the component single mutations are not.
theta (θ) structure An intermediate structure in the replication
of a circular bacterial chromosome.
tandem duplication Adjacent identical chromosome segments.
three-point testcross (three-factor testcross) A testcross in
targeted gene knockout The introduction of a null mutation which one parent has three heterozygous gene pairs.
into a gene by a designed alteration in a cloned DNA sequence
3′ untranslated region (3′ UTR) The region of the RNA
that is then introduced into the genome through homologous
transcript at the 3′ end downstream of the site of translation
recombination and replacement of the normal allele.
termination.
targeting A feature of certain transposable elements that
threshold trait A categorical trait for which the expression
facilitates their insertion into regions of the genome where they
of the different phenotypic states depends on a combination
are not likely to insert into a gene causing a mutation.
of multiple genetic and/or environmental factors that place an
target-site duplication A short direct-repeat DNA sequence individual above or below a critical value for trait expression.
(typically from 2 to 10 bp in length) adjacent to the ends of a
thymine (T) A pyrimidine base that pairs with adenine.
transposable element that was generated during the element’s
integration into the host chromosome. Ti plasmid A circular plasmid of Agrobacterium tumifaciens that
enables the bacterium to infect plant cells and produce a tumor
TATA-binding protein (TBP) A general transcription factor
(crown gall tumor).
that binds to the TATA box and assists in attracting other general
transcription factors and RNA polymerase II to eukaryotic Tn See transposon.
promoters.
topoisomerase An enzyme that can cut and re-form
TATA box A DNA sequence found in many eukaryotic genes that polynucleotide backbones in DNA to allow it to assume a more
is located about 30 bp upstream of the transcription start site. relaxed configuration.
tautomeric shift The spontaneous isomerization of a nitrogen trait More or less synonymous with phenotype.
base from its normal keto form to an alternative hydrogen-
trans-acting factor A diffusible regulatory molecule (almost
bonding enol (or imino) form.
always a protein) that binds to a specific cis-acting element.
tautomerization See tautometic shift.
trans conformation In a heterozygote with two mutant sites
TBP See TATA-binding protein. within a gene or gene cluster, the arrangement a1 +/+ a2.
TC-NER See transcription-coupled nucleotide-excision transcript The RNA molecule copied from the DNA template
repair. strand by RNA polymerase.
Glossary 831
transcription The synthesis of RNA from a DNA template. is responsible for a number of genetic diseases, such as fragile X
syndrome and Huntington disease.
transcriptional gene silencing Occurs when a gene cannot be
transcribed because it is located in heterochromatin. triploid A cell having three chromosome sets or an organism
composed of such cells.
transcription bubble The site at which the double helix is
unwound so that RNA polymerase can use one of the DNA trisomic Basically a diploid with an extra chromosome of one
strands as a template for RNA synthesis. type, producing a chromosome number of the form 2n + 1.
transcription-coupled nucleotide-excision repair trivalent Refers to the meiotic pairing arrangement of three
(TC-NER) A form of nucleotide-excision repair that is activated homologs in a triploid or trisomic.
by stalled transcription complexes and corrects DNA damage in tRNA (transfer RNA) A class of small RNA molecules that bear
transcribed regions of the genome. specific amino acids to the ribosome in the course of translation;
transduction The movement of genes from a bacterial donor to an amino acid is inserted into the growing polypeptide chain
a bacterial recipient with a phage as the vector. when the anticodon of the corresponding tRNA pairs with a
codon on the mRNA being translated.
transfer RNA See tRNA.
tumor-suppressor gene A gene encoding a protein that
transformation The directed modification of a genome by the suppresses tumor formation. The wild-type alleles of tumor-
external application of DNA from a cell of different genotype. suppressor genes are thought to function as negative regulators
transgene A gene that has been modified by externally applied of cell proliferation.
recombinant DNA techniques and reintroduced into the genome Turner syndrome An abnormal human female phenotype
by germ-line transformation. produced by the presence of only one X chromosome (XO).
transgenic organism An organism whose genome has been two-hybrid test A method for detecting protein–protein
modified by externally applied new DNA. interactions, typically performed in yeast.
transition A type of nucleotide-pair substitution in which a Ty element A yeast LTR-retrotransposon; the first isolated from
purine replaces another purine or in which a pyrimidine replaces any organism.
another pyrimidine—for example, G–C to A–T.
translation The ribosome- and tRNA-mediated production of
a polypeptide whose amino acid sequence is derived from the U See uracil; uridine.
codon sequence of an mRNA molecule. UAS See upstream activation sequence.
translesion DNA synthesis A damage-tolerance mechanism in ubiquitin A protein that, when attached as a multicopy chain
eukaryotes that uses bypass polymerases to replicate DNA past a to another protein, targets that protein for degradation by a
site of damage. protease called the 26S proteasome. The addition of single
translesion polymerases A family of DNA polymerases that ubiquitin residues to a protein can change protein–protein
can continue to replicate DNA past a site of damage that would interactions, as in the case of PCNA and bypass polymerases.
halt replication by the normal replicative polymerase. Also ubiquitinization The process of adding ubiquitin chains to a
known as bypass polymerases. protein targeted for degradation.
translocation The relocation of a chromosomal segment to a unbalanced rearrangement A rearrangement in which
different position in the genome. chromosomal material is gained or lost in one chromosome set.
transposase An enzyme encoded by transposable elements that uniparental inheritance Inheritance pattern in which the
undergo conservative transposition. progeny have the genotype and phenotype of one parent only, for
transpose To move from one location in the genome to another; example, inheritance of mitochrondrial genomes.
said of a mobile genetic element. univalent A single unpaired meiotic chromosome, as is often
transposition A process by which mobile genetic elements move found in trisomics and triploids.
from one location in the genome to another. unselected marker In a bacterial recombination experiment,
transposon (Tn) A mobile piece of DNA that is flanked by an allele scored in progeny for the frequency of its cosegregation
terminal repeat sequences and typically bears genes encoding with a linked selected allele.
transposition functions. Bacterial transposons can be simple or unstable phenotype A phenotype characterized by frequent
composite. reversion either somatically or germinally or both due to the
interaction of transposable elements with a host gene.
transposon tagging A method used to identify and isolate a
host gene through the insertion of a cloned transposable element upstream Refers to a DNA or RNA sequence located on the 5′
in the gene. side of a point of reference.
transversion A type of nucleotide-pair substitution in which a upstream activation sequence (UAS) A DNA sequence of
pyrimidine replaces a purine or vice versa—for example, G–C to yeast located 5′ of the gene promoter; a transcription factor binds
T–A. to the UAS to positively regulate gene expression.
trinucleotide repeat See triplet expansion. uracil (U) A pyrimidine base in RNA in place of the thymine
found in DNA.
triplet Three nucleotide pairs that compose a codon.
uridine (U) A nucleoside having uracil as its base.
triplet expansion The expansion of a 3-bp repeat from a
relatively low number of copies to a high number of copies that UTR See 3′ untranslated region; 5′ untranslated region.
832 Glossary
variable number tandem repeat (VNTR) A chromosomal X chromosome One of a pair of sex chromosomes, distinguished
locus at which a particular repetitive sequence is present in from the Y chromosome.
different numbers in different individuals or in the two different
xeroderma pigmentosum (XP) A disorder caused by mutations
homologs in one diploid individual.
in the transcription-coupled nucleotide-excision-repair system
variance A statistical measure used to quantify the degree to that leads to the frequent development of skin cancers.
which the trait values of individuals deviate from the population
X linkage The inheritance pattern of genes found on the
mean.
X chromosome but not on the Y chromosome.
vector See cloning vector.
XP See xeroderma pigmentosum.
vertical transmission Inheritance of DNA from a member of a
previous generation.
virulent phage A phage that cannot become a prophage; Y chromosome One of a pair of sex chromosomes, distinguished
infection by such a phage always leads to lysis of the host cell. from the X chromosome.
virus A particle consisting of nucleic acid and protein that must Y linkage The inheritance pattern of genes found on the
infect a living cell to replicate and reproduce. Y chromosome but not on the X chromosome (rare).
833
834 Answers to Selected Problems
b. Map distance can be calculated by using the formula RF = A–B: 100%(130 + 140 + 17 + 13)/1000 = 30 m.u.
[NPD + (1/2)T]100%. In this case, the frequency of NPD is 6/200, or B–D: 100%(31 + 39 + 17 + 13)/1000 = 10 m.u.
3%, and the frequency of T is 82/200, or 41%. Map distance between Cross 2 reduces to
these two loci is therefore 23.5 cM.
P A/A · C/C · E/E × a/a · c/c · e/e
F1 A/a · C/c · E/e × a/a · c/c · e/e
23.5 cM The testcross progeny indicate that these three genes are linked.
pro his
c. To correct for multiple crossovers, the Perkins formula can be Testcross ACE 243 parental
used. Thus, map distance = (T + 6NPD)50%, or (0.41 + 0.18)50% = progeny ace 237 parental
29.5 cM. Ace 62 CO A–C
47. a. The cross is W e F/W e F × w E f/w E f and the F1 are W e F/w E f. aCE 58 CO A–C
Progeny that are ww ee ff from a testcross of this F1 must have inherited ACe 155 CO C–E
one of the double-crossover recombinant chromosomes (w e f ). With acE 165 CO C–E
the assumption of no interference, the expected percentage of double aCe 46 DCO
crossovers is 8% × 24% = 1.92%, half of which is 0.96%. AcE 34 DCO
b. To obtain a ww ee ff progeny from a self cross of this F1 requires
the independent inheritance of two doubly recombinant w e f chromo- A–B: 100%(62 + 58 + 46 + 34)/1000 = 20 m.u.
somes. Its chances of happening, based on the answer to part a of this B–D: 100%(155 + 165 + 46 + 34)/1000 = 40 m.u.
problem, are 0.96 × 0.96 = 0.009%.
The map that accommodates all the data is
53. The short answer is that the results tell us little about linkage.
Although the number of recombinants (3) is less than the number of E C A B D
parentals (5), one can have no confidence in the fact that the RF is
<50%. The main problem is that the sample size is small, so just one
individual more or less in a genotypic class can dramatically affect the 40 m.u. 20 m.u. 30 m.u. 10 m.u.
ratios. Even the chi-square test is unreliable at such small sample sizes. b. Interference (I) = 1 - [(observed DCO)/(expected DCO)]
It is probably safe to say that there is not tight linkage because several
recombinants were found in a relatively small sample. However, one For cross 1: I = 1 - {30/[(0.30)(0.10)(1000)]} = 1 - 1 = 0,
cannot distinguish between more distant linkage and independent no interference
assortment. A larger sample size is required. For cross 2: I = 1 - {80/[(0.20)(0.40)(1000)]} = 1 - 1 = 0,
58. The data given for each of the three-point testcrosses can be used no interference
to determine the gene order when one realizes that the rarest recom- 69. a. and b. The data support the independent assortment of two
binant classes are the result of double-crossover events. A comparison genes (call them arg1 and arg2). The cross becomes arg1 ; arg2+ ×
of these chromosomes with the “parental” types reveals that the alleles arg1+ ; arg2 and the resulting tetrads are
that have switched represent the gene in the middle.
For example, in data set 1, the most common phenotypes (+ + + 4 : 0 (PD) 3 : 1 (T) 2 : 2 (NPD)
and a b c) represent the parental-allele combinations. A comparison of arg1 ;arg2+ arg1 ; arg2+ arg1 ; arg2
these phenotypes with the rarest phenotypes of this data set (+ b c and
arg1 ; arg2+ arg1+ ; arg2 arg1 ; arg2
a + +) indicates that the a gene is recombinant and must be in the
arg1+ ; arg2 arg1 ; arg2 arg1+ ; arg2+
middle. The gene order is b a c.
For data set 2, + b c and a + + (the parentals) should be com- arg1+ ; arg2 arg1+ ; arg2+ arg1+ ; arg2+
pared with + + + and a b c (the rarest recombinants) to indicate that Because PD = NPD, the genes are unlinked.
the a gene is in the middle. The gene order is b a c.
For data set 3, compare + b + and a + c with a b + and + + c, Chapter 5
which gives the gene order b a c.
19. An Hfr strain has the fertility factor F integrated into the chromo-
For data set 4, compare + + c and a b + with + + + and a b c,
some. An F+ strain has the fertility factor free in the cytoplasm. An
which gives the gene order a c b.
F- strain lacks the fertility factor.
For data set 5, compare + + + and a b c with + + c and a b +,
23. Although the interrupted-mating experiments will yield the
which gives the gene order a c b.
gene order, it will be relative only to fairly distant markers. Thus, the
64. a. Cross 1 reduces to
mutation cannot be precisely located with this technique. Generalized
P A/A · B/B · D/D × a/a · b/b · d/d transduction will yield information with regard to very close markers,
F1 A/a · B/b · D/d × a/a · b/b · d/d which makes it a poor choice for the initial experiments because of the
massive amount of screening that would have to be done. Together, the
The testcross progeny indicate that these three genes are linked two techniques allow, first, for localization of the mutant (interrupted
(CO = crossover, DCO = double crossover). mating) and, second, for the precise determination of the location of
Testcross ABD 316 parental the mutant (generalized transduction) within the general region.
28. The best explanation is that the integrated F factor of the Hfr
progeny abd 314 parental looped out of the bacterial chromosome abnormally and is now an F′
ABd 31 CO B–D that contains the pro+ gene. This F′ is rapidly transferred to F- cells,
abD 39 CO B–D converting them into pro+ (and F+).
Abd 130 CO A–B 33. The expected number of double recombinants is
aBD 140 CO A–B (0.01)(0.002)(100,000) = 2. Interference = 1 - (observed double cross-
AbD 17 DCO over/expected double crossover) = 1 - 5/2 = -1.5. By definition, the
aBd 13 DCO interference is negative.
836 Answers to Selected Problems
37. a. This process appears to be specialized transduction. It is char- 27. a. The sex ratio is expected to be 1 : 1.
acterized by the transduction of specific markers based on the position b. The female parent was heterozygous for an X-linked recessive
of the integration of the prophage. Only those genes near the integra- lethal allele, which would result in 50% fewer males than females.
tion site are possible candidates for misincorporation into phage par- c. Half of the female progeny should be heterozygous for the
ticles that then deliver this DNA to recipient bacteria. lethal allele, and half should be homozygous for the nonlethal allele.
b. The only media that supported colony growth were those lacking Individually mate the F1 females and determine the sex ratio of their
either cysteine or leucine. These media selected for cys+ or leu+ transduc- progeny.
tants and indicate that the prophage is located in the cys–leu region. 30. a. The mutations are in two different genes because the hetero-
42. No. Closely linked loci would be expected to be cotransduced; the karyon is prototrophic (the two mutations complemented each other).
greater the contransduction frequency, the closer the loci are. Because b. leu1+ ; leu2- and leu1- ; leu2+
only 1 of 858 metE+ was also pyrD+, the genes are not closely linked. c. With independent assortment, expect
The lone metE+ pyrD+ could be the result of cotransduction, it could be 1
4
leu1+ ; leu2-
a spontaneous mutation of pyrD to pyrD+, or it could be the result of 1
leu1- ; leu2+
4
coinfection by two separate transducing phages.
47. a. To determine which genes are close, compare the frequencies 1
4
leu1- ; leu2-
of double transformants. Pair-by-pair testing gives low values when- 1
4
leu1+ ; leu2+
ever B is included but fairly high rates when any drug but B is included. 34. a. P A/a (frizzle) × A/a (frizzle)
This finding suggests that the gene for B resistance is not close to the F1 1 A/A (normal) : 2 A/a (frizzle) : 1 a/a (woolly)
other three genes. b. If A/A (normal ) is crossed with a/a (woolly), all offspring will
b. To determine the relative order of genes for resistance to A, C, be A/a (frizzle).
and D, compare the frequencies of double and triple transformants. The 36. The production of black offspring from two pure-breeding reces-
frequency of resistance to AC is approximately the same as that of resis- sive albino parents is possible if albinism results from mutations in two
tance to ACD, which strongly suggests that D is in the middle. Addition- different genes. If the cross is designated
ally, the frequency of AD coresistance is higher than AC (suggesting that
the gene for A resistance is closer to D than to C) and the frequency of CD A/A ; b/b × a/a ; B/B
is higher than AC (suggesting that C is closer to D than to A). all offspring would be
51. To isolate the specialized transducing particles of phage φ80 that A/a ; B/b
carried lac+, the researchers would have had to lysogenize the strain and they would have a black phenotype because of complementation.
with φ80, induce the phage with UV, and then use these lysates to 40. The purple parent can be either A/a ; b/b or a/a ; B/b for this
transduce a Lac- strain to Lac+. The Lac+ colonies would then have been answer. Assume that the purple parent is A/a ; b/b. The blue parent
used to make a new lysate, which should have been highly enriched for must be A/a ; B/b.
the lac+ transducing phage. 43. The cross is gray × yellow, or A/– ; R/– × A/– ; r/r. The F1
progeny are
3 1 3 1
Chapter 6 8 yellow 8 black 8 gray 8 white
13. With the assumption of homozygosity for the normal gene, the mat- For white progeny, both parents must carry an r and an a allele.
ing is A/A · b/b × a/a · B/B. The children would be normal, A/a · B/b. Now the cross can be rewritten as A/a ; R/r × A/a ; r/r.
16. a. Red b. Purple 46. The original brown dog is w/w ; b/b and the original white dog is
W/W ; B/B. The F1 progeny are W/w ; B/b and the F2 progeny are
c. 9 M1/– ; M2/– purple
9 W/– ; B/– white
3 m1/m1 ; M2/– blue 3 w/w ; B/– black
3 M1/– ; m2/m2 red 3 W/– ; b/b white
1 m1/m1 ; m2/m2 white 1 w/w ; b/b brown
50. Pedigrees such as this one are quite common. They indicate lack
d. The mutant alleles do not produce functional enzyme. However,
of penetrance due to epistasis or environmental effects. Individual A
enough functional enzyme must be produced by the single wild-type
must have the dominant autosomal gene.
allele of each gene to synthesize normal levels of pigment.
52. a. Let WO = oval, WS = sickle, and WR = round. The three
20. a. The original cross was a dihybrid cross. Both oval and purple
crosses are
must represent an incomplete dominant phenotype.
Cross 1: WS/WS × WR/Y → WS/WR and WS/Y
b. A long, purple × oval, purple cross is as follows:
Cross 2: WS/WS × WS/Y → WS/WR and WR/Y
P L/L ; R/R′ × L/L′ ; R/R′′ Cross 3: WS/WS × WO/Y → WO/WS and WS/Y
1
1
4
R/R 8 long, red b. WO/WS × WR/Y
F1 1
L/L × 21 R/R′ 1
long, purple 1
4
WO/WR female oval
2 4
1 1
WS/WR female sickle
1
4
R′/R′ 8 long, white 4
1
4
WO/Y male oval
1
1
4
R/R 8 oval, red 1
WS/Y male sickle
4
21 L/L′ × 21 R/R′ 1
4
oval, purple 55. a. The genotypes are
1
R′/R′ 1
oval, white
2 4 P B/B ; i/i × b/b ; I/I
23. Parents Child F1 B/b ; I/i hairless
a. AB × O B F2 9 B/– ; I/– hairless
b. A × O A 3 B/– ; i/i straight
c. A × AB AB 3 b/b ; I/– hairless
d. O × O O 1 b/b ; i/i bent
Answers to Selected Problems 837
b. The genotypes are B/b ; I/i × B/b ; i/i. (2) To obtain a 3 red : 1 white ratio, two alleles must be segregat-
58. There are a total of 159 progeny that should be distributed in a ing and they cannot be within the same gene. A representative cross is
9 : 3 : 3 : 1 ratio if the two genes are assorting independently. You can see R1/r1 ; R2/r2 ; r3/r3 × r1/r1 ; r2/r2 ; r3/r3.
that (3) To obtain a 7 red : 1 white ratio, three alleles must be segre-
Observed Expected gating, and they cannot be within the same gene. The cross is R1/r1 ;
R2/r2 ; R3/r3 × r1/r1 ; r2/r2 ; r3/r3.
88 P/– ; Q/– 90 d. The formula is 1 - ( 41 )n, where n = the number of loci that are
32 P/– ; q/q 30 segregating in the representative crosses in part c.
25 p/p ; Q/– 30
75. a. and b. Epistasis is implicated, and the homozygous recessive
14 p/p ; q/q 10
white genotype seems to block the production of color by a second gene.
Assume the following dominance relations: red > orange > yellow.
61. a. Cross-feeding is taking place, whereby a product made by one
Let the alleles be designated as follows:
strain diffuses to another strain and allows growth of the second
strain. red AR
b. For cross-feeding to take place, the growing strain must have a orange AO
block that is present earlier in the metabolic pathway than the block in
the strain from which the growing strain is obtaining the product for yellow AY
growth. Crosses 1 through 3 now become
c. The data suggest that the metabolic pathway is
trpE → trpD → trpB P AO/AO × AY/AY AR/AR × AO/AO AR/AR × AY/AY
d. Without some tryptophan, there would be no growth at all, F1 AO/AY AR/AO AR/AY
and the cells would not have lived long enough to produce a product F2 3 AO/– : 1 AY/AY 3 AR/– : 1 AO/AO 3 AR/– : 1 AY/AY
that could diffuse
Cross 4: To do this cross, you must add a second gene. You must
63. a. The best explanation is that Marfan’s syndrome is inherited as also rewrite crosses 1 through 3 to include the second gene. Let B allow
a dominant autosomal trait. color expression and b block its expression, producing white. The first
b. The pedigree shows both pleiotropy (multiple affected traits) three crosses become
and variable expressivity (variable degree of expressed phenotype).
c. Pleiotropy indicates that the gene product is required in a P AO/AO ; B/B × AY/AY ; B/B
number of different tissues, organs, or processes. When the gene is AR/AR ; B/B × AO/AO ; B/B
mutant, all tissues needing the gene product will be affected. Variable AR/AR ; B/B × AY/AY ; B/B
expressivity of a phenotype for a given genotype indicates modifica-
tion by one or more other genes, random noise, or environmental F1 AO/AY ; B/B
effects. AR/AO ; B/B
66. a. This type of gene interaction is called epistasis. The phenotype AR/AY ; B/B
of e/e is epistatic to the phenotypes of B/– or b/b. F2 3 AO/– ; B/B : 1 AY/AY ; B/B
b. The inferred genotypes are as follows: 3 AR/– ; B/B : 1 AO/AO ; B/B
I 1 (B/b E/e) 2 (B/b E/e) 3 AR/– ; B/B : 1 AY/AY ; B/B
II 1 (b/b E/e) 2 (B/b E/e) 3 (–/– e/e) 4 (b/b E/–) The fourth cross is
5 (B/b E/e) 6 (b/b E/e)
III 1 (B/b E/–) 2 (–/b e/e) 3 (b/b, E/–) 4 (B/b E/–) P AR/AR ; B/B × AR/AR ; b/b
5 (b/b E/–) 6 (B/b E/–) 7 (–/b e/e) F1 AR/AR ; B/b
69. a. A multiple-allelic series has been detected: superdouble > F2 3 AR/AR ; B/– : 1 AR/AR ; b/b
single > double. Cross 5: To do this cross, note that there is no orange. Therefore,
b. Although the explanation for part a does rationalize all the the two parents must carry the alleles for red and yellow, and the
crosses, it does not take into account either the female sterility or the expression of red must be blocked.
origin of the superdouble plant from a double-flowered variety.
71. a. A trihybrid cross would give a 63 : 1 ratio. Therefore, there are P AY/AY ; B/B × AR/AR ; b/b
three R loci segregating in this cross. F1 AR/AY ; B/b
b.
F2 9 AR/– ; B/– red
P R1/R1 ; R2/R2 ; R3/R3 × r1/r1 ; r2/r2 ; r3/r3 3 AR/– ; b/b white
F1 R1/r1 ; R2/r2 ; R3/r3 3 AY/AY ; B/– yellow
F2 27 R1/– ; R2/– ; R3/– red 1 AY/AY ; b/b white
9 R1/– ; R2/– ; r3/r3 red Cross 6: This cross is identical with cross 5 except that orange
9 R1/– ; r2/r2 ; R3/– red replaces yellow.
9 r1/r1 ; R2/– ; R3/– red
3 R1/– ; r2/r2 ; r3/r3 red P AO/AO ; B/B × AR/AR ; b/b
3 r1/r1 ; R2/– ; r3/r3 red
F1 AR/AO ; B/b
3 r1/r1 ; r2/r2 ; R3/– red
1 r1/r1 ; r2/r2 ; r3/r3 white F2 9 AR/– ; B/– red
3 AR/– ; b/b white
c. (1) To obtain a 1 : 1 ratio, only one of the genes can be hetero- 3 AO/AO ; B/– orange
zygous. A representative cross is R1/r1 ; r2/r2 ; r3/r3 × r1/r1 ; r2/r2 ; r3/r3. 1 AO/AO ; b/b white
838 Answers to Selected Problems
Cross 7: In this cross, yellow is suppressed by b/b. single stranded. The phage would first have to synthesize a comple-
mentary strand before it could begin to make multiple copies of itself.
P AR/AR ; B/B × AY/AY ; b/b
F1 AR/AY ; B/b Chapter 8
F2 9AR/– ; B/– red 11. In prokaryotes, translation is beginning at the 5′ end while the
3 AR/– ; b/b white 3′ end is still being synthesized. In eukaryotes, processing (capping,
3 AY/AY ; B/– yellow splicing) is taking place at the 5′ end while the 3′ end is still being
1 AY/AY ; b/b white synthesized.
17. Yes. Both replication and transcription are performed by large, mul-
77. a. Intercrossing mutant strains that all have a common recessive tisubunit molecular machines (the replisome and RNA polymerase II,
phenotype is the basis of the complementation test. This test is respectively), and both require helicase activity at the fork of the
designed to identify the number of different genes that can mutate to bubble. However, transcription proceeds in only one direction and
a particular phenotype. In this problem, if the progeny of a given cross only one DNA strand is copied.
still express the wiggle phenotype, the mutations fail to complement 19. a. The original sequence represents the -35 and -10 consensus
and are considered alleles of the same gene; if the progeny are wild sequences (with the correct number of intervening spaces) of a bacte-
type, the mutations complement and the two strains carry mutant rial promoter. The s factor, as part of the RNA polymerase holoen-
alleles of separate genes. zyme, recognizes and binds to these sequences.
b. These data identify five complementation groups (genes). b. The mutated (transposed) sequences will not be a binding site
c. mutant 1: a1/a1 · b+/b+ · c+/c+ · d+/d+ · e+/e+ (although for the s factor. The orientation of the two regions with respect to each
only the mutant alleles are usually listed) other is not correct; therefore, they will not be recognized as a promoter.
mutant 2: a+/a+ · b2/b2 · c+/c+ · d+/d+ · e+/e+ 24. Self-splicing introns are capable of excising themselves from a pri-
mutant 5: a5/a5 · b+/b+ · c+/c+ · d+/d+ · e+/e+ mary transcript without the need of additional enzymes or energy
1
5 hybrid: a1/a5 · b+/b+ · c+/c+ · d+/d+ · e+/e+ source. They are one of many examples of RNA molecules that are
phenotype: wiggles catalytic, and, for this property, they are also known as ribozymes.
Conclusion: 1 and 5 are both mutant for gene A. With this additional function, RNA is the only known biological mole-
(The relevant cross a+/a+ · b2/b2 × a2/a2 · b5/b5 gives the follow- cule to encode genetic information and catalyze biological reactions.
ing hybrid.) In simplest terms, life possibly began with an RNA molecule or group
2 of molecules that evolved the ability to self-replicate.
5 hybrid: a+/a5 · b+/b2 · c+/c+ · d+/d+ · e+/e+ 29. Double-stranded RNA, composed of a sense strand and a comple-
phenotype: wild type mentary antisense strand, can be used in C. elegans (and likely all organ-
Conclusion: 2 and 5 are mutant for different genes. isms) to selectively prevent the synthesis of the encoded gene product (a
discovery for which the 2006 Nobel Prize in Physiology or Medicine was
Chapter 7 awarded). This process, called gene silencing, blocks the synthesis of the
encoded protein from the endogenous gene and is thus equivalent to
6. The DNA double helix is held together by two types of bonds: “knocking out” the gene. To test whether a specific mRNA encodes an
covalent and hydrogen. Covalent bonds are found within each linear essential embryonic protein, inject the double-stranded RNA produced
strand and strongly bond the bases, sugars, and phosphate groups from the mRNA into eggs or very early embryos, thus activating the
(both within each component and between components). Hydrogen RNAi pathway. The effects of knocking out the specified gene product
bonds are found between the two strands; a hydrogen bond forms can then be followed by observing what happens in these embryos com-
between a base in one strand and a base in the other strand in comple- pared with controls. If the encoded protein is essential, embryonic devel-
mentary pairing. These hydrogen bonds are individually weak but col- opment should be perturbed when your gene is silenced.
lectively quite strong.
9. Helicases are enzymes that disrupt the hydrogen bonds that hold
the two DNA strands together in a double helix. This breakage is Chapter 9
required for both RNA and DNA synthesis. Topoisomerases are 13. a. and b. 5′ UUG GGA AGC 3′
enzymes that create and relax supercoiling in the DNA double helix. c. and d. With the assumption that the reading frame starts at the
The supercoiling itself is a result of the twisting of the DNA helix when first base,
the two strands separate. NH3 - Leu - Gly - Ser - COOH
11. No. The information of DNA depends on a faithful copying mech-
For the bottom strand, the mRNA is 5′ GCU UCC CAA 3′ and, with
anism. The strict rules of complementarity ensure that replication and
the assumption that the reading frame starts at the first base, the cor-
transcription are reproducible.
responding amino acid chain is
13. The chromosome would become hopelessly fragmented.
15. b. The RNA would be more likely to contain errors. NH3 - Ala - Ser - Gln - COOH
19. If the DNA is double stranded, A = T, G = C, and A + T + C + G = 17. There are three codons for isoleucine: 5′ AUU 3′, 5′ AUC 3′, and 5′
100%. If T = 15%, then C = [100 - 15(2)]/2 = 35%. AUA 3′. Possible anticodons are 3′ UAA 5′ (complementary), 3′ UAG 5′
20. If the DNA is double stranded, G = C = 24% and A = T = 26%. (complementary), and 3′ UAI 5′ (wobble). Although complementary,
24. Yes. DNA replication is also semiconservative in diploid eukaryotes. 5′ UAU 3′ also would base-pair with 5′ AUG 3′ (methionine) owing to
26. 5′ . . . .CCTTAAGACTAACTACTTACTGGGATC. . . . 3′ wobble and therefore would not be an acceptable alternative.
28. Without functional telomerase, the telomeres would shorten at 22. Quaternary structure is due to the interactions of subunits of a
each replication cycle, leading to eventual loss of essential coding infor- protein. In this example, the enzyme activity being studied may be
mation and death. In fact, some current observations indicate that that of a protein consisting of two different subunits. The polypeptides
decline or loss of telomerase activity plays a role in the mechanism of of the subunits are encoded by separate and unlinked genes.
aging in humans. 26. No. The enzyme may require posttranslational modification to be
30. Chargaff’s rules are that A = T and G = C. Because these equali- active. Mutations in the enzymes required for these modifications
ties are not observed, the most likely interpretation is that the DNA is would not map to the isocitrate lyase gene.
Answers to Selected Problems 839
29. With the assumption that all three mutations of gene P are non- In other words, it is a mutation that inactivates the allosteric site that
sense mutations, three different possible stop codons (amber, ochre, or binds to inducer but does not affect the ability of the repressor to bind
opal) might be the cause. A suppressor mutation would be specific to to the operator site. The dominance of the S mutation is due to the
one type of nonsense codon. For example, amber suppressors would binding of the mutant repressor, even under circumstances when nor-
suppress amber mutants but not opal or ochre. mal repressor does not bind to DNA (i.e., in the presence of inducer).
33. Single amino acid changes can result in changes in protein fold- The constitutive reverse mutations that map to lacI are mutational
ing, protein targeting, or post-translational modifications. Any of these events that inactivate the ability of this repressor to bind to the operator.
changes could give the results indicated. The constitutive reverse mutations that map to the operator alter the
41. If the anticodon on a tRNA molecule was altered by mutation to operator DNA sequence such that it will not permit binding to any
be four bases long, with the fourth base on the 5′ side of the anticodon, repressor molecules (wild-type or mutant repressor).
it would suppress the insertion. Alterations in the ribosome also can 24. Mutations in cI, cII, and cIII would all affect lysogeny: cI encodes
induce frameshifting. the repressor, cII encodes an activator of PRE, and cIII encodes a pro-
42. f, d, j, e, c, i, b, h, a, g. tein that protects cII from degradation. Mutations in N (an antitermi-
nator) also would affect lysogeny because its function is required for
Chapter 10 transcription of the cII and cIII genes, but it is also necessary for genes
having roles in lysis. Mutations in the gene encoding the integrase
14. Ligase is an essential enzyme within all cells that seals breaks in (int) also would affect the ability of a mutant phage to lysogenize.
the sugar–phosphate backbone of DNA. In DNA replication, ligase
joins Okazaki fragments to create a continuous strand, and, in cloning,
Chapter 12
it is used to join the various DNA fragments with the vector. If it were
not added, the vector and cloned DNA would simply fall apart. 12. In general, the ground state of a bacterial gene is “on.” Thus, tran-
15. Each cycle takes 5 minutes and doubles the DNA. In 1 hour, there scription initiation is prevented or reduced if the binding of RNA poly-
would be 12 cycles; so the DNA would be amplified 212 = 4096-fold. merase is blocked. In contrast, the ground state of eukaryotes is “off.”
18. You could isolate DNA from the suspected transgenic plant and Thus, the transcriptional machinery (including RNA polymerase II
probe for the presence of the transgene by Southern hybridization. and associated general transcription factors) cannot bind to the pro-
22. a. The transformed phenotype will map to the same locus. If gene moter in the absence of other regulatory proteins.
replacement was due to double crossing over, the transformed cells 16. Among the mutations that might prevent a strain of yeast from
will not contain vector DNA. If a single crossing over took place, the switching mating type would be mutations in the HO and HMRa genes.
entire vector will now be part of the linear Neurospora chromosome. The HO gene encodes an endonuclease that cuts the DNA to initiate
b. The transformed phenotype will map to a different locus from switching and the HMRa locus contains the “cassette” of unexpressed
that of the auxotroph if the transforming gene was inserted ectopically genetic information for the MATa mating type.
(i.e., at another location). Ecotopic incorporation could also be inferred 19. The term epigenetic inheritance is used to describe heritable alter-
by reverse PCR. ations in which the DNA sequence itself is not changed. It can be
23. Size, translocations between known chromosomes, and hybridiza- defined operationally as the inheritance of chromatin states from one
tion to probes of known location can all be useful in identifying which cell generation to the next. Genomic imprinting, X-chromosome inac-
band on a pulsed-field gel corresponds to a particular chromosome. tivation, and position-effect variegation are several such examples.
33. The region of DNA that encodes tyrosinase in “normal” mouse 23. The inheritance of chromatin structure is thought to be respon-
genomic DNA contains two EcoRI sites. Thus, after EcoRI digestion, three sible for the inheritance of epigenetic information. This inheritance is
different-size fragments hybridize to the cDNA clone. When genomic due to the inheritance of the histone code and may also include the
DNA from certain albino mice is subjected to similar analysis, no DNA inheritance of DNA methylation patterns.
fragments contain complementary sequences to the same cDNA. This 36. A gene not expressed owing to alteration of its DNA sequence will
result indicates that these mice lack the ability to produce tyrosinase never be expressed and will be inherited from generation to genera-
because the DNA that encodes the enzyme must have been deleted. tion. An epigenetically inactivated gene may still be regulated.
36. The promoter and control regions of the plant gene of interest Chromatin structure can change in the course of the cell cycle; for
must be cloned and joined in the correct orientation with the gluc- example, when transcription factors modify the histone code.
uronidase gene, which places the reporter gene under the same 38. Chromatin structure greatly affects gene expression. Transgenes
transcriptional control as the gene of interest. The text describes the inserted into regions of euchromatin would more likely be capable of
methodology used to create transgenic plants. Transform plant cells with expression than those inserted into regions of heterochromatin.
the reporter gene construct, and, as discussed in the text, grow them
into transgenic plants. The glucuronidase gene will now be expressed Chapter 13
in the same developmental pattern as that of the gene of interest, and 12. The primary pair-rule gene eve (even-skipped) would be expressed
its expression can be easily monitored by bathing the plant in an in seven stripes along the A–P axis of the late blastoderm.
X-Gluc solution and assaying for the blue reaction product. 15. If you diagram these results, you will see that the deletion of a gene
that functions posteriorly allows the next most anterior segments to
Chapter 11 extend in a posterior direction. Deletion of an anterior gene does not allow
9. OC mutants are changes in the DNA sequence of the operator that extension of the next most posterior segment in an anterior direction. The
impair the binding of the lac repressor. Because an operator controls gap genes activate Ubx in both thoracic and abdominal segments, whereas
only the genes on the same DNA strand, it is cis (on the same strand). the abd-A and Abd-B genes are activated only in the middle and posterior
12. A gene is turned off or inactivated by the “modulator” (usually abdominal segments. The functioning of the abd-A and Abd-B genes in
called a repressor) in negative control, and the repressor must be removed those segments somehow prevents Ubx expression. However, if the abd-A
for transcription to take place. A gene is turned on by the “modulator” and Abd-B genes are deleted, Ubx can be expressed in these regions.
(usually called an activator) in positive control, and the activator must be 18. a. A pair-rule gene.
added or converted into an active form for transcription to take place. b. Look for expression of the mRNA from the candidate gene in a
21. The S mutation is an alteration in lacI such that the repressor repeating pattern of seven stripes along the A–P axis of the developing
protein binds to the operator, regardless of whether inducer is present. embryo.
840 Answers to Selected Problems
c. No. An embryo mutant for the gap gene Krüppel would be miss- Chapter 15
ing many anterior segments. This effect would be epistatic to the 9. Boeke, Fink, and their coworkers demonstrated that transposition
expression of a pair-rule gene. of the Ty element in yeast is through an RNA intermediate. They con-
21. a. The homeodomain is a conserved protein domain containing structed a plasmid by using a Ty element into which they inserted not
60 amino acids found in a significant number of transcription factors. only a promoter that can be activated by galactose but also an intron
Any protein that contains a functional homeodomain is almost cer- into the Ty element’s coding region. First, the frequency of transposi-
tainly a sequence-specific DNA-binding transcription factor. tion was greatly increased by the addition of galactose, indicating that
b. The eyeless gene (named for its mutant phenotype) regulates an increase in transcription (and production of RNA) was correlated to
eye development in Drosophila. You would expect that it is expressed rates of transposition. More importantly, after transposition, they
only in those cells that will give rise to the eyes. To test this prediction, found that the newly transposed Ty DNA lacked the intron sequence.
visualization of the location of eyeless mRNA expression by in situ Because intron splicing takes place only during RNA processing, there
hybridization and the Eyeless protein by immunological methods must have been an RNA intermediate in the transposition event.
should be performed. Through genetic manipulation, the eyeless gene 13. Some transposable elements have evolved strategies to insert into
can be expressed in tissues in which it is not ordinarily expressed. For safe havens, regions of the genome where they will do minimal harm.
example, when eyeless is turned on in cells destined to form legs, eyes Safe havens include duplicate genes (such as tRNA or rRNA genes) and
form on the legs. other transposable elements. Safe havens in bacterial genomes might
c. Transgenic experiments have shown that the mouse Small eye be very specific sequences between genes or the repeated rRNA genes.
gene and the Drosophila eyeless gene are so similar that the mouse gene 27. The staggered cut will lead to a nine-base-pair target-site duplica-
can substitute for eyeless when introduced into Drosophila. As in the tion that flanks the inserted transposon.
answer to part b, when the mouse Small eye gene is expressed in Dro-
sophila, even in cells destined to form legs, eyes form on the legs. (How- AATTTGGCC TAGTACTAATTGGTTGG
ever, the “eyes” are not mouse eyes, because Small eye and eyeless act as TTAAACC GGATCATGATT AACCAACC
master switches that turn on the entire cascade of genes needed to
build the eye—in this case, the Drosophila set to build a Drosophila eye.)
25. GLP-1 protein is localized to the two anterior cells of the four-
cell C. elegans embryo by repression of its translation in the two pos-
AATTTGGCC TAGTAC TAA TAGTACTAATTGGTTGG
terior cells. The repression of GLP-1 translation requires the 3′ UTR transposon
TTAAACC GGATCATGATT ATC ATGATTAACC AAC C
spatial control region (SCR). Deletion of the SCR will allow glp-1
expression in both anterior and posterior cells. In both heterozygous 30. It would not be surprising to find a SINE element in an intron of
and homozygous mutants, you would expect GLP-1 protein expres- a gene rather than in an exon. Processing of the pre-mRNA would
sion in all cells. remove the transposable element as part of the intron and translation
of the FB enzyme would not be effected.
Chapter 14
11. Because bacteria have small genomes (roughly 3-Mb pairs) and Chapter 16
essentially no repeating sequences, the whole-genome shotgun approach 8. You need to know the reading frame of the possible message.
would be used. 11. With the assumption of single-base-pair substitutions, CGG can
13. A scaffold is also called a supercontig. A contig is a sequence of be changed to CGU, CGA, CGC, or AGG and will still encode arginine.
overlapping reads assembled into a unit, and a scaffold is a collection 14. The following list of observations argues “cancer is a genetic
of joined-together contigs. disease”:
18. Yes. The operator is the location at which repressor functionally (1) Certain cancers are inherited as highly penetrant simple
binds through interactions between the DNA sequence and the repressor Mendelian traits.
protein. (2) Most carcinogenic agents are also mutagenic.
23. You can determine whether the cDNA clone is a monster or not (3) Various oncogenes have been isolated from tumor viruses.
by the alignment of the cDNA sequence against the genomic sequence. (4) A number of genes that lead to the susceptibility to particular
(Computer programs for doing such alignments are available.) Is the types of cancer have been mapped, isolated, and studied.
sequence derived from two different sites? Does the cDNA map within (5) Dominant oncogenes have been isolated from tumor cells.
one (gene-size) region in the genome or to two different regions? Introns (6) Certain cancers are highly correlated to specific chromosomal
may complicate the matter. rearrangements.
27. a. Because the triplet code is redundant, changes in the DNA 17. The mismatched T would be corrected to C and the resulting
nucleotide sequence (especially at those nucleotides encoding the third ACG, after transcription, would be 5′ UGC 3′ and encode cysteine. Or,
position of a codon) can occur without changing its encoded protein. if the other strand were corrected, ATG would be transcribed to 5′ UAC
b. Protein sequences can be expected to evolve and diverge more 3′ and encode tyrosine.
slowly than do the genes that encode them. 24. Many repair systems are available: direct reversal, excision repair,
33. The correct assembly of large and nearly identical regions is prob- transcription-coupled repair, and nonhomologous end joining.
lematic with either method of genomic sequencing. However, the whole- 25. Yes, it is mutagenic. It will cause CG-to-TA transitions.
genome shotgun method is less effective at finding these regions than 35. a. A lack of revertants suggests either a deletion or an inversion
the clone-based method. This method also has the added advantage of within the gene.
easy access to the suspect clone(s) for further analysis. b. To understand these data, recall that half the progeny should
36. 15 percent are essential gene functions (such as enzymes required come from the wild-type parent.
for DNA replication or protein synthesis). Prototroph A: Because 100 percent of the progeny are prototro-
25 percent are auxotrophs (enzymes required for the synthesis of phic, a reversion at the original mutant site may have occurred.
amino acids or the metabolism of sugars, etc.). Prototroph B: Half the progeny are parental prototrophs, and the
60 percent are redundant or pathways not tested (genes for his- remaining prototrophs, 28 percent, are the result of the new mutation.
tones, tubulin, ribosomal RNAs, etc., are present in multiple copies; Notice that 28 percent is approximately equal to the 22 percent auxo-
the yeast may require many genes under only unique or special situa- trophs. The suggestion is that an unlinked suppressor mutation
tions or in other ways that are not necessary for life in the laboratory). occurred, yielding independent assortment with the nic-2 mutant.
Answers to Selected Problems 841
Prototroph C: There are 496 “revertant” prototrophs (the other c. The easiest way is to expose the A/a* plant cells to colchicine
500 are parental prototrophs) and four auxotrophs. This suggests that for one cell division, which will result in a doubling of chromosomes to
a suppressor mutation occurred in a site very close to the original yield A/A/a*/a*.
mutation and was infrequently separated from the original mutation d. Cross a hexaploid (a/a/a/a/a/a) with a diploid (A/A) to obtain
by recombination [100%(4 × 2)/1000 = 0.8 m.u.]. A/a/a/a.
38. Xeroderma pigmentosum is a heterogeneous genetic disorder 58. a. The ratio of normal-leaved to potato-leaved plants will be 5 : 1.
and is caused by mutations in any one of several genes taking part in b. If the gene is not on chromosome 6, there should be a 1 : 1 ratio
the process of NER (nucleotide excision repair). As the discovery of yet of normal-leaved to potato-leaved plants.
another protein in the NHEJ pathway through research on cell line 62. a. The aberrant plant is semisterile, which suggests an inversion.
2BN attests, this patient could have a mutation in an as yet unknown Because the d–f and y–p frequencies of recombination in the aberrant
gene that encodes a protein necessary for NER. plant are normal, the inversion must implicate b through x.
b. To obtain recombinant progeny when there has been an inver-
Chapter 17 sion requires the occurrence of either a double crossover within the
inverted region or single crossovers between f and the inversion, which
21. MM N OO would be classified as 2n - 1; MM NN OO would be
occurred someplace between f and b.
classified as 2n; and MMM NN PP would be classified as 2n + 1.
64. The original plant is homozygous for a translocation between
24. There would be one possible quadrivalent.
chromosomes 1 and 5, with break points very close to genes P and S.
27. Seven chromosomes.
Because of the close linkage, a ratio suggesting a monohybrid cross,
29. Cells destined to become pollen grains can be induced by cold
instead of a dihybrid cross, was observed, both with selfing and with a
treatment to grow into embryoids. These embryoids can then be grown
testcross. All gametes are fertile because of homozygosity.
on agar to form monoploid plantlets.
31. Yes. original plant: P S/p s
34. No.
36. An acentric fragment cannot be aligned or moved in meiosis (or tester: p s/p s
mitosis) and is consequently lost. F1 progeny: heterozygous for the translocation:
39. Very large deletions tend to be lethal, likely owing to genomic
imbalance or the unmasking of recessive lethal genes. Therefore, the
observed very large pairing loop is more likely to be from a heterozy- P
gous inversion.
41. Williams syndrome is the result of a deletion of the 7q11.23
region of chromosome 7. Cri du chat syndrome is the result of a dele-
tion of a significant part of the short arm of chromosome 5 (specifically
P
bands 5p15.2 and 5p15.3). Both Turner syndrome (XO) and Down S S
syndrome (trisomy 21) result from meiotic nondisjunction. The term
syndrome is used to describe a set of phenotypes (often complex and
varied) that are generally present together. The easiest way to test this hypothesis is to look at the chromo-
46. The order is b a c e d f. somes of heterozygotes in meiosis I.
70. The original parents must have had the following chromosome
Allele Band constitution:
b 1
a 2 G. hirsutum 26 large, 26 small
c 3 G. thurberi 26 small
e 4 G. herbaceum 26 large
d 5
G. hirsutum is a polyploid derivative of a cross between the two
f 6
Old World species, which could easily be checked by looking at the
47. The data suggest that one or both breakpoints of the inversion are chromosomes.
located within an essential gene, causing a recessive lethal mutation. 72. a. Loss of one X in the developing fetus after the two-cell stage.
50. a. When crossed with yellow females, the results would be b. Nondisjunction leading to Klinefelter syndrome (XXY), fol-
lowed by a nondisjunctive event in one cell for the Y chromosome after
Xe/Ye+ gray males
the two-cell stage, resulting in XX and XXYY.
Xe/Xe yellow females c. Nondisjunction of the X at the one-cell stage.
b. If the e+ allele was translocated to an autosome, the progeny d. Fused XX and XY zygotes (from the separate fertilizations
would be as follows, where “A” indicates autosome: either of two eggs or of an egg and a polar body by one X-bearing and
one Y-bearing sperm).
P Ae+/A ; Xe/Y × A/A ; Xe/Xe e. Nondisjunction of the X at the two-cell stage or later.
75. a. Each mutant is crossed with wild type, or
F1 Ae+/A ; Xe/Xe gray female
Ae+/A ; Xe/Y gray male m × m+
A/A ; Xe/Xe yellow female The resulting tetrads (octads) show 1 : 1 segregation, indicating
A/A ; Xe/Y yellow male that each mutant is the result of a mutation in a single gene.
b. The results from crossing the two mutant strains indicate
52. Klinefelter syndrome XXY male
either that both strains are mutant for the same gene:
Down syndrome trisomy 21
Turner syndrome XO female m1 × m2
56. a. If a hexaploid were crossed with a tetraploid, the result would
or that they are mutant in different but closely linked genes:
be pentaploid.
b. Cross A/A with a/a/a/a to obtain A/a/a. m1 m2+ × m1+ m2
842 Answers to Selected Problems
c. and d. Because phenotypically black offspring can result from 29. pA = pa = pB = pb = 0.5. At equilibrium, the frequency of doubly
nondisjunction (notice that, in cases C and D, black appears in con- heterozygous individuals is 2(pApa) × 2(pBpb) = 0.25
junction with aborted spores), mutant 1 and mutant 2 are likely to be 31. Before migration, qA = 0.1 and qB = 0.3 in the two populations.
mutant in different but closely linked genes. The cross is therefore Because the two populations are equal in number, immediately after
migration qA+B = 21 (qA + qB) = 21 (0.1 + 0.3) = 0.2. At the new equilib-
m1 m2+ × m1+ m2
rium, the frequency of affected males is q = 0.2, and then frequency of
Case A is an NPD tetrad and would be the result of a four-strand affected females is q2 = (0.2)2 = 0.04. (Color blindness is an X-linked
double crossover. trait.)
33. q2 = 0.002. q = 0.045. Assuming F in the founders is 0.0, the F50 =
m1+ m2+ black 0.222 (see Box 18-3). fa/a = q2 + pqF = 0.012.
m1+ m2+ black 37. Ĥ = [4 × 50,000 × (3 × 10-8)]/[4 × 50,000 × (3 × 10-8) + 1]
m1 m2 fawn = 5.96 × 10-3
m1 m2 fawn 41. a. qˆ = 1.0 × 10 -5 /0.5 = 4.47 × 10 -3
Genetic cost = sq2 = 0.5(4.47 × 10-3)2 = 10-5
Case B is a tetratype and would be the result of a single crossover b. qˆ = 6.32 × 10 -3
between one of the genes and the centromere. Genetic cost = sq2 = 0.5(6.32 × 10-3)2 = 2 × 10-5
c. Genetic cost = sq2 = 0.3(5.77 × 10-3)2 = 10-5
m1+ m2+ black
Chapter 19
m1+ m2 fawn
m1 m2+ fawn 7. Many traits vary more or less continuously over a wide range. For
m1 m2 fawn example, height, weight, shape, color, reproductive rate, metabolic
activity, etc., vary quantitatively rather than qualitatively. Continuous
variation can often be represented by a bell-shaped curve, where the
Case C is the result of nondisjunction in meiosis I.
“average” phenotype is more common than the extremes. Discontinu-
ous variation describes the easily classifiable, discrete phenotypes of
m1+ m2+ ; m1+ m2+ black simple Mendelian genetics: seed shape, auxotrophic mutants, sickle-
m1+ m2+ ; m1+ m2+ black cell anemia, etc. These traits often show a simple relation between
no chromosome abort genotype and phenotype, although discontinuous traits such as
no chromosome abort affected versus not-affected for a disease condition can also exhibit
complex inheritance.
Case D is the result of recombination between one of the genes 9. The mean is 4.7 bristles, the variance is 1.11 bristles2, and the
and the centromere followed by nondisjunction in meiosis II. For standard deviation is 1.05 bristles.
example, 12. The breeder cannot be assured that this population will respond
to selective breeding even though the broad-sense heritability is high.
m 1 m 2+ m 1+ m 2 Broad-sense heritability is the ratio of the genetic variance to the phe-
m 1 m 2+
m 1 m 2+
MII notypic variance. The genetic variance is the sum of the additive and
MI m 1+ m 2 chromosome dominance variances. Only additive variance is transmitted from par-
x m 1 m 2+
m 1+ m 2
m 1 m 2+ MII ent to offspring. Dominance variance is not transmitted from parent to
m 1+ m 2 offspring. If all the genetic variance in the population is dominance
m 1+ m 2
variance, then selective breeding will not succeed.
18. x par = [(9.8 + 10.8)/2] - 9.6 = 0.7 mm, a par = 0.79 × 0.7 = 0.55 mm,
m1+ m2 ; m1 m2+ black x̂ off = 9.6 + 0.55 = 10.15 mm
no chromosome abort m1 m2 m1m2 23. Ve = 3.5 g2, Vg in population B is 21.0 - 3.5 = 17.5 g2, H2 =
m1 m2 MII
m1 m2 m1 m2+ fawn 17.5/21.0 = 0.83.
MI m m chromosome
m1 m21 2
+ fawn
x m1 m2
m1 m2 MII
Chapter 20
m1m2
9. The three principles are (1) individuals within any one population
Chapter 18 m1m2 m1m2 vary from one another, (2) offspring resemble their parents more than
7. The frequency of an allele in a population can be altered by natu- they resemble unrelated individuals, and (3) some forms are more
ral selection, mutation, migration, and genetic drift. successful at surviving and reproducing than other forms in a given
11. The frequency of b is q = 0.04 = 0.2, and the frequency of B is environment.
p = 1 - q = 0.8. The frequency of B/B is p2 = 0.64, and the frequency 11. The relative rate of synonymous and nonsynonymous substitu-
of B/b is 2pq = 0.32. tions would not be higher than expected in a globin pseudogene
14. a. p′ = 0.5[(0.5)(1.0) + 0.5(1.0)]/[(0.25)(1.0) + (0.5)(1.0) because a pseudogene is inactive and has no function to be preserved.
+ (0.25)(0.7)] = 0.54 13. A population will not differentiate from other populations by
b. 0.008 local inbreeding if:
m1 m21 m ≥ 1/N
13. 2
(0.02) + 21 (0.02)2 = 0.0102
x and so
1 1
m21. probability of fixation = = ;
1 m2
2N 100, 000 N ≥ 1/m
m1 m2 of M 1 m1 m299
, 999 m1m2 N ≥ 105
probability loss
II
=1- =
2N 100, 000 17. When amino acid changes have been driven by positive adaptive
m m chromosome
MI 26 a. FI = ( 2 ) × (1 + 2 ) = 3/16
1 1 3
2 1
selection, there should be an excess of nonsynonymous changes. The
b. 1/8 = ( 21 )3 × (1 + FA), so m =
FA m 0 MC1R gene (melanocortin 1 receptor) encodes a key protein control-
m1 m2 M 1 2
II
m1m2 m1m2
Answers to Selected Problems 843
ling the amount of melanin in skin and hair. Asian and European popu- 25. For polymorphic sites with a species, let nonsynonymous = a and
lations appear to have experienced positive adaptive selection for more synonymous = b. For polymorphic sites between the species, let non-
lightly pigmented skin relative to their African counterparts. synonymous = c and synonymous = d. If divergence is due to neutral
19. Noncoding sequences. A major constraint on gene evolution com- evolution, then
prises the potential pleiotropic effects of mutations in coding regions. a/b = c/d
These effects can be circumvented by mutations in regulatory
sequences, which play a major role in the evolution of body form. If divergence is due to selection, then
Changes in noncoding sequences provide a mechanism for altering one a/b < c/d
aspect of gene expression while preserving the role of pleiotropic pro-
However, in this example, a/b = 20/50 > c/d = 2/18, which fits
teins in other essential developmental processes.
neither expectation. Because the ratio of nonsynonymous to synony-
21 a. The HbS mutation has arisen independently in five different
mous polymorphisms (a/b) is relatively high, the gene being studied
haplotypes in different regions and then increased to high frequency.
may encode a protein tolerant of relatively fewer species differences.
c. Two independent lines of bacteriophage evolved the ability to
The relatively fewer species differences may suggest that speciation
reproduce at high temperatures in a new host.
was a recent event so new polymorphisms have been fixed in one spe-
23. A new gene duplicate can (1) evolve a new function, (2) become
cies that are not variants in the other.
inactivated, or (3) perform part of the original function, sharing full
function with the original gene.
This page intentionally left blank
Index
Note: Page numbers followed by f indicate figures; those followed by t indicate tables.
Boldface page numbers indicate Key Terms
845
846 Index
Autonomous elements, 552 fertility factor in, 179–184. See also in RNA, 294, 295f, 296–298, 297f. See also
Autopolyploidy, 620–623, 622f, 623f Fertility factor (F) Transcription
Autoradiogram, 368 Hfr strain in, 179–184, 181f, 184f tautomeric shifts in, 588–589
Autosomal chromosomes, 54 integration sites in, 183, 184f, 185f Base substitution mutations, 583–584
independent assortment of, 104 merozygote in, 186 Base-excision repair, 598–599, 598f
Autosomal dominant disorders, pedigree origin in, 182–183, 184f Bateson, William, 5, 5f, 129–130
analysis of, 61–63, 63f pili in, 179, 180f Baulcombe, David, 314
Autosomal polymorphisms, pedigree plasmid transfer in, 178–179, 179f, 180f Bcr-Abl fusion protein, 650
analysis of, 63–64, 66f recipient in, 178 Beadle, George, 7, 8, 224–225, 324, 551, 551f
Autosomal recessive disorders, pedigree rolling circle replication in, 179 Beadle-Tatum experiment, 224–225, 324
analysis of, 59–60, 60f sequence of events in, 183–184, 185f Beall, Cynthia, 24
Autotetraploidy, 621–623 terminus in, 183, 184f Beck, Harry, 128
in plant breeding, 627 vs. transformation, 191 Beetles, transgenic, 542, 542f, 543f
Autotriploidy, in plant breeding, 627 Bacterial genetics, 173–205 Bell curve, 108, 109f
Auxotrophic bacteria, 176 Bacterial insertion sequences, 553–554, 555 Benge, Louise, 14, 14f
Auxotrophic mutations, 176, 224–225 Bacterial mutants Benzer, Seymour, 195
Avery, Oswald, 262, 398 auxotrophic, 176 b clamp, 277
resistant, 177 Bicoid gene, 483, 483f, 485, 488, 488f, 490f
BAC (bacterial artificial chromosome) Bacterial transduction, 175, 175f, 196–200. Binding sites, identification of, 521–522
vectors, 365, 365f, 366f See also Transduction Bioinformatics, 510, 519–524
Bacillus subtilis, sporulation in, 423–424, Bacterial transformation, 175, 175f, in binding site prediction, 521–522, 522f
424f 191–192, 192f BLAST search and, 522–523, 523f
Bacteria. See also Escherichia coli; in DNA technology, 365–366, 366f codon prediction in, 523
Prokaryotes double, 191 in exon detection, 520–524
antibiotic resistance in, 188–191, 191f, 320 in Streptococcus pneumoniae, 261–262, genomic sequencing and, 519–524
evolution of, 775 261f information sources for, 523–524, 523f
auxotrophic, 176 vs. conjugation, 191 in open reading frame detection, 520
classification of, 174 Bacteriophages (bacterial viruses). See in protein inventory, 520–524
colonies of, 176, 176f Phage(s) Biological machines
culture of, 176f Bailey-Serres, Julia, 21 Dicer, 310
culturing of, 176 Balanced rearrangements, 636 protein-protein interactions in, 341
DNA exchange between, 175–176, 175f Balancer chromosomes, 645 replisomes, 277–280, 332
gene exchange in, 175–176, 175f Balancing selection, 699, 701f, 769 ribosomes, 308, 332–335
gene regulation in, 397–425 Baldness, male pattern, 676, 676f RISC, 310
genome of, 178–179 Barr body, 463 spliceosomes, 292
insertion sequences in, 553–554, 555 Barrier insulators, 459, 459f Biological properties, 32
hereditary processes in, 174–175 Basal cell nevus syndrome, 502 Biotechnology, definition of, 382
interrupted mating in, 181, 183f Base(s), 264–265, 266t Bithorax complex, 475
lysis of, 193, 194f, 199, 199f–201f Chargaff’s rules for, 265 Bivalents, 41, 621
lysogenic, 196 complementary, 270 Black urine disease, 223–224
in specialized transduction, 199 damage to Blackburn, Elizabeth, 283–284
mapping of, 184–188, 187f, 201–204, excision of, 598–599, 598f BLAST search, 522–523, 523f
202f–204f mutagenic, 594–595 Blending theory, 2–3
physical, 201–204, 202f–204f enol, 276, 276f Blood groups
recombinant-based, 184–188, 187f imino, 276–277, 276f evolution of, 785–786
as model organism. See Escherichia coli keto, 276, 276f polymorphisms and, inheritance of, 219
mutant, 176–177, 176t molar properties of, 266t racial variation in, 785–786
phage infection of, 192–193, 193f–195f oxidatively damaged, mutations due to, Blue-eyed Mary, flower color in, 234–235,
prototrophic, 176 590 234f
recombination in, 175f, 176. See also in physical mapping, 154–155 Body coloration
Bacterial conjugation; Bacterial purine, 264 in mice, 221–222, 221f, 780–781, 780f
transformation; Transduction pyrimidine, 264 regulatory evolution and, 782–783, 782f,
replication in, 270–280 in RNA, 294, 295f 783f
sporulation in, 423–424, 424f tautomerization and, 276–277, 276f Boeke, Jef, 560
Bacterial artificial chromosomes, 365 Base analogs, as mutagens, 593, 593f Bonds
Bacterial chromosome Base insertion mutations, 583–584 hydrogen, in double helix, 267, 268f
circular shape of, 183, 184f, 185f, 272, 273f Base pairs/pairing peptide, 322, 322f, 323f
mapping of, 184–188, 187f apurinic sites and, 589 Bottlenecks, population, 695–696, 697
Bacterial conjugation, 175f, 177–191 complementarity in, 7 Boveri, Theodor, 5
crossing over in, 179–182, 184f, 185f, 186 in DNA, 268–271, 268f–270f. See also bph2 mutation, 88
definition of, 177 Replication Branch diagrams, 90, 93
discovery of, 177–179 mismatched Brassica, allopolyploidy in, 623–624, 624f
donor in, 178 base analogs and, 593, 594f Bread mold. See Neurospora crassa
endogenote in, 186 mutations due to, 588–590, 590f, 593 Breeding
exogenote in, 186 specific mispairing and, 593–594, 594f additive deviation and, 735–736
Index 847
embryoids in, 626 CD73 deficiency, 14–17, 15f formation of, 41, 42f–43f, 46, 46f, 47f
flood-tolerant rice, 20–23, 22f, 23f cDNA (complementary DNA), 358, in synthesis-dependent strand annealing,
in Green Revolution, 88 520–521 607–608
hybrid vigor in, 99–100 expressed sequence tags and, 521 Chromatin, 433, 442–459
independent assortment in, 88–89 cDNA library, 366–367 condensation of, 443–444, 443f, 455–456
narrow-sense heritability in, 738–740, Cech, Tom, 308 DNA packaging in, 443–444, 443f
739t Cell clones, 176. See also Clone(s) epigenetic inheritance and, 448–449,
phenotype prediction in, 739–740 Cell cycle 449f
polyploidy in, 627 meiosis in, 40–45, 41f–43f. See also euchromatin and, 282, 443f, 455–456. See
population bottlenecks and, 695–696, 697 Meiosis also Euchromatin
pure lines in, 35, 98–99 mitosis in, 40, 41f–43f. See also Mitosis heterochromatin and, 282, 443f, 455–456.
single-gene mutations in, 88 replication in, 41–44, 42f–43f, 46–48, 46f, See also Heterochromatin
Breeding value, 735 47f, 281–282, 282f histones in, 443–449. See also Histone(s)
Brenner, Sydney, 325, 329, 497 S phase of, 40f, 41, 47f mating-type switching and, 444–445
Bridges, Calvin, 470 synapsis in, 41 nucleosomes in, 433, 443–444. See also
Broad-sense heritability, 727–731, 730t. See yeast replication and, 281–282 Nucleosomes
also Heritability Cell division remodeling of, 444–445, 444f, 456
Brown oculocutaneous albinism, 674, 674f crossing over in, 42, 132–135, 132f. See structure of, 443–444, 443f
Bubble-boy disease, 548, 548f also Crossing over inheritance of, 448–449, 449f
Bulbar muscular atrophy, 592 in meiosis, 40–45, 46f, 47f. See also Chromatin immunoprecipitation (ChIP)
Burkitt lymphoma, 649, 649f Meiosis assay, 538–539, 539f
Burnham, Charles, 551, 551f in mitosis, 40–41, 46f, 47f. See also Mitosis Chromosomal polymorphisms. See
Bypass (translesion) polymerases, 605 at molecular level, 46, 53 Polymorphism(s)
stages of, 42f–43f, 47f Chromosome(s)
Caenorhabditis elegans, 10, 11, 11f Cell lineage, in Caenorhabditis elegans, 496, acentric, 636
cell-lineage fate in, 496–499 498f anaphase bridge, 636
development in, 496–499 Cell signaling. See Signaling autosomal, 54
gene expression in, 310–312 Centimorgan (cM), 136 independent assortment of, 104
gene silencing in, 310–315, 311–312 Central dogma of molecular biology, 8, 9–10 bacterial
as model organism, 473, 497, 802–803 Centrifugation, cesium chloride gradient, artificial, as vector, 365, 366f
transgenesis in, 386–387, 386f 271, 272f circular shape of, 183, 184f, 185f
transposable elements silencing in, Centromeres balancer, 645
571–572 in crossing over, 148–150, 149f breakage of, 635–636. See also
Cairns, John, 272, 274 mapping of, 148–150, 149f Chromosome rearrangements
cAMP, in gene regulation, 410–413, in meiosis, 41–42, 148–150, 149f transposable elements and, 549–552,
410f–412f Cesium chloride gradient centrifugation, 549f, 550f
Campbell, Allan, 183, 199 271, 272f dicentric, 636
Cancer, 502–503, 503t, 609–612 Chaetodipus intermedius, coat color in, early studies of, 34–39, 35f, 36f
Burkitt lymphoma, 649, 649f 780–781, 780f gene location on. See Gene loci
chromosome rearrangements in, Chan, Frank, 784 homeologous, 620
649–650, 649f Chaperones, 341 homologous, 622f
chronic myelogenous leukemia, hybrid Chaperonin folding machines, 341 mapping of, 129–148. See also Maps/
genes in, 650 Characters, 34–35. See also Traits mapping
colorectal, 604 Chargaff, Erwin, 4 organelle
mutations and, 595–596, 597f, 609–612. Chargaff’s rules, 265 characteristics of, 111, 112f
See also Carcinogens Charged tRNA, 331 independent assortment of, 111–115
telomeres in, 287 Chase, Martha, 263 polytene, 637–638, 638f
Candidate gene, 749 Chiasmata, 132, 132f, 628 replication of, 40–42, 46f, 47f
in cystic fibrosis, 378–379 Chimpanzee genome, vs. human genome, segregation of. See Segregation
Cap, 5′ end, 305 532 sex, 54
CAP-cAMP complex, in gene regulation, ChIP assay, 458, 538–539, 539f in D. melanogaster, 54
410–413, 410f–412f Chi-square test (χ2 test), 96–98, 97t in dioecious plants, 55
Carboxy terminal domain (CTD), 304, 305, in linkage analysis, 150–151 pseudoautosomal regions of, 55
305f Chloroplast DNA (cpDNA), 111 structure of, 54–55, 55f
Carboxyl end, 322, 322f, 323f variation in, 670–671 structure of, mutation-induced changes
Carcinogens Chloroplast genes in, 634–650
Ames test for, 595–596, 597f characteristics of, 111, 112f tandem repeats in, 283–284
mutations as, 595–596, 597f, 609–612 inheritance of, 111–115 unpaired, aneuploidy and, 621
Carothers, Elinor, 101 Chromatids, 40, 41, 47f Chromosome maps, 129–148. See also
Catabolite activator protein (CAP), in crossing over, 132–135, 134f, 135f Maps/mapping
410–413, 410f–412f nonsister, in double-strand break repair, Chromosome mutations, 618. See also
Catabolite repression, 410–413, 410f–412f 608f Mutation(s)
Categorical traits, 717 sister, 41, 42f–43f, 47f incidence of, 651, 651f
Cats, tailless, 222, 223f crossing over of, 135. See also Crossing numerical changes and, 618–627
Cavalli-Sforza, Luca, 179–180 over structural changes and, 634–650
848 Index
Chromosome number with gel electrophoresis, 371–373, 371f, Comparative genomic hybridization, 650,
aneuploid, 620t, 621, 627–631, 632–634 373f 650f
changes in, 618–634 mutant rescue in, 370 Comparative genomics, 510, 526, 527–536,
agricultural applications of, 626–627 Northern blotting in, 372, 373f 527f
incidence of in humans, 651, 651f positional cloning in, 377–379 homologs in, 528
in parts of chromosome sets, 629–631 with probes, 367–369 human-chimpanzee, 532
in whole chromosome sets, 618–627 Southern blotting in, 372, 373f human-human, 532–533
diploid, 619, 620t. See also Diploids libraries of, 366–367, 369, 513–514 human-mouse, 530–532
euploid, 619, 620t mapping of. See Maps/mapping medical applications of, 532–536
gene balance and, 632–634 sequencing of. See Genome sequencing noncoding functional elements and,
gene-dosage effect and, 633 Cloning, 12, 13f, 343f, 354, 355–356 525–526, 527f
haploid, 619, 620t of animals, 432, 432f, 461–462 nonpathogenic vs. pathogenic E. coli,
hexaploid, 619, 620t of blunt-end DNA, 360–361 534–536, 535f
monoploid, 619–620, 620t cDNA in, 358 orthologs in, 528
monosomy, 620t, 627, 629, 629f cutting and pasting in, 355–356 paralogs in, 528
organism size and, 620, 621f, 627 DNA amplification in, 362–366, 362f parsimony in, 528–529
pentaploid, 619, 620t DNA ligase in, 360 phylogenies and, 527–530, 528, 529f
phenotype and, 632–634, 632f gene tagging in, 564 synteny in, 531, 531f, 569
tetraploid, 619, 620t general strategy in, 362f Complementarity, 7
triploid, 619, 620t genomic DNA in, 355–356 Complementary bases, 270. See also Base
trisomy, 61, 620t, 627, 629–631, 631f, 632f, genomic imprinting and, 461–462 pairs/pairing
647, 647f P elements in, 565 Complementary DNA (cDNA), 358,
Chromosome rearrangements, 634–650 p elements in, 565 520–521
balanced, 636 polymerase chain reaction in, 356–358, expressed sequence tags and, 521
by breakage, 635–636 356f Complementary DNA (cDNA) library,
in cancer, 649–650, 650f positional, 377–379, 379f 366–367
during crossing over, 132–135, 134f–136f restriction enzymes in, 355, 356f Complementation, 227–230, 228
deletions, 634, 635f, 636, 637–640. See of sticky-end DNA, 359–362 definition of, 227
also Deletion(s) transduction in, 366, 366f in diploids, 227–229, 228f
inversions and, 644, 644f transformation in, 365–366, 366f functional, 370
duplications, 634, 635f, 636. See also transposable elements in, 565 in haploids, 229, 231f
Duplications vectors in. See Vectors Complementation test, 227–230
inversions and, 644, 644f Clover, leaf patterns in, 220, 221f Complete (full) dominance, 216–218
identification of, by comparative genomic CMP (cytidine 5′-monophosphate), 295f Complex inheritance, 717
hybridization, 650, 650f Co-activators, 440, 441f Complex traits, 716
incidence of in humans, 651, 651f Coat color Composite transposons, 554, 555f
inversions, 634, 635f, 636. See also in dogs, 235, 235f, 240, 241f Congenic lines, 747–748
Inversion(s) in mice, 221–222, 221f, 780–781, 780f Conjugation, 177, 177–191. See also
mapping and, 648 regulatory evolution and, 782, 782f, Bacterial conjugation
position-effect variegation and, 648–649, 783f Conjugative plasmids, 191
648f Cockayne syndrome, 600–602 Consanguinity, 60
translocations, 635, 635f, 636–637. Coding strand, 298 Consensus sequence, 299, 299f, 512
See also Translocations CODIS, 707 Conservation genetics, 704–705
transposable elements and, 549–552, Codominance, 219–220 Conservative replication, 271
549f Codon(s), 10, 325 Conservative substitution, 584
unbalanced, 636 amber, 329 Conservative transposition, 556–557, 556f
Chromosome theory of inheritance, 5, 58 degeneracy for, 327, 331–332 Constitutive heterochromatin, 456
Chromosome walk, 378, 379f in genetic code, 325 Constitutive mutations, 406
Chronic mountain sickness, 24 stop, 329, 329f, 338–339, 338f Constitutive transcription, 296
Cis conformation, 132 synonymous, 523 Contigs
Cis-acting sequences Codon bias, 523 sequence, 514
evolutionary changes in, 782–783, 783f Codon-anticodon pairing, in translation, supercontigs, 518
in gene regulation, 406 330–332, 330f, 331f Continuous traits, 717
in eukaryotes, 435 wobble in, 331–332, 332f, 332t Continuous variation, 108–110, 109f
in prokaryotes, 406, 406f Coefficient of coincidence (c.o.c.), 142 Coordinately controlled genes, 402
Citrulline, structure of, 224f Cointegrates, 557 Copia-like elements, 559f, 560
Class I transposable elements, 560 Colchicine, in autopolyploid production, Copy and paste transposition, 560
Class II transposable elements, 562–564 621–622, 623f Copy number variations, 533
clear mutants, 418–419 Collinsia parviflora, flower color in, 234–235, Core enzyme, 299–300
Clone(s) 234f Corepressor, 447
cell, 176 Colonies, bacterial, 176, 176f Corn. See Maize (Zea mays)
DNA sequencing for, 374–376, 374f, 376f, Color blindness, 66 Correlation, 725–727, 726, 727f
377f Colorectal cancer, 604 Correlation coefficient, 726–727
identification of, 367–372 Coloring. See Pigmentation Cosuppression, 313, 313f
chromosome walk in, 378, 379f Common single nucleotide polymorphisms, Cotranscriptional RNA processing,
functional complementation in, 370 667 304–305, 305f
Index 849
Development (continued) double-stranded breaks in, 156–157, 156f, in proofreading, 276–277, 277f, 598
screens for, 482–483, 482f 285 in SOS system, 604–605, 605f, 606f
specificity of, 500–501 chromosome rearrangements and, in translesion DNA synthesis, 605
transcription factors in, 484–487, 486f, 635–636, 635f DNA probes, 367–369, 368f
486t in meiotic recombination, 608–609, DNA repair, 596–609
transplantation model of, 471, 471f 609f base-excision, 598–599, 598f
zone of polarizing activity in, 500 repair of, 606–609 chromosome rearrangements and,
Deviation, 720 duplication of. See Duplications 635–636
additive, 734–736 in forensics, 706–707 by direct reversal, 597
breeding value and, 735 genomic, in cloning, 355–356 DNA glycolases in, 598–599, 598f
dominance, 734–736 heteroduplex, 156 of double-strand breaks, 606–609
genetic vs. environmental, 722–724, 724t information content of, 519–520, 520f error-free, 606
standard, 720 information storage in, 8 error-prone, 604–605, 606f
Diakinesis, 85 key properties of, 264 global genomic nucleotide-excision,
Dicentric bridge, 643, 644f meiotic segregation of, 46–48 600–602, 601f
Dicentric chromosomes, 636 melting of, 269 homologous recombination in, 607–608
Dicer, 310, 314f mitochondrial, 111, 112f impaired, cancer and, 612
Dideoxy (Sanger) sequencing, 374–376, in evolutionary studies, 116–117 mismatch, 602–604, 603f
374f, 376f, 377f variation in, 670–671 nonhomologous end joining in, 607–608,
Digitalis purpurea, flower color in, 236, 236f new, sources of, 787–790 607f
Digits, extra, 63, 64f, 501–502, 501f in nucleoids, 111, 111f nucleotide-excision, 599–602, 600
Dihybrid, 89–93, 103 nucleotides in, 264–265, 265f global genomic, 600–602, 601f
Dihybrid crosses, 89–93, 103. See also overview of, 260 by photoreactivation, 597
Crosses palindromic, 355 postreplication, 602–604
Dimorphisms, 63–64 in phages, 263–264 proofreading in, 276–277, 277f, 598
autosomal, pedigree analysis of, 63–64 phosphate in, 264, 265f SOS system in, 604–605, 605f
Dioecious species, 54, 55f recombinant. See also DNA technology synthesis-dependent strand annealing in,
Diploids, 619, 620t production of, 355–356 607–608, 609f
double, 623–624 recovery from vectors, 366 translesion DNA synthesis and, 605
independent assortment in, 101–102 repetitive, 524 DNA replication. See Replication
meiosis in, 41, 42f–43f, 44–45, 47f chromosome rearrangements and, DNA sequencing, 13, 374–376
mitosis in, 40, 42f–43f, 447f 635f, 636 automated, 376, 377f
partial, 405 conserved/ultraconserved, 526 dideoxy (Sanger), 374–376, 374f, 376f,
recombination analysis for, 106–108, 107f replication of. See Replication 377f
single-gene inheritance in, 40–44, 40f–43f structure of, 7, 8f, 264–270. See also DNA technology, 352
Diplotene, 85 Double helix amplification in
Directional selection, 698–701 elucidation of, 264–270 in vitro, 353, 353f, 354
Disassortive mating, 677–678, 677f supercoiled, 279–280, 279f in vivo, 362–366
Discovery panel, 668 template strand of, 296–297, 297f, 298f cDNA libraries in, 366–367
Disjunction, 628 transcription of. See Transcription chromosome walk in, 378, 379f
Disomy, 628 in transformation, 191–192, 192f chromosome walking in, 378, 379f
Dissociation (Ds) elements, 549–552, 550f, variation in, 144. See also Variation cloning in, 355–356. See also Cloning
552f, 564, 565f X-ray diffraction studies of, 266–267, 266f dideoxy (Sanger) sequencing in, 374–376,
Distal-less gene, 491–494, 492f DNA amplification, 353 374f, 376f, 377f
Distribution, Poisson, 151–152, 151f in vitro, 353, 353f, 354 donor DNA in, 355
Distribution curves, 103f, 108–109, 109f in vivo, 353–354, 353f, 362–366 functional complementation in, 370
Distributive enzymes, 277–278 DNA cloning. See Cloning gel electrophoresis in, 371–373, 371f
Diversification. See also Variation DNA fingerprints, 146 gene knockout in, 388, 389f
adaptation and, 764 DNA glycolases, in base-excision repair, gene replacement in, 388
DNA 598–599, 598f gene tagging in, 565
bacterial uptake of, 191–192, 192f DNA gyrase, 279f gene targeting in, 388–391, 389f
bases in. See Base(s) DNA ligase, 275–276, 360 genetic engineering and, 382–391. See also
chloroplast, 111 in cloning, 360 Genetic engineering
variation in, 670–671 DNA matching, 706–707 genomic libraries in, 366–367, 513–514
in chromatin, 443–444, 443f DNA mates, 707–708 mutant rescue in, 370
cloning of, 362–366 DNA methylation, 460–461 Northern blotting in, 317f, 372
coding strand of, 298 inheritance of, 449–450, 450f overview of, 353–354
complementary, 520–521 DNA microarrays, 527f, 536–537, 537f polymerase chain reaction in, 146, 147f,
expressed sequence tags and, 521 in comparative genomic hybridization, 353f, 354, 356–358, 356f
components of, 264–265, 265f, 266t 650, 650f probes in, 367–369
daughter molecules of, 281 for single nucleotide polymorphisms, recombinant DNA production in,
denaturation of, 269 667–668 355–356
discovery of, 260–264 DNA palindromes, 355 restriction enzymes in, 355
donor, 353 DNA polymerases, 12, 273–276, 274f Southern blotting in, 371f, 372
amplification of, 362–366 bypass (translesion), 605 transduction in, 366, 366f
Index 851
transformation in, 365–366, 366f in meiotic recombination, 608–609, 609f Ectopic insertion, 382
in vivo, 353–354, 353f repair of, 606–609 Edwards syndrome, 631
DNA template library, 515 Double-stranded RNA (dsRNA), 310–314, Elongation
DNA transposons, 562–566, 562f–566f. See 313f in transcription, 298
also Transposon(s) in transposition repression, 572 in eukaryotes, 304–307
DnaA, in replication, 280 Down syndrome, 618, 630–631, 631f in prokaryotes, 300, 300f
DNA-binding domains/proteins, 401 Robertsonian translocations and, 647, in translation, in eukaryotes, 304–307
in eukaryotes, 438–439, 439f, 440–442, 647f Elongation factor G (EF-G), 337–338, 337f
442f Downstream promoters, 298 Elongation factor Tu (EF-Tu), 337–338, 337f
in genetic switching, 417–423, 421f Drosophila melanogaster, 11, 11f Embryoids, 626
helix-turn-helix motif and, 422, 422f aneuploidy in, 632 Embryology. See Development
in prokaryotes, 417–423, 421f Copia-like elements in, 559f, 560 Embryonic stem (ES) cells, in gene
specificity of, 422–423, 423f crossing over in, 139–141 targeting, 388–391, 389f
Dolly (cloned sheep), 432, 432f, 461 development in, 474–487. See also Emerson, Rollins, 551, 551f
Domains, protein, 324 Development ENCODE project, 525–526
Dominance, 4, 37, 50, 216–218, 731, 731f eye color in, 104, 456, 457f, 458f Encyclopedia of DNA elements (ENCODE)
codominance and, 219–220 position-effect variegation and, project, 525–526
full (complete), 216–218 648–649, 648f Endogenote, 186, 186f
haploinsufficiency and, 217–218, 217f X-linked inheritance of, 55–58, Endogenous genes, 313
homozygous, 38 57f engrailed gene, 492–494, 493f
incomplete, 218–219 gene linkage in, 127f, 130–131, 131f Enhanceosome(s), 451–452
partial, 731 homeotic genes in, 474–482 β-interferon, 451–452, 451f
Dominance deviation, 734–736 Hox genes in, 476f–481f Enhancer-blocking insulators, 452–453,
Dominance effect, heritability and, 733–736 independent assortment in, 104 453f
Dominant gene action, 731, 731f life cycle of, 56f, 472–473 in genomic imprinting, 461
Dominant mutations, 50, 216–218, 217f as model organism, 56, 472–473, 804–805 Enhancer/Inhibitor elements, 552
Dominant negative mutations, 218, 218f polytene chromosomes in, 637–638, 6 Enhancers, 435, 437–438, 451–453, 451f,
Donor, in conjugation, 178 38f 452f
Donor DNA, 353 position-effect variegation in, 456–458, En/In elements, 552
amplification of, 362–366 457f, 458f Enol bases, 276, 276f
Dorsoventral axis, in development, 483, regulatory evolution in, 782–783, 782f, Environment, vs. genetics, 727–731, 730t.
484f, 485, 487f 783f See also Gene-environment
Dosage compensation, 463, 634 sex determination in, 54, 494–496, 495f interactions
Dotted elements, 552 transposable elements in, 558–565 Environmental deviations, 722–724, 724t
Double crossovers, 134, 135f, 140, 140f, wing development in, 470 Environmental variance, 725. See also
141f discovery for, 51–52 Variance; Variation
bacterial, 186 wing patterns in, 782–783, 782f, 783f Enzymes
coefficient of coincidence for, 141–142 Drug design, structure-based, 335 active sites of, 324
gene order and, 141, 141f Drug-resistant bacteria, 188–191, 191f, 320 core, 299–300
interference and, 141–142 Ds elements, 549–552, 550f, 552f, 564, 565f distributive, 277
mapping function for, 151–152, 151f dsRNA (double-stranded RNA), 310–314, in gene regulation, in prokaryotes,
Perkins formula for, 152–153 312f, 313f 399–400
recombinant frequency and, 151–153 in transposition repression, 572 in lactose metabolism, 401–402, 402f
Double diploid, 623–624 Dt elements, 552 operons for. See Operon(s)
Double helix, 7, 8f, 267–270, 267f–270f Duffy antigen, 785–786 processive, 277
antiparallel orientation in, 267, 268f Duplications, 634, 635f, 636, 640–642, 641f, in replication, 273–276, 274f
base pairing in, 268–271, 268f–270f 642f restriction, 355
discovery of, 264–270, 266f in evolution, 787–790 in cloning, 355, 356f
major groove of, 267, 269f fate of, 788–790, 789f sticky ends of, 355–356, 356f
minor groove of, 267, 269f gene number and, 787–788 structure of, 322–324
unwinding of, 270–271, 279–280, 279f identification of, 650 EPAS1, high-altitude adaptation and,
Watson-Crick model of, 267–270, insertional, 640 24–26
267f–269f pericentric inversions and, 644, 644f Epigenetic inheritance, 448
Double infections, 194, 195f segmental, 641 histone modifications and, 448–449
Double mutants, 227, 231–239 sources of, 787–788 Epigenetic marks, 449, 460, 461
epistasis and, 234–236 tandem, 640 Epigenetically equivalent genes, 461
genetic analysis of, 231–239 target-site, 557 Epistasis, 234–236, 737
lethal, 239, 239f whole-genome, 641, 641f dominant, 236, 236f
Double transformation, 191 Dwarfism, 17, 62, 62f recessive, 234–235, 234f, 235f
doublesex gene, 494–495 Dyad, 41 sign, 775
Double-stranded breaks, 156–157, 156f, 285, Dysgenic mutations, 562 vs. suppression, 237
606 Dyskeratosis congenita, 286 Equal segregation. See Segregation, equal;
chromosome rearrangements and, Single-gene inheritance
635–636, 635f E site, 334, 334f, 337f, 338 Error-prone repair, 604–605, 606f
in DNA, 156–157, 156f Ears, hairy, 68, 69f ES cells, in gene targeting, 388–391
852 Index
Escherichia coli. See also Bacteria Exomes, 508 mosaicism and, 553f
drug resistance in, 775 personal genomics and, 508–509, 533–534 recessive epistasis and, 234–235, 234f
genome of, 178–179 Exon(s), 292, 306 transposable elements and, 553f
mapping of, 201–204, 202f, 203f identification of, 520–524 Fluctuation test, 586–588, 587, 587f
in nonpathogenic vs. pathogenic size of, 524 Forensic genetics, 706–707
strains, 534–536, 535f splicing of, 307–308, 308f, 309f Forward genetics, 33, 352
lac system in, 402f, 404. See also under lac transposable elements in, 568 Beadle-Tatum experiment in, 224–225
as model organism, 180, 794–795 Exon-intron junctions, 307 Fosmid vectors, 364–365, 365f, 366f
properties of, 180 Expressed sequence tags (ESTs), 521, 521f Founder effect, 694, 696f
replisome of, 277–280, 278f, 279f Expression library, 369 Four-o-clocks
transcription in, 298–301, 299f–301f Expressivity, 240–241, 240f, 241f cytoplasmic segregation in, 113–114, 113f,
transposable elements in, 553–557, Extrachromosomal arrays, 386f, 387 114f
555f–558f Extranuclear inheritance, 111–115 incomplete dominance in, 218–219, 218f
Essential genes, 221 Eye color, in D. melanogaster, 104, 456, 457f, Foxgloves, flower color in, 236, 236f
Ethylene response factors, 21 458f Fragile X syndrome, 591–592
Euchromatin, 282, 443f, 455–456 position-effect variegation and, 648–649, Frameshift mutations, 326–327, 585, 585f,
barrier insulators for, 459, 459f 648f 589, 590f
vs. heterochromatin, 456 X-linked inheritance of, 55–58, 57f Franklin, Rosalind, 266, 266f, 269
Eukaryotes Free radicals, mutations and, 590
gene regulation in, 431–465 F+ (donor), 179. See also Fertility factor (F) Frequency histogram, 719, 720f
transcription in, 301–307, 302f, 304f–307f F (recipient), 179. See also Fertility factor Fruit flies. See Drosophila melanogaster
Euploidy, 619, 620t (F) Full (complete) dominance, 216–218
aberrant, 618–627 F plasmid, 179–184, 180, 180f, 181f Functional complementation, 370
eve stripe 2 element, 490–491, 490f F ′ plasmid, 188 Functional genomics, 510, 536–543
even-skipped gene, 483, 489–491, 490f Fertility, reduced DNA microarray in, 527f, 536–537, 537f
Evolution in aneuploids, 621 with nonmodel organisms, 542–543, 543f
adaptation in, 696 heterozygous inversions and, 646 two-hybrid test in, 537–538, 538f
adaptive walks and, 774–777 reciprocal translocations and, 646 Functional RNA, 225, 294, 295–296, 307
Darwin’s theory of, 764–766 in triploidy, 627 Fungi. See also Neurospora crassa;
DNA sources in, 787–790 Fertility factor (F), 179–184 Saccharomyces cerevisiae
founder effect in, 694, 696f discovery of, 178–179 ascus of, 44, 45f, 103
of gene families, 787–790 F+ strain of, 179 cell division in, 41f
gene inactivation and, 781–782 F strain of, 179 equal segregation in, 44–45, 45f
genetic drift in, 115 Hfr strain of, 179–184, 181f hyphal branching in, gene discovery for,
molecular, 771–774 in integrated state, 183–184 52–53
molecular clock and, 773, 773 in plasmid state, 184 maternal inheritance in, 112–113, 113f
morphological, 779–786, 779f–783f, 785f, rolling circle replication of, 179 mating types of, 44, 45f, 105
786f Fibrous proteins, 324 as model organisms, 436
multistep pathways in, 774–777 Fine mapping, 379–382, 381 nonidentical spore pairs in, 155–156
narrow-sense heritability in, 738–739 QTL, 747 spore production in, 103
natural selection in, 696–702, 762, Fingers, extra, 63, 64f, 501–502, 501f
764–771. See also Natural selection Fink, Gerry, 558, 560 G protein, in signaling, 226
neofunctionalization and, 788 Fire, Andrew, 311–312 G6PD gene, variation in, 684–687, 685f
neutral, 694 First filial generation (F1), 35 GAL system, 436–440, 537–538, 538f
neutral theory of, 771–772 First-division segregation (MI) patterns, Galactose, metabolism in, 436–440, 436f
origin of new genes in, 786–790 149, 149f Galàpagos finches, 762–764, 763f
principles of, 766 Fisher, Ronald, 5–6 Gall, Joe, 283–284
purifying selection in, 772–773 Fitness Gametes, 4
recent human, 23–27 absolute, 696–697 formation of, 39–40
adaptation to high altitude and, 24–26 Darwinian, 696 Gametic ratio, 39–40, 39f
lactose intolerance and, 26–27, 26f, 27f genotype, 697–698 Gap genes, 483, 484f, 488
regulatory-sequence, 782–786 natural selection and, 697–698 Garfinkel, David, 560
in D. melanogaster, 782–783, 782f, 783f relative, 697 Garrod, Archibald, 223–224, 223f, 226
in humans, 785–786 5′ end cap, 305 GCN5, 446
loss of characters through, 783–784 5′ regulatory region, 298, 299f, 300f Gel electrophoresis, 371–373, 371f
RNA world and, 308–309 5′ untranslated region (5′ UTR), 299, Gender, gene silencing and, 460–463
subfunctionalization and, 788 300f Gene(s), 37. See also Allele(s)
variation in, heritability of, 765–766 Fixed sites, 686 candidate, 749
Evolutionary genetics, 761–790 Floor plate, in vertebral development, chloroplast, 111, 112f
historical perspective on, 762–764 500–501 coordinately controlled, 402
overview of, 762–764 Flower color definition of, 4
Evolutionary studies, mtDNA in, 116–117 dominant epistasis and, 236, 236f density, in human genome, 524
Excision, of transposable elements, 552, gene discovery for, 51 endogenous, 313
552f gene silencing and, 312–313, 312f, 313f epigenetically equivalent, 461
Exconjugant, 181 incomplete dominance and, 218–219, essential, 221
Exogenote, 186, 186f 218f, 219f gap, 483, 484f, 488
Index 853
Gene regulation (continued) triplet, 325–327, 325f bottlenecks and, 695–696, 697
in sporulation, 423–424 universality of, 328 founder effect and, 694, 696f
trans-acting elements in, 406f, 407 Genetic databases, 523–524 mutations and, 692–693, 694f, 702–703
trp operon and, 414–416, 415f–417f Genetic deviations, 722–724, 724t population size and, 694–696
vs. in eukaryotes, 433–434, 434f Genetic disorders random, 115
repressors in albinism, 60, 674, 674f Genetic engineering, 352, 382–391. See also
early studies of, 398–399 alkaptonuria, 223–224 DNA technology
in eukaryotes, 447, 447f autosomal dominant, 61–63, 62f in animals, 384–391, 389f, 390f
in prokaryotes, 400–401, 405–409, 407t, autosomal recessive, 59–60, 60f definition of, 382
410f, 412, 414f basal cell nevus syndrome, 502 ectopic insertions in, 373
ultraconserved elements in, 526, 527f cancer, 502–503 gene knockout in, 388
variation and, 782–783, 783f Cockayne syndrome, 600–602 gene replacement in, 388
Gene replacement, 388 cri du chat syndrome, 638, 639, 639f gene targeting in, 388–391, 389f
Gene repression, miRNA in, 463–465 cystic fibrosis, 146, 705–706 in plants, 383–386, 384f, 385f
Gene silencing, 311–315, 312f de novo mutations and, 17–20, 18f–20f position effect in, 387
dsRNA and, 310–315, 312f–314f Down syndrome, 618, 630–631, 631f, 647, transgene introduction in, 382
in entire chromosome, 462–463 647f in yeast, 383, 384f
in evolution, 780f, 781–782 due to single nucleotide polymorphisms, Genetic load, 619
gender-specific, 460–463 16 Genetic map unit (m.u.), 136
mating-type switching and, 454–455 dwarfism, 17 Genetic markers, 177
miRNA and, 311, 311f, 463–465 dyskeratosis congenita, 286 unselected, 187
siRNA and, 311, 314f Edwards syndrome, 631 Genetic polymorphisms. See
transgenes and, 312–315 fragile X syndrome, 591–592 Polymorphism(s)
viral resistance and, 314–315 gene identification in. See Genetic Genetic screens, 195
vs. gene repression, 455 analysis for genetic disorders, 378–379
in X chromosome inactivation, 462–463, gene therapy for, 548 for genomic libraries, 513–514
463f genetic screens for, 378–379 for mutations, 32, 32f, 378–379
Gene tagging, 375–376, 377f, 565 genome-wide association studies of, for shotgun libraries, 514, 518f
Gene therapy, 548 752–753, 753f for toolkit genes, 482–483, 482f
for severe combined immunodeficiency, hemophilia, 17, 66, 67f Genetic switches, 400–401, 417–423
570–571 heritability of, 752–753 Genetic toolkit. See Development, toolkit
viruses in, 570–571 holoprosencephaly, 502 genes for
Gene vectors, 180 Huntington’s disease, 146, 592 Genetic variance, 725. See also Variance;
Gene-dosage effect, 633 inborn errors of metabolism, 223–224 Variation
Gene-environment interactions. See also inbreeding and, 679–683, 682t, 683f Genetically modified organisms (GMOs),
under Epigenetic Klinefelter syndrome, 630, 630f 13, 13f, 312, 383–386. See also
expressivity and, 240–241, 240f, 241f lethal alleles in, 222–223 Genetic engineering
penetrance and, 239–241, 240f linkage analysis for, 146–147 animal, 386–391, 386f, 389f, 390f
General transcription factors (GTFs), 302, monosomic, 629, 629f plant, 383–386, 384f, 385f
303–304 osteogenesis imperfecta, 218 yeast, 383, 384f
Generalized transduction, 196, 197–198, Patau syndrome, 631 Genetics, 5
197f, 198f paternal age and, 17–20 bacterial, 173–205
Gene-specific mutagenesis, 540–541, 541f pedigree analysis of, 58–70, 60f conservation, 704–705
Genetic admixture, 688 phenylketonuria (PKU), 48–49, 49f, 59, evolutionary, 761–790
Genetic analysis, 12–13 217, 226, 226f in forensics, 706–707
complementation test in, 227–230 point mutations in, 17–20 forward, 34, 352, 509
DNA cloning in. See Cloning polydactyly, 501–502, 501f Beadle-Tatum experiment in, 224–225
of double mutants, 227, 231–239 risk calculation for, 705–706 history of, 2–10, 6t
forensic, 706–707 severe combined immunodeficiency, 548, in medicine. See Genetic disorders
forward, Beadle-Tatum experiment in, 570–571 population, 665–708
224–225 sickle-cell anemia, 219–220, 220f quantitative, 716–755
Mendelian ratios in, 231–236, 233, 239t single nucleotide polymorphisms in, recent advances in, 14–27
model organisms in. See Model organisms 752–753, 753f reverse, 34, 509, 539–542
mutational. See Mutational analysis single-gene, 34, 508–509 by phenocopying, 541–542
positional cloning in, 377–379, 379f transposable elements and, 569 by random mutagenesis, 540
techniques of, 12–13 trinucleotide repeat diseases, 591–592 by targeted mutagenesis, 541f
Genetic architecture, 716 trisomic, 629–631 vs. environment, 727–731, 730t. See also
Genetic code, 8, 270, 324–329 Turner syndrome, 629, 629f Gene-environment interactions
codons in, 325, 329, 329f undiagnosed, 14–17 Genome
cracking of, 328 Werner syndrome, 286, 286f binding sites in, 519–520, 520f
degeneracy of, 327, 331–332 Williams syndrome, 638–639, 640f duplications of, 641, 641f
number of letters in, 325 xeroderma pigmentosum, 582, 582f, human
overlapping vs. nonoverlapping, 600–602 contents of, 524
325 Genetic dissection, 33 gene density/number in, 301–302, 306,
reversions and, 326–327 Genetic diversity. See Variation 524–525
suppressors and, 326–327 Genetic drift, 691–696 splice variants in, 524
Index 855
structure of, 524–526, 525f Genomics, 13, 352, 507–543, 509. See also in population genetics, 669–670, 671–672
transposable elements in, 567f, Genome sequencing star-cluster, 669–670, 671f
568–569, 568f applications of, 509–510 Haplotype network, 641, 670f, 671f
vs. chimpanzee genome, 532 in bioinformatics, 510 Haplotype number, 686
vs. mouse genome, 530–532 comparative, 510, 526, 527–536, 527f. See HapMap, 671–672
information in, 519–520, 520f. See also also Comparative genomics Hardy-Weinberg equilibrium, 675
Bioinformatics forward approach in, 509 Hardy-Weinberg law, 674–675, 677t
mapping of. See Maps/mapping functional, 510, 536–543 Harebell plant, flower color in, 228–229,
noncoding elements of, conserved/ DNA microarray in, 536–537, 537f 228f
ultraconserved, 525–526, 527f with nonmodel organisms, 542–543, Hayes, William, 178–179
noncoding functional elements in, 543f Hbs allele
525–526 two-hybrid test in, 537–538, 538f malaria and, 767–771
organelle, 111, 112f history of, 509–510 molecular origins of, 783
of prokaryotes, insertion sequences in, impact of, 509 hedgehog gene, 500
553–554, 555 medical applications of, 508–509 Height
protein-coding regions of, 292–293 personal, 508–509, 527, 533–534, 752–753 genome-wide association study of,
relative sizes of, 292–293 Genotype, 38 752–754
repetitive DNA in, 524 fitness of, 697–698. See also Fitness heritability of, 722–723, 724t, 731, 738,
safe havens in, 569–571, 571f Genotype frequency, 673–674 738f, 752–754
sequencing of. See Genome sequencing; gene pool and, 673–674 Helicases, 279
Maps/mapping Hardy-Weinberg law and, 674–677 Helix
size of Genotype-phenotype interactions, 737 alpha, 324, 324f
C-value paradox and, 567 Genotypic ratio, 39, 39f double. See Double helix
transposable elements and, 567 giant gene, 490–491, 490f Helix-turn-helix motif, 422, 422f
ultraconserved elements in, 526, 527f Gilbert, Walter, 399, 408–409 Hox proteins and, 479
viral, mapping of, 192–196 Global genomic nucleotide-excision repair, lac repressor and, 422, 422f
Genome projects, 510 600–602, 601f Hemimethylation, DNA, 450, 450f
Genome sequencing, 511–519, 525f. See also Globular proteins, 324 Hemizygous genes, 55
Genomics glp-1 gene, 496–498 Hemoglobin
accuracy in, 512–513 Glucose, lactose metabolism and, 409–410, evolution of, 788–790, 789f
annotation in, 520 410–411 structure of, 324f
automation of, 510–511, 512f Glucose-6-phosphate dehydrogenase gene, Hemophilia, 17, 66, 67f
in binding site prediction, 521–522, 522f variation in, 684–687, 685f Hereditary nonpolyposis colorectal cancer,
bioinformatics and, 519–524 GMP (guanosine 5′-monophosphate), 295f 604
BLAST search in, 522–523, 523f Gould, John, 762 Heritability. See also Inheritance
clone-based, 514 Grasses, transposable elements in, 569, 570f, additive effect and, 733–736
complete, 513 575 breeding and, 738–740, 739t
data interpretation in, 520–524. See also Green Revolution, 88 breeding value and, 735
Bioinformatics Green-red color blindness, 66 broad-sense, 727–731, 730t
draft-quality, 513, 519 Grieder, Carol, 284–285 definition of, 728
in E. coli, 534–536 Griffith, Frederick, 261 dominance effect and, 733–736
ENCODE project for, 525–526 GU-AG rule, 307, 307f gene action and, 732–733
filling gaps in, 519 Guanine (G), 7. See also Base(s) of genetic disorders, 752–753
finished-quality, 513, 519 in DNA, 266t, 268–269, 268f, 270f genotype-phenotype interactions and,
gene prediction from, 520–524 in RNA, 294, 295f 737
goals of, 513 Guanosine 5′-monophosphate (GMP), 295f group differences and, 731
library construction in, 513–514 of height, 722–723, 724t, 731
open reading frames in, 520, 523 Hairpin loop, 301, 301f of intelligence, 729–730
paired-end reads in, 517–518, 518f Hairy ears, 68, 69f interaction effects in, 737
polypeptide prediction from, 520–524 hairy gene, 489, 490f measurement of, 728–731
primers in, 514 Haldane, J. S. B., 17 narrow-sense, 731, 736–739
protein inventory in, 520–524 Haploid(s) phenotype prediction and, 731–742
proteome in, 520–524 independent assortment in, 92, 103, twin studies of, 728–731
scaffolds in, 518, 518f 104f of variation, in evolution, 765–766
sequence assembly in, 511–513, 512 meiosis in, 44–45 Hershey, Alfred, 194, 263
sequence contigs in, 514 as model organisms, 44–45 Hershey-Chase experiment, 263–264, 263f
steps in, 511, 512f recombination analysis for, 106 Heterochromatin, 282, 443f, 455–456
vectors in, 514 single-gene inheritance in, 44–45 barrier insulators for, 459, 459f
whole-gene shotgun, 514–517, 518f Haploidy, 619, 620t constitutive, 456
Genome surveillance, 573–574 Haploinsufficient genes, 50 in gene silencing, 456, 459f
Genome-wide association studies, 749–754 dominant mutations and, 217–218, 217f, spreading of, 458–459, 459f
Genomic DNA, in cloning, 355–356 218f vs. euchromatin, 456
Genomic imprinting, 460–461, 460f, 461f Haplosufficient genes, 50 Heterochromatin protein-1 (HP-1), 457–458
Genomic library, 366–367, 513–514 recessive mutations and, 217, 217f Heteroduplex DNA, 156
construction of, 513–514 Haplotype, 669–670 Heterogametic sex, 54
Genomic maps. See Maps/mapping mapping of, 671–672 Heterokaryons, 229–230, 231f
856 Index
Heteromorphic gene pairs, independent Hox genes, 476–482, 476f–481f, 491–494, pure lines and, 98–99
assortment of, 101, 101f 492f, 493f. See also Development recombination and, 104–108, 106f–108f
Heteroplasmons, 113 hunchback gene, 488, 488f, 489–490, 490f Indian hedgehog gene, 500
dihybrid, 115 in cancer, 503t Induced mutations, 586
Heterozygosity, 38, 98, 687 huntingtin, 146 molecular basis of, 593–596
variation and, 687 Huntington’s disease, 63, 63f, 146, 592 Inducer(s), 404
Heterozygote, 38 Hybrid dysgenesis, 562, 563, 563f Lac, 404
Heterozygous inversions, 643, 644f Hybrid genes, in cancer, 649f, 650 Induction, 404
Hexaploidy, 619 Hybrid seed, 99–100 Infertility
Hfr strain, of fertility factor, 179–184, 181f, Hybrid vigor, 99–100 in aneuploids, 621
184f Hybridization half sterility and, 646
Histograms, 109–110, 110f comparative genomic, 650, 650f heterozygous inversions and, 646
frequency, 719, 720f in recombinant DNA production, reciprocal translocations and, 646
Histone(s), 443. See also Chromatin 355 in triploids, 627
acetylation of, 446–447, 447f Hydrogen bonds, in double helix, 7, 267, Inheritance. See also Heritability
in chromatin remodeling, 444–445 268f blending theory of, 2–3
consensus, 449 Hydrogen peroxide, mutations and, 590 chromosome theory of, 5, 58
deacetylation of, 446 Hydroxyl radicals, mutations and, 590 complex, 717
hyperacetylated, 446 Hyperacetylated histones, 446 cytoplasmic, 113–116, 113f, 114f
hypoacetylated, 446 Hypertrichosis, 68 epigenetic, 463
linker, 443 Hyphae, 105 histone modifications and, 448–449
methylation of, 448 Hyphal branching, gene discovery for, extranuclear, 111–116
modifications of, 444, 445–449, 445f, 447f 52–53 maternal, 111
epigenetic inheritance and, 448–449 Hypoacetylated histones, 446 Mendelian, 34–39, 36f. See also Single-
post-translational, 446 Hypophosphatemia, 68 gene inheritance
structure of, 443f, 445–446, 445f Hypostatic mutations, 234 first law of, 37
Histone code, 446 second law of, 90–91
Histone deacetylase (HDAC), 446, 447 I gene, in lac operon, 405t–407t, 406–408, Mendel’s laws of, 3–5, 3f, 4f
Histone H3 methyltransferase, 458 406f of mitochondrial diseases, 115–116, 116f,
Histone tails, 443f, 446 ICR compounds, as mutagens, 594, 594f 117f
modification of, 445–446, 445f Identical by descent, 680 monoallelic, X-chromosome inactivation
Histone variants, 444, 449 Ikeda, H., 197 as, 462–463, 463f
HMTase, 458 Illumina system, 516–517 of organelle genes, 111–116
Hodges, Sean, 666 Imino bases, 276–277, 276f polygenic, 108–110, 108f, 111f
Holliday junctions, 157 Imprinting, genomic, 460–461, 460f, 461f sex-linked, 55–58
Holliday, Robin, 157 Inborn errors of metabolism, 223–224 early studies of, 58
Holoprosencephaly, 502 Inbred lines (strains), 723 X-linked, 55–58, 57f
Homeobox, 479 Inbreeding, 679–683, 680f Y-inked, 55
Homeobox genes, in D. melanogaster, autosomal recessive disorders and, 60, Y-linked, 68
472–473 60f simple, 717
Homeodomain, 479, 479f genetic disorders and, 679–683, 682t, 683f single-gene, 31–71. See also Single-gene
Homeologous chromosomes, 620 in model organisms, 723 inheritance
Homeotic development, 474–476, 475f pedigree analysis for, 680–682, 680f uniparental, 111
Homeotic genes, in Drosophila melanogaster, population size and, 682–684 Initiation, transcriptional, 298
474–482 Inbreeding coefficient (F), 680, 681 in eukaryotes, 303–304, 304f
Homodimeric proteins, 218 Inbreeding depression, 680 in prokaryotes, 298–300, 299f, 300f
Homogametic sex, 54 Incomplete dominance, 218–219, 219f Initiation factors, in translation, 335–337,
Homologous chromosomes, 621f Incomplete penetrance, 240 336f
Homologous recombination, 382, Indel mutations, 584, 585, 589, 670 Initiator
607–608 Independent assortment, 88–118. See also in arabinose operon, 413
nonallelic, 636 Crosses in translation, 335–337, 336f
Homologs, 528 chromosomal basis of, 101–108, 102f Insects. See also Drosophila melanogaster
Homothorax gene, 493f, 494 definition of, 88 as model organisms, 56, 470, 472–473
Homozygosity in diploid organisms, 101–102, 102f transgenic, 542–543, 542f, 543f
dominant, 38 in haploid organisms, 92, 103, 104f Insertional duplications, 640
recessive, 38 of heteromorphic gene pairs, 101 Insertional mutagenesis, in mapping,
Homozygote, 38 hybrid vigor and, 99–100 203–204, 204f
Homozygous dominant, 38 Mendel’s law of, 89–93 Insertion-sequence (IS) elements, 184,
Homozygous inversions, 642–643 of polygenes, 108–110, 109f 553–554, 555. See also Transposable
Homozygous recessive, 38 progeny ratios in elements
Horizontal transmission, 175–176 branch diagrams for, 90, 93 Insulators
Horvitz, Robert, 497 chi square test for, 96–98, 97t barrier, 459, 459f
Host range, phage, 193 Mendelian, 89–93, 90f, 92f in genomic imprinting, 460f, 461
Hot spots, mutational, 598–599, 599f prediction of, 93–95 Intelligence, heritability of, 729–730
Housekeeping genes, 474. See also product rule for, 94 Interactome, 341, 342f, 536, 537–538
Development sum rule for, 94 ChIP assay for, 538–539, 539f
Index 857
two-hybrid test for, 537–538, 538f lac operator site, 402 generalized transduction in, 197–198
Intercalating agents, 594, 594f lac operon, 402f, 403, 411–413 microsatellite markers in, 145
Interference, in crossing over, 141–142 activation of, 410–411, 410f–412f minisatellite markers in, 145
β-Interferon enhanceosome, 451–452, 451f CAP-cAMP complex and, 410–413, molecular markers in, 146–148, 147f
Interphase, in meiosis, 85 410f–412f phenotypic ratios in, 142, 144f
Interrupted mating, 181, 183 catabolite repression of, 410–413 polymerase chain reaction in, 146, 147f
in mapping, 186–188, 187f discovery of, 404–409 recombinant frequency in, 129–132
Intragenic deletions, 637 DNA binding sites of, 410–411, 410f–412f restriction fragment length
Intrinsic mechanism, in transcription negative regulation of, 403f, 404–409, 412, polymorphisms in, 145
termination, 300f, 301 412f symbols used in, 132–133
Introns, 292 positive regulation of, 400, 401f, 409–413, testcrosses in, 130–131, 146–147
identification of, 520–524 412, 412f three-point, 139–140
removal of, 305–306, 305f–310f repression of, 404–409, 406f–409f Linkage disequilibrium, variation from,
self-splicing, 308–309, 310f lac promoter, 402, 408, 409f 689–690, 690f
size of, 524 RNA polymerase binding to, 409–411 Linkage equilibrium, 689
transposable elements in, 568 lac repressor, 405–409, 406f–409f Linkage maps, 136. See also Maps/mapping
Inversion(s), 634, 635f, 636, 642–645 lac system, 177, 401–413 vs. physical maps, 201–204
balancer, 645 components of, 402 Linked genes, 129–135
deletions and, 644, 644f discovery of, 404–409 in cis conformation, 132
heterozygous, 643, 644f induction of, 402–404 inheritance of, 129–132, 131f
homozygous, 642–643 Lactose intolerance, 26–27, 26f, 27f in trans conformation, 132
paracentric, 642 Lactose metabolism, 402f, 410–411 lncRNA (long noncoding RNA), 296
pericentric, 642, 644–645, 645f lacZ gene, 405–406, 405t–407t, 406f, 438, Loci, gene. See Gene loci
Inversion heterozygote, 643, 644f 488 Long interspersed elements (LINEs), 567f,
Inversion loops, 643, 644f Lagging strand, 275, 275f 568–569, 568f
Inverted repeat (IR) sequences, 554 Lamarck, Jean-Baptiste, 764 Long noncoding RNA (lncRNA), 296
Ionizing radiation, as mutagen, 594–595 Lambda (λ) phage. See also Phage(s) Long terminal repeats (LTRs), 558,
IPTG, 405, 405f life cycle of, 417, 418f, 421f 569–571
IS elements, 184, 553–554. See also map of, 420f retrotransposons and, 560–561
Transposable elements in specialized transduction, 199–200, solo, 560–561
ISL1 gene, 526 199f–201f The Lord of the Rings (Tolkien), 204
Isoforms, 339–340 Law of equal segregation, 37, 44. See also LTR-retrotransposons, 560–561
Isogenic lines, 747 Segregation, equal Luria, Salvador, 586–588
Isolation by distance, 678–679 Law of independent assortment, 89–93. See Luria-Delbrück fluctuation test, 586–588,
Isopropyl-β-D-thiogalactoside (IPTG), 405, also Independent assortment 587, 587f
405f Leader sequence, 415 Lwoff, André, 398, 398f
Leading strand, 274 Lymphoma, Burkitt, 649, 649f
Jackpot mutation, 587 Leaky mutations, 49, 49f Lysate, 193
Jacob, François, 181–183, 184, 188, 398–399, Lederberg, Esther, 199, 588 Lysis, bacterial, 193
398f, 401, 404–408, 408–409, 417 Lederberg, Joshua, 179, 180, 196, 199 specialized transduction and, 198–200,
Jacob, Howard, 508 Leeuwenhoek, Antony van, 180 199f–201f
Jimsonweed (Datura stramonium), Leptotene, 84 Lysogenic bacteria, 196
aneuploidy in, 632, 632f let-7 gene, 499 in specialized transduction, 198–200
Jorgensen, Richard, 312 Lethal alleles, 220–223 Lysogenic cycle, bacteriophage, 417–422,
Jumping genes. See Transposable elements Lewis, Edward, 478, 483 418f
Libraries Lysogenic phages, in specialized
Karpechenko, Georgi, 623–624 cDNA, 366–367 transduction, 198–200
Kearns-Sayre syndrome, 115 clone identification in, 367–372. See also Lytic cycle, bacteriophage, 417–422, 418f
Kennedy disease, 592 Clone(s), identification of
Keto base, 276, 276f DNA template, 515 M cytotype, 562
Khorana, H. Gobind, 328 expression, 369 M phase, of cell cycle. See Mitosis
Kidwell, Margaret, 562 genomic, 366–367, 513–514 Mackill, David, 21
Kinases, in phosphorylation, 341, 341f shotgun, 514, 518f MacLeod, Colin, 7, 262
Kingsley, David, 784 Ligases, 12 Maize (Zea mays)
Klinefelter syndrome, 630, 630f Lilium regale, mitosis in, 83f as model organism, 551
Kluyveromyces, duplications in, 641, 642f Limb development, 491–494, 492f, 500–501 transposable elements in, 549–553,
Knockout mutations, 388, 389f lin-4 microRNA, 310 550f–552f
Kornberg, Arthur, 273, 352 lin-41 gene, 499 Major groove, 267, 269f
Kreitman, Martin, 778 Linear tetrads, in centromere mapping, Major histocompatibility complex, 678
Kruglyak, Leonid, 14 148–150, 149f Malaria
Krüppel gene, 484f, 490, 490f, 491 LINEs (long interspersed elements), 567f, Duffy antigen and, 785–786
568–569, 568f G6PD gene and, 684–687
lac genes, 401–402, 403f Linkage analysis, 129–135 Hbs allele and, 767–771
F′ plasmid and, 188, 189f chi-square test in, 150–151 sickle-cell anemia and, 767–771, 767f
Lac inducer, 404 development of, 129–131 Male pattern baldness, 676, 676f
Lac operator, 408–409, 409f gene order determination in, 141 Mammals, monotreme, 528–529
858 Index
Map units (m.u.), 136, 149–150 Mayer, Alan, 508–509 as model organisms, 222, 222f ,
Mapping function, 151–152, 151f McCarty, Maclyn, 7, 262 806–807
Maps/mapping, 128–129 McClintock, Barbara, 133, 549–553, 551f transgenesis in, 387–391, 387f
applications of, 128–129 McDonald, John, 778 rocket pocket, coat color in, 780–781,
association, 749–754 Mean, 718–719 780f
bacterial, 184–188, 187f, 201–204, Mediator complex, 440, 441f Microarrays, 536–537, 537f
202f–204f Medical genetics. See Genetic disorders in comparative genomic hybridization,
centromere, 148–150, 149f Meiocytes, 40, 41f 650, 650f
of chromosomal rearrangements, 648 formation of, 106, 106f–108f MicroRNA (miRNA), 296, 310, 311, 311f,
chromosome, 129–148 Meiosis, 40–45, 46f, 47f 463–465
chromosome walk in, 378 centromeres in, 41–42, 148–150, 149f in development, 499, 499f
chromosome walking in, 378, 379f crossing over in. See Crossing over in gene repression, 463–465
deletion, 638, 639f disjunction in, 628 Microsatellite(s), 667, 667
experimental basis for, 129–131 DNA replication in, 46–48, 47f mutation rate in, 687–688, 688t
fine, 379–382, 381 first-division segregation (MI) patterns in, in population genetics, 667, 669f
QTL, 747 149, 149f Microsatellite markers, 145, 147f
genetic distances in, 136–138, 137f, 140 gene transmission in, 47f in DNA matching, 707
haplotype, 671–672 independent assortment in, 89–93, Migration, variation from, 688
insertional mutagenesis in, 203–204, 204f 101–108. See also Independent Miller, Jeffrey, 598
by interrupted mating, 186–188, 187f assortment ming element, 575–576
interrupted mating and, 181, 183 at molecular level, 46 Ming, Yao, 722–723, 722f
linkage, 136 nondisjunction in, 628, 628f Miniature inverted repeat transposable
vs. physical maps, 201–204 products of, 42 elements (MITEs), 575–576, 575–576,
map units in, 136, 149–150 second-division segregation (MII) 575f
microsatellite markers in, 145 patterns in, 149–150, 149f Minimal medium, 176
minisatellite markers in, 145 stages of, 42f–43f, 47f, 83f Minisatellite markers, 145
molecular markers in, 144–148, 147f Meiotic recombination, 105–108, 106f–108f Minor groove, 267, 269f
multiple crossovers and, 151–153, 151f double-strand breaks in, 606–609, 10 regions, 299
Perkins formula for, 152–153 608–609, 609f 35 regions, 299
physical, 129, 154–155, 201–204. See also initiation of, 608 Mirabilis jalapa, cytoplasmic segregation in,
Genome sequencing Melanism, 780–781 113–114, 113f, 114f
recombination-based maps and, Mello, Craig, 311–312, 312 miRNA (microRNA), 296, 310, 311, 311f,
154–155, 155f Mendel, Gregor, 34–39, 764 463–465
vs. linkage maps, 201–204 genetic studies of, 34–39, 35f, 36f. See also in development, 499, 499f
polymerase chain reaction in, 146 Single-gene inheritance in gene repression, 463–465
by pseudodominance, 638, 639f plant breeding experiments of, 3–5, Miscarriage, chromosomal rearrangements
QTL, 742–748, 743f, 744t, 745f–748f, 749t 3f, 4f and, 651, 651f
by recombinant frequency, 135–143, 138f Mendelian inheritance, 34–39, 36f. See also Mismatch repair, 602–604, 603f
recombination-based, 129–155 Single-gene inheritance Missense mutations, 584, 585f
for bacteria, 184–188 Mendelian ratios, 89–93, 90f, 92f. See also MITEs (miniature inverted repeat
physical maps and, 154–155, 155f Phenotypic ratios transposable elements), 575–576,
restriction fragment length in genetic analysis, 231–236, 233, 239t 575–576, 575f
polymorphisms in, 145 Mendel’s first law, 37, 44. See also Mitochondria. See also Organelle(s)
sequence. See Genome sequencing Segregation, equal cytoplasmic segregation in, 114–116, 115f
single nucleotide polymorphisms in, Mendel’s second law, 89–93. See also Mitochondrial diseases, inheritance of,
144–145, 147 Independent assortment 115–116, 116f, 117f
transposons in, 203–204, 204f Meristic traits, 718 Mitochondrial DNA (mtDNA), 111, 112f
Marker gene, 177 Merozygote, 186, 186f evolutionary trees and, 116–117
unselected, 187 MERRF, inheritance of, 115 variation in, 670–671
MAT locus, 441–442, 447, 455f Meselson-Stahl experiment, 271, 272f Mitochondrial Eve, 671
Maternal imprinting, 460 Messenger RNA. See mRNA (messenger Mitochondrial genes
Maternal inheritance, 111 RNA) characteristics of, 111, 112f
Maternal-effect genes, 482–483, 484f Metaphase inheritance of, 111–116
Bicoid, 483, 484, 484f, 485 in meiosis, 84–85 mutations in, 115–116, 116f
Mating, interrupted, 181, 183 in mitosis, 83 Mitosis, 34, 41f–43f
in mapping, 186–188, 187f Methylation disjunction in, 628
Mating systems, 677–684 DNA, 460–461 DNA replication in, 46–48, 47f
assortive mating, 677–678 inheritance of, 449–450, 450f gene transmission in, 47f
inbreeding, 679–683, 680f. See also histone, 448 at molecular level, 46
Inbreeding 5-Methylcytosine, as mutational hot spot, nondisjunction in, 628
isolation by distance, 678–679 598–599, 599f stages of, 42f–43f, 47f, 84–85f
Mating types, fungal, 44, 45f, 105 Metric phenotypes, 108 Mixed infections, 194, 195f
Mating-type switching Mice Mobile elements. See Transposable
chromatin remodeling and, 444–445 house (Mus musculus) elements
gene silencing and, 454–455 coat color in, 221–222, 221f Model building, 264–265, 267f
in yeast, 441–442, 454–455 genome of, vs. human genome, Model organisms, 10–12, 11f, 794–807
Matthaei, Heinrich, 328 530–532 Arabidopsis thaliana, 10, 11f , 800–801
Index 859
Caenorhabditis elegans, 10, 11, 11f, 497, synthetic, 328 epistatic, 234–236, 234f–236f
802–803 in translation, 332–335. See also fate of, 693, 694f
Drosophila melanogaster, 11, 11f, 56, 470, Translation frameshift, 326–327, 585, 585f, 589, 590f
472–473, 804–805 mtDNA (mitochondrial DNA), 111, 112f gene
Escherichia coli, 180, 794–795 in evolutionary studies, 116–117 definition of, 582
genetic engineering in, 383–391 variation in, 670–671 vs. chromosome mutations, 618
haploids as, 44–45 Muller, Hermann, 456 genetic drift and, 692–693, 694f, 702–703
inbred lines for, 723 Müller-Hill, Benno, 408 in haploinsufficient genes, 217–218
Mus musculus, 11, 11f, 222, 222f, 806–807 Mullis, Kary, 358 hot spots for, 598–599, 599f
Neurospora crassa, 105, 798–799 Multifactorial hypothesis, 7, 716 hypostatic, 234
Pisum sativum, 34–39, 35f, 36f Multigenic deletions, 637 indel, 584, 585, 589, 670
Saccharomyces cerevisiae, 11, 11f, 436, Multiple alleles, 216 induced, 586
796–797 Mus musculus. See Mice, house (Mus molecular basis of, 593–596
Zea mays, 551 musculus) insertional mutagenesis and, 203–204,
Modifiers, 238 Muscular dystrophy, 66 204f
Modrich, Paul, 603 Mutagen(s), 586 jackpot, 587
Mold. See Fungi; Neurospora crassa acridine orange, 594, 594f knockout, 388, 389f
Molecular clock, 694, 695, 773 aflatoxin B1 as, 595, 596f, 597f leaky, 49, 49f
Molecular evolution, 771–774 Ames test for, 595–596, 597f missense, 584, 585f
Molecular machines carcinogenic, 595–596, 597f, 609–612 modifier, 238
Dicer, 310 ICR compounds as, 594, 594f at molecular level, 45–46
protein-protein interactions in, 341 ionizing radiation as, 594–595 mutagens and, 586. See also Mutagen(s);
repliome, 277 proflavin, 594, 594f Mutagenesis
replisomes, 277–280, 332 ultraviolet light as, 594 neutral, 771–774
ribosomes, 332–335 SOS repair system and, 604–605, 605f nondisjunction, 628, 628f
RISC, 310 Mutagenesis, 593–596 nonsense, 584, 585f
spliceosomes, 292 base analogs in, 593, 593f null, 49, 49f, 218
Molecular markers base damage in, 594–595 oncogene, 610–611
in linkage analysis, 146–148, 147f intercalating agents in, 594, 594f oxidative damage and, 590
in mapping, 144–148, 470f mechanisms of, 593–595 penetrance of, 239–241, 240f
Molecular mechanisms random, 540 phenotypic consequences of, 583–586
in allelic functions, 48–50 specific base mispairing in, 593–594, 594f in plant breeding, 88
in cell division, 46 targeted, 541f point, 583–586, 583f, 585f
in crossing over, 155–157, 156f Mutant(s), 32 in coding regions, 584–585, 585f
in meiosis, 46 amber, 329, 338–339 functional consequences of, 583f,
in mitosis, 46 bacterial, 176–177, 176t 584–585, 585f
in recombination, 155–157 auxotrophic, 176 genetic disorders and, 17–20
in replication, 46–48, 46f, 47f resistant, 177 in noncoding regions, 586
in segregation, 46–48 double, 227, 231–239 at splice sites, 584–585, 585f
Monoallelic inheritance, X-chromosome epistasis and, 234–236 types of, 583–584
inactivation as, 462–463 genetic analysis of, 231–239 premutations and, 592
Monod, Jacques, 10, 398–399, 398f, 401, lethal, 239, 239f proto-oncogene, 610
404–408, 409 Mutant rescue, 370 rate of, 688t
Monohybrid, 38, 89 Mutation(s), 32 variation and, 687–688, 771–772, 771f
Monohybrid cross, 38, 89 apoptosis and, 609 recessive, 50, 217, 217f
Monoploidy, 619, 619–620, 620t auxotrophic, 176, 224–225 in regulatory sequences, 782–783
in plant breeding, 626, 626f base insertion, 583–584 replication errors and, 588–589
Monosomy, 620t, 627, 629, 629f base substitution, 583–584 replication slippage and, 589
Monotremes, 528–529 cancer-causing, 595–596, 597f, 609–612 repressor, 400–401, 401f, 405–409, 407t
Morgan, Thomas Hunt, 5, 58, 130–131, 135, chromosome, 618 early studies of, 398–399
136, 470 incidence of, 651, 651f screen for, 32, 32f
Morphogens, 471 numerical changes and, 618–634 segregation ratios for, 38–39, 39f, 50–53
Morphological evolution, 779–786, structural changes and, 634–650 selection and, 703–704
779f–783f, 785f, 786f complemented, 227–230 single-gene, 32–34
Morphs, 63. See also Polymorphism(s) constitutive, 406 SOS system and, 604–605, 605f
Mosaicism cytoplasmic sources of, 588–590
flower color and, 553f in humans, 115–116, 116f spontaneous, 586–592
transposable elements and, 553f in plants, 114–115 molecular basis of, 586–592
Mosquitoes, transgenic, 542–543, 543f de novo, 17–20, 18f, 19f, 20f spontaneous lesions and, 589–590
Mouse. See Mice deamination and, 590, 598–599 substitution
mRNA (messenger RNA), 10, 294, 303. depurination and, 589 conservative vs. nonconservative,
See also Pre-mRNA; RNA dominant, 50, 216–218, 217f 584
binding sites for, 519, 520f dominant negative, 218, 218f neutral theory and, 771–774
ribosome and, 332–335 double-strand breaks, 606–609, 607f, 609f superrepressor, 407–408, 407t
splicing of, 305–306 dysgenic, 562 suppressor, 236–238, 238f
synthesis of. See Transcription; effects on genes/proteins, 45–46, 48–50, genetic code and, 326–327
Translation 49f nonsense, 238, 338–339
860 Index
Peterson, Peter, 552 Punnett square for, 91–92, 92f Ti, 384–386, 384f, 385f
Phage(s), 174, 192–200 in sex-linked inheritance, 55 transfer of, in bacterial conjugation,
amber mutant, 329, 338–339 sum rule for, 94 178–179, 179f, 180f
bacterial infection by, 192–193, 193f–195f Phenotypic variance. See Variance; Variation yeast integrative, 383, 384f
DNA in, 263–264 Phenylketonuria (PKU), 59 Plasmid vectors, 363, 364f
function of, 263 mutations in, 48–49, 49f, 217, 226, 226f Ti, 384–386, 384f, 385f
gene exchange in, 175–176, 175f Phenylthiocarbamide (PTC) tasters, 64, 66f Plasmodium falciparum, 767
hereditary processes in, 174–175 Pheromones, 226, 441 Plasmodium vivax, Duffy antigen and,
host range and, 193 Phosphatases, 341, 341f 785–786, 786f
lambda (λ) Phosphate, in DNA, 264, 265f, 267, 268f Plasterk, Ron, 571–572
life cycle of, 417, 418f, 421f Phosphorylation, of proteins, 341–342, 341f Plating, 176, 176f
map of, 420f Photoproducts, 594 Platypus genome, 528
in specialized transduction, 199–200, Phylogenetic inference, 528 Pleated sheets, 323f, 324
199f–201f Phylogeny, 527–530, 528, 529f Pleiotropic alleles, 222
lysogenic, 196 Physical maps, 129, 154–155, 201–204. See Point mutations, 583–586, 583f, 585f
in specialized transduction, 198–200 also Genome sequencing; Maps/ in coding regions, 584–585, 585f
lysogenic cycle in, 417–422, 418f mapping functional consequences of, 583f,
lytic cycle in, 417–422, 418f recombination-based maps and, 154–155, 584–585, 585f
mapping of, 194–196 155f genetic disorders and, 17–20
plaque morphology of, 193, 195f vs. linkage maps, 201–204 in noncoding regions, 586
prophages and, 196, 417 Pi-clusters, 573–574, 574f at splice sites, 584–585, 585f
temperate, 196 Piebaldism, 63, 65f types of, 583–584
in transduction, 196–200, 197f–201f. See Pigment gene, 313 Poisson distribution, 151–152, 151f
also Transduction Pigmentation poky gene, 112–113, 113f
in viral genome mapping, 192–196 body/coat Pol I (DNA polymerase I), 273–274, 274f,
virulent, 196 in D. melanogaster, 782–783, 782f, 783f 275, 276
Phage crosses, in recombination analysis, in mice, 221–222, 221f, 780–781, 780f Pol III holoenzyme, 274, 275, 276, 277, 278f
194–196, 195f regulatory evolution and, 782–783, Polyadenylation signal, 305
Phage recombination, 176 782f, 783f Polydactyly, 63, 64f, 501–502, 501f
in mapping, 194–196 flower. See Flower color Polygenes, 88–89, 108–110
Phage vectors, 363 Pili, 179, 180f continuous variation and, 108–110, 109f
Pharmaceutical development, structure- Ping element, 575–576 identification of, 110
based drug design in, 335 piRNA (piwi-interacting RNA), 296, 314, Polygenic inheritance, 108–110, 108f, 111f
phe operon, 416–417 573–574, 574f Polymerase chain reaction (PCR), 146, 147f,
Phenocopying, 541–542 Pisum sativum, as model organism, 34–39, 353f, 354, 356–358, 356f
Phenotype, 32. See also Traits 35f, 36f Polymerase III holoenzyme, 274, 275, 276,
expression of Pitx1 gene, 784 277, 278f
expressivity and, 240–241, 240f, 241f Plants Polymorphism(s), 32–33, 63–64. See also
factors affecting, 239–241 allopolyploid, 623–625, 623f, 624f, 627 Variation
penetrance of, 239–241, 240f aneuploid, 621, 632 autosomal, pedigree analysis of, 63–64,
gene dosage and, 633 breeding of. See Breeding 66f
gene interaction and, 215–242 cell division in, 41f, 83f blood group, 219
intensity of, 240 chloroplasts of. See under Chloroplast frequency of, 64
lethal alleles and, 220–223 chromosome deletions in, 639–640 restriction fragment length, 145
mutant, 32. See also Mutation(s) cytoplasmic segregation in, 113–114, 113f, in linkage analysis, 145
quantitative, 108 114f single-nucleotide, 145
unstable, 549–550 dioecious, 54, 55f in mapping, 144–145, 147
transposable elements and, 549–550, embryoid, 626 Polypeptide, 322. See also Protein(s)
552f gene silencing in, 312–313, 313f Polypeptide chain
wild type, 32 independent assortment in, 88 amino end of, 322, 322f, 323f
Phenotype prediction, 731–742 Mendel’s breeding experiments with, 3–5, carboxyl end of, 322, 322f, 323f
additive vs. dominance effects and, 3f, 4f Polyploidy, 619, 620–626, 622f, 623f
733–736 pigment gene in, 313 allopolyploidy, 620, 623–625
narrow-sense heritability and, 736–740 polyploidy in, 623–625, 787–788, 787f in animal breeding, 627
Phenotype-genotype interactions, 737 single-gene mutations in, 88 autopolyploidy, 620–623, 621f, 622f
Phenotypic ratios, 39, 39f, 51, 142 transgenic, 383–386, 384f, 385f in evolution, 787–788, 787f
branch diagrams for, 90, 93 transposing elements in, 569, 570f, 575 gene balance and, 632–634, 632–634
chi square test for, 96–98, 97t viral resistance in, 314–315 gene number and, 787–788
in independent assortment, 89–93, 90f, Plaque, phage, 193, 195f gene-dosage effect and, 633
92f Plasmid(s), 179 induced, 622f
lethal alleles and, 223 bacterial segments of, 189f organism size and, 620, 621f, 627
in linkage analysis, 142, 144f conjugative, 191 in plants, 623–625, 787–788, 787f
Mendelian, 89–93, 90f, 92f F, 179–184, 180, 180f, 181f. See also in breeding, 627
in genetic analysis, 231–236 Fertility factor (F) vs. aneuploidy, 632–634
prediction of, 53, 93–95 genetic determinants of, 190t Poly(A) tail, 305
product rule for, 94 R, 188–191, 189, 189f, 190t, 554–555 Polytene chromosomes, 637–638, 638f
862 Index
Rare single nucleotide polymorphisms, 667 Recombination maps, 129–155. See also template for, 270
ras oncogene, 611, 611f Maps/mapping termination of, 283–287
Ratios for bacteria, 184–188 3′ growing tip in, 274, 275f
gametic, 38–39, 39, 39f physical maps and, 154–155, 155f topoisomerases in, 279–280
genotypic, 39, 39f Red-green color blindness, 66 of transposable elements, 556–557, 556f
Mendelian, 89–93, 90f, 92f Regulatory elements, 7 Watson-Crick model of, 267–270,
for crosses, 89–93 Regulatory-sequence evolution, 782–786 267f–269f
in genetic analysis, 231–236 in humans, 785–786 Replication fork, 266f, 272–273, 274–275
phenotypic. See Phenotypic ratios loss of characters through, 783–784 stalled, 600
progeny, chi square test for, 96–98, 97t Regulons, 424 SOS system and, 604–605, 605f
segregation, 38–39, 39f, 50–53 Relative fitness, 697 Replication slippage, 589
Reaction norm, 737 Release factors (RF), 338 Replicative transposition, 556–557, 556f
Rearrangements. See Chromosome Repetitive DNA, 524 Replisome, 332
rearrangements chromosome rearrangements and, 635f, eukaryotic, 280–281
RecA, 604–605, 607 636 prokaryotic, 277–280
Recessive alleles, lethal, 220–223 conserved/ultraconserved, 526 Reporter genes, 438
Recessive epistasis, 234–235, 234f, 235f crossing over of, 635f, 636 in development, 488, 489f
Recessive mutations, 50, 217, 217f Replica plating, 588, 588f transgenes and, 542
Recessiveness, 37, 50, 731, 731f Replication, 9–10 Repressor(s), 400–401, 401f
haplosufficiency and, 217, 217f accessory proteins in, 277, 278f in development, 489–491, 489f
homozygous, 38 accuracy (fidelity) of, 276–277 DNA binding of, 417–423, 421f, 423f
Recipient, in conjugation, 178 β clamp in, 277 early studies of, 398–399
Reciprocal translocations, 637, 645–647, blocked, 600 in eukaryotes, 447, 447f
646f SOS system and, 604–605, 605f in genetic switching, 421–422, 421f
Recombinant(s), 106 in cell cycle, 41–42, 42f–43f, 46–48, 46f, lac, 405–409, 406f–409f
bacterial, 175f, 176, 184–188 47f, 281–282, 282f. See also Cell cycle; lambda (λ), 412f, 417–423
detection of, 106–108 Meiosis; Mitosis in prokaryotes, 400–401, 401f, 407–409,
double, 134, 135f, 140, 141f at chromosome ends, 283–287, 284f 407t, 413–414, 414f
in gene mapping, 135–143, 138f conservative, 271 Resistant mutants, bacterial, 177
Recombinant DNA, 354. See also DNA definition of, 9 Restriction enzymes, 355
technology direction of, 274–275, 275f, 281, 281f in cloning, 355, 356f
production of, 355–356. See also Cloning in DNA amplification, 40, 353–354, 353f sticky ends of, 355–356, 356f
Recombinant frequency (RF), 107, 107f, DNA ligase in, 275–276 Restriction fragment, 355
134–144 DNA polymerase in, 273–276, 275f Restriction fragment length
definition of, 136 DnaA in, 280 polymorphisms, 145
in gene mapping, 129–155. See also Maps/ errors in, mutations and, 588–590 in linkage analysis, 153
mapping in eukaryotes, 280–283 Restriction sites, 355
inversions and, 644 hairpin loop in, 301, 301f Restrictive temperature, 223
with multiple crossovers, 151–153 helicases in, 279 Retrotransposition, 788
Recombination, 104–108, 106f–108f helix unwinding in, 279–280, 279f Retrotransposons, 548, 558–562, 560–561
bacterial, 175f, 176, 184–188, 185f–187f. initiation of, 280, 280f, 284f Alu sequences, 567f, 568–569, 568f
See also Bacterial conjugation; lagging strand in, 275, 275f long interspersed elements (LINEs), 567f,
Bacterial transformation; leading strand in, 274 568–569, 568f
Transduction Meselson-Stahl experiment and, 271, 272f long terminal repeats (LTRs), 558,
crossing over and, 132–135, 134f, 135f, molecular aspects of, 46–48, 46f, 47f 569–571
136f. See also Crossing over Okazaki fragments in, 275, 275f, 284f short interspersed elements (SINEs),
in diploid organisms, 106–108, 106f–108f origins in, 280 567f, 568–569, 568f
double-stranded DNA breaks in, 156–157, in eukaryotes, 280–283, 281f, 283f Retroviruses, 548, 558–560, 559f
156f in prokaryotes, 280, 280f in gene therapy, 570–571
in haploid organisms, 106 overview of, 274–277 Reverse genetics, 34, 509, 539–541
heteroduplex DNA in, 156–157, 156f pol III holoenzyme in, 277, 278f by phenocopying, 541–542
homologous, 382, 607–608 primase in, 275, 278 by random mutagenesis, 540
nonallelic, 636 primer in, 275, 275f by targeted mutagenesis, 585f
inversions and, 644–645 primosome in, 275 Reverse transcriptase, 358, 558
meiotic, 105–108, 106f–108f in prokaryotes, 270–280 Reversions, genetic code and, 326–327
crossovers/noncrossovers in. See proofreading in, 276–277, 277f Revertants, 236–237
Crossovers replication fork in, 266f, 272–273, Rhoades, Marcus, 551, 551f
double-strand breaks in, 608–609, 609f 274–275 Rho-dependent mechanism, in
initiation of, 608 replisome in, 277–280 transcription termination, 300f, 301
molecular mechanism in, 155–157 semiconservative, 270–274, 270–274 Ribonucleic acid. See RNA
nonallelic homologous, segmental single-strand-binding (SSB) proteins in, Ribonucleoproteins, small nuclear, 308
duplications in, 641 279 Ribonucleotides, 294, 295f
phage, 176 speed of, 277, 280 Ribose, 294, 295f
in mapping, 194–196 strand melting in, 269 Ribosomal RNA (rRNA), 295, 321, 333. See
variation from, 689–690 supercoiling in, 279–280, 279f also RNA
Recombination hotspots, 750f, 751 tautomerization in, 276, 276f functions of, 333
864 Index
Short interspersed elements (SINEs), 567f, Small interfering RNA (siRNA), 296, 314 Stop codons, 329, 329f, 338–339, 338f
568, 568f Small nuclear ribonucleoproteins (snRNPs), Strand invasion, 608
Shotgun approach, 514–517, 518f 308 Streptococcus pneumoniae, transformation
Shotgun libraries, 514 Small nuclear RNA (snRNA), 296, 308 in, 261–262, 261f
Sickle-cell anemia, 219–220, 220f Small RNAs, 310–315 Stripe formation, 489–491, 490f
malaria and, 767–769 Snakes, pigmentation in, 231–232, 232f Structure-based drug design, 335
Side chains, amino acid, 322 Snb1 mutation, 88 Sturtevant, Alfred, 136
post-translational modification of, Snips. See Single-nucleotide polymorphisms Su(var) genes, 457–458, 458f
341–342 (SNPs) SUB1, in flood-tolerant rice, 21–23
Sigma factor (s), 299 snRNA (small nuclear RNA), 296, 308 Subfunctionalization, 788
in sporulation, 423–424, 424f snRNPs (small nuclear ribonucleoproteins), Sublethal alleles, 223
Sign epistasis, 775 308 Substitution mutations
Signal sequence, 343 Solo long terminal repeat (LTR), 560–561 conservative, 584
Signaling Somatic cells, 4 neutral theory and, 771–774
in cancer, 502–503, 503t Sonic hedgehog gene, 500–501, 500f nonconservative, 584
in development, 485, 486f, 486t in holoprosencephaly, 502 synonymous, 773
gene interaction in, 226 in polydactyly, 501–502 Subunits, 324
in gene regulation, 417–423 SOS system, 604–605, 605f Sulston, John, 496, 497
in protein targeting, 343, 343f Southern blotting, 371f, 372 Sum of squares, 720
Simple inheritance, 717 Specialized transduction, 198–200, 199, Sum rule, 94
Simple sequence length polymorphisms 199f–201f Supercoiled DNA, 279–280, 279f
(SSLPs), 145–146 Spemann organizer, 471f Supercontigs, 518, 518f
detection of, 146 Splice variants, 524 Superoxide radicals, mutations and, 590
mapping of, 146 Spliceosomes, 292, 305f, 308, 308f Superrepressor mutations, 407–408, 407t
Simple transposons, 555, 555f Splicing, 292, 305–306, 305f–310f, 306f, Suppressor/mutator elements, 552
SINES (short interspersed elements), 567f, 308f, 309f, 310f Suppressors, 236–238, 238f
568, 568f alternative, 306–307, 306f genetic code and, 326–327
Single nucleotide polymorphisms (SNPs), protein isoforms and, 339–340, 340f nonsense, 238, 338–339
16, 145, 667–668, 667f exon joining in, 308, 308f in translation, 338–339
in association mapping, 749–754 intron removal in, 305f, 306–309, 306f of variegation genes, 457–458, 458f
common, 667 self, 308–309, 310f Sutton, Walter, 5
detection of, 145 in sex determination, 494–496, 495f Synapsis, 41
in genetic disorders, 752–753, 753f of transposable elements, 568 Synaptonemal complex, 41
mutation rate in, 687–688, 688t Spm elements, 552 Synergistic transcription factors, 445,
rare, 667 Spo11, in meiotic recombination, 608, 609f 451–452, 451f, 452f
Single-gene disorders, 34, 508–509 Spontaneous abortion, chromosomal Synergyistic effect, 451
Single-gene inheritance, 31–71 rearrangements and, 651 Synonymous codons, 523
chromosomal basis of, 39–45 Spontaneous lesions, 589 Synonymous mutations, 584
in diploids, 40–44, 40f–43f Spontaneous mutations, 586–592. See also Synonymous substitutions, 773
equal segregation in, 34–39, 38, 39f, 43 Mutation(s) Synteny, 531, 531f
in meiocyte, 44–45, 45f molecular basis of, 586–592 Synthesis-dependent strand annealing,
gametic ratio in, 38–39 Spore pairs, identical vs. nonidentical, 607–608, 608f
gene discovery by, 53 155–156 Synthetic lethals, 239, 239f
genetic basis of, 41–45 Sporulation, 423–424, 424f Systems biology, 509
genotypic ratio in, 39 Spradling, Allan, 565
in haploids, 44–45 SRY gene, 68 Tabin, Cliff, 500
in humans, 58–70 Stahl, Franklin, 271 Tandem duplications, 640
Mendel’s studies of, 34–39, 35f, 36f Standard deviation, 720 Tandem repeats, 283–284
patterns of, 34–39 Star-cluster haplotype, 669–670, 671f Targeted mutagenesis, 541f
pedigree analysis of, 58–70. See also Statistical distributions. See Distribution Targeting, of transposable elements,
Pedigree analysis Steitz, Joan, 308 569–570
phenotypic ratio in, 39 Steitz, Thomas, 334 Target-site duplication, 557
in plant breeding, 88 Stem cells, embryonic, in gene targeting, TATA box, 304, 304f, 451–452
sex-linked, 54–58 388–391, 389f TATA-binding protein, 304, 304f, 440, 452
Single-nucleotide polymorphisms (SNPs), Sterility Tatum, Edward, 7, 8, 179, 180, 224–225, 324
145 in aneuploids, 621 Tautomeric shifts, 588–589
in mapping, 144–145, 147 half, 646 Tautomerization, 276, 276f
Single-strand-binding (SSB) proteins, 279 heterozygous inversions and, 646 Tay-Sachs disease, 69–70
siRNA (small interfering RNA), 296, 314 reciprocal translocations and, 646, 646f Telomerase, 284–286, 285f
Sister chromatids, 40, 41, 42f–43f, 47f in triploids, 627 Telomeres, 283–287
crossing over in, 135 Stewart, William, 320 in cancer, 287
formation of, 41, 42f–43f, 46, 46f, 47f Stickleback fish, evolution in, 784, 785f in Werner syndrome, 286
in synthesis-dependent strand annealing, Sticky ends Telophase
607–608, 608f of DNA cloning fragments, 359–360, in meiosis, 84–85
6-4 photoproduct, 594, 595f 361–362 in mitosis, 83
sloppy-paired gene, 492–494, 493f of restriction enzymes, 355–356, 356f Temperate phages, 196
866 Index
class 1, 558–562, 559f, 560, 561f Trifolium, leaf patterns in, 220, 221f Variation. See also Polymorphism(s)
class 2, 562–566, 562f–566f Trinucleotide repeat diseases, 591–592 allele frequency and, 673–677
in cloning, 565, 566f Triplet genetic code, 325–327, 325f analysis of, 667–672
cointegrates and, 557 Triploidy, 619, 620–621, 620t, 621f bottlenecks and, 695–696, 697
C-value paradox and, 567 in plant breeding, 627 conservative vs. nonconservative
discovery of, 549–553, 551f sterility and, 621, 627 substitutions and, 584
disease-causing, 569 Trisomy, 620t, 627, 629–631 continuous, 108–110, 109f, 110f
Ds, 549–552, 550f, 552f, 564, 565f Trisomy 13, 631 detection of, 666–672
Dt, 552 Trisomy 18, 631 founder effect and, 695–696
in E. coli, 553–557, 555f–558f Trisomy 21, 618, 630–631, 631f in G6PD gene, 684–687, 685f
En/In, 552 Robertsonian translocations and, 647, gene diversity and, 686–687
in eukaryotes, 558–565 647f gene regulation and, 782–783, 783f
excision of, 552, 552f Triticum aestivum, alloploidy in, 624–625 from genetic admixture, 688
families of, 552 Trivalents, 621 from genetic drift, 691–696
functions of, 549–553 tRNA (transfer RNA), 296, 320, 329–332. genetics vs. environment in, 727–731, 730t
genome size and, 567 See also RNA genotype frequency and, 673–677
genome surveillance and, 573–574 adapter hypothesis and, 329–330 heritability of, in evolution, 765–766
in grasses, 569, 570f aminoacyl, 330–331, 330f, 331f heterozygosity and, 687
in human genome, 567f, 568–569, 568f ribosomal binding of, 334 from linkage disequilibrium, 689–690,
in hybrid dysgenesis, 562, 563f, 564f in ternary complex, 337f 690f
inactivation of, 568–569 charged, 331 mating systems and, 677–684
insertion of, 549–552, 549f, 550f, 568–569, codon translation by, 330–332 measurement of, 684–687
569–571, 571f ribosomes and, 332–335, 333–334 microsatellites and, 667, 669f
insertion-sequence, 553–554, 555 structure of, 329–330, 330f, 331f from migration, 688
long interspersed elements, 567f, suppressor mutations and, 338–339, 340f modulation of, 687–704
568–569, 568f trp operon, 414–416, 415f–417f from mutation, 582, 687–689, 771–772,
in maize, 549–553, 550f–552f, 564 ts mutations, 223 771f. See also Mutation(s)
mobilization of. See Transposition Tumor-suppressor genes, 610, 611–612 natural selection and, 696–702
mosaicism and, 553f Tup1 complex, 447 nucleotide diversity and, 687, 687f
negative selection of, 568 Turner syndrome, 629, 629f, 634 polymorphism and. See Polymorphism(s)
nonautonomous, 552 Twin studies, 728–731 population size and, 694–696
P, 562–564, 562f–564f, 563f, 566f Two-hybrid test, 537–538, 538f position-effect variegation and, 648–649,
prevalence of, 567 Ty elements, 558–561, 561f 648f
in prokaryotes, 553–557, 555f–558f protein, sources of, 339–343
regulation of, 571–576, 572f–575f Ubiquitin, 342, 343f quantitative. See also Quantitative
replication of, 556–557, 556f Ubiquitination, 341–342, 343f genetics
repression of, 571–576, 572f–575f Ultrabithorax genes, 475, 491, 492f measurement of, 717–721
retrotransposons and, 558–562, 559f, Ultraconserved elements, 526, 527f racial, in blood groups, 785–786
560–561 Ultraviolet light, as mutagen, 594 from recombination, 689–690
safe havens for, 569–571, 571f SOS system and, 604–605, 605f recombination and, 104–108, 582
splicing of, 568 UMP (uridine 5′-monophosphate), single nucleotide polymorphisms (SNPs)
Spm, 552 295f and, 667–668, 667f, 677t
targeting of, 569–570 Uniparental inheritance, 111 sources of, 582
unstable phenotypes and, 549–550, 552f Univalents, 621 transmission of, 732–733. See also
as vectors, 565, 566f Unselected markers, 187 Heritability
in yeast, 558–562 Unstable phenotypes, 549–550 Variegated leaves, cytoplasmic segregation
Transposase, 554 transposable elements and, 549–550, and, 113–114, 113f
Transpose, 552 552f Variegation, position-effect, 648–649, 648f
Transposition, 555–557, 5556f Untranslated regions, 300, 300f Vectors, 353, 358–362, 360f–362f, 364f, 514
conservative (cut and paste), 556–557, Upstream activating sequences, 438, BAC, 365, 366f
556f 451–452, 452f delivery of, 365–366, 366f
copy and paste, 560 Upstream promoters, 298 DNA insertion in, 358–362, 360f–362f,
definition of, 191 Uracil (U), 294, 295f 364f
duplications and, 787–788 Uridine 5′-monophosphate (UMP), 295f DNA recovery from, 366
replicative, 556–557, 556f fosmid, 364–365, 365f, 366f
Transposon(s), 190–191, 192f, 554–557 Van Leeuwenhoek, Antony, 180 gene, 180
composite, 554, 555f Variable expressivity, 240 in genome sequencing, 514
definition of, 190–191 Variable number tandem repeats (VNTRs), P elements as, 565, 566f
DNA, 562–566, 562f–566f 145–146 phage, 363
in insertional mutagenesis, 203–204, 204f detection of, 146 plasmid, 363, 364f, 383–386, 384f, 385f
in mapping, 203–204, 204f Variance, 719. See also Variation selection of, 362–366
in prokaryotes, 554–555, 555f additive vs. dominance, 734–736 Ti, 384–386, 384f, 385f
simple, 555, 555f environmental, 725 yeast, 383, 384f
targeting of, 569–570 genetic, 725 Vertical transmission, 176
Transposon tagging, 375–376, 377f, 565 genetic vs. environmental factors in, Viral resistance, gene silencing and, 314–315
Transversions, 584 722–725, 726t Virulent phages, 196
868 Index
Bacterium Baker’s yeast Bread mold Mustard weed Roundworm Fruit fly Mouse
(E. coli) (S. cerevisiae ) (N. crassa) (A. thaliana) (C. elegans ) (D. melanogaster ) (M. musculus)
CHAPTERS
1. The Genetics Revolution evolutionary tree of evolutionary tree of Beadle and Tatum’s “one-gene— evolutionary tree of evolutionary tree evolutionary tree of life, p. 11 evolutionary tree of life,
life, p. 11 life, p. 11 one-enzyme” experiment, p. 7 life, p. 11 of life, p. 11 p. 11
mouse strain stocks, p. 12
2. Single-Gene genetic analysis using mycelium development flower development identifying a gene for wing
Inheritance ascus, p. 44 mutants, p. 32 mutants, p. 32 development, pp. 51–52
sex determination, pp. 54–58
3. Independent meiotic recombination, life cycle, p. 103 X-linked inheritance (eye color),
Assortment of pp. 106–108 observation of independent p. 104
Genes assortment, pp. 103
maternal inheritance (poky
mutants), pp. 112–113
cytoplasmic segregation,
pp. 113–115
( Continued )
[Photos from left to right: Biophoto Associates/Science Photo Library/Science Source; SciMAT/Science Source; Courtesy of Anthony Griffiths/Olivera Gavric; Darwin Dale/Science Source; Sinclair
Stammers/Science Source; Eye of Science/Science Source; R. L. Brinster, School of Veterinary Medicine, University of Pennsylvania.]
(E. coli) (S. cerevisiae) (N. crassa) (A. thaliana) (C. elegans) (D. melanogaster) (M. musculus)
6. Gene suppression, pp. 235–236 Beadle–Tatum, experiments, eye-color suppression, p. 237 example of
Interaction modifier mutations, pp. 224–225 haploinsufficient gene,
pp. 237–238 complementation, p. 217
pp. 229–231 coat coloration,
pp. 221–222
8. RNA: Transcription number of genes, p. 292 number of genes, p. 292 number of genes, p. 292
and Processing Volkin–Astrachan pulse- RNA polymerase II, p. 302 gene density, p. 302
chase experiment, frequency of introns, p. 306
pp. 293–294
stages of transcription,
pp. 298–301
promoter sequences,
pp. 298–299
sigma factors, p. 299
termination, p. 300
gene density, p. 302
9. Proteins and Crick codon length abundance of RNA number of kinase interactome,
Their Synthesis experiment, p. 325 transcripts, p. 321 genes, p. 342 p. 341
Nirenberg genetic code
elucidation, p. 328
10. Gene Isolation restriction enzymes from, genetic engineering genome size transgenesis in, p. 387
and Manipulation pp. 355, 359–362 with yeast vectors, and cloning, targeted gene
cloning with p. 383 p. 367 knockout, p. 388
bacteriophage transgenesis in,
vectors, p. 363 p. 386
cloning with fosmids,
pp. 364–365
12. Regulation of GAL system, pp. 436–440 position effect variegation, genomic imprinting,
Gene Expression SWI-SNF mutants, p. 444 pp. 456–459 pp. 460–461
in Eukaryotes histone modification, dosage compensation.
pp. 445–44 7 pp. 462–463
control of mating type,
pp. 441–442
gene silencing and mating-
type switching, pp. 454–455
(E. coli) (S. cerevisiae) (N. crassa) (A. thaliana) (C. elegans) (D. melanogaster) (M. musculus)
13. The Genetic Control as model homeotic mutants, p. 470 as model organism,
of Development organism, p. 473 as model organism, pp. 472–473 p. 470
cell lineage fates, homeotic genes, pp. 474–479 Hox gene clusters, p. 481
pp. 496–499 early development, pp. 474–479
development sex determination, pp. 494–496
timing, p. 499 multiple roles of hedgehog gene,
pp. 500–501
14. Genomes and comparative genomics of year sequenced, p. 509 year sequenced, year sequenced, p. 509 human–mouse
Genomics pathogenic and filing sequence gaps, p. 509 method of sequencing, p. 518 comparative genomics
nonpathogenic, p. 518 repeats in, p. 518 pp. 530–532
pp. 534– 536 two-hybrid test, codon bias, p. 523 use in identifying
systematic targeted conserved coding
pp. 537–538
mutagenesis, p. 540 elements, pp. 526–527
15. The Dynamic IS elements, pp. 553–554 Ty elements, pp. 558–561 copia-like elements, p. 560 mouse mammary
Genome: lack of DNA transposons, P elements, pp. 562–564 tumor virus, p. 558
Transposable p. 564 hybrid dysgenesis, pp. 562–563
Elements transposon targeting, P element for tagging/
p. 570 transgenesis, p. 565
transposon targeting, pp. 569–570
16. Mutation, Repair, Luria-Delbrück fluctuation number of double-strand meiotic recombination, p. 606
and Recombination test, pp. 586– 589 breaks, p. 608
methyltransferase, p. 597
mutational hotspots,
pp. 598–599
mismatch repair, pp. 602–604
SOS repair, p. 604
18. Population mutation rate, p. 688 nucleotide diversity, nucleotide diversity, p. 687 nucleotide diversity, nucleotide nucleotide diversity, p. 687 nucleotide diversity,
Genetics p. 687 p. 687 diversity, p. 687 mutation rate, p. 688 p. 687
mutation rate, p. 688 mutation rate, p. 688 mutation rate, mutation rate, p. 688
p. 688