Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Table of Contents
Fall 2012
We gratefully acknowledge David Micklos of the DNA Learning Center at Cold Spring Harbor Laboratory for his
generous help. Some materials for this exercise were adapted, by permission, from the Genomic Biology:
Advanced Instructional Technology for High School and College Biology Faculty laboratory manual, Cold
Spring Harbor Laboratory, copyright 1999.
Introduction
We are humans. We are bipedal and stand upright. We have hands, feet, fingers, and toes. You can look at the
student next to you and easily recognize that person to be human too. What makes us look similar to each other
while different from frogs, fish, or fuchsias is the molecule deoxyribonucleic acid (DNA).
The basic building block of DNA is the nucleotide comprising a deoxyribose sugar, a phosphate, and one of the
four bases A (adenine), C (cytosine), G (guanine), or T (thymine). In the DNA molecule, nucleotides are linked
together in a chain. DNA is a double helix; two chains of nucleotides are wound around each other to form a spiral
structure. Interactions (hydrogen bonds) between the bases on the opposing strands hold the double helix together.
The A's on one strand hydrogen bond with the T's on the other strand. The G's on one strand interact with the C's
on the other. Therefore, As and Ts are said to be complementary as are G's and C's. Complementary bases,
when hydrogen bound in the double helix, are called base pairs (bp). It is the order of the bases along the strands
of the DNA molecule that makes each species unique.
Our bodies are caldrons for thousands of chemical reactions carried out to support the process of life. We ingest
food for energy and for the raw materials needed to build the structures of the cell. We breathe oxygen; it assists in
the moving of electrons from one molecule to another. We manufacture protein molecules called enzymes needed
for the building or breakdown of still other molecules. We all look like humans because we all share the same
cellular makeup.
The information for the construction of all the enzymes in the cell and all the proteins giving the cell its shape and
function is stored within DNAs sequence of bases. One particular base sequence may carry the information for the
assembly of hemoglobin, a protein that carries oxygen to your cells. Another sequence of bases may direct the
manufacture of an actin molecule, a protein found in muscle. The region of bases on DNA that holds the
information needed for the construction of a particular protein is called a gene. The average gene is approximately
10,000 base pairs long. There are approximately 23,000 genes in human DNA.
The human genome (the total sum of our genetic makeup) is made up of approximately 3 billion base pairs
distributed on 23 chromosomes. All cells in your body, except red blood cells, sperm, and eggs, contain these 46
pairs of chromosomes (sperm and egg cells contain only 23 chromosomes). Only 15% of this enormous amount of
DNA is used directly to code for the proteins required for supporting cellular metabolism, growth, and reproduction.
The protein-encoding regions are scattered throughout the genome. Genes may be separated by several thousand
bases. Furthermore, most genes in the human organism are themselves broken into smaller protein-encoding
segments called exons, which, in many cases, have hundreds or thousands of base pairs intervening between
them. These intervening regions are called introns and they make up between 9097% of the entire genome.
Since these non-coding areas such as introns have no defined role, they were referred to as "Junk DNA".
Whatever their function may entail in the genome, closer examination of these intervening DNA regions has
revealed the presence of unique genetic elements that are found in a number of different locations. One of the first
such repeating elements identified was Alu.
Alu repeats are approximately 300 base pairs in length. They got their name from the fact that most carry within
them the base sequence AGCT which is the recognition site for the Alu I restriction endonuclease, a type of
enzyme that cuts DNA at a specific site. There are over 500,000 Alu repeats scattered throughout the human
genome. On average, one can be found every 4,000 base pairs along a human DNA molecule. How they arose is
1
still a matter of speculation but evidence suggests that the first one may have appeared in the genome of higher
primates about 60 million years ago. Approximately every 100 years since then, a new Alu repeat has inserted itself
in an additional location in the human genome. Alu repeats are inherited in a stable manner and they come intact in
the DNA your mother and father contributed to your genome at the time you were conceived. Some Alu repeats are
fixed in a population, meaning all humans have that particular Alu repeat. Others are said to be dimorphic;
different individuals may or may not carry a particular Alu sequence at a particular chromosomal location.
The Polym erase Chain Reaction
Objectives - student should be able to:
1. List and explain the importance of each component of PCR
2. Compare PCR to cellular DNA replication
3. Associate the temperature changes with the cycling steps of PCR
The polymerase chain reaction (PCR) is a method used by scientists to rapidly copy, in vitro, specific segments of
DNA. By mimicking some of the DNA replication strategies employed by living cells, PCR has the capacity for
churning out millions of copies of a particular DNA region. It has found use in forensic science, in the diagnosis of
genetic disease, and in the cloning of rare genes. One of the reasons PCR has become such a popular technique
is that it doesnt require much starting material. It can be used to amplify DNA recovered from a plucked hair, from
a small spot of blood, or from the back of a licked postage stamp.
There are some essential reaction components and conditions needed to amplify DNA by PCR. First and foremost,
it is necessary to have a sample of DNA containing the segment you wish to amplify. This DNA is called the
template because it provides the pattern of base sequence to be duplicated during the PCR process. Along with
template DNA, PCR requires two short single-stranded pieces of DNA called primers. These are usually about 20
bases in length and are complementary to opposite strands of the template at the ends of the target DNA segment
being amplified. Primers attach (anneal) to their complementary sites on the template and are used as initiation
sites for synthesis of new DNA strands. Deoxynucleoside triphosphates containing the bases A, C, G, and T
(NTPs) are also added to the reaction. The enzyme DNA polymerase binds to one end of each annealed primer
and strings the deoxynucleotides together to form new DNA chains complementary to the template. The DNA
++
polymerase enzyme requires the metal ion magnesium (Mg ) for its activity. It is supplied to the reaction in the
form of MgCl2 salt. A buffer is used to maintain an optimal pH level for the DNA polymerase reaction.
PCR is accomplished by cycling a reaction through several temperature steps. In the first step, the two strands of
the template DNA molecule are separated, or denatured, by exposure to a high temperature (usually 94 to 96C).
Once in a single-stranded form, the bases of the template DNA are exposed and are free to interact with the
primers. In the second step of PCR, called annealing, the reaction is brought down to a temperature usually
between 37C to 65C. At this lower temperature, stable hydrogen bonds can form between the complementary
bases of the primers and template. Although human genomic DNA is billions of base pairs in length, the primers
require only seconds to locate and anneal to their complementary sites. In the third step of PCR, called extension,
the reaction temperature is raised to an intermediate level (65C to 72C). During this step, the DNA polymerase
starts adding nucleotides to the ends of the annealed primers. These three phases are repeated over and over
again, doubling the number of DNA molecules with each cycle. After 25 to 40 cycles, millions of copies of target
DNA are produced. The PCR process taken through four cycles is illustrated on the following page (Figure 1).
In the following laboratory exercise, you will use PCR to amplify a dimorphic Alu repeat (designated Alu PV92). If
you have it, will be found on your number 16 chromosome. You will use your own DNA as template for this
experiment. DNA is easily obtained from the human body. A simple saltwater mouth rinse will release cheek cells,
from which you will extract the DNA. After you amplify the Alu repeat region, you will determine whether or not you
carry this particular Alu sequence on one or both or none of your number 16 chromosomes. This will be
accomplished by separating the DNA in your PCR sample on an agarose gel via electrophoresis, a process that
separates DNA by size. Finally, using a program developed by the DNA Learning Center at Cold Spring Harbor
Laboratory, you will determine how rare this Alu sequence is in the human population and make some assessment
as to when and where it arose.
An excellent animated tutorial showing the steps of PCR is available at the DNA Learning Center website:
http://www.dnalc.org/ddnalc/resources/pcr.html
Note: You will need Macromedia Flash plug-in to view this online and to download the animation files to
your computer.
Laboratory Exercise
The protocol outlined below describes a procedure for isolating DNA from cheek cells. In the first step, you will rinse
your mouth with a salt solution. This step typically dislodges hundreds of cells from the cheek epithelium. An aliquot
of the mouthwash solution is centrifuged to collect the dislodged cells, which are then resuspended in a small
volume of saline. The resuspended cells are then added to a solution of Chelex to remove any metal ions (such
as magnesium) which might promote degradation of your genomic DNA. Magnesium (and other metal ions) can act
as cofactor for DNA-degrading nucleases present in saliva and the environment. The Chelex/cell sample is then
boiled to break open the cells. Since the sample is heated at a high temperature, the DNA, following this step, will
be in a single-stranded form. The sample is then centrifuged briefly to collect the Chelex and an aliquot of the
supernatant containing released DNA is used for PCR.
Objectives - student should be able to:
1. Successfully isolate DNA from cheek cells
2. Prepare a PCR reaction for amplification of an Alu insert
d.
11.5 mL saline
6. Observe our cell pellet at the bottom of the tube. If you do not
have one, you may need to start over with another 11.5 mL
saline rinse.
Pour off the supernatant into your cup, being careful NOT to
lose your cell pellet.
Note: There will be about 100 L of saline remaining in the
tube after you pour.
7. Check to make sure you can see your cell pellet and that
there is about 100 L of saline covering it. You may need to
add more saline to get up to about 100 L.
Rack or flick tube to mix, which will resuspend the cell and
make an evenly mixed solution.
Note: You can also rack your sample. Be sure the top of
the tube is closed, hold tube firmly at the top, and pull it
across a microfuge rack 23 times.
8. Obtain a tube of Chelex from your instructor. Label with your
PIN.
11. After heating, gently remove the cap lock and open the tube
to release the pressure. Caution: the tube will be hot! Close
and then rack or shake the tube well and place it in a
centrifuge to spin for 1 minute.
12. Obtain another clean microfuge tube and label it with your
PIN. Also write DNA on this tube.
15. Place your DNA tube in the class rack. Your teacher will
refrigerate your isolated DNA until you are ready to prepare
your PCR amplification.
3. Change your pipet tip and add 20 L of Primer Mix into your
PCR tube.
20 L of
Primer Mix
10 L of
DNA
Note: Make sure that all the liquids are settled into the
bottom of the tube and not on the side of the tube or in the
cap. If not, you can give the tube a quick spin in the
centrifuge. Do not pipette up and down; it introduces error.
5. Setting up the controls:
a. Two students will be asked to set up the positive
control reactions (+C) for the class. They will use the
positive control DNA provided in the kit. There should
be enough +C PCR sample for one lane on each gel.
b. Another two students will set up negative control
reactions for the whole class (C). They will use sterile
water. There should be enough C PCR sample for
one lane on each gel.
Control
Master
Mix
Primer
mix
DNA
20 L
20 L
10 L +C DNA
20 L
20 L
10 L sterile H20
50
50#L#
Note: If the volume of your tube does not match, see your
instructor to troubleshoot. You may need to set up the
reaction again.
PCR Tube
Reference Tube
7. Place your reaction into the thermal cycler and record the
location of your tube on the grid provided by your teacher.
1
A
B
C
1123
828
1027
6777
9305
The gel material to be used for this experiment is called agarose, a gelatinous substance derived from a
polysaccharide in red algae. When agarose granules are placed in a buffer solution and heated to boiling
temperatures, they dissolve and the solution becomes clear. A comb is placed in the casting tray to provide a mold
for the gel. The agarose is allowed to cool slightly and is then poured into the casting tray. Within about 15 minutes,
the agarose solidifies into an opaque gel having the look and feel of coconut Jell-O. The gel, in its casting tray, is
placed in a buffer chamber connected to a power supply and running buffer is poured into the chamber until the gel
is completely submerged. The comb can then be withdrawn to form the wells into which your PCR sample will be
loaded.
Loading dye is a colored, viscous liquid containing dyes (making it easy to see) and sucrose, Ficoll, or glycerol
(making it dense). To a small volume of your total PCR reaction, you will add loading dye, mix and then pipet an
aliquot of the mixture into one of the wells of your agarose gel. When all wells have been loaded with sample, you
will switch on the power supply. The samples should be allowed to electrophorese until the dye front (either yellow
or blue, depending on the dye used) is 1 to 2 cm from the bottom of the gel. The gel can then be moved, stained
and photographed.
Calculations for Preparing 2% Agarose Gel
You will need a 2%, mass/volume agarose gel for electrophoresis of your PCR products. If your agarose gel
casting trays holds 50 mL, then how much agarose and buffer would you need? The definition of m/v % in biology
is grams (mass) / 100 mL (volume). Therefore, for 2% agarose, it will be 2 g /100 mL buffer.
Step 1: Calculate the mass of agarose needed for 50 mL total volume of agarose solution.
2g
Xg
=
100 ml
X = 1 gram
50 ml
Step 2: Calculate the amount of buffer needed to bring the agarose solution to 50 mL. By standard definition, 1
gram of H2O = 1 mL of H2O. The amount of buffer for the 2% agarose solution will be 49 mL (50 mL 1 mL (1 gram
of agarose)).
10
5. When all samples are loaded, attach the electrodes from the
gel box to the power supply. Have your teacher check your
connections and then electrophorese your samples at 150
Volts for 2540 minutes.
11
Your teacher may stain your agarose gel and take a photograph for you so that you may analyze your Alu results.
Gel staining is done as follows:
1. Place the agarose gel in a staining tray.
2. Pour enough ethidium bromide (0.5g/ mL) to cover the gel.
3. Wait 20 minutes.
4. Pour the ethidium bromide solution back into its storage bottle.
5. Pour enough water into the staining tray to cover the gel and wait 5 minutes.
6. Pour the water out of the staining tray into a hazardous waste container and place the stained gel on a
UV light box.
7. Place the camera over the gel and take a photograph.
8. Check with your district on how to dispose of hazardous waste liquid and solids.
CAUTION: Ethidium bromide is considered a carcinogen and neurotoxin. Always wear gloves and
appropriate PPE (personal protective equipment) like safety glasses when handling. Students should
NEVER handle EtBr.
CAUTION: Ultraviolet light can damage your eyes and skin. Always wear protective clothing and UV safety
glasses when using a UV light box.
Figure 4. After staining an agarose gel
with ethidium bromide, DNA bands are
visible upon exposure to UV light.
12
Results
By examining the photograph of your agarose gel, you will determine whether or not you carry the Alu repeat on
one, both, or neither of your number 16 chromosomes. PCR amplification of this Alu site will generate a 415 bp
fragment if the repeat is not present. If the repeat is present, a 715 bp fragment will be made. Figure 5 shows the
structure of an individuals two number 16 chromosomes in a case where one carries the Alu repeat and the other
does not.
Figure 5. The
chromosomes you
inherit from your
parents may or may
not carry the Alu
repeat on
Chromosome 16.
When you examine the photograph of your gel, it should be readily apparent that there are differences between
people at the level of their DNA. Even though you amplified only one site, a site that every one has in their DNA,
you will notice that not all students have the same pattern of bands. Some students will have only one band, while
others will have two.
We use the term allele to describe different forms of a gene or genetic site. For those who have the Alu repeat
(they have at least one 715 bp band), we can say that they are positive for the insertion and denote that allele
configuration with a + sign. If the Alu repeat is absent (a 415 bp band is generated in the PCR), we assign a -
allele designation. If a student has a single band, whether it is a single 415 bp band or a single 715 bp band, then
both their number 16 chromosomes must be the same in regards to the Alu insertion. They are said to be
homozygous and can be designated with the symbols -/- or +/+, respectively. If a students DNA generates a
415 bp band and an 715 bp band during PCR, the student is said to be heterozygous at this site and the
designation +/- is assigned. A persons particular combination of alleles is called their genotype. See the table
below for a quick summary of the allele designations.
Possible Bands
1. One band at 415 bp
2. One band at 715 bp
3. One band at 415 bp
and a second band at
715 bp.
Allele Designation
-/-
Genotype
homozygous
+/+
homozygous
+/-
heterozygous
13
Alu Insert
No Alu insert
Alu insert present on both
chromosomes
Alu insert on one of the
chromosomes
Figure 6 below, shows a representation of a possible experimental outcome on a gel, where all possible allele
combinations have been generated.
Figure 6. Agarose gel of homozygous and heterozygous individuals for the PV92 Alu insertion. A 100 base pair
ladder is loaded in the first lane and is used as a size marker, where these bands differ by 100 bp in length. The
500 bp band and the 1,000 bp band are purposely spiked to be more intense than are the other bands of the ladder
when stained with ethidium bromide. The next 5 lanes contain the results of homozygous and heterozygous
individuals. A negative control (-C) does not contain any template DNA and should therefore contain no bands. The
positive control (+C) is heterozygous for the Alu insertion; it contains both the 415 bp and 715 bp bands.
14
+/+
+/-
50
-/-
30
Genotype
Since each person in your class has two number 16 chromosomes (they are diploid for chromosome 16), there
must be twice as many total alleles as there are people:
2 + alleles
student
To calculate allele frequencies for the class, therefore, 200 will be used as the denominator value. To calculate the
+ allele frequency, we must look at all those students who have a + in their genotype. There are 20 students
who are +/+; they are homozygous for the insertion. Since these 20 students have two copies of the Alu insert on
their chromosomes, they contribute 40 + alleles to the overall frequency:
2 + alleles
homozygous +/+ student
There are 50 students heterozygous (+/-) for the Alu insertion. Each heterozygous individual, therefore,
contributes one + allele to the overall frequency, or 50 + alleles. Adding all + alleles together gives us:
90 + alleles
200 total alleles
= 0.45
15
The frequency for the PV92 - allele is calculated in a similar manner. There are 30 students homozygous for the
- allele. This group, then, contributes 60 - alleles to the frequency. There are 50 students heterozygous for the
Alu insertion. They contribute 50 - alleles to the frequency. Adding all - alleles together gives us:
110 - alleles
200 total alleles
= 0.55
Notice that the sum of the frequencies for the + and - alleles should always be 1.0.
Use the spaces below to calculate the + and - allele frequencies for your class.
Number of total alleles:
2 alleles
student
students
alleles
Number of +
Alleles
Number of Students
+/+
+/-/-
Number of -
Alleles
0
Total:
Allele Frequencies:
total + alleles
=!
+ allele frequency =
total alleles
total - alleles
=!
- allele frequency =
total alleles
Genotype Frequencies
How does the distribution of Alu genotypes in your class compare with the distribution in other populations? For this
analysis, you need to calculate a genotype frequency, the percentage of individuals within a population having a
particular genotype. Remember that the term allele refers to one of several different forms of a particular genetic
site whereas the term genotype refers to the specific alleles that an organism carries. You can calculate the
frequency of each genotype in your class by counting how many students have a particular genotype and dividing
that number by the total number of students. For example, in a class of 100 students, lets say that there are 20
students who have the +/+ genotype. The genotype frequency for +/+, then, is 20/100 = 0.2. Given the ethnic
makeup of your class, might you expect something different? How can you estimate what the expected frequency
should be?
If within an infinitely large population no mutations are acquired, no genotypes are lost or gained, mating is random,
and all genotypes are equally viable, then that population is said to be in Hardy-Weinberg equilibrium. In such
populations, the allele frequencies will remain constant generation after generation. Genotype frequencies within
this population can then be calculated from allele frequencies by using the equation:
2
p + 2pq + q = 1.0
where p and q are the allele frequencies for two alternate forms of a genetic site. The genotype frequency of the
2
2
homozygous condition is either p or q (depending on which allele you assign to p and which to q). The
heterozygous genotype frequency is 2pq.
Lets use our fictitious class again (see page 16) to calculate expected genotype frequencies. We determined the
following allele frequencies (we will assign p to the + allele and q to the - allele):
p = 0.45 for + allele frequency
q = 0.55 for - allele frequency
2
We expect, therefore, that the genotype frequency for +/+ is equal to p which is
2
p = (0.45) = 0.2025
The frequency for the +/- genotype is
2pq = 2(0.45)(0.55) = 0.495
The frequency for the -/- homozygous genotype is expected to be
2
q = (0.55) = 0.3025
To convert these decimal numbers into numbers of students, we multiply each by the total number of students.
Since there are 100 students in this fictitious class, the number of students in the class expected to have the +/+
genotype is
100 x 0.2025 = 20.25 students who should be +/+
The number of students who should be +/- is
100 x 0.495 = 49.5
The number of students who should be -/- is
100 x 0.3025 = 30.25
On page 16, you calculated the allele frequencies found in your class. Use these frequencies to determine the
expected class genotype frequencies. (Let p represent the + allele and q the - allele.)
17
p = _________ = ______
Expected +/- genotype frequency:
2pq = _________________ = ______
Expected -/- genotype frequency:
2
q = _________ = ______
Use the table below to calculate how many students in your class should have each genotype.
Genotype
Expected Genotype
Frequency
Total Number of
Students in Class
Expected Number
of Students with
Specific Genotype
+/+
+/-/Now, calculate the actual genotype frequencies for this class (hint: use data on page 16).
Actual +/+ genotype = ______________________________
Actual +/- genotype = ______________________________
Actual -/- genotype = ______________________________
Is
your
class
in
Hardy-Weinberg
equilibrium?
18
Name________________________________________
Date _________________ Period_________________
Review Questions: Allele and Genotype Frequencies
1. A class is looking at a dimorphic Alu insert on chromosome number. How many total alleles are there in a class
of 34 students for this Alu site?
2. The - allele frequency for the class is 0.3. What is the + allele frequency?
3. A class in Hardy-Weinberg equilibrium has a +/+ genotype frequency of 0.64. What is the + allele frequency?
4. The +/+ genotype frequency for a class is 0.49 and the -/- genotype frequency is 0.09. What is the +/-
genotype frequency if the class is in Hardy-Weinberg equilibrium?
19
20
"+"
"-"
+/+
+/-
-/-
Allele Frequency:
Allele Frequency:
Genotype Frequency:
Genotype Frequency:
Genotype Frequency:
21
22
3.
Record the + allele frequency for that
population on the world map provided. Close the
window by clicking the close box at the top of the
window and open the next population group.
Record that + allele frequency on your map.
Repeat this process for all the population groups
listed on your workspace.
When you are finished, clear the workspace by
checking each box to the left of each population
group and then pressing the CLEAR button.
23
Name________________________________________
Date _________________ Period_________________
Exercise: Global M ap and Table for Analyzing Alu PV92 Allele Frequencies
On page 25 you will find a world map with a variety of different populations highlighted. Follow the directions on
pages 22-23 to find the + allele frequencies on the Allele Server and write them directly on the map next to the
number for each population.
In addition, you can use the table provided on page 26 to fill in the allele frequencies for each population. You can
then use this table to plot the values on the map. This table also has a column for number of samples tested. Why
do you think it is important to consider the number of samples in each data set?
Look at the + allele frequencies for the various world populations that you entered on your world map and/or table
and consider the following questions:
1. Which ethnic groups are most likely to have the Alu insertion?
2. Do you notice any pattern in the allele frequencies? Explain. You may use the map to diagram with arrows.
3. Where do you think the Alu PV92 insert originated? Formulate an explanation for where you believe the Alu
PV92 insert originated and how it spread throughout different world populations.
4. Make a prediction about where you think the future directions of Alu will be and why.
24
Name ________________________________________
Date __________________ Period_________________
W orld M ap for Plotting Global Allele Frequencies
25
Name_________________________________________
Date __________________ Period_________________
Table for Recording Allele Frequencies
ID #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
"+ "
frequency
Group Name
African American
Alaska Native
Australia Aborigine
Breton (France)
Cajun
Chinese
Euro-American
Filipino
French
German
Greek, Cyprus
Hispanic American
Hungarian
India Christian
India Hindu
Indian Muslim
Italian
Java
!Kung (''Bushmen'')
Malay
Maya (Central America)
Moluccas (Indonesia)
Mvskoke (Seminole)
Nguni (Southern Africa)
Nigerian
Pakistani
Papua New Guinea
Papua New Guinea, Costal
Pushtoon (Afgani)
Pygmy (Central African Republic)
Pygmy (Zaire)
Sardinian (Aritzo)
Sardinian (Marrubiu)
Sardinian (Ollolai)
Sardinian (San Teodoro)
Sotho (Southern Africa)
South India
Swiss
Syrian
Taiwanese
Turkish, Cyprus
United Arab Emirates
Yanomamo (Amazon)
26
"- "
frequency
Sample
Size
Part 3: Using Allele Server to Test if Your Class is in Hardy-W einberg Equilibrium
On page 16, you calculated the expected genotype frequencies for your class using the Hardy-Weinberg equation.
Are the expected genotype frequencies you calculated similar to the actual class frequencies? If they are different,
then it may mean that the population in your class is not in Hardy-Weinberg equilibrium. If we do observe
differences, how can we account for them? How do we even know when there is actually a significant difference
between the observed genotype frequencies and the expected genotype frequencies? You will use the Allele
Server program to address these questions.
Chi Square Analysis of Your Class Data
1. In the MANAGE GROUPS Classes window,
locate you class and place a check mark in the
box to its left. Click the OK button at the bottom
of the window. This will bring you back to the
ALLELE SERVER workspace window. Your
class data will have been placed in the
workspace.
By following the above steps, you have directed Allele Server to use a test called Chi-square, a statistical test used
for comparing observed frequencies with expected frequencies. The Allele Server analysis gives you a Chi-square
value and a p-value. The larger the chi-square value, the greater is the difference between the observed and the
expected values. When using the Chi-square analysis, we test the null hypothesis that there is no difference
between samples (observed and expected) and we assume that if there is any difference, then it arose simply by
chance and is not real. For this study, our null hypothesis is that your class is in Hardy-Weinberg equilibrium.
27
Whether or not we can accept the null hypothesis is given by the p-value. If the calculated p-value is less than 0.05,
the null hypothesis is disproved; the population is not in Hardy-Weinberg equilibrium. If the p-value is greater than
0.05, the population may be in Hardy-Weinberg equilibrium; we cannot prove that it is not in Hardy-Weinberg
equilibrium.
As an example, lets say that Chi-square analysis of your data gives a p-value of 0.17. This means that there is a
17% probability that the difference between the observed and the expected values is due to chance. It also means
that there is an 83% (100% - 17% = 83%) probability that the difference is not due to chance; the difference is real.
What is the Chi-square value for your class? ____________
What is the p-value for your class data? ___________
28
Name ________________________________________
Date __________________ Period________________
2. Based on the Chi-square and p-values, do you believe your class is in Hardy-Weinberg equilibrium? Why or why
not?
29
30
5. The CHI SQUARE window will display the Chisquare and p-value for these two population
groups.
The population group you are comparing your
class to is ______________________________.
What is the p-value for the Chi-square test? ______
Based on the p-value, are the genotype frequencies
of your class and the other population most
probably identical or significantly different?
________________________
6. When you are finished, close the CHI SQUARE window. Follow the above steps again to identify a human
population that your class data most resembles. Choose at least five populations to compare with your class.
Record your p-values below.
Population
p-value
The population group your class most resembles is _______________. Chi-square analysis gives a pvalue of __________ when these two populations are compared.
You may want to compare your class with another class in the database. Other classes can be found in the
Classes section of the MANAGE GROUPS window.
31
Name ________________________________________
Date __________________ Period_________________
2. From the population data, the Yanomamo tribe in the Amazon rainforest has the highest genotype frequency for
the PV92 insert. How would you explain this?
32
33