Sei sulla pagina 1di 8

Mathematics and Statistics for AP Biology

AP Biology Testing
You will be given six ‘grid in’ calculation problems on the national AP exam, as
well as on the class final. There may also be calculation and graphing components to
both the multiple choice and the free response sections. To answer those questions a
simple calculator is needed. Buy a cheap five function (+, -, x, ,  ). You will not be
allowed to use a graphing calculator on the test.
You will be provided with a formula sheet on the day of the test. Do not bring
any formulas or written notes with you on the testing day. Mr. Ballog will provide you
with the most recent official formula sheet for class tests as well as for completing this
set of exercises.

Practice Exercises
This set of exercises includes sample problems that are, for the purpose of
instruction, fairly simple. The problems on your exams will not involve any more
complex mathematics but the concepts they are associated with may be of a more
complex nature.
PART 1 – Measures of Central Tendency

Identifying what is happening in the middle of any data set often offers researchers a lot
of information. Many sets of biologically derived data fit a normal curve. A normal curve
shows the distribution of the range of data. If you were to measure everyone’s shoe size
in the class there would be a few students who wear very small shoes and a few who
wear very large shoes, with most students wearing shoes
somewhere near the middle of the range. The mean,
median, and mode are the most widely used measures to
describe how collected data clusters in the middle of a Small
Large
normal distribution (central tendency). Shoes
Shoes
As a general rule use;
Mean when the data closely fits a normal curve,
Median when data are skewed to one end of the
Figure 1 Normal Distribution Curve
distribution or the other or when there are extreme outliers
in the data,
Mode is not often used in Biological research but is valuable to identify data patterns
that are bimodal.

Document1 5/17/2018 P a g e | 1 of 8
Mean (average)
= average
N = total number of individuals in the entire population
n = total number of individuals in a sample
i = the number of measurements
xi = any given single measurement
 = sum of
What this formula says is; add up all instances of the data and divide by the number of
data points – but you already knew that!

Below is a table of data recorded during a behavioral study of fruit flies. The data was
taken over a 10 minute period by counting the number of flies found in two different
chambers. The left (treatment) chamber had a cotton ball saturated with a substance
selected by the students. Ask Mr. Ballog to show you the set-up of the experiment.

Time Number of Number of Drosophila


(minutes) Drosophila in right in left (treated)
chamber chamber
0 5 5
0.5 9 1
1.0 8 2
1.5 8 2
2.0 9 1
2.5 10 0
3.0 9 1
3.5 7 3
4.0 9 1
4.5 9 1
5.0 8 2
5.5 7 3
6.0 9 1
6.5 7 3
7.0 7 3
7.5 7 3
8.0 8 2
8.5 9 1
9.0 7 3
9.5 8 2
10.0 9 1

1) What is the value for N for this experiment? _______


2) What is the value for i for this experiment? _______
3) Calculate the mean of the data for both chambers over the course of the experiment.
right chamber = _________ left chamber = _________

Document1 5/17/2018 P a g e | 2 of 8
Median
The median is the data value that lies in the very middle of a set of data. Half of the data
will be below the median while the other half will lie above the median. Unlike the
mean, whose value may not even be represented in the data, the median is one of the
data values – well, usually. In a data set with an even number of data points the median
will be the average (mean) of the two central data points. The median is used when
there are a few extreme values in the data set that might give an erroneous view of the
central value of the data set. It has the advantage of showing what value the data set
‘revolves’ around.
To find the median you arrange the data points in ascending numerical order. The
middle data point in this arrangement is the median.

4) What is the median of the data sets collected in the drosophila (fruit fly) experiment?

Medianright = _______________ Medianleft = ________________

Mode
The mode is the data value that occurs most frequently in a
set of data. At times it may be useful to describe a data set
as being bimodal. This occurs in populations that exhibit
disruptive selective pressures. Neither the mean or median
would show this tendency in a data set.

5) What is the mode of the data sets collected in the


drosophila experiment? Figure 2 Bimodal distribution pattern

Moderight = _______________ Modeleft = ________________

PART 2 – Measures of Variability

While the measures of central tendency show how the collected data clusters, measures
of variability describe how data spreads out. These measures give an idea of the shape
of the normal distribution and how much variation individual data points exhibit. Range,
standard deviation and variance are the most widely used measures of variability.

Range Smallest Largest


The range in a data set simply shows how far apart the value value
smallest and largest data points are. These data values
populate the two extreme tails of the full data set.
To determine the range identify the smallest data value
and subtract it from the largest data value.
6) What is the range of the two data sets collected on Figure 3 Data range
drosophila behavior?

Rangeright = _______________ Rangeleft = ________________

Document1 5/17/2018 P a g e | 3 of 8
Standard Deviation and Variance

Variance(s2 or σ2) and standard deviation(s or σ) are two closely related measures of
variability. In order to calculate the standard deviation of a data set you must first
calculate the variance of the same data set. Standard deviation basically tells us how far
data points deviate from the mean. You measure how far a data point is from the mean
and then find the average of all of the calculated distances from the mean. The formula
sheet provides the following algebraic definition:

Where;
s = the standard deviation n-1 = the degrees of freedom
s2 = variance i = the number of measurements
= average xi = any given single measurement
n = total number of individuals in a  = sum of
sample

7) Go ahead and calculate the variance and then standard deviation of the drosophila
data. By this time you should notice that the two sets are in essence just inverses of
each other (If a fly is not in one chamber it is in the other) so if you determine the
standard deviation for sides data set it will be the same as for the second sides set.

Standard Deviation (s) = ________________

What does the standard deviation tells us about the distribution of the data?
In a normal curve the distribution of the data is determined by the standard deviation as
shown at right. 68% of the data is within one standard deviation of the mean; 95% is
within 2 standard deviations of the mean; while 97% of the collected data will fall within
3 standard deviations of the mean. As the standard deviation becomes smaller the data
clusters more closely to the mean. The distribution curve is therefore more closely
centered about the mean (below).

Figure 5 Normal distribution w/ changing standard deviation Figure 4 Normal distribution w/standard deviation

Document1 5/17/2018 P a g e | 4 of 8
Died in Drought Survived Drought
Band Beak Depth Band Beak Depth
(mm) (mm)
283 11.2 1019 11.21
To gain further practice and to
278 10.6 1919 11.2
make this a bit less repetitive 294 10.5 2244 11.01
we will introduce a different 609 10.5 8191 10.86
data set. This is the data 674 10.5 1659 10.78
collected by researchers Peter 422 10.3 1861 10.7
and Rosemary Grant on 428 10.2 1599 10.7
Daphne Major in the Galápagos 561 10.2 2249 10.68
Islands. The data shows the 605 10.2 1426 10.61
change in beak depth of a 461 9.8 2206 10.5
611 9.8 1850 10.4
population of finches following
343 9.7 1418 10.38
a draught year (1977)
420 9.7 572 10.3
676 9.7 3642 10.28
456 9.6 673 10.1
458 9.6 1477 10.1
293 9.5 2887 10.1
512 9.4 2210 10.08
347 9.3 1592 10
564 9.3 710 10
619 9.3 618 9.9
311 9.2 2378 9.86
506 9.2 316 9.85
509 9.2 309 9.8
288 9.1 354 9.8
503 9.1 1452 9.8
468 9 2211 9.7
610 9 678 9.7
452 8.9 616 9.6
315 8.8 1372 9.48
462 8.8 2242 9.45
507 8.8 1797 9.31
511 8.8 1587 9.3
307 8.6 8190 9.28
687 8.6 1884 9.15
321 8.5 1635 9.1
356 8.5 943 9.1
298 8.4 8136 8.9
345 8.4 2226 8.9
522 8.4 931 8.9
Figure 6 Daphne Major
9 8.3 1643 8.8
519 8.3 623 8.8
413 8.2 2940 8.78
276 8 1528 8.55
342 8 1248 8.5
346 7.9 560 8.5
457 7.85 1527 8.38
352 7.7 2939 8.31
621 7.6 685 8.2
12 7.5 891 8

Document1 5/17/2018 P a g e | 5 of 8
8) Calculate statistics for the Galápagos Finch data.

Died in Drought Survived Drought


Mean
Median
Mode
Standard
Deviation
9) What trend does the data show?
________________________________________________________________________

10) Propose an explanation for the data.


________________________________________________________________________
________________________________________________________________________

11) Using the standard deviation calculations sketch the population standard curve for
each set of data. (Use the information from the bottom of page four to help you plot the
curve) Label graph and axis.

Document1 5/17/2018 P a g e | 6 of 8
PART 3 – Measures of Confidence

When you sample a population it is just that, a sample, and may not give accurate
information concerning the entire population (here a population also refers to a set of
any recorded data).
The measurements taken of the finches just represents a sample of the entire
population of finches. Statistics provides a way to communicate how much error may
have been in collected data due to sampling error. The more closely the sample size
approaches the entire population the smaller the sampling error until the point where
the entire population is sampled and no error is present. Two measures of confidence
will be presented here; The Standard Error of the Mean and the 95% Confidence
Interval.

Standard Error of the Mean


The standard error of the mean utilizes the standard deviation of the sample and the
sample size to estimate how closely the sample data approximates the data that would
be collected if the entire population were measured. The formula for the standard error
of the mean is;

Where;
s = the standard deviation n = total number of individuals in a sample

The standard error of the mean tells you that 68.3% of the sample means are within ±1standard
error of the entire population mean. This can be expanded and refined to show a 95% confidence
interval using the 95% Confidence Interval.
Figure 7 t table at the .05 (95%) level

95% Confidence Interval


Most research is aimed at having sample populations model the
entire population with 95% confidence. The 95% confidence
interval shows the range of data that may be represented in the
population within 2 SD of the sample mean. To adjust for sample
size a table of values based on the degrees of freedom is
referenced. The following equation is used for determining the 95%
confidence interval and placement of error bars on data graphs.
95% 𝐶𝐿 = 𝑆𝐸 x 𝑡𝑃(𝑛−1)
Where;
SE = Standard Error of the Mean
𝒕𝑷(𝒏−𝟏) = value from t table at the .05 level for n-1 degrees of
freedom.
For large sample sizes (≥ 30) the t value approximates 2 so the
following equation may be used without calculating the standard
2.0𝑠
error. 95% 𝐶𝐿 =
√𝑛
This puts your level of sampling error within 95% 0f the entire population.
(Note: The AP exam will only address the simpler Standard Error of the Mean calculation)
Document1 5/17/2018 P a g e | 7 of 8
The Standard Error of the Mean and the 95% Confidence Interval is used to provide error bars for
graphs showing the mean values of data sets. The error bars show the range 1 standard error
above and one below the mean value.
The graph below shows data means graphed with error bars for a calculated SE = 5. Bars are drawn
5 units above and below the sample means. The AP exam may have you include error bars in
graphs.

12) Calculate the standard error of the mean and 95% confidence Interval for the Finch data.

Died in Drought Survived Drought

s
SEM
95% CL
13) Graph the data as a bar chart of the means showing error bars for both SEM and 95% CL.

Document1 5/17/2018 P a g e | 8 of 8

Potrebbero piacerti anche