Sei sulla pagina 1di 8

Maurie Moline

Term Project
Statistics 1040
Fall 2014

Math 1040 Skittles Term Project


This project began with each student in Math 1040 purchasing a 2.17-ounce bag
of skittles. Each student had their own sample bag, and recorded their individual
sample data. The instructor then compiled the data from each student, along
with the total number of candies in each bag, the total number of bags, and the
total number of candies. An Excel file containing this data was posted on the
class website. This data set was then used throughout the project.

Categorical Data: Colors

Maurie Moline
Term Project
Statistics 1040
Fall 2014

Number and Color of Skittles in Individual Sample


Red Orange Yellow Green Purple Total Number of Bags
10
11
8
16
11
56
1
Number and Color of Skittles in Class Sample
Red Orange Yellow Green Purple Total Number of Bags
500 446
474
503
512
2435 38

Both the pie and pareto chart represent what I anticipated to find in this project.
The overall data collected by the whole class showed similar ratios of each color
candy. In my individual sample however, I was surprised to find that while the

Maurie Moline
Term Project
Statistics 1040
Fall 2014
number of red, orange and purple candies were all similar I had twice as many
green candies as yellow.

Quantitative Data: the Number of Candies per Bag

Summary statistics:
Column
Total Candies in Bag

Mean Std. dev. Min Max Q1 Q3 Median


64.1

13.12

45 114 59 62

61

Maurie Moline
Term Project
Statistics 1040
Fall 2014

The data regarding the number of candies found in each bag of skittles is shown to
be an abnormal distribution. I was surprised to see several rather extreme outliers.
The overall data fits with my individual bag of candies, in that my bag of candies falls
within the normal distribution.

Reflection
Both quantitative and qualitative data are used in statistical analysis.
Quantitative data consists of numbers representing counts or measurements
while qualitative data consists of names and labels.
Different types of graphs are used to represent these different types of data.
Graphs such as a dot plots, stem and leaf plots, histograms, line graphs, box
plots, bar charts etc. portray distributions of quantitative variables.

Maurie Moline
Term Project
Statistics 1040
Fall 2014

Qualitative data is best expressed using graphs such as pie charts, bar graphs
and pareto charts that show the frequencies of the various response categories
and their relative frequencies.
Quantitative data can be analyzed by frequency distributions and percent
distribution calculations, as well calculated by the mean, median and mode. The
data can be also be analyzed to find a variance in the data or a correlation
between two variables. These types of calculations work well for quantitative
data, but do not make sense for qualitative data. Qualitative data is best
expressed in ratios and proportions.

Confidence Interval Estimates


After collecting information from a random sample and computing a statistic, a
confidence interval provides a way to extrapolate that information and with a
certain level of certainty produce a range of values that are likely to contain the
population parameter of interest.
Construct a 99% confidence interval estimate for the true proportion of yellow candies.
n = 2435

x = 474

= .195 q = .805

= .01

z/2= 2.575

E= 2.575((.195) (.805) )/ (2435)


E=.021
E < P <

+E

.195-.021 < P <.195+.021

.174 < p < .216

Construct a 95% confidence interval estimate for the true mean number of candies per
bag.
T/2 = 2.026

= 64.1 s= 13.2 n=38 = .05

Maurie Moline
Term Project
Statistics 1040
Fall 2014

E= 2.026(13.2/38), E= 4.338
E < <

+E

64.1 -4.338 < <64.1 + 4.338

59.76< < 68.44

Construct a 98% confidence interval estimate for the standard deviation of the number of
candies per bag.
n = 38 s = 13.2

= .02

= 49.588
(df= 37 area= .01)
= 14.257
(df= 37 area= .99)

(37)(13.22)/49.588

(37)(13.22)/14.257

11.40< < 21.26

The first confidence interval estimate tells us that the proportion of the yellow
cadies in the population of skittles is between 17.4% and 21.6%. We can estimate
this with 99% confidence.
The second confidence interval estimate tells us that close to 95% of the time the
average number of candies in a 2.17-ounce bag of Skittles is between 59.76 and
68.44 or 60 and 68 candies.
The third confidence interval estimate tells us that with 98% certainty the
standard deviation regarding the number of candies per bag of skittles is
between 11.4 and 21.26

Maurie Moline
Term Project
Statistics 1040
Fall 2014

Hypothesis Tests
Use a 0.05 significance level to test the claim that 20% of all Skittles candies are red.
n = 2435
: p = 20%

= .2053

x = 500

= .05

z=.6586 (found using technology)

: p 20%

p = .51
In this test we find that we fail to reject the null hypothesis seeing there is not
significant evidence to support the alternative hypothesis.

Use a 0.01 significance level to test the claim that the mean number of candies in a bag of
Skittles is 55.
n = 38

= 64.1

s = 13.2

= .01 T = 4.25

DF: 37 = .01 T score = 2.715


: = 55

: 55

This test results in the rejection of the null hypothesis because there is evidence
to support the alternative hypothesis.

Reflection
Interval estimates and hypothesis tests assume that the data used is taken from a
random sample from large population. The sample data collected for this study
was random but the sample size could have been larger. The sampling method
could be improved by combining the data from other statistics classes to produce
a larger sample size thus eliminating the bias caused by outliers.
The statistical research has shown that we can with varying levels of confidence;
use sample data to define the parameters of a population. We can also accurately
assess the validity of claims made about the population given the sample data.

Maurie Moline
Term Project
Statistics 1040
Fall 2014

Potrebbero piacerti anche