Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Skittles Project
For this project each member of my class bought a 2.17-ounce bag of skittles. We then
recorded how many full skittles each of us had and what color they were. We then compared
each others data with our own data to see how color compared to the amount of skittles in each
bag. Our goal for this assignment is to collect data and analyze how each pack compares to the
next.
Skittles Data
Yellow; 31400.000%; 18%
Red
Green
Purple
Orange
Studen
t
Red
Orange
Yellow
Green
Purple
NA
12
15
13
54
GA
12
14
17
59
IC
14
10
12
17
60
10
18
13
10
58
BC
13
13
15
58
LD
11
11
17
14
60
SL
13
15
10
13
60
AF
10
14
10
19
62
JF
11
16
11
12
59
10
ZF
11
SL
17
12
11
11
59
12
MM
10
10
12
10
11
53
13
KM
13
12
15
13
58
14
NR
20
11
14
60
15
CT
14
16
10
16
62
16
ZW
16
18
10
12
60
17
MW
13
21
10
58
18
LA
15
12
13
11
60
19
YA
14
10
17
58
20
MA
10
13
14
10
56
21
CB
26
12
58
22
LC
12
15
11
11
12
61
23
LH
12
13
14
13
60
24
RJ
18
20
12
61
25
BJ
13
10
10
15
14
62
26
RK
13
19
10
12
61
27
DL
14
13
14
10
59
28
LN
15
10
14
16
64
29
DT
10
14
15
12
11
62
30
ST
12
12
11
14
13
62
366
343
314
355
346
1724
Total
Studen
t
Red
Orange
Yellow
Green
Purple
LC
12
15
11
11
12
61
When I started this skittles project I originally assumed the amount of colors in each
packet of skittles would be approximately equal. When looking at my data I realized how each
color in the pack of skittles is unequal to the next. Like in my packet of skittles there were 12
reds while there were 15 oranges. The data was inconstant throughout the class. With people
getting a different numbers of skittles and a different amount of a color in each packet. The data
we collected was not what I originally expected. Though some peoples data may have similar
parts. When looking at the data as a whole there is a wide variety of differences between each
packet of skittles.
Mean: 57.5
Standard Deviation: 11.10
5 Number Summary:
Minimum:0
Q1: 58
Median: 60
Q3: 61
Max: 64
When looking at this data I realized how spread out the colors were from each other.
When looking at my data the amount of skittles in each specific color were spread out from the
next color. When I added more data from the class to each of the colors. The colors I added from
the class allowed each specific color to even out with the next color, but there is still gaps
between colors. If we were to add more date to this, I suspect it would even out more. The
classes data is more accurate to the amount of skittles of each specific color in every bag. I does
not match up with my data because I have much fewer skittles then the class as a whole. There is
no shape to this graph because of the randomness between colors, and it having no order.
Quantitative data is something that can be counted and put into number form. Categorical
data is data that is but into categories, but not necessarily counted. Categorical data can be
quantitative if you count the amount in categories. Categorical data is often not as meaningful as
quantitative data, because you do not know the exact number amount, you usually just know
which category has the most. A graph that would work with categorical data is a pie chart
because you can see the comparison without numbers. Another graph that would work is a
histogram because it can be a comparison between categories. A graph that would work best with
quantitative data is a line graph, boxplot, scatter plot, bar, box and whisker, among other graphs
because they require exact numbers and can be counted. A five number summary, mean, and
standard deviation all make sense for quantitative data because they are calculations that have to
have numbers and be counted. They would not make sense for categorical data because its in
categories not numerical.
A confidence interval is a measurement of the probability that a specific point will fall between a
specific parameter. The purpose of a confidence interval is to pinpoint the probability to
determine the likelihood of a specific event from happening.
1.650*squareroot((.18*.82)/314 = .0358
.18+.0358= .216
.18-.0358 = .144
. 144 <p< .216
95% confidence interval estimate for the mean numbers of candies per bag
1.961*(2.455/squareroot(27))=.927
55.207+.927= 56.134
55.207-.927=54.280
54.280 < < 56.134
98%confidence interval estimate for the standard deviation of candies per bag
squareroot((26*2.455^2)/12.189) = 1.853
squareroot((26*2.455^2)/45.642) = 3.586
1.853 < s < 3.586
A hypothesis test allows you to test a claim about a certain property of the population.
Using a .05 significance level to test the claim that 20% of the skittles are red
Using a .01 significance level to test the claim that the mean numbers of candies per bag of
skittles is 55
The result of the first test conclude that we fail to reject the null, because the P value is greater
than the significant level so we can accept failing to reject the null hypothesis. We have
insufficient evidence to fail to reject the null but we do not enough to support the claim that 20%
of the skittles were red. The second test concluded that we reject the null because there is
sufficient evidence to warrant the claim to reject this null value. We do not have enough evidence
to conclude that the claim is correct.
The conditions for doing interval estimates and a hypothesis test for a population proportion are
first in needs to be reasonably random. Second the data needs to come from a normal distribution
and a large sample, and cannot be bias. Third there cannot be any outliers in the data or else the
data will be off. This sample also needs to be less than 10% of the population. The standard
deviation must also be known. For this skittles test we did not meet all of the conditions. It was
not a normal distribution. It was less than 10% of the population when you look at all the skittles
packs in the world. It also was reasonably random.
The conditions for doing interval estimates and a hypothesis test for a population mean are first
in needs to be reasonably random. It cant be bias. The data needs to come from a normal
distribution and a large sample, and there cannot be any outliers in the data or else the data will
be off. This sample also needs to be less than 10% of the population. It does not need the
standard deviation. For this skittles test we did not meet all of the conditions. It was not a normal
distribution. It was less than 10% of the population when you look at all the skittles packs in the
world. It also was reasonably random
The conditions for doing interval estimates and a hypothesis test for a population standard
deviation are that it must be random if you want the data to be reasonably accurate. This data
must also not be bias. The skittles data for this sample does meet these conditions. It was random
and it was also without bias.
There is a high possibility of errors in this sample first due to human error and also to the fact
that often times it did not meet all of the conditions needed to have a good sample. There was
also not as much evidence as is needed with the skittles packet. So we can con correctly apply
this to all bags of skittles. To improve this sampling method we could have added more bags of
skittles and watched the conditions to make sure it fell under all of them.