Sei sulla pagina 1di 9

Lauren Christiansen

Skittles Project
For this project each member of my class bought a 2.17-ounce bag of skittles. We then
recorded how many full skittles each of us had and what color they were. We then compared
each others data with our own data to see how color compared to the amount of skittles in each
bag. Our goal for this assignment is to collect data and analyze how each pack compares to the
next.

Skittles Data
Yellow; 31400.000%; 18%

Orange; 34300.000%; 20%

Red; 36600.000%; 21%

Purple; 34600.000%; 20%

Green; 35500.000%; 21%


Yellow

Red

Green

Purple

Orange

Studen
t

Red

Orange

Yellow

Green

Purple

Number of each bag

NA

12

15

13

54

GA

12

14

17

59

IC

14

10

12

17

60

10

18

13

10

58

BC

13

13

15

58

LD

11

11

17

14

60

SL

13

15

10

13

60

AF

10

14

10

19

62

JF

11

16

11

12

59

10

ZF

11

SL

17

12

11

11

59

12

MM

10

10

12

10

11

53

13

KM

13

12

15

13

58

14

NR

20

11

14

60

15

CT

14

16

10

16

62

16

ZW

16

18

10

12

60

17

MW

13

21

10

58

18

LA

15

12

13

11

60

19

YA

14

10

17

58

20

MA

10

13

14

10

56

21

CB

26

12

58

22

LC

12

15

11

11

12

61

23

LH

12

13

14

13

60

24

RJ

18

20

12

61

25

BJ

13

10

10

15

14

62

26

RK

13

19

10

12

61

27

DL

14

13

14

10

59

28

LN

15

10

14

16

64

29

DT

10

14

15

12

11

62

30

ST

12

12

11

14

13

62

366

343

314

355

346

1724

Total

Studen
t

Red

Orange

Yellow

Green

Purple

Number of each bag

LC

12

15

11

11

12

61

When I started this skittles project I originally assumed the amount of colors in each
packet of skittles would be approximately equal. When looking at my data I realized how each
color in the pack of skittles is unequal to the next. Like in my packet of skittles there were 12
reds while there were 15 oranges. The data was inconstant throughout the class. With people
getting a different numbers of skittles and a different amount of a color in each packet. The data
we collected was not what I originally expected. Though some peoples data may have similar
parts. When looking at the data as a whole there is a wide variety of differences between each
packet of skittles.

Mean: 57.5
Standard Deviation: 11.10
5 Number Summary:
Minimum:0
Q1: 58
Median: 60
Q3: 61
Max: 64

When looking at this data I realized how spread out the colors were from each other.
When looking at my data the amount of skittles in each specific color were spread out from the

next color. When I added more data from the class to each of the colors. The colors I added from
the class allowed each specific color to even out with the next color, but there is still gaps
between colors. If we were to add more date to this, I suspect it would even out more. The
classes data is more accurate to the amount of skittles of each specific color in every bag. I does
not match up with my data because I have much fewer skittles then the class as a whole. There is
no shape to this graph because of the randomness between colors, and it having no order.

Quantitative data is something that can be counted and put into number form. Categorical
data is data that is but into categories, but not necessarily counted. Categorical data can be
quantitative if you count the amount in categories. Categorical data is often not as meaningful as
quantitative data, because you do not know the exact number amount, you usually just know
which category has the most. A graph that would work with categorical data is a pie chart
because you can see the comparison without numbers. Another graph that would work is a
histogram because it can be a comparison between categories. A graph that would work best with
quantitative data is a line graph, boxplot, scatter plot, bar, box and whisker, among other graphs
because they require exact numbers and can be counted. A five number summary, mean, and
standard deviation all make sense for quantitative data because they are calculations that have to
have numbers and be counted. They would not make sense for categorical data because its in
categories not numerical.

A confidence interval is a measurement of the probability that a specific point will fall between a
specific parameter. The purpose of a confidence interval is to pinpoint the probability to
determine the likelihood of a specific event from happening.

99% confidence interval estimate for the proportion of yellow candy

1.650*squareroot((.18*.82)/314 = .0358
.18+.0358= .216
.18-.0358 = .144
. 144 <p< .216

95% confidence interval estimate for the mean numbers of candies per bag

1.961*(2.455/squareroot(27))=.927
55.207+.927= 56.134
55.207-.927=54.280
54.280 < < 56.134

98%confidence interval estimate for the standard deviation of candies per bag

squareroot((26*2.455^2)/12.189) = 1.853
squareroot((26*2.455^2)/45.642) = 3.586
1.853 < s < 3.586

A hypothesis test allows you to test a claim about a certain property of the population.

Using a .05 significance level to test the claim that 20% of the skittles are red

Using a .01 significance level to test the claim that the mean numbers of candies per bag of
skittles is 55

The result of the first test conclude that we fail to reject the null, because the P value is greater
than the significant level so we can accept failing to reject the null hypothesis. We have
insufficient evidence to fail to reject the null but we do not enough to support the claim that 20%
of the skittles were red. The second test concluded that we reject the null because there is
sufficient evidence to warrant the claim to reject this null value. We do not have enough evidence
to conclude that the claim is correct.
The conditions for doing interval estimates and a hypothesis test for a population proportion are
first in needs to be reasonably random. Second the data needs to come from a normal distribution
and a large sample, and cannot be bias. Third there cannot be any outliers in the data or else the
data will be off. This sample also needs to be less than 10% of the population. The standard
deviation must also be known. For this skittles test we did not meet all of the conditions. It was
not a normal distribution. It was less than 10% of the population when you look at all the skittles
packs in the world. It also was reasonably random.
The conditions for doing interval estimates and a hypothesis test for a population mean are first
in needs to be reasonably random. It cant be bias. The data needs to come from a normal
distribution and a large sample, and there cannot be any outliers in the data or else the data will
be off. This sample also needs to be less than 10% of the population. It does not need the
standard deviation. For this skittles test we did not meet all of the conditions. It was not a normal
distribution. It was less than 10% of the population when you look at all the skittles packs in the
world. It also was reasonably random
The conditions for doing interval estimates and a hypothesis test for a population standard
deviation are that it must be random if you want the data to be reasonably accurate. This data
must also not be bias. The skittles data for this sample does meet these conditions. It was random
and it was also without bias.
There is a high possibility of errors in this sample first due to human error and also to the fact
that often times it did not meet all of the conditions needed to have a good sample. There was
also not as much evidence as is needed with the skittles packet. So we can con correctly apply
this to all bags of skittles. To improve this sampling method we could have added more bags of
skittles and watched the conditions to make sure it fell under all of them.

Potrebbero piacerti anche