Sei sulla pagina 1di 8

Valerie Whitehead

August 5, 2015
Math 1040
Term Project Part 7
PART 1 & 2
Table for Skittle Project
Skittle Color

Skittle Color in My Bag

Total Skittle Color




When I observed the data from this table I kind of expected the
numbers to be a bit random. Everyone bought their bag of skittle from
a different place at a different time and I didnt really expect my
numbers to reflect the class totals. However, I was surprised to find out
that the mean number for skittles in the bag was approximately 60 and
that was exactly what I had. I was expecting to have a lot more variety
in those numbers. Here in this table you can see that the least amount
of skittles I had in my bag were green, and for the class total it was one
of the more frequent colors found. Red and yellow were more
frequently found in my bag, yet they were the lowest in the class total.
As you can see the overall data did not agree with the data I found in
my own individual bag. It was in fact quite different!!

1. My findings about the variable Total candies in each bag were as
follows. The shape of the distribution started out at a minimum and
increased to a peak and then decreased again however it was more
skewed to the left rather than bell shaped. It also had an outlier of 80,
which is hard to believe that that number is accurate. Maybe that
student got the wrong size of bag, or made up the number. The graph
peaked at 60, which is surprising because 60 is the exact amount I had
in my own personal bag of skittles. I figured the mean total of the 61
bags had to be close to what I had in mine, but it surprised me how
exact it was. Because of this I would say that the overall data (61 bags)
collected by the whole class agrees with my own data from my own

2. The difference between categorical data and quantitative data is

that categorical data is data grouped according to common properties
and quantitative data is data that is measurable. For example
categorical data would be things such as race, sex, age group, and
educational level. This kind of data you cant really measure, but you
can count the number of members in each category or group, whereas
quantitative data are things such as length and weight. These things
are measurable and you can compare one to another. When looking at

categorical data you would want to use graphs such as tables. Since
this data cannot be measured you would make a table to coordinate
the property with the number of members with that property. Since
quantitative data can be measured you can use graphs such as
histograms, which compares measurements and the frequency of
those measurements. You can also use boxplots, stemplots, or pareto
graphs for quantitative data. For categorical data there isnt really any
calculations that make sense. I mean you could find the mean of the
data, but it wouldnt really make sense at to why you would want to do
that. For quantitative data you can do several calculations. You can find
the mean, median, and mode. You can also find the range. The 5number summary is also something you can calculate; the minimum,
first quartile, median, third quartile, and the maximum. You can
calculate the standard deviation as well. All of these calculations help
you to understand the data better. Although categorical and
quantitative data are different, you actually need both for the data to
be complete and meaningful. The quantitative data gives us the
numbers and the categorical data gives us the labels that tell us what
the numbers measure.

When we have an estimated population parameter, instead of
just having a single value or point estimate we have can find the
confidence intervals. We use these confidence values because they
give us a range of values that we believe, with varying degrees of
confidence that the true population value falls. The actual definition of
a confidence interval is a range of values so defined that there is a
specified probability that the value of a parameter lies within it. When
you are using a confidence interval, say a 98% confidence interval, you
are first examining your data from your sample. Lets say we did a
survey of 987 people and found that 98% of them know what Harry
Potter is. We could then find the confidence interval and say that we
are 98% sure that the interval actually does contain the true value of
the population portion p.

1. A hypothesis is a claim or statement about a property of population.
A hypothesis test is a procedure for testing a claim about a property of
a population. So when there is a statement made, or hypothesis, about
a property of population we will use hypothesis testing to determine
whether or not there is enough statistical evidence in favor of that
statement. By doing these tests we are able to see if these statements
are true or false. If they are true we can reject to fail the statement.
However, if it is false we can reject the statement.

4. In problem number 2 I failed to reject the null hypothesis. There is

not sufficient sample evidence to support the claim that 20% of all
skittles candies are red. I was able to come to this conclusion by

comparing the p-value to the significance level. I got 0.2802 for my pvalue, which is greater than the significance level of 0.01 Therefore, we
have to fail to reject. In problem 3 I also failed to reject the null
hypothesis. There is not sufficient sample evidence to support the
claim that the mean number of skittles candies per bag is 55. I did this
by finding the t-value which is 8.22644. This value does not fall in the
critical region of 2.660, and so we must fail to reject.

5. For the first hypothesis test that claimed that 20% of all skittles
candies are red I got a p-value that is greater than the significance
value. We presume the null hypothesis of p = .20 to be true. Since my
p-value was greater than the significance level it means that we cannot
reject the null hypothesis that p = 0.20 because there is not enough
evidence to support the claim.. However, this doesnt necessarily mean
that it is true, just plausible. For the second hypothesis test it claimed
that the mean skittles candies per bag is 55. I got a critical value of
2.660 and a t-score of 8.22644. When comparing the two on the chart
we come to the conclusion that we must fail to reject the null
hypothesis. There is not sufficient sample evidence to support the
claim. So once again the null hypothesis is plausible.

The Skittles Term Project helped develop my problem solving
skills in many ways. While doing the project I had to think of the
Skittles as more than just a bag of Skittles. I have eaten Skittles
several times in my life, but never once did I look at them and consider
them a statistical math problem. It opened my mind to see that a lot of
things in my daily life could be turned into or looked at as a statistical

Before this project I knew the basics like how to find the mean,
median and mode of a data set, but I had never heard of a standard
deviation, z-score, hypothesis testing, etc. All of this was new
information for me. This new information also helped to develop my
problem solving skills because it took me deeper into the data set. I
had to analyze it even more. Not only that, but because this class is
online I had to teach myself how to do it.
I think this is what helped develop my problem solving skills the
most. I didnt just have to analyze the data deeper for this project, but I
had to find out how to analyze it deeper. I had to teach myself how to
find the z-score and what it meant. I had to teach myself how to
calculate a standard deviation and the significance it gives to a data
set. I had to teach myself what hypothesis testing is and what it does.
When things became confusing during the project I had to use
my problem solving skills to figure out a way to help me better
understand. The hardest part wasnt learning the calculations, but
what the actual meaning was behind those calculations. What those
calculations meant in regards to the data set. I think this project
definitely helped me to develop my problem solving skills.