Sei sulla pagina 1di 9

Nadia Vedadi

Math 1040 Skittles Term Project


Have you ever wondered how many colors of each candy there is in a bag of Skittles? Not all
bags of skittles are equal, each bag can vary a lot from each other. This term project is about is about
organizing and analyzing the data found from bags of Skittles. These bite-sized colorful candies will help
us apply the statistic skills we have learned so far this semester and use it in a real life situation.
Each member of the class purchased a 2.17 ounce bag of Original Skittles. Each student then
counted the number of red, orange, yellow, green, and purple candies from the bag. We then submitted
our data online to Professor Christensen. He combined the data and put it in an easy to read chart so we
could use the information, to know how many Skittles were in everyone elses bags and how many of
each color there were.

Above are the Pie Chart and Pareto Chart I created to show the number of candies of each color
in the Skittles bags for the entire class sample. These graphs reflect what I expected to see, I
hypothesized that the data that I collected of the number of colors of each Skittle would be similar, but
not exact to the rest of the class. However, I did think that there would be more variety in the numbers
for each color and the number of Skittles in each persons bag. My overall data represents closely with
the class data that I received.
Personal Skittles Bag
Number of
red candies
9

Number of
orange
candies
13

Number of
yellow
candies
12

Number of
green candies
14

Number of
purple
candies
11

Classroom Sample of Skittles Bags


Number of
red candies
184

Number of
orange
candies
191

Number of
yellow
candies
177

Number of
green candies
173

Number of
purple
candies
180

Summary statistics:
Column

n Mean Std. dev. Median Range Min Max Q1 Q3 IQR

Number of Candies Per Bag 15

60.3

2.29

60

55

63 59 63

The histogram and Boxplot above shows the distribution of the candies per bag for the
classroom sample. The distribution is skewed to the left of the graph in the histogram and boxplot. In
the graphs you can see that most of the data from the classroom samples are around 60 to 63 number
of candies per bag. One bag only had 55 candies in the bag which pulled the mean to the left of the
graph. I did not expect to see a spread in variety of candies per bag with the 55 candies per bag. I
thought to be fair that the number of candies had to be within one or two difference. I had 59 candies in
my bag, which is close to the mean of 60.3 with just one candy off. The classroom sample data of
Skittles per bag is close, but does not agree with my own Skittles candy bag because most of the data
was between 60 and 63.
Total # of Bags in Sample
15

Skittle Candies in my Bag


59

Categorical data is data that can be grouped, but not counted. Quantitative data is data that has
a numerical value and can be counted. In this project, the categorical data is the color of Skittles in each
bag, and the quantitative data is the numbers of candies per bag. Graphs that use categorical data are
the following: pareto chart, bar graph, and pie chart. Scatterplot, histogram, boxplot, stem-and-leaf plot,
and dot-plot are graphs using quantitative data. The Pie Chart and Pareto Chart are using categorical
data and cant assign a numerical value to categories and that is why we use it for categorical data. A
Histogram or Boxplot graphs deal with numerical data, not categorical data. Therefore, the Boxplot and
Histogram make sense to use for numerical data.

A confidence interval is a range of values used to estimate the value of a population parameter.
Confidence intervals are performed to make estimates on the population parameter based on the
sample information gathered. Statisticians use a confidence interval to describe the amount of
uncertainty associated with a sample estimate of a population parameter. The range of the confidence
interval is defined by the sample statistic plus or minus the margin of error. What a confidence interval
does is allow us to estimate the range in which our true population parameter falls, given what we know
about the population from the sample weve observed. In order to perform a confidence interval,
certain conditions need to be met. First, the sample is a simple random sample. Second, the sample
needs to meet the binomial distribution conditions. Third, a sample needs to be greater than 30. And,
lastly the proportion, the mean and the standard deviation has to fall within the margin of error. Three
things affect the margin of error: sample size, confidence level, and standard deviation. These
conditions are for estimating the population proportion, population mean and population standard
deviation.

Construct a 99% confidence interval estimate for the true proportion of yellow candies.

The first confidence interval estimate that I calculated was for a 99% confidence level of the true
proportion of yellow candies. The answer was .176 < p < .246. To interpret this information it means
that I am 99% sure that the interval, .211 plus or minus the error of .0349 contains the true proportions
of all yellow candies.

Construct a 95% confidence interval estimate for the true mean number of candies per bag.

The second confidence interval estimate performed was a 95% confidence interval for the true mean
number of candies in per bag. The result was 59 < <61.6. This means that I am 95% sure that all
Skittles bags will fall between the interval, 60.3 plus or minus the error of 1.3 for the mean number of
candies in per bag.
Construct a 98% confidence interval estimate for the standard deviation of the number of
candies per bag.

The last was a 98% confidence interval estimate for the standard deviation of the number of candies per
bag. The result for this was 1.05< <2.62. So I believe that I am 98% sure that for all bags of Skittles ever
made that the standard deviation would fall between 1.05< <2.62.

A hypothesis test is testing the claim of the population parameter or population characteristic.
A statistical hypothesis is an assumption about a population parameter. Hypothesis testing refers to the
procedures used by statisticians to accept or reject a statistical hypothesis. If the sample data is not
consistent with the statistical hypothesis, the hypothesis is rejected. A null and alternative hypothesis
needs to be identified. The null is the value of the population parameter bases on the claim value. The
alternative hypothesis opposes the claim of the null. A significance level is the likelihood of obtaining a
given result by chance. The claim needs to be tested in order draw a conclusion to support or reject the
claim.

Use a 0.05 significance level to test the claim that 20% of all Skittles candies are red.

The first hypothesis test is to use a 0.05 significance level to test the claim that 20% of all Skittles
Candies are red and the alternative hypothesis is that it is not true that 20% off Skittles Candies are red.
This is a two tailed test and I found that the test statistic .226 is less than the critical value 1.96 so we fail
to reject the null hypothesis. There is not sufficient evidence to warrant rejection of the claim that 20%
of all Skittles candies are red.

Use a 0.01 significance level to test the claim that the mean number of candies in a bag of
Skittles is 55.

The Second hypothesis test we used a 0.01 significance level to test the claim that the mean number of
candies in a bag of Skittles is 55. The claim is the null hypothesis and the alternative hypothesis is that
the mean number of candies in a bag is not 55. This is a two tailed test and I found that the test statistic
8.96 is greater than the critical value 2.977 so we reject the null hypothesis. There is sufficient evidence
to warrant rejection of the claim that the mean number of candies is a bag of Skittles is 55.
Reflection
There are three conditions for confidence Interval for estimating a population proportion p
1) The sample is a simple random sample.
2) Either or both of these conditions are satisfied: the population is normally distributed or n>30.
3) There are at least 5 successes and at least 5 failures.
Conditions for Confidence Interval for Estimating a Population Mean with not known
1) The sample is a simple random sample.
2) Either or both of these conditions is satisfied: The population is normally distributed or n > 30.
Conditions for Confidence Interval for estimating a population Standard Deviation or Variance
1) The sample is a simple random sample.
2) The population must have normally distributed values (even if the sample is large). The requirement
of a normal distribution is much stricter here than in earlier sections, so departures from normal
distributions can result in large errors.

I solved the statistical questions using the class data. The data met each of the requirements to
perform the calculations. The confidence intervals results fell within the margin of error. With the
hypothesis test, I compared the significance level to the critical values. The critical value separates the
critical region from the test statistic. These values determine if the claim should be rejected or
supported when viewed within a bell curve.
Errors in the data could have resulted from misusing the formulas or misinterpreting the data.
Students could have guessed on their skittles instead of purchasing a bag. An outlying number of
skittles could have also been reported by students who purchased the wrong size bag required for the
project.
After solving the problems using the class data, I found the level of confidence that not every
Skittles bag has an equal proportion of colors. Also, each bag may or may not have an equal amount of
candies, but know the bag will weigh the same.

Reflection

I remember skimming through this project at the beginning of the semester and feeling so
overwhelmed. I thought all the problems seemed almost impossible. To complete the projects it
required a lot of thinking. We had to create a random sample of data, have that data organized, create
graphs, charts and interpret what the information means. The mathematics has been a little challenging
to wrap my head around, but with the repetition and practice I felt that I really improved my
mathematic skills. With these acquired skills it lead to an improvement in my overall ability to think
more analytical.
I am positive that these skills I learned will help me in my future studies. Right now I am in the
middle of nursing school and completing the necessary requirements to get my nursing degree. The
statistic skills acquired will allow me to gather data, help me analyze the data and form defensible
conclusions. The specific coursework required for nursing school involves pharmacology and
medical/surgical work. These courses work directly with statistics. Now I feel that I will be better
prepared and more involved in those classes.
As a nurse, I will be interacting with patients and be required to interpret patient data. It will be
my responsibility to make decisions and be an advocate for my patients. I will need to present clear and
accurate information to patient and other hospital staff. This will require gathering information from
the patients. What we do with that information leads of for a course of treatment, therapies, and
advice that involves statistics in some manner. The skills that I have learned in this class will help me
gage the signs and symptoms, diagnosis, and course of action for my patients, but will make me a more
prepared nurse.

Potrebbero piacerti anche