Sei sulla pagina 1di 7

Jesse Tice

Term Project - ePortfolio

Introduction-
This semester each person in the class was put into groups to all work on our group
projects. The project was to do a statistical analysis of the colors of candy found in a small bag
of original Skittles. We counted up how many we had of each color in our bag. After totalling up
the colors we’d compare how our color distribution differed or were similar to our group and the
class. As we continued doing group and individual projects we worked out many different
statistical values. We also made multiple charts to visually represent the data.

Connections w/ brief description -


In part one of the project we were to buy the 2.17 ounce bag of skittles. We added the
amount we had for each color. We then wrote in the totals of each color in the chart below. This
was the simplest part of the team project.

The second and following portions of the project required a lot more statistics work. We
compared the proportion of colors from our bag to the proportion to the class averages. I found
the colors in my bag to match pretty closely with the class average, however, that wasn’t the
case with everyone. We also made different types of charts to visually compare the colors.

In our third part of the term project we did more statistical work. We found the 5-number
summary, which consists of the minimum, first quartile, median, third quartile, and maximum
values. This was another part of the project that required a chart. Here we made a frequency
histogram.

Part 4 consisted of working out different statistical values. This is the part of the term
project that I had the most problems with. Some of the values we found were: the confidence
interval estimate and margin of error. To do this we had to make sure the data comes from a
population that is normally distributed.

Summary -
The term project included several different parts in which we had to compare samples
and interpret a large data set. This project has helped confirm for me that statistics can be used
for almost anything and they are used in our everyday life. Even though statistics on Skittles
may not be very useful for me, but the lessons learned will be extremely useful in my future. I’ve
learned there are many things to watch out for when looking at results from studies. We have to
be aware of the size of study, type of study, how they collected the sample population, lurking
variables, etc. Another helpful skill that we learned is how to make different types of charts to
visually represent data. Sometimes data is better represented visually. Overall I’m happy that I
took this class and feel that I can use what I learned in it in more situations than most of the
other math classes that I’ve taken.
Group project, part 1 - Individual portion

Group Project- Part 2


1. What proportion (or percentage) of the Skittles do you expect to see of each color? Why?
When trying to figure out what proportion in which the colors would be distributed, our group
varied a little. Some thought it would be according to the flavor’s popularity, while others
predicted that it would be evenly spread out. The report reflects the general consensus of an
even distribution. Report the proportion of each color within the overall sample gathered by the
class.

2. In StatCrunch, create a pie chart and a Pareto chart for the total number of candies of each
color in our class data set. Submit copies of your graphs in this report.
3. Does the class data represent a random sample? What would the population be? Collaborate
to discuss sampling and our data in a paragraph or two. Think carefully about the definition of
random sample when you work on your group response. We do believe that the class data
shows random sampling because the flavors seem to be evenly distributed. If you look at the
individual data, it does not always appear to be an even distribution, but with the 74 individuals
submitting their data to the sample, it gives a bigger picture of the likeliness that each color has
an equal chance of being selected since they all fall close to the 20% range.

Individual Portion - Project Part 2

The graphs match what I expected to see pretty closely. With a random distribution you’d
expect to see a 20% chance of each color when there are 5 colors. In my sample there was a
larger deviation from that 20%, but not by much. And that’s partially because we’re dealing with
a fairly small sample size. There weren’t any outliers in either mine or the class’ data set. They
were within a pretty close range of each other and what you’d expect. As the total count
increases as it does with the class total you’d expect the samples to regress to the mean, which
they do.

Group Project: Part 3


1. Using the total number of candies in each bag in our class sample, compute the following
measures for the variable “Total candies in each bag”:
(a) mean number of candies per bag Mean: 59.4
(b) standard deviation of the number of candies per bag Standard Deviation: 2.8
(c) 5-number summary for the number of candies per bag Minimum: 52 Q1: 58 Median: 60 Q3:
61 Maximum: 65 2. Create a frequency histogram for the variable “Total candies in each bag”.
Team project part 3 – Individual portion

Overall there wasn’t a wide range in the amount of candies that each bag contained. The
distribution of the graph was roughly symmetrical for the most part. The Interquartile range was
all within 3 candies. I expected most of the bags to contain a similar amount of candies. I would
think most major candy manufacturers are quite accurate and don’t deviate from the mean very
much. The amount of candies contained in my bag was exactly what the mean was, 60 candies.

Categorical data differs from quantitative in that it is either difficult to or cannot be


measured. It represents data that can be divided into groups such as: education level, sex, age,
etc. The types of graphs that suit categorical data are pie charts and bar graphs. With these
graphs the data is easy to visualize and compare group to group.

Quantitative data is the type that is easy to quantify. This would be something like points
scored in a game or GDP of a country. For this type of data you want to use box plots, stem
plots and histograms. With these graphs it is easy to see the frequency for whatever is being
measured. If you tried to use a pie chart it wouldn’t make sense using quantitative data.

Team project part 4 - group portion

● Construct a 99% confidence interval estimate for the population proportion of yellow
candies. In your response, include the following:
○ Sample Proportion of Yellow Candies: p .205 ︿ = 0
○ Interval Type: Confidence Interval
○ Requirements:
■ np(1 ) 0 → → ︿ − p ︿ ≥ 1 4394(0.205)(1 − 0.205) ≥ 10 716.11 ≥ 10
■ n ≤ 0.05N →4394 ≤ 0.05N (all Skittles in the world)
○ Confidence Interval Estimate: (0.189, 0.221)
○ Margin of Error: 0.016
○ Interpretation: There is a 99% chance that the proportion of yellow Skittles per bag
falls between 0.189 and 0.221.
● Construct a 90% confidence interval estimate for the population mean number of
candies per bag. In your response, include the following:
○ Sample mean number of candies per bag: x = 59.4
○ Interval Type: Confidence Interval?
○ Requirements:
■ Simple random sample and/or randomized experiment ✔
■ Sample size is small ✔
■ Data comes from a population that is normally distributed ✔
○ Confidence Interval Estimate: (58.834, 59.923)
○ Margin of Error: 0.5445
○ Interpretation: There is a 90% chance that the mean number of candies per bag is
between 58.834 and 59.923 candies per bag.

Team project part 4 - group portion


For many reasons we are often unable to sample the entire population of who we’re
interested in. For that reason we have to use a sample population, which never is a perfect
representation of the total population. Different samples of the same population will give
different results. This is called sampling error. For this reason we use confidence intervals. They
communicate how accurate our estimate is likely to be. Depending on how varied the population
is and how large it is will determine the size of our confidence interval.

Term Project Part 5 reflection

The term project was to do a statistical analysis of the colors of candy found in a 2.17
ounce bag of original Skittles. We each were to buy our own bag and then count how many
candies we had of each color. Once we finished we compared our distribution to the class’. It
was interesting to see how varied some of the results were. Some bags varied wildly in how
many they had of each color. Probably due to looking at a small sample size, you’d expect to
see more variance. In my bag of candy most colors were pretty evenly distributed however. All
results were within 1% of what you’d expect if the colors were selected completely randomly.
The results were similar to the average of everyone’s bags.

This project helped me realize how much proper methodologies matter immensely when
studying groups. The size of the sample population, the way you select them, etc can have large
influences in the results of your study. In the Skittles project it would have been easy to
cherry-pick a bag with an unordinary amount of color distribution and then say Skittles is
cheating us out of color X or that they only give us color Y.
In conclusion I was able to more thoroughly understand the importance of statistics in
our daily lives and how being oblivious to how probabilities work can have harsh consequences
financially, your health and many other areas. If you’re going to take a large risk on something,
make sure you understand the probabilities as best as possible first.

Potrebbero piacerti anche