Sei sulla pagina 1di 3

Stephen Bate ME626 TMA 01

Q1task(1a,2a)
1a was adapted from task 2.2.1. I wanted to look into the variance in samples to draw conclusions as
to what that can tell me about the variance in the population (Inferential statistics). By varying the
samples using excel enabled me gain a better prospective of variance of the population.
Appendix1p2p3
2a was adapted from task 3.1.2. I wanted to look at which variable could be used as the
independent/dependent again using excel as it was an efficient tool to show any relationship visually
and produce the statistics for me.
A systematic area sample was taken, used in both tasks, gathering 30 fallen leaves ( see pic) four at a
time from the area surrounding 5 dispersed trees (same species). No damaged or clustered leaves were
taken, preventing bias. (Damaged leaves may not have achieved their full growth; clustered leaves
may just have fallen only from the stem or crown of the tree). Data collected was entered into a spread
sheet, recording the length and width of each leaf, creating useable information with real context that I
could ask questions of to obtain a better statistical understanding (see psss). Quantitative summaries
produced in excel allowed me to see how the measures differed across my observations. I was able to
make alterations to the data to observe the effects and make hand written calculations as well as using
the computer software to enhance my learning.

Q1b
By collecting my primary data I observed that most of the leaves appeared reasonably uniform, noting
just an occasional smaller size but the descriptive statistics produced using ICT gave a clearer
indication of the differences. The range ( ) immediately showed spread in length varied greater than
that in width, but because it only used the two extreme observations in the samples (longest and
shortest measures), it could not tell me if this was indicative of the population, and for this I needed
the variance as it uses all of the observations. I needed to look at what difference the sample size had
on estimating the population variance. 1 sample using 4 measures for length gave a sample variance
of 15.72 compared to 1 larger sample of size 32 giving 12.67 ( ). In general a closer estimation was
obtained as expected by using a larger sample as the means of the samples get closer to the mean of
the population.
Imagining the total population to be only the 32 picked leaves would make the population variance
13.32. Examining the effect of taking numerous random samples of size 4 ( ) showed that by
increasing the number of samples also gave a better approximation to the population.

I would usually find the mathematics of finding the variance of a sample straight forward. However,
having collected my own data and having the descriptive statistics shown on a spread sheet presented
more problems for me than I had anticipated. In hindsight it may have been clearer for me just to print
out the length data as that it what I decided to focus on. If I had a better knowledge of excel I could
also of used computer simulation to generate more data and perhaps gained from a visual
interpretation but in so doing I would be losing my context. The leaves I had taken may well just have
been imaginary but then the memory of the task would soon be lost. At one stage I confused myself
with where I was heading with sample size as I was comparing taking samples containing varying
numbers of leaves and taking numerous samples of the same amount of leaves. Using the software
presented its own problems such as inputting data and using correct formulas because that was new to
me as was interpreting the data as compared to finding a statistic or parameter.

Starting with the length and width as the dependent and independent variables a scatter graph ( )
showed a linear relationship (positive and increasing) as it did when reversing the variables. From raw
data and the graph I was not be able to interpret if it had a causal relationship so I had to build on
existing knowledge to make a judgement. Considered myself as the subject, I thought about how my
legs grew longer as did my arms as I aged and that there would be a linear relationship between the
growth in the body parts, but only a linear and causal one if age was an explanatory variable. If I
were to introduce age as the explanatory variable to my data then that would be the causation for the
growth in both lengths. So by using existing real world knowledge I would conclude that there was no
causal effect between the length and width variables although I could not say for absolute certainty. If
there was no causation then the variables could be interchanged so Y/X could become X/Y as shown (
) otherwise we should not interpret X/Y. From looking at the regression graph I could see that it
would be dangerous to extrapolate away from the first and last data points as a length of 0cm would
give a nonsensical width of 2cm.

In statistics there is difficulty distinguishing between dependent/independent and response and


exploratory. For me the use of the term causation helps to explain this much clearer.
Why use a scattergraph and not a regression graph? Is one just a simpler version of the other. Should I
be thinking of the categories of data being used? Would it make a difference in the graph to use?

Potrebbero piacerti anche