Sei sulla pagina 1di 7

In Class Activity STA 6166 Fall 2007 4 October 2007 RAMIN SHAMSHIRI- UFID#:9021-3353

1. aThese value can not be considered a random sample from the population value since they are representing the last 20 fills, thus they are biased and are unable to represent the real fills population.

1. bThe sampling distribution of from a random sample size 20 drawn from a population with Mean and variance 2 will have mean= and variance= 2 /20 The assumption is that regarding the central limit theorem, if a random sample of size n is taken from any distribution with mean and variance 2 , the sample Mean will have a distribution approximately normal with Mean and variance 2 /. The approximately becomes better as n increases.

1. cThis distribution is used for the sample distribution of sample variance. When a random sample is taken from a population with Mean and variance 2 , the sample variance is: 2 = The sample distribution of 2 = Mean: Variance:
(1) 2 2

( )2 1

is a chi-square distribution with (n-1) degrees of freedom. 2 = 2 2 =


2 4 1

A chi-square variable can not be negative, and the distributions are positively skewed. At about 100 degrees of freedom, the chi-square distribution becomes somewhat symmetrical. The area under each chi-square distribution is equal to 1.00 or 100%. Assumption: The chi-square distribution is obtained from the value of 2 =
2 1 . 2 2

when random

samples are selected from a normally distributed population whose variance is . The sample must be randomly selected The population must be normally distributed for the variable under study The observation must be independent of each other

Ramin Shamshiri

STA6166, HW#3, Oct.04.2007

Page 1

1. dWe know that in a normal distribution, the tree parameters Mean, Mode and Median are equal or approximately equal. In addition, from the empirical rule, we know that 68% of the values falls in the interval of Mean plus or mines standard deviation. From the exploratory analysis, we have: Mean=22.83 Mean+SD=24.16 Median=22.6 Mean-SD=21.5 Mode= 22 SD=1.33

From the Box-plot, we see that 50% of the data are between 21.8 and 23.6. If the distribution is normal, then 68% of the data would have fallen in the interval of 21.5 and 24.16. This result shows that the data distribution is approximately normal. The shape of the distribution also shows that it is not exactly normal, but a little skewed to the right.

1. eTesting the hypothesis that the true mean mpg for the car is greater than 26: H0: 26 HA: >26 (Claim) Confidence level=95% => level of significance ()=5% or 0.05 From the exploratory analysis we have: n=20 => degree of freedom=19 SD=1.33 (Sample Standard Deviation) Mean=22.83 From the t-table, with = 0.05 and d.f=19, we have t=1.7291 Since population standard deviation is unknown and the sample size is less than 30, the z test is inappropriate for testing hypothesis involving means. So we use the t-test. The t-test is a statistical test for the mean of a population and is used when the population is normally or approximately normally distributed, is unknown and n<30. We want to check if the claim that the true mean is greater than true is valid or not. We use t-test to transfer the sample mean into the standard normal distribution. = / = 22.83 26 1.33/ 20 = 10.65

The value of t is smaller than 1.72, so we conclude that there is not enough evidence to show that the true mean is greater than 26. So the null hypothesis which was to reject this claim is not rejected.

Ramin Shamshiri

STA6166, HW#3, Oct.04.2007

Page 2

The figure below shows that if we had a sample mean greater than 26.51, we could say that there is enough evidence to declare that true mean is also greater than 26 with =0.05. = / => 1.71 = 26 1.33/ 20 => = 26.51

Rejecting area

Acceptable area

22.83

26

26.51

-10.65

1.72

1.fCalculating 90% confidence interval for the true mean mpg of this car: From this equation, = is: 22.83 0.1
2 /

, we conclude that the confidence interval in which the true mean can fall 1.33 < < + 1.33

20 20 2 2 22.83 0.5142 < < 22.83 + 0.5142 22.31 < < 23.34

< < 22.83 + 0.1

(1-alpha)% Confidence Interval t_alpha/2 t_alpha/2

Ramin Shamshiri

STA6166, HW#3, Oct.04.2007

Page 3

1.gTesting the hypothesis that the population variance is greater than 2.1: The chi-square test is used to test a claim about a single variance or standard deviation. H0: 2 2.1 HA: 2 >2.1 (Claim) Confidence level=95% => level of significance ()=5% or 0.05 From the sampling results we have: n=20 => degree of freedom=19 SD=1.33 (Sample Standard Deviation)=> Var=1.69 From the Chi-square table, with =0.05 and d.f= 19 we have: 2 = 30.144 2 = ( 1) 2 (19)1.69 = = 15.29 2 2.1

Since the value of 2 from the test is smaller than the value of 2 from the table, there is not enough evidence that the population variance is greater than 2.1. so we reject the claim. If we had our sample variance greater than 3.33, we would have our chi-square test result greater than 30.144, thus we could declare that there is enough evidence that the population variance is greater than 2.1.

Ramin Shamshiri

STA6166, HW#3, Oct.04.2007

Page 4

1.hTesting that the population median differs from 25mpg Since median is defined as middle value of the population, we can say that 50% of the population values are below and 50% are above the median. If we define the values above the median as success, we will have a sample from a binomial distribution with p=0.5. A hypothesis test involving a population proportion can be considered as a binomial experiment when there are only two outcomes and the probability of a success does not change from trial to trial. For the binomial distribution, we know that =np and = . . (1 ) Since the normal distribution can be used to approximate the binomial distribution when np5 and n(1p)5, the standard normal distribution can be used to test hypothesis for proportions: = Where: =X/n is sample proportion. Here we have 20*0.5=10>5, so we can use this test. H0: P=0.5 HA: P0.5 Confidence level=95% so, =0.05, this is a two tailed test, so /2=0.025 From the Z table, P(Z>1.96)=0.025 and P(Z<-1.96)=-0.025 We would reject the null hypothesis, if the result of z-test is greater than 1.96 or smaller than -1.96.In the other word, if |Z|>1.96
. /

From the exploratory analysis, we see that we only have two counts of mpg with the value of 25 which means 2/20=10%. So we have: 0.1 0.5 = = = 3.38 . / 0.5 0.5/20 Here we see that |Z|=3.38 which is greater than 1.96, thus we reject the claim, null hypothesis and we conclude that the Median should differs from 25mpg.

Ramin Shamshiri

STA6166, HW#3, Oct.04.2007

Page 5

2.aP claims to be 10% P of a random sample= 17% n=100 Checking if the sample size is sufficiently large to be used as normal distribution: np5 => 100*0.17=17 and is greater than 5 n(1-p) 5 => 100*0.83=83 and is greater than 5 Yes, the sample size is sufficient large 2.bThis can be considered a binomial distribution, the incidence of paratuberculosis in Floridas beef cattle is equivalent to the portion of successes. For the binomial distribution, =np and = . . (1 ) So, the shape of this distribution is normal, center is =np=100*0.17=17, and = . . (1 ) = 100 0.17 0.83=3.75 Figure below:

5.75

9.5

13.25

17

13.75

24.5

28.2

2.cH0: P=10% HA: P10% Confidence level=95% so, =0.05, this is a two tailed test, so /2=0.025 From the Z table, P(Z>1.96)=0.025 and P(Z<-1.96)=-0.025 We would reject the null hypothesis, if the result of z-test is greater than 1.96 or smaller than -1.96.In the other word, if |Z|>1.96 0 0(1 0)/ 0.17 0.1 0.1 0.9/100

= 2.33

The null hypothesis is rejected, which means that the true population mean is different than 10%.

Ramin Shamshiri

STA6166, HW#3, Oct.04.2007

Page 6

2.dConfidence Intervals and sample size for proportions P=symbol for the population proportion = symbol for the sample proportion X= number of sample unites that possess the characteristics of interest N=sample size =x/n Confidence interval about the proportions must meet the criteria that np5 and nq5. To construct a confidence interval about the proportion, the maximum error of estimate must be used: Confidence Interval for Proportion: = /2 (/2 ) 0.17 1.96 < < + (/2 ) 0.17 0.83 100

0.17 0.83 < < 0.17 + 1.96 100 0.17 0.073 < < 0.17 + 0.073 0. 97 < < 0.243

It means that the incidence of paratuberculosis in Floridas beef cattle will be in the range of

0.097 to 0.243 with 95% confidence level.

Ramin Shamshiri

STA6166, HW#3, Oct.04.2007

Page 7

Potrebbero piacerti anche