Sei sulla pagina 1di 4

Q.

1
1. State the null hypothesis and the alternate hypothesis. Null Hypothesis – statement about the
value of a population parameter. Alternate Hypothesis – statement that is accepted if evidence
proves null hypothesis to be false.
2. Select the appropriate test statistic and level of significance. When testing a hypothesis of a
proportion, we use the z-statistic or z-test and the formula n pq p p z − = ˆ When testing a hypothesis
of a mean, we use the z-statistic or we use the t-statistic according to the following conditions. If
the population standard deviation, σ, is known and either the data is normally distributed or the
sample size n > 30, we use the normal distribution (z-statistic). When the population standard
deviation, σ, is unknown and either the data is normally distributed or the sample size is greater
than 30 (n > 30), we use the t-distribution (t-statistic). A traditional guideline for choosing the level
of significance is as follows: (a) the 0.10 level for political polling, (b) the 0.05 level for consumer
research projects, and (c) the 0.01 level for quality assurance work.
3. State the decision rules. The decision rules state the conditions under which the null hypothesis
will be accepted or rejected. The critical value for the test-statistic is determined by the level of
significance. The critical value is the value that divides the non-reject region from the reject region.
4. Compute the appropriate test statistic and make the decision. When we use the z-statistic, we
use the formula n x z σ − µ = When we use the t-statistic, we use the formula n s x t − µ = Compare
the computed test statistic with critical value. If the computed value is within the rejection
region(s), we reject the null hypothesis; otherwise, we do not reject the null hypothesis.
5. Interpret the decision. Based on the decision in Step 4, we state a conclusion in the context of
the original problem.

Q.2 Normal Curve

A normal curve is a bell-shaped curve which shows the probability distribution of a continuous
random variable. Moreover, the normal curve represents a normal distribution. The total area under
the normal curve logically represents the sum of all probabilities for a random variable. Hence, the
area under the normal curve is one. Also, the standard normal curve represents a normal curve
with mean 0 and standard deviation 1. Thus, the parameters involved in a normal distribution is
mean ( μ ) and standard deviation ( σ ).

Characteristics of a normal curve:

• The values of mean, median and mode are same

• It represents a unimodal distribution as it has only one peak.

• It shows a symmetric distribution as 50% of the data set lies on the left side of the mean and 50%
of the data set lies on the right side of the mean.

• Empirical rule: 68% of the data fall within μ ±σ, 95% of the data fall within μ ± 2 σ and 99.7%
of the data fall within μ ± 3 σ

The pictorial representation of the normal curve is shown below:


Some of the examples for normal distribution are given below:

• Heights/weights of the subjects under study


• IQ scores of the students
• Test scores of the students

1. The normal curve is symmetrical:


The Normal Probability Curve (N.P.C.) is symmetrical about the ordinate of the central point of
the curve. It implies that the size, shape and slope of the curve on one side of the curve is identical
to that of the other.
2. The normal curve is unimodal:
Since there is only one point in the curve which has maximum frequency, the normal probability
curve is unimodal, i.e. it has only one mode.
. Mean, median and mode coincide:
The mean, median and mode of the normal distribution are the same and they lie at the centre.
They are represented by 0 (zero) along the base line. [Mean = Median = Mode]
6. The height of the curve declines symmetrically:
In the normal probability curve the height declines symmetrically in either direction from the
maximum point. Hence the ordinates for values of X = µ ± K, where K is a real number, are equal.
7. The points of Influx occur at point ± 1 Standard Deviation (± 1 a):
The normal curve changes its direction from convex to concave at a point recognized as point of
influx. If we draw the perpendiculars from these two points of influx of the curve on horizontal
axis, these two will touch the axis at a distance one Standard Deviation unit above and below the
mean (± 1 σ).
9. Normal curve is a smooth curve:
The normal curve is a smooth curve, not a histogram. It is moderately peaked. The kurtosis of the
normal curve is 263.
10. The normal curve is bilateral:
The 50% area of the curve lies to the left side of the maximum central ordinate and 50% lies to the
right side. Hence the curve is bilateral.
11. The normal curve is a mathematical model in behavioural sciences:
The curve is used as a measurement scale. The measurement unit of this scale is ± σ (the unit
standard deviation).

Q.4
Certain assumptions are associated with most non- parametric statistical tests, namely:
1. That the observations are independent;
2. The variable under study has underlying continuity;
3. Non-parametric procedures lest different hypothesis about population than do parametric
procedures;
Advantages of Non-Parametric Tests:
1. If the sample size is very small, there may be no alternative to using a non-parametric statistical
test unless the nature of the population distribution is known exactly.
2. Non-parametric tests typically make fewer assumptions about the data and may be more relevant
to a particular situation. In addition, the hypothesis tested by the non-parametric test may be more
appropriate for the research investigation.
3. Non-parametric statistical tests are available to analyze data which are inherently in ranks as
well as data whose seemingly numerical scores have the strength of ranks.
Disadvantages of Non-Parametric Tests:
1. If all of the assumptions of a parametric statistical method are, in fact, met in the data and the
research hypothesis could be tested with a parametric test, then non-parametric statistical tests are
wasteful.
2. The degree of wastefulness is expressed by the power-efficiency of the non-parametric test.
3. Another objection to non-parametric statistical tests is that they are not systematic, whereas
parametric statistical tests have been systematized, and different tests are simply variations on a
central theme.
Q.7
In two way analysis of variance, usually the two independent variables are taken simultaneously.
It has two main effects and one interactional or joint effect on dependent variable. In such condition
we have to use analysis of variance in two way i.e. vertically as well as horizontally or we have to
use ANOVA, column and row wise.
Advantages of Two-Way ANOVA
1. It is more efficient to study two factors simultaneously rather than separately.
2. We can reduce the residual variation in a model by including a second factor thought to influence
the response.
3. We can investigate interactions between factors.
Demerits or Limitations of Two Way ANOVA
The following limitations are found in this technique:
Z When there are more than two classification of a factor or factors of study.
F ratio value provides global picture of difference among the main treatment effects.
The inference can be specified by using ‘t’ test in case when F ratio is found significant for a
treatment.
z This technique also follows the assumptions on which one way analysis of variance is based. If
these assumptions are not fulfilled, the use of this technique may give us spurious results. z This
technique is difficult and time consuming.
Q.9
In statistics, Mood's median test is a special case of Pearson's chi-squared test. It is
a nonparametric test that tests the null hypothesis that the medians of the populations from which
two or more samples are drawn are identical. The data in each sample are assigned to two groups,
one consisting of data whose values are higher than the median value in the two groups combined,
and the other consisting of data whose values are at the median or below.
Q.10
An ogive is a graph showing the curve of a cumulative distribution function. The points plotted
are the upper class limit and the corresponding cumulative frequency. The ogive for the normal
distribution, resembles one side of an Arabesque or ogival arch.
Q.11
The level of significance is defined as the probability of rejecting a null hypothesis by the test
when it is really true, which is denoted as α. That is, P (Type I error) = α.
Confidence level:
Confidence level refers to the possibility of a parameter that lies within a specified range of values,
which is denoted as c. Moreover, the confidence level is connected with the level of significance.
The relationship between level of significance and the confidence level is c=1−α.
Q.12
In a positive relationship, high values on one variable are associated with high values on the other
and low values on one are associated with low values on the other. ... On the other hand a negative
relationship implies that high values on one variable are associated with low values on the other.
Q.13
The phi coefficient is a measure of association for two binary variables. Introduced by Karl
Pearson, this measure is similar to the Pearson correlation coefficient in its interpretation. In fact,
a Pearson correlation coefficient estimated for two binary variables will return the phi coefficient.
Q.14
Multiple regression is an extension of simple linear regression. It is used when we want to predict
the value of a variable based on the value of two or more other variables. The variable we want to
predict is called the dependent variable (or sometimes, the outcome, target or criterion variable).
Q.15
An outlier is a data point that differs significantly from other observations. An outlier may be due
to variability in the measurement or it may indicate experimental error; the latter are sometimes
excluded from the data set. An outlier can cause serious problems in statistical analyses.
Q.16
Measure of kurtosis is a measure of the “tailedness” of the probability distribution of a real-valued
random variable. The standard measure ofkurtosis, is based on a scaled version of the fourth
moment of the data or population
Q.17
The Wilcoxon signed rank test (also called the Wilcoxonsigned rank sum test) is a non-
parametric test. ... The Wilcoxon matched-pairs signed rank test computes the difference between
each set of matched pairs, then follows the same procedure as the signed rank test to compare the
sample against some median.
Q.18
The standard error is the approximate standard deviation of a statistical sample population. The
standard error is a statistical term that measures the accuracy with which a sample represents a
population. In statistics, a sample mean deviates from the actual mean of a population;
this deviation is the standard error.2256071

Potrebbero piacerti anche