Sei sulla pagina 1di 7

Minitab material on test for Normality

NORMTEST
example

Note

see also

NORMTEST replaces %NORMPLOT.

Stat > Basic Statistics > Normality Test or Graph > Probability Plot
Command Syntax
NORMTEST C

Generates a normal probability plot

PTILES C or K...K

Specifies a set of reference percents

DVALUE K...K

Shows the percents at the reference x-scale positions

PERCENT

Specifies a percent y-scale

PROBABILITY

Specifies a probability y-scale

SCORES

Specifies a percentile y-scale

RJTEST

Specifies the Ryan-Joiner test (similar to Shapiro-Wilk test)

KSTEST

Specifies the Kolmogorov-Smirnov goodness-of-fit test

SPVALUE C

Stores the p-value of test

TITLE "text"

Specifies a graph title

GSAVE "file"

Saves the graph in a Minitab Graphics Format (MGF) file

Generates a normal probability plot. Normal plots use the values in the input column as x-values. The grid
on the graph resembles the grids found on normal probability paper. The horizontal axis is a linear scale.
The line forms an estimate of the cumulative distribution function for the population from which data are
drawn.
By default, an Anderson-Darling test for normality is performed and the numerical results are displayed
with the graph. You can also use a Ryan-Joiner test (similar to a Shapiro-Wilk test) or a KolmogorovSmirnov test.
Subcommands
PTILES C or K...K
DVALUE K...K

PERCENT

Specifies a set of reference percents. The values must be between 0 and 100
when percents are used as the y-scale type or 0 to 1 when probability is the yscale type. Minitab marks each percent in the column with a horizontal
reference line on the plot, and marks each line with the percent value. Minitab
draws a vertical reference line where the horizontal reference line intersects
the line fit to the data, and marks this line with the estimated data value. Use
DVALUE to show the percents at the reference x-scale positions.
Specifies a percent y-scale.

PROBABILITY

Specifies a probability y-scale.

SCORES

Specifies a percentile y-scale.

RJTEST

There are 3 types of goodness-of-fit test: a chi-square based test, an ECDF


based test, and a correlation based test. By default, Minitab uses the AndersonDarling test, which is an ECDF based test. Use RJTEST to perform a RyanJoiner test, which is a correlation based test; use KSTEST to perform a
Kolmogorov-Smirnov test, which is a chi-square based test.

KSTEST

When your -value is larger than the p-value displayed with the graph, you
should reject the hypothesis of normality. The -value (also known as the
significance level), is the probability that you will reject the hypothesis of
normality when the hypothesis is true.
For example, if you are using an -value of 0.10 and the p-value displayed in
the Graph window is 0.07, then you would reject the hypothesis of normality
at the 0.10 level.
SPVALUE C

Stores the p-value of the test.

TITLE "text"

Use TITLE to specify a title for the graph. When you omit this subcommand,
Minitab displays a default title.

GSAVE "filename"

Use GSAVE to save the graph in a Minitab Graphics Format (MGF) file.
Unless you specify a file extension or use a graphics format subcommand,
Minitab automatically adds the extension MGF to the file name. If you save
the plot, you can view it later with GVIEW and edit the plot with graph
editing tools. See GSAVE for more information on this subcommand.

Back to top

Example of Normality Test


main topic

interpreting results

session command

see also

In an operating engine, parts of the crankshaft move up and down. AtoBDist is the distance (in mm) from
the actual (A) position of a point on the crankshaft to a baseline (B) position. To ensure production quality,
a manager took five measurements each working day in a car assembly plant, from September 28 through
October 15, and then ten per day from the 18th through the 25th.
You wish to see if these data follow a normal distribution, so you use Normality test.
1

Open the worksheet CRANKSH.MTW.

Choose Stat > Basic Statistics > Normality Test.

In Variable, enter AtoBDist. Click OK.

Graph window output -- see below! Thus is a Minitab run on the dataset mentioned above.

Retrieving worksheet from file:


'\\purple2\resource\wminitab14\Data\Cranksh.MTW'
Worksheet was saved on Fri Sep 12 2003

Results for: Cranksh.MTW


MTB > describe c1

Descriptive Statistics: AtoBDist


Variable
AtoBDist

N
125

N*
0

Variable
AtoBDist

Maximum
8.023

Mean
0.442

SE Mean
0.312

StDev
3.491

Minimum
-7.303

Q1
-2.243

Median
0.130

Q3
3.607

MTB > NormTest c1;


SUBC>
KSTest.

Probability Plot of AtoBDist

Probability Plot of AtoBDist


Normal
99.9

Mean
StDev
N
KS
P-Value

99
95

Percent

90

0.4417
3.491
125
0.094
<0.010

80
70
60
50
40
30
20
10
5
1
0.1

-10

-5

0
AtoBDist

10

MTB > normtest c1;


SUBC> rjtest.

Probability Plot of AtoBDist

Probability Plot of AtoBDist


Normal
99.9
99

Percent

95
90

Mean
StDev
N
RJ
P-Value

0.4417
3.491
125
0.990
0.066

Mean
StDev
N
AD
P-Value

0.4417
3.491
125
0.891
0.022

80
70
60
50
40
30
20
10
5
1
0.1

-10

-5

0
AtoBDist

10

MTB > normtest c1

Probability Plot of AtoBDist

Probability Plot of AtoBDist


Normal
99.9
99
95

Percent

90
80
70
60
50
40
30
20
10
5
1
0.1

-10

-5

0
AtoBDist

10

If your data are perfectly normal, then the data points on the probability plot will form a straight line. The
red line forms an estimate of the cumulative distribution function for the population from which the data
are drawn.
Your text says that if the data are skewed to the left, the plot will rise rapidly at first and then level off,
while if they are skewed to the right the plot will rise slowly at first and fast later.

Interpreting the results


The graphical output is a plot of normal probabilities versus the data. The data depart from the fitted line
most evidently in the extremes, or distribution tails. The Anderson-Darling test's p-value indicates that, at
levels greater than 0.022, there is evidence that the data do not follow a normal distribution. There is a
slight tendency for these data to be lighter in the tails than a normal distribution because the smallest
points are below the line and the largest point is just above the line. A distribution with heavy tails would
show the opposite pattern at the extremes.
Many statistical procedures assume the data follow a normal distribution. In order to verify this
assumption, you can perform a normality test on your data.
MINITAB provides three normality tests that you can choose from:

Anderson-Darling - This test has good power and is especially effective at detecting departure
from normality in the high and low values of a distribution.

Ryan-Joiner (similar to Shapiro-Wilk) - This test also has good power. It is based on the
correlation between the sample data and the data one would expect from a normal distribution.

Kolmogorov-Smirnov - This is a popular test of normality, but tends to be less powerful than the
other two.

Choosing a normality test


You have a choice of hypothesis tests for testing normality:
Anderson-Darling test (the default), which is an ECDF (empirical cumulative distribution function)
based test

Ryan-Joiner test [4], [9] (similar to the Shapiro-Wilk test [10], [11]) which is a correlation based test

Kolmogorov-Smirnov test [8], an ECDF based test

The Anderson-Darling and Ryan-Joiner tests have similar power for detecting non-normality. The
Kolmogorov-Smirnov test has lesser powersee [3], [8]and [9] for discussions of these tests for normality.
The common null hypothesis for these three tests is H0: data follow a normal distribution. If the p-value of
the test is less than your level, reject H0.
The results of each test are accompanied by a normal probability plot that can also help you determine
whether your data follow a normal distribution.
The graphs below are data points generated randomly from a Chi-squared distribution.
Copyright 2000-2005 Minitab Inc. All rights reserved.

Probability Plot of C1
Normal
99

95
90

Mean
StDev
N
KS
P-Value

10.20
5.540
20
0.122
>0.150

Mean
StDev
N
RJ
P-Value

10.20
5.540
20
0.970
>0.100

Percent

80
70
60
50
40
30
20
10
5

10
C1

15

20

25

Probability Plot of C1
Normal
99

95
90

Percent

80
70
60
50
40
30
20
10
5

10
C1

15

20

25

Probability Plot of C1
Normal
99

Mean
StDev
N
AD
P-Value

95
90

10.20
5.540
20
0.473
0.217

Percent

80
70
60
50
40
30
20
10
5

10
C1

15

20

25

References Basic Statistics


[1]

S.F. Arnold (1990). Mathematical Statistics. Prentice-Hall.

[2] M.B. Brown and A.B. Forsythe (1974). "Robust Tests for the Equality of Variances," Journal of the
American Statistical Association, 69, 364-367.
[3]

R.B. D'Agostino and M.A. Stephens, Eds. (1986). Goodness-of-Fit Techniques, Marcel Dekker.

[4] J.J. Filliben (1975). "The Probability Plot Correlation Coefficient Test for Normality," Technometrics,
17, 111.
[5] T.P. Hettmansperger and S.J. Sheather (1986). "Confidence Intervals Based on Interpolated Order
Statistics," Statistics and Probability Letters, 4, 75-79.
[6]

N.L. Johnson and S. Kotz (1969). Discrete Distributions, John Wiley & Sons.

[7]

H. Levene (1960). Contributions to Probability and Statistics, Stanford University Press.

[8] H.W. Lilliefors (1967). "On the KolmogorovSmirnov Test for Normality with Mean and Variance
Unknown," Journal of the American Statistical Association, 62, 399-402.
[9] T.A. Ryan, Jr. and B.L. Joiner (1976). "Normal Probability Plots and Tests for Normality," Technical
Report, Statistics Department, The Pennsylvania State University. (Available from Minitab Inc.)
[10] S.S. Shapiro and R.S. Francia (1972). "An Approximate Analysis of Variance Test for Normality,"
Journal of the American Statistical Association, 67, 215-216.
[11] S.S. Shapiro and M.B. Wilk. (1965). "An Analysis of Variance Test for Normality (Complete
Samples)," Biometrika, 52, 591.

Potrebbero piacerti anche