Statistics Wiki

Statistics Wiki
Contents
1
Statistics
1.1
Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1
Mathematical statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3
Data collection
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1
Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2
Experimental and observational studies . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4
Types of data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5
Terminology and theory of inferential statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.1
Statistics, estimators and pivotal quantities . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.2
Null hypothesis and alternative hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.3
Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.4
Interval estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.5
Signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.6
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
Misinterpretation: correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.6
Misuse
1.6.1
1.7
History of statistical science
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.8
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
1.8.1
Applied statistics, theoretical statistics and mathematical statistics . . . . . . . . . . . . . .
14
1.8.2
Machine learning and data mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
1.8.3
Statistics in society . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
1.8.4
Statistical computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
1.8.5
Statistics applied to mathematics or the arts . . . . . . . . . . . . . . . . . . . . . . . . .
14
1.9
Specialized disciplines
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
1.10 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
1.11 References
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
1.12 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
1.13 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
Portal:Statistics
20
List of elds of application of statistics
21
i
ii
CONTENTS
3.1
See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
Business analytics
23
4.1
Examples of application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
4.2
Types of analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
4.3
Basic domains within analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
4.4
History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
4.5
Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
4.6
Competing on analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
4.7
See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
4.8
References
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
4.9
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
Descriptive statistics
27
5.1
Use in statistical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
5.1.1
Univariate analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
5.1.2
Bivariate analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
5.2
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
5.3
External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
Quality control
29
6.1
Notable approaches to quality control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
6.2
Quality control in project management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
6.3
See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
6.4
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
6.5
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
6.6
External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
Operations research
32
7.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
7.2
History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
7.2.1
Historical origins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
7.2.2
Second World War . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
7.2.3
After World War II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
7.3
Problems addressed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
7.4
Management science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
7.4.1
Related elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
7.4.2
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
7.5
Societies and journals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
7.6
See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
7.7
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
7.8
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
7.8.1
42
Classic books and articles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CONTENTS
7.9
8
iii
7.8.2
Classic textbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
7.8.3
History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
Machine learning
44
8.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
8.1.1
Types of problems and tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
History and relationships to other elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
8.2.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
8.2
Relation to statistics
8.3
Theory
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
8.4
Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
8.4.1
Decision tree learning
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
8.4.2
Association rule learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
8.4.3
Articial neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
8.4.4
Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
8.4.5
Inductive logic programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
8.4.6
Support vector machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
8.4.7
Clustering
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
8.4.8
Bayesian networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
8.4.9
Reinforcement learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
8.4.10 Representation learning
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
8.4.11 Similarity and metric learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
8.4.12 Sparse dictionary learning
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
8.4.13 Genetic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
8.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
8.6
Ethics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
8.7
Software
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
8.7.1
Free and open-source software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
8.7.2
Proprietary software with free and open-source editions . . . . . . . . . . . . . . . . . . .
52
8.7.3
Proprietary software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
8.8
Journals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
8.9
Conferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
8.10 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
8.11 References
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
8.12 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
8.13 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
Statistical inference
57
9.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
9.2
Models and assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
9.2.1
Degree of models/assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
9.2.2
Importance of valid models/assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
iv
CONTENTS
9.2.3
Randomization-based models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
Paradigms for inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
9.3.1
Frequentist inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
9.3.2
Bayesian inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
9.3.3
AIC-based inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
9.3.4
Other paradigms for inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
9.4
Inference topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
9.5
See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
9.6
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
9.7
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
64
9.8
Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
9.9
External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
9.3
10 Correlation and dependence
66
10.1 Pearsons product-moment coecient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
66
10.2 Rank correlation coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
68
10.3 Other measures of dependence among random variables . . . . . . . . . . . . . . . . . . . . . . .
68
10.4 Sensitivity to the data distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
10.5 Correlation matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
10.6 Common misconceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
10.6.1 Correlation and causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
10.6.2 Correlation and linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
10.7 Bivariate normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
10.8 Partial correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
10.9 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
10.10References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
10.11Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73
10.12External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73
11 Regression analysis
74
11.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
11.2 Regression models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75
11.2.1 Necessary number of independent measurements . . . . . . . . . . . . . . . . . . . . . . .
76
11.2.2 Statistical assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
11.3 Underlying assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
11.4 Linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
11.4.1 General linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
11.4.2 Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
11.4.3 Limited dependent variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
11.5 Interpolation and extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
11.6 Nonlinear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
11.7 Power and sample size calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
CONTENTS
11.8 Other methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
11.9 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
11.10See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
11.11References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
11.12Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
11.13External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
12 Multivariate statistics
84
12.1 Types of analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
12.2 Important probability distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
12.3 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
12.4 Software and tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
12.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
12.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
12.7 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
12.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
13 Data collection
88
13.1 Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
13.2 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
13.3 Impact of faulty data
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
13.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
13.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
13.6 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
14 Time series
91
14.1 Methods for time series analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
14.2 Time Series and Panel Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
14.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
14.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
14.3.2 Exploratory analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
93
14.3.3 Curve tting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
93
14.3.4 Function Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
14.3.5 Prediction and forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
14.3.6 Classication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
14.3.7 Regression analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
14.3.8 Signal estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
14.3.9 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
14.4 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
14.4.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
96
14.4.2 Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
96
14.4.3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
96
vi
CONTENTS
14.4.4 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
14.5 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
99
14.5.1 Overlapping Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

14.5.2 Separated Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
14.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
14.7 Software
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
14.8 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

14.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
14.10Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
14.11External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
14.12Text and image sources, contributors, and licenses . . . . . . . . . . . . . . . . . . . . . . . . . . 104
14.12.1 Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
14.12.2 Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
14.12.3 Content license . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Chapter 1
Statistics
More probability density is found as one gets closer to the expected (mean) value in a normal distribution. Statistics used in
standardized testing assessment are shown. The scales include standard deviations, cumulative percentages, percentile equivalents,
Z-scores, T-scores, standard nines, and percentages in standard nines.
Statistics is the study of the collection, analysis, interpretation, presentation, and organization of data.[1] In applying
statistics to, e.g., a scientic, industrial, or social problem, it is conventional to begin with a statistical population or
a statistical model process to be studied. Populations can be diverse topics such as all people living in a country or
every atom composing a crystal. Statistics deals with all aspects of data including the planning of data collection in
terms of the design of surveys and experiments.[1]
Some popular denitions are:
Merriam-Webster dictionary denes statistics as classied facts representing the conditions of a people in
a state especially the facts that can be stated in numbers or any other tabular or classied arrangement[2] ".
1
CHAPTER 1. STATISTICS
Scatter plots are used in descriptive statistics to show the observed relationships between dierent variables.
Statistician Sir Arthur Lyon Bowley denes statistics as Numerical statements of facts in any department of
inquiry placed in relation to each other[3] ".
When census data cannot be collected, statisticians collect data by developing specic experiment designs and survey
samples. Representative sampling assures that inferences and conclusions can safely extend from the sample to the
population as a whole. An experimental study involves taking measurements of the system under study, manipulating
the system, and then taking additional measurements using the same procedure to determine if the manipulation
has modied the values of the measurements. In contrast, an observational study does not involve experimental
manipulation.
Two main statistical methodologies are used in data analysis: descriptive statistics, which summarizes data from
a sample using indexes such as the mean or standard deviation, and inferential statistics, which draws conclusions
from data that are subject to random variation (e.g., observational errors, sampling variation).[4] Descriptive statistics
are most often concerned with two sets of properties of a distribution (sample or population): central tendency (or
location) seeks to characterize the distributions central or typical value, while dispersion (or variability) characterizes
the extent to which members of the distribution depart from its center and each other. Inferences on mathematical
statistics are made under the framework of probability theory, which deals with the analysis of random phenomena.
A standard statistical procedure involves the test of the relationship between two statistical data sets, or a data set and
a synthetic data drawn from idealized model. A hypothesis is proposed for the statistical relationship between the two
1.1. SCOPE
data sets, and this is compared as an alternative to an idealized null hypothesis of no relationship between two data
sets. Rejecting or disproving the null hypothesis is done using statistical tests that quantify the sense in which the null
can be proven false, given the data that are used in the test. Working from a null hypothesis, two basic forms of error
are recognized: Type I errors (null hypothesis is falsely rejected giving a false positive) and Type II errors (null
hypothesis fails to be rejected and an actual dierence between populations is missed giving a false negative).[5]
Multiple problems have come to be associated with this framework: ranging from obtaining a sucient sample size
to specifying an adequate null hypothesis.
Measurement processes that generate statistical data are also subject to error. Many of these errors are classied as
random (noise) or systematic (bias), but other types of errors (e.g., blunder, such as when an analyst reports incorrect
units) can also be important. The presence of missing data and/or censoring may result in biased estimates and
specic techniques have been developed to address these problems.
Statistics can be said to have begun in ancient civilization, going back at least to the 5th century BC, but it was not
until the 18th century that it started to draw more heavily from calculus and probability theory. Statistics continues
to be an area of active research, for example on the problem of how to analyze Big data.
1.1 Scope
Statistics is a mathematical body of science that pertains to the collection, analysis, interpretation or explanation, and
presentation of data,[6] or as a branch of mathematics.[7] Some consider statistics to be a distinct mathematical science
rather than a branch of mathematics. While many scientic investigations make use of data, statistics is concerned
with the use of data in the context of uncertainty and decision making in the face of uncertainty.[8][9]
1.1.1
Mathematical statistics
Main article: Mathematical statistics

Mathematical statistics is the application of mathematics to statistics, which was originally conceived as the science of
the state the collection and analysis of facts about a country: its economy, land, military, population, and so forth.
Mathematical techniques used for this include mathematical analysis, linear algebra, stochastic analysis, dierential
equations, and measure-theoretic probability theory.[10][11]
1.2 Overview
In applying statistics to a problem, it is common practice to start with a population or process to be studied. Populations
can be diverse topics such as all persons living in a country or every atom composing a crystal.
Ideally, statisticians compile data about the entire population (an operation called census). This may be organized
by governmental statistical institutes. Descriptive statistics can be used to summarize the population data. Numerical descriptors include mean and standard deviation for continuous data types (like income), while frequency and
percentage are more useful in terms of describing categorical data (like race).
When a census is not feasible, a chosen subset of the population called a sample is studied. Once a sample that
is representative of the population is determined, data is collected for the sample members in an observational or
experimental setting. Again, descriptive statistics can be used to summarize the sample data. However, the drawing of
the sample has been subject to an element of randomness, hence the established numerical descriptors from the sample
are also due to uncertainty. To still draw meaningful conclusions about the entire population, inferential statistics
is needed. It uses patterns in the sample data to draw inferences about the population represented, accounting for
randomness. These inferences may take the form of: answering yes/no questions about the data (hypothesis testing),
estimating numerical characteristics of the data (estimation), describing associations within the data (correlation) and
modeling relationships within the data (for example, using regression analysis). Inference can extend to forecasting,
prediction and estimation of unobserved values either in or associated with the population being studied; it can include
extrapolation and interpolation of time series or spatial data, and can also include data mining.
1.3 Data collection

1.3.1
Sampling
When full census data cannot be collected, statisticians collect sample data by developing specic experiment designs
and survey samples. Statistics itself also provides tools for prediction and forecasting the use of data through statistical
models. To use a sample as a guide to an entire population, it is important that it truly represents the overall population.
Representative sampling assures that inferences and conclusions can safely extend from the sample to the population
as a whole. A major problem lies in determining the extent that the sample chosen is actually representative. Statistics
oers methods to estimate and correct for any bias within the sample and data collection procedures. There are also
methods of experimental design for experiments that can lessen these issues at the outset of a study, strengthening its
capability to discern truths about the population.
Sampling theory is part of the mathematical discipline of probability theory. Probability is used in mathematical
statistics to study the sampling distributions of sample statistics and, more generally, the properties of statistical
procedures. The use of any statistical method is valid when the system or population under consideration satises the
assumptions of the method. The dierence in point of view between classic probability theory and sampling theory
is, roughly, that probability theory starts from the given parameters of a total population to deduce probabilities
that pertain to samples. Statistical inference, however, moves in the opposite directioninductively inferring from
samples to the parameters of a larger or total population.
1.3.2
Experimental and observational studies
A common goal for a statistical research project is to investigate causality, and in particular to draw a conclusion
on the eect of changes in the values of predictors or independent variables on dependent variables. There are two
major types of causal statistical studies: experimental studies and observational studies. In both types of studies, the
eect of dierences of an independent variable (or variables) on the behavior of the dependent variable are observed.
The dierence between the two types lies in how the study is actually conducted. Each can be very eective. An
experimental study involves taking measurements of the system under study, manipulating the system, and then
taking additional measurements using the same procedure to determine if the manipulation has modied the values
of the measurements. In contrast, an observational study does not involve experimental manipulation. Instead, data
are gathered and correlations between predictors and response are investigated. While the tools of data analysis
work best on data from randomized studies, they are also applied to other kinds of data like natural experiments
and observational studies[12] for which a statistician would use a modied, more structured estimation method
(e.g., Dierence in dierences estimation and instrumental variables, among many others) that produce consistent
estimators.
Experiments
The basic steps of a statistical experiment are:
1. Planning the research, including nding the number of replicates of the study, using the following information: preliminary estimates regarding the size of treatment eects, alternative hypotheses, and the estimated
experimental variability. Consideration of the selection of experimental subjects and the ethics of research
is necessary. Statisticians recommend that experiments compare (at least) one new treatment with a standard
treatment or control, to allow an unbiased estimate of the dierence in treatment eects.
2. Design of experiments, using blocking to reduce the inuence of confounding variables, and randomized assignment of treatments to subjects to allow unbiased estimates of treatment eects and experimental error. At
this stage, the experimenters and statisticians write the experimental protocol that will guide the performance
of the experiment and which species the primary analysis of the experimental data.
3. Performing the experiment following the experimental protocol and analyzing the data following the experimental protocol.
4. Further examining the data set in secondary analyses, to suggest new hypotheses for future study.
5. Documenting and presenting the results of the study.
1.4. TYPES OF DATA
Experiments on human behavior have special concerns. The famous Hawthorne study examined changes to the
working environment at the Hawthorne plant of the Western Electric Company. The researchers were interested
in determining whether increased illumination would increase the productivity of the assembly line workers. The
researchers rst measured the productivity in the plant, then modied the illumination in an area of the plant and
checked if the changes in illumination aected productivity. It turned out that productivity indeed improved (under
the experimental conditions). However, the study is heavily criticized today for errors in experimental procedures,
specically for the lack of a control group and blindness. The Hawthorne eect refers to nding that an outcome
(in this case, worker productivity) changed due to observation itself. Those in the Hawthorne study became more
productive not because the lighting was changed but because they were being observed.[13]
Observational study
An example of an observational study is one that explores the association between smoking and lung cancer. This
type of study typically uses a survey to collect observations about the area of interest and then performs statistical
analysis. In this case, the researchers would collect observations of both smokers and non-smokers, perhaps through
a case-control study, and then look for the number of cases of lung cancer in each group.
1.4 Types of data

Main articles: Statistical data type and Levels of measurement
Various attempts have been made to produce a taxonomy of levels of measurement. The psychophysicist Stanley
Smith Stevens dened nominal, ordinal, interval, and ratio scales. Nominal measurements do not have meaningful
rank order among values, and permit any one-to-one transformation. Ordinal measurements have imprecise differences between consecutive values, but have a meaningful order to those values, and permit any order-preserving
transformation. Interval measurements have meaningful distances between measurements dened, but the zero value
is arbitrary (as in the case with longitude and temperature measurements in Celsius or Fahrenheit), and permit any
linear transformation. Ratio measurements have both a meaningful zero value and the distances between dierent
measurements dened, and permit any rescaling transformation.
Because variables conforming only to nominal or ordinal measurements cannot be reasonably measured numerically,
sometimes they are grouped together as categorical variables, whereas ratio and interval measurements are grouped
together as quantitative variables, which can be either discrete or continuous, due to their numerical nature. Such distinctions can often be loosely correlated with data type in computer science, in that dichotomous categorical variables
may be represented with the Boolean data type, polytomous categorical variables with arbitrarily assigned integers
in the integral data type, and continuous variables with the real data type involving oating point computation. But
the mapping of computer science data types to statistical data types depends on which categorization of the latter is
being implemented.
Other categorizations have been proposed. For example, Mosteller and Tukey (1977)[14] distinguished grades, ranks,
counted fractions, counts, amounts, and balances. Nelder (1990)[15] described continuous counts, continuous ratios,
count ratios, and categorical modes of data. See also Chrisman (1998),[16] van den Berg (1991).[17]
The issue of whether or not it is appropriate to apply dierent kinds of statistical methods to data obtained from
dierent kinds of measurement procedures is complicated by issues concerning the transformation of variables and
the precise interpretation of research questions. The relationship between the data and what they describe merely
reects the fact that certain kinds of statistical statements may have truth values which are not invariant under some
transformations. Whether or not a transformation is sensible to contemplate depends on the question one is trying to
answer (Hand, 2004, p. 82).[18]
1.5 Terminology and theory of inferential statistics

1.5.1
Statistics, estimators and pivotal quantities
Consider independent identically distributed (IID) random variables with a given probability distribution: standard
statistical inference and estimation theory denes a random sample as the random vector given by the column vector
of these IID variables.[19] The population being examined is described by a probability distribution that may have
unknown parameters.
A statistic is a random variable that is a function of the random sample, but not a function of unknown parameters.
The probability distribution of the statistic, though, may have unknown parameters.
Consider now a function of the unknown parameter: an estimator is a statistic used to estimate such function. Commonly used estimators include sample mean, unbiased sample variance and sample covariance.
A random variable that is a function of the random sample and of the unknown parameter, but whose probability
distribution does not depend on the unknown parameter is called a pivotal quantity or pivot. Widely used pivots include
the z-score, the chi square statistic and Students t-value.
Between two estimators of a given parameter, the one with lower mean squared error is said to be more ecient.
Furthermore, an estimator is said to be unbiased if its expected value is equal to the true value of the unknown
parameter being estimated, and asymptotically unbiased if its expected value converges at the limit to the true value
of such parameter.
Other desirable properties for estimators include: UMVUE estimators that have the lowest variance for all possible
values of the parameter to be estimated (this is usually an easier property to verify than eciency) and consistent
estimators which converges in probability to the true value of such parameter.
This still leaves the question of how to obtain estimators in a given situation and carry the computation, several
methods have been proposed: the method of moments, the maximum likelihood method, the least squares method
and the more recent method of estimating equations.
1.5.2
Null hypothesis and alternative hypothesis
Interpretation of statistical information can often involve the development of a null hypothesis which is usually (but
not necessarily) that no relationship exists among variables or that no change occurred over time.[20][21]
The best illustration for a novice is the predicament encountered by a criminal trial. The null hypothesis, H0 , asserts
that the defendant is innocent, whereas the alternative hypothesis, H1 , asserts that the defendant is guilty. The indictment comes because of suspicion of the guilt. The H0 (status quo) stands in opposition to H1 and is maintained
unless H1 is supported by evidence beyond a reasonable doubt. However, failure to reject H0 " in this case does
not imply innocence, but merely that the evidence was insucient to convict. So the jury does not necessarily accept
H0 but fails to reject H0 . While one can not prove a null hypothesis, one can test how close it is to being true with
a power test, which tests for type II errors.
What statisticians call an alternative hypothesis is simply an hypothesis that contradicts the null hypothesis.
1.5.3
Error
Working from a null hypothesis, two basic forms of error are recognized:
Type I errors where the null hypothesis is falsely rejected giving a false positive.
Type II errors where the null hypothesis fails to be rejected and an actual dierence between populations is
missed giving a false negative.
Standard deviation refers to the extent to which individual observations in a sample dier from a central value, such
as the sample or population mean, while Standard error refers to an estimate of dierence between sample mean and
population mean.
A statistical error is the amount by which an observation diers from its expected value, a residual is the amount
an observation diers from the value the estimator of the expected value assumes on a given sample (also called
prediction).
Mean squared error is used for obtaining ecient estimators, a widely used class of estimators. Root mean square
error is simply the square root of mean squared error.
Many statistical methods seek to minimize the residual sum of squares, and these are called "methods of least squares"
in contrast to Least absolute deviations. The latter gives equal weight to small and big errors, while the former gives
1.5. TERMINOLOGY AND THEORY OF INFERENTIAL STATISTICS
A least squares t: in red the points to be tted, in blue the tted line.
more weight to large errors. Residual sum of squares is also dierentiable, which provides a handy property for doing
regression. Least squares applied to linear regression is called ordinary least squares method and least squares applied
to nonlinear regression is called non-linear least squares. Also in a linear regression model the non deterministic part
of the model is called error term, disturbance or more simply noise. Both linear regression and non-linear regression
are addressed in polynomial least squares, which also describes the variance in a prediction of the dependent variable
(y axis) as a function of the independent variable (x axis) and the deviations (errors, noise, disturbances) from the
estimated (tted) curve.
Measurement processes that generate statistical data are also subject to error. Many of these errors are classied as
random (noise) or systematic (bias), but other types of errors (e.g., blunder, such as when an analyst reports incorrect
units) can also be important. The presence of missing data and/or censoring may result in biased estimates and
specic techniques have been developed to address these problems.[22]
1.5.4
Interval estimation
Main article: Interval estimation

Most studies only sample part of a population, so results don't fully represent the whole population. Any estimates
obtained from the sample only approximate the population value. Condence intervals allow statisticians to express
how closely the sample estimate matches the true value in the whole population. Often they are expressed as 95%
condence intervals. Formally, a 95% condence interval for a value is a range where, if the sampling and analysis
were repeated under the same conditions (yielding a dierent dataset), the interval would include the true (population)
value in 95% of all possible cases. This does not imply that the probability that the true value is in the condence
Condence intervals: the red line is true value for the mean in this example, the blue lines are random condence intervals for 100
realizations.
interval is 95%. From the frequentist perspective, such a claim does not even make sense, as the true value is not a
random variable. Either the true value is or is not within the given interval. However, it is true that, before any data
are sampled and given a plan for how to construct the condence interval, the probability is 95% that the yet-to-becalculated interval will cover the true value: at this point, the limits of the interval are yet-to-be-observed random
variables. One approach that does yield an interval that can be interpreted as having a given probability of containing
the true value is to use a credible interval from Bayesian statistics: this approach depends on a dierent way of
interpreting what is meant by probability, that is as a Bayesian probability.
In principle condence intervals can be symmetrical or asymmetrical. An interval can be asymmetrical because it
works as lower or upper bound for a parameter (left-sided interval or right sided interval), but it can also be asymmetrical because the two sided interval is built violating symmetry around the estimate. Sometimes the bounds for a
condence interval are reached asymptotically and these are used to approximate the true bounds.
1.5.5
Signicance
Main article: Statistical signicance

Statistics rarely give a simple Yes/No type answer to the question under analysis. Interpretation often comes down
to the level of statistical signicance applied to the numbers and often refers to the probability of a value accurately
rejecting the null hypothesis (sometimes referred to as the p-value).
The standard approach[19] is to test a null hypothesis against an alternative hypothesis. A critical region is the set
of values of the estimator that leads to refuting the null hypothesis. The probability of type I error is therefore the
probability that the estimator belongs to the critical region given that null hypothesis is true (statistical signicance)
and the probability of type II error is the probability that the estimator doesn't belong to the critical region given that
the alternative hypothesis is true. The statistical power of a test is the probability that it correctly rejects the null
hypothesis when the null hypothesis is false.
Referring to statistical signicance does not necessarily mean that the overall result is signicant in real world terms.
For example, in a large study of a drug it may be shown that the drug has a statistically signicant but very small
benecial eect, such that the drug is unlikely to help the patient noticeably.
While in principle the acceptable level of statistical signicance may be subject to debate, the p-value is the smallest
signicance level that allows the test to reject the null hypothesis. This is logically equivalent to saying that the p-value
is the probability, assuming the null hypothesis is true, of observing a result at least as extreme as the test statistic.
Therefore, the smaller the p-value, the lower the probability of committing type I error.
Some problems are usually associated with this framework (See criticism of hypothesis testing):
A dierence that is highly statistically signicant can still be of no practical signicance, but it is possible to
properly formulate tests to account for this. One response involves going beyond reporting only the signicance
1.5. TERMINOLOGY AND THEORY OF INFERENTIAL STATISTICS
Important:
Pr (observation | hypothesis) Pr (hypothesis | observation)
The probability of observing a result given that some hypothesis
is true is not equivalent to the probability that a hypothesis is true
given that some result has been observed.
Using the p-value as a score is committing an egregious logical error:
the transposed conditional fallacy.
Probability density
More likely observation
P-value
Very un-likely
observations
Very un-likely
observations
Observed
data point
Set of possible results
A p-value (shaded green area) is the probability of an observed

(or more extreme) result assuming that the null hypothesis is true.
In this graph the black line is probability distribution for the test statistic, the critical region is the set of values to the right of the
observed data point (observed value of the test statistic) and the p-value is represented by the green area.
level to include the p-value when reporting whether a hypothesis is rejected or accepted. The p-value, however,
does not indicate the size or importance of the observed eect and can also seem to exaggerate the importance
of minor dierences in large studies. A better and increasingly common approach is to report condence
intervals. Although these are produced from the same calculations as those of hypothesis tests or p-values,
they describe both the size of the eect and the uncertainty surrounding it.
Fallacy of the transposed conditional, aka prosecutors fallacy: criticisms arise because the hypothesis testing
approach forces one hypothesis (the null hypothesis) to be favored, since what is being evaluated is probability
of the observed result given the null hypothesis and not probability of the null hypothesis given the observed
result. An alternative to this approach is oered by Bayesian inference, although it requires establishing a prior
probability.[23]
Rejecting the null hypothesis does not automatically prove the alternative hypothesis.
As everything in inferential statistics it relies on sample size, and therefore under fat tails p-values may be
seriously mis-computed.
10
1.5.6
Examples
Some well-known statistical tests and procedures are:

Analysis of variance (ANOVA)
Chi-squared test
Correlation
Factor analysis
MannWhitney U
Mean square weighted deviation (MSWD)
Pearson product-moment correlation coecient
Regression analysis
Spearmans rank correlation coecient
Students t-test
Time series analysis
Conjoint Analysis
1.6 Misuse
Main article: Misuse of statistics
Misuse of statistics can produce subtle, but serious errors in description and interpretationsubtle in the sense that
even experienced professionals make such errors, and serious in the sense that they can lead to devastating decision
errors. For instance, social policy, medical practice, and the reliability of structures like bridges all rely on the proper
use of statistics.
Even when statistical techniques are correctly applied, the results can be dicult to interpret for those lacking expertise. The statistical signicance of a trend in the datawhich measures the extent to which a trend could be caused
by random variation in the samplemay or may not agree with an intuitive sense of its signicance. The set of basic
statistical skills (and skepticism) that people need to deal with information in their everyday lives properly is referred
to as statistical literacy.
There is a general perception that statistical knowledge is all-too-frequently intentionally misused by nding ways to
interpret only the data that are favorable to the presenter.[24] A mistrust and misunderstanding of statistics is associated
with the quotation, "There are three kinds of lies: lies, damned lies, and statistics". Misuse of statistics can be both
inadvertent and intentional, and the book How to Lie with Statistics[24] outlines a range of considerations. In an attempt
to shed light on the use and misuse of statistics, reviews of statistical techniques used in particular elds are conducted
(e.g. Warne, Lazo, Ramos, and Ritter (2012)).[25]
Ways to avoid misuse of statistics include using proper diagrams and avoiding bias.[26] Misuse can occur when conclusions are overgeneralized and claimed to be representative of more than they really are, often by either deliberately
or unconsciously overlooking sampling bias.[27] Bar graphs are arguably the easiest diagrams to use and understand,
and they can be made either by hand or with simple computer programs.[26] Unfortunately, most people do not look
for bias or errors, so they are not noticed. Thus, people may often believe that something is true even if it is not well
represented.[27] To make data gathered from statistics believable and accurate, the sample taken must be representative of the whole.[28] According to Hu, The dependability of a sample can be destroyed by [bias]... allow yourself
some degree of skepticism.[29]
To assist in the understanding of statistics Hu proposed a series of questions to be asked in each case:[30]
Who says so? (Does he/she have an axe to grind?)
1.7. HISTORY OF STATISTICAL SCIENCE
11
How does he/she know? (Does he/she have the resources to know the facts?)
Whats missing? (Does he/she give us a complete picture?)
Did someone change the subject? (Does he/she oer us the right answer to the wrong problem?)
Does it make sense? (Is his/her conclusion logical and consistent with what we already know?)
The confounding variable problem: X and Y may be correlated, not because there is causal relationship between them, but because
both depend on a third variable Z. Z is called a confounding factor.
1.6.1
Misinterpretation: correlation
The concept of correlation is particularly noteworthy for the potential confusion it can cause. Statistical analysis of
a data set often reveals that two variables (properties) of the population under consideration tend to vary together,
as if they were connected. For example, a study of annual income that also looks at age of death might nd that
poor people tend to have shorter lives than auent people. The two variables are said to be correlated; however,
they may or may not be the cause of one another. The correlation phenomena could be caused by a third, previously
unconsidered phenomenon, called a lurking variable or confounding variable. For this reason, there is no way to
immediately infer the existence of a causal relationship between the two variables. (See Correlation does not imply
causation.)
1.7 History of statistical science

Main articles: History of statistics and Founders of statistics
Statistical methods date back at least to the 5th century BC.
Some scholars pinpoint the origin of statistics to 1663, with the publication of Natural and Political Observations
upon the Bills of Mortality by John Graunt.[31] Early applications of statistical thinking revolved around the needs of
states to base policy on demographic and economic data, hence its stat- etymology. The scope of the discipline of
statistics broadened in the early 19th century to include the collection and analysis of data in general. Today, statistics
is widely employed in government, business, and natural and social sciences.
12
Gerolamo Cardano, the earliest pioneer on the mathematics of probability.
Its mathematical foundations were laid in the 17th century with the development of the probability theory by Gerolamo
Cardano, Blaise Pascal and Pierre de Fermat. Mathematical probability theory arose from the study of games of
chance, although the concept of probability was already examined in medieval law and by philosophers such as Juan
Caramuel.[32] The method of least squares was rst described by Adrien-Marie Legendre in 1805.
The modern eld of statistics emerged in the late 19th and early 20th century in three stages.[33] The rst wave,
at the turn of the century, was led by the work of Francis Galton and Karl Pearson, who transformed statistics
into a rigorous mathematical discipline used for analysis, not just in science, but in industry and politics as well.
1.7. HISTORY OF STATISTICAL SCIENCE
13
Karl Pearson, a founder of mathematical statistics.
Galtons contributions included introducing the concepts of standard deviation, correlation, regression analysis and the
application of these methods to the study of the variety of human characteristics height, weight, eyelash length among
others.[34] Pearson developed the Pearson product-moment correlation coecient, dened as a product-moment,[35]
the method of moments for the tting of distributions to samples and the Pearson distribution, among many other
things.[36] Galton and Pearson founded Biometrika as the rst journal of mathematical statistics and biostatistics
(then called biometry), and the latter founded the worlds rst university statistics department at University College
London.[37]
Ronald Fisher coined the term null hypothesis during the Lady tasting tea experiment, which is never proved or
established, but is possibly disproved, in the course of experimentation.[38][39]
The second wave of the 1910s and 20s was initiated by William Gosset, and reached its culmination in the insights
of Ronald Fisher, who wrote the textbooks that were to dene the academic discipline in universities around the
world. Fishers most important publications were his 1918 seminal paper The Correlation between Relatives on the
Supposition of Mendelian Inheritance, which was the rst to use the statistical term, variance, his classic 1925 work
Statistical Methods for Research Workers and his 1935 The Design of Experiments,[40][41][42][43] where he developed
rigorous design of experiments models. He originated the concepts of suciency, ancillary statistics, Fishers linear
discriminator and Fisher information.[44] In his 1930 book The Genetical Theory of Natural Selection he applied
statistics to various biological concepts such as Fishers principle[45] ). Nevertheless, A. W. F. Edwards has remarked
that it is probably the most celebrated argument in evolutionary biology".[45] (about the sex ratio), the Fisherian
runaway,[46][47][48][49][50][51] a concept in sexual selection about a positive feedback runaway aect found in evolution.
The nal wave, which mainly saw the renement and expansion of earlier developments, emerged from the collaborative work between Egon Pearson and Jerzy Neyman in the 1930s. They introduced the concepts of "Type II"
error, power of a test and condence intervals. Jerzy Neyman in 1934 showed that stratied random sampling was
in general a better method of estimation than purposive (quota) sampling.[52]
Today, statistical methods are applied in all elds that involve decision making, for making accurate inferences from
a collated body of data and for making decisions in the face of uncertainty based on statistical methodology. The use
of modern computers has expedited large-scale statistical computations, and has also made possible new methods
that are impractical to perform manually. Statistics continues to be an area of active research, for example on the
14
problem of how to analyze Big data.[53]
1.8 Applications
1.8.1
Applied statistics, theoretical statistics and mathematical statistics
Applied statistics comprises descriptive statistics and the application of inferential statistics.[54][55] Theoretical statistics concerns both the logical arguments underlying justication of approaches to statistical inference, as well encompassing mathematical statistics. Mathematical statistics includes not only the manipulation of probability distributions
necessary for deriving results related to methods of estimation and inference, but also various aspects of computational
statistics and the design of experiments.
1.8.2
Machine learning and data mining
There are two applications for machine learning and data mining: data management and data analysis. Statistics tools
are necessary for the data analysis.
1.8.3
Statistics in society
Statistics is applicable to a wide variety of academic disciplines, including natural and social sciences, government,
and business. Statistical consultants can help organizations and companies that don't have in-house expertise relevant
to their particular questions.
1.8.4
Statistical computing
Main article: Computational statistics

The rapid and sustained increases in computing power starting from the second half of the 20th century have had a
substantial impact on the practice of statistical science. Early statistical models were almost always from the class of
linear models, but powerful computers, coupled with suitable numerical algorithms, caused an increased interest in
nonlinear models (such as neural networks) as well as the creation of new types, such as generalized linear models
and multilevel models.
Increased computing power has also led to the growing popularity of computationally intensive methods based on
resampling, such as permutation tests and the bootstrap, while techniques such as Gibbs sampling have made use
of Bayesian models more feasible. The computer revolution has implications for the future of statistics with new
emphasis on experimental and empirical statistics. A large number of both general and special purpose statistical
software are now available.
1.8.5
Statistics applied to mathematics or the arts
Traditionally, statistics was concerned with drawing inferences using a semi-standardized methodology that was required learning in most sciences. This has changed with use of statistics in non-inferential contexts. What was
once considered a dry subject, taken in many elds as a degree-requirement, is now viewed enthusiastically. Initially
derided by some mathematical purists, it is now considered essential methodology in certain areas.
In number theory, scatter plots of data generated by a distribution function may be transformed with familiar
tools used in statistics to reveal underlying patterns, which may then lead to hypotheses.
Methods of statistics including predictive methods in forecasting are combined with chaos theory and fractal
geometry to create video works that are considered to have great beauty.
The process art of Jackson Pollock relied on artistic experiments whereby underlying distributions in nature
were artistically revealed. With the advent of computers, statistical methods were applied to formalize such
distribution-driven natural processes to make and analyze moving video art.
1.9. SPECIALIZED DISCIPLINES
15
gretl, an example of an open source statistical package
Methods of statistics may be used predicatively in performance art, as in a card trick based on a Markov process
that only works some of the time, the occasion of which can be predicted using statistical methodology.
Statistics can be used to predicatively create art, as in the statistical or stochastic music invented by Iannis
Xenakis, where the music is performance-specic. Though this type of artistry does not always come out as
expected, it does behave in ways that are predictable and tunable using statistics.
1.9 Specialized disciplines

Main article: List of elds of application of statistics
Statistical techniques are used in a wide range of types of scientic and social research, including: biostatistics,
computational biology, computational sociology, network biology, social science, sociology and social research. Some
elds of inquiry use applied statistics so extensively that they have specialized terminology. These disciplines include:
Actuarial science (assesses risk in the insurance and nance industries)
Applied information economics
Astrostatistics (statistical evaluation of astronomical data)
Biostatistics
Business statistics
16
Chemometrics (for analysis of data from chemistry)
Data mining (applying statistics and pattern recognition to discover knowledge from data)
Data science
Demography
Econometrics (statistical analysis of economic data)
Energy statistics
Engineering statistics
Epidemiology (statistical analysis of disease)
Geography and Geographic Information Systems, specically in Spatial analysis
Image processing
Medical Statistics
Psychological statistics
Reliability engineering
Social statistics
Statistical Mechanics
In addition, there are particular types of statistical analysis that have also developed their own specialised terminology
and methodology:
Bootstrap / Jackknife resampling
Multivariate statistics
Statistical classication
Structured data analysis (statistics)
Structural equation modelling
Survey methodology
Survival analysis
Statistics in various sports, particularly baseball - known as Sabermetrics - and cricket
Statistics form a key basis tool in business and manufacturing as well. It is used to understand measurement systems
variability, control processes (as in statistical process control or SPC), for summarizing data, and to make data-driven
decisions. In these roles, it is a key tool, and perhaps the only reliable tool.
1.10 See also

Main article: Outline of statistics
Abundance estimation
Data science
Glossary of probability and statistics
List of academic statistical associations
1.11. REFERENCES
17
List of important publications in statistics

List of national and international statistical services
List of statistical packages (software)
List of statistics articles
List of university statistical consulting centers
Notation in probability and statistics
Foundations and major areas of statistics
Foundations of statistics
List of statisticians
Ocial statistics
Multivariate analysis of variance
1.11 References
[1] Dodge, Y. (2006) The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-920613-9
[2] Denition of STATISTICS. www.merriam-webster.com. Retrieved 2016-05-28.
[3] Essay on Statistics: Meaning and Denition of Statistics. Economics Discussion. 2014-12-02. Retrieved 2016-05-28.
[4] Lund Research Ltd. Descriptive and Inferential Statistics. statistics.laerd.com. Retrieved 2014-03-23.
[5] What Is the Dierence Between Type I and Type II Hypothesis Testing Errors?". About.com Education. Retrieved 201511-27.
[6] Moses, Lincoln E. (1986) Think and Explain with Statistics, Addison-Wesley, ISBN 978-0-201-15619-5 . pp. 13
[7] Hays, William Lee, (1973) Statistics for the Social Sciences, Holt, Rinehart and Winston, p.xii, ISBN 978-0-03-077945-9
[8] Moore, David (1992). Teaching Statistics as a Respectable Subject. In F. Gordon and S. Gordon. Statistics for the
Twenty-First Century. Washington, DC: The Mathematical Association of America. pp. 1425. ISBN 978-0-88385-0787.
[9] Chance, Beth L.; Rossman, Allan J. (2005). Preface. Investigating Statistical Concepts, Applications, and Methods (PDF).
Duxbury Press. ISBN 978-0-495-05064-3.
[10] Lakshmikantham,, ed. by D. Kannan,... V. (2002). Handbook of stochastic analysis and applications. New York: M.
Dekker. ISBN 0824706609.
[11] Schervish, Mark J. (1995). Theory of statistics (Corr. 2nd print. ed.). New York: Springer. ISBN 0387945466.
[12] Freedman, D.A. (2005) Statistical Models: Theory and Practice, Cambridge University Press. ISBN 978-0-521-67105-7
[13] McCarney R, Warner J, Ilie S, van Haselen R, Grin M, Fisher P (2007). The Hawthorne Eect: a randomised,
controlled trial. BMC Med Res Methodol. 7 (1): 30. doi:10.1186/1471-2288-7-30. PMC 1936999 . PMID 17608932.
[14] Mosteller, F., & Tukey, J. W. (1977). Data analysis and regression. Boston: Addison-Wesley.
[15] Nelder, J. A. (1990). The knowledge needed to computerise the analysis and interpretation of statistical information. In
Expert systems and articial intelligence: the need for information about data. Library Association Report, London, March,
2327.
[16] Chrisman, Nicholas R (1998). Rethinking Levels of Measurement for Cartography. Cartography and Geographic Information Science. 25 (4): 231242. doi:10.1559/152304098782383043.
18
[17] van den Berg, G. (1991). Choosing an analysis method. Leiden: DSWO Press
[18] Hand, D. J. (2004). Measurement theory and practice: The world through quantication. London, UK: Arnold.
[19] Piazza Elio, Probabilit e Statistica, Esculapio 2007
[20] Everitt, Brian (1998). The Cambridge Dictionary of Statistics. Cambridge, UK New York: Cambridge University Press.
ISBN 0521593468.
[21] http://www.yourstatsguru.com/epar/rp-reviewed/cohen1994/
[22] Rubin, Donald B.; Little, Roderick J. A.,Statistical analysis with missing data, New York: Wiley 2002
[23] Ioannidis, J. P. A. (2005). Why Most Published Research Findings Are False. PLoS Medicine. 2 (8): e124. doi:10.1371/journal.pmed.0020124
PMC 1182327 . PMID 16060722.
[24] Hu, Darrell (1954) How to Lie with Statistics, WW Norton & Company, Inc. New York, NY. ISBN 0-393-31072-8
[25] Warne, R. Lazo; Ramos, T.; Ritter, N. (2012). Statistical Methods Used in Gifted Education Journals, 20062010.
Gifted Child Quarterly. 56 (3): 134149. doi:10.1177/0016986212444122.
[26] Drennan, Robert D. (2008). Statistics in archaeology. In Pearsall, Deborah M. Encyclopedia of Archaeology. Elsevier
Inc. pp. 20932100. ISBN 978-0-12-373962-9.
[27] Cohen, Jerome B. (December 1938). Misuse of Statistics. Journal of the American Statistical Association. JSTOR. 33
(204): 657674. doi:10.1080/01621459.1938.10502344.
[28] Freund, J. E. (1988). Modern Elementary Statistics. Credo Reference.
[29] Hu, Darrell; Irving Geis (1954). How to Lie with Statistics. New York: Norton. The dependability of a sample can be
destroyed by [bias]... allow yourself some degree of skepticism.
[30] Hu, Darrell; Irving Geis (1954). How to Lie with Statistics. New York: Norton.
[31] Willcox, Walter (1938) The Founder of Statistics. Review of the International Statistical Institute 5(4):321328. JSTOR
1400906
[32] J. Franklin, The Science of Conjecture: Evidence and Probability before Pascal,Johns Hopkins Univ Pr 2002
[33] Helen Mary Walker (1975). Studies in the history of statistical method. Arno Press.
[34] Galton, F (1877). Typical laws of heredity. Nature. 15: 492553. doi:10.1038/015492a0.
[35] Stigler, S. M. (1989). Francis Galtons Account of the Invention of Correlation. Statistical Science. 4 (2): 7379.
doi:10.1214/ss/1177012580.
[36] Pearson, K. (1900). On the Criterion that a given System of Deviations from the Probable in the Case of a Correlated
System of Variables is such that it can be reasonably supposed to have arisen from Random Sampling. Philosophical
Magazine Series 5. 50 (302): 157175. doi:10.1080/14786440009463897.
[37] Karl Pearson (18571936)". Department of Statistical Science University College London.
[38] Fisher|1971|loc=Chapter II. The Principles of Experimentation, Illustrated by a Psycho-physical Experiment, Section 8.
The Null Hypothesis
[39] OED quote: 1935 R. A. Fisher, The Design of Experiments ii. 19, We may speak of this hypothesis as the 'null hypothesis,
and it should be noted that the null hypothesis is never proved or established, but is possibly disproved, in the course of
experimentation.
[40] Stanley, J. C. (1966). The Inuence of Fishers The Design of Experiments on Educational Research Thirty Years
Later. American Educational Research Journal. 3 (3): 223. doi:10.3102/00028312003003223.
[41] Box, JF (February 1980). R. A. Fisher and the Design of Experiments, 1922-1926. The American Statistician. 34 (1):
17. doi:10.2307/2682986. JSTOR 2682986.
[42] Yates, F (June 1964). Sir Ronald Fisher and the Design of Experiments. Biometrics. 20 (2): 307321. doi:10.2307/2528399.
JSTOR 2528399.
[43] Stanley, Julian C. (1966). The Inuence of Fishers The Design of Experiments on Educational Research Thirty Years
Later. American Educational Research Journal. 3 (3): 223229. doi:10.3102/00028312003003223. JSTOR 1161806.
1.12. FURTHER READING
19
[44] Agresti, Alan; David B. Hichcock (2005). Bayesian Inference for Categorical Data Analysis (PDF). Statistical Methods
& Applications. 14 (14): 298. doi:10.1007/s10260-005-0121-y.
[45] Edwards, A.W.F. (1998). Natural Selection and the Sex Ratio: Fishers Sources. American Naturalist. 151 (6): 564569.
doi:10.1086/286141. PMID 18811377.
[46] Fisher, R.A. (1915) The evolution of sexual preference. Eugenics Review (7) 184:192
[47] Fisher, R.A. (1930) The Genetical Theory of Natural Selection. ISBN 0-19-850440-3
[48] Edwards, A.W.F. (2000) Perspectives: Anecdotal, Historial and Critical Commentaries on Genetics. The Genetics Society
of America (154) 1419:1426
[49] Andersson, M. (1994) Sexual selection. ISBN 0-691-00057-3
[50] Andersson, M. and Simmons, L.W. (2006) Sexual selection and mate choice. Trends, Ecology and Evolution (21) 296:302
[51] Gayon, J. (2010) Sexual selection: Another Darwinian process. Comptes Rendus Biologies (333) 134:144
[52] Neyman, J (1934). On the two dierent aspects of the representative method: The method of stratied sampling and the
method of purposive selection. Journal of the Royal Statistical Society. 97 (4): 557625. JSTOR 2342192.
[53] Science in a Complex World - Big Data: Opportunity or Threat?". Santa Fe Institute.
[54] Nikoletseas, M. M. (2014) Statistics: Concepts and Examples. ISBN 978-1500815684
[55] Anderson, D.R.; Sweeney, D.J.; Williams, T.A. (1994) Introduction to Statistics: Concepts and Applications, pp. 59. West
Group. ISBN 978-0-314-03309-3
1.12 Further reading

Statistics Scholarpedia Multiple articles written by experts
1.13 External links
Chapter 2
Portal:Statistics
Topics Culture
Geography
Health
History
Mathematics
Nature
People
Philosophy
Religion
Society
Technology
What are portals?
List of portals
Featured portals
20
Chapter 3
List of elds of application of statistics

Statistics is the mathematical science involving the collection, analysis and interpretation of data. A number of
specialties have evolved to apply statistical theory and methods to various disciplines. Certain topics have statistical
in their name but relate to manipulations of probability distributions rather than to statistical analysis.
Actuarial science is the discipline that applies mathematical and statistical methods to assess risk in the
insurance and nance industries.
Astrostatistics is the discipline that applies statistical analysis to the understanding of astronomical data.
Biostatistics is a branch of biology that studies biological phenomena and observations by means of statistical
analysis, and includes medical statistics.
Business analytics is a rapidly developing business process that applies statistical methods to data sets (often
very large) to develop new insights and understanding of business performance & opportunities
Chemometrics is the science of relating measurements made on a chemical system or process to the state of
the system via application of mathematical or statistical methods.
Demography is the statistical study of all populations. It can be a very general science that can be applied to
any kind of dynamic population, that is, one that changes over time or space.
Econometrics is a branch of economics that applies statistical methods to the empirical study of economic
theories and relationships.
Environmental statistics is the application of statistical methods to environmental science. Weather, climate,
air and water quality are included, as are studies of plant and animal populations.
Epidemiology is the study of factors aecting the health and illness of populations, and serves as the foundation
and logic of interventions made in the interest of public health and preventive medicine.
Geostatistics is a branch of geography that deals with the analysis of data from disciplines such as petroleum
geology, hydrogeology, hydrology, meteorology, oceanography, geochemistry, geography.
Machine Learning
Operations research (or Operational Research) is an interdisciplinary branch of applied mathematics and
formal science that uses methods such as mathematical modeling, statistics, and algorithms to arrive at optimal
or near optimal solutions to complex problems.
Population ecology is a sub-eld of ecology that deals with the dynamics of species populations and how these
populations interact with the environment.
Psychometric is the theory and technique of educational and psychological measurement of knowledge, abilities, attitudes, and personality traits.
Quality control reviews the factors involved in manufacturing and production; it can make use of statistical
sampling of product items to aid decisions in process control or in accepting deliveries.
21
22
CHAPTER 3. LIST OF FIELDS OF APPLICATION OF STATISTICS

Quantitative psychology is the science of statistically explaining and changing mental processes and behaviors
in humans.
Reliability Engineering is the study of the ability of a system or component to perform its required functions
under stated conditions for a specied period of time
Statistical nance, an area of econophysics, is an empirical attempt to shift nance from its normative roots
to a positivist framework using exemplars from statistical physics with an emphasis on emergent or collective
properties of nancial markets.
Statistical mechanics is the application of probability theory, which includes mathematical tools for dealing
with large populations, to the eld of mechanics, which is concerned with the motion of particles or objects
when subjected to a force.
Statistical physics is one of the fundamental theories of physics, and uses methods of probability theory in
solving physical problems.
Statistical Signal Processing
Statistical thermodynamics is the study of the microscopic behaviors of thermodynamic systems using probability theory and provides a molecular level interpretation of thermodynamic quantities such as work, heat,
free energy, and entropy.
3.1 See also

List of statistics topics
Chapter 4
Business analytics
Not to be confused with Business analysis.
Business analytics (BA) refers to the skills, technologies, practices for continuous iterative exploration and investigation of past business performance to gain insight and drive business planning.[1] Business analytics focuses on
developing new insights and understanding of business performance based on data and statistical methods. In contrast, business intelligence traditionally focuses on using a consistent set of metrics to both measure past performance
and guide business planning, which is also based on data and statistical methods.(citation needed)
Business analytics makes extensive use of statistical analysis, including explanatory and predictive modeling,[2] and
fact-based management to drive decision making. It is therefore closely related to management science. Analytics
may be used as input for human decisions or may drive fully automated decisions. Business intelligence is querying,
reporting, online analytical processing (OLAP), and alerts.
In other words, querying, reporting, OLAP, and alert tools can answer questions such as what happened, how many,
how often, where the problem is, and what actions are needed. Business analytics can answer questions like why is
this happening, what if these trends continue, what will happen next (that is, predict), what is the best that can happen
(that is, optimize).[3]
4.1 Examples of application

Banks, such as Capital One, use data analysis (or analytics, as it is also called in the business setting), to dierentiate
among customers based on credit risk, usage and other characteristics and then to match customer characteristics
with appropriate product oerings. Harrahs, the gaming rm, uses analytics in its customer loyalty programs. E
& J Gallo Winery quantitatively analyzes and predicts the appeal of its wines. Between 2002 and 2005, Deere &
Company saved more than $1 billion by employing a new analytical tool to better optimize inventory.[3] A telecoms
company that pursues ecient call centre usage over customer service may save money.
4.2 Types of analytics

Decision analytics: supports human decisions with visual analytics the user models to reect reasoning.[4]
Descriptive analytics: gains insight from historical data with reporting, scorecards, clustering etc.
Predictive analytics: employs predictive modeling using statistical and machine learning techniques
Prescriptive analytics: recommends decisions using optimization, simulation, etc.
4.3 Basic domains within analytics

Behavioral analytics
23
24
CHAPTER 4. BUSINESS ANALYTICS

Cohort Analysis
Collections analytics
Contextual data modeling - supports the human reasoning that occurs after viewing executive dashboards or
any other visual analytics
Cyber analytics
Enterprise Optimization
Financial services analytics
Fraud analytics
Marketing analytics
Pricing analytics
Retail sales analytics
Risk & Credit analytics
Supply Chain analytics
Talent analytics
Telecommunications
Transportation analytics
Customer Analytics
4.4 History
Analytics have been used in business since the management exercises were put into place by Frederick Winslow
Taylor in the late 19th century. Henry Ford measured the time of each component in his newly established assembly
line. But analytics began to command more attention in the late 1960s when computers were used in decision support
systems. Since then, analytics have changed and formed with the development of enterprise resource planning (ERP)
systems, data warehouses, and a large number of other software tools and processes.[3]
In later years the business analytics have exploded with the introduction to computers. This change has brought
analytics to a whole new level and has made the possibilities endless. As far as analytics has come in history, and
what the current eld of analytics is today many people would never think that analytics started in the early 1900s
with Mr. Ford himself.
4.5 Challenges
Business analytics depends on sucient volumes of high quality data. The diculty in ensuring data quality is integrating and reconciling data across dierent systems, and then deciding what subsets of data to make available.[3]
Previously, analytics was considered a type of after-the-fact method of forecasting consumer behavior by examining
the number of units sold in the last quarter or the last year. This type of data warehousing required a lot more storage
space than it did speed. Now business analytics is becoming a tool that can inuence the outcome of customer
interactions.[5] When a specic customer type is considering a purchase, an analytics-enabled enterprise can modify
the sales pitch to appeal to that consumer. This means the storage space for all that data must react extremely fast to
provide the necessary data in real-time.
4.6. COMPETING ON ANALYTICS
25
4.6 Competing on analytics

Thomas Davenport, professor of information technology and management at Babson College argues that businesses
can optimize a distinct business capability via analytics and thus better compete. He identies these characteristics
of an organization that are apt to compete on analytics:[3]
One or more senior executives who strongly advocate fact-based decision making and, specically, analytics
Widespread use of not only descriptive statistics, but also predictive modeling and complex optimization techniques
Substantial use of analytics across multiple business functions or processes
Movement toward an enterprise level approach to managing analytical tools, data, and organizational skills and
capabilities
4.7 See also

Analytics
Business analysis
Business analyst
Business intelligence
Business process discovery
Customer dynamics
Data mining
OLAP
Statistics
Test and learn
4.8 References
[1] Beller, Michael J.; Alan Barnett (2009-06-18). Next Generation Business Analytics. Lightship Partners LLC. Retrieved
2009-06-20.
[2] Galit Schmueli and Otto Koppius. Predictive vs. Explanatory Modeling in IS Research (PDF).
[3] Davenport, Thomas H.; Harris, Jeanne G. (2007). Competing on analytics : the new science of winning. Boston, Mass.:
Harvard Business School Press. ISBN 978-1-4221-0332-6.
[4] Analytics List. Retrieved 3 April 2015.
[5] Choosing the Best Storage for Business Analytics. Dell.com. Retrieved 2012-06-25.
4.9 Further reading

Bartlett, Randy (February 2013). A Practitioners Guide To Business Analytics: Using Data Analysis Tools to
Improve Your Organizations Decision Making and Strategy. McGraw-Hill. ISBN 978-0071807593.
Saxena, Rahul; Anand Srinivasan (December 2012). Business Analytics: A Practitioners Guide (International
Series in Operations Research & Management Science). Springer. ISBN 978-1461460794.
26
CHAPTER 4. BUSINESS ANALYTICS

Davenport, Thomas H.; Jeanne G. Harris (March 2007). Competing on Analytics: The New Science of Winning.
Harvard Business School Press.
McDonald, Mark; Tina Nunno (February 2007). Creating Enterprise Leverage: The 2007 CIO Agenda. Stamford, CT: Gartner, Inc.
Stubbs, Evan (July 2011). The Value of Business Analytics. John Wiley & Sons.
Ranadive, Vivek (2006-01-26). The Power to Predict: How Real Time Businesses Anticipate Customer Needs,
Create Opportunities, and Beat the Competition. McGraw-Hill.
Zabin, Jerey; Gresh Brebach (February 2004). Precision Marketing. John Wiley.
Baker, Stephen (January 23, 2006). Math Will Rock Your World. BusinessWeek. Retrieved 2007-09-19.
Davenport, Thomas H. (January 1, 2006). Competing on Analytics. Harvard Business Review.
Pfeer, Jerey; Robert I. Sutton (January 2006). Evidence-Based Management. Harvard Business Review.
Davenport, Thomas H.; Jeanne G. Harris (Summer 2005). Automated Decision Making Comes of Age.
MIT Sloan Management Review.
Lewis, Michael (April 2004). Moneyball: The Art of Winning an Unfair Game. W.W. Norton & Co.
Bonabeau, Eric (May 2003). Dont Trust Your Gut. Harvard Business Review.
Davenport, Thomas H.; Jeanne G. Harris; David W. De Long; Alvin L. Jacobson. Data to Knowledge to Results: Building an Analytic Capability. California Management Review 43 (2): 117138. doi:10.2307/41166078.
Indian Bank Need Business Analysis Executives | Business Analytics Course
Chapter 5
Descriptive statistics
Descriptive statistics are statistics that quantitatively describe or summarise features of a collection of information.[1]
Descriptive statistics are distinguished from inferential statistics (or inductive statistics), in that descriptive
statistics aim to summarize a sample, rather than use the data to learn about the population that the sample
of data is thought to represent. This generally means that descriptive statistics, unlike inferential statistics, are
not developed on the basis of probability theory.[2] Even when a data analysis draws its main conclusions using
inferential statistics, descriptive statistics are generally also presented. For example in papers reporting on human subjects, typically a table is included giving the overall sample size, sample sizes in important subgroups
(e.g., for each treatment or exposure group), and demographic or clinical characteristics such as the average
age, the proportion of subjects of each sex, the proportion of subjects with related comorbidities etc.
Some measures that are commonly used to describe a data set are measures of central tendency and measures of
variability or dispersion. Measures of central tendency include the mean, median and mode, while measures of
variability include the standard deviation (or variance), the minimum and maximum values of the variables, kurtosis
and skewness.[3]
5.1 Use in statistical analysis

Descriptive statistics provide simple summaries about the sample and about the observations that have been made.
Such summaries may be either quantitative, i.e. summary statistics, or visual, i.e. simple-to-understand graphs.
These summaries may either form the basis of the initial description of the data as part of a more extensive statistical
analysis, or they may be sucient in and of themselves for a particular investigation.
For example, the shooting percentage in basketball is a descriptive statistic that summarizes the performance of a
player or a team. This number is the number of shots made divided by the number of shots taken. For example, a
player who shoots 33% is making approximately one shot in every three. The percentage summarizes or describes
multiple discrete events. Consider also the grade point average. This single number describes the general performance
of a student across the range of their course experiences.[4]
The use of descriptive and summary statistics has an extensive history and, indeed, the simple tabulation of populations
and of economic data was the rst way the topic of statistics appeared. More recently, a collection of summarisation
techniques has been formulated under the heading of exploratory data analysis: an example of such a technique is
the box plot.
In the business world, descriptive statistics provides a useful summary of many types of data. For example, investors
and brokers may use a historical account of return behavior by performing empirical and analytical analyses on their
investments in order to make better investing decisions in the future.
5.1.1
Univariate analysis
Univariate analysis involves describing the distribution of a single variable, including its central tendency (including
the mean, median, and mode) and dispersion (including the range and quantiles of the data-set, and measures of
27
28
CHAPTER 5. DESCRIPTIVE STATISTICS
spread such as the variance and standard deviation). The shape of the distribution may also be described via indices
such as skewness and kurtosis. Characteristics of a variables distribution may also be depicted in graphical or tabular
format, including histograms and stem-and-leaf display.
5.1.2
Bivariate analysis
When a sample consists of more than one variable, descriptive statistics may be used to describe the relationship
between pairs of variables. In this case, descriptive statistics include:
Cross-tabulations and contingency tables
Graphical representation via scatterplots
Quantitative measures of dependence
Descriptions of conditional distributions
The main reason for dierentiating univariate and bivariate analysis is that bivariate analysis is not only simple descriptive analysis, but also it describes the relationship between two dierent variables.[5] Quantitative measures of
dependence include correlation (such as Pearsons r when both variables are continuous, or Spearmans rho if one or
both are not) and covariance (which reects the scale variables are measured on). The slope, in regression analysis,
also reects the relationship between variables. The unstandardised slope indicates the unit change in the criterion
variable for a one unit change in the predictor. The standardised slope indicates this change in standardised (zscore) units. Highly skewed data are often transformed by taking logarithms. Use of logarithms makes graphs more
symmetrical and look more similar to the normal distribution, making them easier to interpret intuitively.[6]:47
5.2 References
[1] Mann, Prem S. (1995). Introductory Statistics (2nd ed.). Wiley. ISBN 0-471-31009-3.
[2] Dodge, Y. (2003). The Oxford Dictionary of Statistical Terms. OUP. ISBN 0-19-850994-4.
[3] Investopedia, Descriptive Statistics Terms
[4] Trochim, William M. K. (2006). Descriptive statistics. Research Methods Knowledge Base. Retrieved 14 March 2011.
[5] Babbie, Earl R. (2009). The Practice of Social Research (12th ed.). Wadsworth. pp. 436440. ISBN 0-495-59841-0.
[6] Nick, Todd G. (2007). Descriptive Statistics. Topics in Biostatistics. Methods in Molecular Biology. 404. New York:
Springer. pp. 3352. doi:10.1007/978-1-59745-530-5_3. ISBN 978-1-58829-531-6.
5.3 External links

Descriptive Statistics Lecture: University of Pittsburgh Supercourse: http://www.pitt.edu/~{}super1/lecture/
lec0421/index.htm
Chapter 6
Quality control
This article is about the project management process. For other uses, see Quality control (disambiguation).
Quality control, or QC for short, is a process by which entities review the quality of all factors involved in production.
Quality inspector in a Volkseigener Betrieb sewing machine parts factory in Dresden, East Germany, 1977.
29
30
CHAPTER 6. QUALITY CONTROL
ISO 9000 denes quality control as A part of quality management focused on fullling quality requirements.[1]
This approach places an emphasis on three aspects:
1. Elements such as controls, job management, dened and well managed processes,[2][3] performance and integrity criteria, and identication of records
2. Competence, such as knowledge, skills, experience, and qualications
3. Soft elements, such as personnel, integrity, condence, organizational culture, motivation, team spirit, and
quality relationships.
Controls include product inspection, where every product is examined visually, and often using a stereo microscope
for ne detail before the product is sold into the external market. Inspectors will be provided with lists and descriptions
of unacceptable product defects such as cracks or surface blemishes for example.
The quality of the outputs is at risk if any of these three aspects is decient in any way.
Quality control emphasizes testing of products to uncover defects and reporting to management who make the decision to allow or deny product release, whereas quality assurance attempts to improve and stabilize production (and
associated processes) to avoid, or at least minimize, issues which led to the defect(s) in the rst place. For contract
work, particularly work awarded by government agencies, quality control issues are among the top reasons for not
renewing a contract.[4]
6.1 Notable approaches to quality control

There is a tendency for individual consultants and organizations to name their own unique approaches to quality
controla few of these have ended up in widespread use:
6.2 Quality control in project management

In project management, quality control requires the project manager and/or the project team to inspect the accomplished work to ensure its alignment with the project scope.[10] In practice, projects typically have a dedicated quality
control team which focuses on this area.
6.3 See also

Analytical quality control
Corrective and preventative action (CAPA)
Eight dimensions of quality
First article inspection (FAI)
Good Automated Manufacturing Practice (GAMP)
Good manufacturing practice
Quality assurance
Quality management framework
Standard operating procedure (SOP)
6.4. REFERENCES
31
6.4 References
This article incorporates public domain material from the General Services Administration document Federal
Standard 1037C (in support of MIL-STD-188).
[1] ISO 9000:2005, Clause 3.2.10
[2] Dennis Adsit (November 9, 2007). What the Call Center Industry Can Learn from Manufacturing: Part I (PDF). National
Association of Call Centers. Retrieved 21 December 2012.
[3] Dennis Adsit (November 23, 2007). What the Call Center Industry Can Learn from Manufacturing: Part II (PDF).
National Association of Call Centers. Retrieved 21 December 2012.
[4] Position Classication Standard for Quality Assurance Series, GS-1910 (PDF). US Oce of Personnel Management.
March 1983. Retrieved 21 December 2012.
[5] Juran, Joseph M., ed. (1995), A History of Managing for Quality: The Evolution, Trends, and Future Directions of Managing
for Quality, Milwaukee, Wisconsin: The American Society for Quality Control, ISBN 9780873893411, OCLC 32394752
[6] Feigenbaum, Armand V. (1956). Total Quality Control. Harvard Business Review. Cambridge, Massachusetts: Harvard
University Press. 34 (6): 93101. ISSN 0017-8012. OCLC 1751795.
[7] Ishikawa, Kaoru (1985), What Is Total Quality Control? The Japanese Way (1 ed.), Englewood Clis, New Jersey: PrenticeHall, pp. 9091, ISBN 978-0-13-952433-2, OCLC 11467749
[8] Evans, James R.; Lindsay, William M. (1999), The Management and Control of Quality (4 ed.), Cincinnati, Ohio: SouthWestern College Publications, p. 118, ISBN 9780538882422, OCLC 38475486, The term total quality management,
or TQM, has been commonly used to denote the system of managing for total quality. (The term TQM was actually
developed within the Department of Defense. It has since been renamed Total Quality Leadership, since leadership outranks
management in military thought.)
[9] What Is Six Sigma?" (PDF). http://www.motorolasolutions.com. Schaumburg, Illinois: Motorola University. 2010-0219. p. 2. Retrieved 2013-11-24. When practiced as a management system, Six Sigma is a high performance system for
executing business strategy. External link in |website= (help)
[10] Phillips, Joseph (November 2008). Quality Control in Project Management. The Project Management Hut. Retrieved
21 December 2012.
6.5 Further reading

Radford, George S. (1922), The Control of Quality in Manufacturing, New York: Ronald Press Co., OCLC
1701274, retrieved 2013-11-16
Shewhart, Walter A. (1931), Economic Control of Quality of Manufactured Product, New York: D. Van Nostrand Co., Inc., OCLC 1045408
Juran, Joseph M. (1951), Quality-Control Handbook, New York: McGraw-Hill, OCLC 1220529
Western Electric Company (1956), Statistical Quality Control Handbook (1 ed.), Indianapolis, Indiana: Western
Electric Co., OCLC 33858387
Feigenbaum, Armand V. (1961), Total Quality Control, New York: McGraw-Hill, OCLC 567344
6.6 External links

ASTM quality control standards
Chapter 7
Operations research
For the academic journal, see Operations Research.
Operations research, or operational research in British usage, is a discipline that deals with the application of
advanced analytical methods to help make better decisions.[1] Further, the term 'operational analysis is used in the
British (and some British Commonwealth) military, as an intrinsic part of capability development, management and
assurance. In particular, operational analysis forms part of the Combined Operational Eectiveness and Investment
Appraisals (COEIA), which support British defence capability acquisition decision-making.
It is often considered to be a sub-eld of mathematics.[2] The terms management science and decision science are
sometimes used as synonyms.[3]
Employing techniques from other mathematical sciences, such as mathematical modeling, statistical analysis, and
mathematical optimization, operations research arrives at optimal or near-optimal solutions to complex decisionmaking problems. Because of its emphasis on human-technology interaction and because of its focus on practical
applications, operations research has overlap with other disciplines, notably industrial engineering and operations
management, and draws on psychology and organization science. Operations research is often concerned with determining the maximum (of prot, performance, or yield) or minimum (of loss, risk, or cost) of some real-world
objective. Originating in military eorts before World War II, its techniques have grown to concern problems in a
variety of industries.[4]
7.1 Overview
Operational research (OR) encompasses a wide range of problem-solving techniques and methods applied in the
pursuit of improved decision-making and eciency, such as simulation, mathematical optimization, queueing theory
and other stochastic-process models, Markov decision processes, econometric methods, data envelopment analysis,
neural networks, expert systems, decision analysis, and the analytic hierarchy process.[5] Nearly all of these techniques
involve the construction of mathematical models that attempt to describe the system. Because of the computational
and statistical nature of most of these elds, OR also has strong ties to computer science and analytics. Operational
researchers faced with a new problem must determine which of these techniques are most appropriate given the nature
of the system, the goals for improvement, and constraints on time and computing power.
The major subdisciplines in modern operational research, as identied by the journal Operations Research,[6] are:
Computing and information technologies
Financial engineering
Manufacturing, service sciences, and supply chain management
Policy modeling and public sector work
Revenue management
Simulation
32
7.2. HISTORY
33
Stochastic models
Transportation
7.2 History
As a discipline, operational research originated in the eorts of military planners during World War I (convoy theory
and Lanchesters laws). In the decades after the two world wars, the techniques were more widely applied to problems in business, industry and society. Since that time, operational research has expanded into a eld widely used
in industries ranging from petrochemicals to airlines, nance, logistics, and government, moving to a focus on the
development of mathematical models that can be used to analyse and optimize complex systems, and has become an
area of active academic and industrial research.[4]
7.2.1
Historical origins
Early work in operational research was carried out by individuals such as Charles Babbage. His research into the cost
of transportation and sorting of mail led to Englands universal Penny Post in 1840, and studies into the dynamical
behaviour of railway vehicles in defence of the GWR's broad gauge.[7] Percy Bridgman brought operational research
to bear on problems in physics in the 1920s and would later attempt to extend these to the social sciences.[8]
Modern operational research originated at the Bawdsey Research Station in the UK in 1937 and was the result of an
initiative of the stations superintendent, A. P. Rowe. Rowe conceived the idea as a means to analyse and improve
the working of the UKs early warning radar system, Chain Home (CH). Initially, he analysed the operating of the
radar equipment and its communication networks, expanding later to include the operating personnels behaviour.
This revealed unappreciated limitations of the CH network and allowed remedial action to be taken.[9]
Scientists in the United Kingdom including Patrick Blackett (later Lord Blackett OM PRS), Cecil Gordon, Solly
Zuckerman, (later Baron Zuckerman OM, KCB, FRS), C. H. Waddington, Owen Wansbrough-Jones, Frank Yates,
Jacob Bronowski and Freeman Dyson, and in the United States with George Dantzig looked for ways to make better
decisions in such areas as logistics and training schedules
7.2.2
Second World War
The modern eld of operational research arose during World War II. In the World War II era, operational research was
dened as a scientic method of providing executive departments with a quantitative basis for decisions regarding
the operations under their control.[10] Other names for it included operational analysis (UK Ministry of Defence
from 1962)[11] and quantitative management.[12]
During the Second World War close to 1,000 men and women in Britain were engaged in operational research. About
200 operational research scientists worked for the British Army.[13]
Patrick Blackett worked for several dierent organizations during the war. Early in the war while working for the
Royal Aircraft Establishment (RAE) he set up a team known as the Circus which helped to reduce the number of
anti-aircraft artillery rounds needed to shoot down an enemy aircraft from an average of over 20,000 at the start of
the Battle of Britain to 4,000 in 1941.[14]
In 1941 Blackett moved from the RAE to the Navy, after rst working with RAF Coastal Command, in 1941 and then
early in 1942 to the Admiralty.[15] Blacketts team at Coastal Commands Operational Research Section (CC-ORS)
included two future Nobel prize winners and many other people who went on to be pre-eminent in their elds.[16]
They undertook a number of crucial analyses that aided the war eort. Britain introduced the convoy system to reduce
shipping losses, but while the principle of using warships to accompany merchant ships was generally accepted, it was
unclear whether it was better for convoys to be small or large. Convoys travel at the speed of the slowest member,
so small convoys can travel faster. It was also argued that small convoys would be harder for German U-boats to
detect. On the other hand, large convoys could deploy more warships against an attacker. Blacketts sta showed that
the losses suered by convoys depended largely on the number of escort vessels present, rather than the size of the
convoy. Their conclusion was that a few large convoys are more defensible than many small ones.[17]
While performing an analysis of the methods used by RAF Coastal Command to hunt and destroy submarines, one
of the analysts asked what colour the aircraft were. As most of them were from Bomber Command they were painted
34
CHAPTER 7. OPERATIONS RESEARCH
A Liberator in standard RAF green/dark earth/black night bomber nish as originally used by Coastal Command
black for night-time operations. At the suggestion of CC-ORS a test was run to see if that was the best colour to
camouage the aircraft for daytime operations in the grey North Atlantic skies. Tests showed that aircraft painted
white were on average not spotted until they were 20% closer than those painted black. This change indicated that
30% more submarines would be attacked and sunk for the same number of sightings.[18] As a result of these ndings
Coastal Command changed their aircraft to using white undersurfaces.
A Warwick in the revised RAF Coastal Command green/dark grey/white nish
7.3. PROBLEMS ADDRESSED
35
Other work by the CC-ORS indicated that on average if the trigger depth of aerial-delivered depth charges (DCs)
were changed from 100 feet to 25 feet, the kill ratios would go up. The reason was that if a U-boat saw an aircraft
only shortly before it arrived over the target then at 100 feet the charges would do no damage (because the U-boat
wouldn't have had time to descend as far as 100 feet), and if it saw the aircraft a long way from the target it had time
to alter course under water so the chances of it being within the 20-foot kill zone of the charges was small. It was
more ecient to attack those submarines close to the surface when the targets locations were better known than to
attempt their destruction at greater depths when their positions could only be guessed. Before the change of settings
from 100 feet to 25 feet, 1% of submerged U-boats were sunk and 14% damaged. After the change, 7% were sunk
and 11% damaged. (If submarines were caught on the surface, even if attacked shortly after submerging, the numbers
rose to 11% sunk and 15% damaged). Blackett observed there can be few cases where such a great operational gain
had been obtained by such a small and simple change of tactics.[19]
Bomber Commands Operational Research Section (BC-ORS), analysed a report of a survey carried out by RAF
Bomber Command. For the survey, Bomber Command inspected all bombers returning from bombing raids over
Germany over a particular period. All damage inicted by German air defences was noted and the recommendation
was given that armour be added in the most heavily damaged areas. This recommendation was not adopted because
the fact that the aircraft returned with these areas damaged indicated these areas were not vital, and adding armour
to non-vital areas where damage is acceptable negatively aects aircraft performance. Their suggestion to remove
some of the crew so that an aircraft loss would result in fewer personnel losses, was also rejected by RAF command.
Blacketts team made the logical recommendation that the armour be placed in the areas which were completely
untouched by damage in the bombers which returned. They reasoned that the survey was biased, since it only included
aircraft that returned to Britain. The untouched areas of returning aircraft were probably vital areas, which, if hit,
would result in the loss of the aircraft.[20]
When Germany organised its air defences into the Kammhuber Line, it was realised by the British that if the RAF
bombers were to y in a bomber stream they could overwhelm the night ghters who ew in individual cells directed
to their targets by ground controllers. It was then a matter of calculating the statistical loss from collisions against the
statistical loss from night ghters to calculate how close the bombers should y to minimise RAF losses.[21]
The exchange rate ratio of output to input was a characteristic feature of operational research. By comparing the
number of ying hours put in by Allied aircraft to the number of U-boat sightings in a given area, it was possible to
redistribute aircraft to more productive patrol areas. Comparison of exchange rates established eectiveness ratios
useful in planning. The ratio of 60 mines laid per ship sunk was common to several campaigns: German mines in
British ports, British mines on German routes, and United States mines in Japanese routes.[22]
Operational research doubled the on-target bomb rate of B-29s bombing Japan from the Marianas Islands by increasing the training ratio from 4 to 10 percent of ying hours; revealed that wolf-packs of three United States submarines
were the most eective number to enable all members of the pack to engage targets discovered on their individual
patrol stations; revealed that glossy enamel paint was more eective camouage for night ghters than traditional dull
camouage paint nish, and the smooth paint nish increased airspeed by reducing skin friction.[22]
On land, the operational research sections of the Army Operational Research Group (AORG) of the Ministry of
Supply (MoS) were landed in Normandy in 1944, and they followed British forces in the advance across Europe.
They analysed, among other topics, the eectiveness of artillery, aerial bombing and anti-tank shooting.
7.2.3
After World War II
With expanded techniques and growing awareness of the eld at the close of the war, operational research was no
longer limited to only operational, but was extended to encompass equipment procurement, training, logistics and
infrastructure. Operations Research also grew in many areas other than the military once scientists learned to apply
its principles to the civilian sector. With the development of the simplex algorithm for linear programming in 1947[23]
and the development of computers over the next three decades, Operations Research can now solve problems with
hundreds of thousands of variables and constraints. Moreover, the large volumes of data required for such problems
can be stored and manipulated very eciently.[23]
7.3 Problems addressed

Critical path analysis or project planning: identifying those processes in a complex project which aect the
overall duration of the project
36
Map of Kammhuber Line
Floorplanning: designing the layout of equipment in a factory or components on a computer chip to reduce
manufacturing time (therefore reducing cost)
Network optimization: for instance, setup of telecommunications networks to maintain quality of service during outages
7.4. MANAGEMENT SCIENCE
37
Allocation problems
Facility location
Assignment Problems:
Assignment problem
Generalized assignment problem
Quadratic assignment problem
Weapon target assignment problem
Bayesian search theory: looking for a target
Optimal search
Routing, such as determining the routes of buses so that as few buses are needed as possible
Supply chain management: managing the ow of raw materials and products based on uncertain demand for
the nished products
Ecient messaging and customer response tactics
Automation: automating or integrating robotic systems in human-driven operations processes
Globalization: globalizing operations processes in order to take advantage of cheaper materials, labor, land or
other productivity inputs
Transportation: managing freight transportation and delivery systems (Examples: LTL shipping, intermodal
freight transport, travelling salesman problem)
Scheduling:
Personnel stang
Manufacturing steps
Project tasks
Network data trac: these are known as queueing models or queueing systems.
Sports events and their television coverage
Blending of raw materials in oil reneries
Determining optimal prices, in many retail and B2B settings, within the disciplines of pricing science
Operational research is also used extensively in government where evidence-based policy is used.
7.4 Management science

Main article: Management science
In 1967 Staord Beer characterized the eld of management science as the business use of operations research.[24]
However, in modern times the term management science may also be used to refer to the separate elds of organizational
studies or corporate strategy. Like operational research itself, management science (MS) is an interdisciplinary
branch of applied mathematics devoted to optimal decision planning, with strong links with economics, business, engineering, and other sciences. It uses various scientic research-based principles, strategies, and analytical methods
including mathematical modeling, statistics and numerical algorithms to improve an organizations ability to enact
rational and meaningful management decisions by arriving at optimal or near optimal solutions to complex decision problems. In short, management sciences help businesses to achieve their goals using the scientic methods of
operational research.
38
The management scientists mandate is to use rational, systematic, science-based techniques to inform and improve
decisions of all kinds. Of course, the techniques of management science are not restricted to business applications but
may be applied to military, medical, public administration, charitable groups, political groups or community groups.
Management science is concerned with developing and applying models and concepts that may prove useful in helping
to illuminate management issues and solve managerial problems, as well as designing and developing new and better
models of organizational excellence.[25]
The application of these models within the corporate sector became known as management science.[26]
7.4.1
Related elds
Some of the elds that have considerable overlap with Operations Research and Management Science include:
7.4.2
Applications
Applications of management science are abundant such as in airlines, manufacturing companies, service organizations, military branches, and government. The range of problems and issues to which management science has
contributed insights and solutions is vast. It includes:[25]
Scheduling airlines, trains, buses etc.
Assignment (assigning crew to ights, trains or buses; employees to projects)
Facility location (deciding the most appropriate location for the new facilities such as a warehouse, factory or
re station)
Network ows (managing the ow of water from reservoirs)
Health service (information and supply chain management for health services)
Game theory (identifying, understanding and developing the strategies adopted by companies)
Management science is also concerned with so-called soft-operational analysis, which concerns methods for strategic
planning, strategic decision support, and problem structuring methods. In dealing with these sorts of challenges mathematical modeling and simulation are not appropriate or will not suce. Therefore, during the past 30 years, a number
of non-quantied modeling methods have been developed. These include:
stakeholder based approaches including metagame analysis and drama theory
morphological analysis and various forms of inuence diagrams
approaches using cognitive mapping
the strategic choice approach
robustness analysis
7.5 Societies and journals

Societies
The International Federation of Operational Research Societies (IFORS)[27] is an umbrella organization for operational research societies worldwide, representing approximately 50 national societies including those in the US,[28]
UK,[29] France,[30] Germany, Canada,[31] Australia,[32] New Zealand,[33] Philippines,[34] India,[35] Japan and South
Africa.[36] The constituent members of IFORS form regional groups, such as that in Europe.[37] Other important operational research organizations are Simulation Interoperability Standards Organization (SISO)[38] and Interservice/Industry
Training, Simulation and Education Conference (I/ITSEC)[39]
7.5. SOCIETIES AND JOURNALS
39
In 2004 the US-based organization INFORMS began an initiative to market the OR profession better, including a
website entitled The Science of Better[40] which provides an introduction to OR and examples of successful applications
of OR to industrial problems. This initiative has been adopted by the Operational Research Society in the UK,
including a website entitled Learn about OR.[41]
Journals
The Institute for Operations Research and the Management Sciences (INFORMS) publishes thirteen scholarly journals about operations research, including the top two journals in their class, according to 2005 Journal Citation
Reports.[42] They are:
Decision Analysis[43]
Information Systems Research[44]
INFORMS Journal on Computing[45]
INFORMS Transactions on Education[46] (an open access journal)
Interfaces[47]
Management Science: A Journal of the Institute for Operations Research and the Management Sciences
Manufacturing & Service Operations Management
Marketing Science
Mathematics of Operations Research
Operations Research: A Journal of the Institute for Operations Research and the Management Sciences
Organization Science[48]
Service Science[49]
Transportation Science
Other journals
4OR-A Quarterly Journal of Operations Research: jointly published the Belgian, French and Italian Operations
Research Societies (Springer);
Decision Sciences published by Wiley-Blackwell on behalf of the Decision Sciences Institute
European Journal of Operational Research (EJOR): Founded in 1975 and is presently by far the largest operational research journal in the world, with its around 9,000 pages of published papers per year. In 2004, its total
number of citations was the second largest amongst Operational Research and Management Science journals;
INFOR Journal: published and sponsored by the Canadian Operational Research Society;
International Journal of Operations Research and Information Systems (IJORIS)": an ocial publication of the
Information Resources Management Association, published quarterly by IGI Global;[50]
Journal of Defense Modeling and Simulation (JDMS): Applications, Methodology, Technology: a quarterly journal devoted to advancing the science of modeling and simulation as it relates to the military and defense.[51]
Journal of the Operational Research Society (JORS): an ocial journal of The OR Society; this is the oldest
continuously published journal of OR in the world, published by Palgrave;[52]
Journal of Simulation (JOS): an ocial journal of The OR Society, published by Palgrave;[52]
Mathematical Methods of Operations Research (MMOR): the journal of the German and Dutch OR Societies,
published by Springer;[53]
Military Operations Research (MOR): published by the Military Operations Research Society;
40

Operations Research Letters;
Opsearch: ocial journal of the Operational Research Society of India;
OR Insight: a quarterly journal of The OR Society, published by Palgrave;[52]
Production and Operations Management, the ocial journal of the Production and Operations Management
Society
TOP: the ocial journal of the Spanish Society of Statistics and Operations Research.[54]
7.6 See also

7.7 References
[1] About Operations Research. INFORMS.org. Retrieved 7 January 2012.
[2] Mathematics Subject Classication. American Mathematical Society. 23 May 2011. Retrieved 7 January 2012.
[3] Wetherbe, James C. (1979), Systems analysis for computer-based information systems, West series in data processing and
information systems, West Pub. Co., ISBN 9780829902280, A systems analyst who contributes in the area of DSS must
be skilled in such areas as management science (synonymous with decision science and operations research), modeling,
simulation, and advanced statistics.
[4] What is OR. HSOR.org. Retrieved 13 November 2011.
[5] Operations Research Analysts. Bls.gov. Retrieved 27 January 2012.
[6] OR / Pubs / IOL Home. INFORMS.org. 2 January 2009. Archived from the original on 27 May 2009. Retrieved 13
November 2011.
[7] M.S. Sodhi, What about the 'O' in O.R.?" OR/MS Today, December, 2007, p. 12, http://www.lionhrtpub.com/orms/
orms-12-07/frqed.html
[8] P. W. Bridgman, The Logic of Modern Physics, The MacMillan Company, New York, 1927
[9] operations research (industrial engineering) :: History Britannica Online Encyclopedia. Britannica.com. Retrieved 13
November 2011.
[10] Operational Research in the British Army 19391945, October 1947, Report C67/3/4/48, UK National Archives le
WO291/1301
Quoted on the dust-jacket of: Morse, Philip M, and Kimball, George E, Methods of Operations Research, 1st Edition
Revised, pub MIT Press & J Wiley, 5th printing, 1954.
[11] UK National Archives Catalogue for WO291 lists a War Oce organisation called Army Operational Research Group
(AORG) that existed from 1946 to 1962. In January 1962 the name was changed to Army Operational Research Establishment (AORE). Following the creation of a unied Ministry of Defence, a tri-service operational research organisation
was established: the Defence Operational Research Establishment (DOAE) which was formed in 1965, and it the Army
Operational Research Establishment based at West Byeet.
[12] http://brochure.unisa.ac.za/myunisa/data/subjects/Quantitative%20Management.pdf
[13] Kirby, p. 117 Archived 27 August 2013 at the Wayback Machine.
[14] Kirby, pp. 9194 Archived 27 August 2013 at the Wayback Machine.
[15] Kirby, p. 96,109 Archived 2 October 2013 at the Wayback Machine.
[16] Kirby, p. 96 Archived 27 March 2014 at the Wayback Machine.
[17] ""Numbers are Essential": Victory in the North Atlantic Reconsidered, MarchMay 1943. Familyheritage.ca. 24 May
1943. Retrieved 13 November 2011.
[18] Kirby, p. 101
[19] (Kirby, pp. 102,103)
7.7. REFERENCES
41
[20] James F. Dunnigan (1999). Dirty Little Secrets of the Twentieth Century. Harper Paperbacks. pp. 215217.
[21] RAF History Bomber Command 60th Anniversary. Raf.mod.uk. Retrieved 13 November 2011.
[22] Milkman, Raymond H. (May 1968). Operations Research in World War II. United States Naval Institute Proceedings.
[23] 1.2 A HISTORICAL PERSPECTIVE. PRINCIPLES AND APPLICATIONS OF OPERATIONS RESEARCH.
[24] Staord Beer (1967) Management Science: The Business Use of Operations Research
[25] What is Management Science? Lancaster University, 2008. Retrieved 5 June 2008.
[26] What is Management Science? The University of Tennessee, 2006. Retrieved 5 June 2008.
[27] IFORS. IFORS. Retrieved 13 November 2011.
[28] Leszczynski, Mary (8 November 2011). Informs. Informs. Retrieved 13 November 2011.
[29] The OR Society. Orsoc.org.uk. Retrieved 13 November 2011.
[30] Socit franaise de Recherche Oprationnelle et d'Aide la Dcision. ROADEF. Retrieved 13 November 2011.
[31] www.cors.ca. CORS. Cors.ca. Retrieved 13 November 2011.
[32] ASOR. ASOR. 1 January 1972. Retrieved 13 November 2011.
[33] ORSNZ. ORSNZ. Retrieved 13 November 2011.
[34] ORSP. ORSP. Retrieved 13 November 2011.
[35] ORSI. Orsi.in. Retrieved 13 November 2011.
[36] ORSSA. ORSSA. 23 September 2011. Retrieved 13 November 2011.
[37] EURO (EURO)". Euro-online.org. Retrieved 13 November 2011.
[38] SISO. Sisostds.org. Retrieved 13 November 2011.
[39] I/Itsec. I/Itsec. Retrieved 13 November 2011.
[40] The Science of Better. The Science of Better. Retrieved 13 November 2011.
[41] Learn about OR. Learn about OR. Retrieved 13 November 2011.
[42] INFORMS Journals. Informs.org. Retrieved 13 November 2011.
[43] Decision Analysis. Informs.org. Retrieved 19 March 2015.
[44] Information Systems Research. Informs.org. Retrieved 19 March 2015.
[45] INFORMS Journal on Computing. Informs.org. Retrieved 19 March 2015.
[46] INFORMS Transactions on Education. Informs.org. Retrieved 19 March 2015.
[47] Interfaces. Informs.org. Retrieved 19 March 2015.
[48] Organization Science. Informs.org. Retrieved 19 March 2015.
[49] Service Science. Informs.org. Retrieved 19 March 2015.
[50] International Journal of Operations Research and Information Systems (IJORIS) (19479328)(19479336): John Wang:
Journals. IGI Global. Retrieved 13 November 2011.
[51] The Society for Modeling & Simulation International. JDMS. Scs.org. Retrieved 13 November 2011.
[52] The OR Society;
[53] Mathematical Methods of Operations Research website. Springer.com. Retrieved 13 November 2011.
[54] TOP. Springer.com. Retrieved 13 November 2011.
42
7.8 Further reading

7.8.1
Classic books and articles
R. E. Bellman, Dynamic Programming, Princeton University Press, Princeton, 1957

Abraham Charnes, William W. Cooper, Management Models and Industrial Applications of Linear Programming, Volumes I and II, New York, John Wiley & Sons, 1961
Abraham Charnes, William W. Cooper, A. Henderson, An Introduction to Linear Programming, New York,
John Wiley & Sons, 1953
C. West Churchman, Russell L. Acko & E. L. Arno, Introduction to Operations Research, New York: J.
Wiley and Sons, 1957
George B. Dantzig, Linear Programming and Extensions, Princeton, Princeton University Press, 1963
Lester K. Ford, Jr., D. Ray Fulkerson, Flows in Networks, Princeton, Princeton University Press, 1962
Jay W. Forrester, Industrial Dynamics, Cambridge, MIT Press, 1961
L. V. Kantorovich, Mathematical Methods of Organizing and Planning Production Management Science, 4,
1960, 266422
Ralph Keeney, Howard Raia, Decisions with Multiple Objectives: Preferences and Value Tradeos, New York,
John Wiley & Sons, 1976
H. W. Kuhn, The Hungarian Method for the Assignment Problem, Naval Research Logistics Quarterly, 12,
1955, 8397
H. W. Kuhn, A. W. Tucker, Nonlinear Programming, pp. 481492 in Proceedings of the Second Berkeley
Symposium on Mathematical Statistics and Probability
B. O. Koopman, Search and Screening: General Principles and Historical Applications, New York, Pergamon
Press, 1980
Tjalling C. Koopmans, editor, Activity Analysis of Production and Allocation, New York, John Wiley & Sons,
1951
Charles C. Holt, Franco Modigliani, John F. Muth, Herbert A. Simon, Planning Production, Inventories, and
Work Force, Englewood Clis, NJ, Prentice-Hall, 1960
Philip M. Morse, George E. Kimball, Methods of Operations Research, New York, MIT Press and John Wiley
& Sons, 1951
Robert O. Schlaifer, Howard Raia, Applied Statistical Decision Theory, Cambridge, Division of Research,
Harvard Business School, 1961
7.8.2
Classic textbooks
Frederick S. Hillier & Gerald J. Lieberman, Introduction to Operations Research, McGraw-Hill: Boston MA;
10th Edition, 2014
Harvey M. Wagner, Principles of Operations Research, Englewood Clis, Prentice-Hall, 1969
7.8.3
History
Saul I. Gass, Arjang A. Assad, An Annotated Timeline of Operations Research: An Informal History. New
York, Kluwer Academic Publishers, 2005.
Saul I. Gass (Editor), Arjang A. Assad (Editor), Proles in Operations Research: Pioneers and Innovators.
Springer, 2011
7.9. EXTERNAL LINKS
43
Maurice W. Kirby (Operational Research Society (Great Britain)). Operational Research in War and Peace:
The British Experience from the 1930s to 1970, Imperial College Press, 2003. ISBN 1-86094-366-7, ISBN
978-1-86094-366-9
J. K. Lenstra, A. H. G. Rinnooy Kan, A. Schrijver (editors) History of Mathematical Programming: A Collection
of Personal Reminiscences, North-Holland, 1991
Charles W. McArthur, Operations Analysis in the U.S. Army Eighth Air Force in World War II, History of
Mathematics, Vol. 4, Providence, American Mathematical Society, 1990
C. H. Waddington, O. R. in World War 2: Operational Research Against the U-boat, London, Elek Science,
1973.
7.9 External links

What is Operations Research?
International Federation of Operational Research Societies
The Institute for Operations Research and the Management Sciences (INFORMS)
Occupational Outlook Handbook, U.S. Department of Labor Bureau of Labor Statistics
Chapter 8
Machine learning
For the journal, see Machine Learning (journal).
Machine learning is a subeld of computer science[1] that evolved from the study of pattern recognition and computational
learning theory in articial intelligence.[1] In 1959, Arthur Samuel dened machine learning as a Field of study that
gives computers the ability to learn without being explicitly programmed.[2] Machine learning explores the study and
construction of algorithms that can learn from and make predictions on data.[3] Such algorithms operate by building
a model from example inputs in order to make data-driven predictions or decisions,[4]:2 rather than following strictly
static program instructions.
Machine learning is closely related to (and often overlaps with) computational statistics; a discipline which also focuses
in prediction-making through the use of computers. It has strong ties to mathematical optimization, which delivers
methods, theory and application domains to the eld. Machine learning is employed in a range of computing tasks
where designing and programming explicit algorithms is unfeasible. Example applications include spam ltering,
optical character recognition (OCR),[5] search engines and computer vision. Machine learning is sometimes conated
with data mining,[6] where the latter sub-eld focuses more on exploratory data analysis and is known as unsupervised
learning.[4]:vii[7]
Within the eld of data analytics, machine learning is a method used to devise complex models and algorithms that
lend themselves to prediction - in commercial use, this is known as predictive analytics. These analytical models
allow researchers, data scientists, engineers, and analysts to produce reliable, repeatable decisions and results and
uncover hidden insights through learning from historical relationships and trends in the data.[8]
8.1 Overview
Tom M. Mitchell provided a widely quoted, more formal denition: A computer program is said to learn from
experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T,
as measured by P, improves with experience E.[9] This denition is notable for its dening machine learning in
fundamentally operational rather than cognitive terms, thus following Alan Turing's proposal in his paper "Computing
Machinery and Intelligence" that the question Can machines think?" be replaced with the question Can machines
do what we (as thinking entities) can do?"[10]
8.1.1
Types of problems and tasks
Machine learning tasks are typically classied into three broad categories, depending on the nature of the learning
signal or feedback available to a learning system. These are[11]
Supervised learning: The computer is presented with example inputs and their desired outputs, given by a
teacher, and the goal is to learn a general rule that maps inputs to outputs.
Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to nd structure in
its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards
44
8.1. OVERVIEW
45
an end (feature learning).

Reinforcement learning: A computer program interacts with a dynamic environment in which it must perform
a certain goal (such as driving a vehicle), without a teacher explicitly telling it whether it has come close to its
goal. Another example is learning to play a game by playing against an opponent.[4]:3
Between supervised and unsupervised learning is semi-supervised learning, where the teacher gives an incomplete
training signal: a training set with some (often many) of the target outputs missing. Transduction is a special case of
this principle where the entire set of problem instances is known at learning time, except that part of the targets are
missing.
A support vector machine is a classier that divides its input space into two regions, separated by a linear boundary. Here, it has
learned to distinguish black and white circles.
Among other categories of machine learning problems, learning to learn learns its own inductive bias based on previous experience. Developmental learning, elaborated for robot learning, generates its own sequences (also called curriculum) of learning situations to cumulatively acquire repertoires of novel skills through autonomous self-exploration
and social interaction with human teachers and using guidance mechanisms such as active learning, maturation, motor
synergies, and imitation.
46
CHAPTER 8. MACHINE LEARNING
Another categorization of machine learning tasks arises when one considers the desired output of a machine-learned
system:[4]:3
In classication, inputs are divided into two or more classes, and the learner must produce a model that assigns
unseen inputs to one or more (multi-label classication) of these classes. This is typically tackled in a supervised
way. Spam ltering is an example of classication, where the inputs are email (or other) messages and the
classes are spam and not spam.
In regression, also a supervised problem, the outputs are continuous rather than discrete.
In clustering, a set of inputs is to be divided into groups. Unlike in classication, the groups are not known
beforehand, making this typically an unsupervised task.
Density estimation nds the distribution of inputs in some space.
Dimensionality reduction simplies inputs by mapping them into a lower-dimensional space. Topic modeling
is a related problem, where a program is given a list of human language documents and is tasked to nd out
which documents cover similar topics.
8.2 History and relationships to other elds

See also: Timeline of machine learning
As a scientic endeavour, machine learning grew out of the quest for articial intelligence. Already in the early days of
AI as an academic discipline, some researchers were interested in having machines learn from data. They attempted
to approach the problem with various symbolic methods, as well as what were then termed "neural networks"; these
were mostly perceptrons and other models that were later found to be reinventions of the generalized linear models
of statistics. Probabilistic reasoning was also employed, especially in automated medical diagnosis.[11]:488
However, an increasing emphasis on the logical, knowledge-based approach caused a rift between AI and machine learning. Probabilistic systems were plagued by theoretical and practical problems of data acquisition and
representation.[11]:488 By 1980, expert systems had come to dominate AI, and statistics was out of favor.[12] Work
on symbolic/knowledge-based learning did continue within AI, leading to inductive logic programming, but the
more statistical line of research was now outside the eld of AI proper, in pattern recognition and information retrieval.[11]:708710; 755 Neural networks research had been abandoned by AI and computer science around the same
time. This line, too, was continued outside the AI/CS eld, as "connectionism", by researchers from other disciplines including Hopeld, Rumelhart and Hinton. Their main success came in the mid-1980s with the reinvention of
backpropagation.[11]:25
Machine learning, reorganized as a separate eld, started to ourish in the 1990s. The eld changed its goal from
achieving articial intelligence to tackling solvable problems of a practical nature. It shifted focus away from the symbolic approaches it had inherited from AI, and toward methods and models borrowed from statistics and probability
theory.[12] It also beneted from the increasing availability of digitized information, and the possibility to distribute
that via the Internet.
Machine learning and data mining often employ the same methods and overlap signicantly. They can be roughly
distinguished as follows:
Machine learning focuses on prediction, based on known properties learned from the training data.
Data mining focuses on the discovery of (previously) unknown properties in the data. This is the analysis step
of Knowledge Discovery in Databases.
The two areas overlap in many ways: data mining uses many machine learning methods, but often with a slightly
dierent goal in mind. On the other hand, machine learning also employs data mining methods as unsupervised
learning or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research
communities (which do often have separate conferences and separate journals, ECML PKDD being a major exception) comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with
respect to the ability to reproduce known knowledge, while in Knowledge Discovery and Data Mining (KDD) the key
8.3. THEORY
47
task is the discovery of previously unknown knowledge. Evaluated with respect to known knowledge, an uninformed
(unsupervised) method will easily be outperformed by supervised methods, while in a typical KDD task, supervised
methods cannot be used due to the unavailability of training data.
Machine learning also has intimate ties to optimization: many learning problems are formulated as minimization of
some loss function on a training set of examples. Loss functions express the discrepancy between the predictions of
the model being trained and the actual problem instances (for example, in classication, one wants to assign a label
to instances, and models are trained to correctly predict the pre-assigned labels of a set examples). The dierence
between the two elds arises from the goal of generalization: while optimization algorithms can minimize the loss on
a training set, machine learning is concerned with minimizing the loss on unseen samples.[13]
8.2.1
Relation to statistics
Machine learning and statistics are closely related elds. According to Michael I. Jordan, the ideas of machine
learning, from methodological principles to theoretical tools, have had a long pre-history in statistics.[14] He also
suggested the term data science as a placeholder to call the overall eld.[14]
Leo Breiman distinguished two statistical modelling paradigms: data model and algorithmic model,[15] wherein 'algorithmic model' means more or less the machine learning algorithms like Random forest.
Some statisticians have adopted methods from machine learning, leading to a combined eld that they call statistical
learning.[16]
8.3 Theory
Main article: Computational learning theory
A core objective of a learner is to generalize from its experience.[17][18] Generalization in this context is the ability
of a learning machine to perform accurately on new, unseen examples/tasks after having experienced a learning data
set. The training examples come from some generally unknown probability distribution (considered representative
of the space of occurrences) and the learner has to build a general model about this space that enables it to produce
suciently accurate predictions in new cases.
The computational analysis of machine learning algorithms and their performance is a branch of theoretical computer
science known as computational learning theory. Because training sets are nite and the future is uncertain, learning
theory usually does not yield guarantees of the performance of algorithms. Instead, probabilistic bounds on the
performance are quite common. The biasvariance decomposition is one way to quantify generalization error.
For the best performance in the context of generalization, the complexity of the hypothesis should match the complexity of the function underlying the data. If the hypothesis is less complex than the function, then the model has
undert the data. If the complexity of the model is increased in response, then the training error decreases. But if
the hypothesis is too complex, then the model is subject to overtting and generalization will be poorer.[19]
In addition to performance bounds, computational learning theorists study the time complexity and feasibility of
learning. In computational learning theory, a computation is considered feasible if it can be done in polynomial time.
There are two kinds of time complexity results. Positive results show that a certain class of functions can be learned
in polynomial time. Negative results show that certain classes cannot be learned in polynomial time.
8.4 Approaches
Main article: List of machine learning algorithms
8.4.1
Decision tree learning
Main article: Decision tree learning
48
Decision tree learning uses a decision tree as a predictive model, which maps observations about an item to conclusions
about the items target value.
8.4.2
Association rule learning
Main article: Association rule learning

Association rule learning is a method for discovering interesting relations between variables in large databases.
8.4.3
Articial neural networks
Main article: Articial neural network

An articial neural network (ANN) learning algorithm, usually called neural network (NN), is a learning algorithm
that is inspired by the structure and functional aspects of biological neural networks. Computations are structured
in terms of an interconnected group of articial neurons, processing information using a connectionist approach to
computation. Modern neural networks are non-linear statistical data modeling tools. They are usually used to model
complex relationships between inputs and outputs, to nd patterns in data, or to capture the statistical structure in an
unknown joint probability distribution between observed variables.
8.4.4
Deep Learning
Main article: Deep learning

Falling hardware prices and the development of GPUs for personal use in the last few years have contributed to the
development of the concept of Deep learning which consists of multiple hidden layers in an articial neural network.
This approach tries to model the way the human brain processes light and sound into vision and hearing. Some
successful applications of deep learning are computer vision and speech recognition.[20]
8.4.5
Inductive logic programming
Main article: Inductive logic programming

Inductive logic programming (ILP) is an approach to rule learning using logic programming as a uniform representation for input examples, background knowledge, and hypotheses. Given an encoding of the known background
knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesized
logic program that entails all positive and no negative examples. Inductive programming is a related eld that considers any kind of programming languages for representing hypotheses (and not only logic programming), such as
functional programs.
8.4.6
Support vector machines
Main article: Support vector machines

Support vector machines (SVMs) are a set of related supervised learning methods used for classication and regression.
Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds
a model that predicts whether a new example falls into one category or the other.
8.4.7
Clustering
Main article: Cluster analysis
8.4. APPROACHES
49
Cluster analysis is the assignment of a set of observations into subsets (called clusters) so that observations within the
same cluster are similar according to some predesignated criterion or criteria, while observations drawn from dierent
clusters are dissimilar. Dierent clustering techniques make dierent assumptions on the structure of the data, often
dened by some similarity metric and evaluated for example by internal compactness (similarity between members of
the same cluster) and separation between dierent clusters. Other methods are based on estimated density and graph
connectivity. Clustering is a method of unsupervised learning, and a common technique for statistical data analysis.
8.4.8
Bayesian networks
Main article: Bayesian network

A Bayesian network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional independencies via a directed acyclic graph (DAG). For example,
a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms,
the network can be used to compute the probabilities of the presence of various diseases. Ecient algorithms exist
that perform inference and learning.
8.4.9
Reinforcement learning
Main article: Reinforcement learning

Reinforcement learning is concerned with how an agent ought to take actions in an environment so as to maximize
some notion of long-term reward. Reinforcement learning algorithms attempt to nd a policy that maps states of
the world to the actions the agent ought to take in those states. Reinforcement learning diers from the supervised
learning problem in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected.
8.4.10
Representation learning
Main article: Representation learning

Several learning algorithms, mostly unsupervised learning algorithms, aim at discovering better representations of
the inputs provided during training. Classical examples include principal components analysis and cluster analysis.
Representation learning algorithms often attempt to preserve the information in their input but transform it in a
way that makes it useful, often as a pre-processing step before performing classication or predictions, allowing to
reconstruct the inputs coming from the unknown data generating distribution, while not being necessarily faithful for
congurations that are implausible under that distribution.
Manifold learning algorithms attempt to do so under the constraint that the learned representation is low-dimensional.
Sparse coding algorithms attempt to do so under the constraint that the learned representation is sparse (has many
zeros). Multilinear subspace learning algorithms aim to learn low-dimensional representations directly from tensor
representations for multidimensional data, without reshaping them into (high-dimensional) vectors.[21] Deep learning
algorithms discover multiple levels of representation, or a hierarchy of features, with higher-level, more abstract
features dened in terms of (or generating) lower-level features. It has been argued that an intelligent machine is one
that learns a representation that disentangles the underlying factors of variation that explain the observed data.[22]
8.4.11
Similarity and metric learning
Main article: Similarity learning

In this problem, the learning machine is given pairs of examples that are considered similar and pairs of less similar
objects. It then needs to learn a similarity function (or a distance metric function) that can predict if new objects are
similar. It is sometimes used in Recommendation systems.
50
8.4.12
Sparse dictionary learning
Main article: Sparse dictionary learning

In this method, a datum is represented as a linear combination of basis functions, and the coecients are assumed to
be sparse. Let x be a d-dimensional datum, D be a d by n matrix, where each column of D represents a basis function.
r is the coecient to represent x using D. Mathematically, sparse dictionary learning means solving x Dr where r
is sparse. Generally speaking, n is assumed to be larger than d to allow the freedom for a sparse representation.
Learning a dictionary along with sparse representations is strongly NP-hard and also dicult to solve approximately.[23]
A popular heuristic method for sparse dictionary learning is K-SVD.
Sparse dictionary learning has been applied in several contexts. In classication, the problem is to determine which
classes a previously unseen datum belongs to. Suppose a dictionary for each class has already been built. Then a
new datum is associated with the class such that its best sparsely represented by the corresponding dictionary. Sparse
dictionary learning has also been applied in image de-noising. The key idea is that a clean image patch can be sparsely
represented by an image dictionary, but the noise cannot.[24]
8.4.13
Genetic algorithms
Main article: Genetic algorithm

A genetic algorithm (GA) is a search heuristic that mimics the process of natural selection, and uses methods such
as mutation and crossover to generate new genotype in the hope of nding good solutions to a given problem. In
machine learning, genetic algorithms found some uses in the 1980s and 1990s.[25][26] Vice versa, machine learning
techniques have been used to improve the performance of genetic and evolutionary algorithms.[27]
8.5 Applications
Applications for machine learning include:
Adaptive websites
Aective computing
Bioinformatics
Brain-machine interfaces
Cheminformatics
Classifying DNA sequences
Computational anatomy
Computer vision, including object recognition
Detecting credit card fraud
Game playing
Information retrieval
Internet fraud detection
Marketing
Machine perception
Medical diagnosis
Natural language processing
8.6. ETHICS
51
Natural language understanding

Optimization and metaheuristic
Online advertising
Recommender systems
Robot locomotion
Search engines
Sentiment analysis (or opinion mining)
Sequence mining
Software engineering
Speech and handwriting recognition
Stock market analysis
Structural health monitoring
Syntactic pattern recognition
Economics
In 2006, the online movie company Netix held the rst "Netix Prize" competition to nd a program to better
predict user preferences and improve the accuracy on its existing Cinematch movie recommendation algorithm by
at least 10%. A joint team made up of researchers from AT&T Labs-Research in collaboration with the teams Big
Chaos and Pragmatic Theory built an ensemble model to win the Grand Prize in 2009 for $1 million.[28] Shortly after
the prize was awarded, Netix realized that viewers ratings were not the best indicators of their viewing patterns
(everything is a recommendation) and they changed their recommendation engine accordingly.[29]
In 2010 The Wall Street Journal wrote about money management rm Rebellion Research's use of machine learning
to predict economic movements. The article describes Rebellion Researchs prediction of the nancial crisis and
economic recovery.[30]
In 2014 it has been reported that a machine learning algorithm has been applied in Art History to study ne art
paintings, and that it may have revealed previously unrecognized inuences between artists.[31]
8.6 Ethics
Machine Learning poses a host of ethical questions. Systems which are trained on datasets collected with biases may
exhibit these biases upon use, thus digitizing cultural prejudices such as institutional racism and classism.[32] Responsible collection of data thus is a critical part of machine learning. See Machine ethics for additional information.
8.7 Software
Software suites containing a variety of machine learning algorithms include the following:
8.7.1
Free and open-source software
dlib
ELKI
Encog
GNU Octave
52

H2O
Mahout
Mallet (software project)
mlpy
MLPACK
MOA (Massive Online Analysis)
ND4J with Deeplearning4j
NuPIC
OpenAI
OpenCV
OpenNN
Orange
R
scikit-learn
scikit-image
Shogun
TensorFlow
Torch (machine learning)
Spark
Yooreeka
Weka
8.7.2
Proprietary software with free and open-source editions
KNIME
RapidMiner
8.7.3
Proprietary software
Amazon Machine Learning

Angoss KnowledgeSTUDIO
Ayasdi
Databricks
Google Prediction API
IBM SPSS Modeler
KXEN Modeler
LIONsolver
Mathematica
8.8. JOURNALS
MATLAB
Microsoft Azure Machine Learning
Neural Designer
NeuroSolutions
Oracle Data Mining
RCASE
SAS Enterprise Miner
Splunk
STATISTICA Data Miner
8.8 Journals
Journal of Machine Learning Research
Machine Learning
Neural Computation
8.9 Conferences
Conference on Neural Information Processing Systems
International Conference on Machine Learning
8.10 See also

Adaptive control
Adversarial machine learning
Automatic reasoning
Bayesian structural time series
Big data
Cache language model
Cognitive model
Cognitive science
Computational intelligence
Computational neuroscience
Data science
Ethics of articial intelligence
Existential risk from advanced articial intelligence
Explanation-based learning
Glossary of articial intelligence
53
54

Important publications in machine learning
List of machine learning algorithms
List of datasets for machine learning research
Similarity learning
Soft computing
Spike-and-slab variable selection
8.11 References
[1] http://www.britannica.com/EBchecked/topic/1116194/machine-learning This tertiary source reuses information from other
sources but does not name them.
[2] Phil Simon (March 18, 2013). Too Big to Ignore: The Business Case for Big Data. Wiley. p. 89. ISBN 978-1-118-63817-0.
[3] Ron Kohavi; Foster Provost (1998). Glossary of terms. Machine Learning. 30: 271274.
[4] Machine learning and pattern recognition can be viewed as two facets of the same eld.
[5] Wernick, Yang, Brankov, Yourganov and Strother, Machine Learning in Medical Imaging, IEEE Signal Processing Magazine, vol. 27, no. 4, July 2010, pp. 25-38
[6] Mannila, Heikki (1996). Data mining: machine learning, statistics, and databases. Int'l Conf. Scientic and Statistical
Database Management. IEEE Computer Society.
[7] Friedman, Jerome H. (1998). Data Mining and Statistics: Whats the connection?". Computing Science and Statistics. 29
(1): 39.
[8] Machine Learning: What it is and why it matters. www.sas.com. Retrieved 2016-03-29.
[9] Mitchell, T. (1997). Machine Learning, McGraw Hill. ISBN 0-07-042807-7, p.2.
[10] Harnad, Stevan (2008), The Annotation Game: On Turing (1950) on Computing, Machinery, and Intelligence, in Epstein,
Robert; Peters, Grace, The Turing Test Sourcebook: Philosophical and Methodological Issues in the Quest for the Thinking
Computer, Kluwer
[11] Russell, Stuart; Norvig, Peter (2003) [1995]. Articial Intelligence: A Modern Approach (2nd ed.). Prentice Hall. ISBN
978-0137903955.
[12] Langley, Pat (2011). The changing science of machine learning. Machine Learning. 82 (3): 275279. doi:10.1007/s10994011-5242-y.
[13] Le Roux, Nicolas; Bengio, Yoshua; Fitzgibbon, Andrew (2012). Improving First and Second-Order Methods by Modeling
Uncertainty. In Sra, Suvrit; Nowozin, Sebastian; Wright, Stephen J. Optimization for Machine Learning. MIT Press. p.
404.
[14] MI Jordan (2014-09-10). statistics and machine learning. reddit. Retrieved 2014-10-01.
[15] Cornell University Library. Breiman : Statistical Modeling: The Two Cultures (with comments and a rejoinder by the
author)". Retrieved 8 August 2015.
[16] Gareth James; Daniela Witten; Trevor Hastie; Robert Tibshirani (2013). An Introduction to Statistical Learning. Springer.
p. vii.
[17] Bishop, C. M. (2006), Pattern Recognition and Machine Learning, Springer, ISBN 0-387-31073-8
[18] Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar (2012) Foundations of Machine Learning, MIT Press ISBN 9780-262-01825-8.
[19] Ethem Alpaydin. "Introduction to Machine Learning" The MIT Press, 2010.
[20] Honglak Lee, Roger Grosse, Rajesh Ranganath, Andrew Y. Ng. "Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations" Proceedings of the 26th Annual International Conference on Machine
Learning, 2009.
55
[21] Lu, Haiping; Plataniotis, K.N.; Venetsanopoulos, A.N. (2011). A Survey of Multilinear Subspace Learning for Tensor
Data (PDF). Pattern Recognition. 44 (7): 15401551. doi:10.1016/j.patcog.2011.01.004.
[22] Yoshua Bengio (2009). Learning Deep Architectures for AI. Now Publishers Inc. pp. 13. ISBN 978-1-60198-294-0.
[23] A. M. Tillmann, "On the Computational Intractability of Exact and Approximate Dictionary Learning", IEEE Signal Processing Letters 22(1), 2015: 4549.
[24] Aharon, M, M Elad, and A Bruckstein. 2006. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse
Representation. Signal Processing, IEEE Transactions on 54 (11): 4311-4322
[25] Goldberg, David E.; Holland, John H. (1988). Genetic algorithms and machine learning. Machine Learning. 3 (2):
9599. doi:10.1007/bf00113892.
[26] Michie, D.; Spiegelhalter, D. J.; Taylor, C. C. (1994). Machine Learning, Neural and Statistical Classication. Ellis Horwood.
[27] Zhang, Jun; Zhan, Zhi-hui; Lin, Ying; Chen, Ni; Gong, Yue-jiao; Zhong, Jing-hui; Chung, Henry S.H.; Li, Yun; Shi, Yuhui (2011). Evolutionary Computation Meets Machine Learning: A Survey (PDF). Computational Intelligence Magazine.
IEEE. 6 (4): 6875. doi:10.1109/mci.2011.942584.
[28] BelKor Home Page research.att.com
[29] The Netix Tech Blog: Netix Recommendations: Beyond the 5 stars (Part 1)". Retrieved 8 August 2015.
[30] Scott Patterson (13 July 2010). "'Articial Intelligence' Gains Fans Among Investors - WSJ. WSJ. Retrieved 8 August
2015.
[31] When A Machine Learning Algorithm Studied Fine Art Paintings, It Saw Things Art Historians Had Never Noticed, The
Physics at ArXiv blog
[32] Bostrom, Nick (2011). The Ethics of Articial Intelligence (PDF). Retrieved 11 April 2016.

Trevor Hastie, Robert Tibshirani and Jerome H. Friedman (2001). The Elements of Statistical Learning,
Springer. ISBN 0-387-95284-5.
Pedro Domingos (September 2015), The Master Algorithm, Basic Books, ISBN 978-0-465-06570-7
Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar (2012). Foundations of Machine Learning, The MIT
Press. ISBN 978-0-262-01825-8.
Ian H. Witten and Eibe Frank (2011). Data Mining: Practical machine learning tools and techniques Morgan
Kaufmann, 664pp., ISBN 978-0-12-374856-0.
David J. C. MacKay. Information Theory, Inference, and Learning Algorithms Cambridge: Cambridge University Press, 2003. ISBN 0-521-64298-1
Richard O. Duda, Peter E. Hart, David G. Stork (2001) Pattern classication (2nd edition), Wiley, New York,
ISBN 0-471-05669-3.
Christopher Bishop (1995). Neural Networks for Pattern Recognition, Oxford University Press. ISBN 0-19853864-2.
Vladimir Vapnik (1998). Statistical Learning Theory. Wiley-Interscience, ISBN 0-471-03003-1.
Ray Solomono, An Inductive Inference Machine, IRE Convention Record, Section on Information Theory,
Part 2, pp., 56-62, 1957.
Ray Solomono, "An Inductive Inference Machine" A privately circulated report from the 1956 Dartmouth
Summer Research Conference on AI.
56
8.13 External links

International Machine Learning Society
Popular online course by Andrew Ng, at Coursera. It uses GNU Octave. The course is a free version of
Stanford University's actual course taught by Ng, whose lectures are also available for free.
mloss is an academic database of open-source machine learning software.
Chapter 9
Statistical inference
Not to be confused with Statistical interference.
Statistical inference is the process of deducing properties of an underlying distribution by analysis of data.[1] Inferential statistical analysis infers properties about a population: this includes testing hypotheses and deriving estimates.
The population is assumed to be larger than the observed data set; in other words, the observed data is assumed to be
sampled from a larger population.
Inferential statistics can be contrasted with descriptive statistics. Descriptive statistics is solely concerned with properties of the observed data, and does not assume that the data came from a larger population.
9.1 Introduction
Statistical inference makes propositions about a population, using data drawn from the population with some form of
sampling. Given a hypothesis about a population, for which we wish to draw inferences, statistical inference consists
of (rstly) selecting a statistical model of the process that generates the data and (secondly) deducing propositions
from the model.
Konishi & Kitagawa state, The majority of the problems in statistical inference can be considered to be problems
related to statistical modeling.[2] Relatedly, Sir David Cox has said, How [the] translation from subject-matter
problem to statistical model is done is often the most critical part of an analysis.[3]
The conclusion of a statistical inference is a statistical proposition. Some common forms of statistical proposition
are the following:
a point estimate, i.e. a particular value that best approximates some parameter of interest;
an interval estimate, e.g. a condence interval (or set estimate), i.e. an interval constructed using a dataset
drawn from a population so that, under repeated sampling of such datasets, such intervals would contain the
true parameter value with the probability at the stated condence level;
a credible interval, i.e. a set of values containing, for example, 95% of posterior belief;
rejection of a hypothesis;[4]
clustering or classication of data points into groups.
9.2 Models and assumptions

Main articles: Statistical model and Statistical assumptions
Any statistical inference requires some assumptions. A statistical model is a set of assumptions concerning the
generation of the observed data and similar data. Descriptions of statistical models usually emphasize the role of
57
58
CHAPTER 9. STATISTICAL INFERENCE
population quantities of interest, about which we wish to draw inference.[5] Descriptive statistics are typically used as
a preliminary step before more formal inferences are drawn.[6]
9.2.1
Degree of models/assumptions
Statisticians distinguish between three levels of modeling assumptions;

Fully parametric: The probability distributions describing the data-generation process are assumed to be fully
described by a family of probability distributions involving only a nite number of unknown parameters.[5] For
example, one may assume that the distribution of population values is truly Normal, with unknown mean and
variance, and that datasets are generated by 'simple' random sampling. The family of generalized linear models
is a widely used and exible class of parametric models.
Non-parametric: The assumptions made about the process generating the data are much less than in parametric statistics and may be minimal.[7] For example, every continuous probability distribution has a median,
which may be estimated using the sample median or the HodgesLehmannSen estimator, which has good
properties when the data arise from simple random sampling.
Semi-parametric: This term typically implies assumptions 'in between' fully and non-parametric approaches.
For example, one may assume that a population distribution has a nite mean. Furthermore, one may assume
that the mean response level in the population depends in a truly linear manner on some covariate (a parametric
assumption) but not make any parametric assumption describing the variance around that mean (i.e. about the
presence or possible form of any heteroscedasticity). More generally, semi-parametric models can often be
separated into 'structural' and 'random variation' components. One component is treated parametrically and
the other non-parametrically. The well-known Cox model is a set of semi-parametric assumptions.
9.2.2
Importance of valid models/assumptions
Whatever level of assumption is made, correctly calibrated inference in general requires these assumptions to be
correct; i.e. that the data-generating mechanisms really have been correctly specied.
Incorrect assumptions of 'simple' random sampling can invalidate statistical inference.[8] More complex semi- and
fully parametric assumptions are also cause for concern. For example, incorrectly assuming the Cox model can in
some cases lead to faulty conclusions.[9] Incorrect assumptions of Normality in the population also invalidates some
forms of regression-based inference.[10] The use of any parametric model is viewed skeptically by most experts
in sampling human populations: most sampling statisticians, when they deal with condence intervals at all, limit
themselves to statements about [estimators] based on very large samples, where the central limit theorem ensures that
these [estimators] will have distributions that are nearly normal.[11] In particular, a normal distribution would be
a totally unrealistic and catastrophically unwise assumption to make if we were dealing with any kind of economic
population.[11] Here, the central limit theorem states that the distribution of the sample mean for very large samples
is approximately normally distributed, if the distribution is not heavy tailed.
Approximate distributions
Main articles: Statistical distance, Asymptotic theory (statistics), and Approximation theory
Given the diculty in specifying exact distributions of sample statistics, many methods have been developed for
approximating these.
With nite samples, approximation results measure how close a limiting distribution approaches the statistics sample
distribution: For example, with 10,000 independent samples the normal distribution approximates (to two digits of
accuracy) the distribution of the sample mean for many population distributions, by the BerryEsseen theorem.[12] Yet
for many practical purposes, the normal approximation provides a good approximation to the sample-means distribution when there are 10 (or more) independent samples, according to simulation studies and statisticians experience.[12]
Following Kolmogorovs work in the 1950s, advanced statistics uses approximation theory and functional analysis to
quantify the error of approximation. In this approach, the metric geometry of probability distributions is studied; this
approach quanties approximation error with, for example, the KullbackLeibler divergence, Bregman divergence,
and the Hellinger distance.[13][14][15]
9.3. PARADIGMS FOR INFERENCE
59
With indenitely large samples, limiting results like the central limit theorem describe the sample statistics limiting
distribution, if one exists. Limiting results are not statements about nite samples, and indeed are irrelevant to nite
samples.[16][17][18] However, the asymptotic theory of limiting distributions is often invoked for work with nite
samples. For example, limiting results are often invoked to justify the generalized method of moments and the
use of generalized estimating equations, which are popular in econometrics and biostatistics. The magnitude of the
dierence between the limiting distribution and the true distribution (formally, the 'error' of the approximation) can
be assessed using simulation.[19] The heuristic application of limiting results to nite samples is common practice in
many applications, especially with low-dimensional models with log-concave likelihoods (such as with one-parameter
exponential families).
9.2.3
Randomization-based models
Main article: Randomization

See also: Random sample and Random assignment
For a given dataset that was produced by a randomization design, the randomization distribution of a statistic (under the null-hypothesis) is dened by evaluating the test statistic for all of the plans that could have been generated
by the randomization design. In frequentist inference, randomization allows inferences to be based on the randomization distribution rather than a subjective model, and this is important especially in survey sampling and design
of experiments.[20][21] Statistical inference from randomized studies is also more straightforward than many other
situations.[22][23][24] In Bayesian inference, randomization is also of importance: in survey sampling, use of sampling
without replacement ensures the exchangeability of the sample with the population; in randomized experiments,
randomization warrants a missing at random assumption for covariate information.[25]
Objective randomization allows properly inductive procedures.[26][27][28][29] Many statisticians prefer randomizationbased analysis of data that was generated by well-dened randomization procedures.[30] (However, it is true that
in elds of science with developed theoretical knowledge and experimental control, randomized experiments may
increase the costs of experimentation without improving the quality of inferences.[31][32] ) Similarly, results from
randomized experiments are recommended by leading statistical authorities as allowing inferences with greater reliability than do observational studies of the same phenomena.[33] However, a good observational study may be better
than a bad randomized experiment.
The statistical analysis of a randomized experiment may be based on the randomization scheme stated in the experimental protocol and does not need a subjective model.[34][35]
However, at any time, some hypotheses cannot be tested using objective statistical models, which accurately describe
randomized experiments or random samples. In some cases, such randomized studies are uneconomical or unethical.
Model-based analysis of randomized experiments

It is standard practice to refer to a statistical model, often a linear model, when analyzing data from randomized
experiments. However, the randomization scheme guides the choice of a statistical model. It is not possible to choose
an appropriate model without knowing the randomization scheme.[21] Seriously misleading results can be obtained
analyzing data from randomized experiments while ignoring the experimental protocol; common mistakes include
forgetting the blocking used in an experiment and confusing repeated measurements on the same experimental unit
with independent replicates of the treatment applied to dierent experimental units.[36]
9.3 Paradigms for inference

Dierent schools of statistical inference have become established. These schoolsor paradigmsare not mutually
exclusive, and methods that work well under one paradigm often have attractive interpretations under other paradigms.
Bandyopadhyay & Forster[37] describe four paradigms: "(i) classical statistics or error statistics, (ii) Bayesian statistics, (iii) likelihood-based statistics, and (iv) the Akaikean-Information Criterion-based statistics. The classical (or
frequentist) paradigm, the Bayesian paradigm, and the AIC-based paradigm are summarized below. The likelihoodbased paradigm is essentially a sub-paradigm of the AIC-based paradigm.
60
9.3.1
Frequentist inference
See also: Frequentist inference

This paradigm calibrates the plausibility of propositions by considering (notional) repeated sampling of a population distribution to produce datasets similar to the one at hand. By considering the datasets characteristics under
repeated sampling, the frequentist properties of a statistical proposition can be quantiedalthough in practice this
quantication may be challenging.
Examples of frequentist inference
p-value
Condence interval
Frequentist inference, objectivity, and decision theory
One interpretation of frequentist inference (or classical inference) is that it is applicable only in terms of frequency
probability; that is, in terms of repeated sampling from a population. However, the approach of Neyman[38] develops
these procedures in terms of pre-experiment probabilities. That is, before undertaking an experiment, one decides
on a rule for coming to a conclusion such that the probability of being correct is controlled in a suitable way: such a
probability need not have a frequentist or repeated sampling interpretation. In contrast, Bayesian inference works in
terms of conditional probabilities (i.e. probabilities conditional on the observed data), compared to the marginal (but
conditioned on unknown parameters) probabilities used in the frequentist approach.
The frequentist procedures of signicance testing and condence intervals can be constructed without regard to
utility functions. However, some elements of frequentist statistics, such as statistical decision theory, do incorporate
utility functions. In particular, frequentist developments of optimal inference (such as minimum-variance unbiased
estimators, or uniformly most powerful testing) make use of loss functions, which play the role of (negative) utility
functions. Loss functions need not be explicitly stated for statistical theorists to prove that a statistical procedure has
an optimality property.[39] However, loss-functions are often useful for stating optimality properties: for example,
median-unbiased estimators are optimal under absolute value loss functions, in that they minimize expected loss, and
least squares estimators are optimal under squared error loss functions, in that they minimize expected loss.
While statisticians using frequentist inference must choose for themselves the parameters of interest, and the estimators/test
statistic to be used, the absence of obviously explicit utilities and prior distributions has helped frequentist procedures
to become widely viewed as 'objective'.
9.3.2
Bayesian inference
See also: Bayesian Inference

The Bayesian calculus describes degrees of belief using the 'language' of probability; beliefs are positive, integrate
to one, and obey probability axioms. Bayesian inference uses the available posterior beliefs as the basis for making
statistical propositions. There are several dierent justications for using the Bayesian approach.
Examples of Bayesian inference
Credible interval for interval estimation
Bayes factors for model comparison
Bayesian inference, subjectivity and decision theory
Many informal Bayesian inferences are based on intuitively reasonable summaries of the posterior. For example,
the posterior mean, median and mode, highest posterior density intervals, and Bayes Factors can all be motivated
9.3. PARADIGMS FOR INFERENCE
61
in this way. While a users utility function need not be stated for this sort of inference, these summaries do all
depend (to some extent) on stated prior beliefs, and are generally viewed as subjective conclusions. (Methods of prior
construction which do not require external input have been proposed but not yet fully developed.)
Formally, Bayesian inference is calibrated with reference to an explicitly stated utility, or loss function; the 'Bayes
rule' is the one which maximizes expected utility, averaged over the posterior uncertainty. Formal Bayesian inference
therefore automatically provides optimal decisions in a decision theoretic sense. Given assumptions, data and utility,
Bayesian inference can be made for essentially any problem, although not every statistical inference need have a
Bayesian interpretation. Analyses which are not formally Bayesian can be (logically) incoherent; a feature of Bayesian
procedures which use proper priors (i.e. those integrable to one) is that they are guaranteed to be coherent. Some
advocates of Bayesian inference assert that inference must take place in this decision-theoretic framework, and that
Bayesian inference should not conclude with the evaluation and summarization of posterior beliefs.
9.3.3
AIC-based inference
Main article: Akaike information criterion
9.3.4
Other paradigms for inference
Minimum description length

Main article: Minimum description length
The minimum description length (MDL) principle has been developed from ideas in information theory[40] and
the theory of Kolmogorov complexity.[41] The (MDL) principle selects statistical models that maximally compress
the data; inference proceeds without assuming counterfactual or non-falsiable data-generating mechanisms or
probability models for the data, as might be done in frequentist or Bayesian approaches.
However, if a data generating mechanism does exist in reality, then according to Shannon's source coding theorem
it provides the MDL description of the data, on average and asymptotically.[42] In minimizing description length (or
descriptive complexity), MDL estimation is similar to maximum likelihood estimation and maximum a posteriori
estimation (using maximum-entropy Bayesian priors). However, MDL avoids assuming that the underlying probability model is known; the MDL principle can also be applied without assumptions that e.g. the data arose from
independent sampling.[42][43]
The MDL principle has been applied in communication-coding theory in information theory, in linear regression,[43]
and in data mining.[41]
The evaluation of MDL-based inferential procedures often uses techniques or criteria from computational complexity
theory.[44]
Fiducial inference
Main article: Fiducial inference
Fiducial inference was an approach to statistical inference based on ducial probability, also known as a ducial distribution. In subsequent work, this approach has been called ill-dened, extremely limited in applicability, and even
fallacious.[45][46] However this argument is the same as that which shows[47] that a so-called condence distribution is
not a valid probability distribution and, since this has not invalidated the application of condence intervals, it does
not necessarily invalidate conclusions drawn from ducial arguments. An attempt was made to reinterpret the early
work of Fishers ducial argument as a special case of an inference theory using Upper and lower probabilities.[48]
Structural inference
Developing ideas of Fisher and of Pitman from 1938 to 1939,[49] George A. Barnard developed structural inference
or pivotal inference,[50] an approach using invariant probabilities on group families. Barnard reformulated the
62
arguments behind ducial inference on a restricted class of models on which ducial procedures would be welldened and useful.
9.4 Inference topics

The topics below are usually included in the area of statistical inference.
1. Statistical assumptions
2. Statistical decision theory
3. Estimation theory
4. Statistical hypothesis testing
5. Revising opinions in statistics
6. Design of experiments, the analysis of variance, and regression
7. Survey sampling
8. Summarizing statistical data
9.5 See also

Algorithmic inference
Induction (philosophy)
Informal inferential reasoning
Population proportion
Philosophy of statistics
Predictive inference
9.6 Notes
[1] Upton, G., Cook, I. (2008) Oxford Dictionary of Statistics, OUP. ISBN 978-0-19-954145-4
[2] Konishi & Kitagawa (2008), p.75
[3] Cox (2006), p.197
[4] According to Peirce, acceptance means that inquiry on this question ceases for the time being. In science, all scientic
theories are revisable
[5] Cox (2006) page 2
[6] Evans, Michael; et al. (2004). Probability and Statistics: The Science of Uncertainty. Freeman and Company. p. 267.
[7] van der Vaart, A.W. (1998) Asymptotic Statistics Cambridge University Press. ISBN 0-521-78450-6 (page 341)
[8] Kruskal 1988
[9] Freedman, D.A. (2008) Survival analysis: An Epidemiological hazard?". The American Statistician (2008) 62: 110-119.
(Reprinted as Chapter 11 (pages 169192) of Freedman (2010)).
[10] Berk, R. (2003) Regression Analysis: A Constructive Critique (Advanced Quantitative Techniques in the Social Sciences) (v.
11) Sage Publications. ISBN 0-7619-2904-5
9.6. NOTES
63
[11] Brewer, Ken (2002). Combined Survey Sampling Inference: Weighing of Basus Elephants. Hodder Arnold. p. 6. ISBN
978-0340692295.
[12] Jrgen Homan-Jrgensens Probability With a View Towards Statistics, Volume I. Page 399
[13] Le Cam (1986)
[14] Erik Torgerson (1991) Comparison of Statistical Experiments, volume 36 of Encyclopedia of Mathematics. Cambridge
University Press.
[15] Liese, Friedrich & Miescke, Klaus-J. (2008). Statistical Decision Theory: Estimation, Testing, and Selection. Springer.
ISBN 0-387-73193-8.
[16] Kolmogorov (1963, p.369): The frequency concept, based on the notion of limiting frequency as the number of trials
increases to innity, does not contribute anything to substantiate the applicability of the results of probability theory to real
practical problems where we have always to deal with a nite number of trials.
[17] Indeed, limit theorems 'as n tends to innity' are logically devoid of content about what happens at any particular n .
All they can do is suggest certain approaches whose performance must then be checked on the case at hand. Le Cam
(1986) (page xiv)
[18] Pfanzagl (1994): The crucial drawback of asymptotic theory: What we expect from asymptotic theory are results which
hold approximately . . . . What asymptotic theory has to oer are limit theorems."(page ix) What counts for applications
are approximations, not limits. (page 188)
[19] Pfanzagl (1994) : By taking a limit theorem as being approximately true for large sample sizes, we commit an error the
size of which is unknown. [. . .] Realistic information about the remaining errors may be obtained by simulations. (page
ix)
[20] Neyman, J.(1934) On the two dierent aspects of the representative method: The method of stratied sampling and the
method of purposive selection, Journal of the Royal Statistical Society, 97 (4), 557625 JSTOR 2342192
[21] Hinkelmann and Kempthorne(2008)
[22] ASA Guidelines for a rst course in statistics for non-statisticians. (available at the ASA website)
[23] David A. Freedman et alias Statistics.
[24] David S. Moore and George McCabe. Introduction to the Practice of Statistics.
[25] Gelman A. et al. (2013). Bayesian Data Analysis (Chapman & Hall).
[26] Peirce (1877-1878)
[27] Peirce (1883)
[28] David Freedman et alia Statistics and David A. Freedman Statistical Models.
[29] Rao, C.R. (1997) Statistics and Truth: Putting Chance to Work, World Scientic. ISBN 981-02-3111-3
[30] Peirce, Freedman, Moore and McCabe.
[31] Box, G.E.P. and Friends (2006) Improving Almost Anything: Ideas and Essays, Revised Edition, Wiley. ISBN 978-0-47172755-2
[32] Cox (2006), page 196
[33] ASA Guidelines for a rst course in statistics for non-statisticians. (available at the ASA website)
David A. Freedman et alias Statistics.
David S. Moore and George McCabe. Introduction to the Practice of Statistics.
[34] Neyman, Jerzy. 1923 [1990]. On the Application of Probability Theory to AgriculturalExperiments. Essay on Principles.
Section 9. Statistical Science 5 (4): 465472. Trans. Dorota M. Dabrowska and Terence P. Speed.
[35] Hinkelmann & Kempthorne (2008)
[36] Hinkelmann and Kempthorne (2008) Chapter 6.
[37] Bandyopadhyay & Forster (2011). The quote is taken from the books Introduction (p.3). See also Section III: Four
Paradigms of Statistics.
64
[38] Neyman, J. (1937) Outline of a Theory of Statistical Estimation Based on the Classical Theory of Probability, Philosophical Transactions of the Royal Society of London A, 236, 333380.
[39] Preface to Pfanzagl.
[40] Soo (2000)
[41] Hansen & Yu (2001)
[42] Hansen and Yu (2001), page 747.
[43] Rissanen (1989), page 84
[44] Joseph F. Traub, G. W. Wasilkowski, and H. Wozniakowski. (1988)
[45] Neyman (1956)
[46] Zabell (1992)
[47] Cox (2006) page 66
[48] Hampel 2003.
[49] Davison, page 12.
[50] Barnard, G.A. (1995) Pivotal Models and the Fiducial Argument, International Statistical Review, 63 (3), 309323.
JSTOR 1403482
9.7 References
Bandyopadhyay, P. S.; Forster, M. R., eds. (2011), Philosophy of Statistics, Elsevier.
Bickel, Peter J.; Doksum, Kjell A. (2001). Mathematical statistics: Basic and selected topics. 1 (Second (updated
printing 2007) ed.). Prentice Hall. ISBN 0-13-850363-X. MR 443141.
Cox, D. R. (2006). Principles of Statistical Inference, Cambridge University Press. ISBN 0-521-68567-2.
Fisher, R. A. (1955), Statistical methods and scientic induction, Journal of the Royal Statistical Society,
Series B, 17, 6978. (criticism of statistical theories of Jerzy Neyman and Abraham Wald)
Freedman, D. A. (2009). Statistical models: Theory and practice (revised ed.). Cambridge University Press.
pp. xiv+442 pp. ISBN 978-0-521-74385-3. MR 2489600.
Freedman, D. A. (2010). Statistical Models and Causal Inferences: A Dialogue with the Social Sciences (Edited
by David Collier, Jasjeet S. Sekhon, and Philip B. Stark), Cambridge University Press.
Hansen, Mark H.; Yu, Bin (June 2001). Model Selection and the Principle of Minimum Description Length:
Review paper. Journal of the American Statistical Association. 96 (454): 746774. doi:10.1198/016214501753168398.
JSTOR 2670311. MR 1939352.
Hinkelmann, Klaus; Kempthorne, Oscar (2008). Introduction to Experimental Design (Second ed.). Wiley.
ISBN 978-0-471-72756-9.
Kolmogorov, Andrei N. (1963). On tables of random numbers. Sankhy Ser. A. 25: 369375. MR 178484.
Reprinted as Kolmogorov, Andrei N. (1998). On tables of random numbers. Theoretical Computer Science.
207 (2): 387395. doi:10.1016/S0304-3975(98)00075-9. MR 1643414.
Konishi S., Kitagawa G. (2008), Information Criteria and Statistical Modeling, Springer.
Kruskal, William (December 1988). Miracles and Statistics: the casual assumption of independence (ASA
Presidential Address)". Journal of the American Statistical Association. 83 (404): 929940. doi:10.2307/2290117.
JSTOR 2290117.
Le Cam, Lucian. (1986) Asymptotic Methods of Statistical Decision Theory, Springer. ISBN 0-387-96307-3
Neyman, Jerzy (1956). Note on an Article by Sir Ronald Fisher. Journal of the Royal Statistical Society,
Series B. 18 (2): 288294. JSTOR 2983716. (reply to Fisher 1955)
65
Peirce, C. S. (18771878), Illustrations of the Logic of Science (series), Popular Science Monthly, vols.
12-13. Relevant individual papers:
(1878 March), The Doctrine of Chances, Popular Science Monthly, v. 12, March issue, pp. 604615.
Internet Archive Eprint.
(1878 April), The Probability of Induction, Popular Science Monthly, v. 12, pp. 705718. Internet
Archive Eprint.
(1878 June), The Order of Nature, Popular Science Monthly, v. 13, pp. 203217.Internet Archive
Eprint.
(1878 August), Deduction, Induction, and Hypothesis, Popular Science Monthly, v. 13, pp. 470482.
Internet Archive Eprint.
Peirce, C. S. (1883), A Theory of Probable Inference, Studies in Logic, pp. 126-181, Little, Brown, and
Company. (Reprinted 1983, John Benjamins Publishing Company, ISBN 90-272-3271-7)
Pfanzagl, Johann; with the assistance of R. Hambker (1994). Parametric Statistical Theory. Berlin: Walter
de Gruyter. ISBN 3-11-013863-8. MR 1291393.
Rissanen, Jorma (1989). Stochastic Complexity in Statistical Inquiry. Series in computer science. 15. Singapore: World Scientic. ISBN 9971-5-0859-1. MR 1082556.
Soo, Ehsan S. (December 2000). Principal Information-Theoretic Approaches (Vignettes for the Year 2000:
Theory and Methods, ed. by George Casella)". Journal of the American Statistical Association. 95 (452):
13491353. doi:10.1080/01621459.2000.10474346. JSTOR 2669786. MR 1825292.
Traub, Joseph F.; Wasilkowski, G. W.; Wozniakowski, H. (1988). Information-Based Complexity. Academic
Press. ISBN 0-12-697545-0.
Zabell, S. L. (Aug 1992). R. A. Fisher and Fiducial Argument. Statistical Science. 7 (3): 369387.
doi:10.1214/ss/1177011233. JSTOR 2246073.
Hampel, Frank (Feb 2003). The proper ducial argument (PDF) (Research Report No. 114). Retrieved 29
March 2016.
9.8 Further reading

Casella, G., Berger, R.L. (2001). Statistical Inference. Duxbury Press. ISBN 0-534-24312-6
Freedman D.A. (1991). Statistical models and shoe leather, Sociological Methodology, 21: 291313.
Held L., Bov D.S. (2014). Applied Statistical InferenceLikelihood and Bayes (Springer).
Lenhard, Johannes (2006). Models and Statistical Inference: the controversy between Fisher and Neyman
Pearson, British Journal for the Philosophy of Science, 57: 6991.
Lindley, D. (1958). Fiducial distribution and Bayes theorem, Journal of the Royal Statistical Society, Series
B, 20: 1027.
Rahlf, Thomas (2014). Statistical Inference, in Claude Diebolt, and Michael Haupert (eds.), Handbook of
Cliometrics ( Springer Reference Series)", Berlin/Heidelberg: Springer. http://www.springerreference.com/
docs/html/chapterdbid/372458.html
Reid, N.; Cox, D. R. (2014). On Some Principles of Statistical Inference. International Statistical Review:
n/a. doi:10.1111/insr.12067.
Young, G.A., Smith, R.L. (2005). Essentials of Statistical Inference, CUP. ISBN 0-521-83971-8
9.9 External links

MIT OpenCourseWare: Statistical Inference
Statistical induction and prediction
Chapter 10
Correlation and dependence

This article is about correlation and dependence in statistical data. For other uses, see correlation (disambiguation).
In statistics, dependence or association is any statistical relationship, whether causal or not, between two random
variables or two sets of data. Correlation is any of a broad class of statistical relationships involving dependence,
though in common usage it most often refers to the extent to which two variables have a linear relationship with each
other. Familiar examples of dependent phenomena include the correlation between the physical statures of parents
and their ospring, and the correlation between the demand for a product and its price.
Correlations are useful because they can indicate a predictive relationship that can be exploited in practice. For
example, an electrical utility may produce less power on a mild day based on the correlation between electricity
demand and weather. In this example there is a causal relationship, because extreme weather causes people to use
more electricity for heating or cooling; however, correlation is not sucient to demonstrate the presence of such a
causal relationship (i.e., correlation does not imply causation).
Formally, dependence refers to any situation in which random variables do not satisfy a mathematical condition of
probabilistic independence. In loose usage, correlation can refer to any departure of two or more random variables
from independence, but technically it refers to any of several more specialized types of relationship between mean
values. There are several correlation coecients, often denoted or r, measuring the degree of correlation. The most
common of these is the Pearson correlation coecient, which is sensitive only to a linear relationship between two
variables (which may exist even if one is a nonlinear function of the other). Other correlation coecients have been
developed to be more robust than the Pearson correlation that is, more sensitive to nonlinear relationships.[1][2][3]
Mutual information can also be applied to measure dependence between two variables.
10.1 Pearsons product-moment coecient

Main article: Pearson product-moment correlation coecient
The most familiar measure of dependence between two quantities is the Pearson product-moment correlation coefcient, or Pearsons correlation coecient, commonly called simply the correlation coecient. It is obtained by
dividing the covariance of the two variables by the product of their standard deviations. Karl Pearson developed the
coecient from a similar but slightly dierent idea by Francis Galton.[4]
The population correlation coecient X,Y between two random variables X and Y with expected values X and Y
and standard deviations X and Y is dened as:
X,Y = corr(X, Y ) =
cov(X, Y )
E[(X X )(Y Y )]
=
,
X Y
X Y
where E is the expected value operator, cov means covariance, and corr is a widely used alternative notation for the
correlation coecient.
The Pearson correlation is dened only if both of the standard deviations are nite and nonzero. It is a corollary of
66
10.1. PEARSONS PRODUCT-MOMENT COEFFICIENT
0.8
0.4
67
-0.4
-0.8
-1
-1
-1
-1
Several sets of (x, y) points, with the Pearson correlation coecient of x and y for each set. Note that the correlation reects the
noisiness and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear
relationships (bottom). N.B.: the gure in the center has a slope of 0 but in that case the correlation coecient is undened because
the variance of Y is zero.
the CauchySchwarz inequality that the correlation cannot exceed 1 in absolute value. The correlation coecient is
symmetric: corr(X,Y) = corr(Y,X).
The Pearson correlation is +1 in the case of a perfect direct (increasing) linear relationship (correlation), 1 in the
case of a perfect decreasing (inverse) linear relationship (anticorrelation),[5] and some value in the open interval
(1, 1) in all other cases, indicating the degree of linear dependence between the variables. As it approaches zero
there is less of a relationship (closer to uncorrelated). The closer the coecient is to either 1 or 1, the stronger the
correlation between the variables.
If the variables are independent, Pearsons correlation coecient is 0, but the converse is not true because the correlation coecient detects only linear dependencies between two variables. For example, suppose the random variable
X is symmetrically distributed about zero, and Y = X2 . Then Y is completely determined by X, so that X and Y are
perfectly dependent, but their correlation is zero; they are uncorrelated. However, in the special case when X and Y
are jointly normal, uncorrelatedness is equivalent to independence.
If we have a series of n measurements of X and Y written as xi and yi for i = 1, 2, ..., n, then the sample correlation
coecient can be used to estimate the population Pearson correlation r between X and Y. The sample correlation
coecient is written:
n
rxy =
(xi x
)(yi y)
i=1
nsx sy
(xi x
)(yi y)
i=1
n
i=1
(xi x
)2
,
(yi y)2
i=1
where x and y are the sample means of X and Y, and sx and sy are the sample standard deviations of X and Y.
This can also be written as:
rxy =

xi yi n
xy
n xi yi xi yi
= 2
.
nsx sy
n xi ( xi )2 n yi2 ( yi )2
If x and y are results of measurements that contain measurement error, the realistic limits on the correlation coecient
are not 1 to +1 but a smaller range.[6]
For the case of a linear model with a single independent variable, the coecient of determination (R squared) is the
square of r, Pearsons product-moment coecient.
68
CHAPTER 10. CORRELATION AND DEPENDENCE
10.2 Rank correlation coecients

Main articles: Spearmans rank correlation coecient and Kendall tau rank correlation coecient
Rank correlation coecients, such as Spearmans rank correlation coecient and Kendalls rank correlation coecient
() measure the extent to which, as one variable increases, the other variable tends to increase, without requiring that
increase to be represented by a linear relationship. If, as the one variable increases, the other decreases, the rank
correlation coecients will be negative. It is common to regard these rank correlation coecients as alternatives to
Pearsons coecient, used either to reduce the amount of calculation or to make the coecient less sensitive to nonnormality in distributions. However, this view has little mathematical basis, as rank correlation coecients measure a
dierent type of relationship than the Pearson product-moment correlation coecient, and are best seen as measures
of a dierent type of association, rather than as alternative measure of the population correlation coecient.[7][8]
To illustrate the nature of rank correlation, and its dierence from linear correlation, consider the following four pairs
of numbers (x, y):
(0, 1), (10, 100), (101, 500), (102, 2000).
As we go from each pair to the next pair x increases, and so does y. This relationship is perfect, in the sense that
an increase in x is always accompanied by an increase in y. This means that we have a perfect rank correlation,
and both Spearmans and Kendalls correlation coecients are 1, whereas in this example Pearson product-moment
correlation coecient is 0.7544, indicating that the points are far from lying on a straight line. In the same way if
y always decreases when x increases, the rank correlation coecients will be 1, while the Pearson product-moment
correlation coecient may or may not be close to 1, depending on how close the points are to a straight line. Although
in the extreme cases of perfect rank correlation the two coecients are both equal (being both +1 or both 1), this
is not generally the case, and so values of the two coecients cannot meaningfully be compared.[7] For example, for
the three pairs (1, 1) (2, 3) (3, 2) Spearmans coecient is 1/2, while Kendalls coecient is 1/3.
10.3 Other measures of dependence among random variables

The information given by a correlation coecient is not enough to dene the dependence structure between random
variables.[9] The correlation coecient completely denes the dependence structure only in very particular cases, for
example when the distribution is a multivariate normal distribution. (See diagram above.) In the case of elliptical
distributions it characterizes the (hyper-)ellipses of equal density, however, it does not completely characterize the
dependence structure (for example, a multivariate t-distributions degrees of freedom determine the level of tail
dependence).
Distance correlation[10][11] was introduced to address the deciency of Pearsons correlation that it can be zero for
dependent random variables; zero distance correlation implies independence.
The Randomized Dependence Coecient[12] is a computationally ecient, copula-based measure of dependence
between multivariate random variables. RDC is invariant with respect to non-linear scalings of random variables, is
capable of discovering a wide range of functional association patterns and takes value zero at independence.
The correlation ratio is able to detect almost any functional dependency, and the entropy-based mutual information,
total correlation and dual total correlation are capable of detecting even more general dependencies. These are sometimes referred to as multi-moment correlation measures, in comparison to those that consider only second moment
(pairwise or quadratic) dependence.
The polychoric correlation is another correlation applied to ordinal data that aims to estimate the correlation between
theorised latent variables.
One way to capture a more complete view of dependence structure is to consider a copula between them.
The coecient of determination generalizes the correlation coecient for relationships beyond simple linear regression.
10.4. SENSITIVITY TO THE DATA DISTRIBUTION
69
10.4 Sensitivity to the data distribution

The degree of dependence between variables X and Y does not depend on the scale on which the variables are
expressed. That is, if we are analyzing the relationship between X and Y, most correlation measures are unaected
by transforming X to a + bX and Y to c + dY, where a, b, c, and d are constants (b and d being positive). This is
true of some correlation statistics as well as their population analogues. Some correlation statistics, such as the rank
correlation coecient, are also invariant to monotone transformations of the marginal distributions of X and/or Y.
Pearson/Spearman correlation coecients between X and Y are shown when the two variables ranges are unrestricted, and when
the range of X is restricted to the interval (0,1).
Most correlation measures are sensitive to the manner in which X and Y are sampled. Dependencies tend to be
stronger if viewed over a wider range of values. Thus, if we consider the correlation coecient between the heights
of fathers and their sons over all adult males, and compare it to the same correlation coecient calculated when the
fathers are selected to be between 165 cm and 170 cm in height, the correlation will be weaker in the latter case.
Several techniques have been developed that attempt to correct for range restriction in one or both variables, and are
commonly used in meta-analysis; the most common are Thorndikes case II and case III equations.[13]
Various correlation measures in use may be undened for certain joint distributions of X and Y. For example, the
Pearson correlation coecient is dened in terms of moments, and hence will be undened if the moments are undened. Measures of dependence based on quantiles are always dened. Sample-based statistics intended to estimate
population measures of dependence may or may not have desirable statistical properties such as being unbiased, or
asymptotically consistent, based on the spatial structure of the population from which the data were sampled.
Sensitivity to the data distribution can be used to an advantage. For example, scaled correlation is designed to use the
sensitivity to the range in order to pick out correlations between fast components of time series.[14] By reducing the
range of values in a controlled manner, the correlations on long time scale are ltered out and only the correlations
on short time scales are revealed.
70
10.5 Correlation matrices

The correlation matrix of n random variables X1 , ..., Xn is the n n matrix whose i,j entry is corr(Xi, Xj). If the
measures of correlation used are product-moment coecients, the correlation matrix is the same as the covariance
matrix of the standardized random variables Xi / (Xi) for i = 1, ..., n. This applies to both the matrix of population
correlations (in which case "" is the population standard deviation), and to the matrix of sample correlations (in
which case "" denotes the sample standard deviation). Consequently, each is necessarily a positive-semidenite
matrix.
The correlation matrix is symmetric because the correlation between Xi and Xj is the same as the correlation between
Xj and Xi.
10.6 Common misconceptions

10.6.1
Correlation and causality
Main article: Correlation does not imply causation

See also: Normally distributed and uncorrelated does not imply independent
The conventional dictum that "correlation does not imply causation" means that correlation cannot be used to infer
a causal relationship between the variables.[15] This dictum should not be taken to mean that correlations cannot
indicate the potential existence of causal relations. However, the causes underlying the correlation, if any, may be
indirect and unknown, and high correlations also overlap with identity relations (tautologies), where no causal process
exists. Consequently, establishing a correlation between two variables is not a sucient condition to establish a causal
relationship (in either direction).
A correlation between age and height in children is fairly causally transparent, but a correlation between mood and
health in people is less so. Does improved mood lead to improved health, or does good health lead to good mood, or
both? Or does some other factor underlie both? In other words, a correlation can be taken as evidence for a possible
causal relationship, but cannot indicate what the causal relationship, if any, might be.
10.6.2
Correlation and linearity
The Pearson correlation coecient indicates the strength of a linear relationship between two variables, but its value
generally does not completely characterize their relationship.[16] In particular, if the conditional mean of Y given X,
denoted E(Y|X), is not linear in X, the correlation coecient will not fully determine the form of E(Y|X).
The image on the right shows scatter plots of Anscombes quartet, a set of four dierent pairs of variables created
by Francis Anscombe.[17] The four y variables have the same mean (7.5), variance (4.12), correlation (0.816) and
regression line (y = 3 + 0.5x). However, as can be seen on the plots, the distribution of the variables is very dierent.
The rst one (top left) seems to be distributed normally, and corresponds to what one would expect when considering
two variables correlated and following the assumption of normality. The second one (top right) is not distributed
normally; while an obvious relationship between the two variables can be observed, it is not linear. In this case the
Pearson correlation coecient does not indicate that there is an exact functional relationship: only the extent to which
that relationship can be approximated by a linear relationship. In the third case (bottom left), the linear relationship
is perfect, except for one outlier which exerts enough inuence to lower the correlation coecient from 1 to 0.816.
Finally, the fourth example (bottom right) shows another example when one outlier is enough to produce a high
correlation coecient, even though the relationship between the two variables is not linear.
These examples indicate that the correlation coecient, as a summary statistic, cannot replace visual examination of
the data. Note that the examples are sometimes said to demonstrate that the Pearson correlation assumes that the
data follow a normal distribution, but this is not correct.[4]
10.7. BIVARIATE NORMAL DISTRIBUTION
71
Four sets of data with the same correlation of 0.816
10.7 Bivariate normal distribution

If a pair (X, Y) of random variables follows a bivariate normal distribution, the conditional mean E(X|Y) is a linear
function of Y, and the conditional mean E(Y|X) is a linear function of X. The correlation coecient r between X and
Y, along with the marginal means and variances of X and Y, determines this linear relationship:
E(Y | X) = E(Y ) + ry
X E(X)
,
x
where and are the expected values of X and Y, respectively, and x and y are the standard deviations of X and Y,
respectively.
10.8 Partial correlation

Main article: Partial correlation
If a population or data-set is characterized by more than two variables, a partial correlation coecient measures the
strength of dependence between a pair of variables that is not accounted for by the way in which they both change in
response to variations in a selected subset of the other variables.
10.9 See also

Further information: Correlation (disambiguation)
72

Autocorrelation
Canonical correlation
Coecient of determination
Cointegration
Concordance correlation coecient
Cophenetic correlation
Copula
Correlation function
Correlation gap
Covariance
Covariance and correlation
Cross-correlation
Ecological correlation
Fraction of variance unexplained
Genetic correlation
Goodman and Kruskals lambda
Illusory correlation
Interclass correlation
Intraclass correlation
Lift (data mining)
Modiable areal unit problem
Multiple correlation
Point-biserial correlation coecient
Quadrant count ratio
Spurious correlation
Statistical arbitrage
Subindependence
10.10 References
[1] Croxton, Frederick Emory; Cowden, Dudley Johnstone; Klein, Sidney (1968) Applied General Statistics, Pitman. ISBN
9780273403159 (page 625)
[2] Dietrich, Cornelius Frank (1991) Uncertainty, Calibration and Probability: The Statistics of Scientic and Industrial Measurement 2nd Edition, A. Higler. ISBN 9780750300605 (Page 331)
[3] Aitken, Alexander Craig (1957) Statistical Mathematics 8th Edition. Oliver & Boyd. ISBN 9780050013007 (Page 95)
[4] Rodgers, J. L.; Nicewander, W. A. (1988). Thirteen ways to look at the correlation coecient. The American Statistician.
42 (1): 5966. doi:10.1080/00031305.1988.10475524. JSTOR 2685263.
[5] Dowdy, S. and Wearden, S. (1983). Statistics for Research, Wiley. ISBN 0-471-08602-9 pp 230
73
[6] Francis, DP; Coats AJ; Gibson D (1999). How high can a correlation coecient be?". Int J Cardiol. 69 (2): 185199.
doi:10.1016/S0167-5273(99)00028-5.
[7] Yule, G.U and Kendall, M.G. (1950), An Introduction to the Theory of Statistics, 14th Edition (5th Impression 1968).
Charles Grin & Co. pp 258270
[8] Kendall, M. G. (1955) Rank Correlation Methods, Charles Grin & Co.
[9] Mahdavi Damghani B. (2013). The Non-Misleading Value of Inferred Correlation: An Introduction to the Cointelation
Model. Wilmott Magazine. doi:10.1002/wilm.10252.
[10] Szkely, G. J. Rizzo; Bakirov, N. K. (2007). Measuring and testing independence by correlation of distances. Annals of
Statistics. 35 (6): 27692794. doi:10.1214/009053607000000505.
[11] Szkely, G. J.; Rizzo, M. L. (2009). Brownian distance covariance. Annals of Applied Statistics. 3 (4): 12331303.
doi:10.1214/09-AOAS312.
[12] Lopez-Paz D. and Hennig P. and Schlkopf B. (2013). The Randomized Dependence Coecient, "Conference on Neural
Information Processing Systems" Reprint
[13] Thorndike, Robert Ladd (1947). Research problems and techniques (Report No. 3). Washington DC: US Govt. print. o.
[14] Nikoli, D; Muresan, RC; Feng, W; Singer, W (2012). Scaled correlation analysis: a better way to compute a crosscorrelogram. European Journal of Neuroscience: 121. doi:10.1111/j.1460-9568.2011.07987.x.
[15] Aldrich, John (1995). Correlations Genuine and Spurious in Pearson and Yule. Statistical Science. 10 (4): 364376.
doi:10.1214/ss/1177009870. JSTOR 2246135.
[16] Mahdavi Damghani, Babak (2012). The Misleading Value of Measured Correlation. Wilmott. 2012 (1): 6473.
doi:10.1002/wilm.10167.
[17] Anscombe, Francis J. (1973). Graphs in statistical analysis. The American Statistician. 27: 1721. doi:10.2307/2682899.
JSTOR 2682899.

Cohen, J., Cohen P., West, S.G., & Aiken, L.S. (2002). Applied multiple regression/correlation analysis for the
behavioral sciences (3rd ed.). Psychology Press. ISBN 0-8058-2223-2.
Hazewinkel, Michiel, ed. (2001), Correlation (in statistics)", Encyclopedia of Mathematics, Springer, ISBN
978-1-55608-010-4
Oestreicher, J. & D. R. (February 26, 2015). Plague of Equals: A science thriller of international disease,
politics and drug discovery. California: Omega Cat Press. p. 408. ISBN 978-0963175540.
10.12 External links

MathWorld page on the (cross-)correlation coecient/s of a sample
Compute signicance between two correlations, for the comparison of two correlation values.
A MATLAB Toolbox for computing Weighted Correlation Coecients
Proof-that-the-Sample-Bivariate-Correlation-has-limits-plus-or-minus-1
Interactive Flash simulation on the correlation of two normally distributed variables by Juha Puranen.
Correlation analysis. Biomedical Statistics
R-Psychologist Correlation visualization of correlation between two numeric variables
Chapter 11
Regression analysis
In statistical modeling, regression analysis is a statistical process for estimating the relationships among variables. It
includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between
a dependent variable and one or more independent variables (or 'predictors). More specically, regression analysis
helps one understand how the typical value of the dependent variable (or 'criterion variable') changes when any one of
the independent variables is varied, while the other independent variables are held xed. Most commonly, regression
analysis estimates the conditional expectation of the dependent variable given the independent variables that is, the
average value of the dependent variable when the independent variables are xed. Less commonly, the focus is on a
quantile, or other location parameter of the conditional distribution of the dependent variable given the independent
variables. In all cases, the estimation target is a function of the independent variables called the regression function.
In regression analysis, it is also of interest to characterize the variation of the dependent variable around the regression
function which can be described by a probability distribution. A related but distinct approach is Necessary Condition
Analysis (NCA), which estimates the maximum (rather than average) value of the dependent variable for a given value
of the independent variable (ceiling line rather than central line) in order to identify what value of the independent
variable is necessary but not sucient for a given value of the dependent variable.
Regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the eld of
machine learning. Regression analysis is also used to understand which among the independent variables are related
to the dependent variable, and to explore the forms of these relationships. In restricted circumstances, regression
analysis can be used to infer causal relationships between the independent and dependent variables. However this can
lead to illusions or false relationships, so caution is advisable;[1] for example, correlation does not imply causation.
Many techniques for carrying out regression analysis have been developed. Familiar methods such as linear regression
and ordinary least squares regression are parametric, in that the regression function is dened in terms of a nite
number of unknown parameters that are estimated from the data. Nonparametric regression refers to techniques that
allow the regression function to lie in a specied set of functions, which may be innite-dimensional.
The performance of regression analysis methods in practice depends on the form of the data generating process, and
how it relates to the regression approach being used. Since the true form of the data-generating process is generally
not known, regression analysis often depends to some extent on making assumptions about this process. These
assumptions are sometimes testable if a sucient quantity of data is available. Regression models for prediction are
often useful even when the assumptions are moderately violated, although they may not perform optimally. However,
in many applications, especially with small eects or questions of causality based on observational data, regression
methods can give misleading results.[2][3]
In a narrower sense, regression may refer specically to the estimation of continuous response variables, as opposed
to the discrete response variables used in classication.[4] The case of a continuous output variable may be more
specically referred to as metric regression to distinguish it from related problems.[5]
11.1 History
The earliest form of regression was the method of least squares, which was published by Legendre in 1805,[6] and by
Gauss in 1809.[7] Legendre and Gauss both applied the method to the problem of determining, from astronomical
observations, the orbits of bodies about the Sun (mostly comets, but also later the then newly discovered minor
74
11.2. REGRESSION MODELS
75
planets). Gauss published a further development of the theory of least squares in 1821,[8] including a version of the
GaussMarkov theorem.
The term regression was coined by Francis Galton in the nineteenth century to describe a biological phenomenon.
The phenomenon was that the heights of descendants of tall ancestors tend to regress down towards a normal average (a phenomenon also known as regression toward the mean).[9][10] For Galton, regression had only this biological meaning,[11][12] but his work was later extended by Udny Yule and Karl Pearson to a more general statistical
context.[13][14] In the work of Yule and Pearson, the joint distribution of the response and explanatory variables is
assumed to be Gaussian. This assumption was weakened by R.A. Fisher in his works of 1922 and 1925.[15][16][17]
Fisher assumed that the conditional distribution of the response variable is Gaussian, but the joint distribution need
not be. In this respect, Fishers assumption is closer to Gausss formulation of 1821.
In the 1950s and 1960s, economists used electromechanical desk calculators to calculate regressions. Before 1970,
it sometimes took up to 24 hours to receive the result from one regression.[18]
Regression methods continue to be an area of active research. In recent decades, new methods have been developed
for robust regression, regression involving correlated responses such as time series and growth curves, regression
in which the predictor (independent variable) or response variables are curves, images, graphs, or other complex
data objects, regression methods accommodating various types of missing data, nonparametric regression, Bayesian
methods for regression, regression in which the predictor variables are measured with error, regression with more
predictor variables than observations, and causal inference with regression.
11.2 Regression models

Regression models involve the following variables:
The unknown parameters, denoted as , which may represent a scalar or a vector.
The independent variables, X.
The dependent variable, Y.
In various elds of application, dierent terminologies are used in place of dependent and independent variables.
A regression model relates Y to a function of X and .
Y f (X, )
The approximation is usually formalized as E(Y | X) = f(X, ). To carry out regression analysis, the form of the
function f must be specied. Sometimes the form of this function is based on knowledge about the relationship
between Y and X that does not rely on the data. If no such knowledge is available, a exible or convenient form for
f is chosen.
Assume now that the vector of unknown parameters is of length k. In order to perform a regression analysis the
user must provide information about the dependent variable Y:
If N data points of the form (Y, X) are observed, where N < k, most classical approaches to regression analysis
cannot be performed: since the system of equations dening the regression model is underdetermined, there
are not enough data to recover .
If exactly N = k data points are observed, and the function f is linear, the equations Y = f(X, ) can be solved
exactly rather than approximately. This reduces to solving a set of N equations with N unknowns (the elements
of ), which has a unique solution as long as the X are linearly independent. If f is nonlinear, a solution may
not exist, or many solutions may exist.
The most common situation is where N > k data points are observed. In this case, there is enough information
in the data to estimate a unique value for that best ts the data in some sense, and the regression model when
applied to the data can be viewed as an overdetermined system in .
In the last case, the regression analysis provides the tools for:
76
CHAPTER 11. REGRESSION ANALYSIS

1. Finding a solution for unknown parameters that will, for example, minimize the distance between the measured and predicted values of the dependent variable Y (also known as method of least squares).
2. Under certain statistical assumptions, the regression analysis uses the surplus of information to provide statistical information about the unknown parameters and predicted values of the dependent variable Y.
11.2.1
Necessary number of independent measurements
Consider a regression model which has three unknown parameters, 0 , 1 , and 2 . Suppose an experimenter performs
10 measurements all at exactly the same value of independent variable vector X (which contains the independent
variables X1 , X2 , and X3 ). In this case, regression analysis fails to give a unique set of estimated values for the three
unknown parameters; the experimenter did not provide enough information. The best one can do is to estimate the
average value and the standard deviation of the dependent variable Y. Similarly, measuring at two dierent values of
X would give enough data for a regression with two unknowns, but not for three or more unknowns.
If the experimenter had performed measurements at three dierent values of the independent variable vector X, then
regression analysis would provide a unique set of estimates for the three unknown parameters in .
In the case of general linear regression, the above statement is equivalent to the requirement that the matrix XT X is
invertible.
11.2.2
Statistical assumptions
When the number of measurements, N, is larger than the number of unknown parameters, k, and the measurement
errors are normally distributed then the excess of information contained in (N k) measurements is used to make
statistical predictions about the unknown parameters. This excess of information is referred to as the degrees of
freedom of the regression.
11.3 Underlying assumptions

Classical assumptions for regression analysis include:
The sample is representative of the population for the inference prediction.
The error is a random variable with a mean of zero conditional on the explanatory variables.
The independent variables are measured with no error. (Note: If this is not so, modeling may be done instead
using errors-in-variables model techniques).
The independent variables (predictors) are linearly independent, i.e. it is not possible to express any predictor
as a linear combination of the others.
The errors are uncorrelated, that is, the variancecovariance matrix of the errors is diagonal and each non-zero
element is the variance of the error.
The variance of the error is constant across observations (homoscedasticity). If not, weighted least squares or
other methods might instead be used.
These are sucient conditions for the least-squares estimator to possess desirable properties; in particular, these
assumptions imply that the parameter estimates will be unbiased, consistent, and ecient in the class of linear unbiased
estimators. It is important to note that actual data rarely satises the assumptions. That is, the method is used even
though the assumptions are not true. Variation from the assumptions can sometimes be used as a measure of how far
the model is from being useful. Many of these assumptions may be relaxed in more advanced treatments. Reports of
statistical analyses usually include analyses of tests on the sample data and methodology for the t and usefulness of
the model.
Assumptions include the geometrical support of the variables.[19] Independent and dependent variables often refer to
values measured at point locations. There may be spatial trends and spatial autocorrelation in the variables that violate
statistical assumptions of regression. Geographic weighted regression is one technique to deal with such data.[20] Also,
11.4. LINEAR REGRESSION
77
variables may include values aggregated by areas. With aggregated data the modiable areal unit problem can cause
extreme variation in regression parameters.[21] When analyzing data aggregated by political boundaries, postal codes
or census areas results may be very distinct with a dierent choice of units.
11.4 Linear regression

Main article: Linear regression
See simple linear regression for a derivation of these formulas and a numerical example
In linear regression, the model specication is that the dependent variable, yi is a linear combination of the parameters
(but need not be linear in the independent variables). For example, in simple linear regression for modeling n data
points there is one independent variable: xi , and two parameters, 0 and 1 :
yi = 0 + 1 xi + i ,
i = 1, . . . , n.
In multiple linear regression, there are several independent variables or functions of independent variables.
Adding a term in xi2 to the preceding regression gives:
yi = 0 + 1 xi + 2 x2i + i , i = 1, . . . , n.
This is still linear regression; although the expression on the right hand side is quadratic in the independent variable
xi , it is linear in the parameters 0 , 1 and 2 .
In both cases, i is an error term and the subscript i indexes a particular observation.
Returning our attention to the straight line case: Given a random sample from the population, we estimate the population parameters and obtain the sample linear regression model:
ybi = b0 + b1 xi .
The residual, ei = yi ybi , is the dierence between the value of the dependent variable predicted by the model, ybi
, and the true value of the dependent variable, yi . One method of estimation is ordinary least squares. This method
obtains parameter estimates that minimize the sum of squared residuals, SSE,[22][23] also sometimes denoted RSS:
SSE =
e2i .
i=1
Minimization of this function results in a set of normal equations, a set of simultaneous linear equations in the
parameters, which are solved to yield the parameter estimators, b0 , b1 .
In the case of simple regression, the formulas for the least squares estimates are
c1 =
(xi x
)(yi y)
c1 x
and 0 = y
(xi x
)2
where x
is the mean (average) of the x values and y is the mean of the y values.
Under the assumption that the population error term has a constant variance, the estimate of that variance is given by:
2 =
SSE
.
n2
This is called the mean square error (MSE) of the regression. The denominator is the sample size reduced by the
number of model parameters estimated from the same data, (n-p) for p regressors or (n-p1) if an intercept is used.[24]
In this case, p=1 so the denominator is n2.
78
Illustration of linear regression on a data set.
The standard errors of the parameter estimates are given by
0 =
1
x
2
+
n
(xi x
)2
1 =
1
.
(xi x
)2
Under the further assumption that the population error term is normally distributed, the researcher can use these
estimated standard errors to create condence intervals and conduct hypothesis tests about the population parameters.
11.4.1
General linear model
For a derivation, see linear least squares

For a numerical example, see linear regression
In the more general multiple regression model, there are p independent variables:
yi = 1 xi1 + 2 xi2 + + p xip + i ,

where xij is the ith observation on the j th independent variable. If the rst independent variable takes the value 1 for
all i, xi1 = 1, then 1 is called the regression intercept.
The least squares parameter estimates are obtained from p normal equations. The residual can be written as
i = yi 1 xi1 p xip .
The normal equations are
11.5. INTERPOLATION AND EXTRAPOLATION
p
n
Xij Xik k =
i=1 k=1
79
Xij yi , j = 1, . . . , p.
i=1
In matrix notation, the normal equations are written as
^ = X Y,
(X X)
where the ij element of X is xij, the i element of the column vector Y is yi, and the j element of is j . Thus X is
np, Y is n1, and is p1. The solution is
^ = (X X)1 X Y.
11.4.2
Diagnostics
Main article: Regression diagnostics

See also: Category:Regression diagnostics.
Once a regression model has been constructed, it may be important to conrm the goodness of t of the model and the
statistical signicance of the estimated parameters. Commonly used checks of goodness of t include the R-squared,
analyses of the pattern of residuals and hypothesis testing. Statistical signicance can be checked by an F-test of the
overall t, followed by t-tests of individual parameters.
Interpretations of these diagnostic tests rest heavily on the model assumptions. Although examination of the residuals
can be used to invalidate a model, the results of a t-test or F-test are sometimes more dicult to interpret if the
models assumptions are violated. For example, if the error term does not have a normal distribution, in small
samples the estimated parameters will not follow normal distributions and complicate inference. With relatively large
samples, however, a central limit theorem can be invoked such that hypothesis testing may proceed using asymptotic
approximations.
11.4.3
Limited dependent variables
The phrase limited dependent is used in econometric statistics for categorical and constrained variables.
The response variable may be non-continuous (limited to lie on some subset of the real line). For binary (zero
or one) variables, if analysis proceeds with least-squares linear regression, the model is called the linear probability
model. Nonlinear models for binary dependent variables include the probit and logit model. The multivariate probit
model is a standard method of estimating a joint relationship between several binary dependent variables and some
independent variables. For categorical variables with more than two values there is the multinomial logit. For ordinal
variables with more than two values, there are the ordered logit and ordered probit models. Censored regression
models may be used when the dependent variable is only sometimes observed, and Heckman correction type models
may be used when the sample is not randomly selected from the population of interest. An alternative to such
procedures is linear regression based on polychoric correlation (or polyserial correlations) between the categorical
variables. Such procedures dier in the assumptions made about the distribution of the variables in the population. If
the variable is positive with low values and represents the repetition of the occurrence of an event, then count models
like the Poisson regression or the negative binomial model may be used instead.
11.5 Interpolation and extrapolation

Regression models predict a value of the Y variable given known values of the X variables. Prediction within the
range of values in the dataset used for model-tting is known informally as interpolation. Prediction outside this
range of the data is known as extrapolation. Performing extrapolation relies strongly on the regression assumptions.
80
The further the extrapolation goes outside the data, the more room there is for the model to fail due to dierences
between the assumptions and the sample data or the true values.
It is generally advised that when performing extrapolation, one should accompany the estimated value of the dependent
variable with a prediction interval that represents the uncertainty. Such intervals tend to expand rapidly as the values
of the independent variable(s) moved outside the range covered by the observed data.
For such reasons and others, some tend to say that it might be unwise to undertake extrapolation.[25]
However, this does not cover the full set of modelling errors that may be being made: in particular, the assumption
of a particular form for the relation between Y and X. A properly conducted regression analysis will include an
assessment of how well the assumed form is matched by the observed data, but it can only do so within the range of
values of the independent variables actually available. This means that any extrapolation is particularly reliant on the
assumptions being made about the structural form of the regression relationship. Best-practice advice here is that a
linear-in-variables and linear-in-parameters relationship should not be chosen simply for computational convenience,
but that all available knowledge should be deployed in constructing a regression model. If this knowledge includes
the fact that the dependent variable cannot go outside a certain range of values, this can be made use of in selecting
the model even if the observed dataset has no values particularly near such bounds. The implications of this step
of choosing an appropriate functional form for the regression can be great when extrapolation is considered. At a
minimum, it can ensure that any extrapolation arising from a tted model is realistic (or in accord with what is
known).
11.6 Nonlinear regression

Main article: Nonlinear regression
When the model function is not linear in the parameters, the sum of squares must be minimized by an iterative
procedure. This introduces many complications which are summarized in Dierences between linear and non-linear
least squares
11.7 Power and sample size calculations

There are no generally agreed methods for relating the number of observations versus the number of independent
variables in the model. One rule of thumb suggested by Good and Hardin is N = mn , where N is the sample size, n
is the number of independent variables and m is the number of observations needed to reach the desired precision if
the model had only one independent variable.[26] For example, a researcher is building a linear regression model using
a dataset that contains 1000 patients ( N ). If the researcher decides that ve observations are needed to precisely
dene a straight line ( m ), then the maximum number of independent variables the model can support is 4, because
log 1000
log 5
= 4.29 .
11.8 Other methods

Although the parameters of a regression model are usually estimated using the method of least squares, other methods
which have been used include:
Bayesian methods, e.g. Bayesian linear regression
Percentage regression, for situations where reducing percentage errors is deemed more appropriate.[27]
Least absolute deviations, which is more robust in the presence of outliers, leading to quantile regression
Nonparametric regression, requires a large number of observations and is computationally intensive
Distance metric learning, which is learned by the search of a meaningful distance metric in a given input
space.[28]
11.9. SOFTWARE
81
11.9 Software
Main article: List of statistical packages
All major statistical software packages perform least squares regression analysis and inference. Simple linear regression and multiple regression using least squares can be done in some spreadsheet applications and on some calculators.
While many statistical software packages can perform various types of nonparametric and robust regression, these
methods are less standardized; dierent software packages implement dierent methods, and a method with a given
name may be implemented dierently in dierent packages. Specialized regression software has been developed for
use in elds such as survey analysis and neuroimaging.
11.10 See also

Curve tting
Estimation Theory
Forecasting
Fraction of variance unexplained
Function approximation
Generalized linear models
Kriging (a linear least squares estimation algorithm)
Local regression
Modiable areal unit problem
Multivariate adaptive regression splines
Multivariate normal distribution
Prediction interval
Regression validation
Robust regression
Segmented regression
Signal processing
Stepwise regression
Trend estimation
11.11 References
[1] Armstrong, J. Scott (2012). Illusions in Regression Analysis. International Journal of Forecasting (forthcoming). 28 (3):
689. doi:10.1016/j.ijforecast.2012.02.001.
[2] David A. Freedman, Statistical Models: Theory and Practice, Cambridge University Press (2005)
[3] R. Dennis Cook; Sanford Weisberg Criticism and Inuence Analysis in Regression, Sociological Methodology, Vol. 13.
(1982), pp. 313361
82
[4] Christopher M. Bishop (2006). Pattern Recognition and Machine Learning. Springer. p. 3. Cases [...] in which the aim is
to assign each input vector to one of a nite number of discrete categories, are called classication problems. If the desired
output consists of one or more continuous variables, then the task is called regression.
[5] Waegeman, Willem; De Baets, Bernard; Boullart, Luc (2008). ROC analysis in ordinal regression learning. Pattern
Recognition Letters. 29: 19. doi:10.1016/j.patrec.2007.07.019.
[6] A.M. Legendre. Nouvelles mthodes pour la dtermination des orbites des comtes, Firmin Didot, Paris, 1805. Sur la
Mthode des moindres quarrs appears as an appendix.
[7] C.F. Gauss. Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientum. (1809)
[8] C.F. Gauss. Theoria combinationis observationum erroribus minimis obnoxiae. (1821/1823)
[9] Mogull, Robert G. (2004). Second-Semester Applied Statistics. Kendall/Hunt Publishing Company. p. 59. ISBN 0-75751181-3.
[10] Galton, Francis (1989). Kinship and Correlation (reprinted 1989)". Statistical Science. Institute of Mathematical Statistics.
4 (2): 8086. doi:10.1214/ss/1177012581. JSTOR 2245330.
[11] Francis Galton. Typical laws of heredity, Nature 15 (1877), 492495, 512514, 532533. (Galton uses the term reversion in this paper, which discusses the size of peas.)
[12] Francis Galton. Presidential address, Section H, Anthropology. (1885) (Galton uses the term regression in this paper,
which discusses the height of humans.)
[13] Yule, G. Udny (1897). On the Theory of Correlation. Journal of the Royal Statistical Society. Blackwell Publishing. 60
(4): 81254. doi:10.2307/2979746. JSTOR 2979746.
[14] Pearson, Karl; Yule, G.U.; Blanchard, Norman; Lee,Alice (1903). The Law of Ancestral Heredity. Biometrika. Biometrika
Trust. 2 (2): 211236. doi:10.1093/biomet/2.2.211. JSTOR 2331683.
[15] Fisher, R.A. (1922). The goodness of t of regression formulae, and the distribution of regression coecients. Journal
of the Royal Statistical Society. Blackwell Publishing. 85 (4): 597612. doi:10.2307/2341124. JSTOR 2341124.
[16] Ronald A. Fisher (1954). Statistical Methods for Research Workers (Twelfth ed.). Edinburgh: Oliver and Boyd. ISBN
0-05-002170-2.
[17] Aldrich, John (2005). Fisher and Regression. Statistical Science. 20 (4): 401417. doi:10.1214/088342305000000331.
JSTOR 20061201.
[18] Rodney Ramcharan. Regressions: Why Are Economists Obessessed with Them? March 2006. Accessed 2011-12-03.
[19] N. Cressie (1996) Change of Support and the Modiable Areal Unit Problem. Geographical Systems 3:159180.
[20] Fotheringham, A. Stewart; Brunsdon, Chris; Charlton, Martin (2002). Geographically weighted regression: the analysis of
spatially varying relationships (Reprint ed.). Chichester, England: John Wiley. ISBN 978-0-471-49616-8.
[21] Fotheringham, AS; Wong, DWS (1 January 1991). The modiable areal unit problem in multivariate statistical analysis.
Environment and Planning A. 23 (7): 10251044. doi:10.1068/a231025.
[22] M. H. Kutner, C. J. Nachtsheim, and J. Neter (2004), Applied Linear Regression Models, 4th ed., McGraw-Hill/Irwin,
Boston (p. 25)
[23] N. Ravishankar and D. K. Dey (2002), A First Course in Linear Model Theory, Chapman and Hall/CRC, Boca Raton
(p. 101)
[24] Steel, R.G.D, and Torrie, J. H., Principles and Procedures of Statistics with Special Reference to the Biological Sciences.,
McGraw Hill, 1960, page 288.
[25] Chiang, C.L, (2003) Statistical methods of analysis, World Scientic. ISBN 981-238-310-7 - page 274 section 9.7.4 interpolation vs extrapolation
[26] Good, P. I.; Hardin, J. W. (2009). Common Errors in Statistics (And How to Avoid Them) (3rd ed.). Hoboken, New Jersey:
Wiley. p. 211. ISBN 978-0-470-45798-6.
[27] Tofallis, C. (2009). Least Squares Percentage Regression. Journal of Modern Applied Statistical Methods. 7: 526534.
doi:10.2139/ssrn.1406472.
[28] YangJing Long (2009). Human age estimation by metric learning for regression problems (PDF). Proc. International
Conference on Computer Analysis of Images and Patterns: 7482.
83

William H. Kruskal and Judith M. Tanur, ed. (1978), Linear Hypotheses, International Encyclopedia of
Statistics. Free Press, v. 1,
Evan J. Williams, I. Regression, pp. 52341.
Julian C. Stanley, II. Analysis of Variance, pp. 541554.
Lindley, D.V. (1987). Regression and correlation analysis, New Palgrave: A Dictionary of Economics, v. 4,
pp. 12023.
Birkes, David and Dodge, Y., Alternative Methods of Regression. ISBN 0-471-56881-3
Chateld, C. (1993) Calculating Interval Forecasts, Journal of Business and Economic Statistics, 11. pp.
121135.
Draper, N.R.; Smith, H. (1998). Applied Regression Analysis (3rd ed.). John Wiley. ISBN 0-471-17082-8.
Fox, J. (1997). Applied Regression Analysis, Linear Models and Related Methods. Sage
Hardle, W., Applied Nonparametric Regression (1990), ISBN 0-521-42950-1
Meade, N. and T. Islam (1995) Prediction Intervals for Growth Curve Forecasts Journal of Forecasting, 14,
pp. 413430.
A. Sen, M. Srivastava, Regression Analysis Theory, Methods, and Applications, Springer-Verlag, Berlin,
2011 (4th printing).
T. Strutz: Data Fitting and Uncertainty (A practical introduction to weighted least squares and beyond). Vieweg+Teubner,
ISBN 978-3-8348-1022-9.
Malakooti, B. (2013). Operations and Production Systems with Multiple Objectives. John Wiley & Sons.

Hazewinkel, Michiel, ed. (2001), Regression analysis, Encyclopedia of Mathematics, Springer, ISBN 978-155608-010-4
Earliest Uses: Regression basic history and references
Regression of Weakly Correlated Data how linear regression mistakes can appear when Y-range is much
smaller than X-range
Chapter 12
Multivariate statistics
Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more
than one outcome variable. The application of multivariate statistics is multivariate analysis.
Multivariate statistics concerns understanding the dierent aims and background of each of the dierent forms of
multivariate analysis, and how they relate to each other. The practical implementation of multivariate statistics to
a particular problem may involve several types of univariate and multivariate analyses in order to understand the
relationships between variables and their relevance to the actual problem being studied.
In addition, multivariate statistics is concerned with multivariate probability distributions, in terms of both
how these can be used to represent the distributions of observed data;
how they can be used as part of statistical inference, particularly where several dierent quantities
are of interest to the same analysis.
Certain types of problem involving multivariate data, for example simple linear regression and multiple regression,
are not usually considered as special cases of multivariate statistics because the analysis is dealt with by considering
the (univariate) conditional distribution of a single outcome variable given the other variables.
12.1 Types of analysis

There are many dierent models, each with its own type of analysis:
1. Multivariate analysis of variance (MANOVA) extends the analysis of variance to cover cases where there is
more than one dependent variable to be analyzed simultaneously; see also MANCOVA.
2. Multivariate regression attempts to determine a formula that can describe how elements in a vector of variables
respond simultaneously to changes in others. For linear relations, regression analyses here are based on forms
of the general linear model. Note that Multivariate regression is distinct from Multivariable regression, which
has only one dependent variable.[1]
3. Principal components analysis (PCA) creates a new set of orthogonal variables that contain the same information as the original set. It rotates the axes of variation to give a new set of orthogonal axes, ordered so that they
summarize decreasing proportions of the variation.
4. Factor analysis is similar to PCA but allows the user to extract a specied number of synthetic variables, fewer
than the original set, leaving the remaining unexplained variation as error. The extracted variables are known
as latent variables or factors; each one may be supposed to account for covariation in a group of observed
variables.
5. Canonical correlation analysis nds linear relationships among two sets of variables; it is the generalised (i.e.
canonical) version of bivariate[2] correlation.
84
12.2. IMPORTANT PROBABILITY DISTRIBUTIONS
85
6. Redundancy analysis (RDA) is similar to canonical correlation analysis but allows the user to derive a specied
number of synthetic variables from one set of (independent) variables that explain as much variance as possible
in another (independent) set. It is a multivariate analogue of regression.
7. Correspondence analysis (CA), or reciprocal averaging, nds (like PCA) a set of synthetic variables that summarise the original set. The underlying model assumes chi-squared dissimilarities among records (cases).
8. Canonical (or constrained) correspondence analysis (CCA) for summarising the joint variation in two sets
of variables (like redundancy analysis); combination of correspondence analysis and multivariate regression
analysis. The underlying model assumes chi-squared dissimilarities among records (cases).
9. Multidimensional scaling comprises various algorithms to determine a set of synthetic variables that best represent the pairwise distances between records. The original method is principal coordinates analysis (PCoA;
based on PCA).
10. Discriminant analysis, or canonical variate analysis, attempts to establish whether a set of variables can be used
to distinguish between two or more groups of cases.
11. Linear discriminant analysis (LDA) computes a linear predictor from two sets of normally distributed data to
allow for classication of new observations.
12. Clustering systems assign objects into groups (called clusters) so that objects (cases) from the same cluster are
more similar to each other than objects from dierent clusters.
13. Recursive partitioning creates a decision tree that attempts to correctly classify members of the population
based on a dichotomous dependent variable.
14. Articial neural networks extend regression and clustering methods to non-linear multivariate models.
15. Statistical graphics such as tours, parallel coordinate plots, scatterplot matrices can be used to explore multivariate data.
16. Simultaneous equations models involve more than one regression equation, with dierent dependent variables,
estimated together.
17. Vector autoregression involves simultaneous regressions of various time series variables on their own and each
others lagged values.
12.2 Important probability distributions

There is a set of probability distributions used in multivariate analyses that play a similar role to the corresponding
set of distributions that are used in univariate analysis when the normal distribution is appropriate to a dataset. These
multivariate distributions are:
Multivariate normal distribution
Wishart distribution
Multivariate Student-t distribution.
The Inverse-Wishart distribution is important in Bayesian inference, for example in Bayesian multivariate linear
regression. Additionally, Hotellings T-squared distribution is a multivariate distribution, generalising Students tdistribution, that is used in multivariate hypothesis testing.
12.3 History
Andersons 1958 textbook, An Introduction to Multivariate Analysis,[3] educated a generation of theorists and applied
statisticians; Andersons book emphasizes hypothesis testing via likelihood ratio tests and the properties of power
functions: Admissibility, unbiasedness and monotonicity.[4][5]
86
CHAPTER 12. MULTIVARIATE STATISTICS
12.4 Software and tools

There are an enormous number of software packages and other tools for multivariate analysis, including:
High-D
JMP (statistical software)
MiniTab
Calc
PLS_Toolbox / Solo (Eigenvector Research)
PSPP
R: https://cran.r-project.org/web/views/Multivariate.html has details on the packages available for multivariate
data analysis
SAS (software)
SciPy for Python
SPSS
Stata
STATISTICA
TMVA - Toolkit for Multivariate Data Analysis in ROOT
The Unscrambler
SmartPLS - Next Generation Path Modeling
MATLAB
Eviews
Prosensus ProMV
Umetrics SIMCA
12.5 See also

Estimation of covariance matrices
Covariance mapping
Important publications in multivariate analysis
Multivariate testing
Structured data analysis (statistics)
RV coecient
12.6. REFERENCES
87
12.6 References
[1] Hidalgo, B; Goodman, M (2013). Multivariate or multivariable regression?". Am J Public Health. 103: 3940. doi:10.2105/AJPH.2012.300897
PMC 3518362 . PMID 23153131.
[2] Unsophisticated analysts of bivariate Gaussian problems may nd useful a crude but accurate method of accurately gauging
probability by simply taking the sum S of the N residuals squares, subtracting the sum Sm at minimum, dividing this
dierence by Sm, multiplying the result by (N - 2) and taking the inverse anti-ln of half that product.
[3] T.W. Anderson (1958) An Introduction to Multivariate Analysis, New York: Wiley ISBN 0471026409; 2e (1984) ISBN
0471889873; 3e (2003) ISBN 0471360910
[4] Sen, Pranab Kumar; Anderson, T. W.; Arnold, S. F.; Eaton, M. L.; Giri, N. C.; Gnanadesikan, R.; Kendall, M. G.; Kshirsagar, A. M.; et al. (June 1986). Review: Contemporary Textbooks on Multivariate Statistical Analysis: A Panoramic
Appraisal and Critique. Journal of the American Statistical Association. 81 (394): 560564. doi:10.2307/2289251. ISSN
0162-1459. JSTOR 2289251.(Pages 560561)
[5] Schervish, Mark J. (November 1987). A Review of Multivariate Analysis. Statistical Science. 2 (4): 396413. doi:10.1214/ss/1177013111.
ISSN 0883-4237. JSTOR 2245530.

Johnson, Richard A.; Wichern, Dean W. (2007). Applied Multivariate Statistical Analysis (Sixth ed.). Prentice
Hall. ISBN 978-0-13-187715-3.
KV Mardia; JT Kent; JM Bibby (1979). Multivariate Analysis. Academic Press. ISBN 0-12-471252-5.
A. Sen, M. Srivastava, Regression Analysis Theory, Methods, and Applications, Springer-Verlag, Berlin,
2011 (4th printing).
Cook, Swayne (2007). Interactive Graphics for Data Analysis.
Malakooti, B. (2013). Operations and Production Systems with Multiple Objectives. John Wiley & Sons.
12.8 External links

Statnotes: Topics in Multivariate Analysis, by G. David Garson
Mike Palmer: The Ordination Web Page
InsightsNow: Makers of ReportsNow, ProlesNow, and KnowledgeNow
Chapter 13
Data collection
Adlie penguins are identied and weighed each time they cross the automated weighbridge on their way to or from the sea.[1]
Data collection is the process of gathering and measuring information on targeted variables in an established systematic fashion, which then enables one to answer relevant questions and evaluate outcomes. The data collection component of research is common to all elds of study including physical and social sciences, humanities and business.It
help us to collect the main points as gathered information. While methods vary by discipline, the emphasis on ensuring accurate and honest collection remains the same. The goal for all data collection is to capture quality evidence
that then translates to rich data analysis and allows the building of a convincing and credible answer to questions that
have been posed.
13.1 Importance
Regardless of the eld of study or preference for dening data (quantitative or qualitative), accurate data collection
is essential to maintaining the integrity of research. Both the selection of appropriate data collection instruments
(existing, modied, or newly developed) and clearly delineated instructions for their correct use reduce the likelihood
88
13.2. TYPES
89
of errors occurring.
A formal data collection process is necessary as it ensures that data gathered are both dened and accurate and that
subsequent decisions based on arguments embodied in the ndings are valid.[2] The process provides both a baseline
from which to measure and in certain cases a target on what to improve.
13.2 Types
Generally there are four types of data collection and they are:
1. Surveys: Standardized paper-and-pencil or phone questionnaires that ask predetermined questions.
2. Interviews: Structured or unstructured one-on-one directed conversations with key individuals or leaders in a
community.
3. Focus groups: Structured interviews with small groups of like individuals using standardized questions, follow-up
questions, and exploration of other topics that arise to better understand participants.
4. Action Research: An intervention that is practicable (researcher does something to implant a modication or
intervention in a situation that is researchable).
Consequences from improperly collected data include:
Inability to answer research questions accurately;
Inability to repeat and validate the study.
13.3 Impact of faulty data

Distorted ndings result in wasted resources and can mislead other researchers to pursue fruitless avenues of investigation.
This compromises decisions for public policy.
While the degree of impact from faulty data collection may vary by discipline and the nature of investigation,
there is the potential to cause disproportionate harm when these research results are used to support public policy
recommendations.[3]
13.4 References
[1] Lescrol, A. L.; Ballard, G.; Grmillet, D.; Authier, M.; Ainley, D. G. (2014). Descamps, Sbastien, ed. Antarctic
Climate Change: Extreme Events Disrupt Plastic Phenotypic Response in Adlie Penguins. PLoS ONE. 9 (1): e85291.
doi:10.1371/journal.pone.0085291. PMC 3906005 . PMID 24489657.
[2] Data Collection and Analysis By Dr. Roger Sapsford, Victor Jupp ISBN 0-7619-5046-X
[3] Weimer, J. (ed.) (1995). Research Techniques in Human Engineering. Englewood Clis, NJ: Prentice Hall ISBN 0-13097072-7
13.5 See also

Scientic data archiving
Data management
Experiment
Observational study
Sampling (statistics)
Statistical survey
90
CHAPTER 13. DATA COLLECTION

Survey data collection
Qualitative method
Quantitative method
Quantitative methods in criminology
13.6 External links

Bureau of Statistics, Guyana by Arun Sooknarine
Chapter 14
Time series
Time series: random data plus trend, with best-t line and dierent applied lters
A time series is a series of data points listed (or graphed) in time order. Most commonly, a time series is a sequence
taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series
are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.
Time series are very frequently plotted via line charts. Time series are used in statistics, signal processing, pattern
recognition, econometrics, mathematical nance, weather forecasting, intelligent transport and trajectory forecasting,[1]
earthquake prediction, electroencephalography, control engineering, astronomy, communications engineering, and
largely in any domain of applied science and engineering which involves temporal measurements.
Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and
other characteristics of the data. Time series forecasting is the use of a model to predict future values based on
previously observed values. While regression analysis is often employed in such a way as to test theories that the
current values of one or more independent time series aect the current value of another time series, this type of
91
92
CHAPTER 14. TIME SERIES
analysis of time series is not called time series analysis, which focuses on comparing values of a single time series
or multiple dependent time series at dierent points in time.[2]
Time series data have a natural temporal ordering. This makes time series analysis distinct from cross-sectional
studies, in which there is no natural ordering of the observations (e.g. explaining peoples wages by reference to their
respective education levels, where the individuals data could be entered in any order). Time series analysis is also
distinct from spatial data analysis where the observations typically relate to geographical locations (e.g. accounting for
house prices by the location as well as the intrinsic characteristics of the houses). A stochastic model for a time series
will generally reect the fact that observations close together in time will be more closely related than observations
further apart. In addition, time series models will often make use of the natural one-way ordering of time so that
values for a given period will be expressed as deriving in some way from past values, rather than from future values
(see time reversibility.)
Time series analysis can be applied to real-valued, continuous data, discrete numeric data, or discrete symbolic data
(i.e. sequences of characters, such as letters and words in the English language[3] ).
14.1 Methods for time series analyses

Methods for time series analyses may be divided into two classes: frequency-domain methods and time-domain methods. The former include spectral analysis and wavelet analysis; the latter include auto-correlation and cross-correlation
analysis. In the time domain, correlation analyses can be made in a lter-like manner using scaled correlation, thereby
mitigating the need to operate in the frequency domain.
Additionally, time series analysis techniques may be divided into parametric and non-parametric methods. The
parametric approaches assume that the underlying stationary stochastic process has a certain structure which can be
described using a small number of parameters (for example, using an autoregressive or moving average model). In
these approaches, the task is to estimate the parameters of the model that describes the stochastic process. By contrast,
non-parametric approaches explicitly estimate the covariance or the spectrum of the process without assuming that
the process has any particular structure.
Methods of time series analysis may also be divided into linear and non-linear, and univariate and multivariate.
14.2 Time Series and Panel Data

A time series is one type of Panel data. Panel data is the general class, a multidimensional data set, whereas a time
series data set is a one-dimensional panel (as is a cross-sectional dataset). A data set may exhibit characteristics of
both panel data and time series data. One way to tell is to ask what makes one data record unique from the other
records. If the answer is the time data eld, then this is a time series data set candidate. If determining a unique
record requires a time data eld and an additional identier which is unrelated to time (student ID, stock symbol,
country code), then it is panel data candidate. If the dierentiation lies on the non-time identier, then the data set is
a cross-sectional data set candidate.
14.3 Analysis
There are several types of motivation and data analysis available for time series which are appropriate for dierent
purposes.
14.3.1
Motivation
In the context of statistics, econometrics, quantitative nance, seismology, meteorology, and geophysics the primary
goal of time series analysis is forecasting. In the context of signal processing, control engineering and communication
engineering it is used for signal detection and estimation, while in the context of data mining, pattern recognition and
machine learning time series analysis can be used for clustering, classication, query by content, anomaly detection
as well as forecasting.
14.3. ANALYSIS
14.3.2
93
Exploratory analysis
Tuberculosis incidence US 1953-2009
Further information: Exploratory analysis

The clearest way to examine a regular time series manually is with a line chart such as the one shown for tuberculosis in
the United States, made with a spreadsheet program. The number of cases was standardized to a rate per 100,000 and
the percent change per year in this rate was calculated. The nearly steadily dropping line shows that the TB incidence
was decreasing in most years, but the percent change in this rate varied by as much as +/- 10%, with 'surges in 1975
and around the early 1990s. The use of both vertical axes allows the comparison of two time series in one graphic.
Other techniques include:
Autocorrelation analysis to examine serial dependence
Spectral analysis to examine cyclic behavior which need not be related to seasonality. For example, sun spot
activity varies over 11 year cycles.[4][5] Other common examples include celestial phenomena, weather patterns,
neural activity, commodity prices, and economic activity.
Separation into components representing trend, seasonality, slow and fast variation, and cyclical irregularity:
see trend estimation and decomposition of time series
14.3.3
Curve tting
Main article: Curve tting

Curve tting[6][7] is the process of constructing a curve, or mathematical function, that has the best t to a series of
data points,[8] possibly subject to constraints.[9][10] Curve tting can involve either interpolation,[11][12] where an exact
t to the data is required, or smoothing,[13][14] in which a smooth function is constructed that approximately ts
the data. A related topic is regression analysis,[15][16] which focuses more on questions of statistical inference such as
how much uncertainty is present in a curve that is t to data observed with random errors. Fitted curves can be used
94
as an aid for data visualization,[17][18] to infer values of a function where no data are available,[19] and to summarize
the relationships among two or more variables.[20] Extrapolation refers to the use of a tted curve beyond the range
of the observed data,[21] and is subject to a degree of uncertainty[22] since it may reect the method used to construct
the curve as much as it reects the observed data.
The construction of economic time series involves the estimation of some components for some dates by interpolation
between values (benchmarks) for earlier and later dates. Interpolation is estimation of an unknown quantity between two known quantities (historical data), or drawing conclusions about missing information from the available
information (reading between the lines).[23] Interpolation is useful where the data surrounding the missing data is
available and its trend, seasonality, and longer-term cycles are known. This is often done by using a related series
known for all relevant dates.[24] Alternatively polynomial interpolation or spline interpolation is used where piecewise polynomial functions are t into time intervals such that they t smoothly together. A dierent problem which
is closely related to interpolation is the approximation of a complicated function by a simple function (also called
regression).The main dierence between regression and interpolation is that polynomial regression gives a single
polynomial that models the entire data set. Spline interpolation, however, yield a piecewise continuous function
composed of many polynomials to model the data set.
Extrapolation is the process of estimating, beyond the original observation range, the value of a variable on the basis
of its relationship with another variable. It is similar to interpolation, which produces estimates between known
observations, but extrapolation is subject to greater uncertainty and a higher risk of producing meaningless results.
14.3.4
Function Approximation
Main article: Function approximation

In general, a function approximation problem asks us to select a function among a well-dened class that closely
matches (approximates) a target function in a task-specic way. One can distinguish two major classes of function
approximation problems: First, for known target functions approximation theory is the branch of numerical analysis
that investigates how certain known functions (for example, special functions) can be approximated by a specic
class of functions (for example, polynomials or rational functions) that often have desirable properties (inexpensive
computation, continuity, integral and limit values, etc.).
Second, the target function, call it g, may be unknown; instead of an explicit formula, only a set of points ( a time
series) of the form (x, g(x)) is provided. Depending on the structure of the domain and codomain of g, several
techniques for approximating g may be applicable. For example, if g is an operation on the real numbers, techniques
of interpolation, extrapolation, regression analysis, and curve tting can be used. If the codomain (range or target
set) of g is a nite set, one is dealing with a classication problem instead. A related problem of online time series
approximation[25] is to summarize the data in one-pass and construct an approximate representation that can support
a variety of time series queries with bounds on worst-case error.
To some extent the dierent problems (regression, classication, tness approximation) have received a unied treatment in statistical learning theory, where they are viewed as supervised learning problems.
14.3.5
Prediction and forecasting
In statistics, prediction is a part of statistical inference. One particular approach to such inference is known as
predictive inference, but the prediction can be undertaken within any of the several approaches to statistical inference. Indeed, one description of statistics is that it provides a means of transferring knowledge about a sample of a
population to the whole population, and to other related populations, which is not necessarily the same as prediction
over time. When information is transferred across time, often to specic points in time, the process is known as
forecasting.
Fully formed statistical models for stochastic simulation purposes, so as to generate alternative versions of the
time series, representing what might happen over non-specic time-periods in the future
Simple or fully formed statistical models to describe the likely outcome of the time series in the immediate
future, given knowledge of the most recent outcomes (forecasting).
Forecasting on time series is usually done using automated statistical software packages and programming
languages, such as R, S, SAS, SPSS, Minitab, Pandas (Python) and many others.
14.4. MODELS
14.3.6
95
Classication
Main article: Statistical classication

Assigning time series pattern to a specic category, for example identify a word based on series of hand movements
in sign language
14.3.7
Regression analysis
Main article: Regression analysis

Estimating future value of a signal based on its previous behavior, e.g. predict the price of AAPL stock based on its
previous price movements for that hour, day or month, or predict position of Apollo 11 spacecraft at a certain future
moment based on its current trajectory (i.e. time series of its previous locations).[26] Regression analysis is usually
based on statistical interpretation of time series properties in time domain, pioneered by statisticians George Box and
Gwilym Jenkins in the 1950s: see BoxJenkins
14.3.8
Signal estimation
See also: Signal processing and Estimation theory

This approach is based on harmonic analysis and ltering of signals in the frequency domain using the Fourier transform, and spectral density estimation, the development of which was signicantly accelerated during World War II by
mathematician Norbert Wiener, electrical engineers Rudolf E. Klmn, Dennis Gabor and others for ltering signals
from noise and predicting signal values at a certain point in time. See Kalman lter, Estimation theory, and digital
signal processing
14.3.9
Segmentation
Main article: Time-series segmentation

Splitting a time-series into a sequence of segments. It is often the case that a time-series can be represented as a
sequence of individual segments, each with its own characteristic properties. For example, the audio signal from a
conference call can be partitioned into pieces corresponding to the times during which each person was speaking. In
time-series segmentation, the goal is to identify the segment boundary points in the time-series, and to characterize
the dynamical properties associated with each segment. One can approach this problem using change-point detection,
or by modeling the time-series as a more sophisticated system, such as a Markov jump linear system.
14.4 Models
Models for time series data can have many forms and represent dierent stochastic processes. When modeling
variations in the level of a process, three broad classes of practical importance are the autoregressive (AR) models,
the integrated (I) models, and the moving average (MA) models. These three classes depend linearly on previous
data points.[27] Combinations of these ideas produce autoregressive moving average (ARMA) and autoregressive
integrated moving average (ARIMA) models. The autoregressive fractionally integrated moving average (ARFIMA)
model generalizes the former three. Extensions of these classes to deal with vector-valued data are available under
the heading of multivariate time-series models and sometimes the preceding acronyms are extended by including
an initial V for vector, as in VAR for vector autoregression. An additional set of extensions of these models is
available for use where the observed time-series is driven by some forcing time-series (which may not have a causal
eect on the observed series): the distinction from the multivariate case is that the forcing series may be deterministic
or under the experimenters control. For these models, the acronyms are extended with a nal X for exogenous.
96
Non-linear dependence of the level of a series on previous data points is of interest, partly because of the possibility
of producing a chaotic time series. However, more importantly, empirical investigations can indicate the advantage
of using predictions derived from non-linear models, over those from linear models, as for example in nonlinear
autoregressive exogenous models. Further references on nonlinear time series analysis: (Kantz and Schreiber),[28]
and (Abarbanel) [29]
Among other types of non-linear time series models, there are models to represent the changes of variance over
time (heteroskedasticity). These models represent autoregressive conditional heteroskedasticity (ARCH) and the
collection comprises a wide variety of representation (GARCH, TARCH, EGARCH, FIGARCH, CGARCH, etc.).
Here changes in variability are related to, or predicted by, recent past values of the observed series. This is in contrast
to other possible representations of locally varying variability, where the variability might be modelled as being driven
by a separate time-varying process, as in a doubly stochastic model.
In recent work on model-free analyses, wavelet transform based methods (for example locally stationary wavelets and
wavelet decomposed neural networks) have gained favor. Multiscale (often referred to as multiresolution) techniques
decompose a given time series, attempting to illustrate time dependence at multiple scales. See also Markov switching
multifractal (MSMF) techniques for modeling volatility evolution.
A Hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be
a Markov process with unobserved (hidden) states. An HMM can be considered as the simplest dynamic Bayesian
network. HMM models are widely used in speech recognition, for translating a time series of spoken words into text.
14.4.1
Notation
A number of dierent notations are in use for time-series analysis. A common notation specifying a time series X
that is indexed by the natural numbers is written
X = {X1 , X2 , ...}.
Another common notation is
Y = {Yt: t T},
where T is the index set.
14.4.2
Conditions
There are two sets of conditions under which much of the theory is built:
Stationary process
Ergodic process
However, ideas of stationarity must be expanded to consider two important ideas: strict stationarity and second-order
stationarity. Both models and applications can be developed under each of these conditions, although the models in
the latter case might be considered as only partly specied.
In addition, time-series analysis can be applied where the series are seasonally stationary or non-stationary. Situations
where the amplitudes of frequency components change with time can be dealt with in time-frequency analysis which
makes use of a timefrequency representation of a time-series or signal.[30]
14.4.3
Models
Main article: Autoregressive model

The general representation of an autoregressive model, well known as AR(p), is
14.4. MODELS
97
Yt = 0 + 1 Yt1 + 2 Yt2 + + p Ytp + t

where the term t is the source of randomness and is called white noise. It is assumed to have the following characteristics:
E[t ] = 0 ,
E[2t ] = 2 ,
E[t s ] = 0 t = s .
With these assumptions, the process is specied up to second-order moments and, subject to conditions on the coefcients, may be second-order stationary.
If the noise also has a normal distribution, it is called normal or Gaussian white noise. In this case, the AR process
may be strictly stationary, again subject to conditions on the coecients.
Tools for investigating time-series data include:
Consideration of the autocorrelation function and the spectral density function (also cross-correlation functions
and cross-spectral density functions)
Scaled cross- and auto-correlation functions to remove contributions of slow components[31]
Performing a Fourier transform to investigate the series in the frequency domain
Use of a lter to remove unwanted noise
Principal component analysis (or empirical orthogonal function analysis)
Singular spectrum analysis
Structural models:
General State Space Models
Unobserved Components Models
Machine Learning
Articial neural networks
Support Vector Machine
Fuzzy Logic
Gaussian Processes
Hidden Markov model
Queueing Theory Analysis
Control chart
Shewhart individuals control chart
CUSUM chart
EWMA chart
Detrended uctuation analysis
Dynamic time warping[32]
Cross-correlation[33]
Dynamic Bayesian network
98

Time-frequency analysis techniques:
Fast Fourier Transform
Continuous wavelet transform
Short-time Fourier transform
Chirplet transform
Fractional Fourier transform
Chaotic analysis
Correlation dimension
Recurrence plots
Recurrence quantication analysis
Lyapunov exponents
Entropy encoding
14.4.4
Measures
Time series metrics or features that can be used for time series classication or regression analysis:[34]
Univariate linear measures
Moment (mathematics)
Spectral band power
Spectral edge frequency
Accumulated Energy (signal processing)
Characteristics of the autocorrelation function
Hjorth parameters
FFT parameters
Autoregressive model parameters
MannKendall test
Univariate non-linear measures
Measures based on the correlation sum
Correlation dimension
Correlation integral
Correlation density
Correlation entropy
Approximate entropy[35]
Sample entropy
Fourier entropy
Wavelet entropy
Rnyi entropy
Higher-order methods
Marginal predictability
Dynamical similarity index
State space dissimilarity measures
Lyapunov exponent
Permutation methods
14.5. VISUALIZATION
99
Local ow
Other univariate measures
Algorithmic complexity
Kolmogorov complexity estimates
Hidden Markov Model states
Surrogate time series and surrogate correction
Loss of recurrence (degree of non-stationarity)
Bivariate linear measures
Maximum linear cross-correlation
Linear Coherence (signal processing)
Bivariate non-linear measures
Non-linear interdependence
Dynamical Entrainment (physics)
Measures for Phase synchronization
Measures for Phase locking
Similarity measures:[36]
Cross-correlation
Dynamic Time Warping[32]
Hidden Markov Models
Edit distance
Total correlation
NeweyWest estimator
PraisWinsten transformation
Data as Vectors in a Metrizable Space
Minkowski distance
Mahalanobis distance
Data as Time Series with Envelopes
Global Standard Deviation
Local Standard Deviation
Windowed Standard Deviation
Data Interpreted as Stochastic Series
Spearmans rank correlation coecient
Data Interpreted as a Probability Distribution Function
KolmogorovSmirnov test
Cramrvon Mises criterion
14.5 Visualization
Time series can be visualized with two categories of chart:Overlapping Charts and Separated Charts. Overlapping
Charts display all-time series on the same layout while Separated Charts presents them on dierent layouts (but
aligned for comparison purpose)[37]
100
14.5.1
Overlapping Charts
Braided Graphs
Line Charts
Slope Graphs
GapChart
14.5.2
Separated Charts
Horizon Graphs
Reduced Line Charts (small multiples)
Silhouette Graph
Circular Silhouette Graph
14.6 Applications
Fractal geometry, using a deterministic Cantor structure, is used to model the surface topography, where recent
advancements in thermoviscoelastic creep contact of rough surfaces are introduced. Various viscoelastic idealizations
are used to model the surface materials, for example, Maxwell, Kelvin-Voigt, Standard Linear Solid and Jerey media.
Asymptotic power laws, through hypergeometric series, were used to express the surface creep as a function of remote
forces, body temperatures and time.[38]
14.7 Software
Working with Time Series data is a relatively common use for statistical analysis software. As a result of this, there
are many oerings both commercial and open source. Some examples include:
CRAN supplementary statistics package for R[39]
Analysis and Forecasting with Weka[40]
Predictive modeling with GMDH Shell[41]
Functions and Modeling in the Wolfram Language[42]
Time Series Objects in MATLAB[43]
SAS/ETS in SAS software[44]
Expert Modeler in IBM SPSS Statistics and IBM SPSS Modeler
14.8 See also

Anomaly time series
Chirp
Decomposition of time series
Detrended uctuation analysis
Digital signal processing
14.9. REFERENCES
101
Distributed lag
Estimation theory
Forecasting
Hurst exponent
Monte Carlo method
Random walk
Scaled correlation
Seasonal adjustment
Sequence analysis
Signal processing
Trend estimation
Unevenly spaced time series
Time series database
14.9 References
[1] Zissis, Dimitrios; Xidias, Elias; Lekkas, Dimitrios (2015). Real-time vessel behavior prediction. Evolving Systems. 7:
112. doi:10.1007/s12530-015-9133-5.
[2] Imdadullah. Time Series Analysis. Basic Statistics and Data Analysis. itfeature.com. Retrieved 2 January 2014.
[3] Lin, Jessica; Keogh, Eamonn; Lonardi, Stefano; Chiu, Bill (2003). A symbolic representation of time series, with implications for streaming algorithms. Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and
knowledge discovery. New York: ACM Press. doi:10.1145/882082.882086.
[4] Bloomeld, P. (1976). Fourier analysis of time series: An introduction. New York: Wiley. ISBN 0471082562.
[5] Shumway, R. H. (1988). Applied statistical time series analysis. Englewood Clis, NJ: Prentice Hall. ISBN 0130415006.
[6] Sandra Lach Arlinghaus, PHB Practical Handbook of Curve Fitting. CRC Press, 1994.
[7] William M. Kolb. Curve Fitting for Programmable Calculators. Syntec, Incorporated, 1984.
[8] S.S. Halli, K.V. Rao. 1992. Advanced Techniques of Population Analysis. ISBN 0306439972 Page 165 (cf. ... functions
are fullled if we have a good to moderate t for the observed data.)
[9] The Signal and the Noise: Why So Many Predictions Fail-but Some Don't. By Nate Silver
[10] Data Preparation for Data Mining: Text. By Dorian Pyle.
[11] Numerical Methods in Engineering with MATLAB. By Jaan Kiusalaas. Page 24.
[12] Numerical Methods in Engineering with Python 3. By Jaan Kiusalaas. Page 21.
[13] Numerical Methods of Curve Fitting. By P. G. Guest, Philip George Guest. Page 349.
[14] See also: Mollier
[15] Fitting Models to Biological Data Using Linear and Nonlinear Regression. By Harvey Motulsky, Arthur Christopoulos.
[16] Regression Analysis By Rudolf J. Freund, William J. Wilson, Ping Sa. Page 269.
[17] Visual Informatics. Edited by Halimah Badioze Zaman, Peter Robinson, Maria Petrou, Patrick Olivier, Heiko Schrder.
Page 689.
[18] Numerical Methods for Nonlinear Engineering Models. By John R. Hauser. Page 227.
[19] Methods of Experimental Physics: Spectroscopy, Volume 13, Part 1. By Claire Marton. Page 150.
102
[20] Encyclopedia of Research Design, Volume 1. Edited by Neil J. Salkind. Page 266.
[21] Community Analysis and Planning Techniques. By Richard E. Klosterman. Page 1.
[22] An Introduction to Risk and Uncertainty in the Evaluation of Environmental Investments. DIANE Publishing. Pg 69
[23] Hamming, Richard. Numerical methods for scientists and engineers. Courier Corporation, 2012.
[24] Friedman, Milton. The interpolation of time series by related series. Journal of the American Statistical Association
57.300 (1962): 729-757.
[25] Gandhi, Sorabh, Luca Foschini, and Subhash Suri. Space-ecient online approximation of time series data: Streams,
amnesia, and out-of-order. Data Engineering (ICDE), 2010 IEEE 26th International Conference on. IEEE, 2010.
[26] Lawson, Charles L.; Hanson, Richard J. (1995). Solving Least Squares Problems. Philadelphia: Society for Industrial and
Applied Mathematics. ISBN 0898713560.
[27] Gershenfeld, N. (1999). The Nature of Mathematical Modeling. New York: Cambridge University Press. pp. 205208.
ISBN 0521570956.
[28] Kantz, Holger; Thomas, Schreiber (2004). Nonlinear Time Series Analysis. London: Cambridge University Press. ISBN
978-0521529020.
[29] Abarbanel, Henry (Nov 25, 1997). Analysis of Observed Chaotic Data. New York: Springer. ISBN 978-0387983721.
[30] Boashash, B. (ed.), (2003) Time-Frequency Signal Analysis and Processing: A Comprehensive Reference, Elsevier Science,
Oxford, 2003 ISBN ISBN 0-08-044335-4
[31] Nikoli, D.; Muresan, R. C.; Feng, W.; Singer, W. (2012). Scaled correlation analysis: a better way to compute a crosscorrelogram. European Journal of Neuroscience. 35 (5): 742762. doi:10.1111/j.1460-9568.2011.07987.x.
[32] Sakoe, Hiroaki; Chiba, Seibi (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE
Transactions on Acoustics, Speech and Signal Processing. doi:10.1109/TASSP.1978.1163055.
[33] Goutte, Cyril; Toft, Peter; Rostrup, Egill; Nielsen, Finn .; Hansen, Lars Kai (1999). On Clustering fMRI Time Series.
NeuroImage. doi:10.1006/nimg.1998.0391.
[34] Mormann, Florian; Andrzejak, Ralph G.; Elger, Christian E.; Lehnertz, Klaus (2007). Seizure prediction: the long and
winding road. Brain. 130 (2): 314333. doi:10.1093/brain/awl241. PMID 17008335.
[35] Land, Bruce; Elias, Damian. Measuring the 'Complexity' of a time series.
[36] Ropella, G. E. P.; Nag, D. A.; Hunt, C. A. (2003). Similarity measures for automated comparison of in silico and in vitro
experimental results. Engineering in Medicine and Biology Society. 3: 29332936. doi:10.1109/IEMBS.2003.1280532.
[37] Tominski, Christian; Aigner, Wolfgang. The TimeViz Browser:A Visual Survey of Visualization Techniques for TimeOriented Data. Retrieved 1 June 2014.TimeViz
[38] Osama Abuzeid, Anas Al-Rabadi, Hashem Alkhaldi . Recent advancements in fractal geometric-based nonlinear time
series solutions to the micro-quasistatic thermoviscoelastic creep for rough surfaces in contact, Mathematical Problems in
Engineering, Volume 2011, Article ID 691270
[39] Hyndman, Rob J (2016-01-22). CRAN Task View: Time Series Analysis.
[40] Time Series Analysis and Forecasting with Weka - Pentaho Data Mining - Pentaho Wiki. wiki.pentaho.com. Retrieved
2016-07-07.
[41] Time Series Analysis & Forecasting Software 2016 [Free Download]". Retrieved 2016-07-07.
[42] Time SeriesWolfram Language Documentation. reference.wolfram.com. Retrieved 2016-07-07.
[43] Time Series Objects - MATLAB & Simulink. www.mathworks.com. Retrieved 2016-07-07.
[44] Econometrics and Time Series Analysis, SAS/ETS Software. Retrieved 2016-07-07.
103

Box, George; Jenkins, Gwilym (1976), Time Series Analysis: forecasting and control, rev. ed., Oakland, California: Holden-Day
Cowpertwait P.S.P., Metcalfe A.V. (2009), Introductory Time Series with R, Springer.
Durbin J., Koopman S.J. (2001), Time Series Analysis by State Space Methods, Oxford University Press.
Gershenfeld, Neil (2000), The Nature of Mathematical Modeling, Cambridge University Press, ISBN 978-0521-57095-4, OCLC 174825352
Hamilton, James (1994), Time Series Analysis, Princeton University Press, ISBN 0-691-04289-6
Priestley, M. B. (1981), Spectral Analysis and Time Series, Academic Press. ISBN 978-0-12-564901-8
Shasha, D. (2004), High Performance Discovery in Time Series, Springer, ISBN 0-387-00857-8
Shumway R. H., Stoer (2011), Time Series Analysis and its Applications, Springer.
Weigend A. S., Gershenfeld N. A. (Eds.) (1994), Time Series Prediction: Forecasting the Future and Understanding the Past. Proceedings of the NATO Advanced Research Workshop on Comparative Time Series
Analysis (Santa Fe, May 1992), Addison-Wesley.
Wiener, N. (1949), Extrapolation, Interpolation, and Smoothing of Stationary Time Series, MIT Press.
Woodward, W. A., Gray, H. L. & Elliott, A. C. (2012), Applied Time Series Analysis, CRC Press.

Time series at Encyclopaedia of Mathematics.
A First Course on Time Series Analysis An open source book on time series analysis with SAS.
Introduction to Time series Analysis (Engineering Statistics Handbook) A practical guide to Time series
analysis.
MATLAB Toolkit for Computation of Multiple Measures on Time Series Data Bases.
A Matlab tutorial on power spectra, wavelet analysis, and coherence on website with many other tutorials.
TimeViz survey
Gaussian Processes for Machine Learning: Book webpage
CRAN Time Series Task View - Time Series in R
TimeSeries Analysis with Pandas
104
14.12 Text and image sources, contributors, and licenses

14.12.1
Text
Statistics Source: https://en.wikipedia.org/wiki/Statistics?oldid=736871217 Contributors: Brion VIBBER, Mav, The Anome, Tarquin,
Stephen Gilbert, Ap, Larry Sanger, Eclecticology, Saikat, Youssefsan, Christian List, Enchanter, Miguel~enwiki, SimonP, Peterlin~enwiki,
Ben-Zin~enwiki, Hefaistos, Waveguy, Heron, Rsabbatini, Camembert, Marekan, Olivier, Stevertigo, Edward, Boud, Michael Hardy,
GABaker, Fred Bauder, Lexor, Nixdorf, Shyamal, Kku, Tannin, Dcljr, Tomi, CesarB, Looxix~enwiki, Ahoerstemeier, DavidWBrooks,
Ronz, BevRowe, Snoyes, Salsa Shark, Netsnipe, Big iron, Jtzg, Cherkash, Samuel~enwiki, Mxn, Schneelocke, Hike395, Guaka, Vanished user 5zariu3jisj0j4irj, Wikiborg, Dysprosia, Jitse Niesen, Quux, Jake Nelson, Maximus Rex, Wakka, Wernher, Optim, Rbellin,
Secretlondon, Noeckel, Phil Boswell, Robbot, Jakohn, Benwing, ZimZalaBim, Gandalf61, Tim Ivorson, RossA, Henrygb, Hemanshu,
Gidonb, Borislav, Ianml, Roozbeh, Dhodges, SoLando, Wile E. Heresiarch, Cutler, Dave6, Aomarks, Ancheta Wis, Matthew Stannard,
Tophcito, Giftlite, Sj, Wikilibrarian, Netoholic, Lethe, Tom harrison, Meursault2004, Everyking, Maha ts, Curps, Dmb000006, Muzzle,
Jfdwol, BrendanH, Maarten van Vliet, Guanaco, Skagedal, Eequor, Mdb~enwiki, SWAdair, Brazuca, Hereticam, Andycjp, Mats Kindahl, Antandrus, MarkSweep, Piotrus, Ampre, L353a1, Sean Heron, CSTAR, APH, Oneiros, Gsociology, PFHLai, Bodnotbod, Mysidia,
Icairns, Simoneau, Sam Hocevar, Jeremykemp, Howardjp, Divadrax, Zondor, Bluemask, Drchris, Richardelainechambers, Moverton,
Discospinster, Rich Farmbrough, Guanabot, Michal Jurosz, IlyaHaykinson, Paul August, Bender235, Kbh3rd, Brian0918, El C, Lycurgus, Zenohockey, Art LaPella, RoyBoy, 2005, Bobo192, Janna Isabot, O18, Gianlu~enwiki, Smalljim, Maurreen, 3mta3, Minghong, Mdd,
Passw0rd, Drf5n, Schissel, Jigen III, Msh210, Alansohn, Gary, Anthony Appleyard, Mduvekot, Kanie, Rgclegg, Avenue, Evil Monkey,
Oleg Alexandrov, AustinZ, Waabu, Linas, Karnesky, LOL, Before My Ken, WadeSimMiser, Acerperi, Wikiklrsc, Sengkang, BlaiseFEgan, Gimboid13, Mr Anthem, Marudubshinki, RichardWeiss, Graham87, Ilya, Galwhaa, Chun-hian, FreplySpang, Dragoneye776, Dpr,
Tlroche, Jorunn, Koolkao, Rjwilmsi, Mayumashu, Pleiotrop3, Amire80, Carbonite, Salix alba, Jb-adder, Willetjo, Crazynas, Jemcneill, Zero0w, FlaBot, Chocolatier, RobertG, Windchaser, Dibowen5, Latka, Mathbot, Airumel, Nivix, Celestianpower, RexNL, Gurch,
AndriuZ, Pete.Hurd, Mathieumcguire, Shaile, Malhonen, BradBeattie, CiaPan, Chobot, Nagytibi, DVdm, Bgwhite, Simesa, Adoniscik,
Gwernol, Wavelength, Phantomsteve, Loom91, Cswrye, Epolk, Donwarnersaklad, Hydrargyrum, Stephenb, Manop, Chaos, NawlinWiki, Wiki alf, Grafen, Tailpig, ONEder Boy, TCrossland, Johndarrington, Isolani, D. Wu, Alex43223, BOT-Superzerocool, Mgnbar, Tigershrike, Saric, Closedmouth, Terfgiu, Modify, Beaker342, GraemeL, AGToth, Whouk, NeilN, DVD R W, Sardanaphalus,
Veinor, JJL, SmackBot, YellowMonkey, Twerges, Unschool, Honza Zruba, Stux, Hydrogen Iodide, McGeddon, Mscuthbert, CommodiCast, Timotheus Canens, Dhochron, Gilliam, Brotherbobby, Skizzik, ERcheck, Chris the speller, Bychan~enwiki, Bluebot, Keegan,
Jjalexand, DocKrin, Wikisamh, Silly rabbit, Ekalin, RayAYang, Deli nk, Klnorman, Dlohcierekims sock, Robth, Zven, John Reaves,
Scwlong, Chendy, SLC1, Iwaterpolo, PierreAnoid, Can't sleep, clown will eat me, DRahier, Asarko, Hve, Terry Oldberg, Addshore,
Kcordina, Amazins490, Mosca, SundarBot, UU, Jmlk17, Aldaron, ConMan, Valenciano, Krexer, Chadmbol, Richard001, Nrcprm2026,
Mini-Geek, G716, Photoleif, GumbyProf, Fschoonj, Wybot, Zeamays, SashatoBot, Lambiam, Arodb, Derek farn, Harryboyles, Chocolateluvr88, Sina2, Archimerged, Kuru, MagnaMopus, Lapaz, Soumyasch, Tim bates, SpyMagician, Deviathan~enwiki, Ckatz, RandomCritic, 16@r, Beetstra, Santa Sangre, Daphne A, Mets501, Spiel496, Ctacmo, RichardF, Roderickmunro, Hu12, Levineps, BranStark,
Joseph Solis in Australia, Wjejskenewr, Mangesh.dashpute, Chris53516, Igoldste, Tawkerbot2, Daniel5127, Filelakeshoe, Kevin Murray, Kendroche, JForget, Robertdamron, CRGreathouse, CmdrObot, Dycedarg, Philiprbrenan, Dexter inside, Requestion, MarsRover,
Neelix, Hingenivrutti, Penbat, Nnp, Art10, MrFish, Myasuda, Mct mht, Slack---line, Mjhoy~enwiki, Arauzo, Ramitmahajan, Gogo
Dodo, Jkokavec, Anonymi, Bornsommer, Odie5533, Christian75, DumbBOT, Richard416282, Englishnerd, Optimist on the run, Lindsay658, Finn krogstad, FrancoGG, Mattisse, Talgalili, Sarvesh85@gmail.com, Epbr123, Jrl306, LeeG, Jsejcksn, Willworkforicecream,
N5iln, Marek69, John254, Escarbot, Dainis, Mentisto, Wikiwilly~enwiki, AntiVandalBot, Luna Santin, Seaphoto, Memset, Zappernapper, Mack2, Sbarnard, Gkhan, Golgofrinchian, MikeLynch, JAnDbot, Ldc, Markbold, The Transhumanist, Db099221, BenB4,
PhilKnight, IamHope, SiobhanHansa, Magioladitis, Bongwarrior, VoABot II, Je Dahl, JamesBWatson, Hubbardaie, Ranger2006, Trugster, Skew-t, Recurring dreams, Ddr~enwiki, Caesarjbsquitti, Avicennasis, Nevvers, KConWiki, Catgut, Animum, Depressedrobot,
Johnbibby, Robotman1974, Boob, Bobby H. Heey, Xerxes minor, JoergenB, DerHexer, JaGa, Khalid Mahmood, AllenDowney,
Apdevries, Pax:Vobiscum, Gjd001, Rustyfence, Cli smith, MartinBot, Vigyani, BetBot~enwiki, Jim.henderson, R'n'B, Lilac Soul,
Mausy5043, J.delanoy, Trusilver, Rlsheehan, Numbo3, Mthibault, Ulyssesmsu, Yannick56, TheSeven, Cpiral, Gzkn, M C Y 1008,
Luntertun, It Is Me Here, Noschool3, Ronny Gunnarsson, Macrolizard, Bmilicevic, HiLo48, The Transhumanist (AWB), KylieTastic,
Kenneth M Burke, DavidCBryant, Tiggerjay, Afv2006, HyDeckar, WinterSpw, Ron shelf, Tanyawade, Idioma-bot, Funandtrvl, Wikieditor06, Lights, VolkovBot, DrMicro, ABF, JohnBlackburne, Paxcoder, Jimmaths, Barneca, Philip Trueman, DoorsAjar, TXiKiBoT,
Ranajeet, Jacob Lundberg, Wikipediatoperfection, Tomsega, ElinorD, Qxz, Arpabr, The Tetrast, Seanstock, Jackfork, Christopher Connor, Onore Baka Sama, Manik762007, Careercornerstone, Wikidan829, Richard redfern, Skarz, Dmcq, Symane, EmxBot, Kolmorogo,
Demmy, Thefellswooper, SieBot, BotMultichill, Katonal, Triwbe, Toddst1, Flyer22 Reborn, Tiptoety, JD554, Ireas, Jt512, Free Software Knight, Strife911, Oxymoron83, Faradayplank, Boromir123, Hinaaa, BenoniBot~enwiki, Emesee, OKBot, Msrasnw, Melcombe,
Yhkhoo, Nn123645, Superbeecat, Digisus, Richard David Ramsey, Escape Orbit, Maniac2910, Tautologist, XDanielx, WikipedianMarlith, ClueBot, Rumping, Fyyer, John ellenberger, DesertAngel, Gaia Octavia Agrippa, Giusippe, Turbojet, Uncle Milty, Niceguyedc,
LizardJr8, Morten Mnchow, Chickenman78, Lbertolotti, DragonBot, Pumpmeup, Jusdafax, Three-quarter-ten, Rwilli13, Adamjslund,
Livius3, Stathope17, Notteln, Precanalytics, Diaa abdelmoneim, Dekisugi, Gundersen53, BOTarate, Aitias, ShawnAGaddy, Dbenzvi, JDPhD, FinnMan, Qwfp, Antonwg, Ano-User, GKantaris, Editorofthewiki, Helixweb, XLinkBot, Avoided, WikHead, Alexius08, Tayste,
Addbot, Proofreader77, Hgberman, DOI bot, Captain-tucker, Atethnekos, Fgnievinski, Johnjohn83, Kwanesum, Br1z, Bte99, CanadianLinuxUser, MrOllie, Chamal N, Glane23, Delaszk, Glass Sword, Debresser, Favonian, Quercus solaris, Aitambong, Ssschhh, Tide
rolls, Lightbot, Kiril Simeonovski, Teles, MuZemike, TeH nOmInAtOr, LuK3, Megaman en m, Nbeltz, Jim, Luckas-bot, Yobot, Notizy1251, OrgasGirl, Fraggle81, Vimalp, DisillusionedBitterAndKnackered, Mathinik, Gobbleswoggler, THEN WHO WAS PHONE?,
ECEstats, Brougham96, Mhmolitor, AnomieBOT, DemocraticLuntz, VX, Jim1138, Cavarrone, Galoubet, Dwayne, Piano non troppo,
Youkbam, Templatehater, Walter Grassroot, Htim, Materialscientist, The High Fin Sperm Whale, Citation bot, Jtamad, OllieFury, Markmagdy, Sweeraha, GB fan, Apollo, Neurolysis, ArthurBot, Herreradavid33, LilHelpa, Xqbot, TinucherianBot II, Class ruiner, Kenz0402,
Drilnoth, Fishiface, Locos epraix, Spetzznaz, AbigailAbernathy, Clear range, Coretheapple, GrouchoBot, Ute in DC, SassoBot, Loizbec,
78.26, Rstatx, Stynyr, Doulos Christos, Chen-Pan Liao, N.j.hansen, Shadowjams, Joaquin008, Brennan41292, FrescoBot, Tobby72,
Hallway916, Shadowpsi, HJ Mitchell, Winterswift, Citation bot 1, PrBeacon, Boxplot, Yuanfangdelang, Pinethicket, Kiefer.Wolfowitz,
Stpasha, Brian Everlasting, le ottante, Bwana2009, Dee539, Florendobe, White Shadows, Gamewizard71, FoxBot, Mjs1991, Ruzihm,
TobeBot, LAUD, Arfgab, Decstop, MrX, Spegali, Keepitup.sid, Sourishdas, Tbhotch, Drivi86, Sandman888, DARTH SIDIOUS 2, Chrisrayner, Whisky drinker, Mean as custard, Updatehelper, TjBot, Kastchei, Karlheinz037, Becritical, Elitropia, Jordan.brayanov, EmausBot, Orphan Wiki, Gfoley4, Racerx11, Hiamy, Tommy2010, Kellylautt, Dcirovic, Tuxedo junction, Bae88, Daonguyen95, F, Josve05a,
14.12. TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES
105
Bollyje, Tastewrong1234, WeijiBaikeBianji, Cbratsas, JA(000)Davidson, Access Denied, Dylthaavatar, Kgwet, SporkBot, Jorjulio,
GrindtXX, Makecat, Sak11sl, Future ahead, Anglais1, Sunur7, Mr. Kenan Bek, Noodleki, Donner60, Agatecat2700, NTox, DemonicPartyHat, 28bot, Petrb, ClueBot NG, MelbourneStar, This lousy T-shirt, Chrisminter, Dvsbmx, BarrelProof, Bped1985, Andreas.Persson,
Shawnluft, Cntras, Braincricket, ScottSteiner, Widr, Hikenstu, Ryan Vesey, Amircrypto, Helpful Pixie Bot, Xandrox, Mishnadar,
Ldownss00, Calabe1992, KLBot2, Lowercase sigmabot, BG19bot, Scyllagist, WikiTryHardDieHard, Juro2351, Northamerica1000, Absconded Northerner, Muhehej1000, MusikAnimal, Marcocapelle, Stalve, EmadIV, Rm1271, Htrkaya, Omiswiki, Manoguru, Kittipatv,
Meclee, Brad7777, Glacialfox, Roleren, Anbu121, Aks23121990, Europeancentralbank, Bsutradhar, Ca3tki, Kodiologist, Codeh, Gr
khan veroana kharal, Markk waugh, Illia Connell, SelmanRepiti, Dexbot, Ubertook, Mogism, Wikignome1213, CuriousMind01, Princessandthepi, Lugia2453, Brownstat, Norazoey, Speakel, 069952497a, PeterLFlomPhD, Faizan, RG57, FallingGravity, AmericanLemming, Tentinator, Beasarah, DavidLeighEllis, Butter7938, Ugog Nizdast, Seppi333, SpuriousTwist, Ginsuloft, Sean4424, Sarwan khan,
Adirlanz, AddWittyNameHere, Narasandraprabhakara, Science.philosophy.arts, Akuaku123, Mendisar Esarimar Desktrwaimar, Mconnolly17, Zib2542, Therealthings, MelaniePS, Monkbot, Horseless Headman, Soon Son Simps, Vieque, Majormuesli, Waggie, Trackteur,
Andri Kuawko, Romelthomas, Umkan, Ybergner, Amortias, NQ, Morgantaschuk, VanishedUser sdu9aya9fs654654, Schmuck420, Crystallizedcarbon, GautamC129, Sumonratin, Zppix, Charlotte Aryanne, Thebearedguy, Mj3322, Rainamagdalena, Lucky457, JohnDae123,
Kreplach123, BabyChastie, SolidPhase, Amira Swedan, Isambard Kingdom, All-wikipro, Asyraf Afthanorhan, KasparBot, Hilopmip, Replypartyeuclides, Chonzom, CLCStudent, Badineleynes, Johnyau89, ArguMentor, Marianna251, XUSB, NRXTR and Anonymous:
1269
Portal:Statistics Source: https://en.wikipedia.org/wiki/Portal%3AStatistics?oldid=659788057 Contributors: Topbanana, Tompw, Btyner,
G716, Magioladitis, VolkovBot, Udufruduhu, Melcombe, Cenarium, Qwfp, Pa36opob, Addbot, Tcharvin, Jbenno, Ciphers, Donner60,
Northamerica1000, Illia Connell, John of Reading Bot and Anonymous: 3
List of elds of application of statistics Source: https://en.wikipedia.org/wiki/List_of_fields_of_application_of_statistics?oldid=712822720
Contributors: Alan Liefting, Btyner, Itub, G716, Cydebot, Rlsheehan, Fratrep, Melcombe, Qwfp, Koumz, Addbot, Luckas-bot, Materialscientist, Xqbot, Duoduoduo, ClueBot NG, WIKIWIZWORKER, BG19bot, Smasongarrison, Sheri khan khan, Soon Son Simps and
Anonymous: 13
Business analytics Source: https://en.wikipedia.org/wiki/Business_analytics?oldid=725185185 Contributors: Michael Hardy, Kku, Michael
Devore, Alvestrand, Gscshoyru, S.K., Mdd, Oleg Alexandrov, Mindmatrix, RHaworth, Rjwilmsi, Hans Genten, Random user 39849958,
Rick lightburn, JLaTondre, XpXiXpY, SmackBot, RolandR, Kuru, Cnbrb, Simonjohnpalmer, Earthlyreason, B, Alaibot, MelanieN,
Vlado1, Vanished user ty12kl89jq10, Sarnalios~enwiki, Wcrosbie, Philip Trueman, Billinghurst, Kerenb, Emilygracedell, Fratrep, Melcombe, Founder DIPM Institute, Ukpremier, Tomas e, Jinij, Niceguyedc, Apparition11, Writerguy71, DeepOpinion, MrOllie, Crmguru2008, Citation bot, Emcien, FrescoBot, Rlistou, Boxplot, Pinethicket, I dream of horses, AmyDenise, Dnedzel, Full-date unlinking bot, Ethansdad, Trappist the monk, Crysb, Helwr, Timtempleton, Dries Debbaut, Chire, Idea Farm, Smithandteam, ClueBot NG,
WhartonCAI, HMSSolent, Wbm1058, BG19bot, Singularit, Jamesx12345, Me, Myself, and I are Here, Faizan, Picturepro, Huang cynthia, Photo.iep, Drchriswilliams, Monkbot, HMSLavender, Loraof, Yashwantsnaik, Rasaxen, Olletove, Rahulfsm, Kavithagrg, Alpha T
Knowledge, Andrewbielat and Anonymous: 70
Descriptive statistics Source: https://en.wikipedia.org/wiki/Descriptive_statistics?oldid=736764422 Contributors: AxelBoldt, Mav, Larry
Sanger, ChangChienFu, Michael Hardy, Dcljr, Tomi, Ronz, Mickey~enwiki, Palfrey, Jitse Niesen, Henrygb, Wikibot, TPK, Giftlite,
Wmahan, Rich Farmbrough, HCA, Bender235, Arcadian, RainbowOfLight, Graham87, Vegaswikian, John Baez, Latka, Chobot, YurikBot, NTBot~enwiki, Wimt, TCrossland, DeadEyeArrow, Zarboki, Blueyoshi321, Closedmouth, Modify, Allens, Bo Jacoby, SmackBot,
Gilliam, Irbobo, Kurykh, TonySt, G716, Vina-iwbot~enwiki, Lambiam, Tim bates, Dan1679, Forsakendaemon, CmdrObot, Irwangatot, Mattisse, Barticus88, Eddyspeeder, Lfstevens, MER-C, Magioladitis, Yllhyseni, David Eppstein, Tgeairn, Syalowitz, VolkovBot,
TXiKiBoT, Samantha kellett, BotKung, Graymornings, Dan Polansky, SieBot, Bentogoa, Quest for Truth, S2000magician, Melcombe,
Escape Orbit, ClueBot, Unbuttered Parsnip, Niceguyedc, Skbkekas, L.tak, Livius3, SchreiberBike, Qwfp, Alexius08, Addbot, RPHv,
Friginator, NjardarBot, MrOllie, AndersBot, Aviados, Legobot, Luckas-bot, Yobot, KamikazeBot, Sz-iwbot, Materialscientist, Citation
bot, MauritsBot, Bakerccm, Pelicans in the lake, Pinethicket, Dashed, Duoduoduo, Cowlibob, Jlj1173, Elium2, GoingBatty, RenamedUser01302013, ZroBot, Donner60, EdoBot, Drea23839, ClueBot NG, Ch88, Hmansourian, BenJChadwick, Benzband, Morning Sunshine, ChrisGualtieri, Yukyuk11, Auss00, S2Jackie, ItsClaudiaC, Evanlemke, Ashleyleia, Clahoonya, Jsw6408, Ualtin, MelaniePS, Soon
Son Simps, Mediavalia, KasparBot and Anonymous: 111
Quality control Source: https://en.wikipedia.org/wiki/Quality_control?oldid=730756539 Contributors: Ed Poor, Deb, Heron, Olivier,
Michael Hardy, JakeVortex, Kku, Danhicks, Jiang, Kaihsu, Smack, Beck, Mydogategodshat, David Thrale, Greenrd, Vaceituno, Frazzydee,
Chuunen Baka, Nufy8, Fredrik, Texture, Robinh, Alan Liefting, Giftlite, DocWatson42, Tom harrison, Robodoc.at, Beland, Quarl,
Adamrice, DMG413, Canterbury Tail, Mike Rosoft, CALR, DanielCD, Discospinster, Rich Farmbrough, Memobug, Mani1, Bender235,
Khalid, Clooistopherm, S.K., Jensbn, El C, Art LaPella, Femto, Bobo192, Smalljim, Prainog, Brim, Maurreen, DaveGorman, John
Fader, Hooperbloob, Mdd, Gary, Conan, Romary, Velella, Computerjoe, Versageek, Ceyockey, Marcelo1229, David Haslam, Uris, ArrowmanCoder, BD2412, FreplySpang, Dpr, Koavf, GlenPeterson, The wub, Sango123, FlaBot, SchuminWeb, DennisArter, RexNL,
AndriuZ, Yorrose, Tomrosenfeld, Adoniscik, YurikBot, Wavelength, RussBot, Jenks1987, Shell Kinney, NawlinWiki, Wiki alf, Multichill, Dhollm, Jpbowen, Kyle Barbour, FF2010, Sandstein, Chase me ladies, I'm the Cavalry, Juanscott, E Wing, Retropunk, Red
Jay, JLaTondre, Dbarefoot, Kungfuadam, SmackBot, DCDuring, The Photon, DanielPeneld, Lds, Abbeyvet, David.c.h, Folajimi,
Bluebot, Rkitko, SchftyThree, KaiserbBot, Jwy, Nakon, Richard001, Weregerbil, Hmoul, Kuru, John, Peterlewis, Arhon, Rwong48,
Ripe, Waggers, Anonymous anonymous, RichardF, Novangelis, Roderickmunro, Dl2000, DabMachine, Iridescent, Shoeofdeath, IvanLanin, Dan1679, Eastlaw, Glanthor Reviol, Teixant, Mak Thorpe, Phatom87, Enoch the red, Biblbroks, Thijs!bot, Eggsyntax, James086,
JustAGal, Sniper Elite, Dfrg.msc, I already forgot, AntiVandalBot, WinBot, Hughch, Ron Richard, Rforsyth, Rabqa1, Danger, Myanw,
PhilKnight, VoABot II, JamesBWatson, Mbc362, Twsx, Ivec, Morlich, E-pen, Gwern, MartinBot, Doodledoo, Jayantaism, J.delanoy, Rlsheehan, Bogey97, VAcharon, Plasticup, SJP, Cobi, KylieTastic, DorganBot, AlnoktaBOT, Philip Trueman, TXiKiBoT, Rotor DB, Ann
Stouter, Martin451, Hanwufu, Jpeeling, Enviroboy, Jmuenzing, SieBot, BotMultichill, Su huynh, Thisisjonathanchan, Itemuk, JSpung, Pm
master, Oxymoron83, KatieDOM, La Parka Your Car, S2000magician, Melcombe, WikiLaurent, Jfbravoc~enwiki, ClueBot, The Thing
That Should Not Be, Professorial, Drmies, SecretDisc, Sabri76, Shustov, Brewcrewer, Gaslan2, StormyJerry, Sam907, DeltaQuad, Versus22, Qwfp, Raploichkin, Richdavi, Addbot, Download, LaaknorBot, SpBot, Numbo3-bot, Lightbot, Anxietycello, Krano, PlankBot,
Yobot, Ptbotgourou, Fraggle81, Sanyi4, Northenpal, KamikazeBot, Emdee, Materialscientist, Frankenpuppy, Erik9bot, Shabadsingh,
Pshent, Kiefer.Wolfowitz, Jonesey95, Impala2009, Sixsigmais, Reconsider the static, Jonkerz, Fastilysock, Suusion of Yellow, DARTH
SIDIOUS 2, Mean as custard, Sapientij, WikitanvirBot, Frostee94, Wikipelli, Dcirovic, F, JeremyBradley, L Kensington, Millsj88,
ChuispastonBot, Ileshko, Qualitytier, ClueBot NG, Bml013, Widr, Theopolisme, Helpful Pixie Bot, BG19bot, Shantanu1989, Wiki-
106
Hannibal, RudeBoyRudeBoy, David.moreno72, Illia Connell, OrganizedGuy, Pittello87, Dimitra Karelou, FallingGravity, Lemnaminor,
JohannesFB, Ana346894, Mat657894, Monkbot, Phil Jacques, Moose1911, KiarashKevin, KasparBot and Anonymous: 320
Operations research Source: https://en.wikipedia.org/wiki/Operations_research?oldid=736687400 Contributors: The Anome, Tbackstr, Iwnbap, Khendon, Vignaux, Maury Markowitz, Jdpipe, Michael Hardy, Wshun, Fred Bauder, Dominus, Ixfd64, Zeno Gantner, Ronz,
EdH, Hike395, Mydogategodshat, Dysprosia, Jitse Niesen, Penfold, Finlay McWalter, Robbot, Xa4~enwiki, PBS, Gwrede, Jredmond, Altenmann, Henrygb, Nilmerg, Aetheling, Jpo, Giftlite, Muness, Oberiko, Mintleaf~enwiki, BenFrantzDale, Fastssion, Leonard G., Gracefool, Just Another Dan, Andycjp, Piotrus, Togo~enwiki, Joyous!, Fintor, Robin klein, Klemen Kocjancic, Clemwang, Canterbury Tail,
Lucidish, Monkeyman, Brianhe, Bender235, Petrus~enwiki,
, RoyBoy, Dungodung, Maurreen, Slambo, Notreadbyhumans, Haham
hanuka, Mdd, Msh210, Arthena, Andrewpmk, Bdwilliamscraig, Hu, Czyl, Saxifrage, Novacatz, Eleusis, Myleslong, Pol098, Tabletop,
AnmaFinotera, Btyner, GraemeLeggett, BDE, Ian Dunster, FlaBot, Mathbot, RobyWayne, Dnadan, YurikBot, Wavelength, Borgx,
Angus Lepper, RobotE, Encyclops, Arzel, Amckern, Manop, Rsrikanth05, Eddie.willers, Welsh, Gareth Jones, Panscient, Amcfreely,
Abrio, Cheese Sandwich, Tribaal, Open2universe, Caliprincess, Nelson50, Bluezy, Zvika, That Guy, From That Show!, Sardanaphalus,
JJL, SmackBot, DXBari, Elgrandragon, Benjaminevans82, Ohnoitsjamie, Skizzik, Anwar saadat, Sadads, Maxsonbd, Baa, Gracenotes,
D nath1, Wyckyd Sceptre, Trekphiler, Tsca.bot, Snowmanradio, Yqwen, Stevenmitchell, RJN, Jon Awbrey, Acdx, Brainfood, Ohconfucius, Vgy7ujm, Aleenf1, Wxm29, Slakr, Beetstra, SandyGeorgia, Ace Frahm, Keith-264, Will Thomas, Iridescent, Vocaro, Philip
ea, CRGreathouse, Thomasmeeks, Requestion, Sanspeur, Penbat, Cydebot, Krauss, Chrislk02, Imajoebob, Jay.Here, Thijs!bot, Surendra
mohnot, David from Downunder, Dawnseeker2000, Seaphoto, Matforddavid, Wayiran, Knotwork, Erxnmedia, .anacondabot, Xn4, Swpb,
David Eppstein, DGG, Jim.henderson, R'n'B, LittleOldMe old, Erkan Yilmaz, NerdyNSK, Auegel, DadaNeem, Axr15, Jose Gaspar, Alterrabe, Deor, JohnBlackburne, Homarjun, Jimmaths, Toddy1, Oshwah, SueHay, Ask123, BarryList, Rich Janis, PhDinMS, SieBot, BotMultichill, Jyoti buet, Bentogoa, Dralbertomarquez, Flyer22 Reborn, Samansouri, Mtrick, Oxymoron83, Wuhwuzdat, S2000magician,
Melcombe, Masoudsa, ClueBot, Koczy, Deanlaw, The Thing That Should Not Be, Isaac.holeman, Grantbow, Boing! said Zebedee, EnigmaMcmxc, Three-quarter-ten, PixelBot, Abkeshvari, JamieS93, TheRedPenOfDoom, Thewellman, Qwfp, Wally Tharg, Graham Sharp,
FTGHSmith, Addbot, Fgnievinski, KenKendall, Fieldday-sunday, Mnh, Leszek Jaczuk, Download, Protonk, LaaknorBot, Lightbot,
Zorrobot, Legobot, Luckas-bot, Yobot, VictorK1965, Pcap, AnomieBOT, VanishedUser sdu9aya9fasdsopa, DemocraticLuntz, Bsimmons666, Galoubet, Formol, HanPritcher, TheTechieGeek63, WebsterRiver, Knowledge Incarnate, Lonniev, Isheden, Williamsrus, KosMal, Omnipaedista, Nnhsky, FrescoBot, Krj373, Sidna, D'ohBot, Cargoking, Michael.Forman, Citation bot 1, Shuroo, Kiefer.Wolfowitz,
RedBot, Rkhwaja, Ibizzavic, NorthernCounties, G Qian, Duoduoduo, Robinqiu, Earthandmoon, Mean as custard, EmausBot, Vader07d,
Dramaturgid, JaeDyWolf, Netha Hussain, Erianna, ChuispastonBot, Nrlsouza, 28bot, Snumath89, Will Beback Auto, StopThat, Gareth
Grith-Jones, Mdgarvey, Widr, BradfordF, Helpful Pixie Bot, Merveunuvar, Wbm1058, BG19bot, Tcody84, Pine, Qx2020, Rjpbi, Marcocapelle, Compfreak7, Brad7777, Gibbja, MahdiBot, Cyberbot II, BFL2015, Flower of Mystery, Jbeyerl, Me, Myself, and I are Here,
Razibot, Randykitty, Kuldeepsheoran1, Biogeographist, Pedarkwa, Melody Lavender, Ginsuloft, Lizia7, Pablodim91, Longobardiano,
WholeWheatBagel, U2fanboi, Behroozkamali, Bfortz, Sarahfores, Cyberbikerva, Prasoon068, Directorofpubs, KasparBot, JessicaGibbs,
Prgks, GreenC bot, RainFall, Varsei.mohsen and Anonymous: 282
Machine learning Source: https://en.wikipedia.org/wiki/Machine_learning?oldid=736737631 Contributors: Arvindn, ChangChienFu,
Michael Hardy, Kku, Delirium, Ahoerstemeier, Ronz, BenKovitz, Mxn, Hike395, Silvonen, Furrykef, Buridan, Jmartinezot, Phoebe,
Shizhao, Topbanana, Robbot, Plehn, KellyCoinGuy, Ancheta Wis, Fabiform, Centrx, Giftlite, Seabhcan, Levin, Dratman, Jason Quinn,
Khalid hassani, Utcursch, APH, Gene s, Paulscrawl, Clemwang, Nowozin, Silence, Bender235, ZeroOne, Superbacana, Aaronbrick,
Jojit fb, Nk, Rajah, Tritium6, Haham hanuka, Mdd, HasharBot~enwiki, Vilapi, Arcenciel, Denoir, Diego Moya, Wjbean, Stephen
Turner, LearnMore, Rrenaud, Leondz, Soultaco, Ruud Koot, BlaiseFEgan, JimmyShelter~enwiki, Essjay, Joerg Kurt Wegner, Adiel,
BD2412, Qwertyus, Rjwilmsi, Emrysk, VKokielov, Eubot, Celendin, Intgr, Predictor, Kri, BMF81, Irregulargalaxies, Chobot, Bobdc,
Bgwhite, Adoniscik, YurikBot, Misterwindupbird, Trondtr, Nesbit, Grafen, Gareth Jones, Srinivasasha, Raikkonen, Crasshopper, DaveWF, Masatran, CWenger, Fram, KnightRider~enwiki, SmackBot, Mneser, InverseHypercube, CommodiCast, Jyoshimi, Mcld, KYN,
Ohnoitsjamie, Chris the speller, FidesLT, Nbarth, Cfallin, Moorejh, JonHarder, Baguasquirrel, Krexer, Shadow1, Philpraxis~enwiki,
Daniel.Cardenas, Sina2, ChaoticLogic, NongBot~enwiki, RexSurvey, Beetstra, WMod-NS, Julthep, Dsilver~enwiki, Dicklyon, Vsweiner,
Optakeover, Ctacmo, MTSbot~enwiki, Ralf Klinkenberg, Dave Runger, Doceddi, Scigrex14, Pgr94, Innohead, Bumbulski, Peterdjones,
Dancter, Msnicki, Quintopia, Thijs!bot, Mereda, Perrygogas, Djbwiki, GordonRoss, Kinimod~enwiki, Damienfrancois, Natalie Erin,
Seaphoto, AnAj, Ninjakannon, Kimptoc, Penguinbroker, The Transhumanist, Jrennie, Hut 8.5, Kyhui, Magioladitis, Ryszard Michalski, Jwojt, Transcendence, Tedickey, Pebkac, Robotman1974, Jroudh, Businessman332211, Pmbhagat, Calltech, STBot, Keith D, Glrx,
Nickvence, Gem-fanat, Salih, AntiSpamBot, Gombang, Chriblo, Mxwsn, Dana2020, DavidCBryant, Bonadea, WinterSpw, RJASE1,
Funandtrvl, James Kidd, LokiClock, Redgecko, Markcsg, Jrljrl, Like.liberation, A4bot, Daniel347x, Joel181, Wikidemon, Lordvolton,
Defza, Chrisoneall, Wingedsubmariner, Spiral5800, Kesshaka, Cvdwalt, Why Not A Duck, Sebastjanmm, LittleBenW, Gal chechik,
Biochaos, Cmbishop, Jbmurray, IradBG, Smsarmad, Scorpion451, Kumioko (renamed), CharlesGillingham, StaticGull, CultureDrone,
Anchor Link Bot, ImageRemovalBot, ClueBot, GorillaWarfare, Ahyeek, Sonu mangla, Ggia, Debejyo, D.scain.farenzena, He7d3r, Magdon~enwiki, WilliamSewell, Jim15936, Vanished user uih38riiw4hjlsd, Evansad, Roxy the dog, PseudoOne, Andr P Ricardo, Agamemnonc, Darnelr, MystBot, Dsimic, YrPolishUncle, MTJM, Addbot, Mortense, Fyrael, Aceituno, MrOllie, LaaknorBot, Jarble, Movado73,
Luckas-bot, QuickUkie, Yobot, NotARusski, Genius002, Examtester, AnomieBOT, Piano non troppo, Materialscientist, Clickey, Devantheryv, Vivohobson, ArthurBot, Quebec99, Xqbot, Happyrabbit, Gtfjbl, Kithira, J04n, Addingrefs, Webidiap, Shirik, Joehms22,
Aaron Kauppi, Velblod, Prari, FrescoBot, Jdizzle123, Olexa Riznyk, Featherard, WhatWasDone, Siculars, Proviktor, Boxplot, Swordsmankirby, I dream of horses, Wikinacious, Skyerise, Mostafa mahdieh, Lars Washington, TobeBot, OnceAlpha, AXRL, ,
BertSeghers, Edouard.darchimbaud, Winnerdy, Zosoin, Helwr, EmausBot, Johncasey, Dzkd, Primefac, MartinThoma, Jasonanaggie,
MarsTrombone, Wht43, Chire, GZ-Bot, Jcautilli, Jorjulio, AManWithNoPlan, Pintaio, L Kensington, Ataulf, Zfeinst, Yoshua.Bengio,
Casia wyq, Ego White Tray, Blaz.zupan, Shinosin, Marius.andreiana, Lovok Sovok, Graytay, Liuyipei, ClueBot NG, Tillander, Keefaas,
Lawrence87, Aiwing, Pranjic973, Candace Gillhoolley, Robiminer, Leonardo61, Wrdieter, Arrandale, O.Koslowski, WikiMSL, Helpful
Pixie Bot, RobertPollak, BG19bot, Smorsy, Mohamed CJ, Lisasolomonsalford, Anubhab91, Chafe66, Solomon7968, Ishq2011, Autologin, Brooksrichardbrown, DasAllFolks, Billhodak, Debora.riu, Ohandyya, Davidmetcalfe, David.moreno72, Mdann52, JoshuSasori,
Ulugen, IjonTichyIjonTichy, Keshav.dhandhania, Dexbot, Mogism, Djfrost711, Bkuhlman80, Frosty, Jamesx12345, Shubhi choudhary,
Jochen Burghardt, Joeinwiki, Brettrmurphy, Phamnhatkhanh, Ppilotte, Delaf, InnocuousPilcrow, Kittensareawesome, Statpumpkin, Neo
Poz, Dustin V. S., TJLaher123, Ankit.u, Francisbach, Aleks-ger, MarinMersenne, Weiping.thu, LokeshRavindranathan, Tonyszedlak,
Proneat123, GrowthRate, Sami Abu-El-Haija, Mpgoldhirsh, Work Shop Corpse, Superploro, Riceissa, Dawolakamp, Waggie, Justincahoon, Jorge Guerra Pires, Hm1235, Velvel2, Vidhul sikka, Erik Itter, Annaelison, Tgrin9, Chazdywaters, Rmashrmash, Komselvam, Robbybluedogs, HelpUsStopSpam, EricVSiegel, KenTancwell, Justinqnabel, Rusky.ai, Datapablo, Aetilley, JenniferTheEmpress0,
Dsysko, Haodong123, Lr0^^k, BNoack, NightOwl15, Latosh Boris, Thejavis86, Muratovst, Pinsi281, ArguMentor, Datakeeper, Doctasarge, Espyromi, Kailey 2001, WunderStahl, Natenatenatenate, Fmadd, Vladiatorr, Ayasdi, ChillyBlue, Hyksandra, Famousceleb, Hex-
107
acta BA, AllenAkhaumere and Anonymous: 429

Statistical inference Source: https://en.wikipedia.org/wiki/Statistical_inference?oldid=734145647 Contributors: Larry Sanger, Jkominek,
Christian List, Michael Hardy, Tomi, Den fjttrade ankan~enwiki, Cherkash, Hyacinth, Benwing, Henrygb, Lfkrebs, Modeha, Ancheta Wis, Giftlite, MarkSweep, Piotrus, Rich Farmbrough, Bender235, Arcadian, Jonsafari, Eric Kvaalen, Hoary, Oleg Alexandrov,
MartinSpacek, Graham87, Rjwilmsi, Koavf, Chobot, YurikBot, Wavelength, Ksyrie, RicReis, Bo Jacoby, SmackBot, Jtneill, Gilliam,
Chris the speller, Nbarth, Scwlong, Shalom Yechiel, G716, KlaudiuMihaila, CRGreathouse, Mattisse, Talgalili, Thijs!bot, Headbomb,
Odoncaoa, Jvstone, Alphachimpbot, Ph.eyes, Rongou, Douglas Whitaker, SHCarter, David Eppstein, TheSeven, Kenneth M Burke,
Aagtbdfoua, The Tetrast,
, Strife911, S2000magician, Melcombe, L.tak, Qwfp, Ps07swt, Tayste, GDibyendu, Addbot, Fgnievinski,
NjardarBot, Quercus solaris, Tassedethe, Yobot, Reindra, AnomieBOT, Citation bot, Xqbot, DSisyphBot, , RibotBOT,
FrescoBot, Olexa Riznyk, Citation bot 1, Maher27777, Kiefer.Wolfowitz, Jonesey95, Codwiki, Serols, Pollinosisss, RockfanRecords,
PleaseStand, RjwilmsiBot, John of Reading, Mo ainm, Wikipelli, Dcirovic, McPastry, JA(000)Davidson, Vahid232323, Erianna, Snehalshekatkar, ClueBot NG, Run54, Bayes Puppy, Helpful Pixie Bot, JMWJMWJMW, HankW512, Illia Connell, Dexbot, FoCuSandLeArN,
Bashirra1, Greenleafjacob, Yamaha5, MalinowskiJ, YiFeiBot, Quenhitran, Monkbot, Soon Son Simps, Robsedropse, SolidPhase, LadyLeodia, Isambard Kingdom, I am grungy, KasparBot, Marco db1984, LawrenceSeminarioRomero and Anonymous: 54
Correlation and dependence Source: https://en.wikipedia.org/wiki/Correlation_and_dependence?oldid=736842065 Contributors: AxelBoldt, Fnielsen, SimonP, ChangChienFu, Heron, Patrick, Michael Hardy, Kku, Tomi, Den fjttrade ankan~enwiki, Julesd, Jtzg, Kaihsu,
Trontonian, Furrykef, Schutz, Jmabel, Seglea, Robinh, Amead, Giftlite, Tom harrison, Skatehorn, Alterego, Muzzle, Frencheigh, Edcolins, Wmahan, Dfrankow, Pgan002, Karol Langner, John Foley, RainerBlome, Annis, R, DanielCD, Rich Farmbrough, Paul August,
Bender235, Nwallins, El C, Bobo192, Arcadian, NickSchweitzer, 3mta3, 99of9, Myudelson, Landroni, Jumbuck, Rgclegg, Ossiemanners, PAR, Cburnett, Msauve, Shoey, Drummond, Oleg Alexandrov, Tariqabjotu, Tappancsa, Waabu, FrancisTyers, LOL, Btyner, Eslip17, Rjwilmsi, Nneonneo, JanSuchy, FlaBot, RobertG, Mathbot, Jrtayloriv, Goudzovski, DVdm, Volunteer Marek, Nehalem, Debivort,
YurikBot, Wavelength, RobotE, Hede2000, Dkostic, Yyy, Wimt, ENeville, Grafen, Holon, Slarson, Cruise, Moe Epsilon, Alex43223,
Scottsher, Kkmurray, Closedmouth, Cedar101, H@r@ld, Killerandy, Mebden, Bo Jacoby, Janek Kozicki, Sardanaphalus, SmackBot,
KnowledgeOfSelf, Jtneill, IstvanWolf, ToddDeLuca, Mcld, Afa86, Rashid8928, SchftyThree, Gruzd, Zven, Lightspeedchick, The
Placebo Eect, Phli, Bilgrau, Morqueozwald, Wen D House, Cybercobra, Decltype, G716, Eggstone, Brianboonstra, Mwtoews, Stefano85, Ugur Basak Bot~enwiki, TriTertButoxy, Dankonikolic, SashatoBot, Loodog, Dfass, Tasc, Beetstra, Julthep, Kuifware, Treyp,
MTSbot~enwiki, Hu12, Mrpoppy, Frazmi, Chris53516, A. Pichler, Dan1679, EXY Wilson, Kaloskagatos~enwiki, Thermochap, Tanthalas39, Jackzhp, Makeemlighter, Rsommerford, Trilateral chairman, Jppellet, Humble2000, Cydebot, Danrok, Jayen466, DavidRF, Bpadinha, Niubrad, Thijs!bot, AndrewDressel, Al Lemos, Headbomb, EdJohnston, Mentisto, JEBrown87544, Gioto, Crabula, Akshayaj,
Zweifel, Mycatharsis, Jack Ransom, JAnDbot, Narssarssuaq, Ph.eyes, Douglas Whitaker, .anacondabot, JamesBWatson, Avicennasis,
Animum, Cathalwoods, Excesses, NAHID, Ccbrucer, Custos0, J.delanoy, Jorgenumata, Debonair dave, SJS1971, Metallter, Mikael
Hggstrm, Comp25, Policron, Atama, Alterrabe, VolkovBot, Jjoe, VivekVish, Matthias Buchmeier, Ann Stouter, Keldorn~enwiki, Sultec, Dfarrar, Smates, Rjakew, Kenmccue, Admdikramr, Tiddly Tom, Malcolmxl5, Truecobb, Zzzzort, Quest for Truth, Flyer22 Reborn,
Tiptoety, Paolo.dL, Yerpo, Grzesiub, Sean.hoyland, Melcombe, Escape Orbit, Martarius, ClueBot, Hongguanglishibahao, Parkjunwung,
Erudecorp, Eric.brasseur, Tomeasy, Gtstricky, Skbkekas, Thingg, Imagecreator, Juancitomiguelito, Qwfp, XLinkBot, Prax54, Colinc719,
Quincypang, Addbot, Foggynight, White sheik, DOI bot, Fgnievinski, EjsBot, Ronhjones, CanadianLinuxUser, Francesco Pozzi, MrOllie, MrVanBot, Lov090, !xo Derek, Zorrobot, Luckas-bot, Yobot, Amirobot, Andresswift, BigAndi~enwiki, KamikazeBot, AnomieBOT,
Erel Segal, Aerosmithfan2012, Ciphers, Joule36e5, Piano non troppo, ATKX, Flewis, Citation bot, Dynablaster, S h i v a (Visnu), Nightbit, Unscented, C+C, Ute in DC, SciberDoc, RibotBOT, Doulos Christos, Bartonpoulson, Davek44, Pinethicket, LittleWink, Jonesey95,
Tom.Reding, Mystykmoo, Tim1357, , Lotje, TheBFG, Decstop, Vrenator, Defender of torch, Duoduoduo, WalkerKr, Diannaa, Unbitwise, H.ehsaan, Orphan Wiki, Razor2988, Wijobs, Tcox88, Wikihah, SporkBot, Amir9a, Netha Hussain, Tolly4bolly, L
Kensington, Gjshisha, ClueBot NG, Mathstat, Jack Greenmaven, Michelsanssoleil, , Jj1236, Frietjes, Rezabot, Amircrypto, DenisBoigelot, Helpful Pixie Bot, BG19bot, George Ponderevo, Tybrad11, Dlituiev, WTFHaxorz117, 180498Js, Willross33, Illia Connell,
7804j, Gulakov, Ultimate cosmic evil, Paweng, Monkbot, Trackteur, ChinnarajaC Nithisa, Jamalmunshi, Renlar, Mario Casteln Castro,
Moorshed, , Loraof, I am grungy, Sharboleth, DatGuy and Anonymous: 395
Regression analysis Source: https://en.wikipedia.org/wiki/Regression_analysis?oldid=728120953 Contributors: Berek, Taw, ChangChienFu,
Michael Hardy, Kku, Meekohi, Jeremymiles, Ronz, Den fjttrade ankan~enwiki, Hike395, Quickbeam, Jitse Niesen, Taxman, Samsara,
Bevo, Mazin07, Benwing, Robinh, Giftlite, Bnn, TomViza, BrendanH, Jason Quinn, Noe, Piotrus, APH, Israel Steinmetz, Urhixidur,
Rich Farmbrough, Pak21, Paul August, Bender235, Bobo192, Cretog8, Arcadian, NickSchweitzer, Photonique, Mdd, Jrme, Denoir,
Arthena, Riana, Avenue, Emvee~enwiki, Nvrmnd, Gene Nygaard, Krubo, Oleg Alexandrov, Abanima, Lkinkade, Woohookitty, LOL,
Marc K, Kosher Fan, BlaiseFEgan, Wayward, Btyner, Lacurus~enwiki, Qwertyus, Gmelli, Salix alba, MZMcBride, Pruneau, Mathbot, Valermos, Goudzovski, King of Hearts, Chobot, Jdannan, Krishnavedala, Wavelength, Wimt, Afelton, Brian Crawford, DavidHouse~enwiki, DeadEyeArrow, Avraham, Jmchen, NorsemanII, Tribaal, Closedmouth, Arthur Rubin, Josh3580, Wikiant, Shawnc,
robot, Veinor, Doubleplusje, SmackBot, NickyMcLean, Deimos 28, Antro5, Cazort, Gilliam, Feinstein, Oli Filth, Nbarth, Ctbolt, DHN-bot~enwiki, Gruzd, Hve, Berland, EvelinaB, Radagast83, Cybercobra, Krexer, CarlManaster, Nrcprm2026, G716, Mwtoews,
Cosmix, Tedjn, Friend of facts, Danilcha, John, FrozenMan, Tim bates, JorisvS, IronGargoyle, Beetstra, Dicklyon, AdultSwim, Kvng,
Joseph Solis in Australia, Chris53516, Dan1679, Ioannes Pragensis, Markjoseph125, CBM, Thomasmeeks, GargoyleMT, Ravensfan5252,
JohnInDC, Talgalili, Wikid77, Qwyrxian, Sagaciousuk, Tolstoy the Cat, N5iln, Carpentc, AntiVandalBot, Woollymammoth, Lcalc,
JAnDbot, Goskan, Giler, QuantumEngineer, Ph.eyes, SiobhanHansa, DickStartz, JamesBWatson, Username550, Fleagle11, Marcelobbribeiro, David Eppstein, DerHexer, Apdevries, Thenightowl~enwiki, Mbhiii, Discott, Trippingpixie, Cpiral, Gzkn, Rod57, TomyDuby,
Coppertwig, RenniePet, Policron, Bobianite, Blueharmony, Peepeedia, EconProf86, Qtea, BernardZ, TinJack, CardinalDan, HughD,
DarkArcher, Gpeilon, Franck Dernoncourt, TXiKiBoT, Oshwah, SueHay, Qxz, Gnomepirate, Sintaku, Antaltamas, JhsBot, Broadbot,
Beusson, Cremepu222, Zain Ebrahim111, Billinghurst, Kusyadi, Traderlion, Asjoseph, Petergans, Rlendog, BotMultichill, Statlearn,
Gerakibot, Matthew Yeager, Timhowardriley, Strife911, Indianarhodes, Amitabha sinha, OKBot, Water and Land, AlanUS, Savedthat,
Mangledorf, Randallbsmith, Amadas, Tesi1700, Melcombe, Denisarona, JL-Bot, Mrfebruary, Kotsiantis, Tdhaene, The Thing That
Should Not Be, Sabri76, Auntof6, DragonBot, Sterdeus, Skbkekas, Stephen Milborrow, Cfn011, Crash D 0T0, SBemper, Qwfp, Antonwg,
Sunsetsky, XLinkBot, Gerhardvalentin, Nomoskedasticity, Veryhuman, Piratejosh85, WikHead, SilvonenBot, Hess88, Addbot, Diegoful, Wootbag, Fgnievinski, Geced, MrOllie, LaaknorBot, Lightbot, Luckas-bot, Yobot, Themfromspace, TaBOT-zerem, Andresswift,
KamikazeBot, Eaihua, Tempodivalse, AnomieBOT, Andypost, Jim1138, RandomAct, HanPritcher, Citation bot, Jyngyr, LilHelpa, Obersachsebot, Xqbot, Statisticsblog, TinucherianBot II, Ilikeed, J04n, GrouchoBot, BYZANTIVM, Fstonedahl, Bartonpoulson, D0kkaebi,
Citation bot 1, Dmitronik~enwiki, Boxplot, Yuanfangdelang, Pinethicket, Kiefer.Wolfowitz, LittleWink, Tom.Reding, Stpasha, Di1000,
Jonkerz, Duoduoduo, Diannaa, Tbhotch, Mean as custard, RjwilmsiBot, EmausBot, Ajraddatz, RA0808, Klbrain, KHamsun, F, Julienbarlan, Hypocritical~enwiki, Kgwet, Zfeinst, Bomazi, ChuispastonBot, 28bot, Rocketrod1960, ClueBot NG, Mathstat, MelbourneStar,
108
Joel B. Lewis, CH-stat, Helpful Pixie Bot, BG19bot, Giogm2000, CitationCleanerBot, Hakimo99, Gprobins, Prof. Squirrel, Attleboro,
Illia Connell, JYBot, Sinxvin, Sminthopsis84, Francescapelusi, Lugia2453, SimonPerera, Me, Myself, and I are Here, Lemnaminor, Inniti4, EJM86, Francisbach, Eli the King, Monkbot, Bob nau, Moorshed k, Moorshed, HelpUsStopSpam, KasparBot, Statperson123,
Ballatown, NatalieSunshine and Anonymous: 396
Multivariate statistics Source: https://en.wikipedia.org/wiki/Multivariate_statistics?oldid=730570477 Contributors: Ap, Fnielsen, Michael
Hardy, Tomi, Den fjttrade ankan~enwiki, Cherkash, Sboehringer, Henrygb, Diberri, APH, Sam Hocevar, Bender235, Kronoss, Landroni,
Denoir, Jheald, Mindmatrix, Lgallindo, Graham87, Rjwilmsi, Adoniscik, Holon, SmackBot, Unyoyega, Chlewbot, Jmlk17, Richard001,
Nutcracker, Kaarebrandt, Gnome (Bot), Tawkerbot2, Dan1679, Mikiemike, CBM, Rphirsch, Thijs!bot, Mycatharsis, Johnbibby, STBot,
Luc.girardin, TheSeven, It Is Me Here, Policron, DonAndre, TXiKiBoT, Seraphim, Lorloci, AlasdairBailey, Yerpo, Melcombe, Digisus,
Dlrohrer2003, Niceguyedc, Qwfp, DumZiBoT, Ewger, Proevy, Addbot, Fgnievinski, MrOllie, Delaszk, PAvdK, Luckas-bot, AnomieBOT,
Pankaj.tux, Mookiedockee, Materialscientist, Dpoduval, Citation bot, Xqbot, GrouchoBot, Omnipaedista, FrescoBot, Boxplot, Kiefer.Wolfowitz,
Ericbeg, FoxBot, Trappist the monk, Kastchei, RMGunton, Fjsalguero, LaineVitola, ZroBot, Johnbates, Helpful Pixie Bot, Therhaag,
Verplay, Cretchen, Illia Connell, Jack McArdle, Epicgenius, Lemnaminor, FizykLJF, Icely88, Monkbot, Soon Son Simps, Moorshed,
Loraof, KasparBot, Kkdabt and Anonymous: 50
Data collection Source: https://en.wikipedia.org/wiki/Data_collection?oldid=735567931 Contributors: Kku, Ronz, Cherkash, Bobo192,
Mdd, Alansohn, RainbowOfLight, Waldir, Bgwhite, Hydrargyrum, Rsrikanth05, Aeusoes1, Daniel Mietchen, Epipelagic, SmackBot,
Yamaguchi , Ohnoitsjamie, Gyrobo, Scwlong, Eliyak, Cydebot, Dmbrown00, RichardVeryard, KylieTastic, Geekdiva, Khairul hazim,
DoorsAjar, Shanzu, Yintan, Khvalamde, Aboluay, Melcombe, Arakunem, Jdrowlands, SchreiberBike, Qwfp, Addbot, MrOllie, Peter
Flass, Tempodivalse, Jim1138, Materialscientist, Qweedsa, Tim Keighley,
, Pinethicket, I dream of horses, Kiefer.Wolfowitz,
Jonkerz, Lotje, Mean as custard, EmausBot, WikitanvirBot, Tuxedo junction, Bemanna, Tolly4bolly, 28bot, ClueBot NG, Helpful Pixie
Bot, BG19bot, Altar, ChrisGualtieri, Illia Connell, Dexbot, Allnamesarefkintaken, JaconaFrere, Woenel, Soon Son Simps, Beth.Alex123,
My Chemistry romantic, Rmullings, WikiBaes, Junosoon and Anonymous: 78
Time series Source: https://en.wikipedia.org/wiki/Time_series?oldid=732545965 Contributors: Michael Hardy, Kku, Dcljr, Cherkash,
Charles Matthews, Taxman, Topbanana, Gandalf61, Babbage, Wile E. Heresiarch, Giftlite, Pucicu, Andycjp, Piotrus, Wyllium, Discospinster, Rich Farmbrough, Pak21, Modargo, Bender235, Calair, Tobacman, Jrme, Gary, Arthena, Rgclegg, PAR, Spangineer, Aegis
Maelstrom, Mindmatrix, Camw, Btyner, Edison, Rjwilmsi, MZMcBride, ElKevbo, Rbonvall, Intgr, Chobot, Gap, YurikBot, Wavelength,
CambridgeBayWeather, Jugander, Joel7687, Tarawneh, Georey.landis, Zvika, SmackBot, Hopf, Mm100100, Unyoyega, CommodiCast,
Commander Keane bot, Esoterum, Chris the speller, Oli Filth, Nbarth, G716, Mwtoews, Bdushaw, Dankonikolic, Lambiam, Kuru, John
Cumbers, Nutcracker, Nialsh, Susko, JohnCD, Requestion, Krauss, Scientio, Lovibond, VictorAnyakin, JAnDbot, Instinct, Magioladitis,
VoABot II, Albmont, SHCarter, Ldecola, JaGa, Apdevries, MartinBot, Nono64, Abeliavsky, Eliz81, TheSeven, SShearman, Policron,
Cottrellnc, STBotD, The enemies of god, Funandtrvl, VolkovBot, DrMicro, Jimmaths, Frederic.vernier, Kv75, Cpdo, Zheric~enwiki,
SieBot, Mathaddins, Charmi99, Flyer22 Reborn, Strife911, Melcombe, Rinconsoleao, ClueBot, Drrho, Zipircik, SchreiberBike, Aleksd,
1ForTheMoney, Qwfp, Keithljelp, Dkondras, Dekart, Hellopeopleofdetroit, Tayste, Addbot, Truswalu, Cwdegier, Fgnievinski, MrOllie, LaaknorBot, Favonian, Legobot, Luckas-bot, Yobot, AnomieBOT, Mihal Orela, Materialscientist, Xqbot, Armandology, FrescoBot,
Luyima, Citation bot 1, Boxplot, Pinethicket, Kiefer.Wolfowitz, Rushbugled13, Hoo man, Merlion444, Twilight Nightmare, Duoduoduo, Badtoothfairy, Diannaa, Simonkramer, FBmotion, Sandman888, DARTH SIDIOUS 2, Helwr, EmausBot, Johncasey, Dewritech,
BAICAN XXX, Dondervogel 2, Chire, A930913, Burhem, Donner60, ChuispastonBot, Visu dreamz, Mjbmrbot, ClueBot NG, Mesoderm, Statoman71, Masssly, Helpful Pixie Bot, Scwarebang, Dr ahmed1010, BG19bot, QualitycontrolUS, Andreas4965, Ricardo Oliveros
Ramos, Adrianafraj, Cretchen, EdwardH, Thulka, BattyBot, Sick Rantorum, ChrisGualtieri, Imdadasad, SFK2, Jochen Burghardt, GabeIglesia, Coginsys, Llinfeng, OhGodItsSoAmazing, LCS check, EJM86, Citrusbowler, Cubism44, Soon Son Simps, Srp54, Moorshed k,
Oversound2, Moorshed, DoctorTerrella, Loraof, Vtor, HelpUsStopSpam, Rodionos, KasparBot, Tariqfaruqi, Jahrmann, Colinwikipedia
and Anonymous: 156
14.12.2
Images
File:Animation2.gif Source: https://upload.wikimedia.org/wikipedia/commons/c/c0/Animation2.gif License: CC-BY-SA-3.0 Contributors: Own work Original artist: MG (talk contribs)
File:Anscombe{}s_quartet_3.svg Source: https://upload.wikimedia.org/wikipedia/commons/e/ec/Anscombe%27s_quartet_3.svg License: CC BY-SA 3.0 Contributors:
Anscombe.svg Original artist: Anscombe.svg: Schutz
File:Astrolabe-Persian-18C.jpg Source: https://upload.wikimedia.org/wikipedia/commons/1/18/Astrolabe-Persian-18C.jpg License:
CC BY-SA 2.0 Contributors: Whipple Museum of the History of Science Original artist: Andrew Dunn
File:Automated_weighbridge_for_Adlie_penguins_-_journal.pone.0085291.g002.png Source: https://upload.wikimedia.org/wikipedia/
commons/5/54/Automated_weighbridge_for_Ad%C3%A9lie_penguins_-_journal.pone.0085291.g002.png License: CC BY 4.0 Contributors: Lescrol, A. L.; Ballard, G.; Grmillet, D.; Authier, M.; Ainley, D. G. (2014). Antarctic Climate Change: Extreme Events
Disrupt Plastic Phenotypic Response in Adlie Penguins. PLoS ONE 9: e85291. DOI:10.1371/journal.pone.0085291. Original artist:
Lescrol, A. L.; Ballard, G.; Grmillet, D.; Authier, M.; Ainley, D. G. (2014)
File:B_24_in_raf_service_23_03_05.jpg Source: https://upload.wikimedia.org/wikipedia/commons/a/a1/B_24_in_raf_service_23_03_
05.jpg License: Public domain Contributors: Transferred from en.wikipedia to Commons. Original artist: The original uploader was Bzuk
at English Wikipedia
File:Commons-logo.svg Source: https://upload.wikimedia.org/wikipedia/en/4/4a/Commons-logo.svg License: CC-BY-SA-3.0 Contributors: ? Original artist: ?
File:Complex-adaptive-system.jpg Source: https://upload.wikimedia.org/wikipedia/commons/0/00/Complex-adaptive-system.jpg License: Public domain Contributors: Own work by Acadac : Taken from en.wikipedia.org, where Acadac was inspired to create this graphic
after reading: Original artist: Acadac
File:Correlation_examples.png Source: https://upload.wikimedia.org/wikipedia/commons/0/02/Correlation_examples.png License: Public domain Contributors: Transferred from en.wikipedia to Commons by jtneill. Original artist: Imagecreator at English Wikipedia
109
File:Correlation_examples2.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/d4/Correlation_examples2.svg License:

CC0 Contributors: Own work, original uploader was Imagecreator Original artist: DenisBoigelot, original uploader was Imagecreator
File:Correlation_range_dependence.svg Source: https://upload.wikimedia.org/wikipedia/commons/f/f0/Correlation_range_dependence.
svg License: CC BY 3.0 Contributors: Own work Original artist: Skbkekas
File:Edit-clear.svg Source: https://upload.wikimedia.org/wikipedia/en/f/f2/Edit-clear.svg License: Public domain Contributors: The
Tango! Desktop Project. Original artist:
The people from the Tango! project. And according to the meta-data in the le, specically: Andreas Nilsson, and Jakub Steiner (although
minimally).
File:Emblem-money.svg Source: https://upload.wikimedia.org/wikipedia/commons/f/f3/Emblem-money.svg License: GPL Contributors: http://www.gnome-look.org/content/show.php/GNOME-colors?content=82562 Original artist: perfectska04
File:Featured_article_star.svg Source: https://upload.wikimedia.org/wikipedia/commons/b/bc/Featured_article_star.svg License: CCBY-SA-3.0 Contributors: Own work Original artist: Booyabazooka
File:Fisher_iris_versicolor_sepalwidth.svg Source: https://upload.wikimedia.org/wikipedia/commons/4/40/Fisher_iris_versicolor_sepalwidth.
svg License: CC BY-SA 3.0 Contributors: en:Image:Fisher iris versicolor sepalwidth.png Original artist: en:User:Qwfp (original); Pbroks13
(talk) (redraw)
File:Folder_Hexagonal_Icon.svg Source: https://upload.wikimedia.org/wikipedia/en/4/48/Folder_Hexagonal_Icon.svg License: Ccby-sa-3.0 Contributors: ? Original artist: ?
File:Fotothek_df_n-04_0000019.jpg Source: https://upload.wikimedia.org/wikipedia/commons/c/cc/Fotothek_df_n-04_0000019.jpg
License: CC BY-SA 3.0 de Contributors: Deutsche Fotothek Original artist: Eugen Nosko
File:Free-to-read_lock_75.svg Source: https://upload.wikimedia.org/wikipedia/commons/8/80/Free-to-read_lock_75.svg License: CC0

Contributors:
Adapted from <a href='//en.wikipedia.org/wiki/File:Open_Access_logo_PLoS_white_green.svg' class='image' title='Open_Access_logo_PLoS_white_green.svg
alt='Open_Access_logo_PLoS_white_green.svg' src='//upload.wikimedia.org/wikipedia/commons/thumb/9/90/Open_Access_logo_PLoS_
white_green.svg/9px-Open_Access_logo_PLoS_white_green.svg.png' width='9' height='14' srcset='//upload.wikimedia.org/wikipedia/
commons/thumb/9/90/Open_Access_logo_PLoS_white_green.svg/14px-Open_Access_logo_PLoS_white_green.svg.png 1.5x, //upload.
wikimedia.org/wikipedia/commons/thumb/9/90/Open_Access_logo_PLoS_white_green.svg/18px-Open_Access_logo_PLoS_white_green.
svg.png 2x' data-le-width='640' data-le-height='1000' /></a>
Original artist:
This version:Trappist_the_monk (talk) (Uploads)
File:Gretl_screenshot.png Source: https://upload.wikimedia.org/wikipedia/commons/b/b9/Gretl_screenshot.png License: GPL Contributors: ? Original artist: ?
File:Internet_map_1024.jpg Source: https://upload.wikimedia.org/wikipedia/commons/d/d2/Internet_map_1024.jpg License: CC BY
2.5 Contributors: Originally from the English Wikipedia; description page is/was here. Original artist: The Opte Project
File:Jerme_Cardan.jpg Source: https://upload.wikimedia.org/wikipedia/commons/9/97/Jer%C3%B4me_Cardan.jpg License: Public
domain Contributors: this website Original artist: Unknown<a href='//www.wikidata.org/wiki/Q4233718' title='wikidata:Q4233718'><img
alt='wikidata:Q4233718' src='https://upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/20px-Wikidata-logo.svg.
png' width='20' height='11' srcset='https://upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/30px-Wikidata-logo.
svg.png 1.5x, https://upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/40px-Wikidata-logo.svg.png 2x' data-lewidth='1050' data-le-height='590' /></a>
File:Kammhuber_Line_Map_-_Agent_Tegal.png Source: https://upload.wikimedia.org/wikipedia/commons/3/33/Kammhuber_Line_
Map_-_Agent_Tegal.png License: Public domain Contributors: Imperial War Museum? - picture scanned by me Ian Dunster 13:56, 12
March 2006 (UTC) from: Most Secret War by R. V. Jones - Coronet - 1981 - ISBN 0-340-24169-1 and uncredited. Original artist:
Unknown<a href='//www.wikidata.org/wiki/Q4233718' title='wikidata:Q4233718'><img alt='wikidata:Q4233718' src='https://upload.
wikimedia.org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/20px-Wikidata-logo.svg.png' width='20' height='11' srcset='https://
upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/30px-Wikidata-logo.svg.png 1.5x, https://upload.wikimedia.
org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/40px-Wikidata-logo.svg.png 2x' data-le-width='1050' data-le-height='590' /></a>
File:Linear_least_squares(2).svg Source: https://upload.wikimedia.org/wikipedia/commons/7/75/Linear_least_squares%282%29.svg
License: CC BY-SA 3.0 Contributors: Own work Original artist: Sega sai
File:Linear_regression.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/3a/Linear_regression.svg License: Public domain Contributors: Own work Original artist: Sewaqu
File:Mergefrom.svg Source: https://upload.wikimedia.org/wikipedia/commons/0/0f/Mergefrom.svg License: Public domain Contributors: ? Original artist: ?
File:Mw160883.jpg Source: https://upload.wikimedia.org/wikipedia/commons/1/15/Mw160883.jpg License: Public domain Contributors: N.P.G.: http://www.npg.org.uk/collections/search/portrait/mw160883/Karl-Pearson?LinkID=mp100188&role=sit&rNo=4 Original artist: Unknown<a href='//www.wikidata.org/wiki/Q4233718' title='wikidata:Q4233718'><img alt='wikidata:Q4233718' src='https:
//upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/20px-Wikidata-logo.svg.png' width='20' height='11' srcset='https:
//upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/30px-Wikidata-logo.svg.png 1.5x, https://upload.wikimedia.
org/wikipedia/commons/thumb/f/ff/Wikidata-logo.svg/40px-Wikidata-logo.svg.png 2x' data-le-width='1050' data-le-height='590' /></a>
File:NYW-confidence-interval.svg Source: https://upload.wikimedia.org/wikipedia/commons/8/8f/NYW-confidence-interval.svg License: Public domain Contributors: Own work Original artist: Tsyplakov
File:Normal_Distribution_PDF.svg Source: https://upload.wikimedia.org/wikipedia/commons/7/74/Normal_Distribution_PDF.svg License: Public domain Contributors: self-made, Mathematica, Inkscape Original artist: Inductiveload
File:Nuvola_apps_edu_mathematics-p.svg Source: https://upload.wikimedia.org/wikipedia/commons/c/c2/Nuvola_apps_edu_mathematics-p.
svg License: GPL Contributors: Derivative of Image:Nuvola apps edu mathematics.png created by self Original artist: David Vignoni (original icon); Flamurai (SVG convertion)
110
File:Nuvola_apps_edu_mathematics_blue-p.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/3e/Nuvola_apps_edu_

mathematics_blue-p.svg License: GPL Contributors: Derivative work from Image:Nuvola apps edu mathematics.png and Image:Nuvola
apps edu mathematics-p.svg Original artist: David Vignoni (original icon); Flamurai (SVG convertion); bayo (color)
File:Nuvola_apps_filetypes.svg Source: https://upload.wikimedia.org/wikipedia/commons/4/44/Nuvola_apps_filetypes.svg License: LGPL
Contributors: Image:Nuvola apps letypes.png Original artist: Richtom80
File:Nuvola_apps_kalzium.svg Source: https://upload.wikimedia.org/wikipedia/commons/8/8b/Nuvola_apps_kalzium.svg License: LGPL
Contributors: Own work Original artist: David Vignoni, SVG version by Bobarino
File:P-value_in_statistical_significance_testing.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/3a/P-value_in_statistical_
significance_testing.svg License: CC BY-SA 3.0 Contributors: File:P value.png Original artist: User:Repapetilto @ Wikipedia & User:
Chen-Pan Liao @ Wikipedia
File:P_philosophy.png Source: https://upload.wikimedia.org/wikipedia/commons/b/bb/P_philosophy.png License: CC-BY-SA-3.0 Contributors: ? Original artist: ?
File:People_icon.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/37/People_icon.svg License: CC0 Contributors: OpenClipart Original artist: OpenClipart
File:Portal-puzzle.svg Source: https://upload.wikimedia.org/wikipedia/en/f/fd/Portal-puzzle.svg License: Public domain Contributors:
? Original artist: ?
File:Question_book-new.svg Source: https://upload.wikimedia.org/wikipedia/en/9/99/Question_book-new.svg License: Cc-by-sa-3.0
Contributors:
Created from scratch in Adobe Illustrator. Based on Image:Question book.png created by User:Equazcion Original artist:
Tkgd2007
File:R._A._Fischer.jpg Source: https://upload.wikimedia.org/wikipedia/commons/4/46/R._A._Fischer.jpg License: Public domain Contributors: http://www.swlearning.com/quant/kohler/stat/biographical_sketches/Fisher_3.jpeg Original artist: ?
File:Random-data-plus-trend-r2.png Source: https://upload.wikimedia.org/wikipedia/commons/7/77/Random-data-plus-trend-r2.png
License: CC-BY-SA-3.0 Contributors: ? Original artist: ?
File:Scatterplot.jpg Source: https://upload.wikimedia.org/wikipedia/commons/d/d8/Scatterplot.jpg License: CC-BY-SA-3.0 Contributors: No machine-readable source provided. Own work assumed (based on copyright claims). Original artist: No machine-readable author
provided. Marius xplore assumed (based on copyright claims).
File:Simple_Confounding_Case.svg Source: https://upload.wikimedia.org/wikipedia/commons/b/b8/Simple_Confounding_Case.svg
License: CC BY-SA 3.0 Contributors: Own work Original artist:
File:Svm_max_sep_hyperplane_with_margin.png Source: https://upload.wikimedia.org/wikipedia/commons/2/2a/Svm_max_sep_hyperplane_
with_margin.png License: Public domain Contributors: Own work Original artist: Cyc
File:Symbol_support_vote.svg Source: https://upload.wikimedia.org/wikipedia/en/9/94/Symbol_support_vote.svg License: Public domain Contributors: ? Original artist: ?
File:The_Normal_Distribution.svg Source: https://upload.wikimedia.org/wikipedia/commons/2/25/The_Normal_Distribution.svg License: Public domain Contributors: Transferred from en.wikipedia to Commons by Abdull. Original artist: Heds 1 at English Wikipedia
File:Tuberculosis_incidence_US_1953-2009.png Source: https://upload.wikimedia.org/wikipedia/commons/0/05/Tuberculosis_incidence_
US_1953-2009.png License: CC BY-SA 3.0 Contributors: Own work Original artist: Ldecola
File:Vickers_Warwick_B_ASR_Mk1_-_BV285.jpg Source: https://upload.wikimedia.org/wikipedia/en/f/f2/Vickers_Warwick_B_
ASR_Mk1_-_BV285.jpg License: Fair use Contributors: ? Original artist: ?
File:Wiki_letter_w_cropped.svg Source: https://upload.wikimedia.org/wikipedia/commons/1/1c/Wiki_letter_w_cropped.svg License:
CC-BY-SA-3.0 Contributors: This le was derived from Wiki letter w.svg: <a href='//commons.wikimedia.org/wiki/File:Wiki_letter_w.
svg' class='image'><img alt='Wiki letter w.svg' src='https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/Wiki_letter_w.svg/
50px-Wiki_letter_w.svg.png' width='50' height='50' srcset='https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/Wiki_letter_
w.svg/75px-Wiki_letter_w.svg.png 1.5x, https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/Wiki_letter_w.svg/100px-Wiki_
letter_w.svg.png 2x' data-le-width='44' data-le-height='44' /></a>
Original artist: Derivative work by Thumperward
File:Wikibooks-logo.svg Source: https://upload.wikimedia.org/wikipedia/commons/f/fa/Wikibooks-logo.svg License: CC BY-SA 3.0
Contributors: Own work Original artist: User:Bastique, User:Ramac et al.
File:Wikinews-logo.svg Source: https://upload.wikimedia.org/wikipedia/commons/2/24/Wikinews-logo.svg License: CC BY-SA 3.0
Contributors: This is a cropped version of Image:Wikinews-logo-en.png. Original artist: Vectorized by Simon 01:05, 2 August 2006
(UTC) Updated by Time3000 17 April 2007 to use ocial Wikinews colours and appear correctly on dark backgrounds. Originally
uploaded by Simon.
File:Wikipedia{}s_W.svg Source: https://upload.wikimedia.org/wikipedia/commons/5/5a/Wikipedia%27s_W.svg License: Public domain Contributors: Own work Original artist: Jonathan Hoeer
File:Wikiquote-logo.svg Source: https://upload.wikimedia.org/wikipedia/commons/f/fa/Wikiquote-logo.svg License: Public domain
Contributors: Own work Original artist: Rei-artur
File:Wikisource-logo.svg Source: https://upload.wikimedia.org/wikipedia/commons/4/4c/Wikisource-logo.svg License: CC BY-SA
3.0 Contributors: Rei-artur Original artist: Nicholas Moreau
File:Wikiversity-logo-Snorky.svg Source: https://upload.wikimedia.org/wikipedia/commons/1/1b/Wikiversity-logo-en.svg License:
CC BY-SA 3.0 Contributors: Own work Original artist: Snorky
File:Wikiversity-logo.svg Source: https://upload.wikimedia.org/wikipedia/commons/9/91/Wikiversity-logo.svg License: CC BY-SA
3.0 Contributors: Snorky (optimized and cleaned up by verdy_p) Original artist: Snorky (optimized and cleaned up by verdy_p)
File:Wiktionary-logo-en.svg Source: https://upload.wikimedia.org/wikipedia/commons/f/f8/Wiktionary-logo-en.svg License: Public
domain Contributors: Vector version of Image:Wiktionary-logo-en.png. Original artist: Vectorized by Fvasconcellos (talk contribs),
based on original logo tossed together by Brion Vibber
File:Wiktionary-logo-v2.svg Source: https://upload.wikimedia.org/wikipedia/commons/0/06/Wiktionary-logo-v2.svg License: CC BYSA 4.0 Contributors: Own work Original artist: Dan Polansky based on work currently attributed to Wikimedia Foundation but originally
created by Smurrayinchester
14.12.3
Content license
Creative Commons Attribution-Share Alike 3.0
111

Statistics Wiki

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Statistics Wiki

Caricato da

Copyright:

Formati disponibili

Statistics Wiki

Experimental and observational studies . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Terminology and theory of inferential statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Statistics, estimators and pivotal quantities . . . . . . . . . . . . . . . . . . . . . . . . . .

Null hypothesis and alternative hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . .

History of statistical science

Applied statistics, theoretical statistics and mathematical statistics . . . . . . . . . . . . . .

Machine learning and data mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Statistics applied to mathematics or the arts . . . . . . . . . . . . . . . . . . . . . . . . .

1.10 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.12 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.13 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

List of elds of application of statistics

Basic domains within analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Use in statistical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Notable approaches to quality control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Quality control in project management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Second World War . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

After World War II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Societies and journals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Classic books and articles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Types of problems and tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

History and relationships to other elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Decision tree learning

Association rule learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Articial neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Inductive logic programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Support vector machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.4.10 Representation learning

8.4.11 Similarity and metric learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.4.12 Sparse dictionary learning

8.4.13 Genetic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Free and open-source software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Proprietary software with free and open-source editions . . . . . . . . . . . . . . . . . . .

8.10 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.12 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.13 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Models and assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Importance of valid models/assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . .

Paradigms for inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Other paradigms for inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10 Correlation and dependence

10.1 Pearsons product-moment coecient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.2 Rank correlation coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.3 Other measures of dependence among random variables . . . . . . . . . . . . . . . . . . . . . . .

10.4 Sensitivity to the data distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.5 Correlation matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.6 Common misconceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.6.1 Correlation and causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.6.2 Correlation and linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.7 Bivariate normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.8 Partial correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.9 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11.2 Regression models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11.2.1 Necessary number of independent measurements . . . . . . . . . . . . . . . . . . . . . . .

11.2.2 Statistical assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11.3 Underlying assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11.4 Linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11.4.1 General linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11.4.3 Limited dependent variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11.5 Interpolation and extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11.6 Nonlinear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11.7 Power and sample size calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11.8 Other methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .