Sei sulla pagina 1di 5

STATISTICS THEORY

STATISTICAL ANALYSIS OF MANUFACTURING DATA


Statistics Theory
High volume manufacturing production, unlike prototype design work, typically involves
repeating the same machining operations and processes hundreds, thousands, or millions of
times during a given products or product familys production run. Understanding the failure
mechanisms in a products tooling and improving the efficiency of these operations by adjusting
manufacturing parameters can save on: tool wear of indexable inserts, milling cutters, reamers,
twist drills; improve speed, feeds, and power consumption profiles; reduce machine tool
accuracy drift; and reduce lubrication and other maintenance related failures. Improving these
and other related process, by even a tiny amount, can result in huge cost savings in large
production run environments.
The first step is to take measurements and test the values of production processes so
that patterns can be found. Most testing procedures include the collection and tabulation of
experimental data. Without mathematical statistical analysis and interpretation it would be
impossible to know whether or not the testing was comprehensive enough to offer valid
experimental conclusions that can then be used to make manufacturing process changes.
Statistical Distribution Curves.Statistical analysis depends on the type of statistical
distributions that apply to various properties of the data being examined.
There are six statistical distributions: 1) Normal; 2) Log Normal; 3 ) Exponential; 4) Binomial; 5)
Weibull; and 6) Poisson.
Normal Distribution Curve.The normal distribution is the most widely used and
bestunderstood statistical distribution. It is used to model mechanical, physical, electrical, and
chemical properties which scatter randomly about a well-defined mean value without either
positive or negative bias. This curve is frequently called a bell curve. The following describes
the characteristics of the normal distribution curve.
Statistical Analysis. Statistically analyzing data is a very important scientific and engineering
tool which defines the characteristics of samples (limited number of observations, trials, data
points, etc.). If a sample of data is randomly selected from the population, its statistical
characteristics converge towards the statistical characteristics of the population as the sample
size increases. Because economic constraints, such as testing time and cost, prevent a large
number of repeat tests, it is important to understand how a sample of data represents an
approximation of the real population of data.The following parameters must be calculated to
evaluate the sample of data with respect to the population of data:
X = Sample mean S= Sample standard deviation V= Coefficient of variation
Ax = Absolute error of the sample mean
Rx = Relative error of the sample mean
t = Critical value of t-distribution (or Students Distribution)
= Population mean
Xt Ax = Confidence interval for the population mean

Sample Mean, (X): The sample mean, sometimes called the measure of average, is a
value about which the data is centered around. There are several types of such average
measures, the most common of which is the arithmetic mean, or the sample mean. The sample
mean X is calculated as:

where xi = individual data point


n= number of data points
Sample Standard Deviation, (S) is a measure of the dispersion of data about its standard mean
X. The sample standard deviation is calculated by the formula:

where n 1 = the number of degrees of freedom (d. f.)


Degrees of freedom, (d.f.) can be defined as the number of observations made in
excess of the minimum needed to estimate a statistical parameter or quantity. For example, only
one measurement is required to identify the width of an indexable inserts flank wear that
occurred while machining a workpiece. If the measurements are repeated seven times, then the
sample variance of flank wear measurement has six degrees of freedom.
Coefficient of Variation, (V) is used to evaluate or control the variability in data points.
The coefficient of variation is calculated by dividing the sample standard deviation S by the
sample mean X and expressing the result in per cent:

Absolute Error of the Sample Mean, (Ax) is calculated by dividing the sample standard
deviation by the square root of the number of data points. The result is expressed in the same
unit of measure as the sample standard deviation and the sample mean:

Relative Error of the Sample Mean, (Rx) is calculated by dividing the absolute error of
the sample mean by the sample mean and expressing the result in per cent:

Critical Value of t-Distribution (Student distribution): The t-Distribution was


discovered in 1908 by W. S. Gosset, who wrote under the name Student. The critical value of t
depends on the number of degrees of freedom and the probability of error. If a 95% two-sided
confidence is used for statistical analysis, then the probability of error is 5% or 2.5% per side. A
5% probability of error provides practical accuracy, which is commonly acceptable in various
engineering calculations.

For a 5% probability of error, the critical value of t-Distribution can be determined from
Table 1, page 151, at the intersection of the column under the heading t0.025 and the row
corresponding to the number of degrees of freedom shown in the column heading d.f.
Population Mean (): The normal distribution has two parameters: the population mean
and the population standard deviation S. The sample mean X is an estimate of the population
mean (= X), and the sample standard deviation is an estimate of the population standard
deviation (= S). A graph of the normal distribution is symmetric about its mean . Virtually, all
of the area (99.74%) under the graph is contained within the interval:
Thus, almost all of the probability associated with a normal distribution falls within three
standard deviations of the population mean . Also, 95.44% of the area falls within two
standard deviations of , and 68.26% within one standard deviation.
Confidence Interval for the Population Mean: The degree of confidence associated with
a confidence interval or limit is known as its confidence level. Confidence levels of 90%, 95%,
and 99% are commonly used. For example, a 95% confidence limit for the unknown population
mean, estimated by use of the sample mean and sample standard deviation, provvides a value
above which the unknown population mean is expected to lie with 95% confidence.
Equations (1) through (5) describe a sample mean that is only an estimate of the true
(population) mean. Therefore, it is important to define a confidence interval that determines a
range within which the population mean lies. Such an interval depends on the sample mean, X,
absolute error of the sample mean, Ax, and t-distribution (Students) value. A confidence
interval for the population mean satisfies the inequality:
Applying Statistics
Minimum Numbers of Tests, or Data Points .Minimum numbers of the data points, which
represent the sample size can be determined through the formulas for the coefficient of
variation V, Equation (3), the absolute error of the sample mean Ax, Equation (4), and the relative
error of the sample mean Rx, Equation (5).
According to Equation (4), the absolute error of the sample mean is:

The other expression for the absolute error of the sample mean from Equation (5) is:
Because the values to the left from the equal sign in Equations (4) and (7) are equal, the values
to the right from the equal sign are also equal and, therefore:

Solving for in Equation (8) produces:

Because S/X is the coefficient of variation V, see Equation (3), then:

The coefficient of variation of the sample mean must be known or selected according to
previously collected data of a similar kind, or, if necessary, preliminary tests should be
conducted to estimate its value. Based on numerous studies of cutting tool performance and
publications on mechanical properties of cutting tool materials, the values of the coefficient of
variation within 25 to 45% are considered as typical. The relative error of the sample mean
between 6 and 12% is also considered typical. The coefficient of variation and the relative error
are used to estimate how many tests are required. For example, if V = 30% and Rx = 8%, then
the numbers of tests required are n = 302/82 = 14.
Comparing Products with Respect to Average Performance.Lab and field tests are usually
conducted to compare the average performance of two or more products. The term average
performance is a quantitative value, which can be any mechanical, physical, or chemical
characteristics of a product. For example, the average tool life of drills and indexable cutting
inserts, the average hardness of cemented carbide grades, etc. The products may differ in
manufacturing procedure (CVD or PVD coatings), in chemical composition (alloying elements
and their amount), and in other parameters. Data collected during the experiments must be
statistically treated to determine whether the products have the same performance
characteristics or not. For example, is there a difference in the sample means or not?
Statistical treatment of data obtained from experiments with two products, includes the following
steps:
a) Calculation of the samples mean X1 and X2 using Equation (1)
b) Calculation of the samples standard deviation S1 and S2 using Equation (2)
c) Calculation of a weighted, or pooled standard deviation using the following formula:

where n1 and n2 the number of data points for products 1 and 2 respectively.
d) Selection of a confidence level. A 95% two-sided confidence level is recommended.
At this confidence level, the probability of error is 2.5% per side. The values of t-Distribution
versus degrees of freedom (d.f.) are provided in Table 1, and for a 95% confidence level are
located in the column under the heading t0.025 with respect to given degrees of freedom (d. f.
= n1 + n2 2).
e) Calculation of Decision Criterion (d.c.) using the following formula:

f) Comparison of the value of Decision Criterion with the difference of the samples
mean: take X1 X2 if X1 > X2, or X2 X1 if X2 > X1
The products average performance is statistically significant if the difference in the two sample
means is greater than Decision Criterion, i.e.

Potrebbero piacerti anche