Quality With Statistics-2

Senior Management Programme
Quality With Statistics

Ahmedabad Electricity Company
Ltd.
26 27 June
2002
Conducted
by:
Indian Statistical
Institute
98 sampatrao Colony
BARODA 390 007
Defining the Ideal Quality

Value
1) List the six factors which you believe are the major determinants of quality
Guidelines for scoring
1. Use same time scale:

Frequency of improvement
actions <= frequency of
review <= frequency of
reporting <= frequency of
measurement.
2. A factor is measured
always if it is measured as
frequently as is practically
possible.
2) For each factor, place a rating on the following statements

5 = Always
M Performance of the listed factor should
be measured
4 = Often
R The performance measure should
be reported
3 = Occasionally
R The management should
review the performance 2 = Rarely
reports
1 = Never
I Improvement actions should
stem from the reviews
Factor
3. Ideal value for each

factor need not be 5. For
example,scrap% may be
measured every hour
(always) but the ideal
frequency may be every
shift (often).
Measure
Report
Review
Total
Improve
4. A factor which is not

measurable (e.g. integrity)
gets a score of 1 for all the
four actions (M-R-R-I).
Total
Quality with Statistics
Defining the Real Quality

1) List the 6 factors you believe are the major determinants of quality
Value
2) For each factor place a rating on the following statements
Guidelines for
scoring
1. Use the same 6
factors and the
time
scale as was used
while defining
ideal
quality value.
M
R
R
I
5 = Always
Performance of the listed factor is
measured
4 = Often
The performance measure is
reported
3 = Occasionally
The management
reviews the performance reports
2 = Rarely
Improvement actions
stem from the reviews
1 = Never
Factor
Measure
Report
Review
Total
Improve
2. Real score can

be
equal to or more
or
less than the ideal
score.
Total
The Quality Value Grid

Behavior Score
120
100
80
60
40
20
0
0
20
40
60
Belief Score
80
100
120
Cutting to the
Core
Behavior is a function of values
B = f(V)
Behavior
The way in which a person or group of people responds
Values
The complex of beliefs, ideals or standards which
characterizes a person or group of people
The Cost of Remaining

Average
Waste as a proportion of total sales volume
30%
Typical Company
Your Area?
The Classical View of

Performance
Practical Meaning of 99% Good
20,000 lost articles of mail per hour
Unsafe drinking water almost 15 minutes each day

5000 incorrect surgical operations per week
2 short or long landings at major airports each day
2,00,000 wrong drug prescriptions every year
No electricity for about 7 hours each month
The Need for

Knowledge
Knowledge
We dont know what we
dont know
If we cant express what
we know in terms of
numbers, we really dont
know much about it
The Need
If we dont know, we can
not act
If we can not act, the risk of

loss is high
If we know and act, the risk

is managed
In God we believe, all

else must have data
Hewlett Packard
If we do know and do not

act, we deserve the loss
The Role of
Questions
Questions lead and answers follow. The same question most often lead to
the same answers which invariably produce the same result. To change
the result means to change the question.
New
measures lead to new questions. [Management needs to focus on

new measures like .. rather than outputs and budgets].
As
questions arise, vision emerges, direction becomes apparent and

ambiguity diminishes. In turn, people become organized and mobilized to
common action.
When people take common action, the organization's
ability to survive
and prosper will increase, owing to the discovery of answers to
problems heretofore not known.
Insanity is doing the same thing over and over again but expecting different results Rita Mae Brown (Author)
The Value of Measurement

Improved Measurement
Measurement
Question
Search
Knowledge
We dont know what we dont know

We cant act on what we dont know
We wont know until we search
We wont search for what we dont question
We dont question what we dont measure
Hence, we just dont know
Mikel J. Harry
10
The Role of
Training
Undoubtedly the most
important aspect of Quality is people and their

knowledge. Without this golden asset all is for nothing. At the risk of redundancy,
you dont know what you dont know and if you dont know something nothing
will happen. Obviously the key is knowledge. Successful change can not occur
without it.
Today, the best-in-class companies provide a tremendous amount of training

and education to their employees. Many of such companies have made
significant investments in training, and are discovering the rewards. For
example, Motorola Inc. has discovered a 10:1 return on their budget. In fact, they
require every employee to receive 40 hours or more of training annually, of
which 40% must be in the area of Quality.
11
What is Quality
Quality means different things to different people. There is
no universally accepted definition.
However, there is a broad agreement on the following
Very difficult to define
Determined by customer
Multi dimensional
Dynamic
Needs to be TOTAL
Usually, TOTAL QUALITY refers to the fact that all

departments have roles in quality.
12
ISO 9000:2000 Definition of

Quality
Degree to which a set of inherent characteristics fulfills
requirements
Requirements are needs or expectations that are stated or implied
Requirements can be generated by different interested parties
Inherent characteristics are the distinguishing features that exist in
the product/process/system, specially as a permanent characteristic
Inherent characteristics are called quality characteristics

Assigned characteristics (e.g. product price) are not quality
characteristics
Note: This definition is an improvement over its 1994 version.
However, it can still be argued that all inherent characteristics
are not quality characteristics.
13
How to Measure Quality

Product
Quality
Marketing Quality + Design Quality +

Mfg.Quality + . + Service Quality
Appropriateness of requirements
Customer
Satisfaction
Degree of conformance to
requirements
Cost of identifying and meeting the
requirements
14
How to Measure Quality (Contd.)

Customer satisfaction can be measured but it is not very useful
as a stand-alone measure.
Establishing the function f is a highly challenging task
Presently, all quality measures (e.g. Defect Rate, Process
Capability, Quality Cost, Cycle Time) address only a part of the
whole.
Points to remember
Quality
is customer satisfaction but customer satisfaction is

not quality
Reducing internal rejection and rework reduces producers
cost but not that of the customer
15
Components of
Main
Quality
Quality of Design
Componen
Decides the level of customer
attraction
Related to market segmentation
based on product grade
t
Quality of
Design
Improving design quality may

lead to higher cost but the same
need not be the case always.
Subcomponen
t
Product
Design
Power rating of an engine

Robustness
Operating cost
Ease of use
Process
Design
Rated efficiency
Process capability
Cycle time
Downtime for regulatory inspection
Process
Conformance
Process instability
Process failures
Late deliveries
Loss of efficiency/yield
Product
Conformance
Field failures
Factory scrap and rework
Deviation from target
Incorrect invoices
Quality of Conformance
Refers to the deficiencies
resulting from lack of control
Decides the level
customer dissatisfaction
of
Improving quality of
conformance always leads to
reduction of costs. It is in this
sense that Crosby says
quality is free
Quality of
Conformance
Examples of features
16

Tasks
Quality of
Design
System Design
Parameter Design
Tolerance Design
Quality
Statistical Tools
Process Monitoring
and Adjustment
Quality of
Conformance Problem Solving
Product Disposal
17

Qualit
This Programme
Tasks
Scope
Statistical Tools
y by
Product
Design
in
AECL
This
Programm
e
System
Design
Limited*
QFD, FMEA, Reliability

Engineering
Nil
Paramet
er
Design
-do-
Statistical Designs, S/N

Ratio, ANOVA
The concept
of robustness
only
Toleranc
e Design
-do-
Statistical Designs, Loss

Function, Simulation,
Regression
Nil
System
design
Limited**
Same as those mentioned

against product design
Process
Design
PLUS optimization tools for
Paramet
Very
inventory management,
er
High
* Applicable only forDesign
intermediate products and services
transportation, scheduling
** Applicable mostly for management and service delivery processes etc.
Toleranc
High
Quality
with Statistics
e Design
Nil
The concept
of robustness
only
Illustration
18
with an

Quality
This Programme
(Contd.)
Tasks
Scop
Statistical Tools
by
Process
Conforman
ce
Product
Conforman
ce
e in
AECL
Process
Monitorin
g and
Adjustme
nt
Very
High
Problem
Solving
Very
High
Product
Disposal
High
Field
This
Programm
e
Probability Distributions, Principles and

Control Charts, GR&R
tools of
Studies, PCA, Process
process
adjustment methods
monitoring
only
Simple tools like
Histogram and C&E
diagram, (Z, t, 2, F)tests, Advanced tools
PLUS all the tools
mentioned above
Concepts,
disciplines
and simple
tools of
problem
solving
Bulk Sampling,
Acceptance Sampling,
Loss Function
Issues in bulk
sampling only
Quality
High with Statistics
Nil
Nil 19
Chapter 2:
Data and Data

Collection
Data
Data are facts or figures related to any characteristic of
an individual
Also called a variable
A m/c, an year, a casting, a dimension, a person
Power station outages (up to 31/03/01 since commissioning)

VARIABLES
Station
Date of
commissioning
Availability
(%)
No. of
outages
C:15
12/11/98
92.5
9
30
C:16
10/05/97
93.0
4
12/10/78
88.3
2
INDIVIDUALS
Average
duration of
non-stop
operation
(days)
Average loss per

outage (hours)
Main
cause of
outage
Capacity
utilization
Forced
Planned
27
64
52
Leakag
e
High
47
28
52
52
Leakag
e
Mod.
124
58
261
164
Gen*
V. Low
* Generator stator / rotor problem
21
Types of Data/Variable
D a t a / V a r ia b le
N u m e r ic a l/ Q u a n t it a t iv e
C o n t in u o u s
D is c r e t e
C a t e g o r ic a l/ Q u a lit a t iv e
O r d in a l
N o m in a l
Continuous: An infinite number of values (positive or negative) are possible, e.g.

measurements of weight, length, chemical composition.
Discrete: The variable can take values 0,1,2,3, .. e.g. count of frequency (# of
defects, breakdowns etc.)
Ordinal: Data classified in ordered categories, e.g. quality of service provided is
classified as poor, moderate, good or yearly rainfall classified as very low, low,
moderate, good and very good.
Nominal: Data classified in categories having no inherent or explicit order, e.g.
location classified as east, west, north, south or names of departments.
22
Types of Data - Outage Data

Example
Variable Name
Variable Type
1. Date of commissioning
2. Availability (%)
3. Number of outages since
commissioning
4. Average duration of non-stop
operation (days)
5. Average loss per outage
(hours)
6. Main cause of outage
7. Capacity utilization
23
Types of Data - Further

Considerations
Continuous data may appear as discrete either due to rounding (see the
outage data example) or due to measurement limitations. We should treat
such data as continuous unless the number of levels in the data set is very few
(say 2-4).
However, hourly records of steam pressure at turbine inlet (station F) show
that the values are either 126 or 127 or 128. Great care must be exercised
while analyzing such data.
Discrete data having seven or more levels may be treated as continuous data.
Dichotomous data (O.K/Not O.K, Pass/Fail etc.) may be treated as discrete data
after coding the two categories as 1 (O.K) and 0 (Not O.K).
In the field of Quality Control, various types of data are classified as
- VARIABLE DATA : Continuous data
- ATTRIBUTE DATA: Others - Discrete, Dichotomous, Ordinal and Nominal
Henceforth we shall use this later classification.
24
Data Gateway
DATA
COLLECTION
Problem/
Hypothesi
s
DATA ANALYSIS
Dat
a
Solutio
n/
Fact
Quality problems can not be solved merely based on experience.

Any claim not backed by data is only a hypothesis.
Data Gates: Quality of the data gates and their placement at
appropriate
locations of a process are extremely important for
process control.
Data Quality: Data collection step is vital garbage in,
garbage out
25
Data Quality Scale

Most Data are of Poor Quality
Whenever you see data,
doubt it
Quality
Impact
Example
category
Rank*
Wrong data
Misleading
information
Cooked data
Noisy Data
Potentially
misleading
information
High gauge
R&R
Old data
Irrelevant data Useless information

Inadequate
data
Partial information
Small sample
Hard data
Difficult to process
Censored
data
* Higher the better
Redundant
Quality
with
Statistics
Useful but
adds
to
Multiple
626
Information Content in Data

for Process Control
Source of Data
Attribute
Data
Variable
Data
Very low
Low
Low
Moderate
Past Data: Statistically

designed experiments
Moderate
High
Live data: Passive observation

of the process
Moderate
High
High
Very High
General literature
Past data: In-house routine
Q.C records
Live Data: Statistically

designed experiments
Do not transform variable data to

attribute data.
That will be
like burning
diamond for heat.
Quality
with Statistics
27
Data Collection Process

INDIVIDUALS
VARIABLES
Var. 1 Var. 2 Var. 3
Population
.
Var. p
Ind. 1
Data
Data
Data
Ind. 2
Data
Data
Data
Data
Ind. 3
Data
Data
Data
Data
Ind. n
Data
Data
Data
Sample
Data
.
.
Measurement
.
Data
Recording
Quality with
Statistics
Editing, Storage,
28
Linking Data Quality to Data Collection

Process
Process
Elements
Popula
tion
Sampl
e
Wron
g
Individua
l
Procedur
e
Inade
quat
e
Har
d
Gauge
Appraise
r
Others
Record
ing
Irrele
vant
Variables
Size
Measu
remen
t
Noisy
Format
Recorder
Editing, Storage,
Retrieval
Redun
dant
Issues
relate
d to
data
base
mgmt.
Quality
with Statistics
29
Poor Data Quality

- Cause and Effect Diagram
Populatio
n
Individu
al
Sampl
e
Variabl
e
Samplin
g
Method
Size
Measureme
nt
Measuran
Gaug
d
e
Metho
Apprais
d
er
Poor
Data
Operat
Softwar
or
Hardwar
e
Forma e
Data base
Mgmt.
t
policy
Recordin
Editing, storage,
g
retrieval
Record
er
Qualit
y
Note: Due to limitations of space, only the main subcauses are shown in the CE diagram.
30
Measurement Related Causes for

Poor Data Quality
Calibratio
n
Not
Operatio
n
done
Statu
Breakdow
s
Done long back n
Not
Resultsused
Measuremen
t
Bia
s
Malfunctionin
g
Inadvertent
error
Numbe
Appraiser
s
Reproducibili
ty
Unstabl
e
Not
traceable
Gauges
Different
makes
Many
Variable
least
count
Numbe
r
Operating
range
Beyond
limit
Capabilit
y
Low
repeatabilit
y Precision
Low least
count
Measuran
d
Inhomogeneou
s
Standard
procedure
Not
availabl
e
Type of
data
Unwante
d
Not
followed
Communicatio
n
Metho
d
Poor
Data
Qualit
y
31
Data Collection Planning

- Principle of Inverse Loading
Plan
The Planning
Questions
1) What do you want to
know?
2) How do you want to
see what it is that you
need to know?
3) What type of tool
will generate what it
is that you need to
see?
4) What type of data
is required of the
selected tool?
5) Where can you get

the required type of
Execut data?
e
Illustration
Has X any effect on
Y?
X1 X2
X3 Y
.
. . .... . ...
..
.. .
Y
Histogra
m
X1 X2 X3
Y11 Y21
Y31
.
.
.
Y1n Y2p
Y3qinspection
Final
and production
log book
Scatter
diagram
X
X1
.
Xn
Y
Y1
.
Yn
Nowhereto be
collected
32
Data Collection Tools

Foregoing discussion indicates that collection of right data, by no means, is
a trivial task. One can go wrong in various ways at different stages of the
data collection process.
The two basic requirements for data collection are
Clarity of purpose
Use of a structured approach
Commonly used data collection tools, that satisfies the two requirements
are
Check Sheet
Data Sheet
Check Sheet: Checks (/, , x etc.) are made against a category of a

variable or combination of categories of several variables. Used primarily
for collecting attribute data.
Data Sheet: Measurement results are recorded against an individual and
its characteristics. Used for collecting both attribute and variable data.
Many consider all check sheets as data sheets and vice versa. However,
we shall distinguish between the two as above.
33
Process Distribution Check

Sheet
Power Generation Process (Moving
Target) Characteristic: Y1= Total generation (MW), Y2= System
Month:
September
Sampling
interval: Every 3.5 demand
hours
Target: Min(420,
Data: Target - Y1
Y1) Class
bar
Check
Process average (Y1 bar): 420

MW Total No. of observations:
206
Frq
Interval
<-54.99
-54.99 to
44.99
-44.99 to
34.99
Wasteful export
due to lack of
control
7
5
Export limit =
-10
-34.99 to
24.99
-24.99 to
14.99
Import limit =
+20
-14.99 to
04.99
-04.99 to
05.01
05.01 to
15.01
15.01 t0
Wasteful import
due to lack of
control
Defect rate = 27
%
8
12
6
16
342
Causes for Wasteful Import of

Power
Run Chart of half-hourly readings of
generation at station C15 in September
2001
35.0
30.0
25.0
20.0
15.0
10.0
5.0
0.0
1340
1237
1134
1031
928
825
722
619
516
413
310
207
104
A: Process failure B: Process deficiency C: Early slow down D: Late

pick up
35
Defect Cause Check Sheet

Month: September, 2001
affected
Station
C15
C16
Defect
Data: # of hours of generation

D
Total
Process
failure
52
Process
deficienc
y
81
Early
slowdow
n
15
Late pick
up
34
Total
54
22
65
21
Note: Criticality of the defects is not same over all

stations
20
182
36
Identifying Critical Causes

for Wasteful Import
Hours of low
generation
C1 C1 D
5
6
PF
30
15
PD
11
36 14 11
ES
LP Process
11 11
PF=
PD=
Process
failure
ES=
Early
deficiency slow
LP=
downLate pick
up
Average generation loss at each

instant
C15 C16
D
E
F
PF
29.
0
29.
5
PD
10
ES
10
30
15
7
LP 10
4
Total generation loss
(MWH)
C15 C16
D
E
30
15
Tota
l
PF
870
160
5
725
320
0
PD
55
18
360
70
55
558
ES
20
270
30
328
LP
110
44
150
105
409
with
238 Statistics
795 190
449
5
Tota
l
105
Quality
70
5
107.0 103.5
110.
0
37
Other Types of Check Sheets

Defective item check sheet
Checks are made against various causes of rejection/rework of an item.
Defect location check sheet
Instead of a table a diagram is made of the defect space.

Checks are made at the location where defect occurs.
Locational segregation of defects, if any, provides valuable clue.
Leakage in a cooling system

Cracks in castings
Wear out of moving parts
Check-up confirmation check sheet
Used to make a comprehensive check-up of product/process quality (usually

at the final stage).
Preprinted items of checks avoids duplication and missing of tests to be
performed.
It is a variation of check list, which is used for checking if all the tasks have
been performed or not.
C-E diagram check sheet
Checks are made against the cause of a problem in the C-E diagram.
38
Data Sheet General Format

Title
Individu
al
Var. 1
Common relevant
information
Var.
2
Var. p
Remar
k
Ind. 1
Ind. 2
Ind. n
Notes:
Important summary of
data
39
Data Sheet - Example

Rak
e
N0.
Up-load detention report for the month of July,

Dat2001
Arrival Qu # of
For
For
Depar Depar Deten Demur Rea
e
time
a
lity
wag
ons
m
date
m
time
t.
date
t.
time
.
hours
.
hours
son
Actual
unloadin
g time Hr.
01
01
19.45
Envi
ro
58
02
05.3
5
02
15.30
09.55
09.00
20
14
07.50
Du.
hill
58
15
16.4
5
16
00.20
07.35
23
S(19)
+I(4)
14.30
42
31
20.20
14.45
Purpose
? Estimation of demurrage
hours of demurrage
Control
hours
Important reasons cited are receipt in quick succession, successive
detentions and wet coal. These are beyond the control of the coal
handling section.
Inadequate
Quality
Data!with Statistics
40
Chapter 3:
Summarization of
Data
Data Analysis Getting Started

Half-hourly record of generation by station E during 19/9/01 (10 hrs.)
to 21/9/01 (1.30 hrs.) under normal operating condition
Hours
(MW)
10.00
13.30
14.00
17.30
18.00
21.30
22.00
01.30
02.00
05.30
06.00
09.30
10.00
13.30
14.00
17.30
Generation
102.8 105.2 103.2 104.0
105.0 105.0 104.0 104.0
103.2 104.2 102.0 103.6
105.2 106.0 105.0 103.0
104.2 105.8 105.4 104.8
106.0 104.0 104.2 103.8
103.4 104.4 104.4 104.2
104.8 102.8 103.6 104.8
104.0 104.0 104.0 104.0
103.0 104.8 102.8 104.0
104.0 103.4 106.0 104.4
What
are105.2
your
105.0
105.2
conclusions?
Quality with
105.2
105.2
103.8
103.2
104.8
104.4
104.8
104.4
104.4
103.4
104.4
Statistics
104.8
106.0
105.0
103.0
105.2
104.0
106.2
104.8
104.0
103.6
102.4
105.6
106.4
105.2
103.0
105.2
102.2
106.4
104.0
102.6
104.0
102.8
42
Frequency Distribution
- Analyzing a large data set on the same
variable
Generation data set (previous
The eighty observations
are grouped in eight classes of
slide)
equal length
Class Interval
Tally
Frequency
101.7 102.3
02
102.3 102.9
06
102.9 103.5
10
103.5 104.1
19
104.1 104.7
11
104.7 105.3
22
105.3 105.9
03
105.9 106.5
07
Total
80
Does the frequency distribution provide better insight into the

Data are not
process?
information
DATA +
ANALYSIS =
INFORMATION
43
Constructing Frequency
Distributions
- Variable Data
Data set
Number of observations (N): About 100 on the same variable.
Formation of the classes (first column)
Number of classes (k)

Too many classes obscure the pattern of the distribution due to
sampling fluctuations. Details are lost with too few classes.

Optimum number of classes is given by k = 1 + 3.3 log10 (N)
The simpler formula k = N also works well in practice.
For better visual impact, it is preferable to have 5 k 12.
For the generation data set we have N = 80. Therefore, k
= 1+3.3*log(80) = 7.3. This means the number of classes

should be either 7 or 8. We have chosen 7 classes.
44
Distributions (..contd.)
Class width (h)

h = (R + w) / k
where R = Range of the observations = Maximum

Minimum
and
w = Least count of measurement.
Next, h is rounded to the nearest integer multiple of w. This
means, if the least unit of measurement (w) is 0.1, then h =

2.312 should be rounded to 2.3. However, if w = 0.2, then
the same h should be rounded to 2.4.
In our generation data example, R = 106.4 102.0 =
4.4, and w = 0.2. Thus, h = (4.4+0.2) / 7 = 0.657,

which is rounded to 0.6. We shall explain later, why
taking h = 0.7 will be erroneous.
Note that if h is rounded down then we shall need (k+1)
classes to cover the whole range of the observations. How

many classes shall we need if h is rounded up?
45
Distributions (..Contd.)
Class limits
The minimum value of the generation data is 102.0 and the class
width has been determined as 0.6. So we can form the classes as
102.0 102.6, 102.7 103.3, 103.4 103.9, . . .
The problem with the above classification is that there is a gap

between two successive class intervals. This is not desirable since
we are dealing with continuous data.
Discontinuity can be removed by forming the classes as

102.0 102.6, 102.6 103.2, 103.2 103.8, . . .
However, this classification has another problem. Suppose we have
an
observation 102.6. In which class shall we place it, first or second?
In order to avoid such confusion we take

Lower limit of the first class = Minimum w/2
and then successively add the class width to this lower limit to obtain
the other class limits.
46
Class limits (..Contd.)
Thus, for the generation data we have the classes as

101.9 102.5
102.5 103.1
103.1 103.7
103.7 104.3
104.3 104.9
104.9 105.5
105.5 106.1
106.1 106.7
Note that now we have

- 8 classes (since h has been rounded down from 0.657 to 0.6)
- no confusion in classification (since there are no observations
which
fall on the class limits) and
- an extended last class (ideally the upper limit of the last class
should
have been 106.5).
In the example, we have extended the first class instead of

the last one since this has brought out the process
abnormalities better. Thus the eight classes used are
101.7 102.3, 102.3 102.8, , 105.9 106.5
47
Tally marking (second column)
Start with the first observation. Find the class to which the observation
belongs. Put a tally against the class.
Classify all the remaining observations as above.
Tally marks are grouped in five, with the fifth tally crossed through the
previous four tallies. This provides a better visual display and helps in
counting the frequency of each class.
Note that all the above observations get classified as we go through
the observations only once. However, if we concentrate on a class and
then try to find out the number of observations in the class then we
have to go through the observations k times. This not only consumes
more time but also increases the chance of committing error.
Counting frequency (third column)
The frequency (f) of each class is obtained simply by counting the

tallies.
Other columns
Columns giving cumulative frequency (f1, f1+f2, ..) and relative

frequency (f1/N, f2/N, ..) may also be added, if required.
48
Distributions
- Getting the class intervals right
Why class width (h) is rounded to nearest integer multiple of w
Consider the same generation data example. Here w=0.2. Assume that
h = 0.657 is rounded to 0.7 (which is not an integer multiple of 0.2)
instead of 0.6. Thus the classes will be 101.9 102.6, 102.6 103.3, ..
Now in order to overcome the problem of classifying observations like

102.6, we are forced to consider w=0.1 and have the classes as
101.95 102.65, 102.65 103.35, 103.35 104.05, 104.05 104.75,
104.75 105.45, 105.45 106.15, 106.15 106.85
Note that the number of observation units covered by each class are
not same. For example, the second class covers three units (102.8,
103.0 and 103.2) but the third class covers four units (103.4, 103.6,
103.8 and 104.0). As a result the frequency distribution is likely to
show many peaks.
Balancing end points
Assuming w=0.1, the seven classes shown above should be

appropriate. However, note that the last class is extended by four units
beyond the maximum observed value of 106.4. It is desirable to
distribute this imbalance to the two end classes by starting the first
class from 101.75 and ending at 106.65.
49
Frequency Distribution of The

Generation Data Further
analysis
The frequency distribution shows an abnormal pattern (nearly

alternative peaks). Does this mean the process mean is jumping
randomly by about 1.2 unit?
Following two frequency distributions constructed out of the same data
provide some additional clues.

Fractional part Frequenc
y
.0
.2
.4
.6
.8
!!
a
at
D
y
s
i
No
Class interval
Frequency
101.7 102.7
04
27
102.7 103.7
17
18
103.7 104.7
26
104.7 105.7
25
105.7 106.7
08
Total
80
15
5
15
Total
80
0s occur more frequently
at the cost of 6s. Does this
indicate
measurement
bias?
Quality
Smooth pattern (left skewed).

Smoothness has been achieved not only
by reducing the number of classes but also
by including the adjacent 0s and 6s in the
with interval.
Statistics
50
same
Histogram
Histogram is a graphical representation of a frequency distribution of
variable data.
The histogram of the generation data having five classes is shown below.
Frequenc
y
30
25
20
15
10
5
0
Bars of equal width (= class width)

Heights of the bars are proportional to
the frequencies of the classes
Bar width of about 1 cm. (7-10 classes)

Horizontal axis is about 1.6 times
longer than the vertical axis
Central tendency: About 104.2.
101.7
103.7
Pattern of variation: Slightly left
105.7
skewed
Generation in E station
(MW)
Specification limits: Should be shown wherever applicable.
Class mid-point: Marking the class mid-points may be helpful in
certain cases.
Open ended classes: Avoid adding too many classes at the ends
having zero or
Quality
withasStatistics
51
very
low
frequencies.
Shown
open
ended
bars
with
arbitrarily
reduced heights.
Construction of Histogram
- An exercise
Half-hourly record of power (MW) generated by station E during
29.9.2001
(10.00 hours) to 30.9.2001 (24.00 hours) gives us the following
data.29/9
6.4 6.4 6.8 6.0 5.2 4.8 6.4 4.4 5.2
(10 hrs.) 6.0
7.6 8.0 7.4 6.6 8.0 5.6 7.2 7.2 7.0
4.0
6.4 8.0 8.0 6.0 6.0 6.4 7.8 7.6 7.6
7.4
7.6 7.6 7.4 4.6 4.2 4.8 6.0 5.6 5.4
5.0
6.2 7.8 7.4 7.2 7.4 7.8 6.6 6.4 6.8
6.8
30/9
6.8 6.8 6.6 6.8 6.6 6.8 6.8 6.8 7.0
(24
hrs.)
7.0
Construct a histogram of the above data set. Compare with the
histogram 6.0 5.6 4.4 4.6 4.6 4.8 6.2 7.0 6.6
for the period
6.4
19.9.01
to 21.9.01
( previous slide) and offer your
Quality
with Statistics
52
comments.
Commonly Observed
Histogram Patterns
Single peak, symmetric,

bell shaped, commonly
observed pattern of a
stable process
LSL
US
L
Single peak, thick

tail
How
?
Single
positively
(long tail
right)
peak,
skewed
on the
Single
peak,
negatively skewed
(Long tail on the
left)
Many
characteristics
follow such patterns. We
have already seen that
generation
data
is
negatively skewed while
breakdown
data
is
Two peaks (bipositively
skewed.
modal)
?
w
However
such
shapes
Ho
may also indicate process
Quality with Statistics instability.
53
Frequency
Distribution of
Discrete
Data
Number of plant outages in each year since
commissioning
Statio
Period
Type of
n
outage
D
197879
To
200001
198586
To
200001
Forced
# of outages in a year
2, 3, 1, 0, 3, 2, 1, 0, 2, 2, 0, 2, 3, 0, 2, 1,
2, 1, 1, 0, 1, 0, 2
Planned 3, 5, 1, 4, 2, 5, 2, 1, 6, 3, 7, 7, 4, 7, 6, 5,
6, 4, 2, 2, 2, 6, 2
Forced
2, 2, 5, 3, 0, 0, 1, 0, 1, 0, 2, 1, 1, 0, 1, 4
Planned 15, 7, 8, 3, 7, 5, 2, 6, 3, 8, 7, 4, 5, 4, 3, 4
F
19881, frequency
1, 0, 0, 1, 1,
2, 0, 1, 0, 1,(for
6 each type
Ideally
we
should Forced
construct4,six
distributions
89
Planned However,
3, 11, 6, 12,
1, 2, 8, 2,of4,data
4, 6 we shall
of outage inTo
each station).
due4,to0,shortage
construct only
2000-two - one for forced outage and the other for planned
outage.
01
What can you say about the occurrence of two types of
Quality
with
Statistics
54
outages from the above
data
set?
Line Graph and Bar Graph

Distribution of number of yearly outages of stations D, E
and F since commissioning
Forced outages (Line
graph)
1
6
1
2
8
Frequenc
y
Frequenc
y
1
6
1
2
8
Planned outages (Bar

graph)
Number of
outages
0 1 2
11,12,15
3 4
Number of
outages
8 9
Line graph is showing the frequencies of individual outcomes.

Bar graph is similar to the histogram. But there are gaps between the
bars since we are dealing with discrete (attribute) data.
Planned outages occur more frequently than forced outages.
Number of planned outages is uniformly distributed between 2 and 7
with very few outages outside this band. Such a pattern is somewhat
odd. Planned outages need to be defined properly. Do we undertake
unnecessary planned outages?
55
Measures of Central Tendency

The Typical value
Most effective measure for numerical data. Let {X1, X2, , XN-1,
XN} be the data set. Then
Mean = X = (X1 + X2 + + XN) /N = Xi / N
Mean
May be used for ordinal data but not for nominal data
Sensitive to extreme values
Ordinal data: Category containing the (N+1)/2 case
Media
n
Mod
e
Numerical data: (N+1)/2 th ordered observation, when N is

odd and average of N/2 th and (N/2)+1 th ordered observations,
when N is even.
Can be computed even for open ended classes at the extremes
provided each of the end classes contain less than 50% of the
observations.
Category
value occurring with greatest frequency
Insensitiveortothe
outliers.
Only measure of center for nominal data
May not be unique and highly sensitive to how the classes or
categories are formed.
56
Interpretation of Mean
In a rising voltage test the alternating breakdown
voltage(kV) of 24 samples of an insulation arrangement were
found to be as follows:
210; 208; 208; 175; 182; 206; 190; 194; 198; 205; 212; 200;
205;
Dot
MEAN
=
[210
+
208
+
+
216
+
196]
/
202; 207; 210; 202; 201; 188; 205; 209; 201; 216; 196
Plo
24
t
= 201.25 kV
170
180
190
210
220
Mea
n for the distribution of the
Mean is the balance point (or fulcrum)
values
Mean is analogous to centre of gravity

Sum of negative deviations from mean exactly equals the sum of
positive deviations. Thus the total sum of the deviations from
mean is always zero
In the above example,
the with
meanStatistics
should be interpreted as a 57
Quality
measure of centre and not that of central tendency or typical
Data Analysis Getting Started

Reportable accidents (#) in AEC Ltd.,
Sabarmati
1995-2000
Aprduring
May Jun
Jul
Aug Sep
Jan
Feb
Mar
Oct
Nov
Dec
Total
1995
24
17
27
19
10
25
19
22
23
16
18
15
235
1996
22
10
22
18
16
21
21
20
21
18
18
18
225
1997
19
14
12
15
15
15
24
19
16
14
19
192
1998
14
14
12
20
19
23
10
16
13
15
17
19
192
1999
19
13
15
13
16
18
17
16
20
17
13
16
193
2000
12
14
15
22
12
13
13
134
Total
110
82
103
108
88
100
88
111
103
91
87
100
1171
What are your conclusions?
58

Quality With Statistics-2

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Quality With Statistics-2

Caricato da

Copyright:

Formati disponibili

Senior Management Programme

Quality With Statistics

Defining the Ideal Quality

1. Use same time scale:

2) For each factor, place a rating on the following statements

3. Ideal value for each

4. A factor which is not

Quality with Statistics

Defining the Real Quality

2. Real score can

Quality with Statistics

The Quality Value Grid

Quality with Statistics

Quality with Statistics

The Cost of Remaining

Quality with Statistics

The Classical View of

Unsafe drinking water almost 15 minutes each day

Quality with Statistics

The Need for

If we can not act, the risk of

If we know and act, the risk

In God we believe, all

If we do know and do not

Quality with Statistics

measures lead to new questions. [Management needs to focus on

questions arise, vision emerges, direction becomes apparent and

When people take common action, the organization's

Quality with Statistics

The Value of Measurement

We dont know what we dont know

Quality with Statistics

important aspect of Quality is people and their

Today, the best-in-class companies provide a tremendous amount of training

Quality with Statistics

Usually, TOTAL QUALITY refers to the fact that all

ISO 9000:2000 Definition of

Inherent characteristics are called quality characteristics

Quality with Statistics

How to Measure Quality

Marketing Quality + Design Quality +

Quality with Statistics

How to Measure Quality (Contd.)

is customer satisfaction but customer satisfaction is

Improving design quality may

Power rating of an engine

Quality with Statistics

Quality with Statistics

Quality with Statistics

Quality with Statistics

QFD, FMEA, Reliability

Statistical Designs, S/N

Statistical Designs, Loss

Same as those mentioned

Quality with Statistics

Probability Distributions, Principles and

Data and Data

Power station outages (up to 31/03/01 since commissioning)

Average loss per

Quality with Statistics

Continuous: An infinite number of values (positive or negative) are possible, e.g.

Quality with Statistics