3 Measure Module

Measure Module Introduction
Quality is good or bad. To determine if its good or bad, one has to measure. So go measure it. Since measurement is the foundation of any science, some would say the quest for quality begins with measurement. Measurement is a numerical value assigned to any element for conveying information about the element. In the Measure phase, expect to measure the existing system and establish a reliable method for monitoring progress toward a goal. Data is information used for the basis of reasoning, discussion, or calculation. The data should be bias-free, reliable, reproducible, repeatable, stable, and valid. To better understand this concept, the ASQ Body of Knowledge provides the following topics: Process analysis and documentation Develop and review process maps, written procedures, work instructions, flowcharts, etc. Identify process input variables and process output variables, and document their relationships through cause and effect diagrams, relational matrices, etc.
Probability and statistics Distinguish between enumerative (descriptive) and analytical (inferential) studies, and distinguish between a population parameter and a sample statistic. Define the central limit theorem and describe its significance in the application of inferential statistics for confidence intervals, control charts, etc. Describe and apply concepts such as independence, mutually exclusive, multiplication rules, etc.
Collecting and summarizing data Identify and classify continuous (variables) and discrete (attributes) data. Describe and define nominal, ordinal, interval, and ratio measurement scales. Define and apply methods for collecting data such as check sheets, coded data, etc. Define and apply techniques such as random sampling, stratified sampling, sample homogeneity, etc. Define, compute, and interpret measures of dispersion and central tendency, and construct and interpret frequency distributions and cumulative frequency distributions. Depict relationships by constructing, applying, and interpreting diagrams and charts such as stem-and-leaf plots, box-and-whisker plots, run charts, scatter diagrams, Pareto charts, etc. Depict distributions by constructing, applying, and interpreting
diagrams such as histograms, normal probability plots, etc. Probability distributions Describe and interpret normal, binomial, Poisson, chi square, Students t, and F distributions.
Measurement system analysis Calculate, analyze, and interpret measurement system capability using repeatability and reproducibility (Gauge R&R), bias, linearity, percent agreement, and precision/tolerance (P/T).
Process capability and performance Identify, describe, and apply the elements nof designing and conducting process capability studies, including identifying characteristics, identifying specifications and tolerances, developing sampling plans, and verifying stability and normality. Distinguish between natural process limits and specification limits, and calculate process performance metrics such as percent defective. Define, select, and calculate Cp and Cpk, and assess process capability. Define, select, and calculate Pp, Ppk, Cpm, and assess process performance. Describe the assumptions and conventions that are appropriate when only short-term data are collected and when only attribute data are available. Describe the changes in relationships that occur when long-term data are used, and interpret the relationship between long- and short-term capability as it relates to a 1.5 sigma shift. Compute the sigma level for a process and describe its relationship to P
Process Analysis and Documentation: Learning Objectives

At the end of this Measure topic, all learners will be able to: develop and review process maps, written procedures, work instructions, flowcharts, etc. identify process input variables and process output variables (SIPOC), and document their relationships through cause and effect diagrams, relational matrices, etc.
Written Documentation
Unfortunately, many organizations either lack written documentation or have poorly written and poorly organized materials. Written procedures and work instructions benefit organizations in many ways, including: increased accuracy
increased documentation use increased compliance increased process consistency decreased performance errors increased empowerment of users to solve problems improved communication within an organization improved customer communication decreased training time
Although generally lacking specific details, written procedures describe the process at a general/conceptual level. These procedures help to conceptualize the entire operation and the individual processes within the operation. Written instructions, on the other hand, provide step-by-step details sequencing particular tasks within a process.
Auditing Documentation
Examining existing written documentation helps clarify the current condition and may provide clues to quality issues. To start an audit of written documentation, obtain a copy of the procedures, study them, and construct a process map (flowchart) to display the actions. Following the initial study, see each area below to learn questions to research.
Document

Is the procedure available to users? Is the procedure easily accessible to the users? Is the procedure in legible condition? Does the procedure show evidence of use? Does the procedure use a legible font type and font size? Is the procedure in easy-to-understand language for the user? Is each step clear?
Use

Does the documentation match what they do? Are the users doing what it says to do? Are the users ignoring any steps? Are the users adding any steps? Do users refer to the procedures? If so, when, how, and why? Do users document that they follow procedures? Do physical abilities such as eyesight, hearing, and strength need to be re-evaluated?
Users

Have users read the written procedures? Do users have a common understanding of the specifics? Do users agree that all steps in the procedure are correct? Do users think that it is important to management that they follow directions? Were users involved in creating the procedures? Did users review the previews before implementation? Are users properly trained in using documentation?
Operation

Do users know how to revise a procedure? How often are procedures revised? Do users have ideas on improving the procedures? Has training kept pace with the job requirements? Has training kept pace with technology?
Process Maps and Flow Charts

Symbols are used to define certain types of steps in a flowchart: rectangles for most steps and diamonds for decisions. Move to the next page for a list of flowchart symbols. A sample process flowchart is shown below. Each symbol on the map can have additional information added to it such as inputs and outputs. To see an example of the inputs and outputs of a step in the process flowchart, roll over the step indicated in the image below.
Benefits of a Process Map

A well-developed process map yields several benefits: Visually represents how the process works Supports the identification of disconnects and non-value-added steps Helps the team better understand the process Enables the discovery of problems or miscommunications Helps define the boundaries of the process Identifies process inputs and outputs Assists in recognition of process bottlenecks and opportunities for improvement
Process maps also serve functions in other phases of DMAIC: Improve: Define and communicate the proposed changes to the process Control: Document the revised process
Creating Process Maps

Procedure
Materials needed: yellow sticky notes, notepads or flipchart paper; marking pens 1. 2. 3. 4. 5. 6. Define (identify) the process. Brainstorm the activities involved in the process. Arrange the activities in proper sequence. Determine inputs and outputs. Identify time lags and non-value-added steps. Once the sequence is agreed upon, draw arrows to show the flow.
Use When
Developing an understanding of the steps in a process. Studying a process for improvement. Communicating how the process works. Documenting a process.
User Tips
Focus on identifying the process before worrying about correctly drawing the process map. Focus on those areas that appear complex with an excessive number of potential decision points or delays. Look for duplication, redundancy, complexity or too many handoffs in the process. Ask the following types of questions: o Why are we performing the task in this manner? o Does the current process deviate from the designed process? Why? o What are the value-added activities? o What are the non-value-added activities? o How much time, money or work hours are required for each task? These may be the outputs (Ys) of the steps in the process.
SIPOC
Purpose
SIPOC (suppliers, inputs, process, outputs, customer) is a tool for identifying all elements involved in a process improvement project. SIPOC is similar to process mapping.
Benefits
Clarifies the important chain involved in the process Identifies who the process serves, inputs required for a successful process, who provides the required inputs, the necessary steps for completing the process, and the results delivered by the process Defines a complex project Is easy to complete
Using SIPOC
Procedure
1. 2. 3. 4. 5. Begin with the process by identifying steps. Identify the outputs of the process. Identify the customers who will receive the outputs from the process. Identify the inputs needed by the process. Identify the supplies of the required inputs.
Fishbone Diagram
Description Created by Kaoru Ishikawa, fishbone diagrams are a problem-analysis tool for identifying, sorting, and displaying as many possible causes for an effect or problem. Also called cause-and-effect diagram and/or Ishikawa diagram. Benefits Sorts the ideas into useful categories Breaks down ideas into smaller chunks Shows the interaction between various causes Displays the information as a graphic Encourages group participation Helps identify areas to collect data for further study
Fishbone Diagram Procedure

1. Develop a problem statement by identifying the effect or symptom. o Write it at the center right (effect box/head of the fish). o Draw a box around it and draw a horizontal arrow running to it. 2. Identify the potential causes. o Manufacturing commonly uses the 6 Ms as categories Methods (Process, documentation and procedures) Machines (Equipment) Manpower (People) Materials Measurement Mother Nature (Environment) o Non-manufacturing may use
3. 4. 5. 6. 7.
People Policies Procedures (Process) Equipment o The 4 Ps Place Procedure People Policies o The Right Stuff Right tools Right materials Right instructions Right supervision Right feedback Write the categories of causes as branches from the main arrow. Brainstorm the possible causes for each area. Continue to ask Why? for each cause and record as a sub-cause branching off the causes. When the group runs out of ideas, focus attention to places on the chart where ideas are few. Return to each cause to prioritize the list.
Relational Matrices Benefits

The process helps team members to identify and agree upon outputs critical to the product and/or customer. Levels of importance are assigned to each output variable (using a numerical rating). The effect of each input (X) on each output (Y) is determined and assigned a numerical value. The relationship between inputs and outputs [Y=f(x)] is determined. For process maps, the relative importance of inputs is determined.
Creating Relational Matrices: Procedure

1. Review the process map. The group should consider involving the customer when defining and rating the Ys.
2. List the output variables (Ys) across the top along the horizontal axis. 3. Rate each output below the output variables (green box) in terms of its overall importance to the customer. In this example, a scale of 1 (low importance) to 5 (high importance) is used. Other scales may also be used. 4. Identify potential inputs (Xs) that can impact the various outputs (Ys). List them on the left side along the vertical axis. The Xs should come directly from the process map. 5. Rate the effect of each X on each Y (red association table). In the following example, a scale of 0 (no relationship), 1 (weak relationship), 3 (moderate relationship), or 9 (strong relationship) is used. The rating is based on how much effect that particular input has on the quality of its corresponding output. Other scales may also be used. 6. The customer importance rating (Y) serves as a weighted response that is multiplied by the association rating (X) for each relationship. 7. Sum each column on the association table (weighted ratings), and multiply by the customer importance ranking to get an importance score. These scores are ranked from highest to lowest. Use the results to analyze and align future team activities, prioritizing where the team can begin its focus.
Example:
Probability and Statistics: Learning Objectives

At the end of this Measure topic, all learners will be able to: distinguish between enumerative (descriptive) and analytical (inferential) studies, and distinguish between a population parameter and a sample statistic. define the central limit theorem and describe its significance in the application of inferential statistics for confidence intervals, control charts, etc. describe and apply concepts such as independence, mutually exclusive, multiplication rules, etc.
Types of Studies
The diagram below compares enumerative studies to analytical studies:
Frequently used symbols: n = Sample Size x-bar= Sample Mean s = Sample Standard Deviation s2 = Sample Variance N = Population Size = Population Mean = Population Standard Deviation 2 = Population Variance
Key Points of the Central Limit Theorem and Six Sigma

Using 3 sigma control limits, the central limit theorem is the basis of the prediction that, if the process has not changed, a sample mean falls outside the control limits an average of only 0.27% of the time.
Probability Problem
Given one standard deck of playing cards answer the questions below. If you are not familiar with a standard deck, read background below, to check your work, answers appear at the bottom of the page.
1. What is the probability of drawing one ace from a standard deck of cards? What is the probability of drawing three aces in a row if the
drawn card is returned to the deck after each draw, and then the deck is reshuffled? 2. What is the probability of selecting a spade? 3. Given two people, if Person 1 selected a spade, without replacing and reshuffling, what is the probability that Person 2 also selects a spade from the same deck?
A standard deck of playing cards: Composed of 52 cards Equally divided into four suits (hearts, spades, diamonds, and clubs) Each suit is composed of 13 cards: 1 (Ace) through 10, plus three face cards: Jack (J), Queen (Q), and King (K)
Answers
1) Drawing 1 ace: P = 4 / 52= 1/13 = 0.0769 2) P (3 aces replacement) = 4 / 52 x 4 / 52 x 4 / 52 = 64 / 140, 608 or 0.0004552 3) Given a standard deck of 52 playing cards, the probability for Person 1selecting a spade is 13 of 52 (or 0.25) 4) Given two people, if Person 1 selected a spade, without replacing and reshuffling, the probability that Person 2 also selects a spade from the same deck is 12/51 (or 0.235) because the second selection is dependent on the first selection.
Probability Terminology
Key Definitions
Probability - The chance of something happening Outcome - The result Sample Space - Set of all possible outcomes (heads, tails) Event - A collection of all outcomes o Independent events - If one event provides no information about the other event occurring o Dependent events - If one event provides information or influences other events o Mutually exclusive events - Two events not possible to occur (one coin cant be both heads and tails) Frequency - The number of observations for each sample
Collecting and Summarizing Data: Learning Objectives

At the end of this Measure topic, all learners will be able to: identify and classify continuous (variables) and discrete (attributes) data. describe and define nominal, ordinal, interval, and ratio measurement scales. define and apply methods for collecting data such as check sheets, coded data, etc. define and apply techniques such as random sampling, stratified sampling, sample homogeneity, etc. define, compute, and interpret measures of dispersion and central tendency, and construct and interpret frequency distributions and cumulative frequency distributions. depict relationships by constructing, applying, and interpreting diagrams and charts such as stem-and-leaf plots, box-and-whisker plots, run charts, scatter diagrams, Pareto charts, etc. depict distributions by constructing, applying, and interpreting diagrams such as histograms, normal probability plots, etc.
Measurement guides the improvement process. Since the purpose of measurement is to guide, forewarn, and inform, properly implementing a data collection effort requires planning and know how.
Important Questions to Ask for Implementing Measurement

What should be known? What is the purpose of measuring? What will be measured? How will the measurement occur? When will the measurements take place? How accurate is the collection instrument? How accurate are the data? How reliable are the data? How sensitive are the data?
Discrete Data
Discrete Data Also called attribute data Countable data
Examples Number of units unfit for sale Number of imperfections on an automobile Number of successes in n trials Number of surface flaws Choice-based classifications such as good/bad, yes/no, pass/fail, tall/short
Uses Computing proportions (defects per unit, calls per associate, or transactions per day) Categorizing counts (types of defects, calls, or transactions)
Questions
1. Is the run chart below showing part width in millimeters an example of continuous or discrete data?
Answers
1. The run chart describes a continuous variable because continuous data generally involves a measuring device and answers questions such as how much, how far, and how long. 2. The check sheet is an example of discrete data because the data involves counts of classifications rather than measurements.
Thinking About Scales

When working with data, be aware of different types of data, methods of collecting data, techniques of assuring accuracy and integrity, tools to display data, and different measurement scales. This exercise is about measurement scales. Identify each given measurement by the appropriate measurement scale. If correct, a checkmark appears, but an X appears if incorrect. When completed, the next page will provide information about measurement scales.
Measurement Scales
Variables differ in how well they can be calculated (i.e., the amount of measurable information their measurement scale provides). Two Key Factors Determining the Amount of Obtainable Information 1. Amount of error (found in every measurement) 2. Type of measurement scale
Data Collection Methods

Data collection for the project is based on three important questions: What do we want to know? From whom do we want to know it? What will we do with the data?
To help insure the data are relevant to the problem statement and project objective, consider these key factors when choosing a data collection method(s): Length of time (per hour, day, shift, batch, etc.) Type (cost, errors, ratings, etc.) Source (reports, observations, surveys, etc.) Cost (internally and externally) Collector (team member, associate, subject matter expert, etc.)
Understanding how the data relates to the process parameters is the beginning of data-based decision-making. There are many types of data collection methods available to the quality process analyst. In this lesson, we concentrated on: Check sheets Coded data Automatic gaging and other gaging
Check sheets and coded data could also be seen as a form of gage as well. The most common of all measurements are ones taken with various types of gages for continuous and discrete data that are not automatic. For example, a person physically makes a measurement when he/she does such things as taking a temperature reading, taking a blood pressure reading, timing an operation, or running a chemical test.
Check Sheets
Check sheets are tools for collecting improvement data. The Quality Tools section of the Continuous Improvement lesson covers this tool. A check sheet is a structured, prepared form for collecting and analyzing data. It is usually comprised of a list or lists of items, and some way to indicate how often each item occurs.
Check Sheet: Types, Strengths, Weaknesses

There are several types of check sheets: Confirmation check sheets focus on verifying whether all steps in a process have been completed. Process check sheets record the frequency of observations with a range of measurement. Defect check sheets record the observed frequency of flaws. Stratified check sheets record observed frequency of defects by defect type and one other criterion.
Strengths Easy to use Provide a choice of observations Good for determining frequency of observations Applicable for identifying many common problems
Weaknesses Must be carefully constructed to be useful May omit pertinent information such as type of data collected, the part number, date, or operator(s)
Coded Data
Coding (any method of classifying or reducing data without significantly reducing accuracy) is a technique to either reduce or eliminate data or to assign a value to attribute data. Coding involves transforming numbers or using abbreviations for long strings. Assigning a number that is crossreferenced to a meaning is a common example of coding. Banks may code bounced checks to the branch location. Customer service may log complaints by codes.
Coded Data Description: Strengths, Weaknesses

Strengths Easy data analysis Easy to summarize Several pieces of information can be recorded for one unit or one individual: o Male - 1 o Under 50 - 1, over 50 - 2 o College education: yes - 1, no - 2 o Married - 1, unmarried - 0 o 1101 is a male, under 50, no college education, and married Reduce errors and wasted time
Weaknesses Loss of information: There is no need to code numerical data such as age, time, temperature. Lack of correct identification: If coding is not explicitly stated, misclassification or data entry errors will occur.
Examples of Data Coding

Truncation coding: Measurements such as 1.0003, 1.0002, and 1.0009 in which the digits 1.000 repeat in all observations can be recorded as the last digit expressed as an integer (e.g. 3, 2 and 9 respectively). Substitution coding: Product length is measured in sixteenths of an inch (1/16 of one inch). All product lengths should be close to 24 7/16". A recorded observation might use an integer that expresses the number of sixteenth increments. Therefore, 24 7/16" is recorded as "7" and 24 12/16" is recorded as "12". Category coding: Use a code, like S for scratch, D for dent, W for warped, etc. This method is used often for coding of discrete data on a form or for collection/analysis of categories of data. Adding/subtracting a constant or multiplying/dividing by a factor:Let X represent raw data, XC a coded statistic, C a constant, and f a factor. The chart below illustrates the mathematical model to use when coding and decoding the data. Note that when decoding the data, the arithmetic mean must be computed, and the original mathematical operation reversed. Also, for addition and subtraction, note that the arithmetic mean of the raw data will be the same as the arithmetic mean of the coded data.
Data Integrity and Accuracy

In the planned pursuit of quality, people strive for perfection. Using various techniques and statistical tools helps minimize mistakes, and ensures data accuracy and integrity. Data integrity and accuracy both play an important role in understanding whether the data collection process is yielding usable data. Data integrity determines whether the information being measured truly represents the desired attribute. Data accuracy determines the degree to which individual or average measurements agree with an accepted standard or reference value.
Random Sampling
Besides acceptance sampling, additional sampling methods are available. The upcoming pages cover four other common methods: stratified, random, sequential and fixed. Random sampling and sequential sampling are approaches that physically take a sample. Sequential and fixed sampling refer to approaches defining how samples are taken and evaluated in order to make a decision about the lot. Random Sampling is the process of selecting sample units so all units have the same probability of being selected. True random sampling requires giving every unit (n) an equal chance of being selected for the sample. The sample cannot be effective unless it is truly representative of the group. All acceptance samples are randomly selected to remove systematic error from the sampling process.
Examples
A random number generator is used in the decennial census conducted by the U.S. Census Bureau to ascertain information about the United States population. A random number generator is used in a fast-food chain to print a survey number on the receipt for customers to call and answer questions about their experience at the restaurant. Using a random number table to determine house numbers in a neighborhood in order to conduct a survey, Following is a random number table and instructions for use.
Sequential Sampling
Sequential samples are taken from a lot and examined individually or in groups. After each examination a decision is made between: Accepting the lot Rejecting the lot Examining another sample from the same lot
Sequential sampling has the advantage of requiring fewer observations, on average, than fixed sample size tests for a similar degree of accuracy. The following are appropriate situations for using sequential sampling: Inspection testing involving costly equipment with limited capacity A supplier having a history of producing a very good product
Stratified Sampling
There may often be factors which divide the population into subpopulations (groups/strata). One may expect the measurement of interest to vary among the different subpopulations. This variance is accounted for when selecting a sample from the population that is representative of the population. The selection is achieved by stratified sampling. A stratified sample is obtained
by taking samples from each stratum or subgroup of a population. When we sample a population with several strata, we generally require that the proportion of each stratum in the sample should be the same as in the population. Stratified sampling techniques are generally used when the population is heterogeneous (or dissimilar), or when certain homogeneous (or similar) subpopulations can be isolated (strata). Some reasons for using stratified sampling over simple random sampling are: the cost per observation in the survey may be reduced. estimates of the population parameters may be wanted for each subpopulation. increased accuracy at given cost.
Strata must be defined before data collection begins to ensure each data point is tagged with the correct stratum.
Stratified Sampling Examples

An operations manager wants to know the average checks processed per associate per shift in his department, which consists of first, second, and third shifts. He could divide up his team into the three sub-groups and take samples. At a manufacturing company, large lots of small plastic components are purchased for assembly into product. In the receiving inspection department, the containers are segmented into eight areas. Random samples are taken from each area. Stratified sampling ensures samples are taken from all areas of the container. A national fast-food restaurant chain wants to survey its customers to determine how well its new products meet expectations. The company randomly selects stores in each geographic area for participation in the survey.
The second example assumes that each of the eight areas has different product. If this is not true, a random sample would be adequate.
Central Tendencies
Central tendency is a measure that characterizes the central value of a collection of data that tends to cluster somewhere between the high and low values in the data. Central tendency refers to a variety of key measurements like mean (the most common), median, and mode. Mean Median Mode Represents the value with the highest frequency of occurrence (the most-often repeated value) Typically used with non-normal data The distributions center point (middle value) An equal number of data points occur on either side of the median Useful when the data set has extreme high or low values Typically used with non-normal data Gives the distributions arithmetic average (center) Provides a reference point for relating all other data points Typically used with normal data
Central Tendency: Normal vs. Skewed Distribution
Frequency Distribution Table

Another way to display frequency data is to use a frequency distribution table. This is a compact way of displaying a set of measurements compared to listing all the numbers.
Purpose
Gives direct information about how many data points are at each value
Cumulative Frequency Distribution

A cumulative frequency distribution is created from a frequency distribution by adding an additional column to the table called Cumulative Frequency. For each value, the cumulative frequency for that value is the frequency up to and including the frequency for that value.
Purpose
To show the number of data at or below a particular variable
Depicting Relationships
One of the most effective tools for the visual evaluation of data is a graph showing the relationship between variables. Quality professionals use graphical methods as a complement to numerical methods because visuals are sometimes better suited than numerical methods for identifying patterns in the data. Tools depicting relationships can be divided into two general categories: those depicting relationships and those depicting distributions. This topic looks at several types of graphical displays showing relationships: stemand-leaf plots and box-and-whisker plots. Quality Control Tools in the Continuous Improvement lesson already covered other applicable tools for depicting relationships such as flowcharts, Pareto charts, cause and effect diagrams, check sheets, scatter diagrams, and run charts.
Stem-and-Leaf Plots
Designed by John Tukey (1977) as a type of histogram, a stem-and-leaf plot is a visual representation of data values directly incorporating the data points. A stem-and-leaf plot separates each number into a stem (all numbers but the last digit) and a leaf (the last digit). Example: The data 95, 99, 100, 110 yields stems: 9, 9, 10, 11 and leaves: 5, 9, 0, 0. Although stem-and-leaf plots are traditionally displayed as the image below, some people use a table to display them. Recall the following observations of the speed of cars used in an earlier example: 32, 29, 41, 36, 34, 39, 28, 37, 36, 36, 30, 32, 31, 35, 36, 38, 40, 42, 33, 34. The stem-and-leaf plot would be the following:
Creating Stem-and-Leaf Plots

The original stem-and-leaf plot about the cars speed does not provide much insightful information about the distribution of the data. An alternative would be to use subintervals by dividing the stems into several smaller intervals.
The stems here are divided into the groups 0-10, 11-20, 21-30 and 31-40, and stems 3 and 4 are divided into smaller subintervals. The purpose of creating these subintervals is to guarantee that the data displays some type of shape. As in the earlier explanation of frequency, the same rule of thumb about the number of classes to use applies also to stem-and-leaf plots: the number of classes should be at about the square root of the sample size. Again in this case, the square root of 20 is about 4.5, so 5 classes are more appropriate.
Using Stem-and-Leaf Plots

The following example summarizes stem-and-leaf plot and the statistics gathered from a set of data. Consider an exercise on pull-off force for a connector. The data for 40 test specimens are as follows: 241 220 249 209 258 194 251 212 237 245 238 185 210 209 210 187 197 201 198 218 225 195 199 190 248 255 183 175 203 245 213 178 195 235 236 175 249 220 245 190 In this stem-and-leaf plot, the stems are the first two digits of each data value, and the leaves are the last digit. The leaves are arranged in incrementally higher values from left to right. The median, range, and quartiles are obtainable from the stem-and-leaf plot. The plot and summary statistics together give valuable information about the sample and the population as a whole. As an exercise to check your understanding and use previously covered concepts, determine the median, range, and quartiles for the pull-off force data on the stem-andleaf plot.
Stem-and-Leaf Answers
1. Determine the median by averaging the 20th and 21st data values 210 and 212, respectively; therefore, the median =211. 2. Determine the range by subtracting the high and the low values: R = 258 175 = 83 3. Determining quartiles: Q1: The median for the lower half below Q2 Q2 = 211 (median) Q3 = The median for the upper half above Q2
Quartile Answers Q1 = 195 Q2 = 211 (median) Q3 = 239.5
Stem-and-Leaf Summary
Benefits Easy and quick to construct Shows shape and distribution Visually compact Convenient to use Displays both variable and categorical data sets Allows for the data to be read directly from the diagram, whereas with a histogram the individual data values may be lost as frequencies within a category
Procedure 1. Some professionals find it helpful to first write (sort) the data in numerical (ranking) order. 2. Separate the numbers into stems and leaves. 3. Group the numbers with the same stems. 4. Prepare an appropriate title and legend for the plot.
Box-and-Whisker Plots
Credited to Tukey, box-and-whisker plots use five key data points to graphically compare data produced from different sources (different machines, operators, work centers, etc.). 1. The ends of the box are the first and third quartiles, Q1 and Q3. 2. The median forms the centerline (vertical line) within the box. 3. The high and low data points (illustrating the range) serve as end points to lines that extend from the box (the whiskers). Each whisker (including outliers) contains 25% of the data. 4. The box serves as the middle half of the data containing 50% of the distribution. 5. Asterisks or diamonds represent data outside the range (outliers).
Run Charts
A predecessor of control charts, a run chart displays how a process performs over time. With data points plotted in chronological order and connected as a line graph, run charts may detect special causes of variation. Since shifts have an assignable special cause, run charts provide a signal that leads to the cause. Run charts are also called trend charts (variations on a control chart, but without the limits)
Benefits
Recognizes problem trends or patterns Displays sequential data Serves as a visual aid in spotting patterns and abnormalities Monitors and communicates process performance Presents information around a middle value (centerline)
Using Run Charts

Use When displayingperformance/processdataovertime. displayingtabulationsorlistsofnumbers.
Procedure
1. 2. 3. 4. 5. 6. 7. 8.
Listthecollecteddatainthesequenceinwhichitoccurs. Orderthedata(lowesttohighest)anddeterminetherange. Calculatethemedian. ConstructtheYaxisandmakethescale1.5totwotimestherange. ConstructtheXaxisandmakeittwotothreetimesaslongastheYaxis. Drawadottedlinetoillustratethemedian. Plotthepointsandconnectthemtoformalinegraph. Labeleachaxiswithunitsandtitlethecharttoidentifytheinvestigation.
User Tips If25ormorepointsofdataexist,thenarunchartmaybeusedtodetermineifa specialcauseexiststhatiscausingvariationintheprocess. Donotusearunchartifmorethan30%ofthevaluesarethesame.(Pyzdek,Quality EngineeringHandbook)
Scatter Diagram Description

Scatter diagrams graph pairs of continuous data, with one variable on each axis, to examine the relationship between them. Scatter diagrams may show what happens to one variable when the other variable changes. This is particularly true when one of the two variables is independent and one is dependent. The dependent variable is normally charted along the vertical (Y) axis and the independent variable along the horizontal (X) axis. If the relationship between the two variables is understood, then the dependent variable may be controlled. The relationship between the two variables may illustrate: Correlation: A correlation suggests there is a relationship between the two variables. A correlation does not necessarily mean that a cause and effect relationship exists. A third characteristic (or more) might be the cause of both the variables behaving as they do. A correlation may be: o Positive: as one variable moves in one direction, the second variable moves in the same direction. o Negative: as one variable moves in one direction, the second variable moves in the opposite direction. No correlation exists.
Scatter diagrams are also called scatter plots, X-Y graphs or correlation charts
Scatter Diagram Procedure

1. Collect pairs of data for both variables. 2. Draw a graph with the independent variable on the horizontal axis (x) and the dependent variable on the vertical axis (y). 3. For each pair of data, plot a dot (or symbol) where the x-axis value intersects the y-axis value. (If two dots fall together, put them side by side, touching, so that you can see both.) 4. If correlated, eyeball a line of best fit.
Example
Using Pareto Charts

In some instances, a double Pareto chart can be used to contrast two sets of data or to compare before-and-after data. The latter can help to measure the impact of quality improvement changes. Specifically, Pareto charts may be used to address several classic quality items such as: Production issues State of scrap Defects or nonconformities Rework Warranty claims Maintenance time Raw material usage Machine downtime
Service issues Wasted time Number of jobs that have to be redone Customer inquiries Number of errors
Constructing a Pareto Chart

According to The Quality Toolbox, second edition, the following are procedures for constructing a Pareto chart: 1. Decide which categories you will use to group the data. 2. Decide which unit of measure is appropriate. Common units of measurement are frequency, quantity, cost, or time. 3. Decide the chart s time period. (In other words, decide for what period of time the data will be collected and displayed.) 4. Collect the data, recording the category each time. Or, assemble data that already exists. 5. Tally the total count for each category. 6. In order to construct the chart, determine the appropriate scale for the data you have collected. The maximum value will be the sum of all subtotals from step 5. Mark the scale on the left-most y-axis 7. Construct and label bars for each category. Place the tallest at far left, then the next tallest, and so on. If there are many categories with small measurements, they can be grouped as other, and placed at the far right. 8. Calculate the percentage for each category: the subtotal for that category divided by the total for all categories. Draw a right vertical axis and label it with percentages. Be sure the left-most y-axis scale and the right-most y-axis scales match in height. Some authors suggest that the left measurement that corresponds to one-half should be exactly opposite 50 percent on the right scale. 9. Calculate and draw cumulative sums: Start by placing a dot above the
first bar showing the percentage of the category it represents. (Use the scale on the right-most y-axis for this display.) Next, add the percentages for the first and second categories, and place a dot above the second bar indicating that sum. To that sum, add the percentage for the third category, and place a dot above the third bar for that new sum. Continue the process for all the bars.
Histogram
A histogram is a graphical representation of the information provided in a frequency distribution. The horizontal axis is the measurement scale used. Adjacent boxes are constructed so that the length (or height) of the box represents the frequency (or relative frequency) for each class. Because the boxes are quantitative, we can get a good idea of the shape of the distribution. Return again to the example of 20 observations of the speed of cars passing a checkpoint. In the image below, notice the shape similarity of the histogram and the stem-and-leaf plot. Data set: 32, 29, 41, 36, 34, 39, 28, 37, 36, 36, 30, 32, 31, 35, 36, 38, 40, 42, 33, 34
Histograms History
Histograms were created in 1833 by French statistician A. M. Gerry to present a pictorial analysis of crime data. Because Gerry presented data pictorially rather than in columns of numbers, his audience found it easier to see his conclusions about crime in France. Today, histograms are a commonly used tool for summarizing, analyzing, and displaying data. As a Quality Progress article, The Tools of Quality, Part IV: Histograms, notes: A picture can be worth more than a thousand numbers when the picture is a histogram.
Histogram Description
According to the ASQ Auditing Handbook, third edition, a histogram is a graphic summary of variation in a set of data. The Quality Progress article The Tools of Quality, Part IV: Histograms describes four simple concepts that help explain the power of histograms: Concept 1: Values in a set of data almost always show variation.Variation is an inevitable part of any process manufacturing, service, or administration. It is impossible to keep all factors in a constant state all of the time. Concept 2: Variation displays a pattern. Different phenomena has different variation, but there is always some pattern to the variation.
Patterns of variation in data are called distributions. Concept 3: Patterns of variation are difficult to see in simple tables of numbers. While there are patterns in looking at a table of numbers, the patterns are difficult for our eyes and mind to discern. Concept 4: Patterns of variation are easier to see when the data is summarized pictorially in a histogram. Because a histogram provides a picture of data, it enables us to easily see variation patterns.
For Example
The example below (taken from The Quality Toolbox, second edition) shows how a histogram is constructed from a table of data: The Bulldogs bowling team wants to improve its standing in the league. Team members decided to study their scores for the past month. The 55 bowling scores are: Bulldogs Team Bowling Scores 115 118 119 129 134 135 147 148 148 155 155 156 167 170 172
103 125 142 153 163
107 126 144 153 163
111 127 145 154 165
115 127 146 155 165
121 137 149 157 176
122 138 150 159 177
124 139 151 160 183
124 141 152 161 198
The number of bars in a histogram can be determined by using the following table (adapted from The Quality Toolbox, second edition). Using the table below, they estimate B (the number of bars) to be 7. Number of Data Points 50 7 8 9 10 11 12 13 14 Number of Bars
100 150 200
The highest score was 198 and the lowest was 103, so the range of values is R= largest smallest R= 198 103 = 95
Constructing a Histogram
According to The Quality Toolbox, second edition, the following are basic procedures for constructing a histogram: 1. Collect at least 50 consecutive data points from a process. If you dont have that much data, use the point graph variation (points arranged as with a line graph with no line connecting the points). 2. Use the histogram worksheet to set up the histogram. After calculating W in step 2 of the worksheet, use your judgment to adjust it to a convenient number. For example, you might decide to round 0.9 to an even 1.0. The value for W must not have more decimal places than the numbers you will be graphing. 3. Draw x- and y-axes on graph paper. Mark and label the y-axis for counting data values. Mark and label the x-axis with the L values from the worksheet. The spaces between these numbers will be the bars of the histogram. Do not create spaces between bars. 4. For each data point, mark off one count above the appropriate bar with an X or by shading that portion of the bar. For numbers that fall directly on the edge of a bar, mark the bar to the right. Histograms can be used for the following purposes: To provide a clearer and more complete picture of data patternsto see if a change has occurred over time To analyze and visually communicate information about variation in process behaviorto communicate the distribution pattern To make decisions about where to focus improvement effortsto see if the output of a process has a distribution that might need to be studied Repeated use of histograms allows a person to see if a change has occurred over time
Normal Probability Plots

The assumption of normality is necessary to adequately apply many statistical tests. The normal probability plot determines if a set of data came from a population that is normally distributed. Before computers, statisticians designed normal probability paper to graph the actual data on the x-axis against percentiles based on the hypothesized distribution on the y-axis. Karl Friedrich Gauss first presented the theory behind the normal curve. The normal probability plot shows whether the data are distributed as the standard normal distribution. Normal distributions follow a linear pattern when plotted on normal probability paper; therefore, if the data plots along a straight line, it is normally distributed. Think of normal probability plots as scatter diagrams of actual data vs. numbers representing a normal distribution.

3 Measure Module

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

3 Measure Module

Caricato da

Copyright:

Formati disponibili

Measure Module Introduction

Process Analysis and Documentation: Learning Objectives

Process Maps and Flow Charts

Benefits of a Process Map

Creating Process Maps

Fishbone Diagram Procedure

Relational Matrices Benefits

Creating Relational Matrices: Procedure

Probability and Statistics: Learning Objectives

Key Points of the Central Limit Theorem and Six Sigma

Collecting and Summarizing Data: Learning Objectives

Important Questions to Ask for Implementing Measurement

Thinking About Scales

Data Collection Methods

Check Sheet: Types, Strengths, Weaknesses

Coded Data Description: Strengths, Weaknesses

Examples of Data Coding

Data Integrity and Accuracy

Stratified Sampling Examples

Central Tendency: Normal vs. Skewed Distribution

Frequency Distribution Table

Cumulative Frequency Distribution

Creating Stem-and-Leaf Plots

Using Stem-and-Leaf Plots

Quartile Answers Q1 = 195 Q2 = 211 (median) Q3 = 239.5

Using Run Charts

User Tips If25ormorepointsofdataexist,thenarunchartmaybeusedtodetermineifa specialcauseexiststhatiscausingvariationintheprocess. Donotusearunchartifmorethan30%ofthevaluesarethesame.(Pyzdek,Quality EngineeringHandbook)

Scatter Diagram Description

Scatter Diagram Procedure

Using Pareto Charts

Constructing a Pareto Chart

103 125 142 153 163

107 126 144 153 163

111 127 145 154 165

115 127 146 155 165

121 137 149 157 176

122 138 150 159 177

124 139 151 160 183

124 141 152 161 198

100 150 200

Normal Probability Plots

Potrebbero piacerti anche