Sei sulla pagina 1di 40

Failure Analysis of Engineering Systems

Instructor: Professor Steve Maher

Module 5: Scripture of the Module Some review of Module 3 8 Failure Mode Assessment and Assignment (FMA&A) 9 Pedigree Analysis 10 Change Analysis

Scripture of the Module


The plans of the diligent lead to profit as surely as haste leads to poverty.
- Proverbs 21:5

Failure Analysis of Engineering Systems ENGR 5323

Assignment
Read Chapter 8, 9, and 10 of Systems Failure Analysis Do Quizzes on Bb as they appear Quiz next week at beginning of class

Failure Analysis of Engineering Systems ENGR 5323

Some Module 3 Leftovers

From Module 3 Quiz


We have 1000 parts that have run at an average of 800 hours each. 20 of them have failed. What is the failure rate? MTBF = total service hours/# failed = 800*1000/20 = 40000 hr/fail Failure rate = = 1/MTBF = 1/40000 = 0.000025 = 2.50E-5 = 2.5*10-5 fails/hr

Failure Analysis of Engineering Systems ENGR 5323

From Module 3 Quiz


For the parts in Question 17, what is the Probability that a part will run to 1000 hours without failing? Ps = e-t = e-(2.5E-5)(1000) = 0.9753 (or 97.53% chance of running that long without failing. Some more info: PF = 1 Ps = .0247 = 2.47% chance of failing; i.e. ~25 parts will fail by 1000 hours of operation, or ~5 more between 800 and 1000 hours.
Failure Analysis of Engineering Systems ENGR 5323

From Module 3 Quiz


Fig 7.1: Event B has a failure rate of 10-4. The part is operated for 100 hours. The probability of event C happening is .005. What is the probability that command event A will occur? OR gate, so PA = PB + PC PB*PC. PB = 1 e-t = 1 exp[-(10-4)(100)] = 0.00995 PA = .00995 + .005 (.00995)(.005) = .0149 = .015 or 1.5% chance of Event A happening.

Failure Analysis of Engineering Systems ENGR 5323

From Module 3 Quiz


Fig 7.3: The system is operated for 100 hours. The failure rate for B is 2x10-5. The failure rate for C is 5x10-6. What is the probability that command event A will occur? AND gate, so PA = PB*PC. PB = 1 e-t = 1 exp[-(2x10-5)(100)] = ~0.002 PC = 1 e-t = 1 exp[-(5x10-6)(100)] = ~0.0005 PA = .002*.0005 = ~10x10-7 = ~1x10-6 or about 0.0001% chance of Event A happening.
Failure Analysis of Engineering Systems ENGR 5323

Failure Mode Assessment and Assignment (FMA&A)

Berks Overall FA Process


Designate a team Determine Corrective Actions Gather all related information Implement Corrective Actions Review and define problem Assess Corrective Actions

Identify all potential failure causes


Evaluate for Preventive Actions List causes in FMA & A Incorporate FA Findings

Converge on root cause

Failure Analysis of Engineering Systems ENGR 5323

10

What is FMA&A?
FMA&A = Failure Mode Assessment and Assignment It is a tool to help manage the evaluation of each of the hypothesized failure causes. It is generally a table textbook has 4 columns:
Event number Description of each hypothesized failure cause Likelihood assessment of each cause (updated as data becomes available) Actions necessary to evaluate the cause and status of the evaluation (sometimes separate columns).

See Table 8.1: FMA&A for light bulb example Spreadsheet (e.g. MS Excel) is an excellent tool for this; can use word-processing tool (e.g. MS Word)
Failure Analysis of Engineering Systems ENGR 5323

11

Hypothesized Failure Causes


Each row of the table is a hypothesized failure cause Each of the causes is briefly described Can develop hypothesized causes with any method, then map them to a row/column List causes or inducing events only
In FTA terms, do not list command events Focus on basic failures, human errors, normal events, inhibiting conditions, and undeveloped events (using FTA terms)

Typically a repeat from previous activity of identifying potential causes (described in Modules 2-3)
Easier to work with table than with diagram(s) Saves time, less confusion
Failure Analysis of Engineering Systems ENGR 5323

12

Event Number Column


Each of the hypothesized failure causes is numbered This is used for tracking and organizational purposes Can develop hypothesized causes with any method
Textbook uses FTA example Each team or individual can choose numbering system

List and assign numbers to causes or inducing events only


In FTA terms, do not list/assign command events Focus list/number on basic failures, human errors, normal events, inhibiting conditions, and undeveloped events (using FTA terms)

Failure Analysis of Engineering Systems ENGR 5323

13

Assessment Column
Describes assessment for each of the hypothesized failure causes Default for each cause is Unknown As analysis proceeds, each cause will be updated using terms such as
Unlikely = evaluation showed no problem or no lead Likely = cause likely found but not conclusive, or Confirmed = we found it! (At least the objective evidence indicates it)

For FA (i.e. the failure has already occurred), Probabilities do not really matter that much
Can be used as a guide to prioritize evaluation Should NOT be used by itself to update status
Failure Analysis of Engineering Systems ENGR 5323

14

Assignment Column
Defines action necessary to evaluate each of the hypothesized failure causes Review each hypothesized cause (i.e. row by row) and determine actions necessary to evaluate it
Evaluation needs to be objective (i.e. fact-based), not subjective (i.e. opinion-based) Focus on ruling the cause in or out Be careful!
of jumping to conclusions of ruling causes out too quickly and without evaluation

Best if ONE owner and a DUE DATE for the action(s) Status is updated as the actions are completed (in this column or an additional one)
Failure Analysis of Engineering Systems ENGR 5323

15

Point of Emphasis
Do not touch any hardware or software from the failed system until you have defined an organized, systematic, and objective manner in which to proceed

Failure Analysis of Engineering Systems ENGR 5323

16

Follow-On Activities (Team)


Meet regularly determined by priority, severity, and urgency
High profile: at least daily Low profile: at least weekly is recommended Use FMA&A to guide the meeting

Execute actions that are assigned and update status


Include findings in Assignment column Assessment and Assignment changes based on the data Clearly indicate items completed and ruled out (e.g. shading the row)

Distribute updates to team and stakeholders on a regular basis (e.g. after each team meeting)
Failure Analysis of Engineering Systems ENGR 5323

17

Individual Approaches
Suggest you use FMA&A or a similar format
Some organizations use an FA Log or similar tool You want something to capture your thoughts, planned actions, and status updates

Update someone regularly determined by priority, severity, and urgency


Go no more than 2 weeks recommend weekly Use FMA&A or FA Log to guide the meeting

Execute planned actions and update status


Document findings as you go Adjust plans based on the data Clearly indicate items completed and ruled out

Have updates ready to distribute when needed


Failure Analysis of Engineering Systems ENGR 5323

18

Performing the Evaluations


Pedigree analysis (Ch 9) Change analysis (Ch 10) Analytical equipment (Ch 11) Mechanical and electronic component failures (Ch 12) Leaks (Ch 13) Contamination (Ch 14) Design Analysis (Ch 15) Statistical Considerations (Ch 16) Design of Experiments (Ch 17)

Failure Analysis of Engineering Systems ENGR 5323

19

Group Activity

* Discuss Scenario (pg. 72-73) * Document answers to questions in a file * Email file to me (one per team) *After emailing, take a 5-10 min break * Re-convene about ____

Which approach does your organization (or do you) follow? Do you think your failure analysis approach needs to change? If so, what can you do to initiate a change?

Failure Analysis of Engineering Systems ENGR 5323

Pedigree Analysis

Berks Overall FA Process


Designate a team Determine Corrective Actions Gather all related information Implement Corrective Actions Review and define problem Assess Corrective Actions

Identify all potential failure causes


Evaluate for Preventive Actions List causes in FMA & A Incorporate FA Findings

Converge on root cause

Failure Analysis of Engineering Systems ENGR 5323

24

Overall Process Flow for Diagnosing Root Cause of a Failure


Confirm the Failure Characterize the Failure

Isolate the Failure

Isolate the Defect Identify the Defect

Determine Root Cause


Failure Analysis of Engineering Systems ENGR 5323 25

What is a Pedigree? And How Do You Analyze it?


Product or System Pedigree = Essentially the history of the product
Describes design of product How it was built That it was built in accordance to specs, codes, etc.

Documents in a pedigree:
Records of how it was built Records of material used Conformance to drawing and material requirements

Will a suspect condition be revealed by an analysis of the pedigree?


Failure Analysis of Engineering Systems ENGR 5323

26

Value of Reviewing the Pedigree


If it addresses the suspect area, examine the pedigree to see if there is something suspicious
Anomalies in test results Non-conformities found in inspections Missing items or documents

If it does not address the suspect area, maybe the pedigree should
Recommend for future builds Can be part of corrective action to prevent future failures

Failure Analysis of Engineering Systems ENGR 5323

27

Examining the Pedigree


Purchase orders Nonconformance documentation Inspection records Test data Calibration data Drawings and specifications Drawing changes Work instructions Certificates of conformance

Failure Analysis of Engineering Systems ENGR 5323

28

Surprisingly
Shipped systems do not always meet all of its requirements
Many pedigree reviews reveal that the product had/has a problem In some cases, pedigree directly related to the failure

Not necessary to check the ENTIRE pedigree


That may be a massive undertaking Only review areas that relate to the hypothetical causes

Pedigree can be suspect


Errors, omissions, or even fraud can happen Certificates of conformance are not a guarantee

Failure Analysis of Engineering Systems ENGR 5323

29

Example: Tragedy in Hawaii


Tour Plane caught fire and crashed Oil leaking into engine caused the fire Oil filter gasket had melted Gasket was made of wrong material Maintenance, filter spec, and gasket spec were fine Gasket manufacturer noted different material Certificate of conformance was missing Gasket packing slip and certificate did not match Mismatch slipped through and nonconforming oil gasket was used, leading to the accident

Failure Analysis of Engineering Systems ENGR 5323

30

Non-conformance Does Happen


Anomalous Certificates of Conformance are not uncommon
Typically not outright fraud Most are human error

Sometimes nonconforming material or system ships anyway Sometimes everything looks fine but something is suspicious
Follow-up independent verification may be needed Additional inspections, testing, etc. can be sought

Failure Analysis of Engineering Systems ENGR 5323

31

Change Analysis

What is Change Analysis?


If a system WAS working, what changed? Need to determine if a change occurred and if the change induced the failure.

Options:
Nothing changed! Failure was happening, but not observed Failure occurs within normal statistical variation Change occurred, but unrelated to failure A change induced the failure
Failure Analysis of Engineering Systems ENGR 5323

33

Things That Can Change


Design Manufacturing Process Test and Inspection Environment Lot Changes (Manufacturing variation) Aging Supplier Changes

Failure Analysis of Engineering Systems ENGR 5323

34

Design Changes
Controlled design change Redlined design change Rejected material: use as is or repair Outsourced components and subassemblies

Failure Analysis of Engineering Systems ENGR 5323

35

Process Changes
Work or Build Instructions
Many companies do not have instructions Imprecise work instructions Little rigor on changes Changes to equipment, tooling, settings not documented People not following the instructions

Investigate the documentation for changes Investigate for non-documented changes SPC can help and provide a starting point
Failure Analysis of Engineering Systems ENGR 5323

36

Test and Inspection Changes


Investigate for issues in testing
Failures returned to manufacturing Reworked systems or components Results out of the ordinary Changes in the test process

Investigate inspection processes and results


Change of inspectors Change of instructions Items noted but system continued anyway

Changes in test or inspection equipment


Failure Analysis of Engineering Systems ENGR 5323

37

Environmental Changes
Temperature and Humidity issues
Curing or dying of materials Non-environmentally controlled processes Investigate if failure correlates to temp/humidity

Storage
Investigate changes in environment or procedure Epoxies and raw material may be sensitive Moving locations can induce changes

Shipping
Failure Analysis of Engineering Systems ENGR 5323

38

Lot and Supplier Changes


Manufacturing has normal variation
Sometimes failures correlate to supplier lots May be related to material distributions Investigate if failure correlates to supplier lots Need to understand suppliers processes

Aging Suppliers can change materials or designs


Purchased supplies may still meet specs Investigate for changes that affect system Can be difficult and sensitive to get information
Failure Analysis of Engineering Systems ENGR 5323

39

Example from Textbook: CBU-87/B Cluster Bomb

Potrebbero piacerti anche