Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
x-y below refer to the Meteorology Telemetry System Case Study below make sure
you refer to this case when answering these questions
You are working for the New Products section in the Communications Division of Electrosystems Ltd
which has an annual turnover of $30m for the local and export markets. A new request for 10
identical remote meteorology telemetry systems has been received. The order is expected to be
worth around $2million, with additional sales in subsequent years being worth up to $6million.
You are part of a team that develops a conceptual design as shown in Figure 1.
CPU ROM
A/D I/O
etc
Instrument
Module
Modulator and transmitter
Figure 1: Preliminary design
Five of the 14 weather inputs are classified as major. These inputs are barometric pressure, wet
and dry bulb temperature, wind direction and speed. Failure of a sensor will result in grossly
inaccurate (or loss of) readings, and the loss of the corresponding input parameter. The system fails
if any one of these five parameters fails, or if the signal processing board, communication board or
power supply unit fails. The customer requires a 10 year MTBF for the system.
First consider failure of the instrument modules. The required meteorological instrument modules
are available from specialist suppliers. One supplier showed evidence from three contracts, each for
50 instrument modules over a period of 5 years. In this data set there were 12 sensor failures that
resulted in the failure of a major parameter. Data on when the failures occurred (based on number
of months in service) is provided in Table 1. When a sensor failed it was not replaced and the
instrument module was considered failed. The combined operation time for these 12 sensors is 324
months (=27.0 years). The remaining units that did not fail survived the 60 month period.
Table 1: Times to failure for the sensors on the instrument modules (months from start of service)
1 2 3 4 5 6 7 8 9 10 11 12
Months 6 9 11 15 17 18 28 30 33 45 55 57
in
service
An industry database for failure data on instrumentation was used to check the calculated failure
rate.
A Failure Modes and Effects Analysis (FMEA) was done on the proposed design shown in Figure 1. In
the FMEA process the following failure rates were used:
1. What is the expected failure rate of the instrument module based on the total time on test?
A. 1.82 failures per million hours
B. 1.91 failures per million hours
C. 5.48 failures per million hours
D. 30.45 failures per million hours
2. The data presented in Table 1 is an example of which of the following (select one only)
A. Right censoring - Failure terminated
B. Right censoring - Time terminated data
C. Left censoring
D. Interval censoring
=1.5864, =30.2644, =0.9862 =1.2318, =346.9859, =0.9599
99 99
Probability -Weibull Probability -Weibull
90
90
Data 1 12F-13 8S
Weibull-2 P 50 Weibull-2 P
U n r e lia b ilit y , F ( t )
U n r e lia b ilit y , F ( t )
RRX SRM MED FM RRX SRM MED FM
50
F=12/S=0 F=12/S=138
Data Point s Data Point s
Probability Line 10 Probability Line
5
10
5 1
5.E-1
1 1.E-1
1 10 100 1 10 100
Time, (t) Time, (t)
U n r e lia b ilit y , F ( t )
5 1
5.E-1
1 1.E-1
1 10 100 1 10 100
Time, (t) Time, (t)
A. Top left
B. Top right
C. Bottom left
D. Bottom right
4. What is the mean time to failure (years) of the meteorological instrument modules based on
the Weibull analysis in the previous question?
Note: Gamma values (1.60)=0.892; (1.63)=0.897; (1.81)=0.934; (2)=1
A. 2.2 years
B. 7.7 years
C. 27.0 years
D. 59.2 years
5. The failure of any sensor or component leads to the loss of a major parameter. Using the
estimated failure rate for the instrument modules based on the total time on test and the
data provided on failure rate for the other components given in Table X, what is the
8. You and your team have conducted a risk identification and risk assessment. The major risks
you have identified are a) financial exposure and opportunity, b) the reliance on a limited
number of specialist suppliers for the instrument module, c) the maintenance costs involved
and lack of access to the remote sites where the meteorological instruments will be located,
d) uncertainties over how the instruments will perform in a range of operating
environments.
What is the most appropriate risk control measure for managing the exposure in supply of the
instrument module?
You are part of a team involved in doing a quantitative risk assessment for the tailings dam at a the
Gold Bug mine near Meekatharra. Gold Bug is a relatively small and remote mine producing about
80k ozs/ year. The dam is about 2 km from the plant and has been running about 10 years since its
construction in 2003. It was built using conventional processes in which an outer wall ~50 m high is
constructed and then an inner drywall from consolidated slimes. Slurry is deposited in the dam from
spigots. As slurry runs down the inside of the dam it spreads into thin layers, allowing the solids to
settle and compact over a period of weeks. Excess (clear) water drains to a pool in the centre of the
dam from where it is pumps back to the plant.
Due to problems with topography and aboriginal heritage issues during construction a contractors
camp was located downhill from the tailings dam. This is where the off-shift contractors are housed.
During construction there were 6 people in the camp at all times. The initial modelling work
assumed that when the dam was more than 20% full that a breach (failure) of the dam wall would
release 60,000 m3 of mud which will likely result in the burial of the camp. The initial risk assessment
in 2003 used the following assumptions:
It was understood that once construction was completed in 2003 that the construction camp would
be moved. However this has not happened. The construction camp is still being used to house
visiting contractors and consultants.
Recently there have been problems with the tailings dam as follows:
Process upsets in the gold plant have resulted in large volumes of lower density material
being sent to the dam.
The dam has a set of piezometers monitoring wall stability.
There is only about 300 mm of vertical freeboard which includes the height of the tailings
dry wall.
Instead of the pond sitting in the middle of the dam it is now up against the northern wall.
In 2013 you and your colleagues update the probability of failure of the dam wall to 1 x 10-3 / annum
and the probability of the dam contents reaching the camp to 0.75. All other assumptions remain
the same. You compare your calculated values with the AGS (2000) suggested tolerable risk criteria
of 10-5 per annum.
Questions
13. In the original design, what is the probability of a camp occupant being killed given a failure of
the tailings dam?
A. 0.15
15. The regulation says that the storage capacity should be sufficient to ensure a freeboard of
at least 0.5 m above the expected maximum water level, which shall be based on the
average monthly rainfall figures less the gross mean evaporation in that area, plus the
maximum precipitation to be expected over a period of 24 hours with a frequency of once in
100 years. The phrase with with a frequency of once in 100 years means which of the
following (select one)?
A. This will occur once every 100 years.
B. The time between events follows a distribution with a mean of 100 years
C. The time between events is 100 years
D. None of the above
16. There are a number of barriers engineers traditionally consider when selecting risk controls,
which one of the following barriers will NOT assist in managing the risk of dam failure.
A. Engineering controls
B. Physical separation
C. Monitoring and Control
D. PPE
AC induction Motor
Function: to convert electrical
energy to mechanical energy to
supply torque to the pump shaft
Pump
Control Valve Piping
Function: to move water at a
Function: to control the flow of Function: to convey the water
specified rate from source to
water through the system from source to destination
destination
Isolation Valve
Function: to isolate required
sections of the piping sytem
Figure 2 Functional diagram for the Maleny pump station
Investigation
The section below describes the data collected and analysis conducted in 2004 at the pump station.
Planned vs unplanned work: Asset managers seek to manage equipment to avoid unexpected
failures or unplanned work. Unplanned work is generally classified as work that is not part of the
scheduled maintenance plan (planning window).. When failures occur within the planning window
unplanned work is often initiated to address the failures.
Sources of failures: In order to investigate the source of these costs unplanned work orders were
broken down into mechanical and electrical failures. 90% of all failures were found to be associated
with electrical equipment. The remaining 10% were mechanical failures and were due to broken
flow switches, pipe work and valve leaks.
Reset failures are associated with electrical problems when the pump has either failed to start or
failed in operation. They are generally associated with poor troubleshooting skills. Contactor failures
are associated with wear on the contactors in electrical starter system or problems with the PLC
control logic. Protection failures are associates with the motor protection and soft start system.
Citect failures are associated with the PLC due to logic errors, control system crashes and
communications issues. Flow failures are associated with the flow instrumentation; these are often
associated with the physical parts of the measurement system such as the paddle. Thermal failures
are associated with the thermistor in the motor when it detects high temperature.
Reliability analysis: Reliability analysis was conducted based on failure data from November 2000-
August 2004. Table 1 shows the results including analysis of failure events after the electrical system
upgrade in 2003. During this upgrade the starter and main switchboard were replaced and the Citect
control system improved. This change was motivated by problems of finding spare parts.
17. Prior to the electric upgrade the failure data for the Citect system had a Beta value of 1.11
and an Eta value of 805 hours. What was the MTBF of the Citect system?
A. 509 hrs
B. 762 hrs
C. 774 hrs
D. 805 hrs
18. Hazard rate plots of data from Table X for the Citect, Contactor, Flow failures and Pump
failures are shown in the Figure below. They are labelled A, B, C and D. Identify which plots is
for which failure.
0.0025
0.002
A
0.0015
Hazard rate
B
C
D
0.001
0.0005
Time (hrs)
21. Based on your answer to the previous question and without doing any calculations, use your
judgement to estimate what the reliability of the system for 2 out of 4 pumps is?
A. 0 0.20
B. 0.20 0.40
C. 0.40 0.70
D. 0.70 1.00
22. Examining the failure data collected on the pumps and the other information presented in
the case, is an age-based maintenance replacement strategy appropriate for the pumps?
A. Yes because you know the MTBF value of the pump and this is used to set the age of change
out
B. Yes because this is what has been done in the past
C. No because the failure times are exponentially distributed and therefore an age based
replacement strategy is not appropriate
D. No because the failure behaviour is indicative of wear in and therefore an age based
replacement strategy is not appropriate
23. Which of the following is not true in this case study?
A. Unplanned work is more costly than planned work
B. Planned work is work completed as per the weekly maintenance plan
C. The risks associated with planned and unplanned work are the same
D. Unplanned work often results in deferring planned work
24. Given the experience and knowledge available concerning the operation and maintenance of
the Clearwater Pumping Station, what would be the most economical and practical
approach for the reliability engineer to review and update the maintenance strategies/
tactics for assets at the station?
A. Hazard and Operability Study (HAZOP)
B. PMO (also known as Reverse RCM)
C. Reliability Centred Maintenance (RCM)
D. Risk Management Study
25. The Piper Alpha disaster in 1988 is now considered a classic example of the failure of what
system
A. Communication
B. Design with respect to location of the gas compression system
C. Emergency management
D. Permit to work
26. After the initial gas explosion on Piper Alpha caused by the start of pump A and the failure of
the blind closing the line, there were a number of events that aggravated the situation
contributing to more causalities than might otherwise have occurred. Which one of the
following was NOT an aggravating event?
A. Gas continued to rise up through the Piper Alpha drilling system from the reservoir
B. The fire deluge system failed to start
C. Adjacent rigs Tartan and Claymore continued to pump oil
D. The helicopter deck was unusable due to smoke and high winds
27. The Cullen Enquiry identified a number of design decisions that had contributed to the scale
of the disaster. Which one of the following was NOT considered a contributing factor.
A. The rig was powered by gas-fired generators which were reliant on either gas pump A or B
operating to supply power to the drill
B. The design of the pressure relief valve on the discharge of pumps A and B
C. The weakness of the walls separating the gas compression area from the oil area
D. Co-location of the main oil and gas trunk lines
28. The hierarchy of controls is an important concept in the selection of reactive and proactive
controls. What is the correct order of this hierarchy from least to most effective?
A. Administrative controls - PPE - Engineering controls Substitution - Elimination
B. PPE - Engineering controls Elimination Substitution - Administrative Controls
C. PPE Administrative controls - Engineering controls Substitution Elimination
D. Administrative controls - PPE - Engineering controls Elimination Substitution
29. The work of the Centre for Safety at UWA has been instrumental in trying to develop ways of
communicating about safety. One result of this has been to distil core concept of safety
culture into two terms. These were core themes in the 3.5 min video made by Rio Tinto and
the Centre for Safety and in the 12th workshop these were presented as an equation to assist
you to remember it. What was the equation?
A. Safety culture = function (reliability, performance)
B. Safety culture = function ( compliance, proactivity)
C. Safety culture = function (behaviour, shared values)
D. Safety culture = function (participation, mental models)
31. ISO 31000:2009 defines the external context as being the external environment in which
the organization seeks to achieve its objectives. Which of the following is not considered
part of the external context
A. Form and extent of contractual relationships
B. Cultural, social, political, legal, regulatory, financial, technological, economic, natural and
competitive environment
C. Key drivers and trends having impact on the objectives of the organization
D. Relationships with, and perceptions and values of external stakeholders
32. The establishment of external communication and reporting processes is described in the
ISO 31000:2009 risk management standard. In the context of applying this requirement to
the proposed Driverless Car Trial discussed in the class, which of the following actions would
NOT be appropriate.
A. Engaging appropriate external stakeholders such as local councils (Nedlands, Subiaco
and the City of Perth), residents of these areas and groups such as Main Roads.
B. External reporting to comply with legal, regulatory and government regulations.
C. Advertising to promote the clean green image of the Driverless Car project
D. Communication with stakeholders involved in crisis management such as Police, Fire and
Emergency Services.
33. How do we assess the reliability of a software system?
A. Determine system reliability using reliability block diagrams
B. Examining repair history
C. Examining the testing history
D. Calculating the reliability of the individual software modules
34. Which of the following is NOT a common assumption of software reliability models?
A. The failure rate is proportional to the number of remaining defects
B. New defects can be introduced during the repair process
C. Defects are fixed very soon after discover (MTTR is small)
D. Defects are independent
35. The definition for risk in ISO31000:2009 is
A. deviation from the expected positive or negative
B. The effect of uncertainty on objectives
C. The probability of something happening multiplied by the resulting cost or benefit if it does
D. The probability of uncertain future events