0% found this document useful (0 votes)
30 views34 pages

Lecture 10 - System Safety

Uploaded by

pifpaf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views34 pages

Lecture 10 - System Safety

Uploaded by

pifpaf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 34

Lecture 10 – System Safety

1
System Safety
• Safety Hierarchy
• Systems Hazard Analysis
• Fault Tree Analysis
• FMECA
• Probabilistic Risk Analysis (PRA)

2
Safety Hierarchy
• Clearly establishes priorities within an
organization
• Example - At NASA the safety hierarchy is
– Safety of the general public
– Safety of our flight crews
– Safety of all other NASA personnel
– Safety to mission critical or high value hardware
• So for example, per this hierarchy, we will
– risk a crew to prevent injury to the public (this is why
we fly a range safety destruct system on shuttle)
– we will also expend hardware to save flight or ground
personnel (why we fly ejection seats on most
research aircraft)

3
Hazard Analysis
• A structured thinking process to
– Identify risks
– Categorize risks
– Determine and implement mitigations
• Attempt to identify potential causes of loss of crew and
vehicle by a “what-if” process
• Best done during the design process
– Usually most effective to design a hazard out of a system
– Evaluation must be performed early to implement mitigations in
design
• Must be re-evaluated as flight experience is gained
– At time of Columbia accident, shuttle hazard analyses had not
been reviewed since mid 80’s despite documented plans to review
every 3 years
• A fault tree is an effective mechanism for communicating
and searching for potential hazards in a structured way
4
Essential Elements of Hazard Analysis

• Categorizing each potential negative outcome by Likelihood and


Severity
• Standard processes at NASA use a 4 by 3 (likelihood vs consequence)
or a 5 by 5 array
• In a 4 by 3 system
– Likelihood is rated from
• Improbable – never will happen in the life of a program
• Remote – could happen in the life of the program
• Infrequent – could happen multiple times in the life of a program
• Probable – likely to happen
– Severity is rated from
• Marginal – limited damage or minor injury to personnel
• Critical – major damage to vehicle, facilities or major injury to
personnel
• Catastrophic – Loss of personnel or vehicle
5
Each Combination of Likelihood and
consequence is rated
• Hazard causes that are catastrophic and Probable are
rated RED
– Traditionally these are considered “showstoppers”, i.e. cannot
proceed unless these risks are mitigated
• Hazard causes that are Probable and Infrequent but not
catastrophic are rated YELLOW
– They are likely but of limited effect
– YELLOW risks require mitigation
• Hazard causes that are Infrequent or Remote but are
catastrophic are rated YELLOW
– They are not likely but are so serious that they require additional
mitigation
– YELLOW risk require mitigation
• Other hazard causes are rated GREEN – accepted risks
without further mitigation
6
– They are unlikely or of limited severity
Hazard Analysis Matrix

Severity
Marginal Critical Catastrophic

Probable

Infrequent
Likelihood
Remote

Improbable

Red-unacceptable
Yellow – accepted risk
7
Green - controlled
Each hazard that requires mitigation

• For each hazard cause that requires


mitigation we establish a
– CONTROL – an activity which prevents the
hazard from occurring
– VERIFICATION – the activity which shows
that the control is in place

8
Example – Hazard Analysis for Commuting from
Home to School and back

• Hazard causes on a commute to and from


school include
– Tire blowout is a cause for loss of control
– Engine failure is a cause for loss of control or being
stranded on the road
– Collision with another vehicle is a cause for impact
damage
– Collision with road debris is a cause for impact
damage
– Road conditions is a cause for loss of control
– Flash flood is a cause for stranding and drowning

9
Hazard Analysis – Commute to School
Hazard Hazard Likelihood Severity Color Control Verification
Number

1 Tire Blowout Infrequent Critical- Yellow Proper tire inflation Inspect on a regular
Catastrophic and tread wear. basis and check
Carry spare. inflation. Ensure proper
spare storage
2 Engine Failure Infrequent Critical Yellow Regular maintenance Maintain vehicle log and
checks perform scheduled
checks per log

3 Collision with Remote Catastrophic Yellow Operator awareness. Operator Training and
another Increase Vehicle pre-launch procedure
vehicle visibility by lights on.
Wear seat belt.
4 Collision with Infrequent Critical- Yellow Operator awareness. Operator training
road debris catastrophic Reduce speed in low
visibility conditions.

5 Road Infrequent Critical- Yellow Operator awareness. Operator training. Pre-


conditions catastrophic Listen to alerts on launch procedure
radio.

6 Flash flood Remote catastrophic Yellow Operator awareness. Operator training. Pre-
Listen to alerts on launch procedure
radio. Route around
low spots
10
Hazard Analysis Matrix – Commuting

Severity
Marginal Critical Catastrophic

Probable

Likelihood Infrequent
2 1, 4,5

Remote 3,6

Improbable

11
DOD MIL STD 882D
• MIL-STD – 882D describes hazard
analysis for Department of Defense
Projects.
• It is an equivalent system to NASA
system, it has additional levels of severity
and likelihood
• It also puts a probability of occurrence
number with each level of likelihood

12
13
14
Hazard Perspective
Event Chance This Year
Car stolen 1 in 100
House catch fire 1 in 200
Die from Heart Disease 1 in 280
Die of Cancer 1 in 500
Die in Car wreck 1 in 6,000
Die by Homicide 1 in 10,000
Die of AIDS 1 in 11,000
Die of Tuberculosis 1 in 200,000
Win a state lottery 1 in 1 million
Killed by lightning 1 in 1.4 million
Killed by flood or tornado 1 in 2 million
Killed in Hurricane 1 in 6 million
1 in 1 million to 10 million (depends
Die in commercial plane crash
on airline)
http://www.cotf.edu/ete/modules/volcanoes/vrisk.html 15
Hazard Perspective

16
http://www.anesi.com/accdeath.htm
DOD 5 by 4 Risk Assessment

17
Irreducible Risk
• In every project, you get to a point where risks
cannot be further reduced
– In our commuting example, collision with road debris
is an example of irreducible risk
– You really can’t do much to eliminate the risk of road
debris and there is only so much you can do to avoid
it or shield yourself from it
• At that point, you have to evaluate the risk and
accept it if it is low enough severity and
likelihood
• The hazard analysis approach provides a
language and technique for communicating risks

18
Risk Reduction Hierarchy of
Effectiveness
• In general it is preferred to
– Eliminate hazard cause by design – e.g. eliminate
potential causes of fire aboard vehicle
– Eliminate hazard cause by operational procedure or
workaround – e.g. don’t do things that produce sparks
or open flames that could cause a fire
– Prevent operation into hazard with warning systems –
e.g. install a fire/smoke detection system so that fires
can be detected and extinguished
– Mitigate effect of hazard with emergency system –
e.g, a system to automatically release fire fighting
chemicals when a fire is detected

19
Fault Tree Analysis
• Fault Tree Analysis is a technique for
investigating hazards and showing how
hazard causes relate to each other
• Many hazards are caused by
combinations of events

20
Printer Fails To Print

No Printer Printer
Paper Out of Out of Data driver
Jam Paper Toner Power connection software
lost not
properly
configured

Example Fault Tree For “Printer Fails To Print”

21
Gate Logic
OR Gate
A No Fault Fault
C B
No Fault Output C= Output C
No Fault = Fault
Fault Output C= Output C=
Fault Fault
A B

AND gate
C A No Fault Fault
B
No Fault Output C= Output C
No Fault = No Fault
Fault Output C= Output C=
A B No Fault Fault

22
Fault Tree Symbology

Transfer Gate (to other branch of tree)

Sufficient data and rationale exists to defer


further tree development at this time

Bottom-level Event

Closed Event – Rationale exists to exonerate


event as initiating event or contributor to event

OR Gate

AND Gate
23
Toyota Sudden Acceleration problems
November 4 2009
NHTSA swiftly issues a statement to correct Toyota’s statement that the investigation is over:
“Toyota has announced a safety recall involving 3.8 million vehicles in which the accelerator pedal may become stuck at
high vehicle speeds due to interference by the driver’s side floor mat, which is obviously a very dangerous situation.
Toyota has written to vehicle owners stating that it has decided that a safety defect exists in their vehicles and asking
owners to remove all floor mats while the company is developing a remedy. We believe consumers should follow Toyota’s
recommendation to address the most immediate safety risk. However, removal of the mats is simply an interim measure,
not a remedy of the underlying defect in the vehicles. NHTSA is discussing with Toyota what the appropriate vehicle
remedy or remedies will be. This matter is not closed until Toyota has effectively addressed the vehicle defect by providing
a suitable remedy.

then on January 22 2010


Toyota announces a new recall for sticky accelerator pedals, separate and apart from the floor mat recall. Toyota says:
“Due to the manner in which the friction lever interacts with the sliding surface of the accelerator pedal inside the pedal
sensor assembly, the sliding surface of the lever may become smooth during vehicle operation. In this condition, if
condensation occurs on the surface, as may occur from heater operation (without A/C) when the pedal assembly is cold,
the friction when the accelerator pedal is operated may increase, which may result in the accelerator pedal becoming
harder to depress, slower to return, or, in the worst case, mechanically stuck in a partially depressed position. In addition,
some of the affected vehicles’ pedals were manufactured with friction levers made of a different material (PA46), which
may be susceptible to humidity when parked for a long period in hot temperatures. In this condition, the friction when the
accelerator pedal is operated may increase, which may result in the accelerator pedal movement becoming rough or slow
to return 24
Fault Tree For Toyota Sudden Acceleration - 2010

Pedal Mechanical Throttle


Operator Control
jammed failure
Error System
under in
carpet pedal Failure

Control
Sensor Actuator Logic
Failure Failure Failure

Logic
Software
Circuit
error
Failure

25
Fault tree from the
LOCV DURING ENTRY
DUE TO AERODYNAMIC
BREAKUP
Columbia accident –
13 Feb 2003

LOCVAEROBKUP

AERODYNAMIC BREAKUP AERODYNAMIC BREAKUP


DUE TO STRUCTURAL DUE TO IMPROPER
FAILURE OF THE ORBITER ATTITUDE / TRAJECTORY
CONTROL

GATE-2 GATE-3

LOSS OF AERODYNAMIC STRUCTURAL FAILURE OF AERODYNAMIC BREAKUP AERODYNAMIC BREAKUP


CHARACTERISTICS DUE TO ORBITER DUE TO LOSS OF DUE TO ATTITUDE / DUE TO ATTITUDE /
STRUCTURAL MEMBERS TRAJECTORY CONTROL TRAJECTORY CONTROL
LOSS OF OML COMMAND FAILURE EQUIPMENT FAILURE

1 4 5 6
A C D E

26
Failure Modes Effects Criticality Analysis (FMECA)
• Hazard Analysis is a top-down process
– Start with a top level event and try to determine the possible causes
• FMECA is a bottoms up process
– Start with every component and analyze how it can fail and the potential effects
of that failure on the system
• Hazard Analysis can be performed at the start of system design and updated as the
design progresses
• FMECA requires design to be in sufficient detail to allow identification of credible
failure modes
• Failure Mode is identified, and then criticality of failure to total system operation is
identified
– Criticality Level Effect of Failure
• Crit 1 Loss of Life or Vehicle
• Crit 1R Failure of all redundant hardware items could cause loss of life or
vehicle
• Crit 2 Loss of mission
• Crit 2R Failure of all redundant hardware items could cause loss of mission
• Crit 3 All others

27
Typical FMECA failure modes

Valves Failed Open


Failed Closed
Fail Leak
Switches Fails Open
Fails Closed
Electronics Boxes Fails short
Fails on
Fails off
partial loss of function
Full loss of function
Filters Fails clogged
28
Fails to filter
Probabilistic Risk Analysis (PRA)

• PRA is a quantitative method for estimating risk of catastrophe for a


complex system by taking into account the system configuration and the
likelihood of failure of each component
• Essentially a giant reliability block diagram except
– That it includes distributions for failures and the distributions may
change with time or operating mode
• For example the likelihood of a blown tire on an airplane is
different during takeoff (medium), in cruise (low) and at landing
(highest)
– That it includes sequences of events (failure a precedes failure b)
• For example - tire failure leads to dragging gear which leads to
runway departure or gear failure
• Probability of runway departure is a combination of probabilities
including probability for blown tire as well as probability for other
events that can cause departure such as nosewheel steering
failure or brake failure or bad runway conditions

29
Runway Departure Example
• Potential causes
– Blown tire – 1/1000 flights
– Locked brake – 1 /1000 flights
– Failed nosewheel steering – 1/10,000 flights
– Combined probability (assuming independent events)
would be .0021
• Locked brakes can also lead to blown tire
• Probability combinations can get quite complex
as you account for the interaction of the failure
modes

30
Problems with PRA
• Capturing complex system configurations
– Can be hard to do but is generally achievable with
cost
• Generating good failure probabilities
– Usually very hard to get with programs with small
amounts of hardware and limited operating
experience
– Possible to get a garbage in-garbage out situation
• Anticipating failure modes that haven’t been
experienced yet
– Only as good as the imagination and experience of
the people building the models
31
Risk Perspective

• Commercial Airliner Risk – approx 1 in a million flights


(FAA)
– According to www.planecrashinfo.com
– 25 best airlines - 1 in 6.3 million
– 25 worst airlines – 1 in 543,000
• Military Aircraft Risk – approx 1 in 100,000 flights (DOD
1999 statistics)
• Military Aircraft in Combat – approx 1 in 5,000 to 1 in
10,000 flights (USAF Manned Aircraft Combat Losses
1990-2002, Dr. Daniel L. Haulman, Air Force Historical
Research Agency, 9 December 2002)
• Space Shuttle – between 1 in 50 flights (demonstrated) to 1
in 250 flights (calculated – JSC Safety and Mission
Assurance (S&MA) Probabilistic Risk Assessment ))
• Expendable Launch Vehicle – best average record is 1 in
25 flights (JSC S&MA Calculation) although Atlas (flying
since the 60’s) was able to record 80 flights without a failure
32
Ejection statistics
• Since the introduction of the ejection seat
in US Air Force aircraft in 1949 there have
been 5446 ejection attempts (through 30
Sep. 02). the survival rate for this period
as 82.3 percent
• Improvements in ejection seat technology
have improved the survival rate to over 91
percent and reduced injuries during the
ejection sequence.
• Source: AF Safety Center
33
Any Questions ?

34

You might also like