0% found this document useful (0 votes)
9 views12 pages

Process safety analysis using operational data and Bayesian network - 2023

This article presents a methodology for process safety analysis using operational data and Bayesian networks to assess process system failure probabilities. The approach integrates fault detection and diagnosis (FDD) methods with failure probability analysis, utilizing principal component analysis (PCA) and Bayesian networks to enhance process safety management. The methodology is validated through case studies, including a level-controlled tank system and the Tesoro heat exchanger explosion, demonstrating its effectiveness in predicting failure probabilities based on operational conditions.

Uploaded by

Sean Cuthbert
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views12 pages

Process safety analysis using operational data and Bayesian network - 2023

This article presents a methodology for process safety analysis using operational data and Bayesian networks to assess process system failure probabilities. The approach integrates fault detection and diagnosis (FDD) methods with failure probability analysis, utilizing principal component analysis (PCA) and Bayesian networks to enhance process safety management. The methodology is validated through case studies, including a level-controlled tank system and the Tesoro heat exchanger explosion, demonstrating its effectiveness in predicting failure probabilities based on operational conditions.

Uploaded by

Sean Cuthbert
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Received: 11 November 2022 Revised: 6 January 2023 Accepted: 8 January 2023

DOI: 10.1002/prs.12441

ORIGINAL ARTICLE

Process safety analysis using operational data


and Bayesian network

James Daley | Faisal Khan | Md. Tanjin Amin

Mary Kay O'Connor Process Safety Center,


Artie McFerrin Department of Chemical Abstract
Engineering, Texas A&M University, College
Fault detection and diagnosis (FDD) methods have recently experienced significant
Station, Texas, USA
advances. These methods provide valuable information from an abnormal situation
Correspondence
management perspective. However, traditional FDD methods do not consider system
Faisal Khan, Mary Kay O'Connor Process
Safety Center, Artie McFerrin Department of failure analysis, which is required from the process safety perspective. This work
Chemical Engineering, Texas A&M University,
seeks to overcome this barrier and presents a methodology to assess process system
College Station, TX 77843-3122, USA.
Email: [email protected] failure probability based on process operational data and system knowledge. The
methodology is built using the principal component analysis (PCA) and a Bayesian
Funding information
Mary Kay O'Connor Process Safety Center network (BN). The PCA is used for FDD, while the BN determines the probability of
(MKOPSC), Texas A&M University
system failure once a fault is detected. This portion of the network is based on the
logical relationship of the data, operational thresholds, and system failure conditions.
The proposed methodology is tested and verified on a level-controlled tank system
and the real-life failure scenario of the 2010 Tesoro heat exchanger explosion. The
results suggest that the proposed methodology helps to assess process system failure
probability based on process operating conditions. The current work is expected to
give a stimulus on digital process safety education.

KEYWORDS
Bayesian network, failure probability, fault detection and diagnosis, principal component
analysis, process safety education

1 | I N T RO DU CT I O N systems detect abnormal operations from process operational data


and predict the root cause of these abnormal operations.1,6 Without
A fault is defined as the deviation of the process variable(s) from an proper fault detection and diagnosis systems in place, a facility is far
acceptable operating range, which can arise throughout any process more likely to suffer an environmental event, economic loss, and cata-
1
operation and lead to a catastrophic accident. Therefore, one of the strophic accidents. Therefore, a key aspect of process safety manage-
most important aspects of safety engineering is the early detection ment (PSM) in production facilities is having strong FDD systems that
and diagnosis of these faults.2–5 This is where fault detection and aid in abnormal event management and return systems to normal
diagnosis modules are used. The fault detection and diagnosis (FDD) operating conditions when faults occur.
FDD approaches can be divided up into three main categories,
Abbreviations: AIC, Akaike information criterion; ANN, artificial neural networks; BIC, which are data-driven, model-based, and knowledge-based models.7
Bayesian information criterion; BN, Bayesian network; CPTs, conditional probability tables; Data-driven models are suitable for large-scale digitalized process
ESD, emergency shutdown system; ETA, event tree analysis; FDD, fault detection and
diagnosis; FP, failure prognosis; GMM, Gaussian mixture models; HAZOP, hazard and
operations and find the hidden features in highly correlated process
operability study; HE, heat exchanger; HTHA, high temperature hydrogen attack; KICA, variables to develop a monitoring model that is used for FDD. Some
kernel independent component analysis; PCs, principal components; PCA, principal
widely used data-driven techniques currently being explored are prin-
component analysis; PSM, process safety management; RA, risk assessment; SPE, square
prediction error; T2, Hotelling's T-squared distribution. cipal component analysis (PCA), artificial neural networks, and

Process Saf Prog. 2023;42:269–280. wileyonlinelibrary.com/journal/prs © 2023 American Institute of Chemical Engineers. 269
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
270 DALEY ET AL.

Gaussian mixture models.8,9 Compared to the other models, PCA is design of a toxic methyl chloride process. HYSYS has also been used
mostly used due to lower data requirements to build the monitoring to design pressure relief systems and calculate the emergency venting
10,11
model. It is a dimensionality reduction technique used to repre- requirements of hydrocarbon storage tanks.27
sent the variance of an entire dataset using only a few variables to Despite several notable applications of HYSYS in the process
reduce computational requirements and demonstrate the correlation safety domain, there is a lack of understanding of how it can be uti-
among different individual variables.12 lized to educate FDD and FP using operational data. To eliminate this
The progress of research into FDD has led to significant improve- gap and create a better understanding of data-driven process safety,
ments in process safety—the reduced number of accidents in the past this work aims to develop a systematic methodology using two widely
decade is a sign that these methods have become pivotal and are used models: PCA and BN for FDD-FP. PCA uses HYSYS-generated
working well. However, despite the major developments in FDD, data for FDD, while BN uses PCA's diagnostic report to be updated
there is far less understanding of the development of faults into fail- and predict the likelihood of a failure due to a fault. Thus, the inte-
ures. Many of the proposed techniques for FDD observe a fault and grated framework overcomes individual method's limitations and pro-
determine its most likely cause but do not calculate the likelihood that vides a robust mechanism for managing process safety. The
this fault will lead to a process failure.13 For example, PCA cannot methodology has been demonstrated and validated using two case
assess the likelihood of a failure due to a fault. studies; the results suggest its efficacy in early failure forecasting. This
Logical failure prediction models (e.g., event tree analysis [ETA] work contributes to the field by developing a systematic methodology
and Bayesian network [BN]) are typically used in this context.14–16 BN for FDD-FP and demonstrating how to process operational informa-
has advantages over ETA in terms of uncertainty handling and non- tion gained through simulation can be used to teach process safety to
17,18
sequential failure analysis. It is noteworthy to mention that Chemical Engineering students.
BN's application is not limited to failure analysis; it has versatile The remainder of this paper will be presented as follows: Section 2
applications (e.g., root cause diagnosis, dependability modeling, and will describe the methodology used for creating the process opera-
availability analysis, just to name a few). Therefore, it has become tional data-based failure probability calculator. Section 3 will demon-
19
the most popular tool in the process safety domain. However, strate the applications of the proposed methodology. Section 4 will
these models, while working alone, find failure probability after the describe the significance of this research. Finally, Section 5 will pre-
occurrence of an initiating event rather than helping prevent this sent the key findings from this study, additional applications of this
event from happening. Generally speaking, these cannot proactively fault detection technique, and future research that can be done on
predict failure unless the initiating event happens. Efforts are ongo- this topic.
ing on how to integrate data-driven FDD models with these logical
failure prediction models for dynamic risk monitoring and early fail-
ure prognosis (FP).20,21 2 | T H E P RO P O S E D M E T H O D O LO G Y
Although data-driven methods are widely used to describe and
validate anecdotes of accident prevention, there is a lack of under- The proposed methodology for process operational data-based failure
standing among early Chemical Engineering students (many of whom prediction model development is presented in Figure 1. The following
will be future process safety torchbearers) about how these methods section will describe the steps necessary to create the model.
can be integrated with the commonly used software (e.g., Aspen Step 1: The process model is developed at the Aspen HYSYS plat-
HYSYS). Dynamic process simulation can improve safety and reliabil- form to generate operational data. This is done using the dynamic
ity.22 HYSYS is one of the most popular process design tools that is mode of HYSYS, which allows the generation of time-discretized
integrated with most of the Chemical Engineering undergraduate cur- operating data based on specified parameters such as inlet stream
ricula. It can also be used to generate different dynamic process oper- pressures and valve operating modes. This can also be used to gener-
ational scenarios with realistic data. ate faulty operating data by entering an unexpected value into one of
HYSYS has been used by many scholars in the process safety the specified parameters.
domain. For instance, a group of researchers created a risk assessment Step 2: The possible consequences relevant to the processing sys-
23
tool integrated with HYSYS. It gave the team the ability to collect tem are identified. A scenario-based analysis approach is adopted in
process data easily and evaluate the risks of a process at all stages of this context.28 These techniques determine what could go wrong in a
development, allowing for the implementation of safety improve- process and identify the possible consequences of these failures.
ments prior to the finalization of design parameters. Similarly, a team Step 3: The next step is to transform the scenario(s) into a causal
from Italy have developed a hazard and operability study tool which model to capture the relationships among the monitored variables,
monitors variables from a HYSYS simulation to analyze changes to safety measures, and possible consequences. Process knowledge and
process parameters and the effects of these changes.24 An integrated the HYSYS model are utilized to build the causal model. It should be
risk assessment (RA) tool to predict the explosion risk in processes noted that the BN has been used in this work as the causal model
25
under development was proposed using HYSYS. The use of HYSYS since it can handle multivariate and multistate dependencies and
to design various process safety elements also exists in the current lit- uncertainty.29 Process data is discretized around one standard devia-
26
erature. Yandrapu et al. used HYSYS to improve the safety valve tion, and prior and conditional probabilities are estimated using
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DALEY ET AL. 271

F I G U R E 1 The proposed
methodology for fault detection
and diagnosis-failure prognosis
(FDD-FP)

frequency counting. Interested readers are referred to the original


work30 for a numerical example.
Step 4: Operational data is generated, which contains both nor-
mal and faulty data. Normal data is used to train the PCA monitoring
model. PCA is the most widely used process FDD tool; the details of
the PCA algorithm can be found in the references.31–33 The aim of
training the PCA model is to estimate the thresholds of PCA-T2
(Hotelling's T-squared distribution) and PCA-SPE (square prediction FIGURE 2 PFD of the level control example
error) control charts. Obtaining an estimate of the PCA-T2 and PCA-
SPE control chart thresholds allows for the accurate detection of
faults, as any value of the T2 or SPE statistics beyond these limits probability back to acceptable levels. Otherwise, the operation is
indicates that faulty operation is occurring at that time. These four continued.
steps complete the HYSYS-aided accident prediction model
development.
Step 5: The remaining datasets are used to see how the model 3 | APPLICATIONS OF THE PROPOSED
performs. For each observation, the T2 and SPE values are computed METHODOLOGY
and compared with the thresholds. If any T2 or SPE values that exceed
the previously determined limits are noticed, an alarm is generated. 3.1 | Tank level control system
The contributions plots are generated to demonstrate how much each
of the measured process variables contributes to the fault. The vari- The level control in a tank is a classical problem that has been widely
able with the highest contribution is used to update the BN with hard used to explain many control and safety-related applications.34 In the
evidence. current work, the water inside the tank is maintained using a control
Step 6: The updated BN is used to predict the consequences' loop. Water flow rate to the tank is altered using inlet valve V-101,
probability. If the failure probability is higher than the acceptable limit, while the flow from the tank (outlet flow rate) is modified using outlet
maintenance is performed to manage risk and reduce the failure valve V-102. A total of five variables are monitored: inlet flow (V1),
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
272 DALEY ET AL.

F I G U R E 3 Bayesian network
(BN) for the level control example

F I G U R E 4 HYSYS model of
level control example

outlet flow (V2), inlet valve opening (V3), outlet valve opening (V4),
and water tank level inside the tank (V5). Figure 2 shows the PFD of
the described system.
In normal operating condition, the amount of water entering and
leaving the tank is almost equal to maintain the tank level. However,
an imbalance in these two flow rates can result in an underflow or an
overflow of liquid. Underflow can result in unsatisfactory operational
performance, while the latter can cause some serious safety concerns
since overflow from tanks is one of the major causes of many acci-
dents like the Buncefield Fire in the UK (2005) and the Caribbean
Petroleum Refining Tank Explosion and Fire in Puerto Rico (2009), just
to name a few.
A total of three possible consequences can be considered: normal,
underflow, and overflow. Since the inlet and outlet valves control the
inlet and outlet flow rates, these are set as the root nodes. The tank
level is set as a child node of the two flow rates, which are the child FIGURE 5 Tank level with time in the considered fault scenario
nodes of the valves. Therefore, these two flow rate nodes can be con-
sidered intermediate nodes in the BN (shown in Figure 3). To prevent
unwanted consequences, two safety barriers: an alarm system and prevented irrespective of the higher water level in the tank. The con-
emergency shutdown (ESD) system are considered. It is worth noting sequence node is set as the child node of the tank level, ESD, and
that a process can have many safety layers. However, this work only alarm system.
considers these two barriers since the aim of this work is to show A HYSYS model for the simple tank system was then developed
how safety analysis can be demonstrated using Aspen HYSYS. If both to collect data for normal and faulty operations (Figure 4). The first
these safety barriers work perfectly, a potential overflow can be step in its development was creating a 2.5 m3 tank with a liquid inlet
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DALEY ET AL. 273

F I G U R E 6 (A) Principal component analysis-square prediction error (PCA-SPE) and (B) principal component analysis-Hotelling's T-squared
distribution (PCA-T2) control charts in the level control example

stream and a liquid outlet stream. In the next development step, valve The contribution plots are generated (Figure 7). PCA-SPE sug-
VLV-101 was placed on the inlet stream, and valve VLV-102 was gests outlet water flow rate is the root cause, while PCA-T2 suggests
placed on the liquid outlet stream to control the flow rates in these inlet valve opening as the probable reason. From a diagnostic perspec-
streams. Then, a flow rate controller was applied to the VLV-101 with tive, PCA-SPE's performance is poor here. However, accurate root
a typical setpoint of 1000 kg/h, a liquid level controller was applied to cause diagnosis is beyond the scope of the current work.
VLV-102 with a setpoint of 50% of tank height, and VLV-101 was set The maximum and minimum values in normal operating condi-
to the closed position. Finally, pump P-101 was placed between tions were 1000.06 and 998.47, respectively, whereas it is found to
VLV-101 and the tank to ensure that liquid flow into the tank would be 520.39 at the 1201st sample—this suggests a lower outlet flow
continue even if tank pressure increased. Once this model was com- rate. The BN has been updated accordingly (Figure 8A); it suggests
pleted, a dynamic operating sequence was created and applied to sim- that there is an 88.75% chance that the tank level will be high. Never-
ulate normal and faulty operating conditions. theless, the consequence node suggests no serious consequence; a
A total of 1500 samples have been generated in HYSYS with a slight probability rise in overflow is noticed, though. As the safety bar-
sampling time of 20 s. A fault is introduced at the 1201st sample—it riers have not failed, severe consequences can be avoided. The effect
increases inlet valve opening and reduces outlet valve opening. How- of safety barrier failure is shown in Figure 8B. It is assumed that ESD
ever, a significant imbalance in inlet and outlet mass flow occurs, and is in a failed state when the fault is detected. In this scenario, overflow
the tank level gradually enhances to lead to a potential overflow becomes the most probable outcome with a 59.29 probability. Using
(Figure 5). The first 1000 samples are used to develop the PCA model a similar technique, an overflow is expected as the possible conse-
and quantify the BN. The nominal values of V1, V2, V3, V4, and V5 quence of the PCA-T2's diagnostic information (Figure 8C). In both
are 1000, 999.73, 45.71%, 54.22%, and 50.01%, respectively. The cases, the developed model can predict failure earlier. For instance,
prior and conditional probabilities are estimated. However, the priors the tank level is 50.11% at the 1201st sample, and it exceeds 80% at
and conditional probability tables related to safety barriers are col- the 1448th sample. However, this phenomenon can be predicted ear-
13
lected from literature since these barriers are not continuously mon- lier using process dynamics-aided operational data.
itored variables.
PCA is applied to these 1000 samples. It is found that two princi-
pal components (PCs) can explain more than 90% of variations. There- 3.2 | Testing of the model—heat exchanger
fore, the first 2 PCs are selected to develop the PCA monitoring
model. The thresholds of PCA-SPE and PCA-T2 control charts are Failure probability calculations as an extension of FDD were then per-
computed as 0.0063 and 9.2715 with a 99% level of significance. The formed on a real-world failure event. The event in question is a heat-
remaining 500 samples are used to see how the developed framework exchanger explosion at the Tesoro refinery in Anacortes, Washington,
2
gives us important safety information. Both the PCA-SPE and PCA-T that fatally injured seven workers. The explosion occurred primarily
control charts can report a fault at the 1201st sample. The SPE and T2 because of high temperature hydrogen attack (HTHA) that induced
values keep rising since the fault went uncorrected (Figure 6). corrosion.35 This corrosion weakened the carbon steel tubing of the
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
274 DALEY ET AL.

F I G U R E 7 (A) Principal component analysis-square prediction error (PCA-SPE) and (B) principal component analysis-Hotelling's T-squared
distribution (PCA-T2) contribution plots in the level control example

heat exchanger, eventually leading to a catastrophic rupture during a The failure probability is influenced by the presence of safety bar-
start-up procedure. The rupture released a highly flammable naphtha riers, such as the heat treatment of welds within the heat exchanger,
and hydrogen mixture that ignited in the atmosphere. Based on the the performance of HTHA inspections, and the use of resilient mate-
knowledge of the actual cause of the incident, the developed failure rials. The failure probability is also influenced by the reactive mea-
model will need to determine whether HTHA has occurred in a moni- sures taken in response to the leak, as failure probability is reduced if
tored heat exchanger system using process operational data. an appropriate response to mitigate the leak is taken. Such causal rela-
Figure 8 displays the modeled heat exchanger and control loops tionships are considered within the BN. Finally, a node representing
used to emulate the Tesoro system. The system uses hot steam as the the presence of an ignition source is added. This node must be true
hot side stream of the heat exchanger. In the model system, the flow for there to be any possibility of a catastrophic failure event, as, with-
rate of hot steam entering the exchanger is modified using valve out an ignition source, the vapor released by a heat exchanger leak
V-100, a steam leak can be induced by opening valve V-101, and the would not explode.
cooling water flow through the exchanger is modified using V-102. A HYSYS model of the exchanger was then developed. The
The opening of valves V-100, V-101, and V-102 is managed by flow dynamic mode of HYSYS was used to simulate normal and faulty
controllers. In the current work, a total of six monitored variables: operating conditions. A total of 1500 samples were generated with a
steam outlet mass flow (V1), cooling water mass flow (V2), steam out- sampling time of 30 s. A fault was induced at the 1401st sample by
let temperature (V3), cooling water outlet temperature (V4), VLV-100 opening valve VLV-101 to simulate the occurrence of a leak within
opening (V5), and VLV-102 opening (V6), have been considered the system. This is the scenario that caused the accident at the Tesoro
(Figure 9). facility, so being able to quickly detect it and evaluate the likelihood it
Under normal operating conditions, the cooling water reduced leads to catastrophic failure is critical for ensuring safe operation. The
the steam temperature to roughly 203.2 C. The flow rate of the steam first 1000 samples are used to develop the PCA model and quantify
stream should also correspond to the VLV-100 position. However, the Bayesian network. The nominal values of V1 through V6 are
there is an event that can disrupt the relationship between valve posi- 191.94, 132.54, 203.20, 68.81, 50.002, and 49.995, respectively. The
tion and steam mass flow rate and lead to dangerous operating condi- BN is quantified using the technique mentioned in Section 3.1.
tions. This condition is a leak in the steam stream, which can cause PCA is applied to the first 1000 samples. The first five PCs were
vapors to enter the atmosphere. These vapors can ignite and cause an found to explain over 80% of the variation within the system. There-
explosion if they are flammable, which occurred during the Tesoro fore, these five PCs were used to develop the PCA monitoring model.
incident. The thresholds of the PCA-SPE and PCA-T2 charts were calculated to
The BN developed (Figure 10) for this heat exchanger system has be 6.5094 and 15.2545, respectively, with a 99% level of significance.
the valve position of the steam and cooling water streams as the root The control charts are shown in Figure 11. It can be seen that PCA-T2
nodes. Each of these root nodes has two child nodes for the mass does not have any deviation from 1001–1400 samples, while PCA-
flow and outlet temperature of their respective streams. Finally, the SPE shows a few deviations due to noise. This shows PCA-T2 was bet-
four child nodes served as parents for the improper operating condi- ter in terms of lesser false alarms. However, both control charts can
tion node. The presence of an improper operating condition can be detect the fault in the 1401st sample.
detected if the mass flow rates and outlet temperatures of the two The contribution plots are generated (Figure 12). PCA-SPE sug-
streams deviate from their expected values. gests that the root cause is steam outlet temperature, while PCA-T2
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
275

F I G U R E 8 Updated Bayesian network (BN) using (A) the evidence from the square prediction error (SPE) contribution plot, (B) the evidence
from the SPE contribution plot and emergency shutdown system (ESD) failure, and (C) the evidence from the Hotelling's T-squared distribution
(T2) contribution plot and ESD failure
DALEY ET AL.
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
276 DALEY ET AL.

F I G U R E 9 Heat exchanger
and control loop diagram

FIGURE 10 Bayesian network (BN) of the heat exchanger

suggests that the steam mass flow rate is the root cause of the fault. will propagate down the network, and eventually, the probability
From a diagnostic perspective, PCA-T2's performance is better here. of catastrophic failure increases to 4% from its previous value of
This is because the stream outlet temperatures are the effects of the 1%. The effect of safety barrier failure is then shown by setting
induced fault, not the cause. The maximum and minimum values of the node for heat treatment of HE welds to not performed, the
the steam mass flow rates under normal operating conditions were node for resilient materials used to no, and the operation
192.949 and 190.949 kg/h, respectively. However, when the fault below the CS Nelson curve to false. Finally, the increase in failure
was induced at the 1401st sample, the steam mass flow rate probability due to mitigating factor failure is evaluated. Suppose
decreased to 171.762 kg/h. This demonstrates that the steam mass the appropriate leak response node is set not to be taken, and an
flow rate was reduced as intended after introducing a leak to the ignition source is present. In that case, the probability of cata-
system. strophic failure increases to 65%—an almost 1600% rise from the
The BN is then updated (Figure 13A) to reflect a fault in the earlier one (Figure 13B). This demonstrates the ability of the net-
measured steam mass flow rate, and the likelihood of an improper work to predict failure probability based on process
operating condition increases to 67%. The effects of this increase operation data.
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DALEY ET AL. 277

F I G U R E 1 1 (A) Principal component analysis-square prediction error (PCA-SPE) and (B) principal component analysis-Hotelling's T-squared
distribution (PCA-T2) control charts of heat exchanger

F I G U R E 1 2 (A) Principal component analysis-square prediction error (PCA-SPE) and (B) principal component analysis-Hotelling's T-squared
distribution (PCA-T2) contribution plots for the heat exchanger model

4 | DISCUSSION maintenance techniques within the facility. The leak probability was
then used to estimate the failure probability alongside contributing
In this study, a failure prediction model influenced by process opera- variables such as facility staff responding to the leak appropriately and
tional data was created for two different process systems. The first the presence of an ignition source near the potential leak area.
failure prediction model was created for a simple tank example to One assumption made during the data portion collection of this
demonstrate the methodology. This model could detect when the model's development was a constant inlet pressure on both the hot
tank's liquid volume was increasing and estimate the probability of and cold streams, as this was necessary to use the HYSYS's dynamic
tank overflow based on the flow rate data, tank liquid level, and the mode. In real-world scenarios, this may not be valid. However, the
presence of safety barriers within the process. proposed model is expected to provide the same performance level
The second failure prediction model was a test case based on the since the change in inlet assumption will not impact the model devel-
2010 Tesoro heat exchanger explosion. This model was able to use opment phase.
process data to detect an improper heat exchanger operation and This failure prediction model was able to predict a 65% chance of
then estimate the likelihood of a hot-side leak using the knowledge of catastrophe when an improper hot-side flow rate was detected, safety
safety barriers within the system and the application of proactive barriers failed, an appropriate response was not taken, and an ignition
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
278 DALEY ET AL.

F I G U R E 1 3 Updated heat exchanger Bayesian network (BN) (A) using the evidence from square prediction error (SPE) contribution plots and
(B) using the evidence from SPE contribution plots and the presence of ignition source with failure of few safety measures

source was present. Since this is a greater than 50% chance of failure, Similar failure prediction models can be applied to a wide range of
it is reasonable to assume that at those specific conditions, a cata- process systems to improve PSM and RA and allow process safety
strophic failure is likely to occur. The safety barrier, leak response, and specialists to evaluate the likelihood of a catastrophic event more
ignition node values can be easily changed to understand how these quickly and effectively. Additionally, this test model demonstrates that
variables affect the failure probability and reflect the current system FDD-integrated failure prediction models will allow process safety
properties. For example, if the heat treatment of the heat exchanger practitioners to more easily determine the effect that implementing
welds node was changed from “no” to “yes,” the probability of cata- additional safety barriers or performing maintenance to bring systems
strophic failure would reduce to 40%. The safety barrier and response back to typical operation will have on the probability of catastrophic
nodes are able to be adjusted until the failure probability is within the failure. This will allow them to choose safest options for process
acceptable range. operations.
This demonstrates the ability of the failure prediction model to This work shows the benefits of using hybrid methods for process
quickly and effectively determine the potential of catastrophic failure safety analysis. PCA is found suitable for early fault detection. None-
through the use of PCA to detect abnormal operating conditions. theless, it fails to show how a failure may occur, which has great
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DALEY ET AL. 279

significance from a process safety perspective. Although the BN pro- (supporting); validation (equal); writing – review and editing (lead).
vides a plausible solution for failure prediction, it cannot detect faults Faisal Khan: Conceptualization (lead); data curation (equal); formal
early. The current work has utilized the benefits of both these tools analysis (equal); investigation (equal); methodology (supporting);
and provided a unified tool for early fault detection and failure supervision (lead); validation (supporting); writing – review and
prediction. editing (supporting).

ACKNOWLEDG MENT
5 | C O N CL U S I O N S The authors thankfully acknowledge the funding supported by the
Mary Kay O'Connor Process Safety Center (MKOPSC), Texas A&M
This paper presents a methodology that uses process data and system University, Texas, USA.
knowledge to predict the likelihood of failure within a process sys-
tem due to an abnormal operation. The PCA has been used for FDD. DATA AVAILABILITY STAT EMEN T
Once a fault is detected, PCA's diagnostic report and safety barriers' The data that support the findings of this study are available from the
operational condition are used to update the BN to assess the corresponding author upon reasonable request.
potential consequences of a fault. The methodology has been tested
and validated in two process systems; the results suggest the effi- OR CID
cacy of the proposed method in the context of early failure Faisal Khan https://orcid.org/0000-0002-5638-4299
prediction.
The developed framework's importance is twofold. First, it can RE FE RE NCE S
help industrial practitioners assess the likelihood of an accident due to 1. Gabbar HA, Boafo EK. FSN-based cosimulation for fault propagation
analysis in nuclear power plants. Process Saf Prog. 2016;35(1):53-60.
a fault—this is essential for accident prevention. Second, it can help to
doi:10.1002/prs.11725
understand how to process dynamics captured through the HYSYS 2. Park Y-J, Fan S-KS, Hsu C-Y. A review on fault detection and process
model and data analytics can be successfully utilized for process diagnostics in industrial processes. Processes. 2020;8(9):1123. doi:10.
safety assessment. The second aspect is of particular importance to 3390/pr8091123
Chemical Engineering students who are currently well-trained with 3. Zadakbar O, Imtiaz S, Khan F. Dynamic risk assessment and fault
detection using a multivariate technique. Process Saf Prog. 2013;32(4):
HYSYS, however, less familiar with its use for process safety analysis
365-375. doi:10.1002/prs.11609
through advanced data analytics methods, which are an integral part 4. Bao H, Khan F, Iqbal T, Chang Y. Risk-based fault diagnosis and safety
of risk management in the industry. Hence, the current work is management for process systems. Process Saf Prog. 2011;30(1):6-17.
expected to contribute significantly toward the development of doi:10.1002/prs.10421
5. Chetouani Y, Mouhab N, Cosmao J, Estel L. Dynamic model-based
safety-aware professionals.
technique for detecting faults in a chemical reactor. Process Saf Prog.
The proposed methodology can be applied to various process sys- 2003;22(3):183-190. doi:10.1002/prs.680220308
tems, from a simple tank example mentioned earlier to an entire pro- 6. Gentile M, Summers AE. Random, systematic, and common cause fail-
cess unit. Nevertheless, when applying this work to more complex ure: how do you manage them? Process Saf Prog. 2006;25(4):331-
338. doi:10.1002/prs.10145
systems, the available data points will also increase. This will increase
7. Venkatasubramanian V, Rengaswamy R, Yin K, Kavuri SN. A review of
the importance of PCA as the dataset will need to be scaled down and process fault detection and diagnosis, part I. Quantitative model-
the relationships between the variables better understood for the based methods. Comput Chem Eng. 2003;27(3):293-311. doi:10.
FDD portion of the model to continue working effectively. 1016/S0098-1354(02)00160-6
8. Alauddin M, Arunthavanathan R, Amin MT, Khan F. Statistical
The current work has considered PCA for FDD, which is optimal
approaches and artificial neural networks for process monitoring.
for linear and Gaussian process systems. Real industrial data may con- Methods in Chemical Process Safety. Vol 6. Elsevier; 2022:1-48. doi:10.
tain nonlinear and non-Gaussian properties. Hence, PCA's early fault 1016/bs.mcps.2022.04.003
detection capacity may be degraded. Kernel independent component 9. Ge Z, Song Z, Gao F. Review of recent research on data-based pro-
cess monitoring. Ind Eng Chem Res. 2013;52(10):3543-3562. doi:10.
analysis36 can be used to capture these data behavior and early fault
1021/ie302069q
detection. Prior to selecting the data-driven FDD model, the Akaike 10. Amin MT, Khan F, Imtiaz SA, Ahmed S. Robust process monitoring
information criterion37 and Bayesian information criterion38 can be methodology for detection and diagnosis of unobservable faults. Ind
used to find the suitable data distribution. These can improve the Eng Chem Res. 2019;58(41):19149-19165. doi:10.1021/acs.iecr.
current work. 9b03406
11. Tahoon AI, Rusli R, Khan F, Zainal AM. Logic-based probabilistic net-
work model to detect and track faults in a process system. Process Saf
AUTHOR CONTRIBUTIONS Prog. 2020;39:e12110. doi:10.1002/prs.12110
James Daley: Conceptualization (lead); data curation (lead); formal 12. Jolliffe IT, Cadima J. Principal component analysis: a review and
analysis (lead); methodology (lead); validation (lead); writing – origi- recent developments. Philos Trans R Soc A Math Phys Eng Sci. 2016;
374(2065):20150202. doi:10.1098/rsta.2015.0202
nal draft (lead); writing – review and editing (lead). Tanjin
13. Amin MT, Khan F, Ahmed S, Imtiaz S. A novel data-driven methodol-
Md. Amin: Conceptualization (equal); data curation (supporting); ogy for fault detection and dynamic risk assessment. Can J Chem Eng.
formal analysis (equal); methodology (supporting); supervision 2020;98:2397-2416. doi:10.1002/cjce.23760
15475913, 2023, 2, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/prs.12441 by Danish Technical Knowledge, Wiley Online Library on [12/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
280 DALEY ET AL.

14. Bhandari J, Arzaghi E, Abbassi R, Garaniya V, Khan F. Dynamic risk- 27. Rao N, Haydary J. Storage tank protection using Aspen HYSYS. Pet
based maintenance for offshore processing facility. Process Saf Prog. Coal. 2017;59(4):533-542.
2016;35:399-406. doi:10.1002/prs.11829 28. ALNabhani K, Khan F, Yang M. Scenario-based risk assessment of
15. Zarei E, Azadeh A, Aliabadi MM, Mohammadfam I. Dynamic safety TENORM waste disposal options in oil and gas industry. J Loss Prev
risk modeling of process systems using Bayesian network. Process Saf Process Ind. 2016;40:55-66. doi:10.1016/j.jlp.2015.12.003
Prog. 2017;36(4):399-407. doi:10.1002/prs.11889 29. Amin MT, Imtiaz S, Khan F. Dynamic availability assessment of safety
16. Fang W, Wu J, Bai Y, Zhang L, Reniers G. Quantitative risk assess- critical systems using a dynamic Bayesian network. Reliab Eng Syst
ment of a natural gas pipeline in an underground utility tunnel. Pro- Saf. 2018;178:108-117. doi:10.1016/j.ress.2018.05.017
cess Saf Prog. 2019;38(4):e12051. doi:10.1002/prs.12051 30. Gharahbagheri H, Imtiaz SA, Khan F. Root cause diagnosis of process
17. Bilal Z, Mohammed K, Brahim H. Bayesian network and bow tie to fault using KPCA and Bayesian network. Ind Eng Chem Res. 2017;
analyze the risk of fire and explosion of pipelines. Process Saf Prog. 56(8):2054-2070. doi:10.1021/acs.iecr.6b01916
2017;36(2):202-212. doi:10.1002/prs.11860 31. Garcia-Alvarez D, Fuente MJ, Sainz GI. Fault detection and isolation
18. Baksh AA, Abbassi R, Garaniya V, Khan F. A network based approach in transient states using principal component analysis. J Process Con-
to envisage potential accidents in offshore process facilities. Process trol. 2012;22(3):551-563. doi:10.1016/j.jprocont.2012.01.007
Saf Prog. 2017;36:178-191. doi:10.1002/prs.11854 32. Kresta JV, Macgregor JF, Marlin TE. Multivariate statistical monitor-
19. Amin MT, Khan F, Amyotte P. A bibliometric review of process safety ing of process operating performance. Can J Chem Eng. 1991;69(1):
and risk analysis. Process Saf Environ Prot. 2019;126:126-381. doi:10. 35-47. doi:10.1258/phleb.2011.010101
1016/j.psep.2019.04.015 33. Amin MT, Imtiaz S, Khan F. Process system fault detection and diag-
20. Amin MT, Khan F. Dynamic process safety assessment using adaptive nosis using a hybrid technique. Chem Eng Sci. 2018;189:191-211. doi:
Bayesian network with loss function. Ind Eng Chem Res. 2022;61: 10.1016/j.ces.2018.05.045
16799-16814. doi:10.1021/acs.iecr.2c03080 34. Crowl DA, Louvar JF. Chemical Process Safety: Fundamentals with
21. Amin MT, Khan F, Ahmed S, Imtiaz S. Risk-based fault detection and Applications. Pearson Education; 2001.
diagnosis for nonlinear and non-Gaussian process systems using R- 35. CSB. Catastrophic Rupture of Heat Exchanger: Tesoro Anacortes Refin-
vine copula. Process Saf Environ Prot. 2021;150:123-136. doi:10. ery. 2014. Accessed November 9, 2022. https://www.csb.gov/file.
1016/j.psep.2021.04.010 aspx?DocumentId=5851.
22. Rasel MAK, Richmond PC. Improve safety and reliability with dynamic 36. Lee J-M, Qin SJ, Lee I-B. Fault detection of non-Linear processes
simulation. Process Saf Prog. 2014;33(4):333-338. doi:10.1002/prs. using kernel independent component analysis. Can J Chem Eng. 2007;
11667 85(4):526-536. doi:10.1002/cjce.5450850414
23. Shariff AM, Leong CT. Inherent risk assessment—a new concept to 37. Akaike H. A new look at the statistical model identification. IEEE Trans
evaluate risk in preliminary design stage. Process Saf Environ Prot. Automat Contr. 1974;19(6):716-723. doi:10.1109/TAC.1974.
2009;87(6):371-376. doi:10.1016/j.psep.2009.08.004 1100705
24. Janosovsky J, Danko M, Labovsky J, Jelemensky L. Development of a 38. Schwarz G. Estimating the dimension of a model. Ann Stat. 1978;6(2):
software tool for hazard identification based on process simulation. 461-464. doi:10.1214/aos/1176344136
Chem Eng Trans. 2019;77:349-354. doi:10.3303/CET1977059
25. Shariff AM, Rusli R, Leong CT, Radhakrishnan VR, Buang A. Inherent
safety tool for explosion consequences study. J Loss Prev Process Ind.
2006;19(5):409-418. doi:10.1016/j.jlp.2005.10.008 How to cite this article: Daley J, Khan F, Amin MT. Process
26. Yandrapu VP, Kanidarapu NR. Energy, economic, environment assess-
safety analysis using operational data and Bayesian network.
ment and process safety of methylchloride plant using Aspen HYSYS
simulation model. Digit Chem Eng. 2022;3:100019. doi:10.1016/j. Process Saf Prog. 2023;42(2):269‐280. doi:10.1002/prs.12441
dche.2022.100019

You might also like