0% found this document useful (0 votes)
88 views17 pages

FMEA Matrix

This paper proposes a data-driven approach to automatically construct a component-failure mode matrix for failure mode and effects analysis (FMEA) by mining unstructured quality problem texts. The method uses text mining techniques like frequent itemset mining and semantic analysis to extract failure modes and build the component-failure mode associations from a large number of quality problem records. The results show that the proposed approach can extract more failure mode combinations and build a richer component-failure mode matrix compared to other existing methods.

Uploaded by

imbaviolet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views17 pages

FMEA Matrix

This paper proposes a data-driven approach to automatically construct a component-failure mode matrix for failure mode and effects analysis (FMEA) by mining unstructured quality problem texts. The method uses text mining techniques like frequent itemset mining and semantic analysis to extract failure modes and build the component-failure mode associations from a large number of quality problem records. The results show that the proposed approach can extract more failure mode combinations and build a richer component-failure mode matrix compared to other existing methods.

Uploaded by

imbaviolet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Journal of Intelligent Manufacturing

https://doi.org/10.1007/s10845-019-01466-z

A data‑driven approach for constructing the component‑failure mode


matrix for FMEA
Zhaoguang Xu1 · Yanzhong Dang1 · Peter Munro2 · Yuhang Wang1

Received: 24 August 2018 / Accepted: 10 February 2019


© Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract
Failure mode and effects analysis (FMEA) is one of the typical structured, systematic and proactive approaches for product
or system failure analysis. A critical step in FMEA is identifying potential failure modes for product sub-systems, compo-
nents, and processes, for which component-failure mode (CF) knowledge is necessarily needed as an important source of
knowledge. However, this knowledge is usually acquired manually based on historical documents such as bills of mate-
rial and failure analysis reports, which is a labor-intensive and time-consuming task, incurring inefficiency and plenty of
mistakes. Nevertheless, few existing studies have developed an effective and intelligent approach to acquiring accurate CF
knowledge automatically. To fill the gap, this paper proposes a method to construct the CF matrix automatically by mining
unstructured and short quality problem texts and mapping as well as representing them as CF knowledge. Starting with mining
the frequent itemsets of failure modes through Apriori algorithm, the method uses the semantic dictionary WordNet to find
synonyms in the set of failure modes, based on which the standard set of failure modes is finally built. Subsequently, upon
the previous work and components set, we design the component-failure mode matrix mining (CFMM) algorithm and apply
it to establish the CF matrix from unstructured quality problem texts. Lastly, we examine the quality data of the seat module
of an automobile company as a case study in order to validate the proposed method. The result shows that the failure mode
extraction method with standardized features can extract failure modes more effectively than the FP-growth and K-means
clustering methods. Meanwhile, the devised CFMM algorithm can extract more combinations of CF than the FP-growth
method and build a richer CF matrix. Although different industries have distinct domain characteristics, our proposed method
can be applicable not only to manufacturing but also to other fields needing FMEA to enhance product and system reliability.

Keywords Failure mode and effects analysis · Component-failure mode matrix · Data mining · System reliability ·
Automotive industry

Introduction perform a required function or its inability to perform within


previously specified limits (ISO/IEC-15026-1 2013) and
Failure mode and effects analysis (FMEA) is a systematic includes both known and/or potential failures and problems
activity for revealing potential faults when a firm does the that may incur customers’ dissatisfaction and poor evalua-
planning of developing a product or new production meth- tion, and thus endanger the reputation of the entire organi-
ods, and for implementing appropriate actions to avoid zation (Asan and Soyer 2016). In practice, FMEAs come in
faults, which ultimately improves product quality and reli- various forms, such as Design FMEA (DFMEA), Process
ability (Stamatis 1995; Liu et al. 2016). By definition, failure FMEA (PFMEA) and System FMEA (Pfeifer 2002) accord-
mode refers to the termination of the ability of a system to ing to different emphasis and objectives. When creating a
DFMEA, listing all potential failure modes for each compo-
nent is a very important step (Brook 2006) that is critical for
* Zhaoguang Xu creating failure-free designs (Arunajadai et al. 2004). While
[email protected] anticipating every failure mode is impossible, the develop-
1
ment team should formulate a list of potential failure modes
Institute of System Engineering, Dalian University
of Technology, Dalian 116024, China as extensive as possible (Goel and Graves 2007). Most firms
2
usually manually analyze, collate and summarize historical
BMW Brilliance Automotive Ltd., Shenyang 110143, China

13
Vol.:(0123456789)
Journal of Intelligent Manufacturing

documents such as the product function decomposition failure mode taxonomy. Beksinska et al. (2007) developed
model, bill of materials (BOM) and failure analysis report a standardized list of terms and definitions of failure modes
to obtain the failure modes of components and obtain the for female condoms. However, there were only eight kinds
corresponding relationship between the components and the of failure modes, and all of them were compiled by members
failure modes (Wang et al. 2011). of the WHO Technical Review Committee which is not suit-
DFMEA is a commonly used but significant tool in able for complex products and systems. Other scholars have
product design and development to take full account of the focused on functional-failure mode (EF) matrix construc-
problems involved in the process of production, transporta- tion (Arunajadai et al. 2002; Tumer and Stone 2003; Chang
tion, and use of products, to bring all possible problems into et al. 2015) and fault dependency matrix (D-matrix) mining
the scope of prevention, and to do a good job of preventive (Singh et al. 2010; Rajpathak and Singh 2014; Deore 2015;
measures and solutions in advance. The creation of DFEMA Thombare and Dole 2015; Jenifa and Balachander 2015;
first needs to know which failure modes have occurred in the Mendhe and Hande 2017). These studies presupposed that
product components. Component-failure mode matrix is an the CF matrix is known, whereas few studies revolve around
important source of knowledge in this process. mining the CF matrix from a large number of texts.
However, there are some assignable drawbacks of manu- However, in the production process, as well as quality
ally acquiring failure modes and their associations with com- management activities, a large amount of data related to the
ponents under a DFMEA. First, the source of failure modes quality problem are generated and accumulated. In our view-
knowledge is very fragmented. When these documents are point, the knowledge embedded in the unstructured quality
missing or difficult to find, the component-failure mode problem data provides insight into component failure. Mean-
(CF) knowledge will be incomplete. Second, a large num- while, text mining is important because it can automatically
ber of failure-mode types make it difficult for an enterprise discover knowledge assets hidden in the unstructured text
to build a firm-level failure-mode library. Instead, different (Hearst 1999; Khilwani and Harding 2016). Therefore, using
departments in a company usually use their own descrip- a text mining method to automatically mine text instead of
tion vocabulary when describing the same failure mode or manually acquiring the failure modes of product compo-
the same description for two marginally different failures nents and the relationship between components and failure
(Tumer et al. 2003). Also, manually building CF knowledge modes from a large number of quality problem data can
is a time-consuming and labor-intensive activity. On the one make it possible to predict each failure mode. The output
hand, the designers who create DFMEA are far away from of this method will provide a data foundation for DFMEA,
the production process, lack of understanding of the product improve the efficiency of building a DFMEA knowledge
quality problems that may occur in the production process; base, and improve product and system reliability and cus-
and the data of product quality problems scattered in the tomer satisfaction.
production process form an information isolated island, Stated thus, there is a paradox: on the one hand, we
which is difficult for designers to use. On the other hand, want the description of employees to be uniform and accu-
the employees who record product quality problems often rate, such as building a standard failure-mode set manual
adopt according to their own habits when describing the beforehand but at the expense of the authenticity of the
same problem. Different words are used to describe the fail- information. After all, we cannot exhaust all failure modes
ure mode, which results in the ambiguity of the designer’s and ensure that they fully match the actual situations staff
perception of failure mode. encounter. On the other hand, employee’s personalized
In response to these problems, many scholars have stud- description guarantees the authenticity and accuracy of the
ied the extraction of failure modes (Collins et al. 1976; information records, but the description is probably confus-
Arunajadai et al. 2002; Tumer et al. 2003; Wani and Jan ing, repetitive and logically poor, which brings great chal-
2006; Chen and Nayak 2007; Wijayasekara et al. 2014; lenges to failure patterns recognition and CF matrix con-
Chang et al. 2015; Rajpathak and De 2016; Kai et al. 2015; struction. Unfortunately, the existing methods have fairly
James et al. 2017; Meng et al. 2017) by utilizing classifica- limited ability to solve this paradox.
tion, clustering and other methods to extract failure modes. To address the previous challenge, this paper solves the
Among these studies, few studies focus on the standardiza- issues of standardization of failure modes and automatic con-
tion of failure modes. Tumer et al. (2003) provide a standard struction of a CF matrix based on data with implicit failure
failure-mode taxonomy with a definition of three levels for modes. In this study, the WordNet-based text mining method
each failure mode by analyzing operational failure reports was initially built to create a standard failure mode library.
from a problem and failure reporting database at Jet Propul- Then, this paper proposes a component-failure mode matrix
sion Laboratory. However, these failure modes were based mining (CFMM) algorithm. Based on the standard failure
on predetermined failures (Roberts et al. 2003), and the modes constructed above and the existing product compo-
authors did not mention the way how to build up the standard nents, the algorithm extracts the CF matrix from the quality

13
Journal of Intelligent Manufacturing

problem text, which can be treated as the component failure Literature review
mode knowledge and the basic knowledge of FMEA. In par-
ticular, this paper makes the following contributions: Failure mode acquisition

1. Different from the existing methods of failure-mode Many text mining-related methods have been used by
extraction, this paper considers the phenomenon that dif- scholars to obtain failure modes. Some of them have stud-
ferent departments use a different vocabulary to describe ied the methods of failure mode classification. Based on
the same problem or failure mode and further studies the the failure-experience matrix (Collins et al. 1976), Aruna-
problem of failure mode standardization. Based on the jadai et al. (2002) classified incremental bills of materi-
semantic dictionary WordNet, this paper recognizes and als with recorded failure information into corresponding
unifies synonymous failure modes and then constructs a default failure modes. Wijayasekara et al. (2014) divided
standard set of them, which provides a common vocabu- the software code failure data into hidden impact bugs
lary across business units, providing a more comprehen- and regular bugs. Wu et al. (2017) proposed a classifica-
sive and standard knowledge resource for DFMEA. tion tree kernel-based support vector machine to identify
2. Based on historical data of components and failure bearing failures. However, the classification method pre-
modes, the CFMM method is employed to build a CF supposed the type of failure mode and usually set a few
matrix automatically, which is more efficient and reli- categories of failure modes. This approach is not appli-
able than traditional experience-based, and brainstorm- cable to complex systems, such as automotive products,
based approaches. It can also be combined with tradi- which may have many failure modes.
tional approaches to building the CF matrix better. The In addition, many scholars have studied how to use the
CFMM algorithm in this paper covers both significant clustering method to acquire failure modes. Based on the
and insignificant FM in frequent itemsets more com- artificially established failure modes and their frequency,
prehensively and constructs the standard failure mode Wani and Jan (2006) adopted the “K clustering” method
and the correlation matrix between components more to determine the failure mode group in the conceptual
completely with higher accuracy. design phase of the mechanical system. Chen and Nayak
3. In the existing research on failure modes, the data source (2007) studied the method of automatically extracting fail-
of most texts is after-sales data (Rajpathak and De ure modes from maintenance text datasets using Ward’s
2016), maintenance text data (Chen and Nayak 2007), agglomerative method and the similar histogram clustering
existing FMEA, FMECA and other data. To the best of method. Arunajadai et al. (2004) constructed a similarity
our knowledge, this paper is the first to use the product matrix between failure modes and then obtained failure
quality problem-solving data in the manufacturing pro- mode groups through hierarchical clustering.
cess as the data source for failure mode extraction. Note Moreover, some scholars clustered failure modes for
that our proposed method can also be compatible with better analysis based on historical data of FMEA. Chang
data-driven FMEA construction based on sales data. et al. (2015) clustered the failure modes in the FMEA and
converted the failure modes in complex FMEA worksheets
The remainder of this paper is organized as follows. In the into a tree structure by constructing the ETree learning
next section, we give a literature review to related studies. algorithm. Kai et al. (2015) studied the Euclidean dis-
Then, in “Research framework” section, the general research tance-based similarity measure and fuzzy adaptive reso-
framework is provided to help guide the reader through the nance theory neural network for the similarity analysis and
steps involved. In “Standard failure mode set construction“ clustering of failure modes in FMEA. Meng et al. (2017)
section, the WordNet-based method is introduced to build performed K-means clustering on the preprocessed soft-
up the standard failure mode set. “Component-failure mode ware failure text and selected representative failure texts
matrix mining“ section presents the CFMM algorithm. “Case from the clusters as cluster labels.
study” section takes the seat module in an automotive com- Apart from the methods of classification and cluster-
pany as an example to perform our methodology and conduct ing, some scholars have studied the ontology method to
relevance analysis. Then, we conclude our paper by discussing extract failure modes. Rajpathak and De (2016) provided
the benefit of our methodology and summarizing the main an ontology-based approach for identifying failure modes
findings in “Conclusions and future work” section. from repairing verbatim data. James et al. (2017) studied
the construction of failure knowledge ontology based on
historical data on maintenance and services.
Among the above methods, few studies have considered
the case where the description of the failure mode is not

13
Journal of Intelligent Manufacturing

uniform. However, quality, maintenance or after repair and impact and critical analysis (FMECA) data. Based on
records are usually recorded by different people or depart- the construction of fault diagnosis ontology, Rajpathak and
ments in different ways. One of the most limiting aspects Singh (2014) applied ontology-based text mining algorithms
of FMEA is the lack of a standard vocabulary to describe to identify necessary artifacts, such as parts, symptoms,
functionality and failure modes accurately and without failure modes and their dependencies, from unstructured
ambiguity (Schneider 2003). Although some studies have repair verbatim texts in the automotive field. Based on this
emphasized that standardization of failure mode vocabu- research, many scholars have conducted similar research.
lary help to effectively maintain and utilize the knowl- Deore (2015) and Thombare and Dole (2015) described
edge base and provides a standard failure mode vocabulary an ontology-based text mining method for automatically
(Arunajadai et al. 2002), none of these studies provided a building and updating the D-matrix by mining thousands of
systematic approach to standardizing failure modes. repaired verbatim data collected during diagnostic events.
Considering the limitations of existing literature, this Mendhe and Hande (2017) further studied the representation
paper proposes a text mining method based on WordNet to of the D-matrix in a graph. Jenifa and Balachander (2015)
construct a standard failure-mode set. Compared with previ- introduced a method to construct the D-matrix with the help
ous studies, this method applies to a large number of failure- of the FP growth algorithm such that the best-practice repair
mode sets and can extract and construct standard failure- actions can be discovered.
mode sets without consuming considerable labor and time. However, all of these studies assumed that the CF matrix
or relationship between components and failure modes is
Component‑failure mode matrix known (Liu et al. 2017), and in the case studies of these
papers, the number of components and failure modes were
CF knowledge is a representation of the potential failure relatively small. Thus, the CF matrix could be provided
modes of product subsystems and components in the FMEA based on experience. Unfortunately, complex products or
and can be represented by a n × m CF matrix (Tumer and systems contain many components, and there are many
Stone 2003; Arunajadai et al. 2004), where m is the total failure modes, which makes it time-consuming and labor-
number of failure modes occurring across all n components. intensive to acquire the CF matrix manually. Therefore, it
In this matrix, a ‘1’ is placed for a component in the cell poses significant challenges to obtaining the component
corresponding to the failure mode the component experi- and the failure mode relationship automatically. Given such
enced and a ‘0’ is placed in the other cells (Arunajadai et al. constraints, this paper provides a CFMM method based on
2002). Except for the binary information of failure modes historical text data to automatically construct a CF matrix,
for a given component (Xu et al. 2018), the likelihood or which fills the gap in the field.
frequency of occurrence data can also be encoded in CF
(Wang et al. 2011).
Many studies involving component and failure modes Research framework
used the CF matrix as a known resource to study other
issues, such as functional-failure mode (EF) matrix construc- The goal of this study is to automatically obtain the CF
tion and fault dependency matrix (D-Matrix) construction. matrix from a large number of unstructured quality problem
The EF matrix relates the failure modes to the elemental data. For exposition clarity, we illustrate the research frame-
functions. Each element in the matrix indicates whether any work and the process of CF matrix extraction in Fig. 1. The
component solving function has ever failed by a failure mode raw data are first preprocessed; then, a standardized failure-
(Arunajadai et al. 2002; Tumer and Stone 2003; Chang et al. mode set is constructed by the following steps, including
2015). Most studies centered on the construction of the EF failure mode frequent itemset mining and frequent itemset
matrix by formula EF = EC × CF , where EC represented a standardization. Moreover, the nonstandard failure mode text
functional component matrix. Another area of study related in the existing problem title set is replaced with the standard
to component-failure mode considered D-matrix. Unlike the failure mode to form a new problem title set. Furthermore,
CF matrix, the D-matrix indicates the dependencies between the CF matrix mining algorithm is designed. Based on the
observable symptoms and failure modes. It is a system diag- standard failure mode set and the existing component set,
nostic model for capturing hierarchical system-level fault the algorithm is used to extract the CF matrix from the pro-
diagnosis information (Rajpathak and Singh 2014). In the cessed quality problem text. The part covered by the red
D-matrix matrix, rows represent combinations of compo- line in Fig. 1 is the focus of this paper. In “Standard failure
nents and failure modes, and columns represent symbols. mode set construction” section, the process and method of
Singh et al. (2010) introduced three types of D-matrices constructing the standard failure mode set are introduced in
and introduced sources of D-matrices, such as historical detail. The “Component-failure mode matrix mining” sec-
field fault data, engineering schematics, and failure modes, tion serves to describe the CFMM algorithm in detail.

13
Journal of Intelligent Manufacturing

Failure mode Failure mode Standardized Formalization of components,


frequent itemset combination & failure modes failure modes and problems
mining standardization
Design of component-failure
Data collection mode matrix mining algorithm
Standard failure mode set construction

Implementation of component-
failure mode matrix mining
Data Quality Processed
Replace the failure mode
preprocessing problem quality Component-failure mode matrix
text with the standard
set problem set mining
failure mode
Data preparation

Component-failure
Component set
mode matrix

Fig. 1  The research framework

Standard failure mode set construction failure mode. Therefore, to focus on the failure mode, so as
not to interfere with this information during the failure mode
According to the research framework in Fig. 1, we can see extraction process, words contained in the components in the
that the method of constructing a standard failure mode original {text are deleted
} according to the known component
set includes the following steps: text preprocessing, failure set C = c1 , c2 , … , cn .
mode frequent itemset mining, and failure mode standardi- Then, in the failure mode frequent itemsets mining step,
zation. Text preprocessing operations include removing all the problem{ texts are part-of-speech
} tagged, and all the
stopwords, converting abbreviations into complete words, word sets U = u1 , u2 , … , uo in the preprocessed document
and removing context words. Failure mode frequent itemset are obtained. Moreover, these words are used as items, and
mining includes part-of-speech tagging, extracting frequent each problem record is taken as a transaction unit to create
itemsets using the Apriori algorithm, and pruning. Failure associated transaction data. According to the Apriori algo-
mode standardization operations include WordNet-based rithm (Han et al. 2011), itemsets satisfying the minimum
failure mode synonym identification and failure mode com- support threshold 𝛼 are extracted from associated transaction
bination and standardization. data as candidate itemsets.
{ The set of candidate
} itemsets is
This paper collects related problem corpora { from }the represented as F1� = f1� , f2� , … , fj� , … fv� , in which fj′ is the
database of quality problem-solving. T = t1 , t2 , … , tl is jth frequent itemset. Each candidate itemset can be seen as a
the original quality problem text set, where ts represents the failure mode. According to the experience of domain
sth problem, s = 1, 2, … , l . First, the original text should be experts, a failure mode contains no more than three words,
preprocessed by deleting stopwords, translating acronyms so this paper does not consider frequent itemsets with more
into complete words, and deleting context words. In the step than three items. Moreover, frequent itemsets do not belong
of removing stopwords, the stopwords in the text are deleted to failure modes, such as road, cobblestone and other words
based on the current stopwords. At the same time, a diction- that indicate the situation of the problem, and the words that
ary of domain abbreviations is built to convert abbrevia- indicate a location such as right, left, rear, front, and middle.
tions in the original text into complete words. Many scholars Therefore, combined with expert experience, this paper per-
have proposed different methods to address the abbreviation forms artificial pruning on candidate itemsets and then
ambiguity problem (Wu et al. 2015; Kim and Yoon 2015). o b t a i{n s a n e w s e}t o f c a n d i d a t e i t e m s e t s
However, the abbreviations have domain characteristics, and F2� = f1� , f2� , … , fj� , … f𝛿� , where 𝛿 < v.
the quality problems are titled in short texts. These abbrevia-
tions usually have a unique sense. Therefore, after identify- The new set of pruned frequent itemsets contains a large
ing the abbreviation in the original text, it is replaced with number of failure modes. If the result is used as a failure
the word in the abbreviation database, and the disambigua- mode set, there will be frequent itemsets that may represent
tion processing is not performed. In addition, the original the same failure mode. Therefore, it is necessary to standard-
text contains information such as components, which con- ize the set of candidate itemsets F2′ and merge the different
tain words that are different from the words contained in the frequent itemsets that are described by synonyms. In quality

13
Journal of Intelligent Manufacturing

management, developing a generic description vocabulary standard failure-mode set. Here we give the notations and
that can be understood by the various departments is a chal- definitions which will be used in the CFMM algorithm.
lenge for the organization. It would take many workforces Automotive components are an essential part of a car.
and material resources to identify synonyms or phrases arti- They usually consist of multiple parts and have specific
ficially from hundreds of words or phrases manually. There- functions. The component set can be formalized as
fore, some methods are often used to identify synonyms, { }
C = c1 , c2 , … , cn ,
such as identifying synonyms in dictionary annotations
(Blondel and Senellart 2002; Muller et al. 2006; Wang and where ci indicates the ith component, i = 1, 2, … , n.
Hirst 2012), vocabulary cooccurrence algorithms in large Failure mode refers to the termination of the ability of
corpora (Baroni and Bisi 2004), and search engine-based a system to perform a required function or its inability to
methods for identifying synonyms in web (Yates and Etzioni perform within previously specified limits. It is the result
2009; Cheng et al. 2012). However, it is difficult to identify of the failure mechanism (cause of the failure mode). For
the synonymous failure modes by these methods since the example; a fully fractured axle, a deformed axle or a fully
quality problem text is usually short and contains limited open or fully closed electrical contact are each a separate
information. Of course, there may be a phenomenon of poly- failure mode of a DFMEA. The failure mode set can be for-
semy for the corpora on the Internet, whereas the failure malized as
mode in this research is often a non-polysemy noun which { }
F = f1 , f2 , … , fm ,
has strong domain characteristics. Therefore, each frequent
itemset with a support count can be considered a failure where fj represents the jth failure mode, j = 1, 2, … , m.
mode with only a single meaning. In the step of standardiza- A problem title is a comprehensive refinement of the
tion of failure modes, this paper introduces a synonym problem, usually recorded in the form of short text, and
extraction method based on WordNet. WordNet is a cogni- contains information about components and failure modes.
tive linguistics-based English dictionary designed by psy- Problem title set can be represented by
{ }
chologists, linguists and computer engineers at Princeton T = t1 , t2 , … , tl ,
University. It organizes vocabulary information based on
where ts represents the sth problem title, s = 1, 2, … , l.
word meaning rather than word form. WordNet groups them
The text of components, failure modes, and problem titles
according to the meaning of the terms. Each group of words
can only be computed after mathematical representation.
with the same meaning is called a Synset (Fellbaum 2000).
This paper uses the bag of words (BOW) model to repre-
WordNet is used to find synonymous relationships between
sent these texts. In information retrieval, the BOW model
all words in the set of candidate itemsets and build a syno-
assumes that for a document, its word order and gram-
nym set. For each set of synonymous frequent itemsets, the
mar, syntax and other elements are ignored, and it is only
frequent itemsets with the highest support count are taken
regarded as a collection of several words. The appearance
as the standard failure mode of the group. Finally, combined
of each word in the document is independent and does not
with the experience and opinions of domain experts, the
depend on whether other words appear. Before using the
results are revised, and a standard set of failure modes is
BOW model to represent text, it is necessary to create a
constructed. For example, according to WordNet, fi′ and fj′
dictionary. A dictionary consisting of the words contained
are synonymous, and the support count of a frequent itemset
in the text of the component, failure mode, and problem title
fj′ is higher than fi′, then these two failure modes are unified
can be represented as
into fj′ as the standard failure mode, and the new support [ ]
count of fj′ will be the sum of first support count of fj′ and W = w1 , w2 , … , we ,
fi′ . According to this rule, this paper constructs a synony- where w𝜏 represents the 𝜏th word in the dictionary,
mous failure {mode set and 𝜏 = 1, 2, … , e.
} obtains a new set of frequent
itemsets F = f1 , f2 , … , fm as a standard failure mode set, Based on the dictionary, this paper uses BOW to repre-
in which m < 𝛿. sent the component set as the document-term matrix. The
document-term matrix of components can be formalized as

⎛ cw11 cw12 ⋯ cw1e ⎞


Component‑failure mode matrix mining � � ⎜ cw cw22 ⋯ cw2e ⎟
CWne = cwi𝜏 n×e = ⎜ 21
⋱ ⋮ ⎟⎟
,
⎜ ⋮ ⋮
Notation and formalization ⎝ cwn1 cwn2 ⋯ cwne ⎠

We use the titles of the quality problems as a link to con- where cwi𝜏 = 1 indicates that component ci contains the word
struct the CF matrix of the existing component set and the w𝜏 , and cwi𝜏 = 0 indicates the opposite.

13
Journal of Intelligent Manufacturing

Similarly, the document-term matrix of failure modes can However, for some non-title content, for example, an article
be formalized as or lengthy text, there may be different methods for different
research needs.
⎛ fw11 fw12 ⋯ fw1e ⎞
In this method, we first need to find out which components
� � ⎜ fw fw22 ⋯ fw2e ⎟
= ⎜ 21 and failure modes each title contains. Through the formula
⋱ ⋮ ⎟⎟
FWme = fwj𝜏 m×e
,
⎜ ⋮ ⋮
below, we can obtain the association between the title and the
⎝ fwm1 fwm2 ⋯ fwme ⎠
component.

where fwj𝜏 = 1 indicates that failure mode fj contains the ⎛ tc11 tc12 ⋯ tc1n ⎞
word w𝜏 , fwj𝜏 = 0 indicates the opposite. � � ⎜ tc tc22 ⋯ tc2n ⎟
TC = tcsi = TW × CW T = ⎜ 21
⋮ ⎟⎟ (1)
The document-term matrix of problem titles can be for- l×n
⎜⋮ ⋮ ⋱
malized as ⎝ tcl1 tcl2 ⋯ tcln ⎠

⎛ tw11 tw12 ⋯ tw1e ⎞


In formula (1), CW T represents the transposition of the
� � ⎜ tw tw22 ⋯ tw2e ⎟
= ⎜ 21 document-term matrix of components. In each row of the
⋮ ⎟⎟
TWle = tws𝜏 l×e
,
⎜⋮ ⋮ ⋱
matrix TC , the result of the multiplication is a number indi-
⎝ twl1 twl2 ⋯ twle ⎠
cating the number of identical words in the component and
the problem title. For example, tc11 indicates the number of
where tws𝜏 = 1 indicates that problem title ts contains the identical words in the first component and the first problem
word w𝜏 , tws𝜏 = 0 indicates the opposite. title. For each title, we need to find a component in all com-
CF matrix is an m × n-dimensional matrix, it can be rep- ponents so that the value of this multiplication is the largest,
resented as which indicates that the number of identical words in this
⎛ cf11 cf12 ⋯ cf1m ⎞ component and this problem title is the highest. This com-
� � ⎜ cf cf22 ⋯ cf2m ⎟ ponent is the( component that)the title contains. For exam-
= ⎜ 21
⋮ ⎟⎟
CFnm = cfij n×m
, ple, if max tc11 , tc12 , … , tc1n = tc12 , then title 1 contains
⎜ ⋮ ⋮ ⋱
⎝ cfn1 cfn2 ⋯ cfnm ⎠ component 2. If a problem title can correspond to multiple
components, we take the component that contains the least
where m is the total number of failure modes occurring number of words as the component corresponding to the
across all n components, and cfij indicates the number of problem title. For example, the problem title is “Seat belt
times that component ci has experienced failure mode fj. noise,” there are two components in the component set that
are “seat belt” and “seat belt buckle.” Then, the dictionary
will be {seat, belt, noise, buckle}. According to the above
Assumptions and algorithm
method, the results of multiplying the seat belt and the seat
belt buckle by the problem title are both 2. However, the cor-
In this section, we provide a novel text mining method for
responding component in the title should be the “seat belt,”
mining the relationships of components and failure modes.
not the “seat belt buckle.”
For the ease of quality exposition, we need to give out some
After this step, we get the correspondence between all titles
assumptions as follows.
and components and store the subscripts of the components
identified in all titles in an array, which then generates a col-
Assumption 1 Each title only contains one component.
lection of components corresponding to all the problem titles.
Moreover, the occurrence frequency of each component is
Assumption 2 Each title only contains one failure mode.
calculated as the statistical result of the component identified
from all the titles.
In fact, a problem title in practice usually contains the fol-
As with the above method of obtaining the association
lowing information, including the problem situation, the com-
between the titles and the components, we can get the associa-
ponents in which the problem occurred, and what the problem
tion between the titles and the failure modes by the following
is. People who input problems into the system will not describe
formula.
the problematic component and the failure mode multiple
times in the title. These titles form a concise representation ⎛ tf11 tf12 ⋯ tf1m ⎞
of the most important message of a document (Mangnoesing � � ⎜ tf tf22 ⋯ tf2m ⎟
et al. 2012), and they are often concise and rarely have com- TF = tfsj l×m = TW × FW T = ⎜ 21
⋮ ⎟⎟ (2)
⎜ ⋮ ⋮ ⋱
plex statement expressions (Miao et al. 2008). Moreover, the ⎝ tfl1 tfl2 ⋯ tflm ⎠
method also has some reference for extracting related informa-
tion from the titles described by short text in other scenarios.

13
Journal of Intelligent Manufacturing

Based on the work done, we store the subscripts of the fail- We obtained 11,677 problem records from the year 2010
ure modes identified in all titles in an array, then generate a to 2016 from the quality management information system of
collection of failure modes corresponding to all the problem Company A, of which 568 are related to the seat. Accord-
titles. Again, the occurrence frequency of each failure mode ing to the title of these data, we delete the items that only
is calculated. contain simple content such as “problems,” “problem,” “seat
According to the component subscript and failure modes problem.” Meanwhile, some titles written in German are
subscript corresponding to each problem title, the compo- also deleted. After this process, the number of useful seat
nent is associated with the failure mode, and the number of problems is 495. Each record includes the problem number,
failure modes of each component is calculated accordingly. vehicle model, main module, title, description, creation date
Furthermore, the CF matrix is established. and other information. Before data mining, we construct the
The CFMM algorithm is shown in Table 1. The applica- corresponding stopword list and obtain the acronym table
tion of this algorithm will be explained in the next section commonly used in Company A.
with an example. The problem titles are refined short texts. The extraction
As shown in the algorithm, the purpose of steps 1–11 of the failure modes and the construction of the CF matrix
is to determine which components are included in each are based on the problem titles. In company A, there is a
title. In steps 12 and 13, the number of occurrences of each standard for the input of the quality problem title, usually
component is calculated as a statistical result. Similarly, the “problem finder _ model _ project stage _ problem concise
purpose of steps 14 to 23 is to identify the failure mode description.” For example, in the problem title “FDP_F35_
contained in each title. In steps 24 and 25, the number of SE_Noise from right rear seat backrest unlocking as driving
occurrences of each failure mode is calculated as a statisti- on the bumpy road”, “FDP” indicates that the problem was
cal result. discovered by a road test, “F35” indicates that the problem
According to Steps 8, 9, 20, and 21, we obtain the com- occurred on the F35 model, and “SE” indicates that the prob-
ponent subscript and failure mode subscript corresponding lem is mass production problem. Despite the input criteria,
to each problem title. Then, through step 27, the components the problem finder will occasionally describe the problem
are associated with the failure modes, and the number of in the way he is used to, resulting in reduced data quality.
failure modes of each component is calculated accordingly. According to the statistics, 65% of the problem input of the
Furthermore, the CF matrix is established. seat module meets the standard, and the remaining problem
title is a concise description of the problem. However, this
does not affect the subsequent analysis of this paper. In the
process of extracting the failure mode, this paper filters the
Case study information of the problem finder, model and project stage
and only uses the concise description.
Experimental data
Failure mode extraction result
In this section, we take problem data of a car seat system
from Company A as an example to verify the proposed This section uses the WordNet-based approach described
method. The seat is an important part of the automotive in “Standard failure mode set construction” section to build
interior. In addition to providing smooth operation and a standardized failure mode set. Table 3 presents a syno-
comfortable driving for the passengers, it must also have nym set of standard failure modes. As shown in the table,
the function of ensuring the safety of the passengers. At the the synonymous failure modes are combined, and a total of
same time, some seats also have the function of heating, 17 groups are obtained. The number in parentheses after
automatic adjustment, and other requirements to meet the each failure mode indicates the support count for the fail-
individual needs of customers. The failure modes of the car ure mode. For most groups, the failure mode with the high-
seat components may have various effects; some may affect est support count is considered the standard failure mode,
the appearance, some may affect the function, and even more which is the most commonly used expression for inputting
pose a safety hazard. Therefore, some measures can be taken the problem from different quality departments. For some
from the design stage to avoid problems by identifying the groups, although some failure modes have the highest sup-
failure modes of each component of the seat and performing port count, the experts judge the other failure modes in the
FMEA. The seat system includes front seat assembly, rear group as the standard failure mode. For example, in group
seat assembly, seat belt system, and child restraint system 12, “aroma,” “smell,” and “odor” indicate that the seat emits
for a total of more than 300 components. Due to space limi- an unusual smell. Although “smell” has the highest sup-
tations, this paper presents some common components in port count, “odor” is a more professional expression, so it
Table 2. is adopted as the standard failure mode of this group. The

13
Journal of Intelligent Manufacturing

Table 1  Component and failure mode association mining algorithm


Algorithm 1 CF matrix mining (CFMM)
Input: CW , FW , TW
Output: CF .
1 for s ← 1 to l do for each problem title
2 for i ← 1 to n do for each component
3 p [i − 1] ← TW ( s,:) × CW T ( i,:) The number of identical words in component and title.
4 end for
5 find i ← α s.t. p [ α − 1] ← max p [ n ]
6 if the number of the maximum value in p [ n ] is greater than one do
find α ← α s.t. ∑ τ =1 cwα' τ ← min ∑ τ =1 cwατ then the component that has the fewest words
' e e
7
8 x [ s - 1] ← α'
9 else x [ s - 1] ← α
10 end if
11 {
return CX ← cx[0] ,cx[1] , ,cx[l -1] }
12 for each cx[k ] ∈ CX do
13 cx[k ] ← cx[k ] .count + +
14 for j ← 1 to m do for each failure mode
15 q [ j − 1] ← TW ( s,:) × FW T ( j,:) the number of identical words in failure mode and title
16 end for
17 find j ← β s.t. q [ β − 1] ← max q[ m ]
18 if the number of the maximum value in q [ m ] is greater than one do

∑ fwα' τ ← min ∑ τ =1 fwατ then


e e
19 find β ← β' s.t. τ =1
the failure mode that has the fewest words
20 y [ s - 1] ← β '

21 else y [ s - 1] ← β
22 end if
23 {
return FY ← f y [0] , f y [1] , , f y [l -1] }
24 for each f y [k ] ∈ FY do
25 f y [k ] ← f y [k ] .count + +
26 end for
27 {
return D ← cf x[0] y[0] ,cf x[1] y[1] , ,cf x[l -1] y[l −1] }
28 for each cf x[k ] y [k ] ∈ D do
29 cf x[k ] y [k ] ← cf x[k ] y [k ] .count + +
30 end for
31 for each cf ij ∈ CFnm do
32 If cf ij ∈ D
33 cf ij ← cf ij .count + +
34 else cf ij ← 0
35 end if
36 end for
37 return CFnm

support count of the standard failure mode will be the sum an important reference for all vehicle manufacturers and cor-
of support count of the synonymous failure modes in each responding seat suppliers.
group. The standard failure modes of seat module presented After the failure mode mining of 495 seat quality prob-
in Table 3 are not only applicable in company A but are also lem records, a total of 57 types of failure modes were

13
Journal of Intelligent Manufacturing

recognized. Due to space constraint, this paper only dis- management. According to Fig. 2, the quality managers
plays in Fig. 2 the failure modes that occurred more than will identify the main failure modes that occurred on seat
six times. As shown in Fig. 2, “noise”, “wavy”, “defect”, module and perhaps take some priority measures to solve
“gap”, and “function defect” are the top five failure these failure modes to improve key performance indicator
modes. This chart is a kind of Pareto Chart, the statistics (KPI) such as defects per 100 units (DPU). In this way, we
in this chart will be a guide for the managers of quality

Table 2  A portion of the seat assembly


System Component

Front seat assembly Seat rail/seat armrest/seat backrest/headrest/covers/finishers/seat heating/first aid box
Rear seat assembly Supports/covers/headrest/center armrest/ski bag/finishers/seat heating
Seat belt system Seat belt/belt height adjuster/belt tensioner/belt buckle/end fittings
Child restraint system Child seat impact table/child seat height adjustment/child seat footrest/ISOFIX

Table 3  The Standard failure No Synonymous failure mode set with support count Standard failure mode
mode set
1 Squeaking (6), knocking (0), creaking (2), sound (0), rattle Noise (128)
(11), noise (109)
2 Failure (3), malfunction (6), defect (55) Defect (64)
3 Wrinkle (12), wavy (55) Wavy (67)
4 Move (2), movement (2), loose (22) Loose (26)
5 Incorrect (0), fault (2), abnormal (2), wrong (13) Wrong (17)
6 Lose (0), omitted (2), disappear (0), missing (8) Missing (10)
7 Thermal (1), heating (7) Heating (8)
8 Broken (3), damage(7) Damage (10)
9 Thread (2), stitch (2), sewing (2), seam (6) Seam (12)
10 Friction (0), rubbing (4), detrition (0) Rubbing (4)
11 Shake (0), vibration (3) Vibration (3)
12 Aroma (0), smell (4), odor (2), scent (0) Odor (6)
13 Warning (0), alarm (2) Alarm (2)
14 Pollutant (1), contamination (2) Contamination (3)
15 Delamination (0), lamination (2) Lamination (2)
16 Not parallel (1), parallelism (0), misalignment (2), tapered (0), Not parallel (3)
wedge (0)
17 Deformed (0), distortion (0), twist (2) Deformed (2)

Fig. 2  The standard failure 140 128


modes and their frequency 120
100
80 67 64
60 43
35
40 26
17 15 13 12 11 10 10 10
20 9 9 8 8 8 8 8 7 7 6
0
noise
wavy

gap

nok

fall

hole
defect

loose

seam

stuck

odor
wrong

offset
damage

scratch

fit nok

visible
missing

adjust defect
adjust noise
function defect

material defect

heat function defect


adjust function defect

13
Journal of Intelligent Manufacturing

can organize and utilize resources such as personnel and the predetermined category does not exceed 22 categories,
equipment more effectively in quality management. which means K = 22.
In addition, to compare the method of this paper with Table 4 shows the clustering results for all quality prob-
other methods, we adopt the clustering method and the lem text records. The descriptive terminologies in the table
FP-growth algorithm in RapidMiner Studio to extract the are representative terms for a cluster selected by RapidMiner
failure modes from the text. RapidMiner is a drag-and- according to the order of TF-IDF of these terms; the absolute
drop graphical tool for machine learning, data mining, text count is the number of files in the cluster; the coverage is
mining, predictive analysis, and business analysis. It is a the number of documents in the cluster divided by the total
kind of tool embedded with algorithms such as K-Means number of documents in the collection. After judging, some
which is a classical clustering algorithm and FP-growth of the different clusters are a type of failure mode, but they
which is a classical algorithm to extract frequent itemsets. are divided into different clusters. For example, cluster 2,
Some other text mining tools or programming languages cluster 8, cluster 17, and cluster 18, are noise-type failure
can also be used to deal with these tasks. modes. Therefore, we combine these four clusters into one
After the pretreatment process, we exploit the K-means cluster and define it as the “noise” failure mode type. Finally,
algorithm for clustering and the squared Euclidean dis- we merge the 22 clusters into 18 failure modes and add the
tance as a measure of distance between samples, which is same cluster counts as the count of the failure modes.
the sum of quadratic differences overall attributes. Furthermore, this paper also uses the FP-growth algo-
K-means clustering requires a prior determination of rithm in RapidMiner to extract the frequent failure mode
the cluster values Kmax . There is no clear theoretical guid- set. In this part, the FP-growth algorithm is used to extract
ance on how to determine the√Kmax . Most scholars use frequent itemsets. To eliminate the infrequent itemsets, the
the empirical rule for Kmax ≤ n , where n is the num- minimum support was set to 0.02, where minimum sup-
ber of data objects (Rezaee et al. 1998; Limwattanapi- port = (number of occurrences of an itemset)/(size of the
bool and Arch-Int 2017). Therefore, according to the example set). After manual pruning, RapidMiner extracted
method and considering a total of 495 problem records, a total of 495 failure modes for 71 categories.

Table 4  Failure mode extraction results with the K-means clustering method
Cluster Absolute Coverage (%) Descriptive terminologies Failure mode type Failure mode New
count count cluster
number

Cluster 0 8 1.6 Malfunction, without, omit Malfunction 8 C_0


Cluster 1 20 4.0 Loose, screw, easy, trim Loose 20 C_1
Cluster 2 56 11.3 Noise Noise 160 C_2
Cluster 8 90 18.2 Noise, squeak,, drive
Cluster 17 11 2.2 Rattle, noise, biw
Cluster 18 3 0.6 Guide, damage, noise
Cluster 3 24 4.8 Gap, misalign, taper Gap 39 C_3
Cluster 10 15 3.0 And, between, gap, seal
Cluster 4 14 2.8 Not, accept, smell Smell 14 C_4
Cluster 5 13 2.6 Wrong, decor, direct Wrong 13 C_5
Cluster 6 43 8.7 Scratch, damage, stuck, touch Scratch 43 C_6
Cluster 7 12 2.4 Cannot, open, adjust Cannot open 12 C_7
Cluster 9 8 1.6 Hole, outer, close Hole 8 C_8
Cluster 11 51 10.3 Wavy Wavy 51 C_9
Cluster 12 15 3.0 Material, defect Material defect 15 C_10
Cluster 13 22 4.4 Adjust, function, noise, defect Adjust defect 22 C_11
Cluster 14 12 2.4 Wrinkle, area, edge Wrinkle 12 C_12
Cluster 15 13 2.6 Fall, off, mechanism Fall off 13 C_13
Cluster 16 10 2.0 Miss, label, inform Missing 10 C_14
Cluster 19 12 2.4 Nok, fit, corner Fit nok 12 C_15
Cluster 20 11 2.2 Offset, between, and, sit Offset 11 C_16
Cluster 21 32 6.5 Function, defect, heat Function defect 32 C_17

13
Journal of Intelligent Manufacturing

Table 5  Comparison of failure modes extracted by three methods 180


FP-growth
160
FP-growth K-means clustering Our method 128 K-means clustering
140
Our method
120
Failure mode catego- 71 18 57 100
ries 80 67 64
60 43
35
40 26
17 15 13 12 11 10 10 10
20
Table 5 shows the categories and number of failure modes 0

noise

loose
wavy

missing

damage
defect

gap

wrong

material defect

nok

seam

adjust noise
offset
function defect
extracted by these three methods. As we can see, our method
extracts 57 failure modes categories, which can yield more
failure mode categories than the clustering method but
fewer than the FP-growth method. For the frequent item-
sets extracted by FP-growth, there are some failure modes
Fig. 3  Comparison of the number of failure modes obtained by the
described with different words but the same failure mode.
three methods
For example, FP-growth extracts “noise” with a support
count of 109 and “rattle” with a support count of 11, which
should be the same failure mode according to the domain titles. The algorithm effectively identifies 110 component
experts. Another example will be “wavy” with support count categories for a total of 495 seat components from all the
of 55 and “wrinkle” with the support count of 12, but both quality problems. The identified useful failure modes are
of them means the wavy of the seat surface. In these 71 57 categories. Similarly, the seat components that appear
failure modes categories extracted by FP-growth, there are more than twice are presented in Fig. 4.
some synonyms, but they are not combined and standard- As shown in Fig. 4, “seat backrest,” “seat,” “seat belt,”
ized. Based on these failure mode set, the users will be con- “seat headrest,” and “seat cover” are the top-five most
fused when the select words to describe the failure modes. popular seat components. In the actual situation, the com-
However, our method not only extracts frequent itemsets ponent described by “seat” is said to be the complete seat
of failure modes but also builds up the synonyms of failure in the module structure. However, it is possible that some
modes based on WordNet. At the same time, the problem employees did not specify specific components when
titles are always written by a short text, the matrix repre- entering data.
sented by the vector space model will be sparse. Thus it As mentioned earlier, the number of components catego-
is not easy to cluster the similar text in the same group by ries extracted is 110, and the failure mode category is 57.
cluster algorithms, and sometimes different text will be clus- That is, the CF matrix is a 110*57-dimensional matrix. In
tered into one group. As shown in Table 5, “scratch,” “dam- this matrix, there are 266 nonzero items (CF combinations),
age,” “stuck,” and “touch” are clustered in one group by the and the total number of all failure modes that have occurred
clustering algorithm, which should be the different failure for all components is 495. Because the matrix obtained by
modes. At the same time, “noise” distributed in cluster 2, 8, the CFMM algorithm cannot be presented, we use Table 6
17, and 18. The clustering result is not so satisfactory. On as an example to specify the CF matrix.
the whole, our method can obtain a variety of high-quality From the table, we can see that the matrix is a three-
failure mode sets. by-four dimensional matrix. In this matrix, there are five
The failure modes with support count greater than ten are nonzero items, which means that the CF combinations have
selected and presented in Fig. 3 due to the limited space. The occurred five times. They are “seat belt broken,” “seat belt
number of failure modes extracted by these three methods noise,” “seat belt stuck,” “seat backrest wavy” and “cup
is compared. As shown in Fig. 3, most of the failure modes holder noise.” As shown in the table, the total number of
obtained by our method are the highest. However, for partial failure modes for all components is eleven.
failure modes such as “noise,” the clustering method obtains This paper also uses the FP-growth algorithm in Rapid-
the largest number, mainly because the clustering result is Miner to extract components and failure mode frequent
not particularly accurate. itemsets. The minimum support is set to 0.02 to eliminate
itemsets that do not occur frequently. After manual pruning,
Component‑failure mode matrix construction result the FP-growth algorithm in RapidMiner extracts 50 catego-
ries of CF combinations. The sum of the support count of
In this section, we use the CFMM algorithm in “Com- all combinations is 188. Among these 50 categories of CF
ponent-failure mode matrix mining algorithm” section to combinations, there are 14 components categories and 27
mine the association matrix between the seat components failure modes categories. According to this result, the CF
and the failure modes from the set of quality problem matrix whose size will be 14*27 can be constructed. Table 7

13
Journal of Intelligent Manufacturing

Fig. 4  Components identified 120


105102
by the CFMM algorithm
100

80

60
44
40

20 14 13 12 12 11
9 7 6 6 5 5 5 5 5 5
4 3 3 3 3 3 3 3 3
0

ornament box centre…


seat foam

isofix cover
seat

seat belt buckle

side bolster
seat cover

seat armrest

seat rail cover


seat lumbar support
seat backrest

seat cushion

seat headrest mechanism

seat belt retractor


seat belt

seat adjust

cup holder

seat seam
seat headrest cover

seat backrest cover

seat belt clip


seat headrest

seat storage box

seat back cushion


seat thigh support

seat backrest foam


seat backrest unlocking
Table 6  An example of a CF matrix platform Gephi to present the CF matrix in the form of a
Components: 3 Wavy Broken Noise Stuck
network diagram. To better present the effect, we delete
Failure modes: 4 the isolated components or failure modes in the matrix
Problem sources: 11 that are not associated with other points. As shown in
Fig. 6, the components and the failure mode nodes are
Seat belt 0 1 2 1
mixed to form a complex connection network. The larger
Seat backrest 3 0 0 0
the node and the number, the more times the component
Cup holder 0 0 4 0
or failure mode occurs. In this network diagram, two
directly connected nodes must be a component and a fail-
ure mode. Two failure modes or two components cannot
Table 7  Comparison of CF matrix identified by CFMM and FP- be connected by one edge. The edge between two nodes
growth algorithm indicates the number of times the component experienced
CF matrix CF combination CF com- the failure mode. The thicker the edge, the more times the
dimension categories bination component has experienced the failure mode.
quantity Through the above network diagram, quality manage-
CFMM 110 × 57 266 495 ment personnel can have a more macroscopic and direct
FP-growth 14 × 27 50 188 understanding of the failure modes that appear in the
components. It is very important for supplementing the
DFMEA at the enterprise level. In addition, the quality
presents the differences between the CF combination mined management personnel can further drill down according
by the CFMM method and the FP-growth algorithm. to Fig. 6 to analyze the composition of the failure modes
Furthermore, we compare the number of each combina- of each component. Therefore, allocating more quality
tion of CF extracted by the CFMM algorithm and FP-growth management-related resources to those key components
algorithm. Because the CF matrix obtained by CFMM is and failure modes. According to the network diagram, this
too large, this paper cannot present all CF combinations. paper takes the seat backrest with the most occurrence as
Therefore, the CF combinations that occur more than three an example for further analysis. The drill-down analysis
times are presented in Fig. 5. result is shown in Fig. 7. The three most common failure
As shown in Fig. 5, the number of each CF combination modes on the seat backrest are “wavy,” “noise” and “gap.”
obtained by the CFMM algorithm is higher than the cor- For this case, the enlightenment is that quality manage-
responding number obtained by the FP-growth algorithm, ment personnel should focus their attention on these three
which shows that the CFMM algorithm proposed can extract categories of failure modes on the seat backrest.
the CF matrix more effectively.
To visualize the relationship between the seat com-
ponents and the failure modes, we use the data analysis

13
13
tion results comparison
Fig. 5  CF combination extrac-

10
15
20
25
30

0
5
35 31

seat backrest wavy


25

seat noise
seat backrest noise
20 20

seat wavy
seat gap

Fig. 6  Network map between seat components and failure modes


seat belt function defect
9 9 8

seat backrest gap


seat backrest offset
seat cover wavy
seat backrest loose
6 6 5 5

seat function defect


seat squeak noise
seat belt noise
seat adjust noise
seat foam odor
CFMM

seat belt stuck


seat vibration
seat hole
seat headrest mechanism noise
seat backrest unlock noise
cup holder noise
FP-growth

seat belt buckle function defect


seat backrest hole
seat belt material defect
seat heat function defect
seat belt pre tighten defect
seat loose
seat belt retractor noise
seat backrest material defect
seat backrest function defect
seat wrong
seat backrest heat function defect
4 4 4 4 4 3 3 3 3 3 3 3 4 4 3 4 3 3 3 4 3 3

seat adjust function defect


Journal of Intelligent Manufacturing
Journal of Intelligent Manufacturing

Fig. 7  Seat backrest failure 30


25
mode drill-down analysis 25
20
20
15
10 8
6 5
4 3 3 3 3
5 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0

heating function defect

production date problem


width adjustment function defect
wavy

gap
noise

offset
loose

hole

bulge
thread

wrong
wrinkled

scratch

fit nok
damage

free play

part nok
adjust noise

wrong color
rubbing noise

missing label
contamination
function defect

material defect

creaking noise
cannot be locked
surface uneven

interference noise
split lines not equal
Conclusions and future work are used to describe the failure mode, which results in the
ambiguity of the designer’s perception of failure mode. To
In this study, we developed a novel systematic approach of solve these problems, this paper proposes a method to mine
effectively establishing the design failure-mode and effects failure modes from quality data recorded in a large num-
analysis upon the standardization of failure modes and CF ber of production processes and construct a standard failure
matrix mining, to overcome some significant shortcom- mode library as a general language for problem description
ings of existing methods. Different departments use dif- between different departments. On this basis, we use a large
ferent vocabulary when describing the same failure mode, number of product quality data to construct component-fail-
so building standard failure mode vocabulary can improve ure mode matrix automatically and use it as a knowledge
communication between departments and improve people’s reference for designers to create DFMEA.
understanding of failure modes. However, manually building In response to these problems, this paper first extracted a
standard failure mode vocabulary is a time-consuming and list of frequent failure modes from problem-solving data by
labor-intensive process. At the same time, the CF matrix is Apriori algorithm, then based on WordNet, find the synony-
an essential source of knowledge for DFMEA, while manu- mous failure modes from the list, and then build the standard
ally extracting CF matrix from large amounts of different failure mode vocabulary. Based upon the standard failure
documents is also not an easy task. mode vocabulary and the existing component set, the qual-
The focused DFMEA here is a fundamental tool for ity problem title with implied failure mode information was
improving quality and enhancing the reliability of products. used as a link, and the CFMM algorithm was examined to
DFMEA is an effective weapon commonly used in product construct the CF matrix automatically. This paper adopted
design and development to take full account of the problems a car company’s seat module as an example to analyze the
involved in the production, delivery, and use of products, results of the standard failure modes and the effect of the
to bring all possible problems into the scope of prevention, CFMM algorithm. The results showed that the failure mode
and to prepare preventive means in advance. The creation extraction method with standardized features could extract
of DFEMA first needs to know which failure modes have the failure mode better than the FP-growth and K-means
occurred in the product components. Component-failure clustering methods. At the same time, the CFMM algorithm
mode matrix is an important source of knowledge in this could extract more CF combinations and build a richer set
process. On the one hand, the designers who create DFMEA of CF matrices than the FP-growth method. Although each
are far away from the production process, lack of under- industry has different domain characteristics, the method in
standing of the product quality problems that may occur this paper is applicable not only to the manufacturing indus-
in the production process, and the data of product quality try but also to other fields that need to use FMEA to ensure
problems scattered in the production process form an infor- product and system reliability.
mation isolated island, which is difficult for designers to Our theoretical contribution can be in large part reflected
use. On the other hand, the employees who record prod- in the innovative component-failure mode matrix (CFMM)
uct quality problems often adopt according to their own algorithm used in the DFMEA construction process. The
habits when describing the same problem. Different words proposed method has several advantages over the existing

13
Journal of Intelligent Manufacturing

methods: (1) In the construction of standard failure modes, Baroni, M., & Bisi, S. Using cooccurrence statistics and the web to
the FP-growth approach does not standardize failure modes, discover synonyms in a technical language. In Proceedings of
the Fourth International Conference on Language Resources
and many essentially same FMs are computed into differ- and Evaluation, Lisbon, Portugal, 2004 (pp. 1725–1728): Euro-
ent FMs, resulting in more significant errors. While the pean Language Resources Association.
K-means clustering method uses rough fields in FM rec- Beksinska, M., Joanis, C., Manning, J., Smit, J., Callahan, M., Dep-
ognition and extraction, and many of the FMs that should erthes, B., et al. (2007). Standardized definitions of failure
modes for female condoms. Contraception, 75(4), 251–255.
have been included are eventually omitted with poor accu- https​://doi.org/10.1016/j.contr​acept​ion.2006.10.003.
racy. (2) For developing CFMM, the FP-growth algorithm Blondel, V. D., & Senellart, P. P. Automatic extraction of synonyms
only seeks the correlation between FM and components with in a dictionary. In Proceedings of the Society for Industrial and
significant frequency in frequent itemsets, and the coverage Applied Mathematics Workshop on Text Mining, Arlington, Vir-
ginia, USA, 2002 (pp. 7–13).
of FMs is relatively narrow, whereas the CFMM algorithm Brook, Q. S. (2006). Six Sigma and Minitab: A complete toolbox
in this paper covers both significant and insignificant FM guide for all Six Sigma practitioners (2nd ed.). London: QSB
in frequent itemsets, and constructs the correlation matrix Consulting.
between standard failure modes and components more com- Chang, W., Kai, M. T., & Lim, C. P. (2015). Clustering and visuali-
zation of failure modes using an evolving tree. Expert Systems
pletely with high accuracy. with Applications, 42(20), 7235–7244.
This paper mined the quality problem title recorded in Chen, L., & Nayak, R. (2007). A case study of failure mode analysis
text form, which serves as a concise information represen- with text mining methods. Paper presented at the International
tation that provides important information about failure Workshop on Integrating Artificial Intelligence & Data Mining,
Darlinghurst, Australia.
modes. However, in the text of other records such as quality Cheng, T., Lauw, H. W., & Paparizos, S. (2012). Entity synonyms
problem descriptions, partial failure mode information is for structured web search. IEEE Transactions on Knowledge and
also included, and it is possible that one description con- Data Engineering, 24(10), 1862–1875.
tains multiple failure modes. Identifying failure modes from Collins, J. A., Hagan, B. T., & Bratt, H. M. (1976). The failure-
experience matrix: A useful design tool. Journal of Engineering
longer texts and building a CF matrix will continue to be for Industry, 98(3), 1074–1079.
studied in the future. Due to space limitations, this paper Deore, R. Y. (2015). An ontology text mining to conversion of
only studied the relationship between components and fail- unstructured to structure text in D-Matrix. Indian Journal of
ure modes. However, in the text of quality problems, the Scientific Research, 6(1), 47–48.
Fellbaum, C. (2000). WordNet: An electronic lexical database. Lan-
causal relationship between failure mode and cause may guage, 76(3), 706–708.
also be implied, and this will also be an important source of Goel, A., & Graves, R. J. (2007). Using failure mode effect analysis
knowledge for FMEA. Therefore, the relationship between to increase electronic systems reliability. Paper presented at the
failure mode and cause may also be one of the problems 30th International Spring Seminar on Electronics Technology
(ISSE), Cluj-Napoca, Romania.
studied in future research. Han, J., Pei, J., & Kamber, M. (2011). Data mining: Concepts and
techniques. Amsterdam: Elsevier.
Acknowledgements This research has received funding from National Hearst, M. A. (1999). Untangling text data mining. Proceedings of
Natural Science Foundation of China (NSFC) (Grant No. 71871041). ACL-99, 3-10.
The authors would like to thank the Institute of systems engineering of ISO/IEC-15026-1 (2013). Systems and software engineering-Sys-
the Dalian University of Technology and BMW Brilliance Automotive tems and software assurance-Part 1: Concepts and vocabulary.
Ltd. for supporting the research. Special thanks to Dr. Longfei He from Switzerland: ISO/IEC JTC 1/SC 7 Software and systems engi-
Tianjin University for his important participation and substantive con- neering subcommittee.
tribution in this research by providing valuable ideas and constructive James, A. T., Gandhi, O. P., & Deshmukh, S. G. (2017). Knowledge
suggestions and revising the whole paper. management of automobile system failures through develop-
ment of failure knowledge ontology from maintenance expe-
rience. Journal of Advances in Management Research, 14(1),
1–21.
References Jenifa, J., & Balachander, T. (2015). Survey on fault dependency matrix
construction using ontology based text mining. International
Arunajadai, S. G., Stone, R. B., Tumer, I. Y., & Clancy, D. (2002). A Journal of Advanced and Innovative Research, 4(3), 347–350.
Framework for creating a function-based design tool for failure Kai, M. T., Jong, C. H., & Lim, C. P. (2015). A clustering-based failure
mode identification. Paper presented at the ASME 2002 Interna- mode and effect analysis model and its application to the edible
tional Design Engineering Technical Conferences and Computers bird nest industry. Neural Computing and Applications, 26(3),
and Information in Engineering Conference, Montreal, Canada. 551–560.
Arunajadai, S. G., Uder, S. J., Stone, R. B., & Tumer, I. Y. (2004). Khilwani, N., & Harding, J. A. (2016). Managing corporate memory
Failure mode identification through clustering analysis. Quality on the semantic web. Journal of Intelligent Manufacturing, 27(1),
and Reliability Engineering International, 20(5), 511–526. https​ 101–118.
://doi.org/10.1002/qre.663. Kim, S., & Yoon, J. (2015). Link-topic model for biomedical abbre-
Asan, U., & Soyer, A. (2016). Failure mode and effects analysis under viation disambiguation. Journal of Biomedical Informatics, 53,
uncertainty: A literature review and tutorial. In C. Kahraman & S. 367–380.
Yanik (Eds.), Intelligent decision making in quality management: Limwattanapibool, O., & Arch-Int, S. (2017). Determination of the
Theory and applications (pp. 265–326). Switzerland: Springer. appropriate parameters for K-means clustering using selection of

13
Journal of Intelligent Manufacturing

region clusters based on density DBSCAN (SRCD-DBSCAN). Stamatis, D. H. (1995). Failure mode and effect analysis: FMEA from
Expert Systems, 34(3), 1–11. theory to execution. Milwaukee: ASQ Quality Press.
Liu, H., Chen, Y., You, J., & Li, H. (2016). Risk evaluation in fail- Thombare, T. R., & Dole, L. (2015). D-matrix: Fault diagnosis frame-
ure mode and effects analysis using fuzzy digraph and matrix work. International Journal of Innovative Research in Computer
approach. Journal of Intelligent Manufacturing, 27(4), 805–816. and Communication Engineering, 3(3), 1740–1745.
Liu, L., Fan, D., Wang, Z., Yang, D., Cui, J., Ma, X., et al. (2017). Tumer, I. Y., Stone, R. B., & Bell, D. G. Requirements for a failure
Enhanced GO methodology to support failure mode, effects mode taxonomy for use in conceptual design. In DS 31: Proceed-
and criticality analysis. Journal of Intelligent Manufacturing, ings of ICED 03, the 14th International Conference on Engineer-
2017, 1–18. https​://link.sprin​ger.com/artic​le/10.1007%2Fs10​ ing Design, Stockholm, 2003.
845-017-1336-0. Tumer, I. Y., & Stone, R. B. (2003). Mapping function to failure mode
Mangnoesing, G., Bunningen, A. V., Hogenboom, A., Hogenboom, F., during component development. Research in Engineering Design,
& Frasincar, F. (2012). An Empirical Study for Determining Rel- 14(1), 25–33.
evant Features for Sentiment Summarization of Online Conversa- Wang, T., & Hirst, G. (2012). Exploring patterns in dictionary defi-
tional Documents. Paper presented at the International Conference nitions for synonym extraction. Natural Language Engineering,
on Web Information Systems Engineering, Berlin. 18(3), 313–342. https​://doi.org/10.1017/S1351​32491​10002​10.
Mendhe, S. P., & Hande, K. N. (2017). Graphical analysis on text min- Wang, M., Wang, B., & Tang, X. (2011). Optimal component subset
ing unstructured data using D-matrix. International Journal on selection method based on component-failure knowledge. Com-
Recent and Innovation Trends in Computing and Communication, puter Integrated Manufacturing Systems, 17(2), 267–272.
5(6), 1412–1416. Wani, M. F., & Jan, M. (2006). Failure Mode Analysis of Mechani-
Meng, L., Wang, H., Xue, Y., & Ma, L. (2017). Research on auto- cal Systems at Conceptual Design Stage. Paper presented at the
matic generation of software failure modes. Journal of Frontiers ASME 8th Biennial Conference on Engineering Systems Design
of Computer Science and Technology, 12(12), 1–11. and Analysis, Torino, Italy.
Miao, J., Zhang, Q., & Zhao, J. (2008). Chinese automatic text catego- Wijayasekara, D., Manic, M., & Mcqueen, M. Vulnerability identifica-
rization based on article title information. Computer Engineering, tion and classification via text mining bug databases. In IECON
34(20), 13–15. 2014—40th Annual Conference of the IEEE Industrial Electronics
Muller, P., Hathout, N., & Gaume, B. Synonym extraction using a Society, Dallas, TX, USA, 2014 (pp. 3612–3618). IEEE.
semantic distance on a dictionary. In Proceedings of the first work- Wu, C., Chen, T., Jiang, R., Ning, L., & Jiang, Z. (2017). A novel
shop on graph based methods for natural language processing, approach to wavelet selection and tree kernel construction for
2006 (pp. 65–72). Association for Computational Linguistics. diagnosis of rolling element bearing fault. Journal of Intelligent
Pfeifer, T. (2002). Quality management: Strategies, methods, tech- Manufacturing, 28(8), 1847–1858.
niques. München: Hanser Verlag. Wu, Y., Denny, J. C., Rosenbloom, S. T., Miller, R. A., Giuse, D. A.,
Rajpathak, D. G., & De, S. (2016). A data- and ontology-driven text Song, M., et al. (2015). A preliminary study of clinical abbrevia-
mining-based construction of reliability model to analyze and tion disambiguation in real time. Applied Clinical Informatics,
predict component failures. Knowledge and Information Systems, 6(2), 364–374. https​://doi.org/10.4338/Aci-2014-10-Ra-0088.
46(1), 87–113. Xu, Z., Dang, Y., & Munro, P. (2018). Knowledge-driven intelli-
Rajpathak, D. G., & Singh, S. (2014). An ontology-based text mining gent quality problem solving system in the automotive industry.
method to develop D-matrix from unstructured text. IEEE Trans- Advanced Engineering Informatics, 38, 441–457.
actions on Systems Man & Cybernetics Systems, 44(7), 966–977. Yates, A., & Etzioni, O. (2009). Unsupervised methods for determin-
Rezaee, M. R., Lelieveldt, B. P. F., & Reiber, J. H. C. (1998). A new ing object and relation synonyms on the web. Journal of Artificial
cluster validity index for the fuzzy c -mean. Pattern Recognition Intelligence Research, 34, 255–296.
Letters, 19(3–4), 237–246.
Roberts, R. A., Tumer, I. Y., Stone, R. B., & Brown, A. F. (2003). A Publisher’s Note Springer Nature remains neutral with regard to
Function-Based Exploration of JPL’s Problem/Failure Reporting jurisdictional claims in published maps and institutional affiliations
Database. Paper presented at the ASME International Mechanical
Engineering Congress and Exposition.
Schneider, H. (2003). Failure mode and effect analysis: FMEA from
theory to execution. Technometrics, 38(1), 80.
Singh, S., Holland, S. W., & Bandyopadhyay, P. (2010). Trends in the
development of system-level fault Dependency matrices. Paper
presented at the Aerospace Conference, Big Sky, MT, USA.

13

You might also like