Landslide Susceptibility Prediction Using Sparse Feature Extraction and Machine Learning Models Based On GIS and Remote Sensing
Landslide Susceptibility Prediction Using Sparse Feature Extraction and Machine Learning Models Based On GIS and Remote Sensing
Abstract— Landslide susceptibility prediction (LSP) is a useful difficult task [1], [2]. Landslide susceptibility prediction (LSP)
technology for landslide prevention. Due to the complex nonlinear plays an important role in accurately locating the potential
correlations among environmental factors, traditional machine landslide. Therefore, it is necessary to carry out in-depth
learning (ML) models have unsatisfactory LSP accuracies. In this
research on the LSP [3].
letter, a sparse feature extraction network (SFE+) is proposed for
LSP. First, the landslides and environmental factors are collected, Recently, based on the remote sensing (RS) and geographic
and frequency ratios of environmental factors are calculated as information system (GIS), data-driven LSP models have been
the model inputs. Second, the input data are passed through developed deeply, which can be divided into heuristic [4],
the input layer with the dropout, and then, the features are mathematical–statistical [5], and machine learning (ML) mod-
passed through the hidden layers, that is, the k% lifetime sparsity els [6]. Among them, heuristic and mathematical–statistical
layers. The hidden layers are employed to further sparse these models have been widely employed in LSP, such as analytical
factors to obtain the independent and redundant prediction
hierarchy process and statistical index. The above models
features as much as possible. Finally, certain classifiers are used
to realize the LSP in the study area. SFE-support vector machine can predict the level of landslide susceptibility to a certain
(SVM), SFE-logistic regression (LR), and SFE-stochastic gradient extent. However, the accuracy of susceptibility results calcu-
descent (SGD) models are built. For comparison, principal com- lated by simple linear statistical methods is low. Meanwhile,
ponent analysis (PCA)-SVM, PCA-LR, PCA-SGD, SVM, LR, and it is difficult to truly reflect the nonlinear coupling effect of
SGD models are also built for LSP in Shicheng County, China. environmental factors on landslide susceptibility [7].
Results show that the SFE-based ML models, especially the With the rapid development of ML, various ML models
SFE-SVM, can effectively extract the sparse nonlinear features
of environmental factors to improve LSP accuracies and have
have been successfully used in LSP, including logical regres-
promising prospects for LSP. sion (LR), support vector machine (SVM), and so on [8].
Considerable studies show that compared with heuristic and
Index Terms— Geographic information system (GIS), landslide mathematical–statistical models, ML models have a higher
susceptibility prediction (LSP), neural network, remote sens-
ing (RS), sparse feature extraction (SFE). LSP accuracy [9].
However, when constructing an LSP model, it is urgent to
solve the problems of feature learning and optimization of
input data by ML models. These problems specifically are:
I. I NTRODUCTION 1) ML models require a lot of prior knowledge in the features
learning processes, while the models cannot automatically
L ANDSLIDES have been widely developed worldwide.
Generally, accurately locating the landslide sites is a extract features from data [10] and 2) the abovementioned ML
methods cannot extract more representative features from huge
Manuscript received October 15, 2020; revised January 4, 2021; accepted input data [11]. To address these problems, a new algorithm,
January 18, 2021. Date of publication February 8, 2021; date of current version namely, sparse feature extraction network (SFE+), is proposed
December 16, 2021. This work was supported in part by the National Natural
Science Foundation of China under Grant 41807285, in part by the Natural to extract the features from the input data. Then, the SFE-based
Science Foundation for Outstanding Young Scholars of Jiangxi Province under LSP model is built. Generally, SFE+ is a simple and efficient
Grant 2018ACB21038, in part by the Natural Science Foundation of Jiangxi unsupervised feature extraction method, which improves the
Province of China under Grant 20192BAB216034, in part by the Postdoctoral
Science Foundation of China under Grant 2019M652287, and in part by the
discriminative power of features by extracting superior sparse
Jiangxi Provincial Postdoctoral Science Foundation under Grant 2019KY08. features and then improves the LSP performance. SFE has also
(Corresponding author: Faming Huang.) been widely used in other fields, such as image classification,
Li Zhu, Gongjian Wang, and Yan Li are with the School of face recognition, and radar signal recognition [12]. However,
Information Engineering, Nanchang University, Nanchang 330031,
China (e-mail: [email protected]; [email protected];
the SFE+ has not been developed for LSP; hence, it is
[email protected]). significant to introduce this algorithm into LSP.
Faming Huang is with the School of Civil Engineering and To sum up, this study constructs the SFE-SVM, SFE-
Architecture, Nanchang University, Nanchang 330031, China (e-mail: LR, and SFE-stochastic gradient descent (SGD) models to
[email protected]).
Wei Chen is with the College of Geology and Environment, Xi’an implement LSP. Meanwhile, the single SVM, LR, and SGD
University of Science and Technology, Xi’an 710126, China (e-mail: models without feature extraction are used for comparisons.
[email protected]). Furthermore, the principal component analysis (PCA) algo-
Haoyuan Hong is with the Department of Geography and Regional
Research, University of Vienna, 211500 Vienna, Austria (e-mail:
rithm is also introduced to build the PCA-SVM, PCA-LR,
[email protected]). and PCA-SGD models for comparisons. The Shicheng County,
Digital Object Identifier 10.1109/LGRS.2021.3054029 China, is used as the study area, and its landslide susceptibility
1558-0571 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Hyderabad IG Memorial Library. Downloaded on September 13,2022 at 04:36:42 UTC from IEEE Xplore. Restrictions apply.
3001505 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 19, 2022
Authorized licensed use limited to: University of Hyderabad IG Memorial Library. Downloaded on September 13,2022 at 04:36:42 UTC from IEEE Xplore. Restrictions apply.
ZHU et al.: LSP USING SFE AND ML MODELS BASED ON GIS AND RS 3001505
Authorized licensed use limited to: University of Hyderabad IG Memorial Library. Downloaded on September 13,2022 at 04:36:42 UTC from IEEE Xplore. Restrictions apply.
3001505 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 19, 2022
Fig. 4. Prediction rate curves of each model. (a) SFE-SVM, single SVM and PCA-SVM. (b) SFE-LR, single LR and PCA-LR. (c) SFE-SGD, single SGD
and PCA-SGD.
the model output. First, the 369 landslides were converted into TABLE I
2709 landslide grid units by the ArcGIS software, and then, L ANDSLIDE P REDICTION P ERFORMANCE
the 2709 landslide grid units in the data set were randomly
divided into a training set (70%) and testing set (30%).
Meanwhile, nonlandslide grid units with the same number are
randomly selected from the landslide-free area, which was also
divided into training and test data sets according to the above
ratio. The landslide and nonlandslide grid units were set to the
labels “1” and “0,” respectively.
Authorized licensed use limited to: University of Hyderabad IG Memorial Library. Downloaded on September 13,2022 at 04:36:42 UTC from IEEE Xplore. Restrictions apply.
ZHU et al.: LSP USING SFE AND ML MODELS BASED ON GIS AND RS 3001505
Authorized licensed use limited to: University of Hyderabad IG Memorial Library. Downloaded on September 13,2022 at 04:36:42 UTC from IEEE Xplore. Restrictions apply.