0% found this document useful (0 votes)
47 views15 pages

Chen Et Al. - 2018 - Wind Speed Forecasting Using Nonlinear-Learning Ensemble of Deep Learning Time Series Prediction and Extremal Optim

Uploaded by

wuzeqiong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views15 pages

Chen Et Al. - 2018 - Wind Speed Forecasting Using Nonlinear-Learning Ensemble of Deep Learning Time Series Prediction and Extremal Optim

Uploaded by

wuzeqiong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Energy Conversion and Management 165 (2018) 681–695

Contents lists available at ScienceDirect

Energy Conversion and Management


journal homepage: www.elsevier.com/locate/enconman

Wind speed forecasting using nonlinear-learning ensemble of deep learning T


time series prediction and extremal optimization

Jie Chena, Guo-Qiang Zengb, Wuneng Zhoua, , Wei Dua, Kang-Di Lua
a
School of Information Sciences and Technology, Donghua University, Shanghai 200051, China
b
National-Local Joint Engineering Laboratory of Digitalize Electrical Design Technology, Wenzhou University, Wenzhou 325035, China

A R T I C LE I N FO A B S T R A C T

Keywords: As an essential issue in wind energy industry, wind speed forecasting plays a vital role in optimal scheduling and
Wind speed forecasting control of wind energy generation and conversion. In this paper, a novel method called EnsemLSTM is proposed
Deep learning by using nonlinear-learning ensemble of deep learning time series prediction based on LSTMs (Long Short Term
Time series prediction Memory neural networks), SVRM (support vector regression machine) and EO (extremal optimization algo-
LSTMs (Long Short Term Memory neural
rithm). First, in order to avert the drawback of weak generalization capability and robustness of a single deep
networks)
learning approach when facing diversiform data, a cluster of LSTMs with diverse hidden layers and neurons are
Ensemble learning
Extremal optimization employed to explore and exploit the implicit information of wind speed time series. Then predictions of LSTMs
are aggregated into a nonlinear-learning regression top-layer composed of SVRM and the EO is introduced to
optimize the parameters of the top-layer. Lastly, the final ensemble prediction for wind speed is given by the
fine-turning top-layer. The proposed EnsemLSTM is applied on two case studies data collected from a wind farm
in Inner Mongolia, China, to perform ten-minute ahead utmost short term wind speed forecasting and one-hour
ahead short term wind speed forecasting. Statistical tests of experimental results compared with other popular
prediction models demonstrated the proposed EnsemLSTM can achieve a better forecasting performance.

1. Introduction intelligence models. Physical models are plain methods, which take
advantage of physical information like atmospheric pressure, tem-
As a promising and practical solution to cut greenhouse gas emis- perature, obstacles and roughness [4]. Thereinto, NWP (numerical
sions and build a renewable society, wind energy is becoming more and weather prediction) models employ a set of mathematics equations
more popular in various countries. The global wind report, released by based on physical information to forecast. Moreover, a range of statis-
the Global Wind Energy Council (GWEC) in 2017, has stated that the tical models have been researched to perform wind speed forecasting in
2016 world wind power market was more than 54.6 GW, causing the the recent decades. The widely used statistical models include auto-
total global installed capacity to nearly 487 GW, which was still led by regressive models (AR), moving average models (MA), autoregressive
China, US, Germany and India [1]. And the capacity of wind energy will moving average models (ARMA), autoregressive integrated moving
continue to grow vastly in next years. However, it can be a difficult task average models (ARIMA) and seasonal autoregressive integrated
to perform a reliable and seasonable wind power management in moving average (SARIMA). Liu [5] proposed a novel method based on
electrical power systems due to the natural irregular characteristic of recursive ARIMA and EMD (empirical mode decomposition) to perform
wind speed. The unstable and uncontrollable wind speed influences short term wind speed forecasting for railway strong wind warning
heavily the generation of wind power and subsequently this will impact system. Kavasseri et al. [6] developed a fraction-ARIMA to predict one-
wind turbines control, power systems and micro-grid scheduling, power day and two-day ahead wind speed in North Dakota. On the other hand,
quality and the balance of supply and load demand [2,3]. So, de- with the rapid development of soft-computing technologies, artificial
pendable and accurate wind speed forecasting can not only provide a intelligence models have been proposed successfully for time series
security basis for wind energy generation and conversion, but also re- prediction. Among them, ANN (artificial neural networks) such as back
duce the costs of power system operation. propagation neural networks [7], multi-layer perceptron neural net-
The existing wind speed forecasting approaches can be classified works [8], radial basis function neural networks [9], Bayesian neural
into three groups as physical models, statistical models and artificial networks [10] and extreme learning machine [11] have been applied to


Corresponding author.
E-mail address: zhouwuneng@163.com (W. Zhou).

https://doi.org/10.1016/j.enconman.2018.03.098
Received 26 November 2017; Received in revised form 21 March 2018; Accepted 31 March 2018
0196-8904/ © 2018 Elsevier Ltd. All rights reserved.
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

wind speed forecasting. Chang et al. [12] provided an improved neural ensemble learning, a cluster of LSTMs with diverse hidden layers and
network based approach with error feedback to predict short term wind neurons are firstly introduced to explore and exploit the hidden in-
speed and power. Noorollahi et al. [13] used ANN models to perform formation of wind speed time series. To overcome the shortcomings of
temporal and spatial wind speed forecasting in Iran with success. In linear representation of traditional combined models, the predictions
[14], Ma et al. proposed a generalized dynamic fuzzy neural network of LSTMs are aggregated into a nonlinear-learning regression top-layer
optimized by BSO (brain storm optimization) to forecast short-term to give the final ensemble prediction in this paper rather than a linear
wind speed. Another popular group is SVM (support vector machine) combination. ANN is a classic artificial intelligence method, but it is
with high generalization ability. Jiang et al. [15] presented a hybrid unstable and its performance depends on data vastly, which make it
short term wind speed forecasting model using v-SVM optimized by difficult to predefine the network construction. Additionally, due to
cuckoo search algorithm. Chen et al. [16] developed a state-space based limitations of training algorithms, ANN may easily fail into local
SVM with unscented Kalman filter for wind speed prediction. Ad- minima [28]. On the contrary, SVRM has superiority in solving com-
ditionally, to improve the forecasting performance of a single model, plex nonlinear regression and prediction problems and has achieved
combination or hybrid models are investigated recently to solve this extensive application and remarkable success in forecasting field
problem [17,18]. For combined methods, different individual models [29,30]. Accordingly, the nonlinear-learning top-layer used in this
are used to predict and their predicted results are combined to give the paper is composed of SVRM to get rid of the weakness of ANN and the
final prediction with the corresponding weight coefficients. Xiao et al. EO will be introduced to search for the optimal parameters of this top-
[18] proposed a novel combined model based on no negative constraint layer. Therefore, the main differences between the proposed En-
theory and artificial intelligence algorithm, in which the chaos particle semLSTM and traditional combined models are summarized: (a):
swarm optimization algorithm was used to find the optimal weight LSTMs, a kind of deep learning method is introduced as the forecasting
coefficients. To simultaneously obtain high accuracy and strong stabi- engine in EnsemLSTM while predictors of traditional combined
lity, Wang et al. [19] developed a combined forecasting model using models are conventional machine learning algorithms like ANN and
multi-objective bat algorithm for wind speed forecasting. In [20], Wang SVM; (b): To overcome the defects of liner representation of tradi-
et al. presented a robust combined model adopting ARIMA, SVM, ELM tional combined models, a nonlinear-learning regression top-layer is
and LSSVM (least square support vector machine) for short term adopted in EnsemLSTM to give the final ensemble prediction; (c): The
probabilistic wind speed prediction, in which GPR (Gaussian process application of a novel promising intelligent optimization algorithm
regression) is utilized to combine the results of individual predictors. i.e. EO is performed to find the optimal parameters of the top-layer in
The recent researches have demonstrated that combined forecasting EnsemLSTM.
mechanism can achieve better prediction performance than single The principal contributions of this paper are as follows: (1) A deep
models. However, it should be noted that the commonly accepted learning time series prediction based on LSTMs is introduced to ex-
combination strategy of weight coefficients is a linear approach, which plore and exploit the implicit information of wind speed time series
could not find the non-linear relationship of individual models. Besides, for wind speed forecasting; (2) To improve the generalization cap-
more advanced prediction approaches need to be introduced to enhance ability and robustness of a single deep learning approach, nonlinear-
the forecasting performance rather than conventional machine learning learning ensemble of deep learning time series prediction consisting of
algorithms like ANN and SVM. a cluster of LSTMs with diverse hidden layers and neurons and one
In recent years, the utilization of deep learning in time series nonlinear-learning regression top-layer composed of SVRM optimized
modeling has aroused many people’s great research interest [21]. Lv by the EO is developed; (3) The performance of the proposed
et al. [22] performed traffic flow prediction with big data in a deep EnsemLSTM is successfully validated on two case studies data col-
learning approach. Qiu et al. [23] proposed an ensemble deep learning lected from a wind farm in Inner Mongolia, China, to perform ten-
method for electrical load forecasting. Moreover, advanced deep minute ahead utmost short term wind speed forecasting and one-hour
learning methods have also been successfully applied into wind speed ahead short term wind speed forecasting. Statistical tests of experi-
forecasting field. In [24], Hu et al. provided a deep auto-encoder mental results have demonstrated the proposed EnsemLSTM can
based model using transfer learning for short term wind speed pre- achieve a better forecasting performance when compared with other
diction. Khodayar [25] proposed a rough deep neural network archi- prediction models.
tecture with auto-encoders to perform short-term wind speed fore- The remainder of this article is arranged as follows. In Section 2, the
casting. Wang [26] developed a new deterministic and probabilistic optimization problem formulation of nonlinear-learning ensemble of
wind speed forecasting method using deep belief network models. deep learning time series prediction for wind speed forecasting is pro-
Furthermore, ensemble learning has been acknowledged widely that posed and the related basic learning and optimization algorithms are
the learning performance could be promoted by combing paralleling introduced. Section 3 presents the proposed EnsemLSTM. Section 4
learning models intelligently [27]. Although the existing combined describes the evaluation indices of model forecasting performance. In
models for wind speed forecasting can be regarded as one type of Section 5, two case studies are performed and the discussion and
ensemble prediction, almost of their forecasting results are a linear comparison of forecasting models are also given in this section. Finally,
combination of individual predictors. From the perspective of general conclusion and future work of this paper are given in Section 6.
ensemble learning, ensemble prediction based on non-linear learning
should be more explored and researched. Thus, in this study, a novel
method using nonlinear-learning ensemble of deep learning time 2. Problem formulation
series prediction based on LSTMs (Long Short Term Memory neural
networks), SVRM (support vector regression machine) and EO (ex- 2.1. Deep learning time series prediction
tremal optimization algorithm) named EnsemLSTM is proposed for
wind speed forecasting. LSTMs, as a breakthrough variant of RNNs As one distinctive class of RNNs, LSTMs utilize special units named
(recurrent neural networks), can learn the temporal and long term memory blocks to take the place of the traditional neurons in the
dependencies from time series data deeply and solve the vanishing hidden layers [31,32]. Moreover, there exist three gates units called
gradient problem effectively compared with traditional RNNs [31,32]. input gates, output gates and forget gates in memory blocks and hence
EO is a novel promising intelligent optimization algorithm from the LSTMs have the ability to update and control the information flow in
statistical physics field and has been applied to a lot of combinatorial the block through these gates. The schema of LSTMs is displayed in
and continuous optimization problems, which shows its superiority Fig. 1. And the implementation of updating the state of the cell and
over commonly used ones like GA and PSO [37–41]. Inspired by calculating the output of LSTMs can be followed below.

682
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

Fig. 1. The schema of LSTMs.

input gate: it = σ (Wix x t + Wip pt − 1 + Wie et − 1 + bi ) f (x ) = W Tϕ (x ) + b


output gate: ot = σ (Wox x t + Wop pt − 1 + Woe et + bo) N
1
R[f ] = 2 ‖W ‖2 + C ∑ L (x i,yi ,f (x i ))
forget gate: ft = σ (Wfx x t + Wfp pt − 1 + Wfe et − 1 + bf ) (2)
i=1
temporary cell state: et ̃ = g (Wex x t + Wep pt − 1 + be )
where W, b and C is respectively the regression coefficients vector, bias
cell state: et = it ⊙ et ̃ + ft ⊙ et − 1
term and punishment coefficient, L(xi, yi, f(xi)) denotes the ε-insensitive
output: pt = ot ⊙ h (et ) loss function. The regression problem can be tackled by the following
output layer: yt = φ (Wyp·pt + by ) (1) constrained optimization problem:
N
where xt is the input vector, yt is the output vector, it, ot and ft is the 1
min ‖W ‖2 + C ∑ (ζi + ζi∗)
output of input gate, output gate, and forget gate respectively, et ̃ and et 2
i=1
is the temporary and finishing state of the memory cell in the memory s.t. yi −(W Tϕ (x ) + b) ⩽ ε + ζi
block, pt is the output of the memory block. σ denotes the gate acti-
(W Tϕ (x ) + b)−yi ⩽ ε + ζi∗
vation function (generally the logistic sigmoid function), g and h are
respectively the input and output activation function (usually the tanh ζi,ζi∗ ⩾ 0 i = 1,2,3,…,N (3)
function), ⊙ is the element-wise multiplication between two vectors
where ζi and ζi∗ denote slack variables to make constraints feasible. By
(Hadamard product), φ is the output activation function of LSTMs,
introducing the Lagrange multipliers, the regression function can be
linear function in this paper for time series prediction while Wix, Wip,
given as follows
Wie, Wox, Wop, Woe, Wfx, Wfp, Wfe, Wex, Wep and Wyp represent the cor-
N
responding weight matrices, bi, bo, bf, be and by are the related bias
f (x ) = ∑ (αi−αi∗) K (x i,x j ) + b
vectors. (4)
i=1

where αi and αi∗ are the Lagrange multipliers which satisfy the condi-
2.2. The nonlinear-learning ensemble of deep learning time series prediction tions αi ⩾ 0,αi∗ ⩾
N
0 and ∑i = 1 (αi−αi∗) = 0 . K (x i,x j ) is the kernel function
for wind speed forecasting and the commonly used radial basis function (RBF) is chosen as the
kernel function in this paper which is defined as
To achieve a better wind speed forecasting, nonlinear-learning en-
‖x i−x j ‖2 ⎞
semble of deep learning time series prediction based on LSTMs, SVRM K (x i,x j ) = exp ⎛⎜− ⎟
and EO is developed in this paper. In the structure of nonlinear-learning ⎝ 2σ 2 ⎠ (5)
ensemble learning, predictions of a cluster of LSTMs are input into a
where σ represents the RBF kernel width.
nonlinear-learning regression top-layer to produce the final forecasting.
Considering the superiority in solving complex regression problems and
2.2.2. Extremal optimization
extensive application and remarkable success in forecasting field
EO [33,34], is a novel promising intelligent optimization algorithm,
[29,30], SVRM is introduced as the nonlinear-learning top-layer.
which stimulated by the self-organized criticality from the statistical
However, the performance of ensemble learning depends vastly on the
physics field [35,36]. In the last decade, the EO has been successfully
parameters of top-layer SVRM i.e. the punishment coefficient C and
applied to a variety of benchmark and real-world engineering optimi-
kernel parameter σ. To address this problem, parameters optimization
zation problems [37–41]. The relevant research has shown that the EO
of SVRM using real-coded EO is successfully developed in this paper.
with simpler evolutional operations can outperform the commonly-used
Before giving the optimization problem formulation for wind speed
optimization algorithms like GA and PSO. The algorithm steps of the
forecasting, the basic conceptions of SVRM and EO are introduced
real-coded τ-EO with adopting PLM are displayed below.
firstly.
Input: The total number of variables N, the maximum number of
2.2.1. Support vector regression machine iterations Imax, the control parameter τ of probability distribution
Given a set of samples {xi, yi}, i = 1, 2, 3, … , N, with input vector P(k).
xi ∈ Rm and output vector yi ∈ R. The task of regression problems is to Output: The best solution optimized by EO.
find a function f(x) to reveal the relationship of inputs and outputs. The Step 1: Generate an initial solution S randomly, where S is a
motivation of SVR is to achieve a linear regression in the high-dimen- combination of variables with count L = N. Set Sbest = S and
sional feature space obtained by mapping the original input set through compute the fitness C(Sbest) = C(S) based on the predefined
a predefined function ϕ (x ) and to minimize the structure risk R[f] fitness function.
[28–30]. And the above process can be expressed as follow. Step 2: For the current solution S,

683
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

Fig. 2. The structure of this proposed EnsemLSTM with K LSTMs.

(a) Generate the solution Si by mutating the component i 3. Proposed nonlinear-learning ensemble of deep learning time
(1 ≤ i ≤ N) (i.e. the corresponding variable is mutated by PLM) series prediction method
and keeping the others unchanged, then compute the fitness
C(Si); As an excellent deep learning time series prediction method, LSTMs
PLM can be explained by the following equations: can explore and exploit the hidden information of dynamic time series
x ′ = x + α·βmax efficiently, but the forecasting ability of LSTMs could be influenced by
the number of hidden layers in LSTMs and neurons count in each
(2r )1/(q + 1)−1, if r ⩽ 0.5
α=⎧ 1/(q + 1) , otherwise
(6) hidden layer. Inspired by the great performance of ensemble learning,

⎩ 1− [2(1 − r )] nonlinear-learning ensemble of deep learning time series prediction
βmax = max[x −l,u−x ] method based on LSTMs, SVRM and EO named EnsemLSTM is proposed
where x denotes the current value of the variable, x′ is the in this article.
mutated value, q is the PLM parameter, r is a random number In the proposed EnsemLSTM, the time series i.e. the wind speed time
belongs to [0,1], and l, u is respectively the lower and upper series data is predicted separately by a cluster of LSTMs with diverse
bound of the variable. numbers of hidden layers and neurons in each hidden layer firstly, to
(b) Evaluate the local fitness λi = C(Si)-Cbest for each component i explore and exploit the implicit information of wind speed time series.
and rank all the components according to λi, i.e., find a Then to overcome the defects of linear representation of traditional
permutation П1 of the labels i such that λП1 (1) ≤ λП1 combined models, one nonlinear-learning regression top-layer is ap-
(2) ≤ ⋯ ≤ λП1 (N); plied for ensemble forecasting, which is feed and trained by these
(c) Select a rank П1(k) according to a probability distribution forecasting results of LSTMs. Considering the extensive application and
P (k ) ∝ k −τ ,1 ≤ k ≤ n , where τ is a positive parameter, and denote remarkable success in forecasting field, the nonlinear-learning top-layer
the corresponding component as xj; used is composed of SVRM and the real-coded EO adopting PLM is in-
(d) Mutate the value of xj and set Snew = S in which only xj value is troduced to optimize the parameters of top-layer. Lastly, the final en-
mutated; semble prediction for wind speed is output by the fine-turning top-
(e) If C(Snew) < C(Sbest), then Sbest = Snew, C(Sbest) = C(Snew); layer. The overall structure of this proposed EnsemLSTM with K LSTMs
(f) Accept Snew unconditionally; is shown in Fig. 2. It should be pointed that there is no any theoretical
Step 3: Repeat Step 2 until some predefined stopping criteria (i.e., knowledge to predefine the network structure of LSTMs for specific data
the maximum number of iterations Imax) is satisfied. and the practical solution is to select the hyper-parameters by trial-and-
Step 4: Obtain the best solution Sbest representing the optimal error experiments [42,43]. To make a trade-off between learning per-
variables and the corresponding best fitness Cbest. formance and model complexity, in the proposed EnsemLSTM, six di-
verse LSTMs are adopted based on trial and error, which are respec-
tively LSTM1 with 1 hidden layer and 50 neurons in the hidden layer,
LSTM2 with 1 hidden layer and 100 neurons in the hidden layer,
LSTM3 with 1 hidden layer and 150 neurons in the hidden layer,
LSTM4 with 2 hidden layers and 50,50 neurons in the hidden layers,
2.2.3. Optimization of nonlinear-learning ensemble of deep learning time LSTM5 with 2 hidden layers and 50,100 neurons in the hidden layers
series prediction and LSTM6 with 2 hidden layers and 50,150 neurons in the hidden
From the perspective of optimization, how to choose the best layers. Like traditional combined models, six diverse LSTMs (LSTM1-
parameters of the nonlinear-learning top-layer for ensemble of deep LSTM6) are built, as single prediction models to forecast the wind speed
learning time series prediction for wind speed forecasting can be seen a on train and test dataset. The prediction of six LSTMs models on train
typical optimization problem, which is described as follows: dataset is the input as train-feature into SVRM to learn. The task of
SVRM is to learn the nonlinear relationship of six LSTMs predictors like
min f = Fitness (C ,σ ) solving multivariate regression problems. Then, the predicted results of
s.t. lC ⩽ C ⩽ uC the six LSTMs models on test dataset are the input as test-feature into
l σ ⩽ σ ⩽ uσ (7) the well-trained SVRM to produce the ensemble forecasting. And the
output of proposed EnsemLSTM is the ensemble forecasting of SVRM
where C and σ represent the punishment coefficient and kernel para- optimized with EO as the final wind speed prediction. The searching
meter of SVRM, respectively. The lC and uC are the lower and upper range of the punishment coefficient C and kernel parameter σ optimized
bounds of C and the lσ and uσ are the lower and upper bounds of σ. In by real-coded EO adopting PLM are respectively [0, 1000] and [0, 1].
this paper, the predefined Fitness function is chosen as the MSE (mean The PLM parameter q in EO is set 30 and the maximum number of
square error) of 3-cross-validation on the train dataset. For the aim of iterations Imax is 1000.
simplification, the variable to be mutated is selected with the worst The flowchart of the EnsemLSTM is shown in Fig. 3 and the detailed
fitness instead of through the probability distribution P(k). implementation of this proposed EnsemLSTM is listed in the Appendix.

684
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

Fig. 3. The flowchart of this proposed EnsemLSTM.

Fig. 4. The wind speed collected in case study 1.

Table 1 Table 2
The statistical information of wind speed in case study 1. Sets of parameters of different compared forecasting methods in case study 1.
Case study 1: ten-minute wind speed data from November 23, 2012 to November 28, Forecasting methods Sets of parameters
2012 (m/s)
EnsemLSTM q = 30, Imax = 1000
Dataset Max Median Min Mean St.d ARIMA (p, d, q) = (2, 0, 1)
SVR C = 13.00, σ2 = 0.25
Entire dataset 18.500 12.100 5.500 12.166 2.460 ANN 1 hidden layers with 15 neurons
Train dataset 18.500 12.300 5.500 12.438 2.269 KNN K=5
Test dataset 18.300 11.600 6.300 11.531 2.761 GBRT The maximum tree depth was set as 3, the number of the
decision tree was set as 300
St.d: standard deviation.

685
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

Table 3 5. Experiments
The forecasting results of prediction models in case study 1.
Forecasting methods MAE RMSE MAPE(%) R 5.1. Wind speed data description

EnsemLSTM 0.5746 0.7552 5.4167 0.9619 Inner Mongolia, in China, is located in the monsoon region, and the
ARIMA 0.6961 0.9257 6.5067 0.9420
average annual wind speed is about 3.7 m/s. And Inner Mongolia’s
SVR 0.5834 0.7729 5.4912 0.9599
ANN 0.6332 0.8397 6.1528 0.9545
wind energy reserves are very large about 270 million kW h, accounting
KNN 0.6391 0.8360 6.0332 0.9530 for the total reserves of 1/5 and ranking first in China. In this paper, the
GBRT 0.6296 0.8147 6.0733 0.9576 proposed EnsemLSTM was applied to the wind speed data collected by a
wind farm in Inner Mongolia, China. Two case studies with different
Best performance is highlighted in bold.
prediction time horizons i.e. ten-minute ahead utmost short term wind
speed forecasting and one-hour ahead short term wind speed fore-
4. Evaluation of forecasting performance casting were implemented to validate the effectiveness of EnsemLSTM
and the results of experiments were compared with conventional
Four commonly used statistical criteria are employed to evaluate forecasting methods including ARIMA, SVR, KNN, ANN and GBRT
the forecasting performance of wind speed prediction models. And they (gradient boosting regression tree). The parameters for ARIMA were
are defined as below. determined based on the values of AIC and BIC. For SVR model, C was
Mean absolute error (MAE): chosen as ymax - ymin according to [44] and the parameter σ was fixed by
try-and-error experiments based on [45]. The predetermined K (i.e. the
N
1 number of neighbors) of KNN model was set as 5. The ANN model was
MAE =
N
∑ |f (i)−h (i)|
composed of one hidden layer and the number of neurons is decided by
i=1 (8)
the trials. And for GBRT, the maximum tree depth was set as 3 and the
Root mean square error (RMSE): number of the decision tree was set as 300. Without loss of generality,
the look-back time lag was set as one in the following experiments. In
1 N other words, forecasting approaches to predict the next point S(t + 1)
RMSE =
N
∑i =1 (f (i)−h (i))2 (9) were based on the last point data S(t) i.e. the input was composed of the
wind speed value of the previous time. The ARIMA and ANN models
Mean absolute percentage error (MAPE): were available in the Econometrics toolbox and Neural Network
toolbox in MATLAB respectively. The SVR, KNN and GBRT models were
N
1 |f (i)−h (i)| performed by using the scikit-learn machine learning package in Python
MAPE =
N
∑ h (i)
× 100%
2.7 [46]. And the proposed EnsemLSTM was implemented by mixed-
i=1 (10)
language programming based on MATLAB and Python 2.7 while the
Correlation coefficient (R): LSTMs algorithm was operated by using the “Keras” deep learning
package [47]. All of models are run on a computer with Windows 10
N
∑i = 1 (f (i)−f )(h (i)−h ) operating system, Intel Core i5 CPU @ 2.30 GHz and RAM of 8.00 GB.
R= Considering the impact of randomness, each experiment was run 30
N N
∑i = 1 (f (i)−f )2 · ∑i = 1 (h (i)−h )2 (11) times and then the statistical results were taken.

where f (i) and h (i) represent the predicted value and actual value at 5.2. Case study 1: utmost short term wind speed forecasting
time i, respectively. And f and h denote the mean of the predicted
values and the actual values, respectively. N is the total number of the In this case study, the wind speed data sampled per ten minutes
data. from November 23, 2012 to November 28, 2012 were utilized as the
dataset to perform ten-minute ahead utmost short term wind speed

Fig. 5. The comparison of forecasting performance in case study 1.

686
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

Table 4
The statistical tests of forecasting performance comparisons in case study 1.
Ranking tests EnsemLSTM ARIMA SVR ANN KNN GBRT Statistic p-value

Friedman 1.000 5.867 2.333 3.467 4.500 3.833 130.815 0.0000


MAE Friedman Aligned 16.633 162.400 52.567 92.667 116.83 101.90 120.118 0.0000
Quade 1.000 5.759 2.357 3.639 4.421 3.824 42.405 0.0000

Friedman 1.000 5.867 2.400 3.600 4.567 3.567 125.412 0.0000


RMSE Friedman Aligned 17.767 162.267 58.267 95.833 117.87 91.000 115.456 0.0000
Quade 1.000 5.755 2.404 3.768 4.548 3.525 42.968 0.0000

Friedman 1.000 5.800 2.333 3.833 3.733 4.300 104.787 0.0000


MAPE Friedman Aligned 17.723 158.400 53.167 102.53 101.90 109.77 112.599 0.0000
Quade 1.000 5.645 2.316 4.131 3.724 4.183 39.013 0.0000

Friedman 1.000 5.900 2.200 3.633 4.733 3.533 195.558 0.0000


R Friedman Aligned 20.233 163.400 48.800 97.467 127.00 86.100 125.848 0.0000
Quade 1.000 5.817 2.230 3.841 4.663 3.448 51.681 0.0000

Best performance is highlighted in bold.

Fig. 6. The wind speed forecasting results in case study 1.

forecasting. The wind speed collected in case study 1 is displayed in different methods with much more statistical reliability [48]. Table 4
Fig. 4. The total 738 obtained samples were divided into two parts (70% shows the ranks, statistics and related p-values achieved by statistical
as the train set and 30% as the test set). The forecasting models were tests in case study 1. And the forecast wind speeds are plotted in Figs. 6,
trained by the train dataset and validated on the test dataset. The sta- 7 and 8 provide the bar graph and line chart of prediction residual
tistical information of the above dataset is showed in Table 1. Table 2 errors for different forecast methods, respectively.
displays the sets of parameters of different compared forecasting From Table 3 and Fig. 5, it can be seen that the proposed En-
methods in case study 1. The forecasting results of different prediction semLSTM performs better than these compared widely used forecast
models are listed in Table 3. Fig. 5 visualizes the comparison of fore- approaches with the minimum value of MAE as 0.5746, RMSE as
casting performance. For the purpose of a comprehensive comparison of 0.7552 and MAPE as 5.4167% and the maximum value of R as 0.9619.
the proposed method and other prediction models, statistical tests in- And the best one of the compared prediction models is SVR with MAE
cluding Friedman, Friedman Aligned and Quade tests were used to rank as 0.5834, RMSE as 0.7729, MAPE as 5.4912% and R as 0.9599 while

687
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

Fig. 7. The bar graphs of prediction residual errors for different forecast methods in case study 1.

Fig. 8. The line chart of prediction residual errors for different forecast methods in case study 1.

Table 5 Table 6
The impact of different ensemble learning top-layer on prediction performance The comparisons of forecasting results between the EnsemLSTM and single
in case study 1. LSTMs in case study 1.
Different models MAE RMSE MAPE(%) R Different models MAE RMSE MAPE (%) R

EnsemLSTM 0.5746 0.7552 5.4167 0.961938 EnsemLSTM 0.5746 0.7552 5.4167 0.961938
ANNLSTM 0.5850 0.7667 5.5145 0.961186 LSTM1 0.5834 0.7618 5.5520 0.961913
MeanLSTM 0.5782 0.7585 5.4822 0.961890 LSTM2 0.5818 0.7613 5.5393 0.961861
LSTM3 0.5794 0.7587 5.5017 0.961919
Best performance is highlighted in bold. LSTM4 0.5778 0.7605 5.4598 0.961652
LSTM5 0.5776 0.7596 5.4494 0.961737
LSTM6 0.5802 0.7625 5.4595 0.961764

Best performance is highlighted in bold.

688
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

Fig. 9. The wind speed collected in case study 2.

Table 7 between ensemble and single models are shown in Table 6. In Table 5,
The statistical information of wind speed in case study2. the different models include the proposed EnsemLSTM, ANNLSTM (the
Case study 2: mean one-hour wind speed data collected from April 1, 2013 to April 30,
nonlinear-learning top-layer is composed of an ANN model) and
2013 (m/s) MeanLSTM (the ensemble learning is an average forecasting result).
From Table 5, we can find that EnsemLSTM can achieve better fore-
Dataset Max Median Min Mean St.d casting performance, which indicates not only the SVRM’s superiority
Entire dataset 23.383 8.092 0.617 8.687 4.322
to ANN but also the advantage of nonlinear-learning top-layer. More-
Train dataset 23.383 7.667 0.617 8.340 4.325 over, from Table 6, it can be clearly seen that the EnsemLSTM has
Test dataset 21.883 9.483 0.967 9.492 4.421 improved the forecasting performance of single LSTMs, showing great
strength of ensemble learning than single models.
St.d: standard deviation.
Remark 1. In case study 1, ten-minute ahead utmost short term wind
Table 8 speed forecasting is performed and our proposed EnsemLSTM achieves
Sets of parameters of different compared forecasting methods in case study 2. a better forecasting performance than ARIMA, SVR, ANN, KNN and
GBRT, which indicates the powerful learning ability of dynamic
Forecasting methods Sets of parameters
sequences (wind speed time series data). The Friedman tests,
EnsemLSTM q = 30, Imax = 1000 Friedman Aligned tests and Quade tests of MAE, RMSE, MAPE and R
ARIMA (p, d, q) = (2, 0, 1) have also proved the superiority of EnsemLSTM from the statistical
SVR C = 22.77, σ2 = 1 perspective. Moreover, the nonlinear-learning top-layer of SVRM in
ANN 1 hidden layer with 10 neurons
EnsemLSTM shows better ensemble learning performance when
KNN K=5
GBRT The maximum tree depth was set as 3, the number of the compared with ANNLSTM and MeanLSTM. And the comparisons
decision tree was set as 300 among EnsemLSTM and six single prediction models LSTM1-LSTM6
also manifest the wonderful learning performance of ensemble
learning.
Table 9
The forecasting results of prediction models in case study 2.
Forecasting methods MAE RMSE MAPE R 5.3. Case study 2: short term wind speed forecasting

EnsemLSTM 1.1410 1.5335 17.1076 0.9375 The one-hour ahead short term wind speed forecasting was in-
ARIMA 1.3753 1.8337 20.7303 0.9098
SVR 1.1841 1.5766 17.7574 0.9338
vestigated in this case study 2. And the mean one-hour wind speed data
ANN 1.1918 1.5784 18.1864 0.9340 collected from April 1, 2013 to April 30, 2013 were used. The total 720
KNN 1.2291 1.6223 17.9257 0.9297 obtained data points were divided into two parts in the same way as
GBRT 1.2143 1.5806 18.6117 0.9341 discussed in the case study 1. The wind speed collected in case study 2
is shown in Fig. 9 and the statistical information of this dataset is shown
Best performance is highlighted in bold.
in Table 7. Table 8 displays the sets of parameters of different compared
forecasting methods in case study 2. The forecasting results and the
the worst one is ARIMA with MAE as 0.6961, RMSE as 0.9257, MAPE as statistical tests of forecasting performance comparisons between dif-
6.5067% and R as 0.9420. According to Table 4, the proposed En- ferent prediction models are also displayed in Tables 9 and 10, re-
semLSTM also realizes the best ranks in Friedman tests, Friedman spectively. Fig. 10 visualizes the comparison of forecasting results. It
Aligned tests and Quade tests for the all of forecasting performance can be observed from Table 9 that the four forecasting performance
indices with a level of significance α = 0.0001. Moreover, from the indices in case study 2 become worse than the ones calculated in case
analysis of Figs. 6, 7 and 8, the EnsemLSTM shows apparently a better study 1. It can be easily understood that short term wind speed fore-
curve fitting of the actual wind speed time series and smaller residual casting is more complicated and difficult than utmost short term wind
errors when contrasted with other forecasting results. speed forecasting when the prediction time horizon became longer from
In order to verify the effectiveness of our proposed EnsemLSTM to ten minutes to one hour with the increase of the wind speed non-de-
improve the wind speed forecasting performance, the analysis of im- terminacy.
pacts of the ensemble learning top-layer on prediction performance and Similar to the results of case study 1, Table 9 and Fig. 10 show that
comparisons between the EnsemLSTM and single LSTMs are performed. the proposed EnsemLSTM outperforms these compared prediction
Table 5 presents the impacts of different top-layers and the comparisons models for short term wind speed forecasting with the minimum value

689
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

Table 10
The statistical tests of forecasting performance comparisons in case study 2.
Ranking tests EnsemLSTM ARIMA SVR ANN KNN GBRT Statistic p-value

Friedman 1.000 6.000 2.400 2.933 4.767 3.867 238.732 0.0000


MAE Friedman Aligned 16.033 165.500 57.067 72.100 130.27 102.03 133.168 0.0000
Quade 1.065 6.000 2.439 2.895 4.757 3.845 58.790 0.0000

Friedman 1.000 6.000 2.567 2.900 4.867 3.667 241.907 0.0000


RMSE Friedman Aligned 15.500 162.500 69.433 71.900 133.80 86.867 130.042 0.0000
Quade 1.000 6.000 2.510 3.086 4.796 3.609 58.435 0.0000

Friedman 1.100 5.967 2.567 3.167 3.533 4.667 124.272 0.0000


MAPE Friedman Aligned 17.733 164.500 65.833 84.700 84.067 126.17 118.344 0.0000
Quade 1.161 5.935 2.594 3.125 3.542 4.643 42.701 0.0000

Friedman 1.033 6.000 3.633 2.867 4.900 2.567 235.936 0.0000


R Friedman Aligned 15.567 165.500 86.433 71.033 134.53 69.933 130.832 0.0000
Quade 1.037 6.000 3.596 3.026 4.845 2.497 58.732 0.0000

Best performance is highlighted in bold.

Fig. 10. The comparison of forecasting performance in case study 2.

of MAE as 1.1410, RMSE as 1.5335 and MAPE as 17.1076% and the worse in case study 2, the statistical tests of experimental results have
maximum value of R as 0.9375 while the best forecasting performance demonstrated that our proposed EnsemLSTM is still superior to ARIMA,
indices of the compared prediction models are respectively MAE as SVR, ANN, KNN and GBRT. Furthermore, better forecasting
1.1841 for SVR, RMSE as 1.5766 for SVR, MAPE as 17.7574% for SVR performance of EnsemLSTM than ANNLSTM, MeanLSTM and single
and R as 0.9341 for GBRT. In addition, the EnsemLSTM also ranks first LSTMs is also verified in case study 2, which implies the great learning
in statistic tests with a level of significance α = 0.0001 for short term ability of the nonlinear-learning top-layer and ensemble learning. The
wind speed forecasting from Table 10. Fig. 11 plots the forecasting above detailed comparisons in case study 1 and case study 2 have
results in case study 2, Figs. 12 and 13 show the bar graph and line proven that the proposed EnsemLSTM using nonlinear-learning
chart of prediction residual errors for different forecast methods re- ensemble of deep learning can perform better wind speed forecasting
spectively. Figs. 11, 12 and 13 demonstrate that the wind speed fore- than conventional prediction models.
casted by the EnsemLSTM shows more similarities with the actual wind
speed and preforms less residual errors in case study 2.
Table 11 displays the impact of different ensemble learning top- 5.4. Discussion and comparison
layers on forecasting performance in case study 2 and Table 12 shows
the comparisons between EnsemLSTM and single LSTMs. Tables 11 and This section focuses on the discussion and comparison between the
12 have demonstrated that our proposed EnsemLSTM performs superior proposed EnsemLSTM and other conventional forecasting models. From
wind speed forecasting to ANNLSTM, MeanLSTM and single LSTMs, the experimental results of utmost short term wind speed forecasting
which is benefited from nonlinear-learning ensemble top-layer of and short term wind speed forecasting, we can observe that
SVRM. EnsemLSTM performs the best in terms of forecasting metrics (MAE,
RMSE, MAPE and R) among compared models, i.e., ARIMA, SVR, ANN,
Remark 2. One-hour ahead short term wind speed forecasting is
KNN and GBRT. And the related statistical tests in Tables 4 and 10 have
performed in case study 2, which is more complicated and difficult
proved the effectiveness of EnsemLSTM. Furthermore, from the struc-
than the ten-minute ahead utmost short term wind speed forecasting in
ture of the proposed method, the forecasting ability of EnsemLSTM is
case study 1. Although the wind speed forecasting performance become
depended upon LSTMs and the nonlinear-learning ensemble top-layer.

690
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

Fig. 11. The wind speed forecasting results in case study 2.

To improve the generalization capability and robustness of single series firstly. Then, predictions of LSTMs are aggregated into a non-
LSTMs, the ensemble learning of LSTMs with diverse hidden layers and linear-learning regression top-layer composed of SVRM and the EO is
neurons are introduced in this paper. And the analysis in Tables 5–6 and introduced to optimize the parameters of top-layer. Lastly, the final
11–12 have also manifested that the ensemble learning top-layer of ensemble wind speed forecasting is given by the fine-turning top-layer.
SVRM optimized by the EO is superior to ANNLSTM, MeanLSTM and To verify the effectiveness of the proposed EnsemLSTM, two case stu-
single LSTMs. On the other hand, due to the liner characteristic of dies data collected from a wind farm in Inner Mongolia, China, are
ARIMA and simple calculation of SVR, ANN, KNN and GBRT, the pro- adopted to perform ten-minute ahead utmost short term wind speed
posed ensemble model based on deep learning is more complex than forecasting and one-hour ahead short term wind speed forecasting.
them. In addition, the forecasting ability of proposed method can be When compared with other popular prediction models including
influenced by the construction of ensemble learning and could be en- ARIMA, SVR, ANN, KNN and GBRT, the proposed EnsemLSTM can
hanced by adopting more proper structure for specific data. Therefore, achieve a better forecasting performance with the minimum value of
we conclude that the EnsemLSTM proposed in this paper is effective MAE, RMSE and MAPE and the maximum value of R. Moreover,
and promising, which can be seen as an alternative reliable technique EnsemLSTM also realizes the best ranks in the statistical tests of ex-
for wind speed forecasting. perimental results including Friedman, Friedman Aligned and Quade
tests. Furthermore, the analysis of impact of the ensemble learning top-
6. Conclusion and future work layer on forecasting performance and comparisons between the
EnsemLSTM and single LSTMs present that the nonlinear-learning top-
Wind speed forecasting is an essential issue in wind energy gen- layer of SVRM optimized by the EO is superior over ANNLSTM,
eration, conversion and operation, which has been attracting a lot of MeanLSTM and single LSTMs. Based on the nonlinear-learning en-
attentions. This paper has introduced a novel method using nonlinear- semble of LSTMs, SVRM and EO, the proposed EnsemLSTM achieved
learning ensemble of deep learning time series prediction based on the satisfactory wind speed forecasting performance.
LSTMs, SVRM and EO for wind speed forecasting. In the proposed Univariate time series prediction for wind speed forecasting is in-
EnsemLSTM, a cluster of LSTMs with diverse hidden layers and neurons vestigated in this paper. In the near future, multivariate time series
are employed separately to learn the information of wind speed time prediction based on deep learning algorithms using more interrelated

691
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

Fig. 12. The bar graphs of prediction residual errors for different forecast methods in case study 2.

Fig. 13. The line chart of prediction residual errors for different forecast methods in case study 2.

Table 11 Table 12
The impact of different ensemble learning top-layer on prediction performance The comparisons of forecasting results between the EnsemLSTM and single
in case study 2. LSTMs in case study 2.
Different models MAE RMSE MAPE (%) R Different models MAE RMSE MAPE (%) R

EnsemLSTM 1.1410 1.5335 17.1076 0.937498 EnsemLSTM 1.1410 1.5335 17.1076 0.937498
ANNLSTM 1.1660 1.5610 17.8027 0.936749 LSTM1 1.1503 1.5443 17.8296 0.937486
MeanLSTM 1.1446 1.5451 17.5856 0.937349 LSTM2 1.1437 1.5463 17.4906 0.937257
LSTM3 1.1451 1.5483 17.4886 0.937313
Best performance is highlighted in bold. LSTM4 1.1461 1.5466 17.6182 0.937308
LSTM5 1.1471 1.5483 17.5832 0.937272
LSTM6 1.1494 1.5502 17.6746 0.937261

Best performance is highlighted in bold.

692
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

features like weather conditions, human factors and power system Acknowledgements
statuses will be researched for more complex wind speed prediction. On
the other hand, the authors will also attempt to study more efficient This work was supported by the Natural Science Foundation of
ensemble learning structures to promote the model forecasting ability. China (Grant No. 61573095) and Zhejiang Provincial Natural Science
Foundation (Nos. LY16F030011 and LZ16E050002).

Appendix A

Algorithm: EnsemLSTM

693
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

References spatial wind speed forecasting in Iran. Energy Convers Manag 2016;115:17–25.
[14] Ma X, Jin Y, Dong Q. A generalized dynamic fuzzy neural network based on singular
spectrum analysis optimized by brain storm optimization for short-term wind speed
[1] http://www.gwec.net/. forecasting. Appl Soft Comput J 2017;54:296–312.
[2] Khare V, Nema S, Baredar P. Solar-wind hybrid renewable energy system: a review. [15] Jiang P, Wang Y, Wang J. Short-term wind speed forecasting using a hybrid model.
Renew Sustain Energy Rev 2016;58:23–33. Energy 2017;119:561–77.
[3] Al-falahi M, Jayasinghe S, Enshaei H. A review on recent size optimization meth- [16] Chen K, Yu J. Short-term wind speed prediction using an unscented Kalman filter
odologies for standalone solar and wind hybrid renewable energy system. Energy based state-space support vector regression approach. Appl Energy
Convers Manag 2017;143:252–74. 2014;113:690–705.
[4] Tascikaraoglu A, Uzunoglu M. A review of combined approaches for prediction of [17] Xiao L, Shao W, Yu M, Ma J, Jin C. Research and application of a combined model
short-term wind speed and power. Renew Sustain Energy Rev 2014;34:243–54. based on multi-objective optimization for electrical load forecasting. Energy
[5] Liu H, Tian H, Li Y. An EMD-recursive ARIMA method to predict wind speed for 2017;119:1057–74.
railway strong wind warning system. J Wind Eng Ind Aerodyn 2015;141:27–38. [18] Xiao L, Wang J, Dong Y, Wu J. Combined forecasting models for wind energy
[6] Kavasseri R, Seetharaman K. Day-ahead wind speed forecasting using f-ARIMA forecasting: a case study in China. Renew Sustain Energy Rev 2015;44:271–88.
models. Renew Energy 2009;34:1388–93. [19] Wang J, Heng J, Xiao L, Wang C. Research and application of a combined model
[7] Ren C, An N, Wang J, Li L, Hu B, Shang D. Optimal parameters selection for BP based on multi-objective optimization for multi-step ahead wind speed forecasting.
neural network based on particle swarm optimization: a case study of wind speed Energy 2017;125:591–613.
forecasting. Knowledge-Based Syst 2014;56:226–39. [20] Wang J, Hu J. A robust combination approach for short-term wind speed forecasting
[8] Aghajani A, Kazemzadeh R, Ebrahimi A. A novel hybrid approach for predicting and analysis – combination of the ARIMA (Autoregressive Integrated Moving
wind farm power production based on wavelet transform, hybrid neural networks Average), ELM (Extreme Learning Machine), SVM (Support Vector Machine) and
and imperialist competitive algorithm. Energy Convers Manag 2016;121:232–40. LSSVM (Least Square SVM) forecasts using a GPR (Gaussian Process Regression)
[9] Zhang C, Wei H, Xie L, Shen Y, Zhang K. Direct interval forecasting of wind speed model. Energy 2015;93:41–56.
using radial basis function neural networks in a multi-objective optimization fra- [21] Längkvist M, Karlsson L, Loutfi A. A review of unsupervised feature learning and
mework. Neurocomputing 2016;205:53–63. deep learning for time-series modeling. Pattern Recognit Lett 2014;42:11–24.
[10] Li G, Shi J. Applications of Bayesian methods in wind energy conversion systems. [22] Lv Y, Duan Y, Kang W, Li Z, Wang F. Traffic flow prediction with big data: a deep
Renew Energy 2012;43:1–8. learning approach. IEEE Trans Intell Transp Syst 2014;16:865–73.
[11] Zhang C, Zhou J, Li C, Fu W, Peng T. A compound structure of ELM based on feature [23] Qiu X, Ren Y, Suganthan PN, Amaratunga G. Empirical mode decomposition based
selection and parameter optimization using hybrid backtracking search algorithm ensemble deep learning for load demand time series forecasting. Appl Soft Comput
for wind speed forecasting. Energy Convers Manag 2017;143:360–76. 2017;54:246–55.
[12] Chang G, Lu H, Chang Y, Lee Y. An improved neural network-based approach for [24] Hu Q, Zhang R, Zhou Y. Transfer learning for short-term wind speed prediction with
short-term wind speed and power forecast. Renew Energy 2017;105:301–11. deep neural networks. Renew Energy 2016;85:83–95.
[13] Noorollahi Y, Jokar MA, Kalhor A. Using artificial neural networks for temporal and [25] Khodayar M, Kaynak O, Khodayar ME. Rough deep neural architecture for short-

694
J. Chen et al. Energy Conversion and Management 165 (2018) 681–695

term wind speed forecasting. IEEE Trans Ind Inform 2017;13:2770–9. [37] Lu Y, Chen Y, Chen M, Chen P, Zeng G. Extremal optimization: fundamentals, al-
[26] Wang H, Wang G, Li G, Peng J, Liu Y. Deep belief network based deterministic and gorithms, and applications. CRC Press & Chemical Industry Press; 2016.
probabilistic wind speed forecasting approach. Appl Energy 2016;182:80–93. [38] Zeng G, Chen J, Li L, Chen M, Wu L, Dai Y, et al. An improved multi-objective
[27] Ren Y, Zhang L, Suganthan PN. Ensemble classification and regression: Recent population-based extremal optimization algorithm with polynomial mutation. Inf
developments, applications and future directions. IEEE Comput Intell Mag Sci 2016;330:49–73.
2016;11:41–53. [39] Zeng G, Chen J, Dai Y, Li L, Zheng C, Chen M. Design of fractional order PID
[28] Yang W, Wang J, Wang R. Research and application of a novel hybrid model based controller for automatic regulator voltage system based on multi-objective extremal
on data selection and artificial intelligence algorithm for short term load fore- optimization. Neurocomputing 2015;160:173–84.
casting. Entropy 2017;19(2):52. [40] Zeng G, Chen J, Chen M, Dai Y, Li L, Lu K, et al. Design of multivariable PID
[29] Zhang X, Wang J, Zhang K. Short-term electric load forecasting based on singular controllers using real-coded population-based extremal optimization.
spectrum analysis and support vector machine optimized by Cuckoo search algo- Neurocomputing 2015;151:1343–53.
rithm. Electr Power Syst Res 2017;146:270–85. [41] Lu K, Zhou W, Zeng G, Du W. Design of PID controller based on a self-adaptive state-
[30] Xu Y, Yang W, Wang J. Air quality early-warning system for cities in China. Atmos space predictive functional control using extremal optimization method. J Franklin
Environ 2017;148:239–57. Inst 2018;355:2197–220.
[31] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput [42] Goodfellow I, Bengio Y, Courville A. Deep Learning. MIT Press; 2016.
1997;9:1–32. [43] Wu Y, Yuan M, Dong S, Lin L, Liu Y. Remaining useful life estimation of engineered
[32] Sainath TN, Vinyals O, Senior A, Sak H. Convolutional, long short-term memory, systems using vanilla LSTM neural networks. Neurocomputing 2018;275:167–79.
fully connected deep neural networks. IEEE Int Conf Acoust Speech Signal Process [44] Shi J, Guo J, Zheng S. Evaluation of hybrid forecasting approaches for wind speed
2015;2015:4580–4. and power generation time series. Renew Sustain Energy Rev 2012;16:3471–80.
[33] Boettcher S, Percus A. Nature’s way of optimizing. Artif Intell 2000;119:275–86. [45] Zhou J, Shi J, Li G. Fine tuning support vector machines for short-term wind speed
[34] Boettcher S, Percus AG. Optimization with extremal dynamics. Phys Rev Lett forecasting. Energy Convers Manag 2011;52:1990–8.
2001;86:5211–4. [46] http://scikit-learn.org/.
[35] Bak P, Sneppen K. Punctuated equilibrium and criticality in a simple model of [47] https://keras.io/.
evolution. Phys Rev Lett 1993;71:4083–6. [48] Derrac J, García S, Molina D, Herrera F. A practical tutorial on the use of non-
[36] Lu Y, Chen M, Chen Y. Studies on extremal optimization and its applications in parametric statistical tests as a methodology for comparing evolutionary and swarm
solving realworld optimization problems. In: Foci; 2007. p. 162–8. intelligence algorithms. Swarm Evol Comput 2011;1:3–18.

695

You might also like