Solar Power Generation Forecasting
Solar Power Generation Forecasting
https://www.emerald.com/insight/2210-8327.htm
Abstract
Solar power forecasting will have a significant impact on the future of large-scale renewable energy plants.
Predicting photovoltaic power generation depends heavily on climate conditions, which fluctuate over time. In
this research, we propose a hybrid model that combines machine-learning methods with Theta statistical
method for more accurate prediction of future solar power generation from renewable energy plants. The
machine learning models include long short-term memory (LSTM), gate recurrent unit (GRU), AutoEncoder
LSTM (Auto-LSTM) and a newly proposed Auto-GRU. To enhance the accuracy of the proposed Machine
learning and Statistical Hybrid Model (MLSHM), we employ two diversity techniques, i.e. structural diversity
and data diversity. To combine the prediction of the ensemble members in the proposed MLSHM, we exploit
four combining methods: simple averaging approach, weighted averaging using linear approach and using
non-linear approach, and combination through variance using inverse approach. The proposed MLSHM
scheme was validated on two real-time series datasets, that sre Shagaya in Kuwait and Cocoa in the USA. The
experiments show that the proposed MLSHM, using all the combination methods, achieved higher accuracy
compared to the prediction of the traditional individual models. Results demonstrate that a hybrid model
combining machine-learning methods with statistical method outperformed a hybrid model that only combines
machine-learning models without statistical method.
Keywords Solar power forecasting, Machine learning, Statistical methods, Renewable energy, Photovoltaic
Paper type Original Article
1. Introduction
Photovoltaic (PV) technology has been one of the most common types of renewable energy
technologies being pursued to fulfil the increasing electricity demand, and decreasing the
© Mariam AlKandari and Imtiaz Ahmad. Published in Applied Computing and Informatics. Published
by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC
BY 4.0) license. Anyone may reproduce, distribute, translate and create derivative works of this article
(for both commercial and non-commercial purposes), subject to full attribution to the original publication
and authors. The full terms of this license may be seen at http://creativecommons.org/licences/by/4.0/
legalcode
Declaration of Competing Interest: The authors declare that they have no known competing financial
interests or personal relationships that could have appeared to influence the work reported in this paper.
Thanks are extended to the reviewers for their valuable comments which certainly improved the
quality of the manuscript.
Publishers note: The publisher wishes to inform readers that the article “Solar power generation
forecasting using ensemble approach based on deep learning and statistical methods” was originally
published by the previous publisher of Applied Computing and Informatics and the pagination of this article
has been subsequently changed. There has been no change to the content of the article. This change was
Applied Computing and
necessary for the journal to transition from the previous publisher to the new one. The publisher sincerely Informatics
apologises for any inconvenience caused. To access and cite this article, please use AlKandari, M., Ahmad, I. Vol. 20 No. 3/4, 2024
pp. 231-250
(2019), “Solar power generation forecasting using ensemble approach based on deep learning and statistical Emerald Publishing Limited
methods”, Applied Computing and Informatics. Vol. ahead-of-print No. ahead-of-print. https://10.1016/j.aci. e-ISSN: 2210-8327
p-ISSN: 2634-1964
2019.11.002. The original publication date for this paper was 06/11/2019. DOI 10.1016/j.aci.2019.11.002
ACI amount of CO2 emission at the same time conserving fossil fuels and natural resources [1]. A
20,3/4 PV panel converts the solar radiation into electrical energy directly by semiconducting
materials. Use of a PV panel highly depends on solar radiation, ambient temperature, weather
and the geographic location of the PV panel. Furthermore, the electrical operators need to
balance the electrical consumption and generation in order to conserve and avoid the waste of
electrical energy produced from the renewable energy plants. In addition, they need to ensure
that there will be a sufficient capacity of power to cover the requirements of the consumers [2].
232 Thus, the industry of renewable energy is moving toward connecting solar energy power
plants with the electrical grid. However, this causes instability to the grid, making it the
greatest challenge to the industry. Hence, predicting power generation from renewable
energy farms in general and PV plants specifically is a necessary step for future development.
Thus, it is essential to be able to predict the capacity of PV plants power that will be produced
for the next few days.
Solar power prediction is not an easy process because it largely depends on climate
conditions, which fluctuate over time. To overcome the above-mentioned challenges, it is
important to employ new, and intelligent methods to obtain valid and accurate results. To
date, machine learning (ML) methods have received significant attention from many
researchers and developers in the solar power generation forecasting field [3–9] in addition to
other fields such as solving partial differential eqautions [10,11]. More details of the
characteristics and performance of ML techniques employed in solar PV generation
forecasting can be found in recent surveys [12,13]. Besides, ensemble learning is a promising
model that grabbed the attention of many researchers in the past few years [14]. Notably, the
authors in [12] mentioned that diversity between the ensemble members creates a significant
benefit and improvement over the traditional models. In addition to ML models, statistical
methods are traditional mathematical formulas and techniques that have been used
historically for forecasting purposes [15]. Yang et al. [16] combined the forecasting results of
different statistical methods into more accurate prediction.
As mentioned earlier, numerous research papers have been published discussing different
ML techniques, ensemble methods to combine these ML models, and ensemble methods for
integrating different statistical methods. However, we could not find research that combines
ML models and statistical method, especially in renewable energy forecasting applications.
Therefore, in this research, we propose a hybrid model (MLSHM) that combines ML models
and statistical method to predict solar power generation of a PV plant for more efficient and
accurate results. As diversity is the main characteristic towards the success of the ensemble
approach, we explore two types of diversity: structural diversity since we combine ML
models and statistical model that have different structures, and data diversity by dividing the
training set among various ML models. Moreover, four different ensemble approaches are
explored to combine and aggregate the solar PV power predicted from ML models and
statistical method: simple averaging approach, weighted averaging using linear approach,
weighted averaging using non-linear approach, and a combination through variance using
the inverse approach.
To test and evaluate the proposed hybrid model, we implement various ML predicting
models: long short-term memory (LSTM) [17], gate recurrent unit (GRU) [18], and AutoEncoder
long short-term memory model (Auto-LSTM) [6]. Moreover, we propose a new ML model, called
AutoEncoder gate recurrent unit (Auto-GRU), which is similar to Auto-LSTM except it uses
GRU cells in the ML network. In addition, we design and implement a statistical method to
forecast the power of a PV farm, which is the Theta model that was proposed in [19]. This
statistical model is considered the most accurate and the simplest model compared with the
other statistical models in M3 Competition [15]. During the experiments, the proposed hybrid
model achieved better accuracy and outperformed other individual ML and statistical models.
Moreover, the experimental results demonstrated that a hybrid model combining both types of
ML methods and statistical method achieved better accuracy than hybrid models that only Solar power
combines machine learning models. The proposed MLSHM and Auto-GRU can be generalized generation
to solve other time series problems such as financial markets, industrial markets, control
engineering and astronomy.
forecasting
The remainder of this paper is structured as follows. Section 2 presents the problem
statement and the related work. Section 3 illustrates the methodology of ML models, the
proposed Auto-GRU model, statistical model, and the proposed hybrid model. Section 4
shows the experimental results and performance validation of the proposed hybrid model. 233
Finally, this research concludes in Section 5.
Figure 1 shows a simple representation of the solar PV power prediction system with n56
weather parameters.
Numerous research studies have introduced ML algorithms as forecasting models in
different application related to the field of renewable energy. Several ML methodologies such
as support vector machine (SVM) [20], long short-term memory (LSTM) [21], and K-nearest
neighbor (K-NN) [22], have been applied to predict solar irradiance, which can be considered
as the first step towards solar PV power forecasting. A gradient boosted regression tree
model (GBRT) was conducted by Persson et al. in [3] to predict multi-site solar power
generation on a forecast horizon of one to six hours ahead. The GBRT model was mainly
designed for classification; however, it has been extended to regression. Furthermore, GBRT
is a ML model that combines the output of many small regression trees of fixed size to
generate better result. Unlike the conventional time series methods, the proposed model has
no updating procedure or recursive version as new observations arrive, which add a
considerable limitation to their proposed model. The authors in [4] proposed a least absolute
shrinkage and selection operator (LASSO) based forecasting model for solar power
generation. LASSO based model assists in variable selection by minimizing the weights of
less important variables and maximizing the sparsity of the overall coefficient vector. They
compared the predicted solar power from their proposed algorithm with two representative
schemes, SVM and a time-series based method known as TLLE method. The results showed
that LASSO based algorithm achieved more accurate forecasts of solar power than the
representative schemes using fewer training dataset. As a supplement to the proposed
algorithm by Tang et al. [4], the authors in [5] integrated LASSO with LSTM as a forecasting
model for solar intensity prediction. Their proposed model attained better performance in
short-term solar intensity prediction rather than long-term prediction.
Figure 1.
Solar PV power
prediction model.
ACI Furthermore, power forecasting of solar power plants using AutoEncoder (AE) and LSTM
20,3/4 neural network was developed by [6]. They used the encoding side of the AE to realize the
most effective features for learning and attached it to the LSTM network. Accordingly, the
LSTM uses the learned encoding data as an input to predict the solar power generation of a
PV plant. The model showed powerful results compared to multilayer perceptrons (MLP),
LSTM, deep belief networks, and AE. Different from [6], the authors in [7] combined the AE
with LSTM into augmented long short-term memory (A-LSTM) forecasting model. The
234 algorithm was tested on different datasets, and it performed well on time series datasets. De
et al. developed LSTM based model to predict PV power using limited dataset [8]. Although
the LSTM model predicted accurate results, it was shown that increasing the amount of data
and features will improve the performance of the LSTM model. Wang et al. proposed a GRU
based short-term PV power forecasting algorithm [9]. GRU model is used to reduce the long
training time compared to LSTM model as well as improve the accuracy of the output. They
concluded that GRU outperformed the traditional ML models such as SVM, autoregressive
integrated moving average model (ARIMA), LSTM, and back propagation neural network.
The authors in [23] surveyed state-of-the-art ML models used in different renewable
energy systems. Moreover, they identified and classified many ML models applied in
different applications in energy applications and explored various research in different
energy systems. Rana et al. demonstrated a comprehensive comparison on various methods
that is used for prediction purposes [24]. They explored six different methods: Neural
Network (NN), SVM, K-NN, Multiple Linear Regression (MLR) and two persistent methods.
The results showed that ensemble of NNs is the most promising accurate method compared
to other predicting methods. Statistical approaches have not gathered the attention of
researchers as much as ML models, especially in solar PV power forecasting. As mentioned
earlier, Makridakis et al. illustrated that Theta method was the most accurate and simpler
statistical method that performed partially well in M3 Competition [15].
The authors in [25] classified the ensemble methods into two categories: competitive
ensemble forecasting and cooperative ensemble forecasting. Ensemble learning is a
promising model which has attracted lot of attention in recent years. Ahmed et al. [14]
proposed three different ensemble approaches to predict day-ahead solar power generation
namely:
(i) linear,
(ii) normal distribution, and
(iii) normal distribution with additional features.
They concluded that all the ensemble methods when combined together showed better
performance than the individual ML models. Gigoni et al. compared several ML forecasting
methodologies, e.g., K-NN, support vector regression (SVR), and quantile random forest and
evaluate their prediction accuracy in solar PV power application [26]. The experimental
results showed that aggregating the output of single prediction models surpassed all the ML
learning models explored in their research under any weather condition. Feng et al. grouped
the weather data into hourly similarity-based approach and used those grouped data to train
a two-layer ML hybrid model to be able to predict one hour ahead solar irradiance [27]. Their
results showed that the hybrid model performed better than any single ML model used in the
hybrid model. Another study by Koprinska et al. illustrated static and dynamic approaches to
ensemble the solar power prediction of NNs [28]. Their experiment showed overwhelming
results for the ensemble approaches compared to bagging, boosting, random forest, and four
single prediction models (NN, SVM, K-NN, and a persistence model). Limited research can be
found on ensemble statistical models for superior performance and accuracy. The authors in
[16] explored eight ensemble techniques to combine the results of six best models from each Solar power
family of statistical models: SARIMA (36 models), ETS (30 models), MLP (1 model), STL generation
decomposition (2 models), TBATS (72 models) and the Theta model (1 model). Although
ensemble learning showed high efficient performance and accurate results in many papers,
forecasting
the results in [16] showed marginal enhancement from the best model. The reason being that
the models tested resulted in highly correlated errors. This led us to emphasizes that diversity
is the key toward major enhancement of ensemble methods. The authors in [29] presented a
cluster-based approach applied to the global solar radiation. Their approach then predict the 235
horizontal global solar radiation using a combination of two Ml models: SVM and ANN. The
results showed higher predicting accuracy compared to the conventional ANN and SVM.
In summary, we explored much research in solar power forecasting that combine ML
models to enhance prediction accuracy. Substantially, all the research we explored confirmed
that diversity is an essential and fundamental procedure for a powerful ensemble model.
Moreover, it was found that most ensemble based studies used the conventional ensemble-
ML methods such as bagging, AdaBoost, and stacked generalization to apply diversity, train
the same ML models and aggregate the results into one complete model. However, this study
incorporates the prediction resulted from ML models and the prediction resulted from the
statistical model, which cannot be achieved by the conventional ML-based ensemble models.
Furthermore, research that aggregates ML model and statistical models in solar PV power
forecasting is non existent. Hence, this study could be considered the first in solar power
forecasting which adds a value to this field of research.
3. Design methodology
In this section, proposed solutions to enhance and reinforce the performance of solar PV
power forecasting algorithms are presented.
3.1.2 GRU. The gated recurrent unit (GRU) is a special case of LSTM introduced by Cho et al.
[30] to reduce the long training time of LSTM. Compared to LSTM, GRU has fewer controlling
gates as it lacks an output gate. As shown in Figure 3, GRU is much simpler than LSTM since
it includes only two gates, the reset gate and update gate, that control the information flow
inside the units. The transition functions between neurons of GRU are given as follows:
rðtÞ ¼ σ ðwr xðtÞ þ ur hðt 1Þ þ br Þ (8)
zðtÞ ¼ σ ðwz xðtÞ þ uz hðt 1Þ þ bz Þ (9)
b
hðtÞ ¼ σ ðwh xðtÞ þ uh ðrðtÞ * hðt 1ÞÞ þ bh Þ (10)
where rðtÞ is the reset gate, zðtÞ is the update gate, and w and u represent the parameter
matrices in GRU. Furthermore, hðtÞ; b hðtÞ, and b are the output, candidate output, and bias,
respectively. The activation function is represented as σ.
Figure 2.
The structure of a long
short-term
memory cell.
Figure 3.
The structure of a
gated recurrent
unit cell.
3.1.3 Auto-LSTM. The Auto-LSTM model proposed by [6] consists of two ML algorithms: Solar power
AutoEncoder (AE) [31] and LSTM. An AE is an unsupervised neural network where the input generation
and the output layers have the same size. An AE tries to learn the identity function so that the
input x is approximately similar to the output b x with some constraints applied to the
forecasting
network, e.g., a limited number of neurons in the hidden layer compared to the input layer.
Therefore, an AE acts as a compressor and a decompressor consisting of two parts separated
by a bottleneck at the center:
237
(i) encoding side, where the neurons are reduced from the input layer to the hidden layer,
and
(ii) decoding side, where the layers in the encoding side are reflected as shown in Figure 4.
Thus, an AE is able to learn and discover the correlations in the input features and the special
structure of the data. Gensler et al. [6] have utilized the encoding side of the AE to realize the
feature extraction and attached it to the LSTM model. Hence, the LSTM model is trained and
fitted using the historical encoded weather data produced by the encoding side of the AE as
well as the corresponding solar PV power. The end result of this Auto-LSTM model is an ML
algorithm that is able to predict the solar PV power generated from a PV farm given the
meteorological data.
3.1.4 Auto-GRU. The proposed Auto-GRU model has similar characteristics of the Auto-
LSTM model presented in [6]. As explained earlier, the AE is used to shrink the
meteorological data by discovering the structure of data and attach the encoded data to the
GRU network. The encoded meteorological data as well as the corresponding historical solar
PV power will be fed to the GRU network to fit and train the neurons to be able to predict the
desired output. Specifically, the Auto-GRU model is trained by a set of historical encoded
meteorological data and the corresponding PV power (encoded meteorological parameters,
PV power). Figure 5, represents the block diagram of the proposed Auto-GRU model. The
forecasting process using Auto-GRU can be summarized as follows:
1. The historical weather data is encoded using the encoding side of the AE.
2. The encoded weather data is split into a training set and a testing set. A small
percentage of training data is preserved for validation.
3. All the training, validating and testing sets are reorganized into chunks or windows
that represent the number of historical previous days (window size).
Figure 4.
An example of
AutoEncoder topology.
ACI 4. The proposed model is trained and at the same time validated using windows of
20,3/4 historical encoded weather data and the corresponding PV power.
5. The proposed model is tested on the testing set that was rearranged into chunks of
previous samples.
6. Finally, the proposed Auto-GRU model is a powerful ML model that is ready to predict
the solar PV power given a chunk of previous meteorological data.
238
3.2 Statistical forecasting algorithm
Based on research by Makridakis et al. [15], we decided to use Theta model to represent the
statistical part of the study. This statistical model is considered to be the most accurate and
the simplest model compared to the other statistical models examined in M3 Competition [15].
Assimakopoulos et al. [19] have proposed a Theta model as a decomposition approach for
forecasting applications. This model is based on modifying the local curvature of the time
series data using a coefficient called ThetaðθÞ that is applied to the second derivative of the
data as shown in Eq. (12).
00 00 00
Xnew ðθÞ ¼ θ$Xdata ; where Xdata
(12)
¼ ðXt 2Xt−1 þ Xt−2 Þ at time t:
The new time series lines are called theta lines and maintain the mean and the slope of the
original time series. Moreover, the deflations of the new time series curvatures depend on the
value of Theta coefficient, i.e., to identify the long-term behaviors of the time series dataset
programed the Theta coefficient to be between 0 and 1 (0 < θ < 1). However, when θ > 1 the
new theta line is more dilated, it affects the short-term trends. The theta lines are then
extrapolated separately and combined to generate the forecasted solar PV power. The
authors in [19] decomposed the original time series into two theta lines by setting the Theta
coefficient to θ 5 0 and θ5 2. The first line (L (θ 5 0)) represents the linear regression line of
the original time series magnifying the long-term trends. The second line (L (θ 5 2)) doubles
the original curvature, magnifying the short-term trends. In this, the forecasting process is
accomplished by linearly extrapolating the first theta-line while extrapolating the second line
using simple exponential smoothing (SES). Afterward, the forecasted time series of the two
theta-lines are simply combined via equal weights resulting in the final forecast of a specific
time series dataset.
Figure 5.
Auto-GRU block
diagram.
ML models and statistical model. Moreover, several ensemble methods were employed to Solar power
combine the predictions of different models and generates the final solar PV power generation
prediction. To boost and raise the benefits of aggregating ML models with statistical method,
enforcing diversity between the combined models, is an essential procedure. In this research,
forecasting
the diversity in ensemble methods falls onto two categories:
(i) data diversity, and
(ii) structural diversity. 239
Data diversity is achieved by generating multiple datasets from the original dataset to train
the ML models [25,32]. Structural diversity is attained by having different architectures of the
prediction models [32]. In our study, we introduced data diversity within the ML models and
structural diversity by combining two differently structured algorithms, i.e., ML models and
statistical model. As a result, data diversity was applied to the combined ML models as
follows:
1. After the dataset was split into a training set and a testing set, the training set was
further divided into n training sub-sets, where in our study n 5 2.
2. Each ML model is trained on one of the training sub-sets; hence, we have n ML models
trained on different sets of data, where in our study n 5 2, hence we have two ML
models trained on different sets of data.
3. All the ML models are tested on the same testing set for comparison purposes to
ensure equality of results between the models.
Figure 6 represents the block diagram of the proposed MLSHM showing the data diversity
applied to the training set. There are n ML models and k statistical methods predicting the
solar PV power.
Figure 6.
The block diagram of
the proposed Machine
Learning and
Statistical Hybrid
Model (MLSHM).
ACI In the first stage, the ML models and the statistical models predicted the solar PV power
20,3/4 separately, and then we combined these results to get the final forecast. We explored four
different ensemble methods to test their effectiveness on the proposed MLSHM. The
ensemble methods are described as follows:
1. EN1: simple averaging approach, which is the simplest and the most natural method
that generates the final forecasted solar PV power by taking the mean value of the
240 forecasts resulted from the ML models and statistical models. The final solar PV
power is generated as follows:
Xm
by ¼ byj m (13)
j¼1
X
m
by ¼ byi $wi (15)
i¼1
Here wi represents the weight given to model i, and nMAEi is the normalized Mean
Absolute Error of model i.
3. EN3: weighted averaging using non-linear approach, where it uses the same concept
of EN1 but the models’ weights are calculated as a softmax function of the negative of
its error (nMAE):
X m
−nMAEi
wi ¼ exp exp−nMAEj (16)
j¼1
Here exp denotes the exponential function. The final solar power forecast is calculated by
Eq. (15).
4. EN4: combination through variance using inverse approach this is simply the
weighted averaging using the following weighting equation:
X
m
wi ¼ 1 nMAEi 1=nMAEj (17)
j¼1
The final prediction of solar PV power is calculated by the weighted averaging as in Eq. 15.
4. Experimental results and discussions Solar power
In this section, we present a comprehensive experimental study to evaluate these proposed generation
methods in solar PV power forecasting application. We compared a new ML method (Auto-
GRU) with other ML algorithms (GRU [30], LSTM [17], and Auto-LSTM [6]) and statistical
forecasting
method (Theta model [19]) in terms of accuracy. Four different ensemble methods were used
to build the MLSHM and will be compared with the accuracy with the traditional methods.
The proposed algorithms were tested on different solar PV systems starting from a single PV
panel to large-scale PV farms. 241
data while the output layer represents the forecasted solar PV power. Hyperbolic Tangent
(tanh) was added as an activation function to introduce non-linearity in the network. The
models were fitted using efficient RMSProp optimization algorithm and the mean squared
error loss function. In addition, various hyper-parameters were used to fit ML models in order
to carry out several experimental scenarios. These experimental configurations were used to
fine tune the Auto-GRU model on the three datasets (Shagaya Poly-SI, Shagaya TFSC, and
Cocoa single Poly-SI). The design of the experiments were tested such that the parameters are
set as follows: window size is 2, update weights after 20 batches and execute 50 epochs to
train the model.
In this research, we used two popular evaluation measures to compare the accuracy of the
predicted normalized power from Auto-GRU model and MLSHM. Specifically, normalized
Mean Absolute Error (nMAE) as shown in Eq. 18 and normalized Mean Square Error (nMSE)
as shown in Eq. 19. Here, y is the normalized actual power, by is the normalized predicted
power, and N is time series samples.
X N
nMAE ¼ 1 N jyi byi j (18)
i¼1
XN
nMSE ¼ 1 N ðyi byi Þ2 (19)
i¼1
243
Figure 7.
Hybrid model
configuration.
Hence, Theta model consolidated the performance and the accuracy of the forecasted solar
PV power.
244
20,3/4
Table 2.
Prediction
poly-SI dataset.
and nMSE of shagaya
performance on nMAE
Auto-MLSHM T-MLSHM Auto-MLHM T-MLHM
Theta Auto- Auto-
Approach Model LSTM GRU LSTM GRU EN1 EN2 EN3 EN4 EN1 EN2 EN3 EN4 EN1 EN2 EN3 EN4 EN1 EN2 EN3 EN4
nMAE 0.057 0.0536 0.0346 0.0806 0.0526 0.044 0.0438 0.0438 0.0424 0.0341 0.0341 0.0341 0.0327 0.045 0.0447 0.0447 0.0423 0.0514 0.0513 0.0513 0.0499
nMSE 0.00695 0.0037 0.00243 0.00891 0.00429 0.00318 0.00316 0.00316 0.00305 0.00213 0.00214 0.00215 0.00197 0.00274 0.00271 0.00271 0.00241 0.00393 0.00393 0.00393 0.00396
Auto-MLSHM T-MLSHM Auto-MLHM T-MLHM
Theta Auto- Auto-
Approach Model LSTM GRU LSTM GRU EN1 EN2 EN3 EN4 EN1 EN2 EN3 EN4 EN1 EN2 EN3 EN4 EN1 EN2 EN3 EN4
nMAE 0.0574 0.0698 0.0358 0.0831 0.0394 0.0411 0.0409 0.0409 0.0389 0.0354 0.0352 0.0352 0.0317 0.0435 0.0428 0.0428 0.0356 0.0374 0.0373 0.0373 0.0368
nMSE 0.00656 0.00577 0.0029 0.00961 0.00251 0.00303 0.003 0.003 0.00265 0.00209 0.00207 0.00207 0.00185 0.00332 0.00324 0.00325 0.00241 0.00194 0.00194 0.00194 0.00197
forecasting
generation
245
Solar power
TFSC dataset.
and nMSE of shagaya
Table 3.
performance on nMAE
Prediction
ACI
246
20,3/4
Table 4.
Prediction
nMAE 0.207 0.0739 0.0866 0.0778 0.0952 0.112 0.109 0.109 0.0948 0.166 0.103 0.103 0.0877 0.0839 0.0838 0.0838 0.0828 0.0794 0.0794 0.0794 0.0794
nMSE 0.0556 0.0176 0.0193 0.0184 0.019 0.0191 0.0185 0.0186 0.0169 0.0187 0.0181 0.0182 0.0168 0.0179 0.0179 0.0179 0.0177 0.0185 0.0185 0.0185 0.0185
Solar power
generation
forecasting
247
Figure 8.
Actual versus
predicted solar PV
power of the three
datasets.
ACI We can summarize the results obtained during this research as follows:
20,3/4 Hybrid models outperform all tested traditional ML models and the Theta statistical
model.
Hybrid models with Theta model (MLSHM) obtained better accuracy than hybrid
models without the Theta statistical model (MLHM).
248 All ensemble methods (EN1, EN2, EN3, and EN4) achieved better accuracy than any
single ML algorithm and theta model.
Almost all ensemble methods achieved similar accuracies; however, EN4 was found to
be the most accurate ensemble method used in this research.
GRU model is the most accurate ML model followed by Auto-GRU, LSTM, and Auto-
LSTM.
ML models performed better than theta statistical model in predicting solar power
generation.
5. Conclusions
Integrating large-scale PV plants into the power grid poses considerable problems and
challenges to the electric operators, as it causes instability to the electric grid causing the
electrical operators to balance the electrical consumption and power generation in order to
avoid waste of energy. Therefore, an accurate solar power forecast is a fundamental
requirement toward the future of renewable energy plants. In this research, we proposed a
hybrid model (MLSHM) that combines the prediction results of both ML models and
statistical method. For our study we developed a new ML model, Auto-GRU, that learns from
historical time series data to predict the desired solar PV power. In order to boost the hybrid
model, two diversity techniques were conducted in this study, i.e., structural diversity
between the ensemble members and data diversity between the training sets of the ML
models. Four different combination methods illustrate to combine the prediction of ML
models and statistical method. The proposed hybrid model and Auto-GRU model tested on
two real-time series datasets of solar PV power and weather data collected from Shagaya
located in Kuwait [33] and Cocoa, Florida, USA [34]. The experiments allow us to conclude
that a hybrid model combining the prediction of ML models and statistical method obtain
higher accuracy than a hybrid model combining the prediction of ML models without
statistical method. Our future work is to test and validate the hybrid model on other ML
models and statistical method. Moreover, we will try other diversity techniques such as
dividing the training set by parameters as well as testing both data and parameters diversity.
In addition, we plan to develop other ensemble techniques to boost the accuracy of the
prediction.
References
[1] O. Publishing, Trends in photovoltaic applications 2018, International Energy Agency,
2018Tech. rep.
[2] K.S. Perera, Z. Aung, W.L. Woon, Machine learning techniques for supporting renewable energy
generation and integration: a survey, in: International Workshop on Data Analytics for
Renewable Energy Integration, 2014, pp. 81–96.
[3] C. Persson, P. Bacher, T. Shiga, H. Madsen, Multi-site solar power forecasting using gradient
boosted regression trees, Sol. Energy 150 (2017) (2017) 423–436.
[4] N. Tang, S. Mao, Y. Wang, R. Nelms, Solar power generation forecasting with a lasso-based Solar power
approach, IEEE Internet Things J. 5 (2018) (2018) 1090–1099.
generation
[5] Y. Wang, Y. Shen, S. Mao, X. Chen, H. Zou, Lasso and lstm integrated temporal model for short-
term solar intensity forecasting, IEEE Internet Things J. 6 (2) (2018) 2933–2944.
forecasting
[6] A. Gensler, J. Henze, B. Sick, N. Raabe, Deep learning for solar power forecasting—an approach
using autoencoder and lstm neural networks, in: Systems, Man, and Cybernetics (SMC), 2016
IEEE International Conference on, IEEE, 2016, pp. 002858–002865.
249
[7] D. Hsu, Time series forecasting based on augmented long short-term memory, 2017. arXiv
preprint arXiv: 1707.00666.
[8] V. De, T. Teo, W. Woo, T. Logenthiran, Photovoltaic power forecasting using lstm on limited
dataset, 2018 IEEE Innovative Smart Grid Technologies-Asia (ISGT Asia), IEEE (2018) 710–715.
[9] Y. Wang, W. Liao, Y. Chang, Gated recurrent unit network-based short-term photovoltaic
forecasting, Energies 11 (8) (2018) 2163–2177.
[10] C. Anitescu, E. Atroshchenko, N. Alajlan, T. Rabczuk, Artificial neural network methods for the
solution of second order boundary value problems, Comput. Mater. Continua 59 (1) (2019)
345–359.
[11] H. Guo, X. Zhuang, T. Rabczuk, A deep collocation method for the bending analysis of kirchhoff
plate, Comput. Mater. Continua 59 (2) (2019) 433–456.
[12] S. Sobri, S. Koohi-Kamali, N.A. Rahim, Solar photovoltaic generation forecasting methods: a
review, Energy Convers. Manage. 156 (2018) 459–497.
[13] D. Yang, J. Kleissl, C.A. Gueymard, H.T. Pedro, C.F. Coimbra, History and trends in solar
irradiance and pv power forecasting: a preliminary assessment and review using text mining, Sol.
Energy 168 (2018) (2018) 60–101.
[14] A. Ahmed Mohammed, Z. Aung, Ensemble learning approach for probabilistic forecasting of
solar power generation, Energies 9 (12) (2016) 1017–1034.
[15] S. Makridakis, E. Spiliotis, V. Assimakopoulos, Statistical and machine learning forecasting
methods: Concerns and ways forward, PloS One 13 (3) (2018) e0194889.
[16] D. Yang, Z. Dong, Operational photovoltaics power forecasting using seasonal time series
ensemble, Sol. Energy 166 (2018) 529–541.
[17] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput. 9 (8) (1997) 1735–1780.
[18] K. Yao, T. Cohn, K. Vylomova, K. Duh, C. Dyer, Depth-gated recurrent neural networks, 2015.
arXiv preprint arXiv:1508.03790 9.
[19] V. Assimakopoulos, K. Nikolopoulos, The theta model: a decomposition approach to forecasting,
Int. J. Forecasting 16 (4) (2000) 521–530.
[20] H.S. Jang, K.Y. Bae, H.-S. Park, D.K. Sung, Solar power prediction based on satellite images and
support vector machine, IEEE Trans. Sustain. Energy 7 (3) (2016) 1255–1263.
[21] A. Alzahrani, P. Shamsi, M. Ferdowsi, C. Dagli, Solar irradiance forecasting using deep recurrent
neural networks, Renewable Energy Research and Applications (ICRERA), 2017 IEEE 6th
International Conference on, IEEE (2017) 988–994.
[22] F. Jawaid, K. NazirJunejo, Predicting daily mean solar power using machine learning regression
techniques, in: 2016 Sixth International Conference on Innovative Computing Technology
(INTECH), 2016, pp. 355–360.
[23] A. Mosavi, M. Salimi, S. Faizollahzadeh Ardabili, T. Rabczuk, S. Shamshirband, A.R. Varkonyi-
Koczy, State of the art of machine learning models in energy systems, a systematic review,
Energies 12 (7) (2019) 1301.
[24] M. Rana, A. Rahman, L. Liyanage, M.N. Uddin, Comparison and sensitivity analysis of methods
for solar pv power prediction, in: Pacific-Asia Conference on Knowledge Discovery and Data
Mining, Springer, 2018, pp. 333–344.
ACI [25] Y. Ren, P. Suganthan, N. Srikanth, Ensemble methods for wind and solar power forecasting—a
state-of-the-art review, Renew. Sustain. Energy Rev. 50 (2015) 82–91.
20,3/4
[26] L. Gigoni, A. Betti, E. Crisostomi, A. Franco, M. Tucci, F. Bizzarri, D. Mucci, Day-ahead hourly
forecasting of power generation from photovoltaic plants, IEEE Trans. Sustain. Energy 9 (2)
(2018) 831–842.
[27] C. Feng, J. Zhang, Hourly-similarity based solar forecasting using multi-model machine learning
blending, 2018. arXiv preprint arXiv:1803.03623.
250
[28] Z.W.I. Koprinska, I. Koprinska, A. Troncoso, F. Martınez-Alvarez, Static and dynamic ensembles
of neural networks for solar power forecasting, in: 2018 International Joint Conference on Neural
Networks (IJCNN), IEEE, 2018, pp. 1–8.
[29] M. Torabi, A. Mosavi, P. Ozturk, A. Varkonyi-Koczy, V. Istvan, A hybrid machine learning
approach for daily prediction of solar radiation, in: International Conference on Global Research
and Education, Springer, 2018, pp. 266–274.
[30] K. Cho, B. Van Merri€enboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio,
Learning phrase representations using rnn encoder-decoder for statistical machine translation,
2014. arXiv preprint arXiv:1406.1078.
[31] A. Ng, Sparse autoencoder, CS294A Lecture Notes 72 (2011) 1–19.
[32] Y. Ren, L. Zhang, P.N. Suganthan, Ensemble classification and regression-recent developments,
applications and future directions, IEEE Comp. Int. Mag. 11 (1) (2016) 41–53.
[33] K.I. for Scientific Research,http://www.kisr.edu.kw/en/, accessed: 2 October 2018 (2018).
[34] B. Marion, A. Anderberg, C. Deline, J. del Cueto, M. Muller, Perrin, et al., New data set for
validating pv module performance models, in: Photovoltaic Specialist Conference (PVSC), 2014
IEEE 40th, IEEE, 2014, pp. 1362–1366.
[35] F. Chollet, A. Yee, R. Prokofyev, Keras: Deep learning for humans, 2015.https://github.com/keras-
team/keras.
[36] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, Dean, et al., Tensorflow: a system for large-scale
machine learning, 2016. Retrieved fromhttp://tensorflow.org/.
Corresponding author
Mariam AlKandari can be contacted at: [email protected]
For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: [email protected]