Machine Learning Model To Estimate The Photodegradation Performance of Stannates Photocatalysts.
Machine Learning Model To Estimate The Photodegradation Performance of Stannates Photocatalysts.
A R T I C L E I N F O A B S T R A C T
Keywords: In this work, a comprehensive machine learning (ML) methodology was used to predict the degradation effi
Photocatalytic degradation ciency of different stannate and hydroxystannate photocatalysts on a wide range of waterborne pollutants. The
Machine learning structural, atomic features along with molecular fingerprints (MF) were used as descriptors of the crystalline
Molecular fingerprint
phase of the photocatalysts and the organic compounds, respectively. The encoded features of the photocatalysts
Random Forest
KNN
and contaminants along with the experimental variables of the degradation process are input to two ML models,
named as RF (random forest) and KNN (K nearest neighbor). The RF model has achieved a very good prediction
of the photocatalytic degradation efficiency (%) by different photocatalysts over a wide range of organic con
taminants. The RF model performance was investigated by applying two different training strategies. The effects
of different factors on photocatalytic degradation performance are further evaluated by feature importance
analyses. Two illustrative applications on the use of the ML model for optimal photocatalyst selection and for
assessing other types of photocatalysts for different environmental applications were provided.
* Corresponding author.
E-mail address: [email protected] (F. Djani).
https://doi.org/10.1016/j.comptc.2024.115003
Received 3 April 2024; Received in revised form 8 November 2024; Accepted 24 November 2024
Available online 2 December 2024
2210-271X/© 2024 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
A. Soltani et al. Computational and Theoretical Chemistry 1244 (2025) 115003
2
A. Soltani et al. Computational and Theoretical Chemistry 1244 (2025) 115003
effective operation.
All machine learning models were built using the Weka 3.8.6 ma
chine learning Software. The study is performed within four stages.
Firstly, a thorough literature review of 31 articles has allowed us to
build a comprehensive dataset of 597 data points containing photo
catalyst type, contaminant type, initial contaminant concentration,
photocatalyst concentration, specific surface area (SSA), light type and
pH (Supporting information). Degradation efficiency data were extrac
ted from C(t)/C0 = f (t) or degradation efficiency (D.E(%)) = f (t) plots
using WebPlotDigitizer 4.6 software. During the second stage, contam
inant canonical smiles were gathered from PubChem and python pack
age RDKit was then used to convert smiles to a binary digit vector, where
“0″ or “1” represent, respectively, the absence or presence of a particular
substructure. Hybrid features for the photocatalysts including crystal
and composition-based features were generated. In the first last stage, a
preprocessing step included outlier values detection and removal and
data normalizing has been performed using Weka built-in filters. Finally,
two ML algorithms, namely random forest (RF) and K nearest neighbor Fig. 1. The RMSE versus generation of evolution related to subset of features.
(KNN) were applied with hyperparameters defined as default, and the
prediction results were gathered and visualized. The algorithm with the
dosage, Contaminant dosage, pH, SSA, Light type, irradiation time,
highest R2 and lowest MAE and RMSE scores was selected for further
lattice parameter c, Average of ionic radii of the elements constituting
performance analysis using another training methodology i.e., training
the photocatalyst material, Average of ionization energies of the ele
with respect to each photocatalyst subset only. The RF model was
ments constituting the photocatalyst material, Ionization energy of the A
interpretated using Shapley Additive explanations (SHAP) for each of
element along with 6 molecular fingerprints of the contaminants.
the features. Evaluation metrics of the different prediction tasks were
In order to select the ML model for which we pave the way for further
expressed in terms of R2, MAE and RMSE as follow:
performance analysis using the two different strategies through the
∑n
(p − oi )2 (p − p) manuscript. Initially, the whole dataset was randomly divided into n =
R2 = ∑n i=1 i 2 ∑n i 2
(1) 2, 3 and 597 subsets, with (n-1) subset being used for model training and
i=1 (oi − o) i=1 (pi − p)
the remaining subset being used for testing periodically. After each
[
∑n
− pi |] subset validation procedure, evaluation metrics (R2, MAE, RMSE) were
i=1 |oi
MAE = (2) calculated and the final testing score is expressed eventually as the
n
average of the three evaluation parameters during the entire validation
[
∑n
− pi )2 ] process. The three-fold cross-validation method is a widely recognized
i=1 (oi
RMSE = (3) re-sampling technique and can mitigates the error of the model pre
n
diction. The evaluation of the prediction performance in terms of coef
where n is the total number of samples, oi and pi represent, the observed ficient of determination (R2), mean absolute error (MAE), and root mean
and the predicted degradation efficiencies, respectively. p and o are the square error (RMSE) of the two ML models on the three testing sub
average of the predicted and the measured degradation efficiencies groups and the leave one out cross validation LOOCV with and without
values, respectively. taking into account processing and morphological parameters are listed
in Table 2.
3. Results The scatter plots in Fig. 2 summarize the predicted vs experimentally
measured values of D.E (%) for the two ML models. The more the points
3.1. Selection of molecular descriptors
3
A. Soltani et al. Computational and Theoretical Chemistry 1244 (2025) 115003
Fig. 2. Scatter plot describing predicted vs experimentally measured values of D.E (%). (a) RF. (b) KNN.
that would lay along the diagonal 1:1 line, the more the model tend data points using these two different training and testing procedures.
toward perfection. The highest correlation coefficient R2 in all cross- For the ML model trained with all datasets (Fig. 2), its prediction
validation folds was achieved by the random forest algorithm, which performance metrics, such as R2, MAE, RMSE of ML model trained with
has achieved an absolute error of only 0.09 and 0.07 in the three-fold the overall dataset (Table 2), lied in between those predicted for indi
and the leave one out cross validation, respectively and was visualized vidual photocatalyst (Table 3).
in the clustering of most of the points close to the diagonal 1:1 line. The The Random Forest model trained using all available datasets,
error decreased as the number of folds rose. This is due to the fact that exhibited superior predictive performance compared to training the
when the number of folds rises, the size of the training sample likewise model only with the data specific to that photocatalyst. This might
increases, providing the model with more opportunities to capture new appear at the first glance paradoxical, especially in this case where
instances. The RF model demonstrated good prediction performance, different photocatalyst and contaminants share no direct connections.
considering the complex interactions at play and the relatively small Still, machine learning algorithms exhibit enhanced efficiency in pattern
data utilized for training. recognition as the volume of data increases. Additionally, it is evident
K nearest neighbor (KNN) algorithm achieved less accurate pre that the model trained without incorporating processing and morpho
dictions than random forest, exhibiting considerably errors and lower logical properties demonstrated superior performance compared to
correlation coefficients, reflected in the spreading of the points from the when these features were included. Furthermore, the significance of the
1:1 regression diagonal. Random forest (RF) typically outperforms KNN data’s quality and variety cannot be overstated, and it has noticeably
in situations with many features and a large number of training exam diminished in the context of the second training approach. ML models
ples. KNN, in contrast, is more suited to low-dimensional data and can be have the ability to learn more patterns and associations when provided
sensitive to the choice of distance metric [50]. with a more extensive dataset. However, it is important to remember
Additional investigations were performed to examine the accuracy of that they may also include irrelevant or misleading information, some
the RF model in predicting the extent of photocatalytic degradation times referred to as ’noise’. The data impact on the performance of the
across various photocatalysts and pollutants. The first investigation ML model is confirmed by the findings shown in Table 3. The finding
sought to examine the RF model using the data acquired for several emphasized the essentials of a data-driven approach, namely the sig
photocatalysts, using the three-fold cross-validation approach as speci nificance of both the quantity and variety of data.
fied. The second set of analysis attempted to compare the prior findings The Fig. 3a and b show the sensitivity of the model’s performance in
of RF training with those obtained by training just particular subgroups terms of R2, MAE and RMSE to the training data volume and diversity in
of photocatalysts. To achieve this objective, the data was partitioned data, respectively. The RF model provides valuable insights into its
into subsets based on various photocatalysts. Each subset was thereafter learning process, demonstrating significant improvements in adapting
used to train and evaluate the RF model specific to that particular kind. to new patterns as the amount of data rises. Additionally, Fig. 3b out
The scatter plots depicted in Fig. S1 (a, c, e, g, i, k, m) in Supporting lines a specific feature related to machine learning models which is,
information show of the prediction vs actual D.E values for each group of unlike physics-based models, are able to learn from data with no direct
the photocatalysts. These predictions resulted from training the RF relevance between them. The above discussion is further confirmed by
model with all the data and testing with respect to each photocatalyst several studies especially in drug discovery [51,52].
group separately. Fig. S1 (b, d, f, h, j, l, n) show the prediction results for
each group of photocatalysts acting on a variety of contaminants trained 4. Performance of RF model for different types of contaminants
and tested separately with their own data. Table 3 summarizes the
performance of model prediction for different photocatalysts with most In addition, we conducted an analysis of the RF model performance
Table 3
The performance of the RF model in the first (trained with all dataset and tested with respect to each photocatalyst) and the second (trained for individual photo
catalyst) training strategy.
Photocatalyst ZnSn ZnSnO3 SrSn SrSnO3 MgSn MgSnO3 CaSnO3
(OH)6 (OH)6 (OH)6
RF model trained with all dataset and tested with respect to each photocatalyst R2 0.964 0.974 0.989 0.986 0.96 0.996 0.972
MAE 0.09 0.058 0.049 0.054 0.088 0.025 0.088
RMSE 0.11 0.073 0.065 0.072 0.104 0.03 0.109
RF model trained for individual photocatalyst R2 0.932 0.941 0.928 0.879 0.933 0.962 0.932
MAE 0.079 0.057 0.089 0.098 0.084 0.055 0.089
RMSE 0.1 0.088 0.123 0.146 0.105 0.068 0.11
4
A. Soltani et al. Computational and Theoretical Chemistry 1244 (2025) 115003
Fig. 3. RF Model performance as a function of training data volume (a) and number of subgroups (b).
with respect to several categories of pollutants. The dataset used to train color coded, with shifting from blue to red indicating the magnitude of
the ML model consisted of 8 distinct water pollutants with the number of the variable increased.
data points differ from one to other as presented in Table 4. The majority Combining the informations extracted from Fig. 4a and b, factors like
of the results were relatively precise, with mean absolute error (MAE) photocatalyst, pH and contaminant dosages possess almost the same
values below 0.1. The previously discussed trend that increasing amount influence with photocatalyst concentration showing slightly advanced
of data would reduce prediction error with was valid for the six influence. This important ML guided conclusion corroborates well with
contaminant categories, namely: methyl orange, dimethyl phthalate experimental findings [53,54], since, usually in photocatalysis, diluted
ester, diethyl phthalate ester, ciprofloxacin, toluene and Remazol golden solutions are used for several purposes that include: improving mass
yellow. However, there were two specific pollutants, namely methylene transfer [55], reducing the recombination probability of photogenerated
blue and rhodamine B, that deviated from this trend. The complex electron-hole pairs [56] and prompting light penetration [57].
structure of these two pollutants may be at the origin of this paradox Another important information that could be extracted which is less
outcome. This implies that it is essential to further include substantial straightforward, that specific surface area (SSA) didn’t have that much
quantity of data in order to effectively improve the ML model accuracy significance as it would be expected. In fact, other factors like crystal
for these two contaminants. linity also play important role in photocatalysis, even more important
As we can see, testing the RF model performance with respect to each than surface area in some cases [58–60] which has been well predicted
contaminant/ photocatalyst (in the previous section) category has hel by our model, indicting furthermore its conformity with real
ped in identifying subsets that need additional enhancements. Conse experiments.
quently, it is essential to conduct a multi-scale performance test to Ultimately, the developed RF model offers a remarkable under
analyze the strengths and weaknesses of the model in question. standing of how the unique properties of a photocatalyst might impact
its activity. Based on the SHAP mean values shown in Fig. 4b, there is a
4.1. Feature importance and model interpretability positive correlation between the higher ionization energy of the metal in
the A position and the catalytic activity of the material. This outcome
The previous results indicated that the RF model trained with all can be comprehended from a chemical standpoint. Elements with
dataset and tested with respect to each photocatalyst and contaminant elevated ionization energies tend to be more electronegative due to their
achieved decent performance in predicting the photocatalytic removal strong retention of electrons. As a result, the electronic density around
efficiency by different photocatalysts over a wide range of contaminants. the metal in the B position, is diminished and the electrons become less
As the volume and diversity of data being investigated increases, the tightly bound and are easily stimulated when exposed to light, which is
efficiency of machine learning models to deduce patterns improves. crucial for a better photocatalytic activity.
However, in situations where it is important to understand how the The clustering of SHAP values for light type to the right with blue
model arrived at its decision, the challenge is to find suitable interpre color data points (UV light) indicated, would affect positively the con
tation for the model’s outcome. taminant’s photo-degradation efficiency, this significance is lowered
In the present study, the estimation of Shapley Additive explanations while shifting toward high feature value of light type (visible and
(SHAP) value for each of the variables could help in assessing the simulated sunlight labelled 2 and 3, respectively).
contribution of a specific feature to the overall target value by making
the prediction with and without the attribute.
The mean SHAP values of the six experimental variables are shown in 4.2. Application of the RF model in predicting the photocatalytic
Fig. 4a. The influence of the magnitude of the value of each independent performance of a novel photocatalyst
variable is demonstrated in Fig. 4b, in which, the value of each point is
One the possible important applications of the developed RF model is
to forecast the degrading efficiency of a non-familiar photocatalyst
Table 4
which is cadmium stannate CdSnO3 (hasn’t been included in our data
The performance of RF model in predicting D.E with respect to each contami
nant subgroup. set). For this purpose, a testing set was prepared after retrieving the
experimental variables that have been adopted for the photocatalytic
Contaminant No of data points R2 MAE RMSE
degradation of rhodamine B in the work of Liu et al. [61]. Table 5
MB 97 0.95 0.1 0.12 summarizes the predicted values against the actual ones extracted from
MO 111 0.88 0.12 0.15
the ln (C(t)/C0) plot in the manuscript. Fig. 5 is a visualization of the
RHB 115 0.97 0.098 0.11
DMP 42 0.994 0.024 0.03 previously RF predicted vs experimental values of the target D.E (%).
DEP 42 0.997 0.024 0.029 The overall predicted D.E was about only 14 % less than the actual value
CIP 30 0.968 0.07 0.085 (pred.61 % vs actual. 75 %), which is accepted given the model’s null
Toluene 33 0.96 0.07 0.098 experience of the novel photocatalyst (see Table 6).
Remazol Golden Yellow 36 0.99 0.048 0.058
Making predictions with the pre-trained RF model with the RhB
5
A. Soltani et al. Computational and Theoretical Chemistry 1244 (2025) 115003
Fig. 4. (a) The mean SHAP values of the six experimental variables (highlighting the relative impact of the experimental variable on the performance of the RF
model). (b) The color-coded distribution of SHAP values (showing whether a certain variable has a positive or negative effect on ML model prediction).
subgroup gave rise to 18 % absolute error (pred.57 % vs actual. 75 %), 4.3. Application of the RF model in predicting the D.E of novel
this observation, once again re-emphasizes the previously made one contaminant
about the necessity of training the model with the largest and diverse
dataset possible. While speaking about RhB subgroup, it is noteworthy Crystal violet also known as gentian violet is another type of con
that model predicted the degradation efficiency of the six photocatalysts taminants that’s hasn’t been included in the dataset and for which we
while preserving mostly the original reactivity sequence: ZnSnO3 < want to forecast its degradation efficiency. In this application, the above
CdSnO3 < MgSnO3 < CaSnO3 < SrSn(OH)6. Finally, the predicted D.E procedure described in methodology section has been repeated for
values may be used to derive the degradation rate constant (k) by the strontium hydroxystannate and the experimental conditions were
application of kinetic models and selecting the one with the highest retrieved from the work made by Xue et al. [62].
correlation coefficient R2, which can be regarded as an additional Surprisingly, the model was able to make a close guess for the novel
application for the model. contaminant, where the predicted value of degradation efficiency of 74
6
A. Soltani et al. Computational and Theoretical Chemistry 1244 (2025) 115003
Table 7
Summar of feature selection with the new set of 2D
molecular descriptors.
No Descriptor
01 Photocatalyst dosage
02 Contaminant dosage
03 pH
04 SSA
05 Light type
Fig. 5. (a) D.E (%) predictions vs. experimental values of the RhB degraded by
06 Irradiation time
CdSnO3; (b) reactivity order as predicted using the pre-trained RF using the first
07 Ia
methodology. 08 EState_VSA5
09 EState_VSA9
% was estimated against an actual value close to 100 %. Fig. 6(a and b) 10 fr_bicyclic
11 fr_methoxy
shows the ML predicted vs actual D.E for CV contaminant by SrSn(OH)6
12 MaxAbsPartialCharge
using the two strategies. Retraining the RF model with the dataset by 13 MaxPartialCharge
involving the data from crystal violet article gave an impressive pre 14 MinAbsPartialCharge
diction of 95 %. 15 MinPartialCharge
16 PEOE_VSA7
17 SlogP_VSA11
18 SMR_VSA5
Table 6
Comparative table of some similar predictive tasks from literature for the photocatalytic degradation of numerous organic pollutants.
Task performed No of data points Best algorithm Evaluation metrics Ref.
R2 MAE RMSE
Fig. 6. (a) Predictions with the pre-trained RF model only with all the dataset without including crystal violet data (CV). (b) Predictions with the pre-trained RF using
all data set with crystal violet included. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
7
A. Soltani et al. Computational and Theoretical Chemistry 1244 (2025) 115003
8
A. Soltani et al. Computational and Theoretical Chemistry 1244 (2025) 115003
[4] E. Drakvik, et al., Statement on advancing the assessment of chemical mixtures and [27] J. Huang, et al., Size-controlled synthesis of porous ZnSnO3 cubes and their gas-
their risks for human health and the environment, Environ. Int. 134 (2020) sensing and photocatalysis properties, Sens. Actuators, BChem 171–172 (2012)
105267, https://doi.org/10.1016/j.envint.2019.105267. 572–579, https://doi.org/10.1016/j.snb.2012.05.036.
[5] A.C. Bejarano, J.E. Adams, J. McDowell, T.F. Parkerton, M.L. Hanson, [28] W.-H. Pan, W.-J. Yang, C.-X. Wei, L.-Y. Hao, H.-D. Lu, W. Yang, Recent advances in
Recommendations for improving the reporting and communication of aquatic zinc hydroxystannate-based flame retardant polymer blends, Polymers (Basel) 14
toxicity studies for oil spill planning, response, and environmental assessment, (11) (2022) 2175, https://doi.org/10.3390/polym14112175.
Aquat. Toxicol. 255 (2023) 106391, https://doi.org/10.1016/j. [29] N. Kumar, U. Jung, B. Jung, J. Park, M. Naushad, Zinc hydroxystannate/zinc-tin
aquatox.2022.106391. oxide heterojunctions for the UVC-assisted photocatalytic degradation of methyl
[6] H.A. Muhammed, A. Yahaya, S.S. Abdullahi, A.H. Jagaba, A.H. Birniwa, Mitigating orange and tetracycline, Environ. Pollut. 316 (2023) 120353, https://doi.org/
water contamination by controlling anthropogenic activities of organochlorine 10.1016/j.envpol.2022.120353.
pesticides (OCPs) for surface water quality assurance, Case Stud. Chem. Environ. [30] G. Gnanamoorthy, V.K. Yadav, D. Latha, V. Karthikeyan, V. Narayanan, Enhanced
Eng. 8 (2023) 100474, https://doi.org/10.1016/j.cscee.2023.100474. photocatalytic performance of ZnSnO3/rGO nanocomposite, Chem. Phys. Lett. 739
[7] I.A. Saleh, N. Zouari, M.A. Al-Ghouti, Removal of pesticides from water and (2020), https://doi.org/10.1016/j.cplett.2019.137050.
wastewater: chemical, physical and biological treatment approaches, Environ. [31] J. Joseph, S.B. Saseendran, S.R. Achary, A.A. Sukumaran, M.K. Jayaraj, Zinc
Technol Innov 19 (2020) 101026, https://doi.org/10.1016/j.eti.2020.101026. stannate flakes for optoelectronic and antibacterial applications, 2019, p. 030026,
[8] S.F. Ahmed, et al., Recent developments in physical, biological, chemical, and doi: 10.1063/1.5112865.
hybrid treatment techniques for removing emerging contaminants from [32] L.M.C. Honorio, M.V.B. Santos, E.C. da Silva Filho, J.A. Osajima, A.S. Maia, I.M.
wastewater, J. Hazard. Mater. 416 (2021) 125912, https://doi.org/10.1016/j. G. dos Santos, Alkaline earth stannates applied in photocatalysis: prospection and
jhazmat.2021.125912. review of literature, Cerâmica 64 (372) (2018) 559–569, https://doi.org/10.1590/
[9] R. Rashid, I. Shafiq, P. Akhter, M.J. Iqbal, M. Hussain, A state-of-the-art review on 0366-69132018643722480.
wastewater treatment techniques: the effectiveness of adsorption method, Environ. [33] A.K. Ganguli, G.B. Kunde, W. Raza, S. Kumar, P. Yadav, Assessment of performance
Sci. Pollut. Res. 28 (8) (2021) 9050–9066, https://doi.org/10.1007/s11356-021- of photocatalytic nanostructured materials with varied morphology based on
12395-x. reaction conditions, Molecules 27 (22) (2022) 7778, https://doi.org/10.3390/
[10] S. Wadhawan, A. Jain, J. Nayyar, S.K. Mehta, Role of nanomaterials as adsorbents molecules27227778.
in heavy metal ion removal from waste water: a review, J. Water Process Eng. 33 [34] H. Tao, T. Wu, M. Aldeghi, T.C. Wu, A. Aspuru-Guzik, E. Kumacheva, Nanoparticle
(2020) 101038, https://doi.org/10.1016/j.jwpe.2019.101038. synthesis assisted by machine learning, Nat. Rev. Mater. 6 (8) (2021) 701–716,
[11] A.K. Prajapati, S. Das, M.K. Mondal, Exhaustive studies on toxic Cr(VI) removal https://doi.org/10.1038/s41578-021-00337-5.
mechanism from aqueous solution using activated carbon of Aloe vera waste [35] G. Hautier, C. Fischer, V. Ehrlacher, A. Jain, G. Ceder, Data mined ionic
leaves, J Mol Liq 307 (2020) 112956, https://doi.org/10.1016/j. substitutions for the discovery of new compounds, Inorg. Chem. 50 (2) (2011)
molliq.2020.112956. 656–663, https://doi.org/10.1021/ic102031h.
[12] A.K. Prajapati, M.K. Mondal, Hazardous As(III) removal using nanoporous [36] C.L. Phillips, G.A. Voth, Discovering crystals using shape matching and machine
activated carbon of waste garlic stem as adsorbent: kinetic and mass transfer learning, Soft Matter 9 (35) (2013) 8552, https://doi.org/10.1039/c3sm51449h.
mechanisms, Korean J. Chem. Eng. 36 (11) (2019) 1900–1914, https://doi.org/ [37] P. Raccuglia, et al., Machine-learning-assisted materials discovery using failed
10.1007/s11814-019-0376-x. experiments, Nature 533 (7601) (2016) 73–76, https://doi.org/10.1038/
[13] A. Chatterjee, S. Shamim, A.K. Jana, J.K. Basu, Insights into the competitive nature17439.
adsorption of pollutants on a mesoporous alumina–silica nano-sorbent synthesized [38] B. Meredig, et al., Combinatorial screening for new materials in unconstrained
from coal fly ash and a waste aluminium foil, RSC Adv. 10 (26) (2020) composition space with machine learning, Phys. Rev. B 89 (9) (2014) 094104,
15514–15522, https://doi.org/10.1039/D0RA01397H. https://doi.org/10.1103/PhysRevB.89.094104.
[14] U. Kumari, H. Siddiqi, M. Bal, B.C. Meikap, Calcium and zirconium modified acid [39] G. Hautier, C.C. Fischer, A. Jain, T. Mueller, G. Ceder, Finding nature’s missing
activated alumina for adsorptive removal of fluoride: performance evaluation, ternary oxide compounds using machine learning and density functional theory,
kinetics, isotherm, characterization and industrial wastewater treatment, Adv. Chem. Mater. 22 (12) (2010) 3762–3767, https://doi.org/10.1021/cm100795d.
Powder Technol. 31 (5) (2020) 2045–2060, https://doi.org/10.1016/j. [40] G.V.S.M. Carrera, L.C. Branco, J. Aires-de-Sousa, C.A.M. Afonso, Exploration of
apt.2020.02.035. quantitative structure–property relationships (QSPR) for the design of new
[15] G.Y. Gor, P. Huber, N. Bernstein, Adsorption-induced deformation of nanoporous guanidinium ionic liquids, Tetrahedron 64 (9) (2008) 2216–2224, https://doi.org/
materials—a review, Appl. Phys. Rev. 4 (1) (2017), https://doi.org/10.1063/ 10.1016/j.tet.2007.12.021.
1.4975001. [41] D. Farrusseng, F. Clerc, C. Mirodatos, R. Rakotomalala, Virtual screening of
[16] G. Saxena, R. Bharagava, Organic and inorganic pollutants in industrial wastes, in: materials using neuro-genetic approach: concepts and implementation, Comput.
Environmental Pollutants and Their Bioremediation Approaches, CRC Press, 2017, Mater. Sci. 45 (1) (2009) 52–59, https://doi.org/10.1016/j.
pp. 23–56, https://doi.org/10.1201/9781315173351-3. commatsci.2008.03.060.
[17] M. Ibrahim, A. Siddique, L. Verma, J. Singh, J.R. Koduru, Adsorptive removal of [42] I. Salahshoori, M. Namayandeh Jorabchi, A. Baghban, H.A. Khonakdar, Integrative
fluoride from aqueous solution by biogenic iron permeated activated carbon analysis of multi machine learning models for tetracycline photocatalytic
derived from sweet lime waste, Acta Chim. Slov. (2019) 123–136, https://doi.org/ degradation with MOFs in wastewater treatment, Chemosphere 350 (2024)
10.17344/acsi.2018.4717. 141010, https://doi.org/10.1016/j.chemosphere.2023.141010.
[18] S.O. Ganiyu, C.A. Martínez-Huitle, M.A. Oturan, Electrochemical advanced [43] Z.H. Jaffari, et al., Machine learning approaches to predict the photocatalytic
oxidation processes for wastewater treatment: advances in formation and detection performance of bismuth ferrite-based materials in the removal of malachite green,
of reactive species and mechanisms, Curr. Opin. Electrochem. 27 (2021) 100678, J. Hazard Mater. 442 (2023) 130031, https://doi.org/10.1016/j.
https://doi.org/10.1016/j.coelec.2020.100678. jhazmat.2022.130031.
[19] D. Ghernaout, N. Elboughdiri, S. Ghareba, Fenton technology for wastewater [44] A. Esmaeili, et al., Pharmaceutical wastewater treatment using TiO2 nanosheets
treatment: dares and trends, Oalib 07 (01) (2020) 1–26, https://doi.org/10.4236/ deposited by cobalt co-catalyst as hybrid photocatalysts: combined experimental
oalib.1106045. study and artificial intelligence modeling, Chem. Prod. Process Model. 18 (4)
[20] Y. Guo, et al., Modelling of emerging contaminant removal during heterogeneous (2023) 611–631, https://doi.org/10.1515/cppm-2022-0070.
catalytic ozonation using chemical kinetic approaches, J. Hazard. Mater. 380 [45] F.-S. Tabatabai-Yazdi, A. Ebrahimian Pirbazari, F. Esmaeili Khalil Saraei, N. Gilani,
(2019) 120888, https://doi.org/10.1016/j.jhazmat.2019.120888. Construction of graphene based photocatalysts for photocatalytic degradation of
[21] K.P. Gopinath, N.V. Madhav, A. Krishnan, R. Malolan, G. Rangarajan, Present organic pollutant and modeling using artificial intelligence techniques, Physica B
applications of titanium dioxide for the photocatalytic removal of pollutants from Condens. Matter 608 (2021) 412869, https://doi.org/10.1016/j.
water: a review, J. Environ. Manage. 270 (2020) 110906, https://doi.org/ physb.2021.412869.
10.1016/j.jenvman.2020.110906. [46] A. Esmaeili, et al., CdS nanocrystallites sensitized ZnO nanosheets for visible light
[22] P. Shandilya, S. Sambyal, R. Sharma, P. Mandyal, B. Fang, Properties, optimized induced sonophotocatalytic/photocatalytic degradation of tetracycline: from
morphologies, and advanced strategies for photocatalytic applications of WO3 experimental results to a generalized model based on machine learning methods,
based photocatalysts, J. Hazard. Mater. 428 (2022) 128218, https://doi.org/ Chemosphere 332 (2023) 138852, https://doi.org/10.1016/j.
10.1016/j.jhazmat.2022.128218. chemosphere.2023.138852.
[23] F. Güell, et al., ZnO-based nanomaterials approach for photocatalytic and sensing [47] N. Esmaeili, et al., Estimation of 2,4-dichlorophenol photocatalytic removal using
applications: recent progress and trends, Mater. Adv. 4 (17) (2023) 3685–3707, different artificial intelligence approaches, Chem. Prod. Process Model. 18 (2)
https://doi.org/10.1039/D3MA00227F. (2023) 247–263, https://doi.org/10.1515/cppm-2021-0065.
[24] G. Ramanathan, K.R. Murali, Photocatalytic activity of SnO2 nanoparticles, [48] C.-M. Kim, Z.H. Jaffari, A. Abbas, M.F. Chowdhury, K.H. Cho, Machine learning
J. Appl. Electrochem. 52 (5) (2022) 849–859, https://doi.org/10.1007/s10800- analysis to interpret the effect of the photocatalytic reaction rate constant (k) of
022-01676-z. semiconductor-based photocatalysts on dye removal, J. Hazard. Mater. 465 (2024)
[25] O.V. Nkwachukwu, O.A. Arotiba, Perovskite oxide–based materials for 132995, https://doi.org/10.1016/j.jhazmat.2023.132995.
photocatalytic and photoelectrocatalytic treatment of water, Front. Chem. 9 [49] Z. Jiang, J. Hu, M. Tong, A.C. Samia, H. (Judy) Zhang, X. (Bill) Yu, A novel
(2021), https://doi.org/10.3389/fchem.2021.634630. machine learning model to predict the photo-degradation performance of different
[26] M. Paszkiewicz-Gawron, et al., Stannates, titanates and tantalates modified with photocatalysts on a variety of water contaminants, Catalysts 11 (9) (2021) 1107,
carbon and graphene quantum dots for enhancement of visible-light photocatalytic https://doi.org/10.3390/catal11091107.
activity, Appl. Surf. Sci. 541 (2021) 148425, https://doi.org/10.1016/j. [50] L. Chen, et al., Optimization and comparison of machine learning methods in
apsusc.2020.148425. estimation of carbon dioxide loading in chemical solvents for environmental
applications, J. Mol. Liq. 349 (2022) 118513, https://doi.org/10.1016/j.
molliq.2022.118513.
9
A. Soltani et al. Computational and Theoretical Chemistry 1244 (2025) 115003
[51] T. Pereira, M. Abbasi, B. Ribeiro, J.P. Arrais, Diversity oriented Deep J. Ind. Eng. Chem. 116 (2022) 339–350, https://doi.org/10.1016/j.
Reinforcement Learning for targeted molecule generation, J. Cheminform. 13 (1) jiec.2022.09.024.
(2021) 21, https://doi.org/10.1186/s13321-021-00498-z. [58] H. Cheng, J. Wang, Y. Zhao, X. Han, Effect of phase composition, morphology, and
[52] T.B. Dunn, et al., Diversity and chemical library networks of large data sets, specific surface area on the photocatalytic activity of TiO2 nanomaterials, RSC Adv.
J. Chem. Inf. Model. 62 (9) (2022) 2186–2201, https://doi.org/10.1021/acs. 4 (87) (2014) 47031–47038, https://doi.org/10.1039/C4RA05509H.
jcim.1c01013. [59] K. Zhang, J. Wang, R. Ninakanti, S.W. Verbruggen, Solvothermal synthesis of
[53] M. Pavel, C. Anastasescu, R.-N. State, A. Vasile, F. Papa, I. Balint, Photocatalytic mesoporous TiO2 with tunable surface area, crystal size and surface hydroxylation
degradation of organic and inorganic pollutants to harmless end products: for efficient photocatalytic acetaldehyde degradation, Chem. Eng. J. 474 (2023)
assessment of practical application potential for water and air cleaning, Catalysts 145188, https://doi.org/10.1016/j.cej.2023.145188.
13 (2) (2023) 380, https://doi.org/10.3390/catal13020380. [60] Z. Liang, X. Zhuang, Z. Tang, Q. Deng, H. Li, W. Kang, High-crystalline polymeric
[54] N. Li, C. Wang, K. Zhang, H. Lv, M. Yuan, D.W. Bahnemann, Progress and prospects carbon nitride flake composed porous nanotubes with significantly improved
of photocatalytic conversion of low-concentration NO, Chin. J. Catal. 43 (9) (2022) photocatalytic water splitting activity: the optimal balance between crystallinity
2363–2387, https://doi.org/10.1016/S1872-2067(22)64139-1. and surface area, Chem. Eng. J. 432 (2022) 134388, https://doi.org/10.1016/j.
[55] A.K. Prajapati, M.K. Mondal, Comprehensive kinetic and mass transfer modeling cej.2021.134388.
for methylene blue dye adsorption onto CuO nanoparticles loaded on nanoporous [61] C. Liu, et al., Controlled synthesis and structure tunability of photocatalytically
activated carbon prepared from waste coconut shell, J. Mol. Liq. 307 (2020) active mesoporous metal-based stannate nanostructures, Appl. Surf. Sci. 296
112949, https://doi.org/10.1016/j.molliq.2020.112949. (2014) 53–60, https://doi.org/10.1016/j.apsusc.2014.01.030.
[56] E. Dhanaraman, A. Verma, P. Chen, N. Chen, Y. Siddiqui, Y. Fu, Bi2WO6 [62] Z. Xue, et al., Low temperature synthesis of SnSr(OH)6 nanoflowers and
incorporation of g-C3N4 to enhance the photocatalytic N2 reduction reaction and photocatalytic performance for organic pollutants, Int. J. Mater. Res. 113 (1)
antibiotic pollutants removal, Sol. RRL (2024), https://doi.org/10.1002/ (2022) 80–90, https://doi.org/10.1515/ijmr-2021-8333.
solr.202300981. [63] A.H. Navidpour, A. Hosseinzadeh, Z. Huang, D. Li, J.L. Zhou, Application of
[57] S. Wu, M. Li, L. Xin, H. Long, X. Gao, Simultaneously photocatalytic removal of Cr machine learning algorithms in predicting the photocatalytic degradation of
(VI) and metronidazole by asynchronous cross-linked modified sodium alginate, perfluorooctanoic acid, Catal. Rev. (2022) 1–26, https://doi.org/10.1080/
01614940.2022.2082650.
10