Ijee Hu Et Al 6 1 71 94
Ijee Hu Et Al 6 1 71 94
1. Introduction
2. Methods
72
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
cases in Hubei Province before February 14, 2020 was adjusted by the for-
mula:
the number of the confirmed cases = the number of the lab confirmed
cases .
Data included the total numbers of the accumulated and new confirmed
cases in all of China and the numbers of the accumulated and new confirmed
cases across 31 Provinces/Cities in mainland China and three other regions
(Hong Kong, Macau and Taiwan) in China. The data were organized in a ma-
trix with the rows representing the whole China and province/city and col-
umns representing the number of the new confirmed cases of each day.
The confirmed cases of each province/city were a time series. Let be
the number of the confirmed cases of the day within the province/city.
Let be a dimensional matrix. The element is the number of the
confirmed new cases of Covid-19 on the day, starting with January 11,
2020 in the city.
, where . Let be
the normalized number of cases to forecast. If then set . The
loss function was defined as
,
73
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
where was the observed number of the cases in the forecasting day of
the segment time series and was its forecasted number of cases by the
MAE, and were weights. If was in the interval [1, 12], then . If
was in the interval [13, 24], then , etc. The back propagation algo-
rithm was used to estimate the weights and bias in the MAE. Repeat training
processes 5 times. The average forecasting will be taken as a
final forecasted number of the accumulated confirmed cases for each prov-
ince/city.
74
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
2.4. Clustering
The values of the latent variables in the second latent layer of the MAE
for each province/city were extracted. For each province/city, a di-
mensional latent matrix were formed. The largest single value of the
latent matrix was obtained via single value decomposition. We performed
five-time trainings and obtained five largest single values. For each prov-
ince/city, we formed a feature vector that consisted of the five largest single
values , the starting day and the forecasted end day of the Covid-19 out-
break, the day, the number of new confirmed cases reaching the maximum,
the largest number of the forecasted new confirmed cases and the number of
the forecasted accumulated confirmed cases of Covid-19 in the respective
province/city. The k-means algorithms were performed on the feature
vectors to group provinces/cities into clusters.
5. Results
Figure 2 plotted the total number curves of the reported and forecasted
cumulative and new confirmed cases of Covid-19 in China as a function of
days. The reported cases were from January 11, 2020 to February 27, 2020.
A total number of 47 days’ data were available. We began to forecast on
February 20, 2020. Figure 2 showed that the forecasting curve was close to the
reported curve. From Figure 2 it can be observed that the curve of the new
confirmed cases reached the 5,236 peak on February 5, 2020 and decreased to
zero on April 20 (forecasting). The potential cumulative confirmed cases of
Covid-19 in China was observed to reach the plateau (83,401) on April 20,
2020. Figure S1 plotted the national reported and fitted curves of the
75
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
cumulative confirmed cases in China from January 20, 2020 to February 27,
2020. To further examine the accuracy of forecasting, Table 1 where the data
from January 20, 2020 to February 27, 2020 were used to fit the MAE model.
In the table, we listed 1 day-step to 10 day-step forecasting errors,
respectively, i.e., the errors of using the current reported cases to forecast
future s day cases. Table 1 showed that the average errors of the forecasting
did not strictly increase as the number-steps for forecasting increased due to
fluctuations of the data. Forecasting accuracy was very high.
Figure 2. The national reported and forecasted curves of the cumulative and new confirmed
cases of Covid-19 in China as a function of days from January 11, 2020 to April 20, 2020.
Figure S1. The national reported and fitted curves of the cumulative confirmed cases of Covid-
19 in China from January 11, 2020 to February 27, 2020, where the red curve was the
reported and green curve was the fitted.
76
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
Figure 3. The forecasted curves of the cumulative confirmed cases of Covid-19 across 34
province/cities in China as a function of days from January 11, 2020 to April 20, 2020.
77
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
Figure 3. The clusters that were grouped by features extracted from the MAE and the
cumulative confirmed case time series of Covid-19 across 31 provinces/cities in mainland
China and three other regions in China formed 9 clusters.
78
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
geographic area. They may have similar economic relationships with Wuhan,
healthcare resources and take similar interventions to control of the spread of
Covid-19.
6. Discussion
79
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
Appendix: Table 1.
10-
1-step 1-step 2-step 3-step 4-step 5-step 6-step 7-step 8-step 9-step step
Date Actual prediction error error error error error error error error error error
18/02/2020 72.528 71.757 -1,06%
24/02/2020 77.262 79.837 3,33% 2,18% 0,83% 2,53% 2,46% 3,30% 1,08%
25/02/2020 77.780 79.155 1,77% 3,01% 2,14% 0,47% 2,28% 2,36% 3,15% 0,76%
26/02/2020 78.191 77.957 -0,30% 1,61% 2,70% 2,15% 0,28% 2,20% 2,49% 3,21% 0,80%
27/02/2020 78.630 78.646 0,02% -0,53% 1,49% 2,47% 2,22% 0,21% 2,37% 2,46% 3,37% 0,73%
Average Absolute Error 1,07% 1,43% 1,46% 1,56% 1,62% 1,64% 2,27% 2,14% 2,08% 0,73%
References
Charte, D., Charte, F., García, S., del Jesus, M. J., & Herrera, F. (2018). A
practical tutorial on autoencoders for nonlinear feature fusion:
Taxonomy, models, software and guidelines. Information Fusion, 44,
78-96.
Funk, S., Camacho, A., Kucharski, A. J., Eggo, R. M., & Edmunds, W. J.
(2018). Real-time forecasting of infectious disease dynamics with a
stochastic semi-mechanistic model. Epidemics, 22, 56-61.
Johansson, M. A., Apfeldorf, K. M., Dobson, S., Devita, J., Buczak, A. L.,
Baugher, B. et al. (2019). An open challenge to advance probabilistic
forecasting for dengue epidemics. Proceedings of the National Academy
of Sciences, 116(48), 24268-24274.
Kucharski, A., Russell, T., Diamond, C., Liu, Y, (2020). CMMID nCoV
working group. In J. Edmunds, S. Funk, & R. Eggo (ed.). Analysis and
projections of transmission dynamics of nCoV in Wuhan. Retrieved
from https://cmmid.github.io/ncov/wuhan_early_dynamics/index.html.
80
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
Li, Q., Guan, X., Wu, P., et al. (2020) Early transmission dynamics in Wuhan,
China, of novel coronavirus–infected pneumonia. New England Journal
of Medicine, 2020 Jan 29. [Epub ahead of print] doi:
10.1056/NEJMoa2001316.
Tuite, A. R., & Fisman, D. N. (2020). Reporting, epidemic growth, and
reproduction numbers for the 2019 novel coronavirus (2019-nCoV)
epidemic. Annals of Internal Medicine, 2020 Feb 5. [Epub ahead of
print] doi: 10.7326/M20-0358..
Wu, J.T., Leung, K., Leung, G.M. (2020). Nowcasting and forecasting the
potential domestic and international spread of the 2019-nCoV outbreak
originating in Wuhan, China: a modelling study. The Lancet. 2020 Jan
31. [Epub ahead of print] pii:S0140-6736(20)30260-9. doi:
10.1016/S0140-6736(20)30260-9.
Yuan, X., Huang, B., Wang, Y., Yang, C., & Gui, W. (2018). Deep learning-
based feature representation and its application for soft sensor modeling
with variable-wise weighted SAE. IEEE Transactions on Industrial
Informatics, 14(7), 3235-3243.
Zhao, S., Musa, S. S., Lin, Q., Ran, J., Yang, G., Wang, W, et. al. (2020).
Estimating the unreported number of novel coronavirus (2019-nCoV)
cases in China in the first half of January 2020: a data-driven Modelling
analysis of the early outbreak. Journal of Clinical Medicine, 9(2), 388.
© 2020 Hu, Ge, Li, & Xiong. International Journal of Educactional Excellence, Universidad Ana G. Méndez
(UAGM). This is an Open Access article distributed under the terms of the Creative Commons Attribution
License (http://creativecommons.org /licenses/by/4.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly credited.
81
International Journal of Educational Excellence
(2020) Vol. 6, No. 1, 71-94
ISSN 2373-5929
DOI: 10.18562/IJEE.054
a
The School of Life Sciences, Fudan University, Shanghai (China) 0000-0002-
2359-744X, b Human Phenome Institute, Fudan University, Shanghai (China), c The School of
Mathematic Sciences, Fudan University, Shanghai (China, d The School of Public Health, The
University of Texas Health Science Center at Houston, Houston, TX 77030 (USA)
0000-0003-0635-5796. Correspondence: Dr. Momiao Xiong, Department of Biostatistics and
Data Science, School of Public Health, The University of Texas Health Science Center at
Houston, P.O. Box 20186, Houston, Texas 77225 (USA). [email protected]
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
1. Introducción
2. Métodos
84
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
después del 14 de febrero de 2020. Los números de los casos clínicos confir-
mados y los casos confirmados por el laboratorio en la provincia de Hubei
(China) el 14 de febrero fueron 15.384 y 36.602, respectivamente. Por lo
tanto, el número de los casos confirmados en la Provincia de Hubei antes del
14 de febrero de 2020 fue ajustado por la fórmula: Nnúmero de casos confir-
85
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
86
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
2.4. Clustering
Se extrajeron los valores de las variables latentes en la segunda capa la-
tente del MAE para cada provincia/ciudad. Para cada provincia/ciudad, un
matrices latentes dimensionales fueron formadas. El mayor valor
individual de la matriz latente se obtuvo a través de la descomposición de
un solo valor. Realizamos cinco entrenamientos y obtuvimos cinco valores
únicos más grandes. Para cada provincia/ciudad, formamos un vector de ca-
racterísticas que consistía en los cinco valores individuales más grandes , el
día de inicio y el día previsto de finalización del brote de Covid-19, el día, el
número de nuevos casos confirmados que alcanza el máximo, el mayor núme-
ro de los nuevos casos confirmados previstos y el número de los casos confir-
mados acumulados previstos de Covid-19 en la provincia/ciudad respectiva.
Los algoritmos de k-means se realizaron en el presentan vectores para
agrupar las provincias/ciudades en conglomerados.
5. Resultados
87
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
Figura 2. Las curvas nacionales notificadas y pronosticadas de los casos acumulados y nuevos
confirmados de Covid-19 en China en función de los días comprendidos entre el 11 de enero de
2020 y el 20 de abril de 2020.
88
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
Figura S1. Las curvas nacionales notificadas y ajustadas de los casos acumulados confirmados
de Covid-19 en China desde el 11 de enero de 2020 hasta el 27 de febrero de 2020, en las que
la curva roja era la notificada y la verde la ajustada.
89
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
90
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
Figura 4. Los grupos que se agruparon por características extraídas del MAE y las series
cronológicas acumuladas de casos confirmados de Covid-19 en 31 provincias/ciudades de
China continental y otras tres regiones de China formaron 9 grupos.
91
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
6. Discusión
92
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
Apéndice: Tabla 1.
10-
1-step 1-step 2-step 3-step 4-step 5-step 6-step 7-step 8-step 9-step step
Date Actual prediction error error error error error error error error error error
24/02/2020 77.262 79.837 3,33% 2,18% 0,83% 2,53% 2,46% 3,30% 1,08%
25/02/2020 77.780 79.155 1,77% 3,01% 2,14% 0,47% 2,28% 2,36% 3,15% 0,76%
26/02/2020 78.191 77.957 -0,30% 1,61% 2,70% 2,15% 0,28% 2,20% 2,49% 3,21% 0,80%
27/02/2020 78.630 78.646 0,02% -0,53% 1,49% 2,47% 2,22% 0,21% 2,37% 2,46% 3,37% 0,73%
Average Absolute Error 1,07% 1,43% 1,46% 1,56% 1,62% 1,64% 2,27% 2,14% 2,08% 0,73%
Referencias
Charte, D., Charte, F., García, S., del Jesus, M. J., & Herrera, F. (2018). A
practical tutorial on autoencoders for nonlinear feature fusion:
Taxonomy, models, software and guidelines. Information Fusion, 44,
78-96.
Funk, S., Camacho, A., Kucharski, A. J., Eggo, R. M., & Edmunds, W. J.
(2018). Real-time forecasting of infectious disease dynamics with a
stochastic semi-mechanistic model. Epidemics, 22, 56-61.
Johansson, M. A., Apfeldorf, K. M., Dobson, S., Devita, J., Buczak, A. L.,
Baugher, B. et al. (2019). An open challenge to advance probabilistic
forecasting for dengue epidemics. Proceedings of the National Academy
of Sciences, 116(48), 24268-24274.
Kucharski, A., Russell, T., Diamond, C., Liu, Y, (2020). CMMID nCoV
working group. In J. Edmunds, S. Funk, & R. Eggo (ed.). Analysis and
projections of transmission dynamics of nCoV in Wuhan. Retrieved
from https://cmmid.github.io/ncov/wuhan_early_dynamics/index.html.
Li, Q., Guan, X., Wu, P., et al. (2020) Early transmission dynamics in Wuhan,
China, of novel coronavirus–infected pneumonia. New England Journal
of Medicine, 2020 Jan 29. [Epub ahead of print] doi:
10.1056/NEJMoa2001316.
93
Zixin Hu, Qiyang Ge, Shudi Li, & Momiao Xiong, International Journal of
Educational Excellence, (2028) Vol. 6, No. 1, 71-94. ISSN 2373-5929
DOI: 10.18562/IJEE.054
© 2020 Hu, Ge, Li, & Xiong. International Journal of Educactional Excellence, Universidad Ana G. Méndez
(UAGM). This is an Open Access article distributed under the terms of the Creative Commons Attribution
License (http://creativecommons.org /licenses/by/4.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly credited.
94