0% found this document useful (0 votes)

56 views7 pages

STGCN

Uploaded by

Aapu mallick

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views7 pages

STGCN

Uploaded by

Aapu mallick

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework

for Traffic Forecasting

Bing Yu ⇤1 , Haoteng Yin⇤2,3 , Zhanxing Zhu †3,4

School of Mathematical Sciences, Peking University, Beijing, China
1
2
Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
3
Center for Data Science, Peking University, Beijing, China
4
Beijing Institute of Big Data Research (BIBDR), Beijing, China
{byu, htyin, zhanxing.zhu}@pku.edu.cn

Abstract to predict the future. Based on the length of prediction, traffic

forecast is generally classified into two scales: short-term (5
Timely accurate traffic forecast is crucial for ur- ⇠ 30 min), medium and long term (over 30 min). Most preva-
ban traffic control and guidance. Due to the high lent statistical approaches (for example, linear regression) are
nonlinearity and complexity of traffic flow, tradi- able to perform well on short interval forecast. However, due
tional methods cannot satisfy the requirements of to the uncertainty and complexity of traffic flow, those meth-
mid-and-long term prediction tasks and often ne- ods are less effective for relatively long-term predictions.
glect spatial and temporal dependencies. In this pa- Previous studies on mid-and-long term traffic prediction
per, we propose a novel deep learning framework, can be roughly divided into two categories: dynamical mod-
Spatio-Temporal Graph Convolutional Networks eling and data-driven methods. Dynamical modeling uses
(STGCN), to tackle the time series prediction prob- mathematical tools (e.g. differential equations) and physi-
lem in traffic domain. Instead of applying regu- cal knowledge to formulate traffic problems by computational
lar convolutional and recurrent units, we formulate simulation [Vlahogianni, 2015]. To achieve a steady state,
the problem on graphs and build the model with the simulation process not only requires sophisticated system-
complete convolutional structures, which enable atic programming but also consumes massive computational
much faster training speed with fewer parameters. power. Impractical assumptions and simplifications among
Experiments show that our model STGCN effec- the modeling also degrade the prediction accuracy. Therefore,
tively captures comprehensive spatio-temporal cor- with rapid development of traffic data collection and storage
relations through modeling multi-scale traffic net- techniques, a large group of researchers are shifting their at-
works and consistently outperforms state-of-the-art tention to data-driven approaches.
baselines on various real-world traffic datasets. Classic statistical and machine learning models are two
major representatives of data-driven methods. In time-
1 Introduction series analysis, autoregressive integrated moving average
(ARIMA) and its variants are one of the most consolidated
Transportation plays a vital role in everybody’s daily life. Ac- approaches based on classical statistics [Ahmed and Cook,
cording to a survey in 2015, U.S. drivers spend about 48 min- 1979; Williams and Hoel, 2003]. However, this type of model
utes on average behind the wheel daily.1 Under this circum- is limited by the stationary assumption of time sequences
stance, accurate real-time forecast of traffic conditions is of and fails to take the spatio-temporal correlation into account.
paramount importance for road users, private sectors and gov- Therefore, these approaches have constrained representabil-
ernments. Widely used transportation services, such as flow ity of highly nonlinear traffic flow. Recently, classic statistical
control, route planning, and navigation, also rely heavily on models have been vigorously challenged by machine learning
a high-quality traffic condition evaluation. In general, multi- methods on traffic prediction tasks. Higher prediction accu-
scale traffic forecast is the premise and foundation of urban racy and more complex data modeling can be achieved by
traffic control and guidance, which is also one of main func- these models, such as k-nearest neighbors algorithm (KNN),
tions of the Intelligent Transportation System (ITS). support vector machine (SVM), and neural networks (NN).
In the traffic study, fundamental variables of traffic flow, Deep learning approaches have been widely and suc-
namely speed, volume, and density are typically chosen as in- cessfully applied to various traffic tasks nowadays. Sig-
dicators to monitor the current status of traffic conditions and nificant progress has been made in related work, for in-
⇤
Equal contributions. stance, deep belief network (DBN) [Jia et al., 2016; Huang
†
Corresponding author. et al., 2014], stacked autoencoder (SAE) [Lv et al., 2015;
1
https://aaafoundation.org/american-driving-survey-2014-2015/ Chen et al., 2016]. However, it is difficult for these dense

3634
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

networks to extract spatial and temporal features from the in- Time … vt+H
put jointly. Moreover, within narrow constraints or even com- … vt
vt-M+1
plete absence of spatial attributes, the representative ability of
these networks would be hindered seriously.
To take full advantage of spatial features, some researchers wij
use convolutional neural network (CNN) to capture adjacent
relations among the traffic network, along with employing
recurrent neural network (RNN) on time axis. By combin- Figure 1: Graph-structured traffic data. Each vt indicates a frame
ing long short-term memory (LSTM) network [Hochreiter of current traffic status at time step t, which is recorded in a graph-
structured data matrix.
and Schmidhuber, 1997] and 1-D CNN, Wu and Tan [2016]
presented a feature-level fused architecture CLTFP for short-
term traffic forecast. Although it adopted a straightforward not independent but linked by pairwise connection in graph.
strategy, CLTFP still made the first attempt to align spatial Therefore, the data point vt can be regarded as a graph sig-
and temporal regularities. Afterwards, Shi et al. [2015] pro- nal that is defined on an undirected graph (or directed one) G
posed the convolutional LSTM, which is an extended fully- with weights wij as shown in Figure 1. At the t-th time step,
connected LSTM (FC-LSTM) with embedded convolutional in graph Gt = (Vt , E, W ), Vt is a finite set of vertices, corre-
layers. However, the normal convolutional operation applied sponding to the observations from n monitor stations in a traf-
restricts the model to only process grid structures (e.g. im- fic network; E is a set of edges, indicating the connectedness
ages, videos) rather than general domains. Meanwhile, recur- between stations; while W 2 Rn⇥n denotes the weighted
rent networks for sequence learning require iterative training, adjacency matrix of Gt .
which introduces error accumulation by steps. Additionally,
RNN-based networks (including LSTM) are widely known to 2.2 Convolutions on Graphs
be difficult to train and computationally heavy. A standard convolution for regular grids is clearly not appli-
For overcoming these issues, we introduce several strate- cable to general graphs. There are two basic approaches cur-
gies to effectively model temporal dynamics and spatial de- rently exploring how to generalize CNNs to structured data
pendencies of traffic flow. To fully utilize spatial informa- forms. One is to expand the spatial definition of a convolu-
tion, we model the traffic network by a general graph instead tion [Niepert et al., 2016], and the other is to manipulate in
of treating it separately (e.g. grids or segments). To handle the spectral domain with graph Fourier transforms [Bruna et
the inherent deficiencies of recurrent networks, we employ a al., 2013]. The former approach rearranges the vertices into
fully convolutional structure on time axis. Above all, we pro- certain grid forms which can be processed by normal con-
pose a novel deep learning architecture, the spatio-temporal volutional operations. The latter one introduces the spectral
graph convolutional networks, for traffic forecasting tasks. framework to apply convolutions in spectral domains, often
This architecture comprises several spatio-temporal convolu- named as the spectral graph convolution. Several following-
tional blocks, which are a combination of graph convolutional up studies make the graph convolution more promising by
layers [Defferrard et al., 2016] and convolutional sequence reducing the computational complexity from O(n2 ) to linear
learning layers, to model spatial and temporal dependencies. [Defferrard et al., 2016; Kipf and Welling, 2016].
To the best of our knowledge, it is the first time that to ap- We introduce the notion of graph convolution operator
ply purely convolutional structures to extract spatio-temporal “⇤G ” based on the conception of spectral graph convolution,
features simultaneously from graph-structured time series in as the multiplication of a signal x 2 Rn with a kernel ⇥,
a traffic study. We evaluate our proposed model on two real- ⇥ ⇤G x = ⇥(L)x = ⇥(U ⇤U T )x = U ⇥(⇤)U T x, (2)
world traffic datasets. Experiments show that our framework where graph Fourier basis U 2 Rn⇥n is the matrix of
outperforms existing baselines in prediction tasks with multi- eigenvectors of the normalized graph Laplacian L = In
ple preset prediction lengths and network scales. 1 1
D 2 W D 2 = U ⇤U T 2 Rn⇥n (In is an identity matrix,
D2R n⇥n
is the diagonal degree matrix with Dii = ⌃j Wij );
2 Preliminary ⇤ 2 Rn⇥n is the diagonal matrix of eigenvalues of L, and fil-
2.1 Traffic Prediction on Road Graphs ter ⇥(⇤) is also a diagonal matrix. By this definition, a graph
Traffic forecast is a typical time-series prediction problem, signal x is filtered by a kernel ⇥ with multiplication between
i.e. predicting the most likely traffic measurements (e.g. ⇥ and graph Fourier transform U T x [Shuman et al., 2013].
speed or traffic flow) in the next H time steps given the pre-
vious M traffic observations as, 3 Proposed Model
v̂t+1 , ..., v̂t+H = 3.1 Network Architecture
arg max log P (vt+1 , ..., vt+H |vt M +1 , ..., vt ), (1) In this section, we elaborate on the proposed architecture of
vt+1 ,...,vt+H spatio-temporal graph convolutional networks (STGCN). As
where vt 2 Rn is an observation vector of n road segments shown in Figure 2, STGCN is composed of several spatio-
at time step t, each element of which records historical obser- temporal convolutional blocks, each of which is formed as a
vation for a single road segment. “sandwich” structure with two gated sequential convolution
In this work, we define the traffic network on a graph and layers and one spatial graph convolution layer in between.
focus on structured traffic time series. The observation vt is The details of each module are described as follows.

3635
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

vl l l
(vt-M+1, … vt ) polynomial approximation, the cost of Eq. (2) can be reduced
(vt-M+1, … vt) W W
to O(K|E|) as Eq. (3) shows [Defferrard et al., 2016].
Temporal
ST-Conv Block Gated-Conv, C=64
1-D 1st -order Approximation A layer-wise linear formulation
ST-Conv Block
Spatial Conv can be defined by stacking multiple localized graph convo-
Graph-Conv, C=16
lutional layers with the first-order approximation of graph
Temporal Laplacian [Kipf and Welling, 2016]. Consequently, a deeper
GLU
Output Layer Gated-Conv, C=64 Temporal architecture can be constructed to recover spatial information
ST-Conv Block Gated-Conv
in depth without being limited to the explicit parameteriza-
tion given by the polynomials. Due to the scaling and nor-
v̂ vl+1 l l
(vt-M+Kt , … vt ) malization in neural networks, we can further assume that
max ⇡ 2. Thus, the Eq. (3) can be simplified to,

Figure 2: Architecture of spatio-temporal graph convolutional net- 2

works. The framework STGCN consists of two spatio-temporal ⇥ ⇤G x ⇡ ✓0 x + ✓1 ( L In )x
max (4)
convolutional blocks (ST-Conv blocks) and a fully-connected output 1 1
layer in the end. Each ST-Conv block contains two temporal gated ⇡ ✓0 x ✓1 (D 2 WD 2 )x,
convolution layers and one spatial graph convolution layer in the
middle. The residual connection and bottleneck strategy are applied where ✓0 , ✓1 are two shared parameters of the kernel. In
inside each block. The input vt M +1 , ..., vt is uniformly processed order to constrain parameters and stabilize numerical per-
by ST-Conv blocks to explore spatial and temporal dependencies co- formances, ✓0 and ✓1 are replaced by a single parameter ✓
herently. Comprehensive features are integrated by an output layer by letting ✓ = ✓0 = ✓1 ; W and D are renormalized by
to generate the final prediction v̂. W̃ = W + In and D̃ii = ⌃j W̃ij separately. Then, the graph
convolution can be alternatively expressed as,
3.2 Graph CNNs for Extracting Spatial Features ⇥ ⇤G x = ✓(In + D
1
2 WD
1
2 )x
The traffic network generally organizes as a graph structure. 1 1
(5)
It is natural and reasonable to formulate road networks as = ✓(D̃ 2 W̃ D̃ 2 )x.
graphs mathematically. However, previous studies neglect
spatial attributes of traffic networks: the connectivity and Applying a stack of graph convolutions with the 1st -order ap-
globality of the networks are overlooked, since they are split proximation vertically that achieves the similar effect as K-
into multiple segments or grids. Even with 2-D convolu- localized convolutions do horizontally, all of which exploit
tions on grids, it can only capture the spatial locality roughly the information from the (K 1)-order neighborhood of cen-
due to compromises of data modeling. Accordingly, in our tral nodes. In this scenario, K is the number of successive fil-
model, the graph convolution is employed directly on graph- tering operations or convolutional layers in a model instead.
structured data to extract highly meaningful patterns and fea- Additionally, the layer-wise linear structure is parameter-
tures in the space domain. Though the computation of kernel economic and highly efficient for large-scale graphs, since
⇥ in graph convolution by Eq. (2) can be expensive due to the order of the approximation is limited to one.
O(n2 ) multiplications with graph Fourier basis, two approx-
imation strategies are applied to overcome this issue. Generalization of Graph Convolutions The graph convo-
lution operator “⇤G ” defined on x 2 Rn can be extended
Chebyshev Polynomials Approximation To localize the to multi-dimensional tensors. For a signal with Ci channels
filter and reduce the number of parameters, the kernel ⇥ can X 2 Rn⇥Ci , the graph convolution can be generalized by,
PK 1
be restricted to a polynomial of ⇤ as ⇥(⇤) = k=0 ✓k ⇤k , Ci
X
where ✓ 2 RK is a vector of polynomial coefficients. K yj = ⇥i,j (L)xi 2 Rn , 1  j  Co (6)
is the kernel size of graph convolution, which determines i=1
the maximum radius of the convolution from central nodes. with the Ci ⇥ Co vectors of Chebyshev coefficients ⇥i,j 2
Traditionally, Chebyshev polynomial Tk (x) is used to ap-
RK (Ci , Co are the size of input and output of the feature
proximate kernels as a truncated expansion of order K 1 as
PK 1 ˜ with rescaled ⇤
˜ = 2⇤/ max In maps, respectively). The graph convolution for 2-D variables
⇥(⇤) ⇡ k=0 ✓k Tk (⇤) is denoted as “⇥ ⇤G X” with ⇥ 2 RK⇥Ci ⇥Co . Specifically,
( max denotes the largest eigenvalue of L) [Hammond et al., the input of traffic prediction is composed of M frame of road
2011]. The graph convolution can then be rewritten as, graphs as Figure 1 shows. Each frame vt can be regarded as
K
X1 a matrix whose column i is the Ci -dimensional value of vt
⇥ ⇤G x = ⇥(L)x ⇡ ✓k Tk (L̃)x, (3) at the ith node in graph Gt , as X 2 Rn⇥Ci (in this case,
k=0 Ci = 1). For each time step t of M , the equal graph con-
volution operation with the same kernel ⇥ is imposed on
where Tk (L̃) 2 Rn⇥n is the Chebyshev polynomial of order Xt 2 Rn⇥Ci in parallel. Thus, the graph convolution can
k evaluated at the scaled Laplacian L̃ = 2L/ max In . By be further generalized in 3-D variables, noted as “⇥ ⇤G X ”
recursively computing K-localized convolutions through the with X 2 RM ⇥n⇥Ci .

3636
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

3.3 Gated CNNs for Extracting Temporal Features where l0 , l1 are the upper and lower temporal kernel within
Although RNN-based models become widespread in time- block l, respectively; ⇥l is the spectral kernel of graph con-
series analysis, recurrent networks for traffic prediction still volution; ReLU(·) denotes the rectified linear units function.
suffer from time-consuming iterations, complex gate mecha- After stacking two ST-Conv blocks, we attach an extra tem-
nisms, and slow response to dynamic changes. On the con- poral convolution layer with a fully-connected layer as the
trary, CNNs have the superiority of fast training, simple struc- output layer in the end (See the left of Figure 2). The tempo-
tures, and no dependency constraints to previous steps. In- ral convolution layer maps outputs of the last ST-Conv block
spired by [Gehring et al., 2017], we employ entire convolu- to a single-step prediction. Then, we can obtain a final output
tional structures on time axis to capture temporal dynamic Z 2 Rn⇥c from the model and calculate the speed predic-
behaviors of traffic flows. This specific design allows parallel tion for n nodes by applying a linear transformation across
and controllable training procedures through multi-layer con- c-channels as v̂ = Zw + b, where w 2 Rc is a weight vector
volutional structures formed as hierarchical representations. and b is a bias. We use L2 loss to measure the performance
As Figure 2 (right) shows, the temporal convolutional layer of our model. Thus, the loss function of STGCN for traffic
contains a 1-D causal convolution with a width-Kt kernel fol- prediction can be written as,
lowed by gated linear units (GLU) as a non-linearity. For X
each node in graph G, the temporal convolution explores L(v̂; W✓ ) = ||v̂(vt M +1 , ..., vt , W✓ ) vt+1 ||2 , (9)
Kt neighbors of input elements without padding which lead- t

ing to shorten the length of sequences by Kt -1 each time. where W✓ are all trainable parameters in the model; vt+1 is
Thus, input of temporal convolution for each node can be the ground truth and v̂(·) denotes the model’s prediction.
regarded as a length-M sequence with Ci channels as Y 2 We now summarize the main characteristics of our model
RM ⇥Ci . The convolution kernel 2 RKt ⇥Ci ⇥2Co is de- STGCN in the following,
signed to map the input Y to a single output element [P Q] 2 • STGCN is a universal framework to process structured
R(M Kt +1)⇥(2Co ) (P , Q is split in half with the same size of time series. It is not only able to tackle traffic network
channels). As a result, the temporal gated convolution can be modeling and prediction issues but also to be applied to
defined as, more general spatio-temporal sequence learning tasks.
⇤T Y = P (Q) 2 R(M Kt +1)⇥Co , (7)
• The spatio-temporal block combines graph convolutions
where P , Q are input of gates in GLU respectively; denotes and gated temporal convolutions, which can extract the
the element-wise Hadamard product. The sigmoid gate (Q) most useful spatial features and capture the most essen-
controls which input P of the current states are relevant for tial temporal features coherently.
discovering compositional structure and dynamic variances
in time series. The non-linearity gates contribute to the ex- • The model is entirely composed of convolutional struc-
ploiting of the full input filed through stacked temporal layers tures and therefore achieves parallelization over input
as well. Furthermore, residual connections are implemented with fewer parameters and faster training speed. More
among stacked temporal convolutional layers. Similarly, the importantly, this economic architecture allows the model
temporal convolution can also be generalized to 3-D variables to handle large-scale networks with more efficiency.
by employing the same convolution kernel to every node
Yi 2 RM ⇥Ci (e.g. sensor stations) in G equally, noted as 4 Experiments
“ ⇤T Y” with Y 2 RM ⇥n⇥Ci . 4.1 Dataset Description
3.4 Spatio-temporal Convolutional Block We verify our model on two real-world traffic datasets,
In order to fuse features from both spatial and temporal BJER4 and PeMSD7, collected by Beijing Municipal Traffic
domains, the spatio-temporal convolutional block (ST-Conv Commission and California Deportment of Transportation,
block) is constructed to jointly process graph-structured time respectively. Each dataset contains key attributes of traffic
series. The block itself can be stacked or extended based on observations and geographic information with corresponding
the scale and complexity of particular cases. timestamps, as detailed below.
As illustrated in Figure 2 (mid), the spatial layer in the BJER4 was gathered from the major areas of east ring
middle is to bridge two temporal layers which can achieve No.4 routes in Beijing City by double-loop detectors. There
fast spatial-state propagation from graph convolution through are 12 roads selected for our experiment. The traffic data are
temporal convolutions. The “sandwich” structure also helps aggregated every 5 minutes. The time period used is from 1st
the network sufficiently apply bottleneck strategy to achieve July to 31st August, 2014 except the weekends. We select the
scale compression and feature squeezing by downscaling and first month of historical speed records as training set, and the
upscaling of channels C through the graph convolutional rest serves as validation and test set respectively.
layer. Moreover, layer normalization is utilized within every PeMSD7 was collected from Caltrans Performance Mea-
ST-Conv block to prevent overfitting. surement System (PeMS) in real-time by over 39, 000 sensor
The input and output of ST-Conv blocks are all 3-D tensors. stations, deployed across the major metropolitan areas of Cal-
l ifornia state highway system [Chen et al., 2001]. The dataset
For the input v l 2 RM ⇥n⇥C of block l, the output v l+1 2
l+1 is also aggregated into 5-minute interval from 30-second data
R(M 2(Kt 1))⇥n⇥C is computed by, samples. We randomly select a medium and a large scale
v l+1 = l1 ⇤T ReLU(⇥l ⇤G ( l0 ⇤T v l )), (8) among the District 7 of California containing 228 and 1, 026

3637
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

BJER4 (15/ 30/ 45 min)

Model
MAE MAPE (%) RMSE
HA 5.21 14.64 7.56
LSVR 4.24/ 5.23/ 6.12 10.11/ 12.70/ 14.95 5.91/ 7.27/ 8.81
ARIMA 5.99/ 6.27/ 6.70 15.42/ 16.36/ 17.67 8.19/ 8.38/ 8.72
FNN 4.30/ 5.33/ 6.14 10.68/ 13.48/ 15.82 5.86/ 7.31/ 8.58
FC-LSTM 4.24/ 4.74/ 5.22 10.78/ 12.17/ 13.60 5.71/ 6.62/ 7.44
GCGRU 3.84/ 4.62/ 5.32 9.31/ 11.41/ 13.30 5.22/ 6.35/ 7.58
STGCN(Cheb) 3.78/ 4.45/ 5.03 9.11/ 10.80/ 12.27 5.20/ 6.20/ 7.21
STGCN(1st ) 3.83/ 4.51/ 5.10 9.28/ 11.19/ 12.79 5.29/ 6.39/ 7.39

Table 1: Performance comparison of different approaches on the

Figure 3: PeMS sensor network in District 7 of California (left), dataset BJER4.
each dot denotes a sensor station; Heat map of weighted adjacency
matrix in PeMSD7(M) (right).
Root Mean Squared Errors (RMSE) are adopted. We com-
stations, labeled as PeMSD7(M) and PeMSD7(L), respec- pare our framework STGCN with the following baselines: 1).
tively, as data sources (shown in the left of Figure 3). The Historical Average (HA); 2). Linear Support Victor Regres-
time range of PeMSD7 dataset is in the weekdays of May sion (LSVR); 3). Auto-Regressive Integrated Moving Aver-
and June of 2012. We split the training and test sets based on age (ARIMA); 4). Feed-Forward Neural Network (FNN); 5).
the same principles as above. Full-Connected LSTM (FC-LSTM) [Sutskever et al., 2014];
6). Graph Convolutional GRU (GCGRU) [Li et al., 2018].
4.2 Data Preprocessing
The standard time interval in two datasets is set to 5 min- STGCN Model For BJER4 and PeMSD7(M/L), the chan-
utes. Thus, every node of the road graph contains 288 data nels of three layers in ST-Conv block are 64, 16, 64 respec-
points per day. The linear interpolation method is used to fill tively. Both the graph convolution kernel size K and tem-
missing values after data cleaning. In addition, data input are poral convolution kernel size Kt are set to 3 in the model
normalized by Z-Score method. STGCN(Cheb) with the Chebyshev polynomials approxima-
In BJER4, the topology of the road graph in Beijing east tion, while the K is set to 1 in the model STGCN(1st ) with
No.4 ring route system is constructed by the deployment dia- the 1st -order approximation. We train our models by mini-
gram of sensor stations. By collating affiliation, direction and mizing the mean square error using RMSprop for 50 epochs
origin-destination points of each road, the ring route system with batch size as 50. The initial learning rate is 10 3 with a
can be digitized as a directed graph. decay rate of 0.7 after every 5 epochs.
In PeMSD7, the adjacency matrix of the road graph is com-
puted based on the distances among stations in the traffic net- 4.4 Experiment Results
work. The weighted adjacency matrix W can be formed as, Table 1 and 2 demonstrate the results of STGCN and base-
8 lines on the datasets BJER4 and PeMSD7(M/L). Our pro-
< d2ij d2ij
exp( ), i 6= j and exp( ) ✏ posed model achieves the best performance with statistical
wij = 2 2 (10)
: significance (two-tailed T-test, ↵ = 0.01, P < 0.01) in all
0 , otherwise. three evaluation metrics. We can easily observe that tradi-
where wij is the weight of edge which is decided by dij (the tional statistical and machine learning methods may perform
distance between station i and j). 2 and ✏ are thresholds to well for short-term forecasting, but their long-term predic-
control the distribution and sparsity of matrix W , assigned to tions are not accurate because of error accumulation, memo-
10 and 0.5, respectively. The visualization of W is presented rization issues, and absence of spatial information. ARIMA
in the right of Figure 3. model performs the worst due to its incapability of handling
complex spatio-temporal data. Deep learning approaches
4.3 Experimental Settings generally achieved better prediction results than traditional
machine learning models.
All experiments are compiled and tested on a Linux cluster
(CPU: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz, GPU: Benefits of Spatial Topology
NVIDIA GeForce GTX 1080). In order to eliminate atypical Previous methods did not incorporate spatial topology and
traffic, only workday traffic data are adopted in our experi- modeled the time series in a coarse-grained way. Differently,
ment [Li et al., 2015]. We execute grid search strategy to through modeling spatial topology of the sensors, our model
locate the best parameters on validations. All the tests use STGCN has achieved a significant improvement on short and
60 minutes as the historical time window, a.k.a. 12 observed mid-and-long term forecasting. The advantage of STGCN is
data points (M = 12) are used to forecast traffic conditions more obvious on dataset PeMSD7 than BJER4, since the sen-
in the next 15, 30, and 45 minutes (H = 3, 6, 9). sor network of PeMS is more complicated and structured (as
illustrated in Figure 3), and our model can effectively utilize
Evaluation Metric & Baselines To measure and evaluate spatial structure to make more accurate predictions.
the performance of different methods, Mean Absolute Er- To compare three methods based on graph convolution:
rors (MAE), Mean Absolute Percentage Errors (MAPE), and GCGRU, STGCN(Cheb) and STGCN(1st ), we show their

3638
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

PeMSD7(M) (15/ 30/ 45 min) PeMSD7(L) (15/ 30/ 45 min)

Model
MAE MAPE (%) RMSE MAE MAPE (%) RMSE
HA 4.01 10.61 7.20 4.60 12.50 8.05
LSVR 2.50/ 3.63/ 4.54 5.81/ 8.88/ 11.50 4.55/ 6.67/ 8.28 2.69/ 3.85/ 4.79 6.27/ 9.48/ 12.42 4.88/ 7.10/ 8.72
ARIMA 5.55/ 5.86/ 6.27 12.92/ 13.94/ 15.20 9.00/ 9.13/ 9.38 5.50/ 5.87/ 6.30 12.30/ 13.54/ 14.85 8.63/ 8.96/ 9.39
FNN 2.74/ 4.02/ 5.04 6.38/ 9.72/ 12.38 4.75/ 6.98/ 8.58 2.74/ 3.92/ 4.78 7.11/ 10.89/ 13.56 4.87/ 7.02/ 8.46
FC-LSTM 3.57/ 3.94/ 4.16 8.60/ 9.55/ 10.10 6.20/ 7.03/ 7.51 4.38/ 4.51/ 4.66 11.10/ 11.41/ 11.69 7.68/ 7.94/ 8.20
GCGRU 2.37/ 3.31/ 4.01 5.54/ 8.06/ 9.99 4.21/ 5.96/ 7.13 2.48/ 3.43/ 4.12 ⇤ 5.76/ 8.45/ 10.51 ⇤ 4.40/ 6.25/ 7.49 ⇤
STGCN(Cheb) 2.25/ 3.03/ 3.57 5.26/ 7.33/ 8.69 4.04/ 5.70/ 6.77 2.37/ 3.27/ 3.97 5.56/ 7.98/ 9.73 4.32/ 6.21/ 7.45
STGCN(1st ) 2.26/ 3.09/ 3.79 5.24/ 7.39/ 9.12 4.07/ 5.77/ 7.03 2.40/ 3.31/ 4.01 5.63/ 8.21/ 10.12 4.38/ 6.43/ 7.81

Table 2: Performance comparison of different approaches on the dataset PeMSD7.

Time Consumption (s)

Dataset
STGCN(Cheb) STGCN(1st ) GCGRU
PeMSD7(M) 272.34 271.18 3824.54
PeMSD7(L) 1926.81 1554.37 19511.92

Table 3: Time consumptions of training on the dataset PeMSD7.

Figure 4: Speed prediction in the morning peak and evening rush do. For PeMSD7(L), GCGRU has to use the half of batch
hours of the dataset PeMSD7. size since its GPU consumption exceeded the memory capac-
ity of a single card (results marked as “*” in Table 2); while
STGCN only need to double the channels in the middle of
ST-Conv blocks. Even though our model still consumes less
than a tenth of the training time of model GCGRU under this
circumstance. Meanwhile, the advantages of the 1st -order
approximation have appeared since it is not restricted to the
parameterization of polynomials. The model STGCN(1st )
speeds up around 20% on a larger dataset with a satisfactory
performance compared with STGCN(Cheb).
In order to further investigate the performance of compared
Figure 5: Test RMSE versus the training time (left); Test MAE ver- deep learning models, we plot the RMSE and MAE of the test
sus the number of training epochs (right). (PeMSD7(M)) set of PeMSD7(M) during the training process, see Figure 5.
Those figures also suggest that our model can achieve much
faster training procedure and easier convergences. Thanks to
predictions during morning peak and evening rush hours, as
the special designs in ST-Conv blocks, our model has superior
shown in Figure 4. It is easy to observe that our proposal
performances in balancing time consumption and parameter
STGCN captures the trend of rush hours more accurately than
settings. Specifically, the number of parameters in STGCN
other methods; and it detects the ending of the rush hours ear-
(4.54 ⇥ 105 ) only accounts for around two third of GCGRU,
lier than others. Stemming from the efficient graph convolu-
and saving over 95% parameters compared to FC-LSTM.
tion and stacked temporal convolution structures, our model
is capable of fast responding to the dynamic changes among
the traffic network without over-reliance on historical average 5 Related Works
as most of recurrent networks do. There are several recent deep learning studies that are also
motivated by the graph convolution in spatio-temporal tasks.
Training Efficiency and Generalization Seo et al. [2016] introduced graph convolutional recurrent
To see the benefits of the convolution along time axis in our network (GCRN) to identify jointly spatial structures and dy-
proposal, we summarize the comparison of training time be- namic variation from structured sequences of data. The key
tween STGCN and GCGRU in Table 3. In terms of fairness, challenge of this study is to determine the optimal combi-
GCGRU consists of three layers with 64, 64, 128 units re- nations of recurrent networks and graph convolution under
spectively in the experiment for PeMSD7(M), and STGCN specific settings. Based on principles above, Li et al. [2018]
uses the default settings as described in Section 4.3. Our successfully employed the gated recurrent units (GRU) with
model STGCN only consumes 272 seconds, while RNN-type graph convolution for long-term traffic forecasting. In con-
of model GCGRU spends 3, 824 seconds on PeMSD7(M). trast to these works, we build up our model completely from
This 14 times acceleration of training speed mainly bene- convolutional structures; The ST-Conv block is specially de-
fits from applying the temporal convolution instead of re- signed to uniformly process structured data with residual con-
current structures, which can achieve fully parallel training nection and bottleneck strategy inside; More efficient graph
rather than exclusively relying on chain structures as RNN convolution kernels are employed in our model as well.

3639
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

6 Conclusion and Future Work [Jia et al., 2016] Yuhan Jia, Jianping Wu, and Yiman Du.
In this paper, we propose a novel deep learning framework Traffic speed prediction using deep learning method. In
STGCN for traffic prediction, integrating graph convolution ITSC, pages 1217–1222. IEEE, 2016.
and gated temporal convolution through spatio-temporal con- [Kipf and Welling, 2016] Thomas N Kipf and Max Welling.
volutional blocks. Experiments show that our model out- Semi-supervised classification with graph convolutional
performs other state-of-the-art methods on two real-world networks. arXiv preprint arXiv:1609.02907, 2016.
datasets, indicating its great potentials on exploring spatio- [Li et al., 2015] Yexin Li, Yu Zheng, Huichu Zhang, and Lei
temporal structures from the input. It also achieves faster Chen. Traffic prediction in a bike-sharing system. In
training, easier convergences, and fewer parameters with flex- SIGSPATIAL, page 33. ACM, 2015.
ibility and scalability. These features are quite promising and
practical for scholarly development and large-scale industry [Li et al., 2018] Yaguang Li, Rose Yu, Cyrus Shahabi, and
deployment. In the future, we will further optimize the net- Yan Liu. Diffusion convolutional recurrent neural net-
work structure and parameter settings. Moreover, our pro- work: Data-driven traffic forecasting. In ICLR, 2018.
posed framework can be applied into more general spatio- [Lv et al., 2015] Yisheng Lv, Yanjie Duan, Wenwen Kang,
temporal structured sequence forecasting scenarios, such as Zhengxi Li, and Fei-Yue Wang. Traffic flow prediction
evolving of social networks, and preference prediction in rec- with big data: a deep learning approach. IEEE Trans-
ommendation systems, etc. actions on Intelligent Transportation Systems, 16(2):865–
873, 2015.
References [Niepert et al., 2016] Mathias Niepert, Mohamed Ahmed,
[Ahmed and Cook, 1979] Mohammed S Ahmed and Allen R and Konstantin Kutzkov. Learning convolutional neural
Cook. Analysis of freeway traffic time-series data by using networks for graphs. In ICML, pages 2014–2023, 2016.
Box-Jenkins techniques. 1979. [Seo et al., 2016] Youngjoo Seo, Michaël Defferrard, Pierre
[Bruna et al., 2013] Joan Bruna, Wojciech Zaremba, Arthur Vandergheynst, and Xavier Bresson. Structured sequence
Szlam, and Yann LeCun. Spectral networks and lo- modeling with graph convolutional recurrent networks.
cally connected networks on graphs. arXiv preprint arXiv preprint arXiv:1612.07659, 2016.
arXiv:1312.6203, 2013. [Shi et al., 2015] Xingjian Shi, Zhourong Chen, Hao Wang,
[Chen et al., 2001] Chao Chen, Karl Petty, Alexander Sk- Dit-Yan Yeung, Wai-Kin Wong, and Wang-chun Woo.
abardonis, Pravin Varaiya, and Zhanfeng Jia. Freeway per- Convolutional lstm network: A machine learning approach
formance measurement system: mining loop detector data. for precipitation nowcasting. In NIPS, pages 802–810,
Transportation Research Record: Journal of the Trans- 2015.
portation Research Board, (1748):96–102, 2001. [Shuman et al., 2013] David I Shuman, Sunil K Narang, Pas-
[Chen et al., 2016] Quanjun Chen, Xuan Song, Harutoshi cal Frossard, Antonio Ortega, and Pierre Vandergheynst.
Yamada, and Ryosuke Shibasaki. Learning deep represen- The emerging field of signal processing on graphs: Ex-
tation from big and heterogeneous data for traffic accident tending high-dimensional data analysis to networks and
inference. In AAAI, pages 338–344, 2016. other irregular domains. IEEE Signal Processing Maga-
zine, 30(3):83–98, 2013.
[Defferrard et al., 2016] Michaël Defferrard, Xavier Bres-
son, and Pierre Vandergheynst. Convolutional neural net- [Sutskever et al., 2014] Ilya Sutskever, Oriol Vinyals, and
works on graphs with fast localized spectral filtering. In Quoc V Le. Sequence to sequence learning with neural
NIPS, pages 3844–3852, 2016. networks. In NIPS, pages 3104–3112, 2014.
[Gehring et al., 2017] Jonas Gehring, Michael Auli, David [Vlahogianni, 2015] Eleni I Vlahogianni. Computational in-
Grangier, Denis Yarats, and Yann N Dauphin. Convo- telligence and optimization for transportation big data:
lutional sequence to sequence learning. arXiv preprint challenges and opportunities. In Engineering and Applied
arXiv:1705.03122, 2017. Sciences Optimization, pages 107–128. Springer, 2015.
[Hammond et al., 2011] David K Hammond, Pierre Van- [Williams and Hoel, 2003] Billy M Williams and Lester A
dergheynst, and Rémi Gribonval. Wavelets on graphs via Hoel. Modeling and forecasting vehicular traffic flow
spectral graph theory. Applied and Computational Har- as a seasonal arima process: Theoretical basis and em-
monic Analysis, 30(2):129–150, 2011. pirical results. Journal of transportation engineering,
129(6):664–672, 2003.
[Hochreiter and Schmidhuber, 1997] Sepp Hochreiter and
[Wu and Tan, 2016] Yuankai Wu and Huachun Tan. Short-
Jürgen Schmidhuber. Long short-term memory. Neural
term traffic flow forecasting with spatial-temporal correla-
computation, 9(8):1735–1780, 1997.
tion in a hybrid deep learning framework. arXiv preprint
[Huang et al., 2014] Wenhao Huang, Guojie Song, Haikun arXiv:1612.01022, 2016.
Hong, and Kunqing Xie. Deep architecture for traffic flow
prediction: deep belief networks with multitask learning.
IEEE Transactions on Intelligent Transportation Systems,
15(5):2191–2201, 2014.

3640

Traffic Prediction - Using AI
No ratings yet
Traffic Prediction - Using AI
15 pages
Pre-Hiring, Hiring, and Post-Hiring
0% (1)
Pre-Hiring, Hiring, and Post-Hiring
11 pages
Jam Jim Jam Plan
No ratings yet
Jam Jim Jam Plan
7 pages
Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework For Traffic Forecasting
No ratings yet
Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework For Traffic Forecasting
7 pages
Spatial‑Temporal Graph Convolutional Networks for Trafc Fow Prediction Considering Multiple Trafc Parameters
No ratings yet
Spatial‑Temporal Graph Convolutional Networks for Trafc Fow Prediction Considering Multiple Trafc Parameters
20 pages
Attention LST M
No ratings yet
Attention LST M
8 pages
Nhom4 Report
No ratings yet
Nhom4 Report
16 pages
Urban Traffic Prediction From Mobility Data Using Deep Learning
No ratings yet
Urban Traffic Prediction From Mobility Data Using Deep Learning
7 pages
Liu 2025 ST LLM+ - Graph Enhanced Spatio Tempor
No ratings yet
Liu 2025 ST LLM+ - Graph Enhanced Spatio Tempor
14 pages
Spatiotemporal Forecasting of Traffic Flow Using Wavelet-Based Temporal Attention
No ratings yet
Spatiotemporal Forecasting of Traffic Flow Using Wavelet-Based Temporal Attention
13 pages
ASurvey On Modern DeepNeural Network
No ratings yet
ASurvey On Modern DeepNeural Network
18 pages
Traffic Transform
No ratings yet
Traffic Transform
12 pages
BiLSTM LSTM
No ratings yet
BiLSTM LSTM
11 pages
A Comprehensive Analysis of Road Traffic Prediction Using Machine Learning Algorithms-3
No ratings yet
A Comprehensive Analysis of Road Traffic Prediction Using Machine Learning Algorithms-3
5 pages
EN+4 2 1 2024+official+reference+check
No ratings yet
EN+4 2 1 2024+official+reference+check
12 pages
Artificial Intelligence-Based Traffic Flow Prediction: A Comprehensive Review
No ratings yet
Artificial Intelligence-Based Traffic Flow Prediction: A Comprehensive Review
42 pages
Smart City Traffic Flow and Signal Optimization Using STGCN-LSTM and PPO Algorithms
No ratings yet
Smart City Traffic Flow and Signal Optimization Using STGCN-LSTM and PPO Algorithms
17 pages
Real-Time Traffic Prediction With Deep Reinforceme
No ratings yet
Real-Time Traffic Prediction With Deep Reinforceme
11 pages
Forecasting Transportation Network Speed Using Deep Capsule Networks With Nested LSTM Models
No ratings yet
Forecasting Transportation Network Speed Using Deep Capsule Networks With Nested LSTM Models
12 pages
Two-Stream Multi-Channel Convolutional Neural Network (TM-CNN) For Multi-Lane Traffic Speed Prediction Considering Traffic Volume Impact
No ratings yet
Two-Stream Multi-Channel Convolutional Neural Network (TM-CNN) For Multi-Lane Traffic Speed Prediction Considering Traffic Volume Impact
9 pages
Adaptive Traffic Lights Based On Traffic Flow Prediction Using Machine Learning Models
No ratings yet
Adaptive Traffic Lights Based On Traffic Flow Prediction Using Machine Learning Models
11 pages
T-GCN: A Temporal Graph Convolutional Network For Traffic Prediction
No ratings yet
T-GCN: A Temporal Graph Convolutional Network For Traffic Prediction
11 pages
Major Base 3
No ratings yet
Major Base 3
43 pages
1 s2.0 S095741742302883X Main
No ratings yet
1 s2.0 S095741742302883X Main
15 pages
STGC GNN
No ratings yet
STGC GNN
14 pages
A SpatialTemporal Attention Approach For Traffic Prediction
No ratings yet
A SpatialTemporal Attention Approach For Traffic Prediction
10 pages
Deep Temporal Convolutional Networks For Short-Ter
No ratings yet
Deep Temporal Convolutional Networks For Short-Ter
12 pages
Information 14 00108
No ratings yet
Information 14 00108
13 pages
Peerj Cs 2527
No ratings yet
Peerj Cs 2527
37 pages
TTPML Paper-2
No ratings yet
TTPML Paper-2
12 pages
Thesis
No ratings yet
Thesis
166 pages
Research Paper1 - AI in Education
No ratings yet
Research Paper1 - AI in Education
12 pages
Traffic Flow Forecasting Using Multivariate Time-Series Deep Learning and Distributed Computing
No ratings yet
Traffic Flow Forecasting Using Multivariate Time-Series Deep Learning and Distributed Computing
7 pages
Research Paper
No ratings yet
Research Paper
13 pages
Traffic Flow Prediction With Big Data - A Deep Learning Approach
No ratings yet
Traffic Flow Prediction With Big Data - A Deep Learning Approach
9 pages
1 s2.0 S0031320323003710 Main
No ratings yet
1 s2.0 S0031320323003710 Main
11 pages
One Column IEEE Journal Article
No ratings yet
One Column IEEE Journal Article
10 pages
IEEE Conference Template
No ratings yet
IEEE Conference Template
5 pages
Traffic Flow Prediction For Intelligent Transporta
No ratings yet
Traffic Flow Prediction For Intelligent Transporta
8 pages
Traffic Prediction For Intelligent Transportation Systems Using Machine Learning
No ratings yet
Traffic Prediction For Intelligent Transportation Systems Using Machine Learning
28 pages
The Impact of Weather Data On Traffic Flow Prediction Models
No ratings yet
The Impact of Weather Data On Traffic Flow Prediction Models
9 pages
Ijsrd Implementation Paper
No ratings yet
Ijsrd Implementation Paper
6 pages
Deep Learning Models For Traffic
No ratings yet
Deep Learning Models For Traffic
6 pages
Research On Intelligent Vehicle Traffic Flow C - 2024 - International Journal of
No ratings yet
Research On Intelligent Vehicle Traffic Flow C - 2024 - International Journal of
9 pages
25556-Article Text-29619-1-2-20230626
No ratings yet
25556-Article Text-29619-1-2-20230626
9 pages
Efficient Large-Scale Traffic Forecasting With Transformers: A Spatial Data Management Perspective
No ratings yet
Efficient Large-Scale Traffic Forecasting With Transformers: A Spatial Data Management Perspective
11 pages
paper_081
No ratings yet
paper_081
12 pages
Review Paper On Traffic Flow Simulation On Python
No ratings yet
Review Paper On Traffic Flow Simulation On Python
7 pages
T-GCN A Temporal Graph Convolutional Network For Traffic Prediction
No ratings yet
T-GCN A Temporal Graph Convolutional Network For Traffic Prediction
11 pages
ZHENG, Ge - Ph.D. - 2022
No ratings yet
ZHENG, Ge - Ph.D. - 2022
217 pages
Smart Traffic Forecasting: Leveraging Adaptive Machine Learning and Big Data Analytics For Traffic Flow Prediction
No ratings yet
Smart Traffic Forecasting: Leveraging Adaptive Machine Learning and Big Data Analytics For Traffic Flow Prediction
10 pages
Artificial Intelligence-Based Traffic Flow Predict
No ratings yet
Artificial Intelligence-Based Traffic Flow Predict
50 pages
Dynamic Spatial-Temporal Representation Learning For Traffic Flow Prediction
No ratings yet
Dynamic Spatial-Temporal Representation Learning For Traffic Flow Prediction
15 pages
Aaai 21 1
No ratings yet
Aaai 21 1
9 pages
Technologies 10 00005
No ratings yet
Technologies 10 00005
11 pages
Fenvs 10 905443
No ratings yet
Fenvs 10 905443
10 pages
Chaos, Solitons and Fractals: Zhihao Xu, Zhiqiang LV, Benjia Chu, Jianbo Li
No ratings yet
Chaos, Solitons and Fractals: Zhihao Xu, Zhiqiang LV, Benjia Chu, Jianbo Li
17 pages
Shin y Yoon - 2023 - Performance Evaluation of Building Blocks of Spati
No ratings yet
Shin y Yoon - 2023 - Performance Evaluation of Building Blocks of Spati
18 pages
Fu Et Al. - 2020 - Short-Term Traffic Speed Prediction Method For Urban Road Sections Based On Wavelet Transform and Gated Recurrent Uni-Annotated
No ratings yet
Fu Et Al. - 2020 - Short-Term Traffic Speed Prediction Method For Urban Road Sections Based On Wavelet Transform and Gated Recurrent Uni-Annotated
13 pages
Deep Learning For Enhancing Urban Planning and Smart Cities
No ratings yet
Deep Learning For Enhancing Urban Planning and Smart Cities
4 pages
A Deep Learning Approach Using Graph Neural Networks For Anomaly Detection in Air Quality Data Considering Spatiotemporal Correlations
No ratings yet
A Deep Learning Approach Using Graph Neural Networks For Anomaly Detection in Air Quality Data Considering Spatiotemporal Correlations
15 pages
Soil Moisture
No ratings yet
Soil Moisture
7 pages
Graph Wavenet
No ratings yet
Graph Wavenet
7 pages
Engineering Syllabus of First Year Old
100% (1)
Engineering Syllabus of First Year Old
71 pages
05 Table Space
No ratings yet
05 Table Space
32 pages
Siena News Fall 2010
No ratings yet
Siena News Fall 2010
36 pages
MCQ HRM 23302D
No ratings yet
MCQ HRM 23302D
23 pages
Alfred Adler's Individual Psychology - QUIZ
No ratings yet
Alfred Adler's Individual Psychology - QUIZ
26 pages
660 For Upload AY 2021 2022 2
No ratings yet
660 For Upload AY 2021 2022 2
64 pages
Citizenship Advancement Training
No ratings yet
Citizenship Advancement Training
41 pages
Unit 6 (Debate Pro Junior 1)
No ratings yet
Unit 6 (Debate Pro Junior 1)
8 pages
NR 511 Differential Diagnosis Chamberlain University College of Nursing
No ratings yet
NR 511 Differential Diagnosis Chamberlain University College of Nursing
3 pages
Maharaja Agrasen College Pocket Guide for Freshies 2025 - Siddharth Mahajan
No ratings yet
Maharaja Agrasen College Pocket Guide for Freshies 2025 - Siddharth Mahajan
15 pages
Dodzik2017 Article BehaviorRatingInventoryOfExecu
No ratings yet
Dodzik2017 Article BehaviorRatingInventoryOfExecu
6 pages
Collins Chapt 4
No ratings yet
Collins Chapt 4
7 pages
The Essence of Interdisciplinary Research: Speaker: Martin Dunn Writer: Sreetej Lakkam
No ratings yet
The Essence of Interdisciplinary Research: Speaker: Martin Dunn Writer: Sreetej Lakkam
2 pages
Basic Information On Grading
No ratings yet
Basic Information On Grading
1 page
Interior of PDF Book - Freebie Word Scramble
No ratings yet
Interior of PDF Book - Freebie Word Scramble
27 pages
Chapter 1
No ratings yet
Chapter 1
15 pages
Skills 360 - Making The Most of Personal Learning (Part 1) Discussion Questions
No ratings yet
Skills 360 - Making The Most of Personal Learning (Part 1) Discussion Questions
6 pages
4530 - CIP Interim Report - Ruchi
No ratings yet
4530 - CIP Interim Report - Ruchi
15 pages
Lec 13-Power Series
No ratings yet
Lec 13-Power Series
63 pages
Model: BERT + DNN Discussion: Anushya Subbiah Divya Sudhakar Kenny Hsu
No ratings yet
Model: BERT + DNN Discussion: Anushya Subbiah Divya Sudhakar Kenny Hsu
1 page
Udemy Course Quality+Checklist
No ratings yet
Udemy Course Quality+Checklist
1 page
Sex Education in Utah
No ratings yet
Sex Education in Utah
10 pages
INVITATION For Speaker
No ratings yet
INVITATION For Speaker
3 pages
Umbrella To Which All The Defense Mechanism Exist
No ratings yet
Umbrella To Which All The Defense Mechanism Exist
9 pages
Weekly Planning-Maker Space-Pyp 3
No ratings yet
Weekly Planning-Maker Space-Pyp 3
12 pages
Pecs 10
No ratings yet
Pecs 10
2 pages
Curvitaeko Updated
No ratings yet
Curvitaeko Updated
4 pages
Strategy: The Totality of Decisions - 47
No ratings yet
Strategy: The Totality of Decisions - 47
1 page
General Questions Principal
No ratings yet
General Questions Principal
2 pages

STGCN

Uploaded by

STGCN

Uploaded by

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework

Bing Yu ⇤1 , Haoteng Yin⇤2,3 , Zhanxing Zhu †3,4

Abstract to predict the future. Based on the length of prediction, traffic

Figure 2: Architecture of spatio-temporal graph convolutional net- 2

BJER4 (15/ 30/ 45 min)

Table 1: Performance comparison of different approaches on the

PeMSD7(M) (15/ 30/ 45 min) PeMSD7(L) (15/ 30/ 45 min)

Table 2: Performance comparison of different approaches on the dataset PeMSD7.

Time Consumption (s)

Table 3: Time consumptions of training on the dataset PeMSD7.

You might also like