Shi 2021
Shi 2021
Science
Yan Shi, Da Wang, Jianbo Tang, Min Deng, Huimin Liu & Baoju Liu
To cite this article: Yan Shi, Da Wang, Jianbo Tang, Min Deng, Huimin Liu & Baoju Liu
(2021): Detecting spatiotemporal extents of traffic congestion: a density-based moving
object clustering approach, International Journal of Geographical Information Science, DOI:
10.1080/13658816.2021.1905820
RESEARCH ARTICLE
1. Introduction
Rapid urbanization has caused traffic congestion, which has in turn blocked sustainable
urban development by reducing travel efficiency and aggravating air pollution. It is
important to identify traffic congestion accurately, adequately and in a timely manner
in space and time for the realization of traffic dispersion, the dynamic planning of driving
routes and the optimization of road network structures (Colak et al. 2016). As a result,
traffic congestion detection has become a hotspot in the field of spatiotemporal data
mining and in intelligent transportation research.
Many studies have been performed to detect traffic congestion using traffic flow
information monitored by induction coils or surveillance cameras during the past decades
(Li and Dai 2016, Inoue et al. 2016). Due to restrictions on installation locations and the high
cost of equipment, neither induction coils nor surveillance cameras have the ability to
monitor traffic status over an entire area without blind spots (Yuan et al. 2014). With the
development of positioning technologies, GPS systems mounted on taxies can sense the
spatial locations, driving directions and speeds of vehicles in near-real time with low costs
and high efficiencies. As an important transportation mode for residents’ daily trips in large
cities, such as Beijing, taxies compose approximately 5% of all vehicles on the road,
accounting for their high travel mileage (He et al. 2019). In addition, taxis can provide travel
services for 24 hours, so the massive amount of road-traffic-related information collected in
taxi trajectory data can, to a large degree, reflect road traffic conditions (Jenelius and
Koutsopoulos 2013). Along with the low cost of sampling, taxi trajectory data have been
widely utilized for traffic status estimations (Hu 2016, Zheng et al. 2020), anomalous
trajectory detections (Taniarza and Akbar 2017, Wang et al. 2018), taxi origin-destination
hotspot extractions (Pei et al. 2015, Zhao et al. 2016, Deng et al. 2019) and other intelligent
transportation systems (ITSs) (Bock et al. 2019, Liu et al. 2019). In addition, taxi trajectory
data have become a new focus in research on the detection of traffic congestion (Xu et al.
2018, He et al. 2019). However, existing studies have mostly treated the traffic flow speed as
the singular variation used for the identification of traffic congestion, making it difficult to
distinguish authentic traffic congestion from behaviours such as waiting at traffic signal
lights or parking (Gupta et al. 2013, Keler et al. 2017). In fact, traffic congestion can be
treated as a group of gathered vehicles that have the characteristics of spatial adjacency
and significantly slow speeds. Therefore, traffic congestion detection can be translated into
an extraction of dense clusters containing slow-moving vehicles, namely, moving clusters
(Kalnis et al. 2005, Huang et al. 2008). This paper proposes a density-based moving-object
clustering approach for accurately extracting the spatiotemporal extents of traffic conges
tion. The major contributions of this paper include:
The structure of this paper is organized as follows: Section 2 reviews the existing work
related to traffic congestion detection. In Section 3, the traffic congestion detection
strategy is presented, and the proposed method is elaborated upon. Section 4 performs
a comparative experimental analysis to demonstrate our approach. Section 5 summarizes
the interesting findings and highlights the future research directions.
2. Related work
Most existing studies originally detected traffic congestion on highways using vehicle
speed information acquired from induction coils. Compared with highways, urban roads
have various levels and vast numbers of intersections, as well as complicated spatial
topological relationships (Chow et al. 2014). Thus, with the help of spatial clustering
strategies, specific methods have been subsequently developed for the extraction of
congested urban areas, road segments and lanes. A detailed review of these methods is
given below.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 3
of vehicle speeds in traffic congestion. However, previous studies have not typically accounted
for the diversity in the moving orientations or the distributions of the speeds of vehicles,
resulting in the misrecognition of traffic congestion. In essence, a traffic flow constitutes a series
of continuously moving vehicles on roads. In cases of traffic congestion, there is not enough
space for overtaking or lane changing. The affected vehicles will be aggregated as significantly
dense clusters that move with extremely slow speeds compared with those of clusters running
on smooth road segments (Liu and Ban 2013). As a result, it can be deduced that traffic
congestion can be effectively detected by extracting dense moving clusters constituted by
slow-moving vehicles. However, due to insufficient estimations of the spatial distribution
characteristics of vehicles on a road network, existing spatial clustering-based methods lack
adaptivity and availability when employed in traffic congestion detections.
To address the abovementioned issues, this study proposes a new strategy to unveil
the spatiotemporal propagation process of traffic congestion based on the objective
detection of moving clusters with high densities and slow speeds.
Figure 1. Formation of a moving cluster. (a) Consecutive time windows that contain trajectory points.
(b) Moving cluster determination in consecutive time windows.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 5
3.1. A combination of map matching and time slicing for trajectory points
Unavoidable location errors impede the detection of traffic congestion from original
vehicle trajectory points in a road network space. Thus, these trajectory points must
be aligned with the road segments by a map-matching operation. To address the
uncertainty problem caused by a low sampling rate of vehicle trajectories (Chen et al.
2014), Lou et al. (2009) combined the geometric and topological information of
a road network with the speed constraints of vehicles to design an effective map-
matching algorithm called ST-Matching. The power of ST-Matching to improve the
accuracy and efficiency of low-sampling-rate trajectory matching has been demon
strated through many experiments. This study will employ this algorithm to map the
original trajectory points from a Euclidean space to a road network space.
In this way, the trajectory of any vehicle Vi can be represented as a set of points, i.e. Vi
= {(xi.1, yi.1, ri.1, ti.1), (xi.2, yi.2, ri.2, ti.2), . . ., (xi.k, yi.k, ri.k, ti.k), . . . }, where (xi.k, yi.k, ri.k, ti.k)
represents the spatial position of Vi after it is matched to road segment ri.k at timestamp
ti.k. To facilitate the spatiotemporal analysis of the vehicle trajectories, all trajectory points
are projected onto a series of continuous time windows {TW1, TW2, . . . TWm} with the same
widths based on their timestamps, as obtained by the snapshot model. Given any
trajectory point (xi.k, yi.k, ri.k, ti.k) of vehicle Vi and the time window TWk’, if ti.k lies within
the range [TWk’.start, TWk’.end], where TWk’.start and TWk’.end denote the start and end
times of TWk’, respectively, this trajectory point is considered to be located in TWk’ and is
re-expressed as (xi.k, yi.k, ri.k, TWk’). Regarding irregular sampling trajectory points, we can
set the time window width to the adopted time interval to sample the most trajectory
points. In this way, as many vehicles as possible that have exactly one sampling point in
each time window can be incorporated. Furthermore, for trajectory points with sampling
intervals larger than the time window width, the records of these vehicles that are missing
in some time windows are interpolated by gap-filling strategies with the help of the
recorded points in the adjacent time windows; these gap-filling strategies include linear
(Yuan et al. 2019) and kinematic (Long 2016, Hwang et al. 2018) interpolation methods.
The above operations can ensure that any vehicle will have trajectory points in each
selected time window. After the conversion operations are completed, trajectory points
located in the same time window are grouped into a snapshot. For example, the trajectory
of vehicle V3 in Figure 2(a) is denoted as {(x3.1, y3.1, t3.1), (x3.2, y3.2, t3.2), (x3.3, y3.3, t3.3)}. After
the time slicing operation, its trajectory can be expressed as {(x3.1, y3.1, TW1), (x3.2, y3.2, TW2)
and (x3.3, y3.3, TW3)}, as shown in Figure 2(b). Moreover, Figure 2(c-e) exhibits the spatial
distributions of vehicles in a road network space within the three time windows.
Figure 2. An example of time slicing for trajectory points. (a) Trajectory points attached to spatial
location information and time stamps. (b) Projection of trajectory points onto time windows. (c-e)
Projected trajectory points in TW1, TW2 and TW3.
with consistent running behaviours. In this case, this study extends the density-based
clustering method described in Deng et al. (2019) by imposing moving orientation
restrictions and implementing space-time connections to adaptively extract network-
constrained moving clusters with significantly high densities as candidates for traffic
congestion. Specifically, the moving orientations of vehicles are first adaptively parti
tioned into separate trajectory points that indicate significantly different driving direc
tions (e.g. opposite directions). Based on this constraint, the density-based clustering idea
outlined in Deng et al. (2019) is incorporated to aggregate trajectory points with both
spatial adjacency and orientation homogeneity. However, the method in Deng et al.
(2019) focuses on clustering spatial points that are observed in the same time window.
Sequentially, this study further discovers moving clusters by measuring the overlapping
relationships between spatial clusters detected in adjacent time windows.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 7
where pj represents the other points in the road network; N_Dist(pi, pj) denotes the
network distance between pi and pj; and σ determines the spatial range of attraction
generated by any point in the road network space. The value of σ is consistent with the
radius of the spatial neighbourhood. Giachetta et al. (2009) demonstrated that the
attraction intensity is equivalent to the probabilistic density function with a normalized
constant. Thus, the network kernel density estimator proposed by Borruso (2005) is
employed to re-express the attraction intensity as follows:
Figure 3. An example of a network distance measurement. (a) Original and projected trajectory points
on a road network. (b) Network distance between p1 and p4.
8 Y. SHI ET AL.
( PN
nc� 1
j¼1 2σ expf ½ N Distðpi;pjÞ
pffiffi �2 g; when 0 < N Distðpi ;pj Þ � σ
Attr Iðpi Þ ¼ 2σ (2)
0; when N Distðpi ;pj Þ > σ
where nc is the normalized constant. The vehicle trajectory points are primarily recorded in
road segments rather than at intersections. For the convenience of calculation, the spatial
neighbourhood of any point pi is defined as a range of σ from pi in two directions, i.e. 2σ. On
this basis, Shannon entropy is further adopted to determine the optimal σ value by character
izing the uncertainty of the attraction intensity distribution (Shannon 1948, Zhao et al. 2016,
Deng et al. 2019), which is expressed as follows.
XN Attr Iðpi Þ �Attr Iðpi Þ� XN
H¼ log where Z ¼ Attr Iðpi Þ (3)
i¼1 Z Z i¼1
The minimum entropy value indicates the highest asymmetry in the attraction intensities,
and core points with significantly stronger attraction intensities can be effectively identi
fied using the corresponding σ. Thus, the σ value that minimizes entropy, denoted as σopt,
is selected to determine the optimal range of the spatial neighbourhoods.
where θm and θ � denote the moving orientation value of the mth item and the mean
moving orientation value of all items, respectively. Furthermore, each cluster Ci is divided
into two subclusters (denoted as SCi−1 and SCi−2) using the k-means algorithm. The inner-
cluster variance loss of Ci can be expressed as follows (Guo 2008).
C i VL ¼ jC i var SC i 1 var SC i 2 varj (5)
The cluster with the maximal variance loss value has the greatest heterogeneity, and
divisions are thus continuously imposed on it. The inner-cluster relative variance loss of Ci
can be expressed as follows.
C i RVL ¼ C i VL C i 1 VL s:t:Ci 1 VL > Ci 2 VL (6)
This partition operation is iteratively implemented until the relative loss values of the
inner-cluster variances are smaller than an extremely small threshold (e.g. 0.1 in this
study). In this way, all moving orientation values can be adaptively partitioned into
homogeneous clusters. For each trajectory point, the attached moving orientation value
can be replaced by the tag of the corresponding cluster.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 9
λk rs l
Pðjrs lj¼ kÞ¼ e λ � ; where λ ¼ N� (7)
k! rn l
Focusing on a given time window, for any point pi, if there are ni observed points that
(i) fall in the spatial neighbourhood of pi (SN_pi for short) and (ii) belong to the same
cluster as pi in terms of their moving orientations, then the probability that more than ni
points follow these two conditions can be expressed as follows.
0
Xni 1 0 λk 0 2σopt
λ
PðjSN pi j � ni Þ ¼ 1 PðjSN pi j < ni Þ ¼ 1 e ; where λ ¼ (8)
k¼0 k! rn l
If this probability is smaller than a sufficiently low significance level, it is almost
impossible that pi has more than the observed number of points in its spatial neighbour
hood. In other words, pi can be regarded as a core point with a significantly high local
density, denoted as core_pi.
where |Ci∩Cj| represents the volumes of the intersecting set of Ci and Cj with respect
to the cluster members. If Con(Ci, Cj) is greater than a given threshold γ, the
combination of Ci and Cj will be defined as a moving cluster with a time span of
two, denoted as MC{Ci→Cj}. By this analogy, a cluster Ck in TWi+2 will be added to
MC{Ci→Cj} only if Con(Cj, Ck)>γ; then, the moving cluster MC{Ci→Cj} can be updated as
MC{Ci→Cj→Ck}. This process is iteratively implemented to detect all moving clusters
within any time span.
10 Y. SHI ET AL.
where Ci_k.centre and Ci_k+[Link] represent the centre point of clusters in the kth and
(k+ 1)th time windows of MCi, respectively, and TW_width denotes the width of the time
windows. In this way, the speed of all n moving clusters can be estimated to form the set
Speed_MC = {Speed_MC1, Speed_MC2 . . ., Speed_MCn}. The speed of a traffic flow can be
depicted by a log-normal distribution (Hollander and Liu 2008). The logarithmic transfor
mation of the set Speed_MC follows a normal distribution, i.e. log(Speed_MC)~N(μ, σ2),
Pn rffiP
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n
logðspeed MC i Þ logðspeed MC i μÞ2 Þ
where μ ¼ i¼1 n and σ ¼ i¼1
n 1 . The percentile value regarding
a given probability value p can be calculated as follows (Pu 2012):
Z ðpÞ¼ exp½μ þ σΦ 1 ðpÞ� (11)
−1
where Φ (.) is the percentile function of the standard normal distribution. The percentile
values can be utilized to extract moving clusters with significantly slow speeds.
Additionally, the time spans of slow-moving clusters should be further considered to
distinguish traffic congestion from other events that can also cause slow-moving traffic
flows, such as waiting at red signals at road intersections. Given the significance level α
(e.g. 0.1) and a time span threshold δts, the moving cluster MCi indicates traffic congestion
when Speed_MCi≤Z(1-α) and TS_MCi≥δts.
recent decades (Liu et al. 2015). The Yangtze and Han Rivers divide Wuhan into three
towns, namely, Wuchang, Hanyang and Hankou. Wuchang takes up approximately half of
the total area of Wuhan and has a collection of governmental agencies, universities,
scientific research institutions and technology industrial parks. A large population com
mutes daily between their workplaces and residences in Wuchang. This frequent popula
tion flow leads to daily variations in road traffic status with respect to time. As a result, the
road network of Wuchang was selected as the study area in these experiments. Figure 4
shows the spatial distribution of the road network of Wuchang with 916 segments derived
from the Open Street Map, which is a reliable platform for obtaining urban road network
datasets.
It is well known that the irregular trips of a large quantity of tourists generally increase
the complexity of traffic conditions during holidays, e.g. the May Day holiday in China,
especially in tourist cities. Thus, the recorded taxi trajectory points from 1 to 6 May 2014 in
Wuchang were utilized to sufficiently verify the performance of the proposed method.
A total of 5.6 million trajectory points were collected by vehicle-mounted positioning
devices on approximately 7000 taxies with a sampling time interval of 60 s; these data
points include IDs, spatial locations (latitude/longitude), time stamps, instantaneous
speeds (m/s), instantaneous driving orientations (from 0° to 360°) and service statuses
(occupied or unoccupied). Specifically, the trajectory points collected from 8:00 to9:00 on
5 May are shown in the right panel of Figure 4. In addition, traffic congestion is detected in
distinct time periods to further explore its spatiotemporal variation characteristics.
Kan et al. (2019) proposed an approach for extracting invalid slow-speed trajectories
relevant to individual demands (e.g. searching for passengers) with the combined ana
lyses of trajectory shapes, distances to road segments and moving velocities. Performing
this approach on the trajectory points obtained on 5 May 2014, these invalid trajectories
are found to account for approximately 20% of all trajectory segments. The sparse
distributions of these invalid trajectories make it difficult for these points to form dense
clusters in each time stamp. Taking the trajectory points collected on 5 May as an
example, Figure 5 shows that invalid trajectory segments steadily occupied a small
Figure 4. Spatial distribution of road segments in Wuchang and trajectory points obtained from
8:00 ~ 9:00 on 5 May 2014.
12 Y. SHI ET AL.
proportion of the total number of points in the daytime and had a negligible influence on
the number of congested road segments detected by the proposed method. To reduce
the computing complexity and uncertainty in the data elimination method, the trajec
tories of all taxis were utilized for the detection of traffic congestion by the proposed
method in the experiments.
Figure 5. Comparisons of the variations in the numbers of (a) trajectory segments and (b) detected
congested road segments regarding all trajectories and valid trajectories.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 13
required to queue for at least three signal cycles before passing through the intersection
(Lu 2001). In this case, assuming the homogeneous flow of vehicle queues, the 1-γ value
and δts can be determined as 1-γ = TW_width/3τ (TW_width<3τ) and δts = 3τ/TW_width,
respectively, where τ denotes the average cycle of traffic signals. The γ value is set to 0 in
the case that TW_width≥3τ. Considering the time settings of road traffic signals in
Wuchang, we set the τ value to 1 minute for all intersections in the experiments based
on empirical induction. In summary, the γ and δts values of the proposed method are 2/3
and 3, respectively, with the condition that the width of the time window is 1 minute.
To verify the rationality of the assigned γ and TW_width values, a traffic status simula
tion tool, namely, VISSIM, was utilized to simulate the trajectory points of 4298 vehicles on
a simulated road network for 60 minutes with a sampling interval of 60 s. The simulated
road network, as shown in Figure 6, contains 136 road segments with a total length of
22 km. In the simulated dataset, traffic congestion was pre-set according to the definition
of traffic congestion in Lu (2001). Table 2 gives the quantitative, evaluative results of the
detected traffic congestion with different γ values. One can see that the precision value is
positively proportional to the γ values, but more traffic congestion was missed with the
use of larger γ values. The precision values are at least 0.75 in the range γ = [2/3, 5/6], while
the recall values exceed 0.80 in the range γ = [1/6, 2/3]. With the combination of the
precision and recall values, the detected result with γ = 2/3 has the largest value of F1,
where F1 = (2*precision*recall)/(precision+recall). To a large degree, this result can
indicate the rationality of the γ value setting strategy using traffic signal cycles. When
applying the proposed method to detect traffic congestion in a new city without informa
tion about the traffic signal cycles, the following two kinds of strategies can be used to
carry out our method. If there are official records of traffic congestion, the optimal γ value
can be directly determined by comparing the precision and recall of the detected results
obtained by different γ values. Otherwise, it is necessary to estimate the value of τ to
further calculate the γ value. In fact, a large number of models have been constructed to
effectively estimate traffic signal cycles based on vehicle trajectories (Fayazi et al. 2015, Yu
and Lu 2016, Yang et al. 2018, Yu et al. 2018). For example, Yang et al. (2018) first extracted
stop-and-go patterns at an intersection using speed information from vehicle trajectories
to determine the start times of the green lights. On this basis, a traffic signal cycle can be
further estimated from the sequence of green light start times obtained by employing
periodicity mining approaches.
Furthermore, we evaluated the performance of the proposed method by performing
experiments with different TW_width values and different trajectory point sampling
intervals, as shown in Table 3. One can see that, in the case that TW_width value is greater
than the sampling interval of the trajectory points, multiple-trajectory points of one
vehicle are recorded in the same time window, resulting in an increase in the running
time and a decrease in the detection precision. In addition, high TW_width values and
sampling intervals can reduce the running time. However, when these two values were
Figure 6. Spatial distributions of (a) a simulated road network and (b) the trajectory points simulated
by VISSIM.
both equal to or larger than 3 minutes, more traffic congestion could be exactly discri
minated, but under these conditions, the loss of the trajectory information of vehicles
seriously decreased the reliability of the proposed method.
Table 3. Results of the quantitative evaluation of traffic congestion detected with different TW_width
values and trajectory point sampling intervals.
Sampling interval (minute) TW_width (minute) Precision Recall F1 value Running time (second)
1 1 0.75 0.82 0.78 9.7
1 2 0.41 0.79 0.55 17.0
1 3 0.41 0.69 0.51 16.2
1 4 0.42 0.64 0.51 18.4
1 5 0.44 0.60 0.51 27.9
1 6 0.53 0.49 0.51 22.4
2 2 0.62 0.96 0.76 5.0
3 3 0.40 0.99 0.57 3.5
4 4 0.16 1 0.28 3.0
5 5 0.16 1 0.28 2.6
6 6 0.15 1 0.26 2.1
the precision and recall values. As shown in Table 4, the precision values of the results
obtained by the proposed method without the adaptive partitioning of moving orienta
tions (abbreviated as APMO) have significant variations among different vehicle penetra
tion rates. For the results obtained using Liu’s method, the precision values can achieve at
least 64%; however, the low recall values indicate that this method could only detect
approximately 30% traffic congestion when vehicle penetration rates were set as 6%–8%.
Kan’s method has stable precision values and significantly higher recall values than those
of other methods using trajectory data with different vehicle penetration rates. However,
relatively more unreliable traffic congestion was recognized by Kan’s method. Overall, the
highest precision values, relatively high recall values and the highest F1 values of the
proposed method demonstrate the superiority of this method over the other studied
methods in detecting traffic congestion using sparse vehicle trajectories. Moreover, by
comparison with the results of the detection performed using all trajectory points, one
can see that the low vehicle penetration rates have the least influence on the performance
of the proposed method. In other words, the proposed method is more suitable for the
detection of traffic congestion from sparse trajectory points than the other three
methods.
Figure 7 gives the variations in the numbers and total lengths of congested road
segments detected by different methods using trajectories of 10% of the probe vehicles,
wherein baselines were drawn based on the records of traffic congestion in the simulated
trajectories. Compared with the results obtained by the proposed method, the proposed
method conducted without APMO detected significantly more and longer congested road
segments due to the deficiency of the method in distinguishing opposite moving orienta
tions. The results detected by Liu’s method contain more congested roads with shorter
total lengths. The total lengths and numbers of congested road segments detected by
Kan’s method are greater than those of both the proposed and Liu’s methods.
Furthermore, Table 5 shows that the results detected by the proposed method have
minimum root mean square error (i.e. RMSE) values, which quantitatively proves the
effectiveness and advantages of the proposed method in detecting traffic congestion
using sparse vehicle trajectories.
16 Y. SHI ET AL.
Table 4. Quantitative evaluations of the detection results obtained with different vehicle penetration
rates.
Vehicle penetration rate Methods Precision Recall F1 value
6% The proposed method 0.73 0.56 0.63
The proposed method without APMO 0.48 0.58 0.52
Liu’s method 0.70 0.27 0.39
Kan’s method 0.45 0.70 0.55
7% The proposed method 0.72 0.61 0.66
The proposed method without APMO 0.71 0.65 0.68
Liu’s method 0.67 0.24 0.35
Kan’s method 0.49 0.86 0.62
8% The proposed method 0.84 0.59 0.69
The proposed method without APMO 0.56 0.57 0.57
Liu’s method 0.64 0.33 0.44
Kan’s method 0.50 0.93 0.65
9% The proposed method 0.78 0.70 0.74
The proposed method without APMO 0.58 0.66 0.61
Liu’s method 0.64 0.64 0.64
Kan’s method 0.50 0.88 0.64
10% The proposed method 0.80 0.68 0.74
The proposed method without APMO 0.40 0.72 0.51
Liu’s method 0.67 0.59 0.63
Kan’s method 0.48 0.99 0.64
100% The proposed method 0.75 0.82 0.78
The proposed method without APMO 0.80 0.57 0.66
Liu’s method 0.20 1 0.34
Kan’s method 0.57 0.78 0.66
Figure 7. Variations in the (a) numbers and (b) total lengths of the congested road segments detected
by different methods using trajectories of 10% of the probe vehicles.
Table 5. The RMSE values of the results obtained with different methods using trajectories of 10% of
the probe vehicles.
Variables Methods RMSE
The number ofcongested road segments The proposed method 0.87
The proposed method without APMO 5.96
Liu’s method 2.35
Kan’s method 4.25
The total length of congested road segments The proposed method 251.18
The proposed method without APMO 2055.28
Liu’s method 445.70
Kan’s method 763.70
and TN_2 represent the straight driving direction from the southeast to the northwest of
Wuluo Road and the left turning direction into Shipailing Road, respectively. As shown in
Figure 9(a), no congested moving clusters were detected from 2:00 to 6:00 in either TN_1
or TN_2 due to the high vehicle speeds. In the daytime, a significant decrease in vehicle
speed was observed on Wuluo Road, and long queues of vehicles caused traffic conges
tion from 9:00 to 12:00 and from 15:00 to 18:00. Note that vehicles drive more slowly in
the TN_2 direction, which may be caused by the frequent deceleration of vehicles at the
intersection when turning left. In addition, the narrower width of the Shipailing Road can
substantially increase the waiting time for red signal lights. However, traffic congestion
was detected only at approximately 10:00 and 18:00 in the TN_2 direction, which indicates
that the proposed method can effectively distinguish long vehicle queues induced by
traffic congestion from common deceleration behaviours.
TN_3 is the opposite direction of TN_1 on Wuluo Road. As shown in Figure 9(b), the
vehicle speed in the TN_3 direction obviously decreased from 6:00 to 7:00 since traffic
congestion was frequently identified until 12:00. As the two driving directions on Wuluo
Road, TN_1 and TN_3 exhibit significant differences in their traffic congestion durations
and congested segment lengths. Specifically, congestion in the TN_3 directions occurred
earlier than that in the TN_1 direction in morning peak hours, while TN_1 experienced
traffic congestion for a longer duration in the evening peak hours. This phenomenon
Figure 8. Different turning directions at the intersection between Wuluo road and Shipailing road.
18 Y. SHI ET AL.
Figure 9. Variations in average vehicle speed and length of the congested road segments detected in
the (a) TN_1 and TN_2, (b) TN_3 and TN_4 and (c) TN_5 and TN_6 directions.
reveals that the majority of residents commute to workplaces (e.g. Optical Valley south
east of Wuchang) via Wuluo Road in the TN_3 direction and return home after work in the
TN_1 direction.
TN_4 denotes the right-turning direction from Wuluo Road to Shipailing Road. Figure 9(b)
indicates that vehicles in the TN_3 and TN_4 directions moved at nearly the same speeds due
to no restrictions by the traffic signal lights. As nearby residents went off work for lunch,
vehicles began to queue at approximately 11:00 in the TN_4 direction, and traffic congestion
in this direction persisted until approximately 11:30.
TN_5 and TN_6 represent the left and right turning directions from Shipailing Road
onto Wuluo Road, respectively. As shown in Figure 10, there are only two lanes on
Shipailing Road for vehicles to drive into Wuluo Road: the left lane for turning left only
and the right lane for turning both left and right. Consequently, right-turning vehicles are
often affected by left-turning vehicles that wait at the red signal lights, and thus, the
vehicle speeds in the TN_5 and TN_6 directions have approximately the same values and
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 19
similar variation tendencies with time (as shown in Figure 9(c)). As it is connected with
Wuluo Road, which is the arterial road in Wuchang, the traffic flow on Shipailing Road can
be efficiently diverted by the multiple lanes of Wuluo Road. For this reason, only small-
scope traffic congestion was formed and persisted only for short-time periods during
peak hours.
Figure 10. The intersection between Shipailing road and Wuluo road in the real-life scenario.
20 Y. SHI ET AL.
Figure 11. Temporally varying numbers of congested moving clusters from 1 to 6 May.
more congested during the evening peak hours than during the morning rush hours, as
tourists may prefer to sightsee in the afternoon. On working days (i.e. 4 to 6 May), the
traffic congestion presented significant dispersed distributions in Wuchang. Caused by
the daily commuting of residents, road segments leading to working areas, such as Luoyu
Road, Zhongnan Road and Zhongbei Road, became regularly congested during peak
hours on workdays. Moreover, frequent traffic congestion also took place on several roads
around large residential communities, e.g. on Youyi Avenue, Wenxin Street and
Zhuodaoquan South Road, due to concentrated residential travel. Additionally, those
main roads (such as Wuluo Road, Zhongshan Road and Jiefang Road) carried large
volumes of traffic on both holidays and workdays. Comparatively, there was significantly
less traffic congestion on other side-road segments, which may to a large extent reflect
the spatial heterogeneity and multicentricity of urban traffic congestion.
In summary, the proposed method has the ability to precisely detect the spatiotemporal
extent of traffic congestion in a road network space. The temporally dependent patterns of
traffic congestion on urban road networks further reveal the complicated travel regularities
of both residents and tourists at different times, which will guide dynamic route planning in
intelligent navigation systems and the optimization of road networks in urban planning.
Figure 12. Spatiotemporal distributions of the detected traffic congestion from 1 to 6 May.
Future work will focus on three aspects. The first aspect is to trace the multiple sources
of traffic congestion by analysing the implicit spatiotemporal correlations among differ
ent congested road segments. Second, we will modify the proposed method using
parallel computing strategies to realize the real-time detection of traffic congestion on
road networks. Furthermore, the deep features of urban traffic congestion in space and
22 Y. SHI ET AL.
time will be extracted using modified deep learning frameworks to perform accurate
spatiotemporal predictions of traffic congestion.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Funding
This work was supported by the National Natural Science Foundation of China (NSFC) [No.
42071452, 41730105 and 41771492]; the Natural Science Foundation of Hunan Province, China
[No.2020JJ4696]; the Hunan Key Research and Development Project [No, 2018SK2052] and the
Independent Exploration and Innovation Project Fund Designated for Graduate Students of Central
South University [No.2020zzts695].
Notes on contributors
Yan Shi is currently an associate professor in the school of Geosciences and Info-physics, Central
South University, Changsha, Hunan, China. He works in the area of spatio-temporal clustering,
anomaly detection and association rule mining.
Da Wang is currently a postgraduate in the school of Geosciences and Info-physics, Central South
University, Changsha, Hunan, China. He majors in trajectory data mining.
Jianbo Tang is currently a lecturer in the school of Geosciences and Info-physics, Central South
University, Changsha, Hunan, China. He works in the area of spatio-temporal clustering and map
generalization.
Min Deng is currently a professor in the school of Geosciences and Info-physics, Central South
University, Changsha, Hunan, China. He works in the area of geographical big data mining.
Huimin Liu is currently an associate professor in the school of Geosciences and Info-physics, Central
South University, Changsha, Hunan, China. She works in the area of spatio-temporal data fusion and
map generalization.
Baoju Liu received the Ph.D. degree from the School of Geosciences and Info-physics, Central South
University, Changsha, Hunan, China. His research interests include traffic flow analysis and urban
transportation programming.
ORCID
Yan Shi [Link]
References
An, S., Yang, H., and Wang, J., 2018. Revealing recurrent urban congestion evolution patterns with
taxi trajectories. ISPRS International Journal of Geo-Information, 7 (4), 18. doi:10.3390/
ijgi7040128.
Bock, F., Di Martino, S., and Origlia, A., 2019. Smart parking: using a crowd of taxis to sense on-street
parking space availability. IEEE Transactions on Intelligent Transportation Systems, 21 (2), 496–508.
doi:10.1109/TITS.2019.2899149.
Borruso, G., 2005. Network density estimation: analysis of point patterns over a network. In:
International Conference on Computational Science and Its Applications (ICCSA 2005), 9–12 May
Singapore. Singapore: Springer, 126–132. doi:10.1007/11424857_14.
Chaurasia, B.K., Manjoro, W.S., and Dhakar, M., 2020. Traffic congestion identification and reduction.
Wireless Personal Communications, 12, 1–20. doi:10.1007/s11277-020-07420-0
Chen, B.Y., et al., 2014. Map-matching algorithm for large-scale low-frequency Floating Car Data.
International Journal of Geographical Information Science, 28 (1), 22–38. doi:10.1080/
13658816.2013.816427.
Chow, A.H.F., et al., 2014. Empirical assessment of urban traffic congestion. Journal of Advanced
Transportation, 48 (8), 1000–1016. doi:10.1002/atr.1241.
Colak, S., Lima, A., and González, M.C., 2016. Understanding congested travel in urban areas. Nature
Communications, 7 (1), 10793. doi:10.1038/ncomms10793.
Deng, M., et al., 2019. A density-based approach for detecting network-constrained clusters in
spatial point events. International Journal of Geographical Information Science, 33 (3), 466–488.
doi:10.1080/13658816.2018.1541177.
Ester, M., et al., 1996. A density-based algorithm for discovering clusters in large spatial databases
with noise. International Conference on Knowledge Discovery and Data Mining, 96 (34), 226–231.
Fayazi, S.A., et al., 2015. Traffic signal phase and timing estimation from low-frequency transit bus
data. IEEE Transactions on Intelligent Transportation Systems, 16 (1), 19–28. doi:10.1109/
tits.2014.2323341.
Giachetta, G., Mangiarotti, L., and Sardanashvily, G., 2009. Advanced classical field theory. Singapore:
World Scientific.
Guo, D., 2008. Regionalization with dynamically constrained agglomerative clustering and partition
ing (REDCAP). International Journal of Geographical Information Science, 22 (7), 801–823.
doi:10.1080/13658810701674970.
Gupta, A., Choudhary, S., and Paul, S., 2013. DTC: a framework to detect traffic congestion by mining
versatile GPS data. In: Proceedings of the 1st International Conference on Emerging Trends and
Applications in Computer Science (ICETACS), 13-14 September Shillong, India: IEEE, 97–103.
doi:10.1109/ICETACS.2013.6691403.
He, Z., et al., 2019. Network-wide identification of turn-level intersection congestion using only
low-frequency probe vehicle data. Transportation Research Part C: Emerging Technologies, 108,
320–339. doi:10.1016/[Link].2019.10.001.
Hollander, Y. and Liu, R., 2008. Estimation of the distribution of travel times by repeated simulation.
Transportation Research Part C Emerging Technologies, 16 (2), 212–231. doi:10.1016/j.
trc.2007.07.005.
Hu, H.Q., 2016. Crowd sourcing-based real-time urban traffic speed estimation: from trends to
speed. In: Proceedings of the 32nd International Conference on Data Engineering (ICDE), 16-20
May Helsinki, Finland: IEEE, 883–894. doi:10.1109/ICDE.2016.7498298.
Huang, Y., Chen, C., and Dong, P., 2008. Modeling herds and their evolvements from trajectory data.
In: Proceedings of the 5th International Conference on Geographic Information Science, 23-26
September Berlin, Germany: Springer, 90–105. doi:10.1007/978-3-540-87473-7_6.
Hwang, S., et al., 2018. Segmenting human trajectory data by movement states while addressing
signal loss and signal noise. International Journal of Geographical Information Science, 32 (7),
1391–1412. doi:10.1080/13658816.2018.1423685.
Inoue, R., Miyashita, A., and Sugita, M., 2016. Mining spatio-temporal patterns of congested traffic in
urban areas from traffic sensor data. In: Proceedings of the 19th International Conference on
24 Y. SHI ET AL.
Intelligent Transportation Systems (ITSC), 1-4 November Rio de Janeiro, Brazil: IEEE, 731–736.
doi:10.1109/ITSC.2016.7795635
Jenelius, E. and Koutsopoulos, H.N., 2013. Travel time estimation for urban road networks using low
frequency probe vehicle data. Transportation Research Part B Methodological, 53, 64–81.
doi:10.1016/[Link].2013.03.008
Kalnis, P., Mamoulis, N., and Bakiras, S., 2005. On discovering moving clusters in spatio-temporal
data. In: Proceedings of the 9th international conference on Advances in Spatial and Temporal
Databases, 22-24 August Angra dos Reis, Brazil: Springer, 364–381. doi:10.1007/11535331_21.
Kan, Z., et al., 2019. Traffic congestion analysis at the turn level using taxis’ GPS trajectory data.
Computers, Environment and Urban Systems, 74, 229–243. doi:10.1016/[Link].2018.11.007.
Keler, A., 2017. Traffic pattern analysis framework with emphasis on Floating Car Data (FCD).
Augsburg, Germany: Universität Augsburg.
Keler, A., Krisp, J.M., and Ding, L., 2017. Detecting traffic congestion propagation in urban
environments-a case study with Floating Taxi Data (FTD) in Shanghai. Journal of Location Based
Services, 11 (2), 133–151. doi:10.1080/17489725.2017.1420256.
Li, W. and Dai, H.Y., 2016. Real-time road congestion detection based on image texture analysis.
Procedia Engineering, 137, 196–201. doi:10.1016/[Link].2016.01.250
Liu, C., Qin, K., and Kang, C., 2015. Exploring time-dependent traffic congestion patterns from taxi
trajectory data. In: Proceedings of the 2nd IEEE International Conference on Spatial Data Mining and
Geographical Knowledge Services (ICSDM), 8-10 July Fuzhou, China: IEEE, 39–44. doi:10.1109/
ICSDM.2015.7298022.
Liu, Q., et al., 2019. Data-driven intelligent location of public charging stations for electric vehicles.
Journal of Cleaner Production, 232, 531–541. doi:10.1016/[Link].2019.05.388.
Liu, X. and Ban, Y., 2013. Uncovering spatio-temporal cluster patterns using massive Floating Car
Data. International Journal of Geo-Information, 2 (2), 371–384. doi:10.3390/ijgi2020371.
Long, J.A., 2016. Kinematic interpolation of movement data. International Journal of Geographical
Information Science, 30 (5), 854–868. doi:10.1080/13658816.2015.1081909.
Lou, Y., et al., 2009. Map-matching for low-sampling-rate GPS trajectories. In: Proceedings of the 17th
ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems,
4-6 November Seattle, WA: ACM, 352–361. doi:10.1145/1653771.1653820.
Lu, H.P., 2001. Analysis of the urban traffic. Beijing: China water power press.
Macqueen, J., 1967. Some methods for classification and analysis of multivariate observations.
Proceedings of the 15th Berkeley Symposium on Mathematical Statistics and Probability, 1 (14),
281–297.
Nolle, T., Schweizer, I., and Janssen, F., 2014. Data-driven detection of congestion-affected roads.
Technical Report TUD-KE-2014-02.
Ong, R., et al., 2011. Traffic jams detection using flock mining. In: Proceedings of the European
Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
(ECML PKDD), 5-9 September Athens, Greece, 6913, 650–653. doi:10.1007/978-3-642-23808-6_49.
Pei, T., et al., 2015. Density-based clustering for data containing two types of points. International
Journal of Geographical Information Science, 29 (2), 175–193. doi:10.1080/13658816.2014.955027.
Pu, W., 2012. Analytic relationships between travel time reliability measures. Transportation Research
Record: Journal of the Transportation Research Board, 2254 (1), 122–130. doi:10.3141/2254-13.
Rempe, F., Huber, G., and Bogenberger, K., 2016. Spatio-temporal congestion patterns in urban
traffic networks. Transportation Research Procedia, 15, 513–524. doi:10.1016/[Link].2016.06.043
Shannon, C.E., 1948. A mathematical theory of communication. The Bell System Technical Journal, 27
(379–423), 623–656. doi:10.1002/j.1538-7305.1948.tb00917.x.
Shiode, S., 2011. Street-level spatial scan statistic and STAC for analyzing street crime
concentrations. Transactions in GIS, 15 (3), 365–383. doi:10.1111/j.1467-9671.2011.01255.x.
Shiode, S. and Shiode, N., 2009. Detection of multi-scale clusters in network space. International
Journal of Geographical Information Science, 23 (1), 75–92. doi:10.1080/13658810801949843.
Taniarza, N. and Akbar, S., 2017. Anomalous trajectory detection from taxi GPS traces using
combination of iBAT and DTW. In: Proceedings of the 6th International Conference on Electrical
Engineering and Informatics (ICEEI), 5-9 November Langkawi, Malaysia: IEEE, 1–5.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 25
Terroso-Saenz, F., et al., 2012. A cooperative approach to traffic congestion detection with complex
event processing and VANET. IEEE Transactions on Intelligent Transportation Systems, 13 (2),
914–929. doi:10.1109/TITS.2012.2186127.
Tobler, W., 1970. A computer movie simulating urban growth in the Detroit region. Economic
Geography, 46 (2), 234–240. doi:10.2307/143141.
Wang, H.D., Yang, Y., and Li, Q.Q., 2013. How many probe vehicles are enough for identifying traffic
congestion? -a study from a streaming data perspective. Frontiers of Earth Science, 7 (1), 34–42.
doi:10.1007/s11707-012-0343-x.
Wang, S., et al., 2011. Data field for hierarchical clustering. International Journal of Data Warehousing
and Mining, 7 (4), 43–63. doi:10.4018/jdwm.2011100103.
Wang, Y., et al., 2018. Detecting anomalous trajectories and behavior patterns using hierarchical
clustering from taxi GPS data. ISPRS International Journal of Geo-information, 7 (1), 25.
doi:10.3390/ijgi7010025.
Wu, N.N. and She, S.Y., 2016. Influence of online booking taxis on urban taxi system based on big data
analysis technology. Traffic and Transportation, 01, 238–241. doi:10.3969/j..1671-3400.2016.z1.052
Xu, L., Yue, Y., and Li, Q.Q., 2013. Identifying urban traffic congestion pattern from historical Floating
Car Data. In: Proceedings of the 13th COTA International Conference of Transportation Professionals
(CICTP), 13-16 August Shenzhen, China, 96, 2084–2095. doi:10.1016/[Link].2013.08.235.
Xu, W., Qin, K., and Wang, Y., 2018. Congestion detection and distribution pattern analysis based on
spatiotemporal density clustering. In: Proceedings of the 26th International Conference on
Geoinformatics, 28-30 June Kunming, China: Yunnan Normal Univ, 1–7.
Yamada, I. and Thill, J.C., 2007. Local indicators of network-constrained clusters in spatial point
patterns. Geographical Analysis, 39 (3), 268–292. doi:10.1111/j.1538-4632.2007.00704.x.
Yang, Q., et al., 2018. Traffic signals timing cycle length learning: using taxi GPS trajectories. In:
Proceedings of the International Conference on Machine Learning and Cybernetics (ICMLC), 15-
18 July Chengdu, China: IEEE, 13–18. doi:10.1109/ICMLC.2018.8526996.
Yu, J., et al., 2018. Detecting regularities of traffic signal timing using GPS trajectories. IEICE
Transactions on Information and System, 101 (4), 956–963. doi:10.1587/transinf.2016IIP0022.
Yu, J. and Lu, P., 2016. Learning traffic signal phase and timing information from low-sampling rate
taxi GPS trajectories. Knowledge-Based Systems, 110, 275–292. doi:10.1016/[Link].2016.07.036
Yu, Q., et al., 2019. Road congestion detection based on trajectory stay-place clustering. ISPRS
International Journal of Geo-Information, 8 (6), 264. doi:10.3390/ijgi8060264.
Yuan, G., et al., 2019. APDS: a framework for discovering movement pattern from trajectory database.
International Journal of Distributed Sensor Networks, 15 (11), 11. doi:10.1177/1550147719888164.
Yuan, Q., et al., 2014. A traffic congestion detection and information dissemination scheme for urban
expressways using vehicular networks. Transportation Research Part C: Emerging Technologies, 47,
114–127. doi:10.1016/[Link].2014.08.001.
Zhao, P., et al., 2016. A trajectory clustering approach based on decision graph and data field for
detecting hotspots. International Journal of Geographical Information Science, 31 (6), 1101–1127.
doi:10.1080/13658816.2016.1213845.
Zheng, J. and Liu, H.X., 2017. Estimating traffic volumes for signalized intersections using connected
vehicle data. Transportation Research Part C: Emerging Technologies, 79, 347–362. doi:10.1016/j.
trc.2017.03.007
Zheng, L., et al., 2020. A tensor-based K-nearest neighbors method for traffic speed prediction under data
missing. Transportmetrica B: Transport Dynamics, 8 (1), 182–199. doi:10.1080/21680566.2020.1732247.