0% found this document useful (0 votes)
10 views26 pages

Shi 2021

This research article presents a density-based moving object clustering approach for detecting spatiotemporal extents of traffic congestion. The method improves upon existing techniques by accurately extracting traffic congestion characteristics through a three-step process involving map-matching, statistical detection of moving clusters, and analysis of vehicle speed and time spans. Comparative experiments demonstrate that this approach outperforms traditional methods, offering insights for dynamic route planning and road network optimization.

Uploaded by

alyssadmaliksi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views26 pages

Shi 2021

This research article presents a density-based moving object clustering approach for detecting spatiotemporal extents of traffic congestion. The method improves upon existing techniques by accurately extracting traffic congestion characteristics through a three-step process involving map-matching, statistical detection of moving clusters, and analysis of vehicle speed and time spans. Comparative experiments demonstrate that this approach outperforms traditional methods, offering insights for dynamic route planning and road network optimization.

Uploaded by

alyssadmaliksi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

International Journal of Geographical Information

Science

ISSN: (Print) (Online) Journal homepage: [Link]

Detecting spatiotemporal extents of traffic


congestion: a density-based moving object
clustering approach

Yan Shi, Da Wang, Jianbo Tang, Min Deng, Huimin Liu & Baoju Liu

To cite this article: Yan Shi, Da Wang, Jianbo Tang, Min Deng, Huimin Liu & Baoju Liu
(2021): Detecting spatiotemporal extents of traffic congestion: a density-based moving
object clustering approach, International Journal of Geographical Information Science, DOI:
10.1080/13658816.2021.1905820

To link to this article: [Link]

Published online: 01 Apr 2021.

Submit your article to this journal

Article views: 143

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


[Link]
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE
[Link]

RESEARCH ARTICLE

Detecting spatiotemporal extents of traffic congestion: a


density-based moving object clustering approach
a
Yan Shi , Da Wanga, Jianbo Tanga,b, Min Denga, Huimin Liua and Baoju Liua
a
Department of Geo-informatics, Central South University, Changsha, Hunan, China; bState Key Laboratory of
Information Engineering in Surveying, Mapping & Remote Sensing, Wuhan University, Wuhan, China

ABSTRACT ARTICLE HISTORY


Traffic congestion detection poses challenges in spatiotemporal Received 28 November 2019
data mining and intelligent transportation research. Existing studies Accepted 16 March 2021
primarily detect traffic congestion based on the speed estimation of KEYWORDS
traffic flows. Such detection techniques may overlook the formation Traffic congestion; road
of traffic congestion in space and time. This research proposes network space; density-
a density-based approach to moving object clustering that extracts based clustering; moving
the spatiotemporal extents of traffic congestion in three steps. The clusters
first step applies a map-matching strategy to project original tra­
jectory points in a planar space onto a road network space and
segments the trajectories into consecutive time windows. In
the second step, we statistically detect moving clusters with sig­
nificantly high-density subject to network constrained clustering.
The final third step determines moving clusters indicative of traffic
congestion through the analysis of both vehicle speed and time
spans. Comparative experiments on both simulated trajectories and
the real-life taxi trajectories in Wuchang demonstrate that the
proposed method outperforms other methods through quantita­
tive evaluations using three indicators, i.e. the precision, recall and
F1 value. The proposed approach can illustrate the spatiotemporal
regularities of traffic congestion, which can inform dynamic route
planning and network design optimization.

1. Introduction
Rapid urbanization has caused traffic congestion, which has in turn blocked sustainable
urban development by reducing travel efficiency and aggravating air pollution. It is
important to identify traffic congestion accurately, adequately and in a timely manner
in space and time for the realization of traffic dispersion, the dynamic planning of driving
routes and the optimization of road network structures (Colak et al. 2016). As a result,
traffic congestion detection has become a hotspot in the field of spatiotemporal data
mining and in intelligent transportation research.
Many studies have been performed to detect traffic congestion using traffic flow
information monitored by induction coils or surveillance cameras during the past decades
(Li and Dai 2016, Inoue et al. 2016). Due to restrictions on installation locations and the high

CONTACT Baoju Liu baojuliu@[Link]


This article has been republished with minor changes. These changes do not impact the academic content of the article.
© 2021 Informa UK Limited, trading as Taylor & Francis Group
2 Y. SHI ET AL.

cost of equipment, neither induction coils nor surveillance cameras have the ability to
monitor traffic status over an entire area without blind spots (Yuan et al. 2014). With the
development of positioning technologies, GPS systems mounted on taxies can sense the
spatial locations, driving directions and speeds of vehicles in near-real time with low costs
and high efficiencies. As an important transportation mode for residents’ daily trips in large
cities, such as Beijing, taxies compose approximately 5% of all vehicles on the road,
accounting for their high travel mileage (He et al. 2019). In addition, taxis can provide travel
services for 24 hours, so the massive amount of road-traffic-related information collected in
taxi trajectory data can, to a large degree, reflect road traffic conditions (Jenelius and
Koutsopoulos 2013). Along with the low cost of sampling, taxi trajectory data have been
widely utilized for traffic status estimations (Hu 2016, Zheng et al. 2020), anomalous
trajectory detections (Taniarza and Akbar 2017, Wang et al. 2018), taxi origin-destination
hotspot extractions (Pei et al. 2015, Zhao et al. 2016, Deng et al. 2019) and other intelligent
transportation systems (ITSs) (Bock et al. 2019, Liu et al. 2019). In addition, taxi trajectory
data have become a new focus in research on the detection of traffic congestion (Xu et al.
2018, He et al. 2019). However, existing studies have mostly treated the traffic flow speed as
the singular variation used for the identification of traffic congestion, making it difficult to
distinguish authentic traffic congestion from behaviours such as waiting at traffic signal
lights or parking (Gupta et al. 2013, Keler et al. 2017). In fact, traffic congestion can be
treated as a group of gathered vehicles that have the characteristics of spatial adjacency
and significantly slow speeds. Therefore, traffic congestion detection can be translated into
an extraction of dense clusters containing slow-moving vehicles, namely, moving clusters
(Kalnis et al. 2005, Huang et al. 2008). This paper proposes a density-based moving-object
clustering approach for accurately extracting the spatiotemporal extents of traffic conges­
tion. The major contributions of this paper include:

● Improving the density-based moving-object clustering method for traffic congestion


detection;
● Extracting the delicate spatial extents and lifetimes of traffic congestion; and
● Performing a time-dependent analysis of traffic congestion on a road network.

The structure of this paper is organized as follows: Section 2 reviews the existing work
related to traffic congestion detection. In Section 3, the traffic congestion detection
strategy is presented, and the proposed method is elaborated upon. Section 4 performs
a comparative experimental analysis to demonstrate our approach. Section 5 summarizes
the interesting findings and highlights the future research directions.

2. Related work
Most existing studies originally detected traffic congestion on highways using vehicle
speed information acquired from induction coils. Compared with highways, urban roads
have various levels and vast numbers of intersections, as well as complicated spatial
topological relationships (Chow et al. 2014). Thus, with the help of spatial clustering
strategies, specific methods have been subsequently developed for the extraction of
congested urban areas, road segments and lanes. A detailed review of these methods is
given below.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 3

2.1. Methods for the detection of congested regions


From a macroscopic perspective, some studies have been conducted to extract local regions
wherein traffic congestion occurs in a given city. For example, Liu et al. (2015) aggregated
low-speed GPS points using the DBSCAN algorithm to obtain congested urban regions and
further studied their interactions in the road network space. Rempe et al. (2016) clustered
connected roads using relative vehicle speeds (i.e. the ratio of the average speed to the
maximum permissible speed) less than 0.1 to extract local congested regions. Keler (2017)
employed the OPTICS clustering algorithm to identify congested regions based on low-
speed trajectory points obtained during peak hours. An et al. (2018) first utilized the
DBSCAN algorithm to cluster grids that recorded low-speed values and then discovered
the periodicity of urban traffic congestion. Yu et al. (2019) proposed a stay-place clustering
method for the detection of congestion regions using trajectory data.

2.2. Methods for the detection of congested road segments


The majority of current relevant studies have focused on the identification of road segments with
traffic congestion. Ong et al. (2011) combined the discovery of flock patterns with dynamic speed
constraints to detect congested road segments in Pisa. Liu and Ban (2013) detected congested road
segments by aggregating low-speed and trajectory stop points into spatiotemporal clusters,
essentially simulating moving clusters. Wang et al. (2013) performed a quantitative analysis of the
influences of probe vehicle numbers and sampling intervals on the detection accuracy of road
segments with traffic congestion. Xu et al. (2013) estimated traffic states on road segments using
travel speeds, and congested road subnetworks were further identified by spatiotemporally
clustering low-speed road segments. Chaurasia et al. (2020) first aggregated trajectory segments
and utilized the average speeds of trajectory clusters to detect traffic congestion.

2.3. Methods for the detection of congested lanes


To facilitate accurate traffic dispersion and route planning, it is necessary to identify the spatial
extent of traffic congestion at the lane level. For instance, traffic status information obtained by
induction coils or vehicle-to-vehicle communications has been adopted to identify congested lanes
(Terroso-Saenz et al. 2012, Nolle et al. 2014). In addition, Zheng and Liu (2017) built a model to
estimate the number of arriving vehicles in different lanes at road intersections using datasets
captured from connected vehicle technology. Using the trajectories of taxis, Kan et al. (2019) filtered
valid trajectory segments considering three features (i.e. net curvature, distance and speed) from
which congested segments with different intensities were detected. Furthermore, a spatial cluster­
ing operation was performed to identify traffic congestion at the turn level.
Through the review of previous studies, we find that there are still several unsolved issues in
current traffic congestion detection methods. On the one hand, it seems difficult for the current
methods to distinguish traffic congestion from normal queueing behaviours caused by vehicles
waiting for red lights. On the other hand, vehicles located in the same congested road segment
generally have both consistent moving orientations and significantly slow speeds. Thus, the
directions and magnitudes of vehicle velocities should be combined in detections of traffic
congestion. Considering the divergences of vehicle speeds among different cities and time
periods, it is difficult to determine a fixed threshold for the reasonable value of the upper bound
4 Y. SHI ET AL.

of vehicle speeds in traffic congestion. However, previous studies have not typically accounted
for the diversity in the moving orientations or the distributions of the speeds of vehicles,
resulting in the misrecognition of traffic congestion. In essence, a traffic flow constitutes a series
of continuously moving vehicles on roads. In cases of traffic congestion, there is not enough
space for overtaking or lane changing. The affected vehicles will be aggregated as significantly
dense clusters that move with extremely slow speeds compared with those of clusters running
on smooth road segments (Liu and Ban 2013). As a result, it can be deduced that traffic
congestion can be effectively detected by extracting dense moving clusters constituted by
slow-moving vehicles. However, due to insufficient estimations of the spatial distribution
characteristics of vehicles on a road network, existing spatial clustering-based methods lack
adaptivity and availability when employed in traffic congestion detections.
To address the abovementioned issues, this study proposes a new strategy to unveil
the spatiotemporal propagation process of traffic congestion based on the objective
detection of moving clusters with high densities and slow speeds.

3. A density-based moving object clustering method for traffic congestion


detection
Regardless of the various traffic conditions present in different cities or in distinct functional
areas of a city, traffic congestion on a road network can be detected by the extraction of
moving clusters. For example, Figure 1(a) displays the trajectory points of 13 vehicles using
different symbols at three consecutive timestamps. Some spatially adjacent vehicles, i.e. the
points surrounded by ellipses in Figure 1(b), can form a cluster in each timestamp. As at
least six of the thirteen vehicles are contained in each of the three clusters, these clusters
can be treated as a moving cluster that represents the continuous aggregation of vehicles.
These moving clusters with extremely slow speeds indicate the occurrence of traffic
congestion. Based on this qualitative description, this study proposes a strategy to quanti­
tatively extract the spatiotemporal extents of urban traffic congestion as follows.

Figure 1. Formation of a moving cluster. (a) Consecutive time windows that contain trajectory points.
(b) Moving cluster determination in consecutive time windows.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 5

3.1. A combination of map matching and time slicing for trajectory points
Unavoidable location errors impede the detection of traffic congestion from original
vehicle trajectory points in a road network space. Thus, these trajectory points must
be aligned with the road segments by a map-matching operation. To address the
uncertainty problem caused by a low sampling rate of vehicle trajectories (Chen et al.
2014), Lou et al. (2009) combined the geometric and topological information of
a road network with the speed constraints of vehicles to design an effective map-
matching algorithm called ST-Matching. The power of ST-Matching to improve the
accuracy and efficiency of low-sampling-rate trajectory matching has been demon­
strated through many experiments. This study will employ this algorithm to map the
original trajectory points from a Euclidean space to a road network space.
In this way, the trajectory of any vehicle Vi can be represented as a set of points, i.e. Vi
= {(xi.1, yi.1, ri.1, ti.1), (xi.2, yi.2, ri.2, ti.2), . . ., (xi.k, yi.k, ri.k, ti.k), . . . }, where (xi.k, yi.k, ri.k, ti.k)
represents the spatial position of Vi after it is matched to road segment ri.k at timestamp
ti.k. To facilitate the spatiotemporal analysis of the vehicle trajectories, all trajectory points
are projected onto a series of continuous time windows {TW1, TW2, . . . TWm} with the same
widths based on their timestamps, as obtained by the snapshot model. Given any
trajectory point (xi.k, yi.k, ri.k, ti.k) of vehicle Vi and the time window TWk’, if ti.k lies within
the range [TWk’.start, TWk’.end], where TWk’.start and TWk’.end denote the start and end
times of TWk’, respectively, this trajectory point is considered to be located in TWk’ and is
re-expressed as (xi.k, yi.k, ri.k, TWk’). Regarding irregular sampling trajectory points, we can
set the time window width to the adopted time interval to sample the most trajectory
points. In this way, as many vehicles as possible that have exactly one sampling point in
each time window can be incorporated. Furthermore, for trajectory points with sampling
intervals larger than the time window width, the records of these vehicles that are missing
in some time windows are interpolated by gap-filling strategies with the help of the
recorded points in the adjacent time windows; these gap-filling strategies include linear
(Yuan et al. 2019) and kinematic (Long 2016, Hwang et al. 2018) interpolation methods.
The above operations can ensure that any vehicle will have trajectory points in each
selected time window. After the conversion operations are completed, trajectory points
located in the same time window are grouped into a snapshot. For example, the trajectory
of vehicle V3 in Figure 2(a) is denoted as {(x3.1, y3.1, t3.1), (x3.2, y3.2, t3.2), (x3.3, y3.3, t3.3)}. After
the time slicing operation, its trajectory can be expressed as {(x3.1, y3.1, TW1), (x3.2, y3.2, TW2)
and (x3.3, y3.3, TW3)}, as shown in Figure 2(b). Moreover, Figure 2(c-e) exhibits the spatial
distributions of vehicles in a road network space within the three time windows.

3.2. A hybrid method for detecting density-based moving clusters in road


networks
Spatial clusters are constituted by a set of points with both spatial proximity and non-
spatial attribute similarity. In real life, considering that the movements of vehicles are
generally restricted within road segments, the construction of spatial neighbourhoods
should be performed within the road network space instead of with the direct use of
Euclidean distances (Yamada and Thill 2007, Shiode and Shiode 2009, Shiode 2011).
Moreover, various moving orientations should also be considered to aggregate vehicles
6 Y. SHI ET AL.

Figure 2. An example of time slicing for trajectory points. (a) Trajectory points attached to spatial
location information and time stamps. (b) Projection of trajectory points onto time windows. (c-e)
Projected trajectory points in TW1, TW2 and TW3.

with consistent running behaviours. In this case, this study extends the density-based
clustering method described in Deng et al. (2019) by imposing moving orientation
restrictions and implementing space-time connections to adaptively extract network-
constrained moving clusters with significantly high densities as candidates for traffic
congestion. Specifically, the moving orientations of vehicles are first adaptively parti­
tioned into separate trajectory points that indicate significantly different driving direc­
tions (e.g. opposite directions). Based on this constraint, the density-based clustering idea
outlined in Deng et al. (2019) is incorporated to aggregate trajectory points with both
spatial adjacency and orientation homogeneity. However, the method in Deng et al.
(2019) focuses on clustering spatial points that are observed in the same time window.
Sequentially, this study further discovers moving clusters by measuring the overlapping
relationships between spatial clusters detected in adjacent time windows.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 7

3.2.1. The measurement of network distances


With respect to the vehicle trajectory points, the previously mentioned ST-matching
algorithm has been employed to accurately aligned the trajectory points with the road
network by a spatiotemporal analysis. Thus, the trajectory points projected after the map
matching are utilized to measure the network distances using road network nodes. Given
a road network, there can be large numbers of paths that can be used to connect any two
projected trajectory points pi and pj. The path with the shortest distance is defined as the
network path between pi and pj, and this shortest distance is regarded as their network
distance, denoted as N_Dist(pi, pj). For example, the network distance between p1 and p4
in Figure 3 can be expressed as N_Dist(p1, p4) = E_Dist(p1, n2)+ E_Dist(n2, n1)+ E_Dist(n1, p4),
where E_Dist(.) denotes the Euclidean distances.

3.2.2. The construction of optimal spatial neighbourhoods


In classical density-based clustering algorithms, e.g. DBSCAN, prior knowledge or vast
amounts of experiments are required to determine the ranges of spatial neighbourhoods
(Ester et al. 1996). To improve the objectivity of the clustering results in this study, the data
field is combined with Shannon entropy to adaptively select the optimal ranges of the
spatial neighbourhoods (Zhao et al. 2016, Deng et al. 2019).
The first law of geography indicates a mutual attraction between two distinct objects in
space (Tobler 1970). The mutual attraction can be quantified using a data field model that
is derived from the concept of a physical field (Wang et al. 2011, Zhao et al. 2016, Deng
et al. 2019). Given any projected point pi with the mass mpi , its attraction intensity to
other points can be calculated as follows:
( � �2 )
XN N Distðpi; pjÞ
Attr Iðpi Þ ¼ mpi � exp (1)
j¼1 σ

where pj represents the other points in the road network; N_Dist(pi, pj) denotes the
network distance between pi and pj; and σ determines the spatial range of attraction
generated by any point in the road network space. The value of σ is consistent with the
radius of the spatial neighbourhood. Giachetta et al. (2009) demonstrated that the
attraction intensity is equivalent to the probabilistic density function with a normalized
constant. Thus, the network kernel density estimator proposed by Borruso (2005) is
employed to re-express the attraction intensity as follows:

Figure 3. An example of a network distance measurement. (a) Original and projected trajectory points
on a road network. (b) Network distance between p1 and p4.
8 Y. SHI ET AL.

( PN
nc� 1
j¼1 2σ expf ½ N Distðpi;pjÞ
pffiffi �2 g; when 0 < N Distðpi ;pj Þ � σ
Attr Iðpi Þ ¼ 2σ (2)
0; when N Distðpi ;pj Þ > σ

where nc is the normalized constant. The vehicle trajectory points are primarily recorded in
road segments rather than at intersections. For the convenience of calculation, the spatial
neighbourhood of any point pi is defined as a range of σ from pi in two directions, i.e. 2σ. On
this basis, Shannon entropy is further adopted to determine the optimal σ value by character­
izing the uncertainty of the attraction intensity distribution (Shannon 1948, Zhao et al. 2016,
Deng et al. 2019), which is expressed as follows.
XN Attr Iðpi Þ �Attr Iðpi Þ� XN
H¼ log where Z ¼ Attr Iðpi Þ (3)
i¼1 Z Z i¼1

The minimum entropy value indicates the highest asymmetry in the attraction intensities,
and core points with significantly stronger attraction intensities can be effectively identi­
fied using the corresponding σ. Thus, the σ value that minimizes entropy, denoted as σopt,
is selected to determine the optimal range of the spatial neighbourhoods.

3.2.3. Adaptive partitioning of moving orientations


The combination of temporally varying spatial locations and thematic attribute values com­
pletely constitutes the portrait of a trajectory. Spatially adjacent trajectory points that are
connected to different moving orientations may be produced by vehicles with heterogeneous
running behaviours, i.e. vehicles running in opposite lanes. Thus, the similarity of the moving
orientations of vehicles should first be measured in the clustering of trajectory points. This study
proposes a modified k-means clustering method to adaptively partition moving orientations.
Given the moving orientation values recorded in the trajectory points in each time window, the
k-means algorithm is primarily employed to partition these values into two clusters (Macqueen
1967). For any cluster Ci with n items, the inner-cluster variances can be calculated as follows:
Xn �� �
Ci var ¼ arccos cos θ m
� 2
θ (4)
m¼1

where θm and θ � denote the moving orientation value of the mth item and the mean
moving orientation value of all items, respectively. Furthermore, each cluster Ci is divided
into two subclusters (denoted as SCi−1 and SCi−2) using the k-means algorithm. The inner-
cluster variance loss of Ci can be expressed as follows (Guo 2008).
C i VL ¼ jC i var SC i 1 var SC i 2 varj (5)
The cluster with the maximal variance loss value has the greatest heterogeneity, and
divisions are thus continuously imposed on it. The inner-cluster relative variance loss of Ci
can be expressed as follows.
C i RVL ¼ C i VL C i 1 VL s:t:Ci 1 VL > Ci 2 VL (6)
This partition operation is iteratively implemented until the relative loss values of the
inner-cluster variances are smaller than an extremely small threshold (e.g. 0.1 in this
study). In this way, all moving orientation values can be adaptively partitioned into
homogeneous clusters. For each trajectory point, the attached moving orientation value
can be replaced by the tag of the corresponding cluster.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 9

3.2.4. Statistical identification of core points


In the process of density-based clustering, clusters are formed by expansion from a series
of core points with a high local density. Instead of artificially assigning a lower bound
regarding the local density, statistical hypothesis tests are adopted in this study to identify
core points with statistical significance.
It has been proven that spatial points in a road network space approximately obey
a homogeneous Poisson distribution with a null hypothesis (Shiode 2011, Deng et al. 2019).
Given N points located in a road network with a length of rn_l, the probability that any road
segment with length rs_l contains k points is mathematically calculated as follows.

λk rs l
Pðjrs lj¼ kÞ¼ e λ � ; where λ ¼ N� (7)
k! rn l
Focusing on a given time window, for any point pi, if there are ni observed points that
(i) fall in the spatial neighbourhood of pi (SN_pi for short) and (ii) belong to the same
cluster as pi in terms of their moving orientations, then the probability that more than ni
points follow these two conditions can be expressed as follows.
0
Xni 1 0 λk 0 2σopt
λ
PðjSN pi j � ni Þ ¼ 1 PðjSN pi j < ni Þ ¼ 1 e ; where λ ¼ (8)
k¼0 k! rn l
If this probability is smaller than a sufficiently low significance level, it is almost
impossible that pi has more than the observed number of points in its spatial neighbour­
hood. In other words, pi can be regarded as a core point with a significantly high local
density, denoted as core_pi.

3.2.5. Extraction of moving clusters


Based on the constructed optimal spatial neighbourhoods, expansions from the deter­
mined core points can be performed to build distinct clusters in each time window. For any
core point core_pi, if any other point pj in the spatial neighbourhood of core_pi is attached to
the same cluster tag regarding its moving orientation with core_pi, pj is defined to be
density-reachable from core_pi. Selecting core_pi as a seed point, its density-reachable
points are aggregated into a cluster Ck with core_pi. For any other core point in Ck,
a similar operation is continuously carried out to aggregate its density-reachable points
to update the member of Ck. This process is iteratively repeated until all core points have
been visited. In this way, a set of spatial clusters, denoted as C= {Ci1, Ci2, Ci3 . . .}, can be
extracted in each time window.
Given any two clusters Ci∈TWi and Cj∈TWi+1, the consistency degree between Ci and
Cj can be quantified as follows:

Con C i ;C j ¼ ðC i \ C j Þ=C i (9)

where |Ci∩Cj| represents the volumes of the intersecting set of Ci and Cj with respect
to the cluster members. If Con(Ci, Cj) is greater than a given threshold γ, the
combination of Ci and Cj will be defined as a moving cluster with a time span of
two, denoted as MC{Ci→Cj}. By this analogy, a cluster Ck in TWi+2 will be added to
MC{Ci→Cj} only if Con(Cj, Ck)>γ; then, the moving cluster MC{Ci→Cj} can be updated as
MC{Ci→Cj→Ck}. This process is iteratively implemented to detect all moving clusters
within any time span.
10 Y. SHI ET AL.

3.3. Speed-based percentile function for determining traffic congestion


In a given road network, traffic congestion is involved in the moving clusters that are
detected from vehicle trajectory points. In this study, two factors, i.e. vehicle speeds and
the time spans of moving clusters, are analysed to determine traffic congestion using
a speed-based percentile function.
Given any moving cluster MCi with n time spans (i.e. TS_MCi = n), the average speed of
MCi can be approximately estimated as follows:
Xn
1 N DistðC i k :centre;C i kþ1 :centre; Þ
Speed MC i ¼ (10)
n 1 k¼1 TW width

where Ci_k.centre and Ci_k+[Link] represent the centre point of clusters in the kth and
(k+ 1)th time windows of MCi, respectively, and TW_width denotes the width of the time
windows. In this way, the speed of all n moving clusters can be estimated to form the set
Speed_MC = {Speed_MC1, Speed_MC2 . . ., Speed_MCn}. The speed of a traffic flow can be
depicted by a log-normal distribution (Hollander and Liu 2008). The logarithmic transfor­
mation of the set Speed_MC follows a normal distribution, i.e. log(Speed_MC)~N(μ, σ2),
Pn rffiP
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n
logðspeed MC i Þ logðspeed MC i μÞ2 Þ
where μ ¼ i¼1 n and σ ¼ i¼1
n 1 . The percentile value regarding
a given probability value p can be calculated as follows (Pu 2012):
Z ðpÞ¼ exp½μ þ σΦ 1 ðpÞ� (11)
−1
where Φ (.) is the percentile function of the standard normal distribution. The percentile
values can be utilized to extract moving clusters with significantly slow speeds.
Additionally, the time spans of slow-moving clusters should be further considered to
distinguish traffic congestion from other events that can also cause slow-moving traffic
flows, such as waiting at red signals at road intersections. Given the significance level α
(e.g. 0.1) and a time span threshold δts, the moving cluster MCi indicates traffic congestion
when Speed_MCi≤Z(1-α) and TS_MCi≥δts.

4. Experimental comparisons and analysis


In this section, experiments are performed on massive historical trajectories of urban
taxies to evaluate the effectiveness and practicality of the proposed method. Section 4.1
introduces the study area and the utilized real-life taxi trajectory dataset. The setting of
the experimental parameters is analysed in Section 4.2. Section 4.3 compares the perfor­
mances of the experiments with other methods. The detected traffic congestion at the
intersection is analysed in Section 4.4. The proposed method is further employed to unveil
the time-dependent patterns of urban traffic congestion in Section 4.5. We implemented
all methods and carried out experiments on a server with Ubuntu, one 3.6 GHz Intel Core
i9-9900k CPU, and 32 GB RAM.

4.1. Study area and dataset description


As the largest metropolis in central China, Wuhan has encountered significant traffic
congestion problems due to rapid urbanization and the massive influx of immigrants in
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 11

recent decades (Liu et al. 2015). The Yangtze and Han Rivers divide Wuhan into three
towns, namely, Wuchang, Hanyang and Hankou. Wuchang takes up approximately half of
the total area of Wuhan and has a collection of governmental agencies, universities,
scientific research institutions and technology industrial parks. A large population com­
mutes daily between their workplaces and residences in Wuchang. This frequent popula­
tion flow leads to daily variations in road traffic status with respect to time. As a result, the
road network of Wuchang was selected as the study area in these experiments. Figure 4
shows the spatial distribution of the road network of Wuchang with 916 segments derived
from the Open Street Map, which is a reliable platform for obtaining urban road network
datasets.
It is well known that the irregular trips of a large quantity of tourists generally increase
the complexity of traffic conditions during holidays, e.g. the May Day holiday in China,
especially in tourist cities. Thus, the recorded taxi trajectory points from 1 to 6 May 2014 in
Wuchang were utilized to sufficiently verify the performance of the proposed method.
A total of 5.6 million trajectory points were collected by vehicle-mounted positioning
devices on approximately 7000 taxies with a sampling time interval of 60 s; these data
points include IDs, spatial locations (latitude/longitude), time stamps, instantaneous
speeds (m/s), instantaneous driving orientations (from 0° to 360°) and service statuses
(occupied or unoccupied). Specifically, the trajectory points collected from 8:00 to9:00 on
5 May are shown in the right panel of Figure 4. In addition, traffic congestion is detected in
distinct time periods to further explore its spatiotemporal variation characteristics.
Kan et al. (2019) proposed an approach for extracting invalid slow-speed trajectories
relevant to individual demands (e.g. searching for passengers) with the combined ana­
lyses of trajectory shapes, distances to road segments and moving velocities. Performing
this approach on the trajectory points obtained on 5 May 2014, these invalid trajectories
are found to account for approximately 20% of all trajectory segments. The sparse
distributions of these invalid trajectories make it difficult for these points to form dense
clusters in each time stamp. Taking the trajectory points collected on 5 May as an
example, Figure 5 shows that invalid trajectory segments steadily occupied a small

Figure 4. Spatial distribution of road segments in Wuchang and trajectory points obtained from
8:00 ~ 9:00 on 5 May 2014.
12 Y. SHI ET AL.

proportion of the total number of points in the daytime and had a negligible influence on
the number of congested road segments detected by the proposed method. To reduce
the computing complexity and uncertainty in the data elimination method, the trajec­
tories of all taxis were utilized for the detection of traffic congestion by the proposed
method in the experiments.

4.2. Analysis of parameter settings


The proposed method includes three parameters, i.e. the time window width TW_width, the
similarity threshold γ and the time span threshold δts. For any vehicle, the TW_width value
determines the number of trajectory points that are located in each time window. As shown
in Table 1, the majority of trajectory points are sampled every 60 ~ 70 s. Following the
strategy mentioned in Section 3.1, the TW_width is set as 60 s in the experiments. In
addition, gap-filling operations were not implemented because of the extremely small
proportion of trajectory points with sampling intervals larger than the time window width.
The parameter γ was utilized to determine whether two consecutive clusters were
sufficiently consistent to be coupled to form a moving cluster. In other words, for any
cluster in a given time window, the value of 1-γ essentially indicates the upper bounded
rate of vehicles that leave this dense cluster to evolve into a moving cluster. Only
congested moving clusters with time spans of more than δts are determined as indicating
traffic congestion. According to the Ministry of Public Security of China, credible traffic
congestion at a given road intersection should satisfy the condition that vehicles are

Figure 5. Comparisons of the variations in the numbers of (a) trajectory segments and (b) detected
congested road segments regarding all trajectories and valid trajectories.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 13

required to queue for at least three signal cycles before passing through the intersection
(Lu 2001). In this case, assuming the homogeneous flow of vehicle queues, the 1-γ value
and δts can be determined as 1-γ = TW_width/3τ (TW_width<3τ) and δts = 3τ/TW_width,
respectively, where τ denotes the average cycle of traffic signals. The γ value is set to 0 in
the case that TW_width≥3τ. Considering the time settings of road traffic signals in
Wuchang, we set the τ value to 1 minute for all intersections in the experiments based
on empirical induction. In summary, the γ and δts values of the proposed method are 2/3
and 3, respectively, with the condition that the width of the time window is 1 minute.
To verify the rationality of the assigned γ and TW_width values, a traffic status simula­
tion tool, namely, VISSIM, was utilized to simulate the trajectory points of 4298 vehicles on
a simulated road network for 60 minutes with a sampling interval of 60 s. The simulated
road network, as shown in Figure 6, contains 136 road segments with a total length of
22 km. In the simulated dataset, traffic congestion was pre-set according to the definition
of traffic congestion in Lu (2001). Table 2 gives the quantitative, evaluative results of the
detected traffic congestion with different γ values. One can see that the precision value is
positively proportional to the γ values, but more traffic congestion was missed with the
use of larger γ values. The precision values are at least 0.75 in the range γ = [2/3, 5/6], while
the recall values exceed 0.80 in the range γ = [1/6, 2/3]. With the combination of the
precision and recall values, the detected result with γ = 2/3 has the largest value of F1,
where F1 = (2*precision*recall)/(precision+recall). To a large degree, this result can
indicate the rationality of the γ value setting strategy using traffic signal cycles. When
applying the proposed method to detect traffic congestion in a new city without informa­
tion about the traffic signal cycles, the following two kinds of strategies can be used to
carry out our method. If there are official records of traffic congestion, the optimal γ value
can be directly determined by comparing the precision and recall of the detected results
obtained by different γ values. Otherwise, it is necessary to estimate the value of τ to
further calculate the γ value. In fact, a large number of models have been constructed to
effectively estimate traffic signal cycles based on vehicle trajectories (Fayazi et al. 2015, Yu
and Lu 2016, Yang et al. 2018, Yu et al. 2018). For example, Yang et al. (2018) first extracted
stop-and-go patterns at an intersection using speed information from vehicle trajectories
to determine the start times of the green lights. On this basis, a traffic signal cycle can be
further estimated from the sequence of green light start times obtained by employing
periodicity mining approaches.
Furthermore, we evaluated the performance of the proposed method by performing
experiments with different TW_width values and different trajectory point sampling
intervals, as shown in Table 3. One can see that, in the case that TW_width value is greater
than the sampling interval of the trajectory points, multiple-trajectory points of one
vehicle are recorded in the same time window, resulting in an increase in the running
time and a decrease in the detection precision. In addition, high TW_width values and
sampling intervals can reduce the running time. However, when these two values were

Table 1. The frequencies of the sampling intervals of taxi trajectory points.


Sampling interval 0–10 s 10–20 s 20–30 s 30–40 s 40–50 s 50–60 s 60–70 s 70–80 s
Percentage 2.6% 30.7% 2.4% 1.6% 1.4% 1.9% 59.0% 0.4%
14 Y. SHI ET AL.

Figure 6. Spatial distributions of (a) a simulated road network and (b) the trajectory points simulated
by VISSIM.

Table 2. Quantitative evaluations of the detected results


with different values of γ.
γ value Precision Recall F1 value
1/6 0.26 1 0.41
1/3 0.36 0.99 0.53
1/2 0.53 0.97 0.69
2/3 0.75 0.82 0.78
5/6 0.93 0.28 0.43

both equal to or larger than 3 minutes, more traffic congestion could be exactly discri­
minated, but under these conditions, the loss of the trajectory information of vehicles
seriously decreased the reliability of the proposed method.

4.3. Experimental comparisons with other methods


This section comprehensively evaluates the performance of the proposed method
through comparative experiments with existing methods. In recent years, effective meth­
ods have been designed for the detection of urban traffic congestion: Liu and Ban (2013)
extracted both slow-speed and stop points to identify congested road segments using the
original idea of moving-cluster-detection methods. Kan et al. (2019) designed a novel
method to detect traffic congestion at the turn level by combining trajectory segmenta­
tion with segment clustering. The parameters involved in these two methods were
initially selected by the strategies introduced in the corresponding literature and were
further determined by comparing the experimental results obtained by different para­
meter combinations.
According to the Wuhan Traffic Development Annual Report, approximately 6%~10%
intra-city travel is produced by taxies in Wuhan (Wu and She 2016). Due to the lack of
trajectory data for all types of vehicles, the use of only taxi trajectories cannot provide an
accurate representation of the traffic status of a road network. As a result, the simulated
data in Section 4.2 are utilized to quantify the influences of vehicle penetration rates on
the performance of traffic congestion detections with different methods. The detection
result of traffic congestion on each road segment every minute was utilized to calculate
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 15

Table 3. Results of the quantitative evaluation of traffic congestion detected with different TW_width
values and trajectory point sampling intervals.
Sampling interval (minute) TW_width (minute) Precision Recall F1 value Running time (second)
1 1 0.75 0.82 0.78 9.7
1 2 0.41 0.79 0.55 17.0
1 3 0.41 0.69 0.51 16.2
1 4 0.42 0.64 0.51 18.4
1 5 0.44 0.60 0.51 27.9
1 6 0.53 0.49 0.51 22.4
2 2 0.62 0.96 0.76 5.0
3 3 0.40 0.99 0.57 3.5
4 4 0.16 1 0.28 3.0
5 5 0.16 1 0.28 2.6
6 6 0.15 1 0.26 2.1

the precision and recall values. As shown in Table 4, the precision values of the results
obtained by the proposed method without the adaptive partitioning of moving orienta­
tions (abbreviated as APMO) have significant variations among different vehicle penetra­
tion rates. For the results obtained using Liu’s method, the precision values can achieve at
least 64%; however, the low recall values indicate that this method could only detect
approximately 30% traffic congestion when vehicle penetration rates were set as 6%–8%.
Kan’s method has stable precision values and significantly higher recall values than those
of other methods using trajectory data with different vehicle penetration rates. However,
relatively more unreliable traffic congestion was recognized by Kan’s method. Overall, the
highest precision values, relatively high recall values and the highest F1 values of the
proposed method demonstrate the superiority of this method over the other studied
methods in detecting traffic congestion using sparse vehicle trajectories. Moreover, by
comparison with the results of the detection performed using all trajectory points, one
can see that the low vehicle penetration rates have the least influence on the performance
of the proposed method. In other words, the proposed method is more suitable for the
detection of traffic congestion from sparse trajectory points than the other three
methods.
Figure 7 gives the variations in the numbers and total lengths of congested road
segments detected by different methods using trajectories of 10% of the probe vehicles,
wherein baselines were drawn based on the records of traffic congestion in the simulated
trajectories. Compared with the results obtained by the proposed method, the proposed
method conducted without APMO detected significantly more and longer congested road
segments due to the deficiency of the method in distinguishing opposite moving orienta­
tions. The results detected by Liu’s method contain more congested roads with shorter
total lengths. The total lengths and numbers of congested road segments detected by
Kan’s method are greater than those of both the proposed and Liu’s methods.
Furthermore, Table 5 shows that the results detected by the proposed method have
minimum root mean square error (i.e. RMSE) values, which quantitatively proves the
effectiveness and advantages of the proposed method in detecting traffic congestion
using sparse vehicle trajectories.
16 Y. SHI ET AL.

Table 4. Quantitative evaluations of the detection results obtained with different vehicle penetration
rates.
Vehicle penetration rate Methods Precision Recall F1 value
6% The proposed method 0.73 0.56 0.63
The proposed method without APMO 0.48 0.58 0.52
Liu’s method 0.70 0.27 0.39
Kan’s method 0.45 0.70 0.55
7% The proposed method 0.72 0.61 0.66
The proposed method without APMO 0.71 0.65 0.68
Liu’s method 0.67 0.24 0.35
Kan’s method 0.49 0.86 0.62
8% The proposed method 0.84 0.59 0.69
The proposed method without APMO 0.56 0.57 0.57
Liu’s method 0.64 0.33 0.44
Kan’s method 0.50 0.93 0.65
9% The proposed method 0.78 0.70 0.74
The proposed method without APMO 0.58 0.66 0.61
Liu’s method 0.64 0.64 0.64
Kan’s method 0.50 0.88 0.64
10% The proposed method 0.80 0.68 0.74
The proposed method without APMO 0.40 0.72 0.51
Liu’s method 0.67 0.59 0.63
Kan’s method 0.48 0.99 0.64
100% The proposed method 0.75 0.82 0.78
The proposed method without APMO 0.80 0.57 0.66
Liu’s method 0.20 1 0.34
Kan’s method 0.57 0.78 0.66

Figure 7. Variations in the (a) numbers and (b) total lengths of the congested road segments detected
by different methods using trajectories of 10% of the probe vehicles.

4.4. Lane-level analysis of the detected traffic congestion


This section will analyse the spatiotemporal characteristics of the detected traffic conges­
tion by the proposed method at the lane level using the intersection that connects Wuluo
Road and Shipailing Road as an example. As shown in Figure 8, the six turning directions
at this intersection, i.e. TN_1-TN_6, are denoted by arrows with different shapes.
Figure 9 exhibits the variations in average vehicle speed and the lengths of the
congested road segments in different turning directions over time. The vehicle speed
could not be estimated properly during the early morning hours (e.g. 2:00–6:00 in Figure
9) due to the lack of sufficient trajectories of running vehicles. In this intersection, TN_1
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 17

Table 5. The RMSE values of the results obtained with different methods using trajectories of 10% of
the probe vehicles.
Variables Methods RMSE
The number ofcongested road segments The proposed method 0.87
The proposed method without APMO 5.96
Liu’s method 2.35
Kan’s method 4.25
The total length of congested road segments The proposed method 251.18
The proposed method without APMO 2055.28
Liu’s method 445.70
Kan’s method 763.70

and TN_2 represent the straight driving direction from the southeast to the northwest of
Wuluo Road and the left turning direction into Shipailing Road, respectively. As shown in
Figure 9(a), no congested moving clusters were detected from 2:00 to 6:00 in either TN_1
or TN_2 due to the high vehicle speeds. In the daytime, a significant decrease in vehicle
speed was observed on Wuluo Road, and long queues of vehicles caused traffic conges­
tion from 9:00 to 12:00 and from 15:00 to 18:00. Note that vehicles drive more slowly in
the TN_2 direction, which may be caused by the frequent deceleration of vehicles at the
intersection when turning left. In addition, the narrower width of the Shipailing Road can
substantially increase the waiting time for red signal lights. However, traffic congestion
was detected only at approximately 10:00 and 18:00 in the TN_2 direction, which indicates
that the proposed method can effectively distinguish long vehicle queues induced by
traffic congestion from common deceleration behaviours.
TN_3 is the opposite direction of TN_1 on Wuluo Road. As shown in Figure 9(b), the
vehicle speed in the TN_3 direction obviously decreased from 6:00 to 7:00 since traffic
congestion was frequently identified until 12:00. As the two driving directions on Wuluo
Road, TN_1 and TN_3 exhibit significant differences in their traffic congestion durations
and congested segment lengths. Specifically, congestion in the TN_3 directions occurred
earlier than that in the TN_1 direction in morning peak hours, while TN_1 experienced
traffic congestion for a longer duration in the evening peak hours. This phenomenon

Figure 8. Different turning directions at the intersection between Wuluo road and Shipailing road.
18 Y. SHI ET AL.

Figure 9. Variations in average vehicle speed and length of the congested road segments detected in
the (a) TN_1 and TN_2, (b) TN_3 and TN_4 and (c) TN_5 and TN_6 directions.

reveals that the majority of residents commute to workplaces (e.g. Optical Valley south­
east of Wuchang) via Wuluo Road in the TN_3 direction and return home after work in the
TN_1 direction.
TN_4 denotes the right-turning direction from Wuluo Road to Shipailing Road. Figure 9(b)
indicates that vehicles in the TN_3 and TN_4 directions moved at nearly the same speeds due
to no restrictions by the traffic signal lights. As nearby residents went off work for lunch,
vehicles began to queue at approximately 11:00 in the TN_4 direction, and traffic congestion
in this direction persisted until approximately 11:30.
TN_5 and TN_6 represent the left and right turning directions from Shipailing Road
onto Wuluo Road, respectively. As shown in Figure 10, there are only two lanes on
Shipailing Road for vehicles to drive into Wuluo Road: the left lane for turning left only
and the right lane for turning both left and right. Consequently, right-turning vehicles are
often affected by left-turning vehicles that wait at the red signal lights, and thus, the
vehicle speeds in the TN_5 and TN_6 directions have approximately the same values and
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 19

similar variation tendencies with time (as shown in Figure 9(c)). As it is connected with
Wuluo Road, which is the arterial road in Wuchang, the traffic flow on Shipailing Road can
be efficiently diverted by the multiple lanes of Wuluo Road. For this reason, only small-
scope traffic congestion was formed and persisted only for short-time periods during
peak hours.

4.5. Time-dependent analysis of the detected traffic congestion


To further validate the applicability of the proposed method, a time-dependent analysis is
performed to unveil the potential spatiotemporal characteristics of the detected traffic
congestion. Figure 11 provides the temporally varying numbers of congested moving clusters
observed from 1 to 6 May. A substantially higher amount of traffic congestion was detected in
the mornings of the first two days of the May Day holiday (i.e. 1–2 May) due to large numbers
of travelling residents and tourists. The number of congested moving clusters significantly
decreased after 3 May as tourists departed from Wuhan. From 4 to 6 May, the urban traffic
status recovered with distinct regular morning and evening peak hours, and traffic congestion
rarely took place during the early morning hours (i.e. 1:00–5:00), which is consistent with
reality.
Furthermore, Figure 12 gives the spatial distributions of the detected traffic congestion
in both the morning peak hours (i.e. 7:00–10:00) and evening peak hours (i.e. 17:00–20:00)
from 1 to 6 May, in which the vehicle congestion speed values are indicated by different
colours. One can see that the traffic congestion in Wuchang had similar spatial distribu­
tion characteristics during the May Day holiday. Specifically, local residents and tourists
were prone to visit famous sightseeing spots, resulting in a large volume of vehicles on
the surrounding road segments, such as Linjiang Avenue, Zhonghua Road and Jiefang
Road. All of these road segments are adjacent to the Yellow Crane Tower and the Ministry
of Lane. In addition, the road segments near the Yellow Crane Tower scenic area were

Figure 10. The intersection between Shipailing road and Wuluo road in the real-life scenario.
20 Y. SHI ET AL.

Figure 11. Temporally varying numbers of congested moving clusters from 1 to 6 May.

more congested during the evening peak hours than during the morning rush hours, as
tourists may prefer to sightsee in the afternoon. On working days (i.e. 4 to 6 May), the
traffic congestion presented significant dispersed distributions in Wuchang. Caused by
the daily commuting of residents, road segments leading to working areas, such as Luoyu
Road, Zhongnan Road and Zhongbei Road, became regularly congested during peak
hours on workdays. Moreover, frequent traffic congestion also took place on several roads
around large residential communities, e.g. on Youyi Avenue, Wenxin Street and
Zhuodaoquan South Road, due to concentrated residential travel. Additionally, those
main roads (such as Wuluo Road, Zhongshan Road and Jiefang Road) carried large
volumes of traffic on both holidays and workdays. Comparatively, there was significantly
less traffic congestion on other side-road segments, which may to a large extent reflect
the spatial heterogeneity and multicentricity of urban traffic congestion.
In summary, the proposed method has the ability to precisely detect the spatiotemporal
extent of traffic congestion in a road network space. The temporally dependent patterns of
traffic congestion on urban road networks further reveal the complicated travel regularities
of both residents and tourists at different times, which will guide dynamic route planning in
intelligent navigation systems and the optimization of road networks in urban planning.

5. Conclusions and future work


This article proposes a density-based moving-object clustering approach for detecting the
spatiotemporal extents of traffic congestion based on GPS trajectories of vehicles. Map-
matching and time slicing were performed to project original trajectory points onto road
segments and continuous time windows. A density-based method was designed to
adaptively extract network-constrained moving clusters with significantly high densities
as candidates for traffic congestion. On this basis, both vehicle speeds and the time spans
of moving clusters were analysed to determine the spatiotemporal extents of traffic
congestion. Using both trajectories simulated by VISSIM and real-life taxi trajectory data
collected in Wuchang, the effectiveness and practicality of the proposed method have
been demonstrated by comparative experiments with other clustering-based traffic con­
gestion detection methods. Moreover, the regularities of the spatiotemporal distribution
of traffic congestion were revealed through a time-dependent analysis, which can facil­
itate dynamic route planning for drivers to efficiently avoid congested road segments.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 21

Figure 12. Spatiotemporal distributions of the detected traffic congestion from 1 to 6 May.

Future work will focus on three aspects. The first aspect is to trace the multiple sources
of traffic congestion by analysing the implicit spatiotemporal correlations among differ­
ent congested road segments. Second, we will modify the proposed method using
parallel computing strategies to realize the real-time detection of traffic congestion on
road networks. Furthermore, the deep features of urban traffic congestion in space and
22 Y. SHI ET AL.

time will be extracted using modified deep learning frameworks to perform accurate
spatiotemporal predictions of traffic congestion.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by the National Natural Science Foundation of China (NSFC) [No.
42071452, 41730105 and 41771492]; the Natural Science Foundation of Hunan Province, China
[No.2020JJ4696]; the Hunan Key Research and Development Project [No, 2018SK2052] and the
Independent Exploration and Innovation Project Fund Designated for Graduate Students of Central
South University [No.2020zzts695].

Notes on contributors
Yan Shi is currently an associate professor in the school of Geosciences and Info-physics, Central
South University, Changsha, Hunan, China. He works in the area of spatio-temporal clustering,
anomaly detection and association rule mining.
Da Wang is currently a postgraduate in the school of Geosciences and Info-physics, Central South
University, Changsha, Hunan, China. He majors in trajectory data mining.
Jianbo Tang is currently a lecturer in the school of Geosciences and Info-physics, Central South
University, Changsha, Hunan, China. He works in the area of spatio-temporal clustering and map
generalization.
Min Deng is currently a professor in the school of Geosciences and Info-physics, Central South
University, Changsha, Hunan, China. He works in the area of geographical big data mining.
Huimin Liu is currently an associate professor in the school of Geosciences and Info-physics, Central
South University, Changsha, Hunan, China. She works in the area of spatio-temporal data fusion and
map generalization.
Baoju Liu received the Ph.D. degree from the School of Geosciences and Info-physics, Central South
University, Changsha, Hunan, China. His research interests include traffic flow analysis and urban
transportation programming.

ORCID
Yan Shi [Link]

Data and codes availability statement


The data and codes that support the findings of this study are available in [Link] with the
identifiers at [Link]
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 23

References
An, S., Yang, H., and Wang, J., 2018. Revealing recurrent urban congestion evolution patterns with
taxi trajectories. ISPRS International Journal of Geo-Information, 7 (4), 18. doi:10.3390/
ijgi7040128.
Bock, F., Di Martino, S., and Origlia, A., 2019. Smart parking: using a crowd of taxis to sense on-street
parking space availability. IEEE Transactions on Intelligent Transportation Systems, 21 (2), 496–508.
doi:10.1109/TITS.2019.2899149.
Borruso, G., 2005. Network density estimation: analysis of point patterns over a network. In:
International Conference on Computational Science and Its Applications (ICCSA 2005), 9–12 May
Singapore. Singapore: Springer, 126–132. doi:10.1007/11424857_14.
Chaurasia, B.K., Manjoro, W.S., and Dhakar, M., 2020. Traffic congestion identification and reduction.
Wireless Personal Communications, 12, 1–20. doi:10.1007/s11277-020-07420-0
Chen, B.Y., et al., 2014. Map-matching algorithm for large-scale low-frequency Floating Car Data.
International Journal of Geographical Information Science, 28 (1), 22–38. doi:10.1080/
13658816.2013.816427.
Chow, A.H.F., et al., 2014. Empirical assessment of urban traffic congestion. Journal of Advanced
Transportation, 48 (8), 1000–1016. doi:10.1002/atr.1241.
Colak, S., Lima, A., and González, M.C., 2016. Understanding congested travel in urban areas. Nature
Communications, 7 (1), 10793. doi:10.1038/ncomms10793.
Deng, M., et al., 2019. A density-based approach for detecting network-constrained clusters in
spatial point events. International Journal of Geographical Information Science, 33 (3), 466–488.
doi:10.1080/13658816.2018.1541177.
Ester, M., et al., 1996. A density-based algorithm for discovering clusters in large spatial databases
with noise. International Conference on Knowledge Discovery and Data Mining, 96 (34), 226–231.
Fayazi, S.A., et al., 2015. Traffic signal phase and timing estimation from low-frequency transit bus
data. IEEE Transactions on Intelligent Transportation Systems, 16 (1), 19–28. doi:10.1109/
tits.2014.2323341.
Giachetta, G., Mangiarotti, L., and Sardanashvily, G., 2009. Advanced classical field theory. Singapore:
World Scientific.
Guo, D., 2008. Regionalization with dynamically constrained agglomerative clustering and partition­
ing (REDCAP). International Journal of Geographical Information Science, 22 (7), 801–823.
doi:10.1080/13658810701674970.
Gupta, A., Choudhary, S., and Paul, S., 2013. DTC: a framework to detect traffic congestion by mining
versatile GPS data. In: Proceedings of the 1st International Conference on Emerging Trends and
Applications in Computer Science (ICETACS), 13-14 September Shillong, India: IEEE, 97–103.
doi:10.1109/ICETACS.2013.6691403.
He, Z., et al., 2019. Network-wide identification of turn-level intersection congestion using only
low-frequency probe vehicle data. Transportation Research Part C: Emerging Technologies, 108,
320–339. doi:10.1016/[Link].2019.10.001.
Hollander, Y. and Liu, R., 2008. Estimation of the distribution of travel times by repeated simulation.
Transportation Research Part C Emerging Technologies, 16 (2), 212–231. doi:10.1016/j.
trc.2007.07.005.
Hu, H.Q., 2016. Crowd sourcing-based real-time urban traffic speed estimation: from trends to
speed. In: Proceedings of the 32nd International Conference on Data Engineering (ICDE), 16-20
May Helsinki, Finland: IEEE, 883–894. doi:10.1109/ICDE.2016.7498298.
Huang, Y., Chen, C., and Dong, P., 2008. Modeling herds and their evolvements from trajectory data.
In: Proceedings of the 5th International Conference on Geographic Information Science, 23-26
September Berlin, Germany: Springer, 90–105. doi:10.1007/978-3-540-87473-7_6.
Hwang, S., et al., 2018. Segmenting human trajectory data by movement states while addressing
signal loss and signal noise. International Journal of Geographical Information Science, 32 (7),
1391–1412. doi:10.1080/13658816.2018.1423685.
Inoue, R., Miyashita, A., and Sugita, M., 2016. Mining spatio-temporal patterns of congested traffic in
urban areas from traffic sensor data. In: Proceedings of the 19th International Conference on
24 Y. SHI ET AL.

Intelligent Transportation Systems (ITSC), 1-4 November Rio de Janeiro, Brazil: IEEE, 731–736.
doi:10.1109/ITSC.2016.7795635
Jenelius, E. and Koutsopoulos, H.N., 2013. Travel time estimation for urban road networks using low
frequency probe vehicle data. Transportation Research Part B Methodological, 53, 64–81.
doi:10.1016/[Link].2013.03.008
Kalnis, P., Mamoulis, N., and Bakiras, S., 2005. On discovering moving clusters in spatio-temporal
data. In: Proceedings of the 9th international conference on Advances in Spatial and Temporal
Databases, 22-24 August Angra dos Reis, Brazil: Springer, 364–381. doi:10.1007/11535331_21.
Kan, Z., et al., 2019. Traffic congestion analysis at the turn level using taxis’ GPS trajectory data.
Computers, Environment and Urban Systems, 74, 229–243. doi:10.1016/[Link].2018.11.007.
Keler, A., 2017. Traffic pattern analysis framework with emphasis on Floating Car Data (FCD).
Augsburg, Germany: Universität Augsburg.
Keler, A., Krisp, J.M., and Ding, L., 2017. Detecting traffic congestion propagation in urban
environments-a case study with Floating Taxi Data (FTD) in Shanghai. Journal of Location Based
Services, 11 (2), 133–151. doi:10.1080/17489725.2017.1420256.
Li, W. and Dai, H.Y., 2016. Real-time road congestion detection based on image texture analysis.
Procedia Engineering, 137, 196–201. doi:10.1016/[Link].2016.01.250
Liu, C., Qin, K., and Kang, C., 2015. Exploring time-dependent traffic congestion patterns from taxi
trajectory data. In: Proceedings of the 2nd IEEE International Conference on Spatial Data Mining and
Geographical Knowledge Services (ICSDM), 8-10 July Fuzhou, China: IEEE, 39–44. doi:10.1109/
ICSDM.2015.7298022.
Liu, Q., et al., 2019. Data-driven intelligent location of public charging stations for electric vehicles.
Journal of Cleaner Production, 232, 531–541. doi:10.1016/[Link].2019.05.388.
Liu, X. and Ban, Y., 2013. Uncovering spatio-temporal cluster patterns using massive Floating Car
Data. International Journal of Geo-Information, 2 (2), 371–384. doi:10.3390/ijgi2020371.
Long, J.A., 2016. Kinematic interpolation of movement data. International Journal of Geographical
Information Science, 30 (5), 854–868. doi:10.1080/13658816.2015.1081909.
Lou, Y., et al., 2009. Map-matching for low-sampling-rate GPS trajectories. In: Proceedings of the 17th
ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems,
4-6 November Seattle, WA: ACM, 352–361. doi:10.1145/1653771.1653820.
Lu, H.P., 2001. Analysis of the urban traffic. Beijing: China water power press.
Macqueen, J., 1967. Some methods for classification and analysis of multivariate observations.
Proceedings of the 15th Berkeley Symposium on Mathematical Statistics and Probability, 1 (14),
281–297.
Nolle, T., Schweizer, I., and Janssen, F., 2014. Data-driven detection of congestion-affected roads.
Technical Report TUD-KE-2014-02.
Ong, R., et al., 2011. Traffic jams detection using flock mining. In: Proceedings of the European
Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
(ECML PKDD), 5-9 September Athens, Greece, 6913, 650–653. doi:10.1007/978-3-642-23808-6_49.
Pei, T., et al., 2015. Density-based clustering for data containing two types of points. International
Journal of Geographical Information Science, 29 (2), 175–193. doi:10.1080/13658816.2014.955027.
Pu, W., 2012. Analytic relationships between travel time reliability measures. Transportation Research
Record: Journal of the Transportation Research Board, 2254 (1), 122–130. doi:10.3141/2254-13.
Rempe, F., Huber, G., and Bogenberger, K., 2016. Spatio-temporal congestion patterns in urban
traffic networks. Transportation Research Procedia, 15, 513–524. doi:10.1016/[Link].2016.06.043
Shannon, C.E., 1948. A mathematical theory of communication. The Bell System Technical Journal, 27
(379–423), 623–656. doi:10.1002/j.1538-7305.1948.tb00917.x.
Shiode, S., 2011. Street-level spatial scan statistic and STAC for analyzing street crime
concentrations. Transactions in GIS, 15 (3), 365–383. doi:10.1111/j.1467-9671.2011.01255.x.
Shiode, S. and Shiode, N., 2009. Detection of multi-scale clusters in network space. International
Journal of Geographical Information Science, 23 (1), 75–92. doi:10.1080/13658810801949843.
Taniarza, N. and Akbar, S., 2017. Anomalous trajectory detection from taxi GPS traces using
combination of iBAT and DTW. In: Proceedings of the 6th International Conference on Electrical
Engineering and Informatics (ICEEI), 5-9 November Langkawi, Malaysia: IEEE, 1–5.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 25

Terroso-Saenz, F., et al., 2012. A cooperative approach to traffic congestion detection with complex
event processing and VANET. IEEE Transactions on Intelligent Transportation Systems, 13 (2),
914–929. doi:10.1109/TITS.2012.2186127.
Tobler, W., 1970. A computer movie simulating urban growth in the Detroit region. Economic
Geography, 46 (2), 234–240. doi:10.2307/143141.
Wang, H.D., Yang, Y., and Li, Q.Q., 2013. How many probe vehicles are enough for identifying traffic
congestion? -a study from a streaming data perspective. Frontiers of Earth Science, 7 (1), 34–42.
doi:10.1007/s11707-012-0343-x.
Wang, S., et al., 2011. Data field for hierarchical clustering. International Journal of Data Warehousing
and Mining, 7 (4), 43–63. doi:10.4018/jdwm.2011100103.
Wang, Y., et al., 2018. Detecting anomalous trajectories and behavior patterns using hierarchical
clustering from taxi GPS data. ISPRS International Journal of Geo-information, 7 (1), 25.
doi:10.3390/ijgi7010025.
Wu, N.N. and She, S.Y., 2016. Influence of online booking taxis on urban taxi system based on big data
analysis technology. Traffic and Transportation, 01, 238–241. doi:10.3969/j..1671-3400.2016.z1.052
Xu, L., Yue, Y., and Li, Q.Q., 2013. Identifying urban traffic congestion pattern from historical Floating
Car Data. In: Proceedings of the 13th COTA International Conference of Transportation Professionals
(CICTP), 13-16 August Shenzhen, China, 96, 2084–2095. doi:10.1016/[Link].2013.08.235.
Xu, W., Qin, K., and Wang, Y., 2018. Congestion detection and distribution pattern analysis based on
spatiotemporal density clustering. In: Proceedings of the 26th International Conference on
Geoinformatics, 28-30 June Kunming, China: Yunnan Normal Univ, 1–7.
Yamada, I. and Thill, J.C., 2007. Local indicators of network-constrained clusters in spatial point
patterns. Geographical Analysis, 39 (3), 268–292. doi:10.1111/j.1538-4632.2007.00704.x.
Yang, Q., et al., 2018. Traffic signals timing cycle length learning: using taxi GPS trajectories. In:
Proceedings of the International Conference on Machine Learning and Cybernetics (ICMLC), 15-
18 July Chengdu, China: IEEE, 13–18. doi:10.1109/ICMLC.2018.8526996.
Yu, J., et al., 2018. Detecting regularities of traffic signal timing using GPS trajectories. IEICE
Transactions on Information and System, 101 (4), 956–963. doi:10.1587/transinf.2016IIP0022.
Yu, J. and Lu, P., 2016. Learning traffic signal phase and timing information from low-sampling rate
taxi GPS trajectories. Knowledge-Based Systems, 110, 275–292. doi:10.1016/[Link].2016.07.036
Yu, Q., et al., 2019. Road congestion detection based on trajectory stay-place clustering. ISPRS
International Journal of Geo-Information, 8 (6), 264. doi:10.3390/ijgi8060264.
Yuan, G., et al., 2019. APDS: a framework for discovering movement pattern from trajectory database.
International Journal of Distributed Sensor Networks, 15 (11), 11. doi:10.1177/1550147719888164.
Yuan, Q., et al., 2014. A traffic congestion detection and information dissemination scheme for urban
expressways using vehicular networks. Transportation Research Part C: Emerging Technologies, 47,
114–127. doi:10.1016/[Link].2014.08.001.
Zhao, P., et al., 2016. A trajectory clustering approach based on decision graph and data field for
detecting hotspots. International Journal of Geographical Information Science, 31 (6), 1101–1127.
doi:10.1080/13658816.2016.1213845.
Zheng, J. and Liu, H.X., 2017. Estimating traffic volumes for signalized intersections using connected
vehicle data. Transportation Research Part C: Emerging Technologies, 79, 347–362. doi:10.1016/j.
trc.2017.03.007
Zheng, L., et al., 2020. A tensor-based K-nearest neighbors method for traffic speed prediction under data
missing. Transportmetrica B: Transport Dynamics, 8 (1), 182–199. doi:10.1080/21680566.2020.1732247.

You might also like