Visualizing Football Strategies & Performance
Visualizing Football Strategies & Performance
MASTER
Bakker, L.F.B.C.
Award date:
2015
Link to publication
Disclaimer
This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student
theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document
as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required
minimum study period may vary in duration.
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
• You may not further distribute the material or use it for any profit-making activity or commercial gain
Eindhoven University of Technology
Department of Mathematics and Computer Science
Master’s Thesis
L.F.B.C. Bakker
Supervisor:
[Link]. H.M.M. van de Wetering
29 June 2015
EINDHOVEN UNIVERSITY OF TECHNOLOGY
Abstract
by L.F.B.C. Bakker
In this thesis we present a software prototype to explore football data containing spa-
tiotemporal data. In contrast to most of the current analyses, we show that football
visualization is not restricted to statistical information by visualizing playing styles and
strategies. We define possession chains as the main element in our analysis of attacking
and defending playing styles of football teams. We visualize the structure, attacking ten-
dencies, defending tendencies, set piece tendencies, and playing styles of football teams
in an interactive match dashboard.
Acknowledgements
I would like to thank my supervisor, [Link]. Huub van de Wetering, for his guidance during
the project. I am extremely thankful to him for sharing his expertise, and sincere and
valuable guidance and encouragement to me.
I take this opportunity to express gratitude to all of my friends and fellow students for
their help and support. I also thank my parents for the encouragement and support. I
would like to thank my father for introducing me to the world of football.
ii
Contents
Acknowledgements ii
Contents iii
List of Figures v
List of Tables ix
1 Introduction 1
1.1 Thesis objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Thesis scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Problem analysis 6
2.1 Match analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Team structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Attacking tendencies . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3 Defending tendencies . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.4 Set pieces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Playing styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Attacking styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Defending styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Events and qualifiers . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Data processing 15
3.1 Mathematical model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Possession chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Chain properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Visualizations 21
4.1 Match Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2 Single Chain visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
iii
Contents iv
5 Results 42
5.1 Team structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Attacking tendencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.3 Defending tendencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.4 Set pieces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.5 Objective questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6 Conclusions 59
6.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
A XML structures 62
B Event types 63
C Qualifiers 66
D Field zones 73
E Goalmouth locations 74
Bibliography 75
List of Figures
2.1 An example showing the line of confrontation and the line of defense. The
lines between the blue players connect the four defenders in the back and
the three players applying pressure in the front. Pressure is shown by a
black arrow. The line of confrontation is where the defending team starts
applying pressure, shown by a dashed line. The line of defense is where
the last line of defenders are located, also shown by a dashed line. . . . . 8
2.2 F24 data overview diagram with matches, events, qualifiers, and their
attributes. A match has multiple events, which each belong to a sin-
gle match. An event may have multiple qualifiers describing additional
properties of that event. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Field coordinates by Opta [1]. . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Patterns of goal-scoring with respect to the different lengths of possessions
in the 1990 and 1994 World Cups for soccer. [2] . . . . . . . . . . . . . . . 12
2.5 SoccerStories user interface: (1) complete game overview as a timeline,
(2) temporal zoom on a game phase and layout on a soccer field, (3) player
details on the side, (4) the thumbnails of phases, and (5) generated text-
annotations. [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 Five different ways to visualize a pass cluster: (a) a full node-link diagram,
(b) a compact node-link diagram, (c) an adjacency matrix, (d) a hive plot,
and (e) a tagcloud. [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.7 Passing networks for the Netherlands and Spain drawn before the final
game of the World Cup 2010, using the passing data and tactical forma-
tions of the semi-finals. [4] . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.8 Heatmap of player movement showing which parts of the field the player
spent more time in [5]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1 Match Dashboard for FC Augsburg against TSV München 1860. (1)
Field area. (2) Glyph graph. (3) Timeline. (4) Overview header. (5)
Rectangular Chain Graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2 Visualizing a possession chain. . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Meaning of colors in chain visualization. . . . . . . . . . . . . . . . . . . . 23
4.4 Visualizing a team’s formation. Example: 4–4–2 diamond formation . . . 25
4.5 Pass graphs of FC Augsburg (left) and TSV München 1860 (right). Re-
sult: FC Augsburg 2 - 6 TSV München 1860 . . . . . . . . . . . . . . . . 26
4.6 The colormap for coloring the nodes according to the success rate of passes
ranging from 0 (0%) to 1 (100%) . . . . . . . . . . . . . . . . . . . . . . . 26
4.7 Pass graphs of FC Augsburg (left) and TSV München 1860 (right) using
average positions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
v
List of Figures vi
5.1 Team structures in the Pass Graph visualization. (a) shows a 4–4–2 for-
mation played by 1. FC Mainz 05 and (b) shows a 4–4–2 diamond forma-
tion played by Alemannia Aachen. . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Average y position of player Timo Achenbach (32) for each 10 minutes in
the match 1. FC Kaiserslautern - SpVgg Greuther Fürth. . . . . . . . . . 43
5.3 Average x position of player Timo Achenbach (32) for each 10 minutes in
the match 1. FC Kaiserslautern - SpVgg Greuther Fürth. . . . . . . . . . 44
5.4 Pass Graphs of (a) 1. FC Köln and (b) Alemannia Aachen. . . . . . . . . 45
5.5 Glyph Graphs of (a) 1. FC Köln and (b) Alemannia Aachen filtered on
possession chains with a length of at least 10. . . . . . . . . . . . . . . . . 45
List of Figures vii
5.6 Glyph Graphs of (a) 1. FC Köln and (b) Alemannia Aachen filtered on
possession chains with a length of at least 10 containing a shot. . . . . . . 45
5.7 Possession chain of Alemannia Aachen leading to a goal against 1. FC Köln. 46
5.8 Possession Chain Heatmap of (a) 1. FC Köln playing from left to right
and (b) Alemannia Aachen playing from right to left. . . . . . . . . . . . . 46
5.9 Rectangular Chain Graph of the match 1. FC Köln (red) - Alemannia
Aachen (blue) zoomed in on each 15 minutes of the match. The y-axis
is the time axis (top-down) and the x-axis is the x-coordinate on the
field (recall Section 4.4). The possession percentages in the figures are
denoted as 1. FC Köln/Allemannia Aachen and show the possession in
that particular time period. . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.10 Pass Graphs of (a) 1. FC Köln and (b) Alemannia Aachen showing av-
erage player positions during their match. Note that the maximum value
for the number of passes is relative to the team to get a better view of
the mutual passes within a team. . . . . . . . . . . . . . . . . . . . . . . . 48
5.11 The passes of (a)-(b) 1. FC Köln’s midfielder Thomas Broich (10), (c)-
(d) 1. FC Köln’s midfielder Roda Antar (20), (e)-(f) Alemannia Aachen’s
midfielder Pekka Lagerblom (5), and (g)-(h) Alemannia Aachen’s mid-
fielder Matthias Lehmann (20) in the match 1. FC Köln - Alemannia
Aachen. Note that in (a)-(d) the playing direction is from left to right,
while in (e)-(h) the playing direction is from right to left. . . . . . . . . . 49
5.12 Match Dashboard for FC Augsburg against TSV München 1860 filtered
on possession chains containing a goal. . . . . . . . . . . . . . . . . . . . . 51
5.13 Defense plot for (a) 1. FC Köln and (b) Allemannia Aachen of their match. 53
5.14 Glyph Graph of 1. FC Kaiserlautern in their match against Borussia
Mönchengladbach filtered on possession chains containing a corner. . . . . 53
5.15 Two corners for 1. FC Kaiserslautern taken by number 10 Patrice Bernier
that reach player number 2 Moussa Ouattara in the match 1. FC Kaiser-
slautern - Borussia Mönchengladbach. . . . . . . . . . . . . . . . . . . . . 54
5.16 Glyph Graphs of (a) FC Augsburg and (b) TSV München 1860 filtered
on possession chains containing a free kick. . . . . . . . . . . . . . . . . . 55
5.17 Glyph Graphs of (a) FC Augsburg and (b) TSV München 1860 filtered
on possession chains containing a free kick where in (a) player number 10
and in (b) player number 8 are highlighted by hovering. . . . . . . . . . . 55
5.18 Free kicks of FC Augsburg trying the long ball towards the opponent’s
penalty area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.19 Free kicks of FC Augsburg playing the ball from one side to the other. . . 56
5.20 Glyph Graph of (a) 1. FC Köln and (b) Alemannia Aachen filtered on
shots. 1. FC Köln has 20 shots of which only 1 shot is on target (shown
by the yellow circle). Alemannia Aachen has 8 shots of which 5 are on
target including a goal (the longest chain). Unfortunately, the green circle
denoting the goal is outside the visible area. . . . . . . . . . . . . . . . . . 57
ix
Dedicated to my friends and family. . .
x
Chapter 1
Introduction
Sports are one of the most popular forms of entertainment, having millions of people
following sporting events on television, computers, or other emerging devices such as
smartphones and tablets. Many people have their opinion on various sport games, and
professionals and experts analyze sports for entertainment or performance enhancement.
In recent years, computer visualizations are increasingly used to improve the viewer’s
experience or to provide analysts with more insight in a particular sporting event. This
performance analysis is a common practice during matches and training in many sports
[6].
Sports Visualization (not to be confused with mental imagery and rehearsal in sports
psychology) is a relatively new and growing field in Information Visualization. In 2013
the IEEE Visualization conference had its first workshop and tutorial on Sports Data
Visualization, showing that Sports Visualization is becoming more important. Further-
more, more and more detailed sports data is being collected by companies like Opta [1]
and Infostrada [7]. This data does not only contain statistical data, but also spatiotem-
poral data, i.e., data related to time and space. However, still many visualizations in
sports only show statistics, such as a team’s progress in the league table, main match
events, or simple histories of results [8, 9], while there is a growing need for more detailed
analyses. These needs come from different users of sports visualization such as sports
journalists and sports coaches. Their goal is to find patterns in the data that can not
be described by simple statistics. These patterns describe players’ performance, playing
styles, and strategies.
Football, also called association football or soccer, is the world’s most popular ball game
in numbers of participants and spectators [10, 11]. A survey in 2006 by the Fédération
Internationale de Football Association (FIFA) counted 270 million people (4% of the
world’s population) - 265 million male and female players in addition to 5 million referees
1
Chapter 1. Introduction 2
and officials - who are actively involved in the game of football [12]. Professional football
revolves around performance and improving the performance to gain an advantage over
the opponent. As subtle changes to the performance might be the difference between
winning a game or losing a match, data visualization plays an important role in the
analysis.
In this thesis, we aim to give a football match analysis that can be used to explain the
outcome of the match and that provides insight into a team’s playing style. In particular
we aim to answer the following questions:
In order to answer these questions we need to find the answers to the following questions
first.
This project is intended to provide insight into football teams’ playing styles and players’
performance by means of a software prototype. To do so, we have selected several
questions we think are important in football analysis. We aim to answer those questions
with our visualizations. A lot of information can be obtained from the data, including
statistics about passes, scored goals etc. It was not the goal to show any such statistics
when they were not of significant importance to the goal of answering our selected
questions. The main target group of users for our prototype are the football enthusiasts.
Chapter 1. Introduction 3
1.3 Terminology
In this section we describe the terms and definitions we use throughout this thesis.
Unless stated otherwise, these definitions are taken from the U.S. Soccer Curriculum
[13].
Combination play
Quick and effective movement of the ball by two or more players from the same
team.
Corner kick
A corner kick is a method of restarting play. A corner kick is awarded when the
whole of the ball passes over the goal line, either on the ground or in the air, having
last touched a player of the defending team, and a goal is not scored [14].
Crossing
Passing of the ball from wide areas of the field to a central area close to goal with
the intention of finding a teammate to score.
Dribbling
Close control of a ball in movement, with the feet and on the ground, continuously
changing its trajectory.
Formation
The shape of the team and distribution of the players on the field at the beginning
of the game. A formation is always numbered from the backfield forward. For
example, a 4–4–2 system denotes four defenders, four midfielders and two forwards.
Free kick
A free kick is a method for restarting the game after a foul is committed outside
the opponent’s penalty area. A free kick is awarded to the opposing team if a
player commits any offences in a manner considered by the referee to be careless,
reckless or using excessive force. A free kick can be direct or indirect. A direct
free kick may directly be shot at the goal, while an indirect free kick must first be
flicked on by another player. [14]
Heading
Striking the ball with any part of the head with the purpose of clearing, passing
or scoring.
Line of confrontation
The line of confrontation is an imaginary line on the field where the defending
team starts pressing the team in possession of the ball [15].
Chapter 1. Introduction 4
Line of defense
The line of defense is an imaginary line on the field where the defending back
players set up to hold the opposing team away from the goal.
Passing
Transferring the ball on the ground or in the air from one player to another.
Penalty
A penalty is a method for restarting the game after a foul is committed within the
penalty area. A free shot at goal from the 11 meter penalty spot is awarded to the
fouled player’s team, where only the penalty taker and the defending goal keeper
are allowed inside the penalty area when the penalty is being taken.
Playing style
A team’s playing style is the manner in which the players of that team defend and
attack as a unit.
Shooting
Striking the ball toward the goal with the objective of scoring.
Strategy
A general concept or idea agreed upon by the team at the beginning of the game
with the intention to beat the opponents.
Switching play
The transferring of the ball from one part of the field to another, generally from one
wide area to another, in order to disorganize the defense and create an advantage
over the opponents.
System
A formation with specifications in the shape and/or roles for one or more players.
The system combines the formation and strategy.
Problem analysis
When analysing a football match, there are a lot of different aspects that play an impor-
tant role in the outcome of that match. Some of these aspects can not be influenced by
a team, such as weather conditions or bad decisions by officials, but other aspects can be
influenced and we will focus on these. We introduce several questions and divide them
into different categories, described below. The questions are based on the US Soccer
Federation (USSF) recommendations for analyzing a match [16, 17].
A football team has a certain structure, also called a formation. A team’s formation is
the arrangement of the ten field players on the field. Such formations give some informa-
tion on how defensive or attacking a team aims to play. Each formation requires certain
player skills to effectively carry out the goal. This goal can for example be to avoid goals
against or to keep possession of the ball during large periods of the match. A team’s
formation is not static, i.e., players can swap positions or formations can be transformed
during a match. A 4–4–2 formation might for example transform temporarily to a 4–2–4
system during an attacking phase in the match, as the two outer midfielders move up
to join the two forwards. Regarding a team’s structure, we aim to answer the following
questions:
6
Chapter 2. Problem analysis 7
Football is a game of attacking and defending. The way a team attacks or defends,
called a playing style (Section 2.2), plays an important role in how a match goes. When
we analyse the attacking tendencies of the teams in a match, we analyse how these
teams act when they are in possession of the ball. One of the most important aspects
to analyse when a team is in possession of the ball, is passing. A player who controls
the flow of the team’s offensive play, and is often involved in many passing moves, is
called the playmaker. Regarding the attacking tendencies of a team, we are interested
in answering the following questions:
Apart from the attacking tendencies, the defending playing style (Section 2.2.2), is also
important. It determines when defensive actions take place and where these actions take
place on the field. Two important aspects in defending are the line of confrontation and
the line of defense. The line of confrontation is an imaginary line on the field where
the defending team starts pressing the opposing team that is in possession of the ball.
Teams that use a high-pressure defending playing style will have a confrontation line
closer to the opposing goal compared to teams that use a low-pressure defending playing
style. The line of defense is an imaginary line on the field where the back players set
up to hold the opposing team away from the goal. This line plays an important role
when using an offside trap. Figure 2.1 shows an example of the line of confrontation and
the line of defense. The red team is the attacking team, the blue team is the defending
team, and the black dot denotes the ball. The arrows denote the movement direction of
the pressing players. Regarding the defending tendencies of a team, we aim to answer
the following questions:
Attacking team
Defending team
Ball
Pressure
Figure 2.1: An example showing the line of confrontation and the line of defense.
The lines between the blue players connect the four defenders in the back and the three
players applying pressure in the front. Pressure is shown by a black arrow. The line of
confrontation is where the defending team starts applying pressure, shown by a dashed
line. The line of defense is where the last line of defenders are located, also shown by
a dashed line.
Set pieces such as free kicks and corner kicks play a major role in football. An analysis
of the World Cup in 1994 shows that around 40% of all the goals scored came from set
pieces; the winners, Brazil, had around 50% of the goals scored out of set pieces [18].
Other studies show similar results with set piece goal percentages between 30% and 40%
[19, 20]. Since so many goals are scored from set pieces, it is important to analyse them.
The main questions we aim to answer are:
According to the literature [21, 22], playing styles in football can be divided into two
main categories: attacking playing styles and defending playing styles. We distinguish
two main types in each category that are briefly discussed below.
Chapter 2. Problem analysis 9
We distinguish two main types of attacking playing styles: direct attacking (also known
as counterattacking) and indirect attacking (also known as possession football).
Direct attack
A team using a direct attack playing style attempts to score a goal by beating the de-
fending team with long, penetrating passes. The main idea is to take the most direct
route to the opponent’s goal with as few passes or ball touches as possible.
Indirect attack
An indirect style of play is the opposite of the direct style of play. Rather than attacking
in a fast and direct manner, a team playing an indirect style builds its attack slowly.
The main idea is to keep possession of the ball by using combination play. This combi-
nation play is characterized by many short passes or dribbles to advance slowly to the
opponent’s goal.
We distinguish two main types of defending playing styles: low-pressure defending (also
known as a zonal defense) and high-pressure defending (also known as man marking).
Low-pressure defending
In a low-pressure defense style, the defending team focuses on covering zones on the field
rather than marking individual attackers. Once the opposing team gains possession of
the ball, the defense withdraws toward its own half of the field and keeps a compact
shape to keep the attacking team from passing the ball into space. The defensive team
begins to pressure the ball as soon as it crosses midfield or when proper support is es-
tablished.
High-pressure defending
In a high-pressure defense style, the defending team immediately puts pressure on the
opponent’s players after losing possession. Instead of marking zones on the field, the
defenders use man-to-man marking to keep the attacking team from passing the ball
and to recover possession of the ball. The goal is to force the attacking players to play
faster than their ability allows.
Chapter 2. Problem analysis 10
2.3 Data
Our data consists of 90 football matches of the Bundesliga 2 of the season 2007/2008
gathered by Opta Sports [1]. Each match is stored in two different XML files, one
containing general match data (called F7), the other containing the events data (called
F24). The events data consists of all events in a match, linked to one or more players
and/or other objects, including the outcomes of these events. This can be goals, shots,
passes, tackles, etc. The events data is spatiotemporal data, i.e., each event contains x
and y coordinates on the field and a timestamp. An overview of the F24 data is shown
in Figure 2.2. For the structure of the XML files, we refer to Appendix A.
Figure 2.2: F24 data overview diagram with matches, events, qualifiers, and their
attributes. A match has multiple events, which each belong to a single match. An
event may have multiple qualifiers describing additional properties of that event.
A match in our data consists of roughly 6,000 to 8,000 events. Each player action event
within the game contains a player, a team, an event type, a minute and a second. De-
pending on the event type, an event may have multiple qualifiers describing particular
properties of that specific event type. An example of an event type with qualifiers is
given below.
< Event id = " 783279345 " event_id = " 170 " type_id = " 1 " period_id = " 1 " min = " 20 "
sec = " 33 " player_id = " 19645 " team_id = " 52 " outcome = " 1 " x = " 98.0 " y = " 65.9 "
keypass = " 1 " timestamp = " 2011 -08 -13 T15 :21:16.403 " >
<Q id = " 1360104517 " qualifier_id = " 140 " value = " 95.7 " / >
<Q id = " 716687577 " qualifier_id = " 154 " / >
<Q id = " 1024698286 " qualifier_id = " 141 " value = " 69.8 " / >
<Q id = " 548528358 " qualifier_id = " 213 " value = " 2.3 " / >
<Q id = " 1464195139 " qualifier_id = " 212 " value = " 3.6 " / >
<Q id = " 811964232 " qualifier_id = " 56 " value = " Center " / >
<Q id = " 629946245 " qualifier_id = " 210 " / >
</ Event >
Chapter 2. Problem analysis 11
0,100 100,100
5.8,63.2 94.2,63.2
5.8,36.8 94.2,36.8
0,0 100,0
x
Figure 2.3: Field coordinates by Opta [1].
2.3.2 Limitations
The movement of players on the field plays an important role in both defensive and
offensive tactics [23, 24]. However, our data does not contain any information of where
players are when they do not touch the ball. This means, that an analysis of how players
move without the ball, is impossible.
Since our data contains only the first ten rounds (90 matches) of the season, we can
not perform a full analysis of the season. Although this is not our goal, this limits the
ability to analyse top scorer results, final league standings, etcetera.
Chapter 2. Problem analysis 12
A lot of the work done in the area of football visualization is related to statistics. A
few examples are Hughes et al. [2], who analyzed the number of goals that were scored
related to the length of the passing sequences (Figure 2.4), Cava et al., who designed a
”glyphs in matrix” visualization for results of football matches [25], and Rusu et al [26],
who designed Soccer Scoop, an application that allows football managers to compare
statistical data of individual players.
Figure 2.4: Patterns of goal-scoring with respect to the different lengths of possessions
in the 1990 and 1994 World Cups for soccer. [2]
Perin et al. [3] showed that there was more to football visualization than just the vi-
sualization of statistics as they introduced the visualization of what they call phases in
football (Figure 2.5[2]). Phases are sequences of actions from one team until it loses the
ball. Apart from these phases they show additional information such as an overview time-
line (Figure 2.5[1]), single player details (Figure 2.5[3]), and generated text-annotations
(Figure 2.5[5]).
Figure 2.5: SoccerStories user interface: (1) complete game overview as a timeline,
(2) temporal zoom on a game phase and layout on a soccer field, (3) player details on
the side, (4) the thumbnails of phases, and (5) generated text-annotations. [3]
Chapter 2. Problem analysis 13
Perin et al. also introduce five different ways to visualize a pass cluster (passes between
a group of players) in football, including a compact node-link diagram, a full node-
link diagram, an adjacency matrix, a tagcloud, and a hiveplot (Figure 2.6). These
different techniques aid to finding different kinds of passing information such as finding
the player with the most successful passes or finding which two players have the best
mutual passing. However, Perin et al. focus on telling stories with the football data
rather than providing insight into playing styles as SoccerStories is developed to support
analysts in exploring soccer data and communicating interesting insights.
(a) Full node-link diagram (b) Compact node-link diagram (c) Adjacency matrix
Figure 2.6: Five different ways to visualize a pass cluster: (a) a full node-link diagram,
(b) a compact node-link diagram, (c) an adjacency matrix, (d) a hive plot, and (e) a
tagcloud. [3]
There are some general visualization techniques that are widely used in football visual-
ization. One widely used technique to visualize a team’s passing performance, is a pass
graph or pass network. Peña et al. [4] used some tools from network theory to describe
the strategy of football teams. They define the passing network of a team as follows:
The team’s players are nodes on the football field corresponding to their original playing
position, i.e., their position as specified by the team formation. The connecting arrows
between two player nodes are weighted by the successful number of passes completed
between them. A bigger and darker arrow means more successful passes between the
players. Figure 2.7 shows an example of such a network.
Another widely used visualization technique used for sports in general is the heatmap.
A heatmap is a two-dimensional graphical representation of data where the data values
are represented by colors. In sports it is often used to visualize where players move or
Chapter 2. Problem analysis 14
Figure 2.7: Passing networks for the Netherlands and Spain drawn before the final
game of the World Cup 2010, using the passing data and tactical formations of the
semi-finals. [4]
where particular events happen on the playing field. An example of such a heatmap in
is shown in Figure 2.8. It shows which parts of the field the player spent more time in.
Figure 2.8: Heatmap of player movement showing which parts of the field the player
spent more time in [5].
Chapter 3
Data processing
Our data consists of thousands of different individual events. These individual events
say something about a football match, but not everything. That is because events take
place in a certain context, such as a foul that is committed just after losing the ball,
or a header goal from a perfect corner cross. This means that events can be linked
together in sequences. We call these sequences of linked events, possession chains. A
possession chain, or simply chain, starts when a team obtains possession of the ball and
ends when this team loses the ball again. This is similar to the “phases” introduced by
Perin et al. [3]. A formal definition of a possession chain is given in Section 3.2. Once
we enriched our data with these possession chains, we can calculate various properties of
such chains, such as the their speed, or their average distance per touch. These properties
are described in Section 3.3. But before we go into detail about the possession chains
and their properties, we give a mathematical model of the data. This helps us to define
the possession chains later on. The mathematical model is described in Section 3.1.
In this section we will give a mathematical model for the football data. As discussed
in section 2.3.1, a match consists of thousands of events. We denote the events of a
match m with n events as Em = e1 , e2 , . . . , en . Each event has certain attributes (recall
Figure 2.2) and, depending on the event type, certain qualifiers. We will define these
attributes and qualifiers as follows:
The qualifier values of an event ei are defined as qi,j , where i is the event number and j
is the qualifier id. So the value of a qualifier with a qualifier id = 56 of event e5 would
be written as q5,56 . If event e5 does not have a qualifier with qualifier id = 56, then we
15
Chapter 3. Data processing 16
write q5,56 = ∅. The position of an event ei on the field is defined as a coordinate (xi , yi ).
The distance of an event ei depends on its event type. A pass has qualifiers 140 and
141 which describe the end x-coordinate and end y-coordinate respectively. Similarly,
a shot that is blocked contains qualifiers 146 and 147 describing the end x-coordinate
and end y-coordinate. All other shots are either a goal, hit the post or bar, or end up
behind the goal. In these cases, qualifier 102 describes the end y-coordinate of the shot.
Since these shots always hit or pass the goal line, the end x-coordinate is equal to 100.
A dribble is a special case, because it has no qualifiers describing where it ends. In this
case, we look at where the next event starts and take the starting coordinate of that
event as end point of the dribble. All other events do not have a distance as the ball is
not moved. This leads to the following definition of the end coordinates (Xi , Yi ) of an
event ei :
(qi,140 , qi,141 ) if ei is a pass or an offside pass
(qi,146 , qi,147 ) if ei is a blocked shot
(Xi , Yi ) = (100, qi,102 ) if ei is a non-blocked shot (3.1)
(xi+1 , yi+1 ) if ei is a dribble
(xi , yi ) otherwise
Now that the end coordinates (Xi , Yi ) of an event ei are defined, we define the distance
di of an event ei as follows:
p
di = (Xi − xi )2 + (Yi − yi )2 (3.2)
The starting time si of event ei is calculated from the starting minute min and starting
seconds sec of that event. We define starting time si of event ei as
This allows us to easily calculate with the starting times as they are all in seconds. Using
the starting times of the events, we can define the duration. The duration depends on
the event type. Generally, we look at the starting time of the next event to calculate
the duration. However, we found out that a shot leads to an inaccurate duration as
out-of-play events after a missed shot are inaccurate and restarts after a goal may take a
long time. To determine the duration of a shot event, we use the average ball velocity of
a shot by a professional football player and the distance the ball has traveled. Research
has shown that the average ball velocity of a shot by a professional football player is
Chapter 3. Data processing 17
di
if ei is a shot.
ti = 30 (3.4)
s
i+1 − si otherwise.
An event also contains some boolean attributes, which are either 1 or 0. These attributes
are the outcome of the event (successful / failed), whether or not the event is an assist,
and whether or not the event is a keypass. We describe these boolean attributes for
event ei as oi for the outcome, ai for the assist, and ki for the keypass. Finally, we define
the type id, team id and player id of event ei as typei , teami and playeri respectively.
We summarize all the notations in Table 3.1.
Notation Description
Em = e1 , e2 , . . . , en The list of n events of match m in chronological order.
qi,j The value of the qualifier with qualifier id = j of event ei (∅
if qi,j does not exist).
(xi , yi ) The starting position of event ei as coordinate on the field.
(Xi , Yi ) The end position of event ei as coordinate on the field (equa-
tion 3.1).
si The starting time of event ei in seconds (equation 3.3).
ti The duration of event ei (equation 3.4).
oi The outcome of event ei , either 0 for failed, or 1 for successful.
ai Describes if event ei is an assist (1) or not (0).
ki Describes if event ei is a keypass (1) or not (0).
di The distance the ball traveled during event ei (equation 3.2).
teami The team identifier of event ei .
playeri The player identifier of event ei .
typei The type identifier of event ei .
Table 3.1: Notations for the football data and their meaning.
A single match consists of several possession chains for the two teams playing against
each other. Each possession chain starts with an event where possession of the ball is
obtained by a team and ends with an event where that team loses possession of the ball
again. Before we can formally define a possession chain, we must clarify which events
lead to obtaining the ball and which events lead to losing the ball.
Definition 3.1 (Possession chain). Let event ei be an event where team teami obtains
the ball and let event ej with j > i be the first event after ei that results in team teami
losing possession of the ball. Then the sequence c = ei , ei+1 , . . . , ej is called a possession
chain of team teami . Each event in the chain must belong to the same team, and each
Chapter 3. Data processing 18
Attribute Value
type identifier 0
starting time sj + tj
starting position (Xj , Yj )
end position (Xj , Yj )
team identifier teami
player identifier 0 (non-existing id)
distance 0
duration 0
outcome 1
keypass 0
assist 0
Table 3.2: The attributes of the end-of-chain event and their value.
event except for the last one must have a positive outcome. Formally, for chain c it must
hold that:
(∀k∈N : i ≤ k < j : teamk = teamk+1 ∧ ok = 1) (3.5)
End-of-chain event
To easily process the possession chains, we build an “end-of-chain” event indicating the
end of a possession chain. The end-of-chain event has the same attributes as any other
event in the data. The attributes of an end-of-chain event for a chain c = ei , ei+1 , . . . , ej
are defined in Table 3.2. The end-of-chain event is attached to the end of a possession
chain.
Possession chains have several properties such as speed or distance. In this section
we give formal definitions of the properties we use. For a better understanding of the
possession chain properties we created a schematic view of a possession chain (see Figure
3.1). This schematic view contains a possession chain with j − i + 1 events where ei
denotes event i at position (xi , yi ) and di denotes the distance of event ei .
Length
Let chain c = ei , ei+1 , . . . , ej be a possession chain of j − i + 1 events. Then the length
or number of touches cl of chain c is defined as
cl = j − i + 1 (3.6)
Chapter 3. Data processing 19
ei+2 ej−1
di+1 ...
ei+1 dj−1
dj
di ej
ei = (xi , yi )
Duration
Let chain c = ei , ei+1 , . . . , ej be a possession chain. Then the total time or duration ct
of chain c is defined as follows:
j
X
ct = tk (3.7)
k=i
Distance
Let chain c = ei , ei+1 , . . . , ej be a possession chain. Then the distance cd of chain c is
defined as follows:
j
X
cd = dk (3.8)
k=i
cd
cdavg = (3.9)
cl
Speed
Let chain c = ei , ei+1 , . . . , ej be a possession chain. Then the speed or velocity cv of
chain c is defined as follows:
cd
cv = (3.10)
ct
Number of passes
Let chain c = ei , ei+1 , . . . , ej be a possession chain. Then the number of passes cpasses
of chain c is defined as
Note that we exclude passing events which contain the following qualifiers as they do
not have a designated target player:
Visualizations
In Chapter 2 we discussed the problem domain and we came up with several questions
we needed to answer in order to answer our main questions Q0 and Q1 on the match
outcome and playing style. In this chapter we will explain the visualizations we devel-
oped that answer those questions. Most of these visualizations are linked together into
our Match Dashboard described in Section 4.1.
21
Chapter 4. Visualizations 22
Figure 4.1: Match Dashboard for FC Augsburg against TSV München 1860. (1)
Field area. (2) Glyph graph. (3) Timeline. (4) Overview header. (5) Rectangular
Chain Graph.
Most of the designed visualizations are captured in our Match Dashboard depicted in
Figure 4.1. The visualizations, numbered from 1 to 5, are all connected via a user
interface, i.e., the user can select, filter, and highlight particular aspects such as chains
or players. This input from the user affects all visualizations in the dashboard. We
discuss the interaction with the dashboard in Section 4.10.
The football fields, numbered as number 1 in Figure 4.1, are used to draw 4 different
visualizations for both the home team (left) and the away team (right). These visualiza-
tions are the Single Chain visualization (Section 4.2), the Pass Graph (Section 4.3), the
Defense Plot (Section 4.5), and the Chain Heatmap (Section 4.6). The Glyph Graphs
(number 2 in Figure 4.1) visualize the possession chains as glyphs for both teams in the
match. The Glyph Graph is discussed in Section 4.8. The Timeline in the dashboard
(number 3 in Figure 4.1) allows the user to select particular time periods. The Time-
line is discussed in Section 4.7. The Overview Header (number 4 in Figure 4.1) shows
general match information such as the teams that play each other, the final result, who
scored, etcetera. The Overview Header is discussed in Section 4.9. The Rectangular
Chain Graph (number 5 in Figure 4.1) visualizes the possession chains of both teams in
an alternative way, leading to different insights. It is discussed in Section 4.4.
Chapter 4. Visualizations 23
The first step in our visualization is to visualize the possession chains discussed in
Section 3.2. The way we visualize these chains is similar to the way Perin et al. [3]
visualized their soccer phases. Each ball event within a possession chain has either
a single position, for example a tackle, or a starting position and an end position, for
example a pass. Events with a single position are visualized as nodes on the field, whereas
events with a starting position and an end position are visualized as a combination of a
node and an arrow. An example of the visualization of a possession chain is shown in
Figure 4.2.
Each node in Figure 4.2 represents a player action. The color of a node visualizes the
original position of this player as shown in Figure 4.3(a). The color of an arrow denotes
the outcome of the event it represents, which can be successful (green) or unsuccessful
(red). Set pieces (including goal kicks) and keypasses are highlighted by different colors
to easily identify them as shown in Figure 4.3(b). A player dribbling with the ball is
shown as a dashed line.
Goalkeeper
successful pass / goal
Defender failed pass / missed shot
Midfielder keypass
set piece (incl. goal kick)
Striker
shot against post / bar
Substitute
dribble
(a) Player positions (b) Arrow meanings
In order to analyse the structure of a team, it is convenient and intuitive to draw each
player’s position on the field. Figure 4.4 shows an example of how a 4–4–2 diamond
formation can be visualized on the football field in a step by step approach. In Fig-
ure 4.4(a) all players are drawn as (black) nodes on their respective positions on the
field. To make a clear distinction between the different positions of the player we color
code them, as shown in Figure 4.4(b). Finally we add the players’ shirt numbers and
names to identify each individual player as shown in Figure 4.4(c) and 4.4(d) respec-
tively. This is a very common way to visualize team formations in football, although the
nodes may be different as they are often visualized as shirts or player passport photos.
The advantage of circular nodes is that it is possible to add another dimension to the
visualization by varying the node sizes. This could also be done with shirt nodes, but
circular nodes are allow easier size comparison.
As explained in Section 2.1.2, passing plays a very important role in football. Passing can
be looked at from two points of view: a player or a team. These different viewing points
lead to different visualizations, one capturing player performance, the other capturing
team play and possibly strategy. We first look at passing from a team’s point of view.
In Section 2.4, we discussed the pass graph or pass network. Figure 4.5 shows our
idea of a pass graph. The nodes in the graph represent players and are placed on
the field according to their original position in the team formation. The lines between
players represent successful mutual passes. The width and color represent the number
Chapter 4. Visualizations 25
5 8 5 8
Weigelt Lehmann
11 11
4 4 Nemeth
1 6 10 1 Vukovic 6 10
3 Nicht 3 Polenz Pecka
9 Klitzpera 9
Ebbers
2 7 2 7
Stehle Reghecampf
of successful mutual passes; the wider and darker the line is, the more successful mutual
passes it represents. To determine the exact color of these lines we create a colormap.
Formula 4.1 shows how we calculate the color of a given value v, given a minimum color, a
maximum color, a minimum value, and a maximum value. For the number of passes, the
minimum value is 0 as this the least possible number of passes. The maximum value is
calculated from the data; we take the maximum mutual passes between any two players.
The minimum and maximum color depend on the team. For the home team we create a
red colormap going from RGB=(254,224,210) to RGB=(222,45,38). For the away team
we create a blue colormap going from RGB=(222,235,247) to RGB=(49,130,189). Both
these colormaps are shown in Figure 4.5.
The size of a node shows how many passes that particular player has given and its color
shows the success rate of these passes. This means that a large bright green node denotes
a player who gave a lot of successful passes and that a large red colored node denotes a
player who gave a lot of failed passes. The colors of the values are calculated using linear
interpolation as shown in formula 4.1. In this formula cmin and cmax are the minimum
and maximum color values, vmin and vmax are the minimum and maximum data values,
Chapter 4. Visualizations 26
Figure 4.5: Pass graphs of FC Augsburg (left) and TSV München 1860 (right). Result:
FC Augsburg 2 - 6 TSV München 1860
Since the success rate of passes is a percentage, we let the data values range from 0 (0%)
to 1 (100%). The colors of the created colormap range from red (RGB=(255,0,0)) to
bright green (RGB=(0,255,0)) and are shown in Figure 4.6.
Figure 4.6: The colormap for coloring the nodes according to the success rate of
passes ranging from 0 (0%) to 1 (100%)
To determine the size of the nodes we use linear interpolation as well. We have a
minimum and maximum radius, rmin and rmax , that we use as bounds for the node sizes
and a minimum and maximum data value vmin and vmax respectively. Since the area of
a node represents its value, we interpolate using the minimum and maximum area and
not the minimum and maximum radius. A linear increase in the radius would result in
a quadratic increase in the area and that leads to a wrong representation. We use rmin
and rmax to determine the minimum area Amin and maximum area Amax :
2
Amin = π · rmin (4.2)
2
Amax = π · rmax (4.3)
Since we use the radius to render a node we calculate the radius r from the previously
calculated area A: r
A
r= (4.5)
π
In Figure 4.5 we can see that TSV München 1860 is more successful in passing than FC
Augsburg. From the node sizes we see that the TSV München 1860 players played more
passes and by the colors of these nodes we see that they were more successful too. The
width of the lines between the nodes show that TSV München 1860 had more successful
mutual passes than FC Augsburg. The mutual passing between player number 4 and
player number 5 of TSV München 1860 were the highest (19 passes) as can be seen by
the thick dark blue line.
The user can interact with the Pass Graph using the mouse. In a settings menu of the
Match Dashboard (Section 4.10) the user can select in which way the player nodes must
be drawn. The user can select between the options ”Formation” and ”Average”. The
option ”Formation” draws the nodes according to the original position of the players
in the team formation as shown in Figure 4.5. The option ”Average” draws the player
nodes according to their average position on the field during the whole match. This
average position is calculated from all events where that particular player touches the
ball. Because the average positions of two players can be close to each other, there is a
possibility of overlapping nodes. To avoid that larger nodes are completely drawn over
smaller nodes we draw in the nodes in order from large to small. This means that the
smallest node is always drawn on top. Figure 4.7 shows the pass graph of FC Augsburg
and TSV München 1860 using the average positions of players.
Figure 4.7: Pass graphs of FC Augsburg (left) and TSV München 1860 (right) using
average positions.
Apart from the team passing analysis, the user can also analyze passing of individual
players in the pass graph. By hovering the mouse over a player node, the individual
passes of that player are shown as arrows on the field (Figure 4.8(a)). Green arrows show
the successful passes, red arrows show failed passes, and black arrows show keypasses.
Chapter 4. Visualizations 28
Some successful passes are drawn as thicker green arrows; these passes are assists. When
the user is hovering a player, he can click and hold the left mouse button to show the
individual passing of that player in a radial graph (Figure 4.8(b)). The radial graph
is ideal for analyzing the direction of passes quickly. In both visualizations statistics
of the passes are shown above the field. In Figure 4.8(a) the total number of passes,
the number of successful versus failed passes, the number of keypasses, and the number
of assists are shown above the field. In the radial representation in Figure 4.8(b) the
number of forward and backward passes are shown above the field.
(a) Individual passes as arrows on the field (b) Individual passes as radial graph
Figure 4.8: The individual passes of player number 22 of TSV München 1860.
Figure 4.9 shows such a rectangular chain representation. The first and last event of the
possession chain, ei and ej respectively, are shown as well.
Chapter 4. Visualizations 29
∆x
ei
∆t
ej
Figure 4.9: A rectangular representation of a possession chain
In our final rectangular chain visualization, called Rectangular Chain Graph, we color
the possession chains according to the team. Possession chains of the home team are
colored red and possession chains of the away team are colored blue. All possession
chains are visualized top-down in chronological order, i.e., in the order they happened in
the match. Figure 4.10 shows the final result of the match FC Augsburg (red) against
TSV München 1860 (blue). The arrows above the graph show the playing direction of
each team and the percentages show the ball possession of both teams. Figure 4.10(a)
shows all possession chains of the match. Figure 4.10(b) highlights all possession chains
containing a shot. And finally, Figure 4.10(c) highlights all possession chains containing
a goal. The Rectangular Chain Graph is interactive, i.e., the user can set filters to
highlight or grey out possession chains. Furthermore, the user can zoom in to have a
closer look at specific possession chains using a timeline (Section 4.7).
(a) All chains (b) Chains containing a shot (c) Chains containing a goal
Figure 4.10: Rectangular Chain Graph for the match FC Augsburg (red) against
TSV München 1860 (blue).
A problem with this visualization is that we lose the information of where this chain
took place on the field as in Figure 4.2 of Section 4.2. We can partly solve this problem
by drawing this information inside the rectangle, as shown in Figure 4.11.
Note that the rectangular chain representation of Figure 4.11 does not show any in-
formation of the y-coordinate of the events. The vertical difference between events in
Chapter 4. Visualizations 30
∆x ∆x
e0 e1 e0 e1
e2 e2
∆t ∆t
e3 e3
e4 e4
e0 e0
e4 e4
e3 e3
e1 e2 e1 e2
(a) (b)
this visualization is a time difference. In many cases, the first event in a possession
chain is not the event with the smallest x-coordinate. And similarly, in many cases, the
last event in a possession chain is not the event with the highest x-coordinate. In both
cases the chain drawing inside the rectangle would be partly outside the rectangle. To
show such minimum and maximum x-coordinates in a possession chain, we make use of
dashed lines, as shown in Figure 4.12. Figure 4.13(a) shows an example of two zoomed-
∆x
e0
e1 e2
∆t
e3
e4
e0 e4
e3
e1 e2
Figure 4.12: A rectangular representation of a possession chain showing min and max
x-coordinates as dashed area. The football field underneath the rectangle contains the
example chain visualized on the field.
in consecutive possession chains. Note that the Rectangular Chain Graph makes use of
semantic zooming, i.e., we only draw inside the rectangle when a certain zooming level
has been reached. Figure 4.13(b) shows the same two consecutive possession chains
drawn on the field. We can see that TSV München 1860 tries to keep possession of the
Chapter 4. Visualizations 31
ball by playing a lot of short passes at their own half. Finally they give a long ball which
ends behind the goal line. Then the FC Augsburg goalkeeper passes the ball to the right
defender from the goal kick. This right defender applies the long ball to an advanced
midfielder who loses the ball, showing a direct attack playing style in this possession
chain.
Figure 4.13: An example of two possession chains in the match FC Augsburg (red)
against TSV München 1860 (blue).
As discussed in Section 2.1.3, while looking at the defending tactics, we are interested
in the line of confrontation and the line of defense. Since we do not know where players
are if they are not touching the ball, we can only analyse the defensive events involving
the ball, such as interceptions or tackles. In order to analyse where exactly the line of
confrontation and the line of defense are, we need to visualize where all these events
happen on the field. A common way to visualize spatial data is by means of a heatmap
as discussed in Section 2.4. In football, it is common to divide the field into thirds. The
first third of the field is called the defending third, the second third of the field is called
the middle third, and the last third of the field is called the attacking third (See Figure
4.14(a)). Hence, it is intuitive to make use of this division in our heatmap. To get a
better view of where defensive events happen in each third of the field, we divide each
third into a 3 by 6 grid as shown in Figure 4.14(b).
We color each rectangle in the grid according to the number of defensive actions that
happen in that area on the field. This creates a heatmap where darker colors denote
Chapter 4. Visualizations 32
playing direction
more defensive actions than lighter colors. We use formula 4.1 to create a colormap
for each team. This colormap is the same colormap as the one used for the coloring of
mutual passes in the pass graph (Section 4.3). The only difference is the maximum value
that is used in the algorithm. We calculate the maximum value by analyzing the data;
the area with the most defensive actions provides the maximum value for the colormap.
This means that the heatmap is relative to the maximum number of defensive actions of
any single area. To analyse the defensive contribution of each individual player and to
analyse where individual players defend on the field, we show the average position of the
defending actions of each player as nodes in the defense plot (see Figure 4.15). The size
of these nodes shows how many defensive actions this player has performed; the bigger
a node is, the more defensive actions the corresponding player has performed.
Figure 4.15: Defense plot over 90 minutes for the home team (left) and the away
team (right)
The Possession Chain Heatmap is one of the four visualizations that is drawn on the
fields of the Match Dashboard. Since the Opta coordinate system consists of a 100 by
100 grid, we divide the heatmap in a 100 by 100 grid mapped onto the football field.
Chapter 4. Visualizations 33
Each possession chain is mapped onto the grid covering certain tiles of the heatmap grid
as shown in Figure 4.16.
To determine which tiles a possession chain exactly covers, we use Algorithm 1. Given a
line, defined by the two endpoints (x0 , y0 ) and (x1 , y1 ), the algorithm determines which
tiles the line overlaps. The addT ile(x, y) function in the algorithm adds the value 1
to each corner point of the tile at position (x, y). We call the algorithm for each two
consecutive events in the possession chain and we do this for each possession chain of
the team. This results in a grid where each tile has four values for each of its corner
points corresponding to the number of possession chains that go near those points. We
then map these values to a blue-green-red colormap and color each corner point of a
tile according to its value. When the tile is drawn using different colors for the corner
points, we linearly interpolate the colors of the tile resulting in a heatmap as shown in
Figure 4.17. Figure 4.17 shows the possession chain heatmaps of FC Augsburg and TSV
München 1860 in their match against each other. Note that FC Augsburg plays from
left to right, while TSV München 1860 plays from right to left.
(a) FC Augsburg (left to right) (b) TSV München 1860 (right to left)
Figure 4.17: Possession chain heatmaps of FC Augsburg, playing from left to right,
and TSV München 1860, playing from right to left.
To create the blue-green-red colormap we use formula 4.6, which uses formula 4.1 of
Section 4.3. In this formula cmin , cmid , and cmax are the minimum, middle and maximum
color values, vmin and vmax are the minimum and maximum data values, and v is the
value of which we want the color. For the blue-green-red colormap we pick the colors
blue RGB=(0, 0, 255), green RGB=(0, 255, 0), and red RGB=(255, 0, 0) for cmin , cmid ,
and cmax respectively. The colormap for the home team and the away team can have
a different maximum data value. The user can select to match the maximum values as
in Figure 4.17 or to set the maximum data values relative to the team to analyze where
that particular team moves the ball. The user can also make use of filters to remove or
add possession chains to the visualization. Figure 4.18 shows an example of possession
chain heatmaps of FC Augsburg and TSV München 1860, where only possession chains
with a length of at least 6 touches are taken into account. Note that the maximum data
values of the colormap are here set relative to the team. It shows that TSV München
1860 had a lot more possession chains with at least 6 touches and that most of those
possession chains went through their own half.
(
2 C(cmin , cmid , vmin , vmid , v) if v < vmid
C (cmin , cmid , cmax , vmin , vmax , v) = (4.6)
C(cmid , cmax , vmid , vmax , v) if v ≥ vmid
vmin + vmax
where vmid =
2
4.7 Timeline
The timeline in the Match Dashboard (see Figure 4.19) gives an overview of the impor-
tant events that happened during the match. These important events are fouls, awarded
Chapter 4. Visualizations 35
(a) FC Augsburg (left to right) (b) TSV München 1860 (right to left)
Figure 4.18: Possession chain heatmaps of FC Augsburg, playing from left to right,
and TSV München 1860, playing from right to left filtered on chain length ≥ 6.
cards, substitutions, and goals. Each such event has its own glyph representation as
shown in Table 4.1.
“In the context of data visualization, a glyph is the visual representation of a piece of
data where the attributes of a graphical entity are dictated by one or more attributes of
a data record.”
— Matthew O. Ward [28]
The timeline can be divided into six rows, where the top three rows represent actions
by the home team and the bottom three rows represent actions by the away team.
Figure 4.20(a) shows the division of these six rows. The rows numbered by 1 show the
Figure 4.20: Part of the timeline of the match FC Augsburg - TSV München 1860.
main events in the form of glyphs for the corresponding teams. The rows numbered by
2 contain dots. These dots represent key possession chains. A key possession chain is a
chain which contains a shot. Finally, the rows numbered by 3 show the ball possession
moments for the corresponding team. On the left and the right side of the timeline are
sliders that can be moved by the user. The sliders enclose a time period in minutes that
Chapter 4. Visualizations 36
Glyph Description
A goal is scored.
A foul is committed.
is used in the Match Dashboard. Moving the mouse over the timeline, shows a vertical
line at the point of the mouse pointer with the corresponding minute in the match.
By clicking on a certain point in time in the timeline, the sliders automatically clamp
around that moment in time as shown in Figure 4.20(b). Other visualizations, such as
the Rectangular Chain Graph (Section 4.4), are automatically adjusted to the set time
period allowing the user to zoom in on particular moments in time.
In the Match Dashboard there are two Glyph Chain Graphs, one for each team in the
match. A Glyph Chain Graph contains the possession chains of a team, where each
event is represented by a glyph. Figure 4.21(a) shows an example of a possession chain
that is represented by glyphs. It is the same possession chain as shown in Figure 4.21(b).
A glyph in the possession chain has two properties: its shape and its color. The shape
denotes the type of event that the glyph represents. A triangular glyph represents a
Chapter 4. Visualizations 37
passing event, a circular glyph represents a shooting event, a dotted line represents a
dribble event, a zigzag dotted line represents a take-on event and a rectangular glyph
is used to represent all remaining events (see Figure 4.22). The take-on event is an
attempted dribble past an opponent. The color of a glyph represents the outcome of the
event. A green colored glyph represents a successful event, while a red colored glyph
represents a failed event. There are some special events which are colored differently.
These events are keypasses (black), set pieces (blue), blocked shots (orange), and saved
shots (yellow) (see Figure 4.22).
Keypass
Pass
Failed / missed
Shot
Successful / goal
Dribble
Set piece (incl. goal kick)
Take on
Blocked shot
Other event
Saved shot
The advantage of the glyph representation of possession chains is that it is easy to com-
pare their structure. The user can use filters in the Match Dashboard to filter the pos-
session chains and use the mouse to hover and highlight players. Figure 4.23 shows the
Glyph Chain Graphs of FC Augsburg and TSV München 1860 in their match. The pos-
session chains are filtered to show only chains of at least twelve touches. Figure 4.23(a)
shows that FC Augsburg has only one such possession chain, while Figure 4.23(b) shows
that TSV München 1860 has twelve such possession chains. Figure 4.24(a) and (b) show
that player number 4 and player number 8 where most involved in those relatively long
possession chains of TSV München. Looking closely at the Glyph Chain Graph of TSV
München 1860, it shows that only two out of the twelve possession chains lead to a shot,
where one shot is missed and the other is saved.
The overview header, shown in Figure 4.25, is used to show general information of the
displayed match. It shows the final score, the goal scorers, the players that gave the
assists, and the players that gave the keypasses. Each goal (G), assist (A), and keypass
(K) is represented by a square with the jersey number of the corresponding player as
label.
Chapter 4. Visualizations 38
Figure 4.23: The Glyph Chain Graphs of FC Augsburg and TSV München 1860
filtered on possession chains of at least 12 touches.
(a) Nr. 4 highlighted (TSV München 1860) (b) Nr. 8 highlighted (TSV München 1860)
Figure 4.24: The Glyph Chain Graph of TSV München 1860 with player nr. 4 and
player nr. 8 highlighted in the match FC Augsburg - TSV München 1860 filtered on
possession chains of at least 12 touches.
Figure 4.25: Overview header of the match FC Augsburg - TSV München 1860.
Chapter 4. Visualizations 39
Besides the mouse interaction with the different visualizations, the user can set several
filters and settings. Since there are different visualizations that can be rendered onto
the two fields of the dashboard, the user can select which one to render: the Pass Graph,
the Defense Plot, the heatmap, or a single chain.
The user can filter possession chains and set one or more of the following filters:
When the user sets more than one of these filters, the application considers it as an
AND-condition. This means that the selection of “possession chains containing a cross”
combined with the selection of “possession chains containing a shot” results in possession
chains containing a cross as well as a shot.
Apart from filtering possession chains, the user can also filter on one or more players
and combine this with the possession chain filters. This allows the user to set filters like
“show the possession chains where player A is part of having at least 10 touches and
containing a shot”. Similar to the possession chain filters, selecting multiple players is
interpreted as an AND-condition by the application.
When we analyze player positions on the field, we can look at the formation of the team
as in Section 4.3, but we can also look at where players are at certain points in time in
the match. Since our data does not contain any information of players when they do
Chapter 4. Visualizations 40
not have the ball, we can only look at players when they do have the ball. We analyze
the position of a player by looking at each event of that player and calculating the
average position. We calculate the average position by adding up all the player events
and dividing this sum by the number of events. We can show the average position of a
player over the whole match as in Figure 4.7 of Section 4.3, but we get more insight into
the position of a player if we show the average position in time intervals of the match.
Figure 4.26 shows such time intervals with average positions. Figure 4.26(a) shows the
average x-coordinates as a black dot during 10 minute intervals. It shows that the
depicted player tends to play close to the opposing goal as the black dot lies more to
the right, especially between the 50th and 60th minute. Similarly, Figure 4.26(b) shows
the average y-coordinates as a black dot during 10 minute intervals. It shows that the
player mainly plays on the left side of the field (note the playing direction). Combining
the two figures, we can conclude that the depicted player is either a left wing, or a left
midfielder with a lot of attacking tendencies.
x
0
10
20
t
30
0 10 20 30 40 50 60 70 80 90 93
40
t 50
60
70
y
80
90
93
Figure 4.26: Average positions of a player in 10 minute intervals for (a) the x-direction
and (b) the y-direction.
We extend this visualization of Figure 4.26 by adding a football field to draw the average
positions on. Furthermore, we label and color the player “dot” or node with the shirt
number and playing position respectively. We use the same color coding for the positions
as in Figure 4.3(a) of Section 4.2, i.e., goalkeepers are colored orange, defenders are
colored yellow, midfielders are colored green, strikers are colored blue, and substitutes
are colored red. The size of the node visualizes how many events are used to calculate
the average position in that particular time period, where the maximum used node
size is equal to half the size of the space between the lines. The maximum node size
corresponds to at least one event per minute, which means that a node smaller than half
of the space between the lines has less than one event per minute. Figure 4.27 shows an
example of the visualization. It shows the average positions of Alexander Bugere (17)
Chapter 4. Visualizations 41
and Sven Müller (23), the left back and right back of FC Kaiserslautern respectively,
in the home match against Borussia Mönchengladbach in 10 minute intervals. The
highlighted time period in grey shows the currently hovered time period by the user.
The average positions for this selected time period are drawn on the field in the top-left
corner of the figure.
Figure 4.27: Average positions of Alexander Bugere (17) and Sven Müller (23), the
left back and right back of FC Kaiserslautern respectively, in the home match against
Borussia Mönchengladbach. The playing direction is from left to right.
Chapter 5
Results
In Section 2.1 we introduced several questions divided into multiple categories that we
aim to answer. In Chapter 4 we described the visualizations we developed that help
answer those questions. In this chapter we discuss the results of our visualizations.
We introduced two questions in the “team structure” categorie. The first question is
The answer to this question is visualized in the Pass Graph (Section 4.3). Figure 5.1
shows such a Pass Graph for two different teams. Figure 5.1(a) shows a Pass Graph
with a 4–4–2 formation played by 1. FC Mainz 05 and Figure 5.1(b) shows a Pass Graph
with a 4–4–2 diamond formation played by Alemannia Aachen. The second question
(a) 4–4–2 (1. FC Mainz 05) (b) 4–4–2 diamond (Alemannia Aachen)
Figure 5.1: Team structures in the Pass Graph visualization. (a) shows a 4–4–2
formation played by 1. FC Mainz 05 and (b) shows a 4–4–2 diamond formation played
by Alemannia Aachen.
42
Chapter 5. Results 43
TS1: Do the players always play in the same position on the field or is there a
rotation?
Since we do not know the position of players when they do not touch the ball, we can
only analyze player positions when the ball is touched. To see if there is position rotation
within a team, we use the small multiples visualization of average player positions.
Figure 5.2 and 5.3 show the average position of player Timo Achenbach (32) for each 10
minutes in the match 1. FC Kaiserslautern - SpVgg Greuther Fürth. Figure 5.2 shows
the average y position and Figure 5.3 shows the average x position of Achenbach. Timo
Achenbach played as a left back defender as can be seen in the figures. However, in the
second half Achenbach played more forward as can be seen by his average x position
from the 50th minute till the end of the match. It seems that he acted more like a left
midfielder during that period of the match, because Achenbach average position was
also slightly to the center of the field as can be seen by his average y position (especially
between 60 and 80 minutes).
Figure 5.2: Average y position of player Timo Achenbach (32) for each 10 minutes
in the match 1. FC Kaiserslautern - SpVgg Greuther Fürth.
Figure 5.3: Average x position of player Timo Achenbach (32) for each 10 minutes
in the match 1. FC Kaiserslautern - SpVgg Greuther Fürth.
To answer these questions we use the Match Dashboard (Section 4.1) and analyze mul-
tiple visualizations. We use the match 1. FC Köln - Alemannia Aachen, which ended in
0 - 1, as a use case. When we look at the Pass Graphs of both teams (Figure 5.4), we
see that 1. FC Köln tends to keep possession of the ball as the mutual passing lines are
thicker compared to the passing lines of Alemannia Aachen. Furthermore, the bigger
node sizes of 1. FC Köln indicate that the players gave more passes and the brighter
green colors indicate that these players had a higher success rate too. When we filter
the possession chains of both teams in our dashboard and show only possession chains
of at least 10 touches, we see that 1. FC Köln has twice as much such possession chains
as Alemannia Aachen (see Figure 5.5), indicating that 1. FC Köln indeed aims to keep
possession of the ball. However, it does not mean that 1. FC Köln is more effective while
keeping possession of the ball. If we add another filter and show only the possession
chains with a length of at least 10 with a shot, we see that both teams have exactly
two such possession chains (see Figure 5.6). The only team who actually scores a goal
out of the relatively long possession chains is Alemannia Aachen, shown in Figure 5.7.
When we analyze the heatmap of the possession chains (Figure 5.8), we clearly see that
1. FC Köln had more possession of the ball. We also see that 1. FC Köln’s possession
Chapter 5. Results 45
Figure 5.4: Pass Graphs of (a) 1. FC Köln and (b) Alemannia Aachen.
Figure 5.5: Glyph Graphs of (a) 1. FC Köln and (b) Alemannia Aachen filtered on
possession chains with a length of at least 10.
Figure 5.6: Glyph Graphs of (a) 1. FC Köln and (b) Alemannia Aachen filtered on
possession chains with a length of at least 10 containing a shot.
chains mainly go through the own half and that their attacks tend to go through the
wings, with a small preference for the left wing in this match. From Alemannia Aachen’s
heatmap we can conclude that they did not have much possession of the ball and that
their possession chains are all over the field with no preference for a particular area.
Figure 5.9(a)-(f) show the Rectangular Chain Graph of the match 1. FC Köln - Ale-
mannia Aachen zoomed in on each 15 minutes of the match. Note that Figure 5.9(f)
Chapter 5. Results 46
(a) (b)
Figure 5.8: Possession Chain Heatmap of (a) 1. FC Köln playing from left to right
and (b) Alemannia Aachen playing from right to left.
consists of 17 minutes as 2 minutes injury time were added. In each 15 minutes of the
match 1. FC Köln has more possession of the ball. It seems that 1. FC Köln is able to
control the match more at the start of a half as the first 15 minutes of the first half (Fig-
ure 5.9(a)) and the first 15 minutes of the second half (Figure 5.9(d)) yield the highest
possession percentages: 59,07% and 64,84% respectively. This can also be seen by the
relatively large number of red rectangles with a relatively large height. Recall that the
larger the height of a rectangle is, the longer the duration of the possession chain is.
The large blue rectangle in the center of Figure 5.9(b) is the scored goal by Alemannia
Aachen that we discussed earlier (Figure 5.7). The presence of more blue rectangles
with a larger height in the period after Alemannia Aachen’s goal, shows that 1. FC Köln
was unable to control the match as before. The small difference in ball possession in
Figure 5.9(c) confirms this. At the end of the match (Figure 5.9(f) and the last few
Chapter 5. Results 47
minutes in Figure 5.9(e)) we see that the rectangles remain wide, but the heights of the
rectangles are lower, indicating that these possession chains have a low duration. This
might indicate that 1. FC Köln played more opportunistic in this time period by trying
to get the ball as fast as possible towards Alemannia Aachen’s goal in order to score the
equalizing goal.
(a) [00:00 - 15:00) minutes. (b) [15:00 - 30:00) minutes. (c) [30:00 - 45:00) minutes.
possession: 59,07%/40,93% possession: 57,40%/42,60% possession: 50,79%/49,21%
(d) [45:00 - 60:00) minutes. (e) [60:00 - 75:00) mintues. (f) [75:00 - 92:00) minutes.
possession: 64,84%/35,16% possession: 55,95%/44,05% possession: 53,98%/46,02%
Figure 5.9: Rectangular Chain Graph of the match 1. FC Köln (red) - Alemannia
Aachen (blue) zoomed in on each 15 minutes of the match. The y-axis is the time axis
(top-down) and the x-axis is the x-coordinate on the field (recall Section 4.4). The
possession percentages in the figures are denoted as 1. FC Köln/Allemannia Aachen
and show the possession in that particular time period.
Chapter 5. Results 48
When we look at the pass graph of both teams containing the average positions (Fig-
ure 5.10), we see that for 1. FC Köln Broich (10) and Antar (20) are potential playmak-
ers, as they are midfielders who gave a lot of passes. For similar reasons, it seems that
Lehmann (20) and Lagerblom (5) are potential playmakers for Alemannia Aachen. We
Figure 5.10: Pass Graphs of (a) 1. FC Köln and (b) Alemannia Aachen showing
average player positions during their match. Note that the maximum value for the
number of passes is relative to the team to get a better view of the mutual passes
within a team.
can analyze the individual passing performance of the four aforementioned players by
hovering over their node in the Pass Graph. Figure 5.11 shows the individual passing
performance of these players during the match.
Figure 5.11(a) shows the passes of 1. FC Köln’s player Thomas Broich on the field.
Broich gave 63 passes with a success rate of 93,65% including 2 keypasses. Most of his
passes are at the own half or at the left or right wing. Figure 5.11(b) shows the direction
of Broich’s passes in a radial graph. He played 40 out of the 63 passes (63,49%) forward.
Figure 5.11(c) shows the passes of 1. FC Köln’s player Roda Antar. Antar gave 57
passes with a success rate of 91,23% including 3 keypasses. In contrast to Broich,
Antar’s passes are not on a particular part of the field. Antar gave more passes through
the center midfield. Figure 5.11(d) shows the direction of Antar’s passes in a radial
graph. He gave 30 forward passes and 27 backward passes yielding a 52,63% forward
pass rate.
Figure 5.11(e) shows the passes of Alemannia Aachen’s player Pekka Lagerblom. Lagerblom
played 29 passes of which only 16 (55,17%) were successful. He gave no keypasses and
no assists. In Figure 5.11(f) we see that Lagerblom played 24 passes forward (82,76%),
but all of his long forward passes were unsuccessful.
Chapter 5. Results 49
Figure 5.11: The passes of (a)-(b) 1. FC Köln’s midfielder Thomas Broich (10), (c)-
(d) 1. FC Köln’s midfielder Roda Antar (20), (e)-(f) Alemannia Aachen’s midfielder
Pekka Lagerblom (5), and (g)-(h) Alemannia Aachen’s midfielder Matthias Lehmann
(20) in the match 1. FC Köln - Alemannia Aachen. Note that in (a)-(d) the playing
direction is from left to right, while in (e)-(h) the playing direction is from right to left.
Chapter 5. Results 50
Figure 5.11(g)-(h) show the passes of Alemannia Aachen’s player Matthias Lehmann.
Lehmann performed 39 passes of which 28 (71,79%) were played forward. Out of these
39 passes 24 were successful, leading to a success rate of 61,54%. Lehmann did not give
any keypasses or assist during the match. In contrast to Lagerblom who tends to play
the long ball forward, Lehmann uses often a switch of play, playing the ball from the
left side to the right side or the other way around. As can be seen by the red arrows,
these switch of plays were unsuccessful most of the time.
The next question is concerned with the most important aspect of attacking of all:
scoring. The question is
To analyze this question we use the match FC Augsburg - TSV München 1860 (result:
2-6) as a use case as it contains many goals. We set a filter for the possession chains to
show only the possession chains containing a shot. This yields the the Match Dashboard
shown in Figure 5.12.
In the Rectangular Chain Graph, the filtered possession chains containing a goal are
highlighted. The blue rectangles show the 6 goals scored by TSV München 1860 and the
red rectangles show the 2 goals scored by FC Augsburg. Note that the color of rectangles
with a small height is not clearly visible as only the black border is visible. However,
all goals scored by the home team must end at the right side of the Rectangluar Chain
Graph. And similarly, all goals scored by the away team must end at the left side of the
Rectangular Chain Graph. Selected goals are visualized on the field in the left and right
top corner of the dashboard. Goals can be selected in the Rectangular Chain Graph
and in the Glyph Chain Graph. Selected chains in the Rectangular Chain Graph are
shown with a thicker black border, while selected chains in the Glyph Chain Graph are
surrounded by a black border. The selected chains are also shown in the timeline by a
vertical colored bar.
At the top of the dashboard in Figure 5.12 is the overview header. The overview header
shows which players scored on the left and right side of the ”G” (goals). It shows that
both goals of FC Augsburg are scored by number 18 Mourad Hdiouad. The selected
goal for FC Augsburg, shown in the top left corner, happens to be a penalty. For TSV
München 1860 we see that out of the six goals, three were scored by number 9 Antonio
Di Salvio. The other three goals were scored by number 10 Berkant Göktan, number
7 Daniel Bierofka, and number 22 Lars Bender. The fifth goal of TSV München 1860
scored by Lars Bender is shown in the top right corner.
Chapter 5. Results
51
Figure 5.12: Match Dashboard for FC Augsburg against TSV München 1860 filtered on possession chains containing a goal.
Chapter 5. Results 52
AT4: Are all players moving to create space or to support the ball, or does the team
rely on only a few players?
This question can not be answered by any of our visualizations, because the data does
not have any information regarding the positions of players without the ball. Hence, it
is impossible to analyze player movement in space on the field.
In this section we discuss the questions regarding the defending tendencies of a team.
These questions are
In Section 2.2 we discussed two types of defending playing styles: high-pressure defending
and low-pressure defending. So to answer question DT0 we must know when, where,
and how the players apply pressure on the opponent. Since our data does not contain this
information, we can not analyze this. For similar reasons, we can not determine where
the line of confrontation is (DT1) or where the line of defense is (DT2). However, we
can visualize where defensive actions happen on the field and analyze if a team defends
far away from its own goal, or close to its own goal. Defensive actions that are far away
from the own goal might indicate a high-pressure defending playing style, while a lot
of defensive actions near the own goal might indicate a low-pressure defending playing
style. Similarly, defensive actions that are far away from the own goal might indicate
a high line of confrontation, while defensive actions near the own goal might indicate
a low line of confrontation. Figure 5.13 shows the defense plot for 1. FC Köln and
Allemannia Aachen in their match. Comparing the two shows that the average position
of the defensive actions of the Alemannia Aachen players are much closer to their own
goal as those of the 1. FC Köln players. The larger size of the Alemannia Aachen nodes
show that they had more defensive actions than 1. FC Köln. All players of Alemannia
Aachen, except for number 8, have their average defensive actions on their own half. 1.
FC Köln on the other hand defends farther away from the goal.
Chapter 5. Results 53
Figure 5.13: Defense plot for (a) 1. FC Köln and (b) Allemannia Aachen of their
match.
In this section we discuss the questions regarding set pieces such as free kicks and corners.
These questions are
To answer these questions we use the filters in the Match Dashboard. To analyze cor-
ners, we filter the possession chains to show only possession chains containing a corner.
Figure 5.14 shows the resulting Glyph Graph of 1. FC Kaiserslautern in their match
Axel Bellinghausen, indicating that Bernier is the main corner kicker. We see that two
corners reach player number 2 Moussa Ouattara, which could mean that he is a target
man for 1. FC Kaiserslautern when taking corners (see Figure 5.15).
(a) (b)
Figure 5.15: Two corners for 1. FC Kaiserslautern taken by number 10 Patrice Bernier
that reach player number 2 Moussa Ouattara in the match 1. FC Kaiserslautern -
Borussia Mönchengladbach.
To analyze free kicks, we filter the possession chains to show only possession chains
containing a free kick. The resulting Glyph Graphs of the match FC Augsburg - TSV
München 1860 are shown in Figure 5.16. We see that FC Augsburg had 15 free kicks
and TSV München 1860 has 13 free kicks. The higher length of the TSV München 1860
chains suggest that they (in some cases) want to keep possession of the ball by playing
a short free kick rather than playing the long ball forward or crossing it inside the
penalty area. By hovering over the glyphs we can easily recognize the main kickers. For
FC Augsburg the main kicker is Elton da Costa, playing with number 10 on his jersey
(Figure 5.17(a)). For TSV München 1860 the main kicker is Danny Schwarz, playing
with jersey number 8. When we take a closer look at the free kicks of FC Augsburg by
analyzing the individual possession chains on the field we see two main patterns:
1. The ball is played high inside the penalty in order to score a goal (Figure 5.18).
2. The ball is played from the left side of the field to right winger (28) Robert Strauss
(Figure 5.19(a)). Or similarly the ball is played from the right side to the upcoming
left back (4) Benjamin Kern (Figure 5.19(a)).
Chapter 5. Results 55
Figure 5.16: Glyph Graphs of (a) FC Augsburg and (b) TSV München 1860 filtered
on possession chains containing a free kick.
(a) FC Augsburg: number 10 hovered (b) TSV München 1860: number 8 hovered
Figure 5.17: Glyph Graphs of (a) FC Augsburg and (b) TSV München 1860 filtered
on possession chains containing a free kick where in (a) player number 10 and in (b)
player number 8 are highlighted by hovering.
Chapter 5. Results 56
(a) (b)
Figure 5.18: Free kicks of FC Augsburg trying the long ball towards the opponent’s
penalty area.
(a) (b)
Figure 5.19: Free kicks of FC Augsburg playing the ball from one side to the other.
In this section we discuss the two main questions that we introduced as our thesis
objectives in Section 1.1. These questions are
We use the match 1. FC Köln - Alemannia Aachen as a use case. This is the same
match we used as a use case discussing a team’s behavior while in possession of the
ball in Section 5.2. We saw that 1. FC Köln used an indirect attacking playing style as
they kept possession of the ball. We showed that 1. FC Köln had more possession of
the ball during each 15 minutes of the match (recall Figure 5.9). 1. FC Köln also had
more shots compared to Alemannia Aachen; 20 shots for 1. FC Köln versus 8 shots for
Alemannia Aachen. Out of the 20 shots, only 1 shot was on target for 1. FC Köln, while
Chapter 5. Results 57
for Alemannia Aachen 5 out of the 8 shots were on target, including the only goal of
the match. Figure 5.20 shows these shots of both teams in a Glyph Graph. Alemannia
Aachen used a more direct attacking playing style. This can be seen when analyzing the
potential playmakers of Alemannia Aachen (recall Figure 5.11(e)-(f)). Lagerblom and
Lehmann applied more long balls and switch of plays instead of short passes.
Figure 5.20: Glyph Graph of (a) 1. FC Köln and (b) Alemannia Aachen filtered on
shots. 1. FC Köln has 20 shots of which only 1 shot is on target (shown by the yellow
circle). Alemannia Aachen has 8 shots of which 5 are on target including a goal (the
longest chain). Unfortunately, the green circle denoting the goal is outside the visible
area.
So although 1. FC Köln had more possession of the ball and more shots compared to
Alemannia Aachen, they were simply not effective enough in front of the goal. Too many
shots missed the target or were blocked by Alemannia Aachen defenders. The indirect
attacking playing style was not successful in this match as only 2 out of the 18 possession
chains with at least 10 touches led to a shot (recall Figures 5.5 and 5.6 of Section 5.2).
Defensively, 1. FC Köln defended farther away from the own goal compared to Alemannia
Aachen as we saw in Figure 5.13 of Section 5.3. This indicates that 1. FC Köln used
high-pressure defending playing style, i.e., they tried to get the ball back as fast as
possible after losing it. Alemannia Aachen on the other hand seem to defend closer to
the goal and “sag” back more, i.e., they go back so they are in position to stop an attack
on goal. This low-pressure type of defending is often associated with a counter-attacking
or direct attacking playing style.
Chapter 5. Results 58
1. FC Köln had the upper hand in the match against Alemannia Aachen, but they lacked
precision in front of the goal. Alemannia Aachen was more effective and scored the only
goal in the match.
1. FC Köln used an indirect attacking playing style and a high-pressure defending playing
style. Alemannia Aachen used a direct attacking playing style combined with a low-
pressure defending playing style.
Chapter 6
Conclusions
Future work would include improving and extending the current visualizations. If the
data was extended with tracked player movement, the prototype could become more
powerful. It would make it possible to visualize player movement during defending and
attacking phases of a match, giving a more accurate analysis of attacking and defending
playing styles. Unanswered questions about the line of confrontation and the line of
defense could easily be answered for different periods in a match. Some of our different
visualizations can extended as follows:
Chain visualization
Long possession chains can get hard to analyze when visualized on the field as it tends
to get spaghetti-like. This can by improved by using passing clusters as shown by Perin
et al. [3]. Perin et al. show five different ways to visualize a passing cluster as discussed
in Section 2.4.
59
Chapter 6. Conclusions 60
Pass Graph
The Pass Graph can be extended with directed passes instead of showing mutual passing
only. Furthermore, substitutes are not visualized in our pass graph and can be added to
the visualization. Besides adding substitutes, a node for a “shot” could also be added
to visualize which players fired a shot at goal. The lines between a player and a shot
node would represent the amount of shots that particular player has taken.
Figure 6.1: Alternative rectangle visualizations. (a) the current rectangle visualiza-
tion, (b) added direction using color gradient, (c) added direction by showing the first
and last event.
The current rectangle visualization (Figure 6.2(a)) shows the progress of a chain, i.e.,
how much the team went forward or backward on the field from the first event until
the last event. One might want to see how far a possession chain came towards the
opponent’s goal. To show this, the coloring of the rectangle could be extended to meet
the maximum and minimum x-coordinate of an event inside that possession chain as
shown in Figure 6.2(b). Such a rectangle visualizes the coverage of a possession chain.
(a) (b)
Figure 6.2: Alternative rectangle visualizations. (a) the current rectangle visualiza-
tion and (b) extended coloring to the minimum and maximum x-coordinate within a
chain.
shown in Figure 6.3(b) and (c). This does not only allow the analyst to see changes of
direction within a chain, but also where the ball is most parts of the chain. A lot of
passes at own half going back forth finally followed by a long ball would result in many
subrectangles at the own half followed by a single subrectangle going from the own half
to the opponent’s half.
Figure 6.3: Alternative rectangle visualizations. (a) the current rectangle visual-
ization, (b) division into subrectangles corresponding to change in direction, and (c)
division into subrectangles corresponding to change in direction with bounding box.
Defense Plot
The Defense Plot does currently not show any actual values such as the number of
defensive actions a certain player has performed. Adding interaction to the visualization
to show additional information when hovering a node or area could solve this problem.
This way the analyst can get actual values instead of just visual differences.
XML structures
62
Appendix B
Event types
ID Name Description
1 Pass Any pass attempted from one player to another – free kicks,
corners, throw ins, goal kicks and goal assists
2 Offside Pass Attempted pass made to a player who is in an offside position
3 Take On Attempted dribble past an opponent
4 Foul A foul is committed resulting in a free kick
5 Out Shown each time the ball goes out of play for a throw-in or
goal kick
6 Corner Awarded Ball goes out of play for a corner kick
7 Tackle Tackle: dispossesses an opponent of the ball - Outcome 1:
win and retain possession or out of play, 0: win tackle but not
possession
8 Interception When a player intercepts any pass event between opposition
players and prevents the ball reaching its target. Cannot be
a clearance.
9 Turnover Unforced error / loss of possession - i.e. bad control of ball –
NO LONGER USED (Replaced with Unsuccessful Touch +
Overrun)
10 Save Goalkeeper event; saving a shot on goal. Can also be an out-
field player event with qualifier 94 for blocked shot
11 Claim Goalkeeper event; catching a crossed ball
12 Clearance Player under pressure hits the ball clear of the defensive zone
or/and out of play
13 Miss Any shot on goal which goes wide or over the goal
14 Post Whenever the ball hits the frame of the goal
15 Attempt Saved Shot saved - this event is for the player who made the shot.
Qualifier 82 can be added for blocked shot.
16 Goal All goals
17 Card Bookings; will have red, yellow or 2nd yellow qualifier plus a
reason
18 Player off Player is substituted off
19 Player on Player comes on as a substitute
20 Player retired Player is forced to leave the pitch due to injury and the team
have no substitutions left
63
Appendix B. Event types 64
ID Name Description
21 Player returns Player comes back on the pitch
22 Player becomes When an outfield player has to replace the goalkeeper
goalkeeper
23 Goalkeeper be- If goalkeeper becomes an outfield player
comes player
24 Condition change Change in playing conditions
25 Official change Referee or linesman is replaced
27 Start delay Used when there is a stoppage in play such as a player injury
28 End delay Used when the stoppage ends and play resumes
30 End End of a match period
32 Start Start of a match period
34 Team set up Team line up; qualifiers 30, 44, 59, 130, 131 will show player
line up and formation
35 Player changed po- Player moved to a different position but the team formation
sition remained the same
36 Player changed Jer- Player is forced to change jersey number, qualifier will show
sey number the new number
37 Collection End Event 30 signals end of half. This signals end of the match
and thus data collection.
38 Temp Goal Goal has occurred but it is pending additional detail qualifiers
from Opta. Will change to event 16.
39 Temp Attempt Shot on goal has occurred but is pending additional detail
qualifiers from Opta. Will change to event 15.
40 Formation change Team alters its formation
41 Punch Goalkeeper event; ball is punched clear
42 Good Skill A player shows a good piece of skill on the ball – such as a
step over or turn on the ball – NO LONGER USED
43 Deleted event Event has been deleted – the event will remain as it was origi-
nally with the same ID but will be resent with the type altered
to 43.
44 Aerial Aerial duel – 50/50 when the ball is in the air – outcome will
represent whether the duel was won or lost
45 Challenge When a player fails to win the ball as an opponent successfully
dribbles past them
47 Rescinded card This can occur post match if the referee rescinds a card he
has awarded
49 Ball recovery Team wins the possession of the ball and successfully keeps
possession for at least two passes or an attacking play
50 Dispossessed Player is successfully tackled and loses possession of the ball
51 Error Mistake by player losing the ball. Leads to a shot or goals as
described with qualifier 169 or 170
52 Keeper pick-up Goalkeeper event; picks up the ball
53 Cross not claimed Goalkeeper event; cross not successfully caught
54 Smother Goalkeeper event; comes out and covers the ball in the box
winning possession
55 Offside provoked Awarded to last defender when an offside decision is given
against an attacker
56 Shield ball opp Defender uses his body to shield the ball from an opponent as
it rolls out of play
57 Foul throw-in A throw-in not taken correctly resulting in the throw being
awarded to the opposing team
58 Penalty faced Goalkeeper event; penalty by opposition team
59 Keeper Sweeper When keeper comes off his line and/or out of his box to clear
the ball
60 Chance missed Used when a player does not actually make a shot on goal but
was in a good position to score and only just missed receiving
a pass
ID Name Description
61 Ball touch Used when a player makes a bad touch on the ball and loses
possession. Outcome 1 – ball simply hit the player uninten-
tionally. Outcome 0 – Player unsuccessfully controlled the
ball.
63 Temp Save An event indicating a save has occurred but without full de-
tails. Event 10 will follow shortly afterwards with full details.
64 Resume Match resumes on a new date after being abandoned mid game.
65 Contentious referee Any major talking point or error made by the referee – deci-
decision sion will be assigned to the relevant team
Qualifiers
66
Appendix C. Qualifiers 67
Table C.3: Qualifiers that are related to shots (event types 13, 14, 15, and 16)
Appendix C. Qualifiers 68
Table C.4: Qualifiers that are related to shots (event types 13, 14, 15, and 16)
Appendix C. Qualifiers 69
Table C.5: Qualifiers that are related to shots (event types 13, 14, 15, and 16)
Appendix C. Qualifiers 70
ID Goalkeeper Description
event
190 From shot off target Used with Event 10. Indicates a shot was saved by the goal-
keeper but in fact the shot was going wide and not on target
88 High claim Event 11 Claim - Goalkeeper claims possession of a crossed
ball
89 1 on 1 Event 10 Save; when attacker was clear with no defenders
between him and the goalkeeper
90 Deflected save Event 10 Save; when goalkeeper saves a shot but does not
catch the ball
91 Dive and deflect Event 10 Save; when goalkeeper saves a shot while diving but
does not catch the ball
92 Catch Event 10 Save; when goalkeeper saves a shot and catches it
93 Dive and catch Event 10 Save; when goalkeeper saves a shot while diving and
catches it
123 Keeper Throw Pass event - goalkeeper throws the ball out
124 Goal Kick Pass event – goal kick
128 Punch Clearance by goalkeeper where he punches the ball clear
139 Own Player Shot saved by goalkeeper that was deflected by a defender
173 Parried safe Goalkeeper save where shot is parried to safety
174 Parried danger Goalkeeper save where shot is parried but only to another
opponent
175 Fingertip Goalkeeper save using his fingertips
176 Caught Goalkeeper catches the ball
177 Collected Goalkeeper save and collects possession of the ball
178 Standing Goalkeeper save while standing
179 Diving Goalkeeper save while diving
180 Stooping Goalkeeper saves while stooping
181 Reaching Goalkeeper save where goalkeeper reaches for the ball
182 Hands Goalkeeper saves with his hands
183 Feet Goalkeeper save using his feet
186 Scored Goalkeeper event - shots faced and not saved resulting in goal
187 Saved Goalkeeper event - shots faced and saved
188 Missed Goalkeeper event - shot faced which went wide or over. Did
not require a save.
198 GK hoof Goalkeeper drops the ball on the ground and kicks it long
towards a position rather than a specific player
199 Gk kick from hands Goalkeeper kicks the ball forward straight out of his hands
Table C.7: Qualifiers that are related to goalkeeper events (event types 10, 11, and
12)
Table C.9: Qualifiers that are related to line ups, substitutions, and formations
ID Referee Description
50 Official position 1, 2, 3, or 4 for Referee, Linesman#1, Linesman#2, Forth
official
51 Official ID Unique ID for the official
200 Referee stop Referee stops play
201 Referee delay Delay in play instructed by referee
208 Referee Injury Referee injured
ID Stoppage Description
53 Injured player id ID of the player who is injured and causing a delay in the
game
202 Weather problem Bad weather stops or interrupts play
203 Crowd trouble Trouble within the crowd stops or delays play
204 Fire Fire with the stadium stops or delays play
205 Object thrown on Object throw from the crowd lands on the pitch and delays
pitch play
206 Spectator on pitch Spectator comes onto the pitch and forces a delay in play
207 Awaiting officials Given to an event/delay where the referee still has to make a
decision decision
208 Referee injury Referee sustained injury causing stoppage in play
226 Suspended Game is has not finished but is suspended
227 Resume Game has resumed after being suspended mid-way through
on a previous date
Field zones
73
Appendix E
Goalmouth locations
74
Bibliography
[2] Mike Hughes and Ian Franks. Analysis of passing sequences, shots and goals in
soccer. Journal of Sports Sciences, 23(5):509–514, 2005.
[3] Charles Perin, Romain Vuillemot, and Jean-Daniel Fekete. Soccerstories: A kick-
off for visual soccer analysis. IEEE Transactions on Visualization and Com-
puter Graphics, 19(12):2506–2515, 2013. ISSN 1077-2626. doi: [Link]
[Link]/10.1109/TVCG.2013.192.
[4] Javier López Peña and Hugo Touchette. A network theory analysis of football
strategies. arXiv preprint arXiv:1206.6904, 2012.
[5] Aislan Gomide Foina, Rosa M Badia, Ahmed El-Deeb, and Francisco Javier
Ramirez-Fernandez. Player tracker-a tool to analyze sport players using rfid. In Per-
vasive Computing and Communications Workshops (PERCOM Workshops), 2010
8th IEEE International Conference on, pages 772–775. IEEE, 2010.
[6] Mike D. Hughes and Roger M. Bartlett. The use of performance indicators in
performance analysis. Journal of Sports Sciences, pages 739–754, 2012. doi: 10.
1080/026404102320675602.
[8] Charles Perin and Frédéric Vernier. R2S2: a Hybrid Technique to Visualize Sport
Ranking Evolution. In What’s the score? The 1st Workshop on Sports Data Vi-
sualization, Atlanta, GA, United States, October 2013. URL [Link]
fr/hal-00869346.
[9] Andy Cox and John Stasko. Sportsvis: Discovering meaning in sports statistics
through information visualization. In Compendium of Symposium on Information
Visualization, pages 114–115. Citeseer, 2006.
[13] Javier Perez. Us soccer curriculum. United State Soccer Federation, 2011.
[15] Christian Lavers. The reality of high pressure defending. Soccer Coach-
ing International, (34):39, August / September 2009. URL [Link]
[Link]/pdf/38-43_SCI_09.34_Defending.pdf.
[16] Stan Baker. Our Competition is the World. Lulu Publishing, 2012.
ISBN 978-1-300-04165-8. URL [Link]
our-competition-is-the-world/paperback/[Link].
[18] Phil Wymer. Coaching Soccer Tactics. Phil Wymer, 2004. ISBN 0955007607. URL
[Link]
[19] A Yiannakos and V Armatas. Evaluation of the goal scoring patterns in european
championship in portugal 2004. International Journal of Performance Analysis in
Sport, 6(1):178–188, 2006.
[20] Craig Wright, Steve Atkins, Remco Polman, Bryan Jones, and Lee Sargeson. Fac-
tors associated with goals and goal scoring opportunities in professional soccer.
International Journal of Performance Analysis in Sport, 11(3):438–449, 2011.
[21] LA84 Foundation, Stacy Chapman, Ed Derse, and Jacqueline Hansen. LA84 Foun-
dation Soccer Coaching Manual. LA84 Foundation, 2007.
[22] Jens Bangsbo and Birger Peitersen. Soccer systems and strategies. Human Kinet-
ics, 2000. ISBN 0-7360-0300-2. URL [Link]
all-products/soccer-systems--strategies.
[23] Hossini Fatemeh, Rezae Shirazi Reza, Mesuodi Nezhad Monire, and Rezaei Rozita.
Effectiveness of types of individual defense performance in achieving success. In-
ternational Research Journal of Applied and Basic Science, 3(4):891–895, 2012.
Bibliography 77
[24] Wouter Frencken, Harjo de Poel, Chris Visscher, and Koen Lemmink. Variability of
inter-team distances associated with match events in elite-standard soccer. Journal
of sports sciences, 30(12):1207–1213, 2012.
[25] Ricardo Cava and Carla Dal Sasso Freitas. Glyphs in matrix representation of
graphs for displaying soccer games results. In SportVIS-Workshop on Sports Data
Visualization. Atlanta, Georgia, USA: IEEE VIS, 2013.
[26] Adrian Rusu, Doru Stoica, Edward Burns, Benjamin Hample, Kevin McGarry, and
Robert Russell. Dynamic visualizations for soccer statistical analysis. In Informa-
tion Visualisation (IV), 2010 14th International Conference, pages 207–212. IEEE,
2010.
[27] Eleftherios Kellis and Athanasios Katis. Biomechanical characteristics and deter-
minants of instep soccer kick. Journal of sports science & medicine, 6(2):154, 2007.
[28] Matthew O Ward. Multivariate data glyphs: Principles and practice. In Handbook
of Data Visualization, pages 179–198. Springer, 2008.