0% found this document useful (0 votes)
5 views4 pages

Genetic Algorithm for Graph Layouts

This paper presents a genetic algorithm for improving the layout of undirected graphs, focusing on performance and scalability through parallel computing. The algorithm employs evolutionary techniques such as selection, crossover, and mutation to optimize graph aesthetics, specifically minimizing edge crossings and ensuring even node distribution. Results indicate significant improvements in graph layouts compared to traditional methods, demonstrating the effectiveness of genetic algorithms in handling large datasets efficiently.

Uploaded by

agus
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views4 pages

Genetic Algorithm for Graph Layouts

This paper presents a genetic algorithm for improving the layout of undirected graphs, focusing on performance and scalability through parallel computing. The algorithm employs evolutionary techniques such as selection, crossover, and mutation to optimize graph aesthetics, specifically minimizing edge crossings and ensuring even node distribution. Results indicate significant improvements in graph layouts compared to traditional methods, demonstrating the effectiveness of genetic algorithms in handling large datasets efficiently.

Uploaded by

agus
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Genetic Algorithm for Drawing Undirected

Graph’s Layouts
Ekaterina Bochvaroska Aleksandar Tošić
Faculty of Mathematics, Natural Faculty of Mathematics, Natural
Sciences and Information Technologies Sciences and Information Technologies
University of Primorska University of Primorska
Glagoljška ulica 8, Glagoljška ulica 8,
SI-6000 Koper, Slovenia SI-6000 Koper, Slovenia
89221049@[Link] Innorenew CoE
Izola, Slovenia
[Link]@[Link]

Abstract—In this paper, we present an empirical ap- also ensures faster processing, providing more balanced
proach to graph layout improvement using a genetic and visually appealing graphs that can handle larger
algorithm, leveraging the power of parallel computing datasets efficiently, demonstrating the potential of this
to enhance performance and scalability. We develop
an iterative population-based evolutionary search for a algorithm in tackling such intricate problems.
random graph layout that combines selections, crossover, Early work on the topic was introduced by ”TimGA:
and mutation to obtain the best possible outlay of graphs A Genetic Algorithm for Drawing Undirected Graphs”
concerning multiple aesthetic criteria. We evaluate our ap- [3]. In complement, TimGA proposes a genetic al-
proach using a complete evaluation pipeline and compare gorithm that is tailored for the layout of undirected
the results with the ForceAtlas 2 algorithm. [1] Traditional
graph topology algorithms typically suffer from degraded graphs, which takes into consideration important aes-
performance when dealing with large graphs but gener- thetic criteria such as minimizing edge crossings while
ally produce better results. Thus, using genetic algorithms enforcing an even distribution of node positions and
might be preferable in large networks, where the need reducing-edge length deviation [4]. Using a genetic
for performance is prioritized above the requirement for algorithm, we seek to refine graph layouts until they
a detailed visual inspection of the resulting graph. The
evolutionary algorithm is guided by a fitness function are clear and visually appealing. We begin the process
based on edge-crossing minimization, node distribution, with a randomly generated graph, mimicking natural
and overall visual clarity. Finally, absent an objective evolution in generating the initial population of graphs,
method for measuring how appealing topologies are to which progressively evolved with selection, crossover,
the human eye, we provide a visual comparison between and mutation. A key component here is the quality of
our results and commonly used graph layout algorithms.
a graph layout, which can be evaluated by measuring
I. I NTRODUCTION different graph properties such as the number of edge
crossings or how evenly the nodes are positioned across
When analyzing networks and visualizing data, it’s the displaying surface. As a result, the output layouts of
crucial to create appealing graph layouts and discover the graphs are simplified, which again puts more em-
attractive topologies. Tutte [2], a pioneer in the field of phasis on the importance of understanding the complex
graph drawing, laid the groundwork for creating visu- data. In addition, aiming to provide a global measure
ally understandable representations of complex graphs. of the progress, we used Force Atlas 2 algorithm [1] to
A well-designed graph provides both a precise and estimate the progress of the initial graph in comparison
visually appealing representation of data. to the outcomes of (GA), serving as a visual reference
Relying on his work Force-directed layout algo- to determine if any improvement has been achieved.
rithms [1] aim to position the nodes dynamically and
apply forces, hence reflecting the structure and rela- II. I MPLEMENTATION
tionships making it easier to find patterns and symme- In this section, we explain the genetic algorithm’s
tries. However, these force-directed layouts often hit functionality, its implementation for this project, and
lower local minima quickly, as well as have a high the methodology used to calculate the fitness function.
running time. In contrast to this, genetic algorithms Our approach is inspired by the principles of natural se-
(GA) use more direct and adaptive methods to solve lection and genetics, we iteratively evolve a population
a wide range of optimization problems, making them of solutions to optimize graph layouts.
widely popular. They are flexible and can be applied
in many different scenarios without requiring specific A. Overview of the Genetic Algorithm
assumptions about the problem. This work addresses The genetic algorithm follows these steps:
the limitations of conventional methods by using a 1) Initialization: We begin with a randomly gen-
(GA) approach to improve graph representations. This erated graph based on user-defined sizes of the
approach not only aims for better optimization but edge set (|E|) and vertex set (|V |), represented

ERK'2024, Portorož, 436-439 436


by array lists. The initial population of graph 3) Edge Length Deviation Θ : This term measures
individuals is created within the specified panel the deviation of edge lengths from an optimal edge
size, which we fixed to 1000x1000. length, which is the smallest edge length found overall
2) Fitness Evaluation: Evaluate each graph’s fitness in the graph structure to ensure a balanced and symmet-
function score based on criteria such as minimiz- ric graph layout. [4]Further this value is increased by 5
ing edge crossings and evenly distributing nodes. units, which will again assist in spreading the edges
Computationally this is the most expensive part to more even lengths across the graph. This avoids
of the algorithm. very long or very short edges and gives a much more
3) Selection: Like nature employs Darwinism [5], uniform and aesthetically pleasing layout. By doing so,
we select the fittest layouts, without employing the lengths of the edges become more consistent, hence
any changes -elitism [6] to serve as parents’ reducing clutter and increasing clarity in the graph.
layouts to the next generation to ensure the best
solutions are preserved. 1 X 2
Θ= (d(e) − µ) (3)
4) Crossover: By combining pairs of parent layouts |E|
e∈E
to create new graphs, we introduce variations in
the graph structure. where d(e) is the length of edge e, µ is the Optimal
5) Mutation: To preserve genetic diversity, we in- Edge length and |E| is the total number of edges.
troduce random changes to certain layouts, simi-
4) Edge Crossings = X : Edge crossings refer to
lar to how natural variation is essential in evolu-
the number of intersections between edges in a graph
tionary processes.
layout. Minimizing edge crossings is essential for im-
6) Termination: Repeating the process until a stop-
proving the visual clarity of the graphs’ layouts. High
ping condition is met, such as reaching a maxi-
numbers of edge crossings can lead to visual clutter,
mum number of generations.
making it difficult to observe relationships and patterns
within the data. [4]
B. Fitness Function Calculation
|E| |E|
The fitness function serves as a compass, guiding
X X
X= δ(ei , ej ) (4)
through the optimization journey. It evaluates how well i=1 j=i+1
an individual graph layout meets the desired criteria,
specifically, it defines the quality of the potential so- where δ(ei , ej ) is 1 if edges ei and ej intersect, and 0
lution. Evaluating through the whole population each otherwise.
graph receives a unique fitness score based on certain 5) Combined Fitness Calculation: The total fitness
properties. Particularly as the algorithm evolves through score is a weighted combination of the above equations
multiple generations, the fitness function guides the (1), (2), (3) and (4). The weights are chosen to balance
selection of layouts for further improvement where it the importance of each criterion, ensuring a well-
influences which solutions survive, reproduce, and con- organized graph layout. The fitness function formula
tribute to the next iterations. Hence the fitness function is:
consists of several components each contributing to
finding the optimal solution: FitnessScore = + w1 × Λ
1) Minimum Distance Neighbour Sum = Λ: This + w2 × Θ
component evaluates the sum of the minimum distances
+ w3 × X
between each node and its nearest neighbor, aiming  
to achieve an even distribution of nodes and reduce Θ
− w4 ×
clustering within the graph layout. Its high values Λ+ϵ
indicate that nodes are well-separated. − w5 × N × (minD)2
N
X In the code, the weights are:
Λ= min d(i, j) (1)
j∈neighbors(i)
i=1 w1 = 2.0 (Weight for Minimum Distance Neighbour Sum)
where d(i, j) is the Euclidean distance between w2 = 2.0 (Weight for Edge Length Deviation)
nodes i and j, and N is the total number of nodes. w3 = 1.0 (Weight for Edge Crossings)
2) Node Spread = S: This term further distributes w4 = 2.5 (Weight for Edge Length Deviation Penalty)
nodes by taking into account the number of nodes and
w5 = 0.5 (Weight for Node Spread)
the square of the minimum distance between nodes,
promoting a wider spread to improve layout clarity. [7] where ϵ is a small constant added to avoid division by
1 zero. Additionally, we incorporate a check for symmet-
S= × N × (minD)2 (2) rical edge crossings, as some symmetrical patterns can
4
be beneficial. If such patterns are detected, the fitness
where minD is the Minimum Node Distance is multiplied by a weight of 1.5.

437
C. Selection that in the first few hundred generations, most changes
Individual graphs are probabilistically selected from to the layout are improving on even spacing of nodes
the population in the selection phase based on their fit- while the last few thousand generations significantly
ness values. This means that better layouts are normally improve on symmetry. One of the components of this
favored, ensuring a better creation of the next genera- optimization apart from the one-point crossover and
tion. Additionally, implementing elitism means that the elitism was also the mutation method used, which
graphs are firstly sorted based on their fitness scores introduces random perturbations to the graph layout.
and then only the top half of individuals are retained Comparing these results with the state-of-the-art lay-
in the population. This approach hopefully ensures that out algorithm Force Atlas 2 [9] developed by Gephi
the best layouts remain in the next generation leading [10], we offer visually comparable results. However, it
to continuous improvement and progress. is worth noting that Force Atlas 2 does not preserve
edge lengths whereas our implementation does.
D. Crossover Furthermore, we measured computation time across
Crossover combines genetic material from two par- various graph sizes—from 100 x 100 nodes and edges
ent graphs to create new offspring. This process keeps to scales reaching up to 6,400 x 6,400 over a generation
the genetic pool diverse, which is important for finding of 100 graphs. The most computationally expensive
better solutions and exploring different possibilities in operation is the fitness evaluation, for instance, cal-
the population. Usually, this operation requires two culating minimum distances between neighbors, edge
parent individuals to produce one offspring. Our imple- length deviations, and edge crossings involves iterating
mentation includes creating two offsprings to uniformly through all the edges and nodes which becomes compu-
retain genes that could possibly lead to future improve- tationally heavier when dealing with larger graph sizes
ment in the layout. [8] A crossover point is chosen or larger populations. Through parallel programming,
randomly in the node list, and as a result, the parents’ which takes advantage of multi-core processors, we
nodes are divided by the separation point. The child distribute computations across multiple threads. [11]
graphs are then created by replacing the first parent’s We implement this by using the ExecutorService class
1
node sublist with the second parent’s node sublist, , which manages a pool of threads for concurrent
effectively exchanging the genetic material. It should execution, we allow multiple fitness evaluations to run
be noted that in this process copies of the parents simultaneously. A Semaphore is used to synchronize
are normally utilized to avoid disrupting the original the completion of the tasks, ensuring all evaluations
parents. are completed before proceeding further. Our approach
not only delivers efficient computational performance
E. Mutation visible in Figure 4 but also ensures scalability for
There are two types of mutations that occur in every managing larger graphs.
individual with a probability of 0.1% per graph. The
IV. C ONCLUSION
mutation is implemented by randomly selecting a node
from the graph, and performing one of two mutation Our research explores a relatively untapped area, as
operations on it: there are few published works on this topic, which are
1) Rotate mutation is a random rotation of con- mostly quite outdated. Thus, we significantly improve
nected nodes or random movement of selected the quality of the generated graph layout by integrating
incident edges, preserving their length and con- more precise and complex fitness calculations that take
nectivity. This exploration of new configurations into account elements like minimum distances between
helps discover optimal layouts. neighbors, edge length, edge variations, symmetry, and
2) Flip mutation is divided into horizontal, vertical, edge crossings. Our strategic use of different mutations
and diagonal flips of a node’s position. to preserve genetic diversity emphasizes how crucial it
is to keep a variety of genes. Moreover, we show that
The flipping mutation introduces more diversity avoid-
even for large-scale graphs, genetic algorithms could
ing local minimum. Figure 2 illustrates a horizontal
be highly efficient by utilizing parallel computing archi-
flip in which node labeled 3 is flipped. One of the
tectures, especially in the fitness evaluation process. To
three options is chosen with equal probability. Rota-
conclude, our implementation of a more precise fitness
tional mutations would make it difficult to achieve a
function with the help of concurrent programming not
similar layout. By flipping the 3rd node we not only
only enhances performance but also lays a foundation
improve node distribution but reduce the number of
for more complex graphs with efficiency and scalabil-
edge crossings by one.
ity, improving the graph layout quality, and opening a
III. R ESULTS way for further research and development in this field.
Generally, the results from simulations show a signif- R EFERENCES
icant improvement in layout with relatively fast conver-
[1] T. Masui, “Evolutionary learning of graph layout constraints
gence. Figure 1 shows a chronological time-lapse of the from examples,” in Proceedings of the 7th annual ACM Sym-
evolutionary process for a random graph with 10 nodes,
15 edges, and a population size of 500. We observe 1 [Link]

438
(a) Gen=0 (b) Gen=10 (c) Gen=50 (d) Gen=100

(e) Gen=500 (f) Gen=1000 (g) Gen=5000 (h) Gen=10000


Fig. 1: Mutation by flipping a node (3) horizontally.

generation_log

35000

1 1 30000

2 2
Horizontal Flip 25000

3 3 3 20000
Time in ms

15000
5 5
4 4
10000

5000

0
0 1000 2000 3000 4000 5000 6000 7000
(a) Before mutation (b) After mutation Graph size for population 100

Fig. 2: Mutation by flipping a node (3) horizontally.


Fig. 4: Requirement of running time in ms for different graph
sizes

Department of Computer Science, 1996.


[4] A. d. M. S. Barreto and H. J. Barbosa, “Graph layout using
a genetic algorithm,” in Proceedings. Vol. 1. Sixth Brazilian
Symposium on Neural Networks. IEEE, 2000, pp. 179–184.
[5] S. J. Gould, “Darwinism and the expansion of evolutionary
theory,” Science, vol. 216, no. 4544, pp. 380–387, 1982.
[6] H. Du, Z. Wang, W. Zhan, and J. Guo, “Elitism and distance
strategy for selection of evolutionary algorithms,” IEEE Access,
vol. 6, pp. 44 531–44 541, 2018.
[7] G. Di Battista, P. Eades, R. Tamassia, and I. G. Tollis,
“Algorithms for drawing graphs: an annotated bibliography,”
Computational Geometry, vol. 4, no. 5, pp. 235–282, 1994.
[8] R. Poli and W. B. Langdon, “Schema theory for genetic
programming with one-point crossover and point mutation,”
Evolutionary Computation, vol. 6, no. 3, pp. 231–252, 1998.
[9] M. Jacomy, T. Venturini, S. Heymann, and M. Bastian,
Fig. 3: Force Atlas 2 layout from Gephi given the same input “Forceatlas2, a continuous graph layout algorithm for handy
graph with 10 nodes, 15 edges. network visualization designed for the gephi software,” PloS
one, vol. 9, no. 6, p. e98679, 2014.
[10] M. Bastian, S. Heymann, and M. Jacomy, “Gephi: an open
source software for exploring and manipulating networks,” in
posium on User Interface Software and Technology, 1994, pp. Proceedings of the international AAAI conference on web and
103–108. social media, vol. 3, no. 1, 2009, pp. 361–362.
[2] W. T. Tutte, “How to draw a graph,” Proceedings of the London [11] A. Zamuda, J. Brest, B. Bošković, and V. Žumer, “Differential
Mathematical Society, vol. 3, no. 1, pp. 743–767, 1963. evolution for parameterized procedural woody plant models
[3] T. Eloranta and E. Mäkinen, TimGA: A Genetic Algorithm reconstruction,” Applied Soft Computing, vol. 11, no. 8, pp.
for Drawing Undirected Graphs. University of Tampere, 4904–4912, 2011.

439

You might also like