Swarm Robotics Task Review
Swarm Robotics Task Review
net/publication/282623965
CITATIONS READS
435 3,522
1 author:
Levent Bayindir
Atatürk University
21 PUBLICATIONS 834 CITATIONS
SEE PROFILE
All content following this page was uploaded by Levent Bayindir on 04 February 2019.
Levent Bayındıra
a Departmentof Computer Engineering, Ataturk University, 25240, Erzurum, Turkey
E-mail: [Link]@[Link] Tel: +90 (442) 231 4905
Abstract
1. Introduction
2
The expression “swarm intelligence”, which is now widely used in the field of
swarm robotics, refers to the superior capabilities of a swarm of agents compared
to its single individuals. The local events triggered by swarm members during
execution of a task translate into a global behavior which often transcends the
individual capabilities, to the point that many collective tasks can be success-
fully done by robots that are not explicitly programmed to execute those tasks:
the global, macroscopic dynamics is said to emerge from interactions of swarm
members between each other and with the environment.
The possibility to achieve global objectives at the swarm level by means
of distributed algorithms acting at the individual level comes at a price: it is
often difficult to design the individual robot behavior so that the global perfor-
mance is maximized. This problem has been widely studied by swarm robotics
researchers, and has been addressed with simulation, modeling and learning ap-
proaches. Simulation, where a given multi-robot scenario is replicated in a vir-
tual environment in which robot capabilities (sensors and actuators) and inter-
actions are simulated by a computer program, allows assessing the performance
of a robot swarm with repeated runs of an experiment, eliminating or mitigating
the need for time-consuming experiments with real robots and facilitating algo-
rithm optimization with a trial-and-error approach. Modeling (more precisely,
macroscopic modeling) utilizes mathematical formulas to link individual-level
control parameters to swarm-level dynamics; with such formulas, the impact of
algorithm parameters can be evaluated directly, and often valuable insights on
the global dynamics of the swarm can be intuitively obtained. Learning refers
generically to adaptation of algorithm parameters based on previous experience;
learning methods can be categorized in offline approaches, where the parameter
optimization phase is part of the design of robot controllers, and online ap-
proaches, where robots dynamically update their control parameters based on
their perception of the environment.
A widely used offline learning method is artificial evolution, which, starting
from initial values of algorithm parameters, iteratively executes robot exper-
iments evaluating for each experiment a fitness function which estimates the
3
performance of the algorithm in executing a swarm-level task; the most per-
forming parameter values at a given iteration are identified and used as a basis
to program robots in subsequent iterations. Similar to what happens in nature
with the evolution of species, robots are able to evolve their behavior across
different “generations” and accomplish the given task. While neural networks
are a common type of robot controllers used with artificial evolution, recent
works explored the use of alternative methods such as rule-based grammatical
evolution [3].
Online learning methods have been shown to be able to increase the flexibil-
ity of a swarm, i.e. its capability to adapt to different environment conditions.
By definition, robots learning during task execution must have some form of
memory which allows them to remember past experiences in order to adapt
their future behavior; thus, inclusion of online learning methods in robot con-
trollers implies an additional level of complexity in robot implementation. But
generally the biggest difficulties encountered in this domain are due to different
aspects: first, robots often have a limited and noisy perception of the environ-
ment and of the progression of a global task; second, as already discussed, the
distributed nature of the problem makes it difficult to relate individual behavior
to global performance. Two types of online learning methods can be identified in
past works: reinforcement learning and parameter adaptation. Reinforcement
learning is based on a model where robots, which can be in a given set of states
and can execute a given set of actions, receive feedback on the results of their ac-
tions through a reward ; the objective of robots is to choose a mapping between
states and actions so as to maximize the reward. Using a local reinforcement
paradigm, the reward is assigned only to robots which directly accomplish an
objective, while with global reinforcement all the robots are rewarded for each
accomplishment; local reinforcement is more coherent with swarm intelligence
principles because it does not require sharing global information in the swarm.
Other online learning methods can be described as based on dynamic adaptation
of robot algorithm parameters triggered by observations of the environment.
In this paper, various tasks for which past works have proposed solutions us-
4
ing a swarm intelligence approach are surveyed, focusing on distributed control,
locality of interactions and simplicity of individual robot controllers. The next
section is dedicated to previous swarm robotics reviews; then, the subsequent
sections describe the different tasks and the corresponding solutions proposed
in past studies; finally, future research directions are outlined and concluding
remarks are made in the last sections.
2. Previous work
5
then further differentiated at the knowledge level, where systems with robots
aware of the existence of other robots are distinguished from systems where each
robot acts as if it was alone. The lower levels of the proposed taxonomy are
the coordination level, describing how the actions of each robot take into ac-
count the actions of other robots, and the organization level, which determines
whether decisions are taken in a centralized or distributed way; it is interesting
to note that a centralized system may be compatible with swarm intelligence
principles: more precisely, systems defined as weakly centralized, where one of
the robots temporarily assumes the role of leader, can exhibit the desired prop-
erty of fault tolerance provided that suitable mechanisms are in place to assign
the leadership role.
While early papers provided characterization of swarm robotics systems
mainly as an analysis tool to encourage further research and give guidance
for the design of new systems, with the increasing number of published studies
describing actual implementations of robot swarms, newer surveys have been
able to propose taxonomies where each category is represented by several ex-
amples of existing works. An extensive review of the state of the art in the mid
2000s is provided in [7], where existing works are categorized based on analytical
modeling approaches, design of individual robot behavior, type of interactions
between robots, and problems being addressed by a robotic swarm. Gazi and
Fidan [8] focused on the aspect of controlling robot movement, dividing existing
works based on how the robot dynamics is modeled (i.e. how control inputs map
to position variations) and how robot controllers are engineered; in addition, a
further classification is done on the problem dimension. Previous works have
been classified on the problem dimension also in other surveys [9, 10]; Mohan
and Ponnambalam [11] analyzed various research domains in swarm robotics,
however their review does not provide a clear categorization of the state of the
art, mixing a classification of some studies in the problem dimension with a
description of how other studies differ on aspects such as biological inspiration,
communication between robots, control approach and learning.
A comprehensive survey recently published is the work by Brambilla et al.
6
[12], who proposed two taxonomies for swarm robotics studies: methods and col-
lective behaviors. The first taxonomy includes design methods, differentiated in
behavior-based and automatic methods, and analysis methods, i.e. techniques
to study the performance of a swarm in executing a given task; analysis methods
are divided in microscopic models, macroscopic models and real-robot analysis.
The second taxonomy is based on the concept of collective behaviors, i.e. behav-
iors of a swarm of robots considered as a whole; the main categories identified by
the authors under this taxonomy are spatially organizing behaviors, navigation
behaviors and collective decision making.
Barca and Sekercioglu [13] analyzed past research by identifying a series of
challenges faced by swarm robotics systems and describing how each of these
challenges has been addressed by existing studies. Challenges are grouped un-
der five categories: selecting appropriate communication and control schemes,
incorporating self-organization, scalability and robustness properties, devising
mechanisms to support goal-oriented formations, control and connectivity, im-
plementing functions that enable robots to interact efficiently with the envi-
ronment, and addressing problems related to energy consumption. The authors
outlined a number of issues that need to be tackled to overcome these challenges,
and observed how existing works typically focus on only a subset of these issues,
suggesting that in order to implement successfully a swarm robotics system in
real-world applications a larger set of issues should be tackled simultaneously.
In this review, the problem dimension is used as main taxonomy axis, thus
grouping past works according to the collective task being addressed. For each
of the most studied tasks, first the main high-level methods employed in past
studies are described, then task-specific categories are identified and a more de-
tailed description of distributed algorithms is provided for a representative set
of existing works in each category, and finally mathematical models to analyze
and predict swarm behavior, and methods and metrics to evaluate swarm-level
performance are reviewed. Due to similarities and analogies between different
collective tasks, multiple equally valid categorizations can and have been pro-
posed in past reviews under this taxonomy, and many works can be put in more
7
than one category; in this study, partially different categories are identified com-
pared to existing reviews, further categorizations within each task are proposed,
and differences, similarities and relationships between tasks are explained.
3. Aggregation
3.1. Methods
8
Although artificial physics has been successfully used to formally characterize
the aggregation dynamics of swarms of autonomous agents [14, 15, 16, 17, 18], its
practical implementation in artificial systems with real robots imposes some re-
quirements on robot sensing capabilities which may not be cost-effective. Robots
with local sensing abilities are typically characterized by a limited range of vis-
ibility, which influences the capability to perceive other robots in the environ-
ment; determination of the relative orientation of neighbors may be affected by
a high error, such as when infrared technology is used; mechanical constraints
usually determine a saturation effect in robot actuators, effectively limiting the
amplitude of the control inputs which regulate robot motion. For these reasons,
only a limited number of studies considered applying artificial physics rules to
control robot movement. One of these studies is the work by Priolo [19], where
a swarm aggregation algorithm based on artificial physics is implemented and
tested with real robots. In this algorithm, distance and relative orientation of
neighboring robots are obtained via radio frequency and infrared technologies;
as usual, attraction forces are stronger at high distance values between robots,
while repulsion dominates at low distance values. In addition, an obstacle avoid-
ance mechanism is devised in which obstacles are modeled as virtual robots; in
order to avoid attractive force between robots and obstacles, virtual robots are
activated only when their distance to a real robot is such that repulsion is
stronger than attraction.
9
each state corresponds a specific robot behavior; the different states in use (and
the behavior implemented in each state) vary between algorithms, but many
similarities can be found in various aggregation studies. In some cases the walk
state is split in two different states: one where a robot tries to approach other
robots, and one where it distances itself from its neighbors [33, 34]. The decision
to switch between states can be taken in a purely random way, or based on local
cues, which can be as simple as the presence of nearby robots, or may involve
more complex algorithms and signaling mechanisms [35, 36]. Parameters of
the finite state machine, such as the probabilities of switching between states,
are typically chosen manually by the swarm designer, but recently alternative
methods based on automatic design have been proposed [37].
A common point in probabilistic aggregation methods is the presence of
unstable aggregates, with robots continuously joining and leaving them. The
aggregation dynamics arises from changes in randomness of robot behavior due
to the detection of neighboring robots: while disaggregated robots typically
move randomly in the environment, the dynamics of aggregated robots is more
deterministic. However, a random component in the behavior of aggregated
robots is often necessary to facilitate the formation of a small number of large
aggregates, avoiding situations where the presence of small aggregates prevents
robots from joining larger aggregates. In studies where a different control algo-
rithm than a finite state machine is used, typically there is not a clear distinction
between aggregated and non-aggregated robots, but the swarm dynamics can
be measured based on metrics such as the average distance between robots, and
the randomness of robot movement can be changed on a continuous scale [38].
10
environment [39], and actuator outputs can include devices enabling robots to
communicate with other robots [40]. Examples of algorithms used for artificial
evolution are genetic algorithms [39, 40] and q-tournament selection [41].
The standard paradigm used for evolving a population with artificial evo-
lution is based on a fitness value measuring the ability of a given generation
of individuals to accomplish a task. Gomes et al. [42] proposed a different al-
gorithm, based on the concept of novelty search. As opposed to fitness-based
evolution, novelty search-based evolution rewards robots whose behavior differs
from the behaviors observed in past generations. This method allows avoiding
a possible drawback of fitness-based evolution where local maxima of the fit-
ness function in the parameter space may prevent exploration of other parts of
the space and thus limit evolution. In [42], when novelty search is applied to
the aggregation task, the characterization of the behavior of a given generation
is done by measuring metrics such as the average distance of each robot from
the center of mass of all the robots, or the total number of aggregates; these
metrics are measured at multiple time instants during a simulation, and their
values (averaged over different simulations) are inserted in a behavior charac-
terization vector, which is used to calculate the similarity between behaviors.
The authors compared simulation results using novelty search and fitness-based
evolution, and showed that, while the latter is generally better in finding the
optimum parameters for the aggregation task after many generations, with nov-
elty search better results are obtained during the first phase of the evolution,
and are slightly refined in subsequent generations, yielding performance values
similar to those obtained with fitness-based evolution.
Novelty search-based methods rely on a measure of similarity between robot
behaviors, from which the novelty of a given robot can be assessed. Gomes and
Christensen [43] proposed two similarity measures that are independent from
a specific swarm-level task and as such can be used without having domain
knowledge of the task at hand. Both measures are based on evaluating the
state of the neural controller implemented in each robot; the state is defined as a
vector containing the values of the inputs and outputs of the controller at a given
11
time. The first measure, called combined state count, characterizes a behavior by
discretizing the possible states of the controller and counting the occurrences of
each state (sampled at defined intervals) during an experiment run. The second
measure is called sampled average state and is based on calculating a vector
containing the average state of the robotic swarm (i.e. the state obtained by
averaging the states of all the robots) sampled at fixed intervals. Both measures,
when evaluated for the aggregation task, showed good results, comparable with
the results obtained using a domain-specific similarity measure.
3.2. Algorithms
12
robot and then switches to the wait state, otherwise it switches directly to
the wait state; from this state, a robot switches back to random walk with a
predefined probability. Using this algorithm, the global aggregation dynamics
is regulated by the probability for robots to detect other robots after random
walking, with large aggregates being easier to detect than single robots.
Bayındır [34] proposed another algorithm based on a finite state machine,
where the aggregation behavior is obtained using four states called search, wait,
leave and change direction. The wait state, which is entered when a searching
robot detects the presence of other robots and aggregates with them, is designed
so that the robot tries to keep a fixed distance from each of its neighbors; this
allows the formation of compact aggregates with approximately circular shape.
As usual, to avoid situations where small aggregates prevent the formation of
larger aggregates, robots can leave their aggregate at any time with a defined
probability.
In [33], aggregation is obtained in a simulated robotic system where each
robot is equipped with an omnidirectional speaker and a set of microphones, and
uses the sound emitted by the other robots to determine their relative direction
and proximity. The basic states of the finite state machine implemented in the
robots are called approach, wait and repel. In the approach state, robots move
toward the direction of the loudest sound, while in the repel state they move in
the opposite direction; when a robot in the approach state senses another robot
in close proximity, it switches to the wait state, where it stays in its current
position; from this state, it switches to the repel state with a given probability,
and then returns to the approach state with another probability value.
In [35], Trianni et al. presented a system where robots are equipped with
a light source that can be used to signal their presence. From its sensory per-
ceptions (robots are able to detect the presence of other nearby robots and to
measure the intensity of received light), each robot creates a context, i.e. a
high-level abstraction of the surrounding environment; at each time instant, the
robot selects randomly its behavior among a set of predefined behaviors (which
include turning on and off its light, and moving towards or away from other
13
robots); the probabilities to select the various behaviors are defined based on
the perceived context. This generic algorithm can be adapted to different col-
lective tasks by specifying how the sensory inputs map to the perceived context
and defining the probabilities of activating the basic behaviors from each pos-
sible context; as shown by the authors, aggregation is one of the tasks that can
be executed.
In [30], the probability of an aggregated robot to leave the aggregate is de-
termined by its orientation relative to the rest of the robots in the aggregate;
a robot pointing towards the center of the aggregate has a lower probability of
leaving than a robot pointing at other directions. Following various examples
found in nature such as the assembly of molecules, the stability of an aggregate
is described in terms of the energy of the bonds between its robots, which is a
function of the relative orientation of the robots. For simplicity, the authors lim-
ited their discussion to two-robot aggregates, where the energy of the aggregate
equals the energy of the bond between its two robots.
Gauci et al. [41, 44] used minimalistic aggregation algorithms where sensor
input obtained by robots from the environment is limited to one binary vari-
able that indicates whether there is another robot in their line of sight. In [41],
aggregation is obtained through artificial evolution. In [44], the control algo-
rithm makes robots move backwards along a circular trajectory if no robots are
perceived in their line of sight, and rotate on the spot otherwise; this simple
mechanism is shown by the authors to provide emergent aggregation if robot
sensors have a sufficiently long range. However, due to the lack of a behavior
similar to random walk, aggregation cannot be guaranteed if robots are initially
placed at a distance from one another larger than their sensing range.
In [40], robots equipped with microphones, proximity sensors, wheels and a
speaker are controlled by a neural network whose parameters are evolved with
a genetic algorithm. The authors observed the emergence of two types of col-
lective behavior: static and dynamic aggregation. The first type determines the
formation of compact and static aggregates, which is shown not to be scalable
because many robots in the arena tend to form multiple disjoint clusters. With
14
dynamic aggregation, the formed aggregates are less compact, but keep moving
in the arena, and when many robots are present this leads different aggregates
to join and form a single aggregate, thus showing more scalability. In [45], a
similar robotic system is evaluated running multiple experiments with varying
setups, and the effect of different characteristics such as arena size and number
of robots is assessed in terms of performance and scalability of the evolved be-
havior. In [42, 43], artificial evolution is implemented exploiting the concept of
novelty search.
In [36], the aggregation dynamics is obtained using an active environment,
which is able to propagate potentially at a long distance signals emitted by
robots. The aggregation algorithm proposed by the authors is inspired by the
formation of multi-cellular organisms via diffusion of chemical agents. Each
robot emits a signal (the emission rate is regulated by a specific parameter,
called firing rate) that propagates in the surrounding environment; other robots
(which in absence of this signal can either stay at their current position, or move
in a random direction) are attracted by this signal and move toward its source.
With this mechanism, robots tend to move toward each other and form one or
multiple aggregates. In [46], this aggregation method is implemented with real
robots.
In other studies, the aggregation process is not considered as the formation
of a collection of mostly static individuals in close proximity of each other,
but is described in terms of density of robots in a given space. For example,
aggregation obtained with artificial physics methods [14, 15, 16, 17, 18, 19]
generally does not lend itself to a precise identification of discrete aggregates. In
[38], where robots are able to sense the population density in their neighborhood,
the aggregation dynamics is obtained with robots increasing and decreasing
the randomness of their movement based on the local density: while robots in
low-density areas move in a highly random fashion, in high-density areas the
stochastic component of robot movement is lower, leading to robots “settling
down” in regions with a high number of neighbors. In [47], the aggregation
algorithm is controlled by wireless connectivity between robots with a limited
15
communication range. Robots in the arena move with a constant speed and
transmit periodically a wireless message containing their unique identifier; this
message allows receiving robots to detect the presence and count the number
of robots within communication range; each robot tries to keep this number
above a defined threshold: if the number of neighboring robots falls below the
threshold value, the robot inverts the direction of its movement, while when the
number of neighbors increases above the threshold, the robot executes a random
turn. Aggregates are thus formed as dynamic structures where each robot is in
communication range of many other robots.
16
waiting time when stopped near a high number of neighbors. In [48, 49], the
aggregation area is signaled with a sound source, and robots are equipped with
microphones to measure the sound intensity; in order to increase the aggregation
efficiency, each robot has a set of microphones oriented at different directions,
and when resuming walking moves toward the estimated direction of the sound
source.
In other studies [21, 23, 50], two distinct light sources are put in the environ-
ment, and the aggregation dynamics varies with the relative size and intensity
of the light spots. A particular case of the scenario with two light sources in
the arena is when the two sources are identical. As explained in [24], where
experiments with this scenario are evaluated, robots tend to aggregate under
one of the two identical sources. This behavior is an example of symmetry
breaking, which is observed in many robotic systems, where robots converge to
a unanimous decision in front of two options with the same utility.
The aggregation behavior of cockroaches is similar to that of honeybees,
but cockroaches have in addition the ability to detect the number of neighbors,
and the probabilities of entering the wait and walk states are dependent on the
number of detected neighbors [51]. Since cockroaches tend to prefer dark places
as resting sites, the size of the available dark places, in terms of the number of
cockroaches they can host, influences the aggregation dynamics; in [26, 27, 28],
this type of scenarios is analyzed with experiments and theoretical studies. In
[39], a simulated swarm of robots controlled by a neural network is used to
replicate the behavior of cockroaches collectively selecting a single resting site
among two identical shelters. Robots are able to detect whether their current
location is inside a shelter, and are able to count the number of neighboring
robots within a limited range. In the genetic algorithm to select the optimal
parameters of the neural controller, the fitness function is chosen so as to reward
the behavior where the majority of robots aggregates under a single shelter. In
a subsequent study [37], the same behavior is obtained using probabilistic finite
state machines whose parameters are selected with an optimization algorithm.
Schmickl et al. [32] studied an aggregation scenario where robots moving
17
in an arena with two differently sized target areas must form in the target
areas aggregates of a number of robots proportional to the area sizes. Robots
are able to detect whether they are inside a target area, but cannot measure
the size of the area. To achieve the aggregation task, each robot keeps in its
memory a scalar value which corresponds to its perception of the environment.
A robot can receive from other robots within communication range the value
in their memory; at predefined time intervals, the robot updates the value in
its memory based on whether it is in the target area and on the values received
from other robots within communication range. Movement of a robot in the
arena depends on the current value in its memory: if the value is above a
defined threshold, the robot tries to approach the neighboring robot with the
highest value in its memory, otherwise the robot moves randomly. With this
mechanism, a “collective perception” of the environment is obtained via local
communication, despite the limited sensing capabilities of single robots.
3.3. Analysis
3.3.1. Metrics
Performance metrics for the aggregation task are mainly based on either
identifying discrete groups of robots forming aggregates, or measuring the spa-
tial distribution of all robots in the arena. In the first case, a formal definition
of an aggregate is needed. Often, an aggregate is defined as a group of robots
such that for any pair of robots in the group there is path connecting them
formed by robots within a maximum distance from each other; the value of the
maximum distance is usually chosen based on the range of local communication
and sensing of robots. In studies where robots are controlled by a finite state
machine, aggregates can be identified by robots whose controller is in the “wait”
state. Once a suitable definition of aggregates is adopted, performance metrics
can be calculated as the ratio of the number of robots forming the largest aggre-
gate to the total number of robots [45, 31, 44], or as the average aggregate size
[33, 52]; a more in-depth analysis of the aggregation dynamics can be conducted
by observing the distribution of robots belonging to aggregates of different size
18
[35]. In tasks where the objective is to aggregate robots in a given area, a
common metric is given by the number of robots located in the target area
[26, 53, 32, 28, 21, 48] or within a certain distance from a defined aggregation
spot [22, 50].
The second type of metrics involves locating all robots in the arena and
finding a measure of their spatial relationships. Soysal and Şahin [33] used the
sum of distances between each pair of robots; other studies used the average
distance of robots from the center of mass of the swarm [40, 42, 43, 41]; Gauci
et al. [44] used the “second moment of the robots”, calculated by summing the
squares of the distances of each robot from the center of mass; Fatès [36] used
the “bounding box ratio”, defined as the ratio of the surface of the smallest
rectangle containing all the robots over the total surface of the arena.
The temporal dimension in the aggregation task is taken into account in
performance metrics that measure the speed at which a swarm achieves an
aggregation target [20, 25, 48, 49]; usually, such metrics are calculated as the
time elapsed before a given percentage of robots forms an aggregate, starting
from initial conditions where robots are positioned randomly in the arena.
3.3.2. Models
The aggregation behavior has been studied extensively in both natural and
artificial swarms, and mathematical models have been proposed to predict the
performance of a swarm in achieving self-organized aggregation. These models
allow calculating macroscopic quantities describing the collective behavior of a
swarm from parameters governing the individual robot behavior.
A mathematical description of how the inter-individual relationships between
swarm members subject to artificial physics rules determine the swarm dynam-
ics is given in [14]. The spatial density of agents in the environment is modeled
using a conservation law expressed with the advection-diffusion equation; this
partial differential equation models the density variation over time as the sum
of a diffusion term (which models the tendency of randomly moving particles
to go toward less concentrated areas) and an advection term (which accounts
19
for attractive and repulsive forces, and determines a non-random velocity com-
ponent in the agent motion). The authors stated that with local advection (i.e.
a model where the velocity at a given location is only a function of the agent
density at that location) many swarming behaviors observed in nature cannot
be modeled; thus, a non-local advection term is proposed, where the velocity is
calculated as the convolution of the particle density with a kernel function. The
advection-diffusion equation becomes then an integro-differential equation. Ex-
pressing the kernel function as a combination of attraction and repulsion terms,
mathematical analysis and numerical simulations show how these two factors
influence the spatiotemporal characteristics of the swarm.
In [15], the spatial patterns formed by agents are studied via a conservation
law without a diffusion term: the advection equation. A non-local advection
type is considered, and steady state solutions for the equation are derived for
different kernel functions (called interaction potentials in [15]). The solutions
show the formation of various aggregation patterns starting from a uniform
agent distribution. In addition, the dynamics determining the time needed to
reach the steady state, which can offer useful insights on factors influencing the
aggregation performance, is analyzed.
In [16], a robotic swarm where the interaction between robots is characterized
by long-range attraction and short-range repulsion forces is studied with the
Lyapunov theory, which allows finding the equilibrium point of the system. The
position of each robot relative to the centroid of the swarm is described by
means of a set of ordinary differential equations, and a Lyapunov function with
attraction and repulsion components between robots is proposed in order to
find a stable equilibrium point. The Lyapunov function is defined in terms of
three sets of parameters, called coupling, cohesion and convergence parameters;
the equilibrium point, i.e. the set of positions of each robot relative to the
swarm centroid when the system converges to a stable state, is shown to change
depending on the values of these parameters. Through computer simulations,
the authors obtained various swarming behaviors similar to those observed in
nature with social animals.
20
In [17], robots move with a constant speed in a two-dimensional space, and
their reciprocal interactions are governed by three forces (repulsion, alignment
and attraction) acting at short, intermediate and long distance, respectively.
Specifically, the alignment force accounts for the tendency of different agents
to move in the same direction, as observed in many groups of animal species.
The density of agents is expressed as a function of spatial coordinates, head-
ing direction and time, and is modeled using an integro-differential equation
in which the strength of the three interaction forces is determined by specific
parameters. Solving the equation with different parameter values, various pat-
tern formations can be obtained; for example, by changing the strength of the
alignment force the solution of the equation varies from a swarm-like pattern
(with low alignment force) whose single agents move randomly but which does
not advance in any direction, to a flock formation (with high alignment force)
whose agents aggregate while moving along a common direction.
In [18], the aggregation dynamics of the above model is extended with two
additional considerations: the agents may have a non-uniform field of vision,
and there may be another group of agents which interacts with the first group.
The non-uniform field of vision is introduced to model more realistically the
behavior of animals, which typically cannot see behind themselves; thus the po-
sition and orientation of an agent is influenced by the presence of other agents
depending on the orientation of the other agents relative to the field of vision
of the first agent. The second factor introduced in [18] is the presence of a
second group of agents: with the combined system it is possible to model the
interactions between different animal species, as for example in predator-prey
relationships. In the mathematical model, these interactions are expressed by
additional terms (taking into account the attraction and repulsion forces be-
tween groups of agents) in the integro-differential equations which govern the
density of agents of each group. By solving these equations, patterns of preda-
tor and prey movement commonly found in hunting and escape strategies of
different animals can be reproduced.
In probabilistic aggregation methods, mathematical models characterizing
21
the swarm dynamics are usually based on identifying some types of meaningful
events that can occur in the swarm (for example, an encounter between two
moving robots) and estimating the probability of occurrence of such events. In
[29], the aggregation behavior of cockroaches is studied using a model based on
difference equations, which allows calculating the expected number of robots
in aggregates of a given size at a given time. This model, which incorporates
the events where the size of an aggregate changes, such as when a robot joins
or leaves an aggregate, is validated with simulation experiments and is shown
to be able to describe qualitatively and quantitatively the swarm aggregation
dynamics; however, results are only reported for a relatively low number of
robots in the arena. A similar modeling approach is used by Hu et al. [52].
In [30], where aggregates are characterized by an energy value indicating the
strength of the bond between aggregated robots, difference equations are used in
a model estimating the number of non-aggregated robots and of aggregates with
a given energy value. In [34], the aggregation dynamics is described using four
basic events: creation, growing, shrinking and dissipation; with a probabilistic
model utilizing geometric considerations, the probability of occurrence of each
event is calculated, and model predictions are shown to match experimental re-
sults obtained from robot simulations in various operating conditions, including
scenarios with a relatively large number of robots. In [47], where robots can
establish wireless connectivity within a limited range, the main events of inter-
est are when two robots acquire or lose connectivity between each other, and
the macroscopic model estimates the number of robots with a given number of
connections, i.e. other robots within communication range.
In algorithms where the aggregation process is influenced by local charac-
teristics of the environment, mathematical models must take into account such
characteristics, and the collective dynamics can be expressed either with the
number of robots inside the areas designated for aggregation [21], or with the
density of robots expressed as a function of spatial coordinates [53, 54]. In
[20] an experimental setup to simulate honeybee aggregation is described, and
the probability of forming aggregates of a certain size in a given area is put
22
in relation with the luminance of the area and other parameters such as the
density of robots in the arena, their speed and their sensing range. Hamann et
al. [22] proposed two models to estimate the time-varying pattern of moving
and aggregated robots simulating the aggregation behavior of honeybees. In
the first model, called compartmental model, the arena is split in concentric
rings centered below the light source, and the number of moving robots and of
aggregates in each ring is estimated with ordinary differential equations. In the
second model, the space in the arena is not divided in discrete compartments
but represented as a continuous variable, and the densities of moving and aggre-
gated robots are expressed as functions of space and time variables and solved
via partial differential equations. Schmickl et al. [23], who analyzed a scenario
with two light sources of different intensity, studied the aggregation dynamics
using a stock and flow model, with which the number of robots aggregated near
each of the two light sources and the number of moving robots are predicted, and
a second model that estimates the spatial distribution of moving and aggregated
robots using partial differential equations.
4. Flocking
23
usual assumption is that all robots have at least one neighbor which connects
them to the rest of the swarm.
4.1. Methods
4.2. Algorithms
Flocking animal species typically are able to orient themselves in the en-
vironment, and thus can move toward a common target location known by all
individuals. This situation is replicated in engineered systems by giving individ-
ual robots the capability to reach a global target location, or at least to move
along a given direction on a common reference frame. However, the swarm
robotics discipline usually favors the use of minimalistic robots, with little or
24
no global information shared by swarm members, and past works have demon-
strated that flocking can be obtained also without global information. In the
following, the algorithms proposed in some notable studies on the flocking be-
havior are reviewed, differentiating methods relying on a global target location
or direction from methods without such shared knowledge.
25
ence of obstacles and other robots), microphones (through which the relative
position of other robots can be determined) and light sensors (used to detect the
global target). Each robot continuously emits a fixed amplitude sound so that
it can signal its presence to other robots beyond the coverage area of the in-
frared sensors. The fitness function used for artificial evolution includes a group
compactness component, and a speed component measuring the progress of the
swarm in approaching the target. From various replications of the evolution
process, the authors obtained different flocking behaviors.
In [64], Null-Space-based Behavioral (NSB) control is proposed for the task
of flocking. NSB control is used to determine the behavior of a robot in presence
of multiple sub-tasks, when each sub-task is accomplished by a specific behavior.
With NSB control, different priorities are assigned to different behaviors, and
the final behavior of a robot is obtained by combining the single behaviors after
having removed from lower-priority behaviors the components which conflict
with the higher-priority behaviors. In the specific case of flocking, three behav-
iors are identified, namely obstacle avoidance, flock formation and navigation
to target, in decreasing priority order. Each behavior corresponds to a control
input for robot movement, and the control inputs for lower-priority behaviors
are adjusted in order not to conflict with those of higher-priority behaviors. The
sum of the adjusted control inputs determines how a robot must be controlled
to accomplish the task.
In cases where only a fraction of robots has knowledge of the global target,
the motion control input to which robots are subjected depends on whether they
are informed or not; a typical approach followed in various studies [57, 66, 61] is
to include the target direction vector as an additional component in the motion
control input of informed robots. Using local interactions, informed robots can
propagate the information on the target to the other robots and steer the swarm
toward the target. In [58], an information-aware communication strategy is im-
plemented where informed robots transmit to their neighbors the goal direction
vector, while non-informed robots transmit the average heading vector received
from their neighbors. In [59], the above strategy is extended with a mecha-
26
nism that allows dealing with multiple goal directions with different priorities,
so that when a higher-priority direction appears the robots informed of the new
target are able to steer the swarm and override a lower-priority direction; the
heading vector transmitted by each robot to its neighbors is a weighted sum
of the average heading in the neighborhood and the goal direction (if known),
where the weight values are chosen dynamically based on the degree of local
consensus among robots about the preferred direction: when there is a high
consensus, informed robots give a large weight to their goal direction, otherwise
their behavior is more similar to that of non-informed robots, which facilitates
propagating to the swarm information on higher-priority targets. In [61], no ex-
plicit communication between robots is used, and flocking is implemented with
each robot detecting its neighbors and defining a set of regions around itself:
a repulsion area, a direction matching area, an attraction area and a frontal
interaction area; the robot heading direction vector is then calculated from the
position of neighbors inside each area.
27
light and detects whether the light is reflected by a nearby object (either another
robot or an obstacle), or passively, i.e. it detects whether another source of
infrared light (i.e. another robot) is within its sensing range. Robots move
with constant speed in a two-dimensional arena and monitor continuously the
presence of nearby objects in all directions with their sensors, adjusting their
movement direction according to an algorithm based on collision avoidance,
flock separation and flock cohesion rules. The authors verified with experiments
that using appropriate values of the algorithm parameters a flocking behavior
emerges from local robot interactions.
Turgut et al. [56] implemented flocking with real robots, equipped with
an infrared sensing system to measure the distance and relative orientation
of nearby robots and obstacles, and with a digital compass and a short-range
wireless communication module to measure their heading direction and the di-
rection of their neighbors. Flocking is obtained by combining proximal control
and heading alignment. The proximal control behavior uses the infrared sens-
ing system, which is capable of distinguishing neighboring robots from obstacles,
and tries to keep a desired distance from neighboring robots and to avoid ob-
stacles; the heading alignment system measures the heading direction with the
digital compass, periodically transmits this information with wireless messages,
and tries to align a robot with the average direction of its neighbors, calcu-
lated from messages received from the neighbors. The combination of proximal
control and heading alignment produces a desired heading direction, which is
translated into actuator commands regulating robot movements.
Ferrante et al. [66] obtained a flocking behavior using uniquely proximal
control, without alignment control. In their system, the implementation of
proximal control is done analogously to that in [56]; the novelty of this approach
lies in the motion control formulas which translate a desired velocity vector into
forward and angular velocities. In these formulas, robots tend to move forwards
due to a bias term in their forward velocity, but can also move backwards; as
shown by experiments in simulation and with real robots, this motion control
method is able to generate an emergent global direction of movement.
28
In [60], flocking in a three-dimensional space is implemented with real flying
robots. The control algorithm of the robots incorporates a repulsion force to
avoid collisions and an alignment force to align the heading directions of neigh-
bors; relative position and heading of neighbors are obtained via local wireless
communication. The alignment force is defined as a viscous, friction-like force,
which according to the authors prevents possible instabilities of the swarm due
to noise and delays in sensing and communication.
4.3. Analysis
4.3.1. Metrics
Various metrics have been proposed to give a quantitative assessment of
flocking performance. A common and intuitive metric is given by the distance
covered by the center of mass of the robot swarm [56, 65], where the center of
mass at a given time is calculated as the average of robot positions. In [62],
where robots are located in a square arena and are tasked with moving from one
corner of the arena to the opposite corner, the system performance is defined as
a combination of the time required to complete the task, the average distance
traveled by the robots, and the average inter-robot distance; these factors are
combined to form a cost metric that is used to learn optimal values of robot
control parameters. Baldassarre et al. [63] used three statistical measures to
characterize individual and group behavior in a flocking swarm: a “group stabil-
ity index”, a “group role index” and a “rotational index”. In [17], the flocking
behavior is described by measuring the spatial density and the distribution of
heading direction values at specific time instants after the beginning of exper-
iments: by taking different “screenshots” during the course of an experiment,
insightful representations of the swarm behavior can be analyzed. In [60], the
coherence of flocking is calculated by comparing the heading direction of each
robot with the direction of any other robot in the swarm. In [56], Turgut et
al. used a number of metrics to evaluate the performance of their robot control
algorithm: the swarm order, given by the sum of the heading direction vectors
for each robot (normalized with respect to the swarm population), with values
29
approaching 1 indicating an ordered state, while a swarm in disordered state is
characterized by a value approaching 0; the variation with time of the entropy
of the swarm, where the entropy is calculated defining clusters via a maximum
value of inter-individual distance and counting the number of robots belonging
to each of these clusters (a good flocking behavior is one where the change of
entropy over time is approximately zero); the swarm velocity, i.e. the aver-
age velocity of the center of the swarm during an experiment; and finally, the
number of robots in the largest cluster that is not fragmented during flocking.
If the flocking task is defined so that robots should move toward a global goal
direction, performance metrics must incorporate this direction. For example, in
various studies [57, 58, 66, 59] the accuracy value measures how the average di-
rection of robots differs from the goal direction, and in [66] the effective traveled
distance projects the vector representing the displacement of the center of mass
of the swarm on the vector indicating the goal direction.
4.3.2. Models
Starting from the work of Reynolds [55], various mathematical models of
self-organized flocking have been proposed to study the dynamics of natural
flocks or to inspire the design of robot controllers able to reproduce this collec-
tive behavior in artificial systems. These models are all based on the common
principles of inter-robot attraction and collision avoidance, and can include ad-
ditional factors such as velocity matching and presence of a target direction
(which may be known by all robots or only by a fraction of them). Equations
governing swarm behavior can be expressed either via functions of macroscopic
quantities (such as in [17], where a swarm is characterized in terms of local den-
sity of particles), or more often, as control laws for individual robot movements.
However, calculating analytically a performance measure of the flocking behav-
ior achievable with a given robot control algorithm is generally not possible, and
to evaluate the performance of an algorithm existing works have made extensive
use of computer simulations, in addition to performing experiments with real
robots.
30
5. Foraging
5.1. Methods
31
so that the actions executed by a robot are influenced by the activity of the
rest of the swarm. Such communication can be achieved in different forms: via
a shared memory, via local modifications of the environment, and with direct
exchange of information between agents. The following subsections describe
in more details each form of communication giving examples of relevant past
studies.
32
5.1.2. Communication through the environment
Ants and other social animals are known to produce chemical substances
called pheromone and use them to mark the environment. With pheromone,
swarm members establish a mechanism of indirect communication, which uses
the environment as a medium for sharing information. Pheromone presence can
usually be sensed from a short distance, and pheromone-mediated communica-
tion allows overcoming to some extent the limited sensing and communication
capabilities of swarm members. In the foraging task, pheromone can be used,
analogously to shared memory mechanisms, to create trails which facilitate find-
ing places of interest (such as rich food sources) in the environment [70, 71, 72].
In artificial systems, if robots do not have navigation capabilities that al-
low them to find quickly the nest, in order to optimize the return path to the
nest (whose location is fixed) the environment can be marked statically with a
gradient field whose intensity increases near the nest [70, 71]. Static gradient
fields cannot be considered as a type of pheromone communication, because
pheromone intensity in the environment is typically dynamic and determined
by the robot activity. However, there are similarities between these two mech-
anisms, because both produce local modifications to the environment aimed at
guiding robots with limited sensing and navigation capabilities.
A mechanism conceptually analogous to pheromone communication is put
in place when some robots in a swarm assume the role of “markers” of the envi-
ronment, by staying at a fixed location and providing to nearby foraging robots
an indication of the proximity to the nest and to item sources. This approach
works with static robots storing and communicating to foraging robots a value
corresponding to the current pheromone level at their location. For example, in
[73] the pheromone level stored by “beacon robots” changes over time based on
short-range communications with foraging robots, and thus follows the swarm
dynamics similar to what happens with physical pheromone dropped by for-
aging ants; in [74], beacon emitters signal their distance to the nest allowing
robots that carry an item to easily return to the nest following a chain of beacon
33
emitters within communication range of each other.
Another mechanism of communication through the environment can be put
in place using the items being collected, such as when the process of transporting
an item from its source to the nest is done by more than one robot [75]; in
some studies [76, 77], robots are allocated to different regions of the arena
and cooperate with each other by picking up and dropping the items at the
boundaries of these regions.
5.2. Algorithms
In the following, algorithms implemented in past studies dedicated to the
foraging task are described in more details, categorized based on the main mech-
anism used to achieve cooperative behavior.
34
if the reason for which they were created no longer exists, such as when a source
of items is depleted.
In studies where robots can communicate through a shared memory or with
broadcast messages, if a robot shares information about the path it followed
when reaching a place of interest, such information can be used by the other
members of the swarm to optimize their movements. This principle is put in
practice in [67]: robots, which are assumed to be able to determine their spatial
location in the arena, although with limited accuracy, continually record the
path along which they move; when a robot, which initially moves randomly
searching for items, finds an item to bring to the nest, it puts in the shared
memory the path it followed to go from the nest to the item. Other searching
robots modify their movement following the path read from the shared memory,
and then in turn share information on their path when they find an item. With
this algorithm, a map of preferred directions of movement is gradually built
in the shared memory, and allows optimizing the foraging task. In [68], this
scenario is implemented and tested with real robots.
In [69], shared trail information is built gradually in a similar manner as
in [67]. To address interference problems arising when the path from rich item
sources to the nest tends to be overcrowded and robots traveling in one direction
interfere with robots in the opposite direction, the basic path following algorithm
is adjusted so that robots heading toward a target, in addition to being attracted
by the path followed by other robots for the same target, are repelled by paths
used for different targets. For instance, a robot carrying an item and going
toward the nest will tend to avoid the opposite path followed by robots heading
toward the item source.
Methods based on pheromone communication, taking inspiration from ant
behavior, are based on marking the environment with signs meant to guide
robots through the optimal path to their destination. In [70], a mechanism
analogous to pheromone communication is used: robots carrying an item drop
“crumbs” on their return path to the nest, while robots searching for items
are attracted by crumbs. Hamann and Wörn [71] studied a similar scenario,
35
where foraging robots are guided by pheromone that is dropped by robots when
they return to the nest to deposit a collected item. The amount of pheromone
dropped by robots is maximum where the item has been collected and decreases
going toward the nest; with this mechanism, a pheromone gradient is formed in
the arena, and robots follow this gradient when they are searching for items. In
[72], pheromone is dropped by robots on their return path only if they detect
other items near the location of the item they are carrying.
In various studies, the problem of finding a food source or the nest is ad-
dressed with robots forming chains of interconnected individuals which guide
other robots in their search activity, similarly to pheromone trails. In [74], the
nest emits a beacon signal which can be sensed by robots up to a certain dis-
tance; a robot that reaches the limits of the beacon coverage area stops searching
for items and emits a new beacon, thus extending the search area for the other
robots, which may in turn become beacon emitters if they reach the limits of
the area covered by the beacon signal of the first robot. With this mechanism,
chains of beacon-emitting robots are formed which allow searching robots to
explore new areas and efficiently return to the nest. In [85], a chain is formed,
starting from the nest, by robots in physical contact of each other, and basic
behaviors allow robots to execute actions such as forming and extending the
chain, following the chain (e.g. when returning to the nest), and performing
excursions in proximity of the chain (to search for items).
In [73], each robot can either engage directly in the foraging activity, or
become a beacon. Robots acting as beacons form networks starting from either
the nest or an item source; each beacon stores an integer number indicating its
position in the network, with higher numbers corresponding to higher distances
from the nest or an item source. These robot chains create a virtual gradient
field, which can be “sensed” by foraging robots via local communication with
the beacon emitters and can be used to efficiently reach a location of interest.
The above algorithm is integrated in a subsequent work [84] in an adaptive
system capable of utilizing multiple algorithms, dynamically switching at the
swarm level from one algorithm to the next, based on the characteristics of
36
the environment. The virtual gradient-based algorithm, which performs better
when item sources are near the nest, is executed first by robots when starting
the foraging task; if it is not successful (as indicated by all the robots becoming
beacons), then a different algorithm is activated, in which robots form a single
chain starting from the nest and sweep a circular area centered at the nest. If
this algorithm, which is capable of exploring a larger area than the gradient-
based algorithm, is not effective either, then random walk is used as a last
resort.
37
item.
A mechanism similar to bucket brigading is territorial division, where robots
are assigned different working areas. Goldberg and Matarić [76] described a
system where robots are divided in two groups (castes) operating in different
areas: the “search caste” operates outside the nest area and brings collected
items to the boundary of the nest area, while the “goal caste” takes collected
items from the boundary of the nest area and deposits them in the nest. In [77],
the arena is divided in a number of working areas corresponding to the number
of robots. Each robot is assigned a working area, where it searches for items;
collected items are deposited at the boundary of the next working area in the
direction toward the nest, except for the robot in the last working area, which
drops the items directly in the nest. The authors showed how “invasions” of
contiguous areas by robots, due to either localization errors, or the mechanism
of depositing collected items, are a major source of interference between robots,
and determine the existence of a critical number of robots, above which the
foraging performance decreases.
In the scenario described by Pini et al. [75], territorial division emerges
dynamically as a result of task partition between robots. Items to be collected
are grouped in a single location, and after transporting an item robots return to
the location where they found the item; however localization errors cause this
mechanism to be imperfect, thus robots may not be able to return quickly to the
item source. To tackle this problem, each robot transports a collected object for
a limited distance instead of traveling directly to the nest, and relies on other
robots to complete the transportation task; since the amount of localization
errors increases with the distance walked, using this mechanism robots improve
their success rate in finding the item source. The distance to cover is chosen
autonomously by each robot based on the estimation of a cost function which
maps the amount of work executed by the robot to the overall cost of the task
to deliver the item to the nest.
Arkin et al. [83] studied a cooperative transport mechanism where one
item can be carried by more than one robot at a time; when this happens, the
38
process of bringing the item to the nest is accelerated, thus cooperating robots
accomplish the task more efficiently. An explicit communication mechanism is
used by robots carrying an item can signal their status, so that other robots are
attracted by signaling robots and help them to bring the item to the nest.
5.3. Analysis
5.3.1. Metrics
Two widely used metrics to extract a measure of foraging performance from
experimental results are the number of items collected in a fixed time as a
39
function of swarm size [67, 69, 73, 75, 77, 80, 82] and the time needed to collect
a given number of items [70, 76, 77, 79, 81, 83, 72, 84]. In some studies, the
cost of the foraging activity (i.e. the total energy consumed by the swarm) is
quantified, for example by calculating the distance traveled by robots [76, 83];
Krieger and Billeter [78] assigned a different weight to the traveled distance
depending on whether robots are carrying an item, with the assumption that
more energy is consumed when carrying an item; Hoff et al. [73] included also
communication between robots in their cost metric.
Special attention is dedicated to the problem of inter-robot interference,
which arises when multiple robots share a constrained space. The incidence of
this phenomenon can be quantified as the number of collisions between robots
[76, 77], or the time spent in all instances when a robot tries to perform a task
but is hindered by another robot [78].
5.3.2. Models
In the majority of existing studies, analysis of the performance of a swarm
in the foraging task is done with either real robot experiments or computer
simulations, while mathematical modeling is usually not done due to its inherent
difficulty. However, some notable exceptions exist. For example, in [86] it is
shown how a foraging scenario of the type described in [70] can be modeled using
stochastic Petri nets and Markov chain analysis; with this model, performance
metrics such as the probability to bring to the nest all items placed in the
environment within a given time frame can be predicted. In [71], a foraging
scenario based on pheromone communication is modeled with a set of partial
differential equations, which allow estimating the spatial density of robots in
the search state and robots carrying an item, as well as quantifying the flow of
items to the nest based on the rate of transitions to the search state.
A mathematical model that studies the effect of inter-robot interference is
presented by Lerman and Galstyan [87], who showed that in a non-cooperative
system there exists an optimal number of robots which minimizes the time
needed to bring to the nest the items contained in the arena: above this optimal
40
number, the interference between robots outweighs the work executed by the
additional robots. Thus, many efforts have been devoted in various studies to
devising cooperative mechanisms for interference reduction.
6.1. Methods
Similar to the foraging task, in most object clustering scenarios robots are
assumed to be unable to sense the presence of distant objects, thus the typical
behavior when searching for objects is to move randomly; the randomness of
movement is either encoded directly in the robot controller, or the result of
trajectory changes executed by robots when obstacles are encountered. When
robots do not have localization capabilities, given that there are no predefined
locations where objects must be delivered, the walking behavior of robots car-
rying an object is usually similar to the behavior adopted when searching for
objects, thus clusters are formed at random locations, due to the fact that robots
carrying objects move randomly.
Robots used in the clustering task are usually characterized by short-range
sensing (by which only nearby objects can be detected) and short-term memory
(by which a robot can remember the presence of objects encountered in the near
past): at the extreme, a robot can only detect objects at zero distance (such as
when contact sensing is used) and cannot remember previous object encounters.
Given these limitations, the decision of each robot on whether to pick up or
41
drop an object at any given time or location (which determines the emergence
of clusters at the macroscopic level) can only be taken (either deterministically,
or with a stochastic component) based on the limited representation of the
environment allowed by robot capabilities; the algorithms implementing this
decision are often based on intuitive considerations, but in some cases [88, 89, 90]
artificial evolution methods have been used.
In the clustering task, interactions between robots working in the same arena
typically cannot be used to boost the performance of task execution, and in-
creasing the number of robots employed for a given task con only lead to a
sub-linear increase of global performance, due to the possibility of inter-robot
interference. Thus, mechanisms of communication between robots are usually
not considered in clustering studies. An exception is represented by scenarios
where robots have localization capabilities which allow them to build clusters in
a “smarter” way: in this case, coordination between robots becomes important,
and communication mechanisms (either direct [91] or indirect [92]) play a role
in the global clustering dynamics.
6.2. Algorithms
Most existing literature on clustering and sorting tasks makes use of simple
robots with no localization capabilities, where clustering of objects emerges as a
mere result of probabilistic interactions with the environment; in other studies,
robots have the ability to localize themselves, and clustering is done with a more
deterministic approach. Theses two types of scenarios are described in the rest
of this section with an overview of relevant past works.
42
fers inspiring behaviors which can be used as a basis for engineering artificial
systems: for example, some algorithms for cluster formation are inspired by the
behavior of ants during brood sorting.
An often cited paper in this field is the work by Deneubourg et al. [93], who
described a simple algorithm for implementing object clustering by a group of
robots. Each robot moves randomly in the arena, and has a short-term memory
of the objects encountered during its last steps; each time a new object of a
particular type is encountered, robots pick it up with a probability dependent
on the number of objects of the same type encountered previously, with larger
numbers corresponding to lower probability values. Conversely, a robot carrying
an object has a probability of dropping the object which increases for increasing
numbers of objects of the same type encountered during the last steps. The
authors showed with computer simulations how this simple algorithm leads to
the formation in the arena of groups of same-type objects.
In other simulation studies, robots are assumed to be able to sense their
immediate vicinity. For example, in [94] robots and objects are placed on a
two-dimensional grid of square cells, and a robot, which can move from a cell
to another in 8 possible directions, can sense the presence and type of objects
in 3 of the 8 cells surrounding its current position: the cell on its movement
direction, and the cells on its left and right side. When a robot not carrying
any object senses an object in front of it, it picks up the object if the object
type is different from objects on either its left or its right; conversely, an object
is dropped if its type is the same as the objects on either its left or its right.
In [88], Hartmann studied a scenario where robots controlled by a neural
network are placed on a two-dimensional grid of square cells and are able to
detect the presence and type of objects located in each of the 8 cells surrounding
their current position. The outputs of the neural controller allow robots to move
forwards and backwards, to turn left and right, and to pick up and drop objects.
Evolving the neural controller parameters with a genetic algorithm, robots can
achieve good results in clustering both single-type and multi-type objects.
In [95], the clustering task is executed by real robots equipped with a frontal
43
C-shaped shovel that they can use to push up to two objects at the same time.
Robots move in a straight line until either they sense an obstacle, or their shovel
indicates that more than two objects are being pushed: in both cases, robots
change randomly their direction of movement, but in the second case robots
first release the objects being pushed. With this simple mechanism, robots are
able to cluster all objects in a single group.
Martinoli et al. [96] developed an algorithm to position objects in a chain
formation: robots equipped with a gripper and a set of infrared sensors are able
to estimate the size of obstacles in front of them; small obstacles are considered
objects to be clustered, while robots and arena boundaries are perceived as large
obstacles; when an object is detected, the robot picks it up if it is not carrying
one, otherwise it drops the carried object: when a large obstacle is detected,
the robot avoids it. With this algorithm, groups of two or more objects are
detected as large obstacles, and thus avoided, unless a robot approaches them
from an angle such that the first object hides the others; as a consequence of
this behavior, clusters are built as chain formations, instead of compact groups,
as demonstrated by the authors with real-robot experiments and simulations.
Another set of experiments with physical robots is described in [97]. In
this scenario, a robot cannot distinguish between the objects to be clustered
and other obstacles such as arena walls and other robots: when the proximity
of any type of obstacle is sensed, an avoidance mechanism makes the robot
steer away from the obstacle. Moving of objects to be clustered is obtained
by deactivating the frontal infrared sensor, so that robots cannot sense the
presence of small objects located frontally: such objects are thus pushed until
another obstacle is sensed in a direction different from the frontal direction.
This approach allows obtaining clusters, but as noted by the authors, when
an object is pushed against a wall, it becomes “lost”, thus experiments always
terminate with a non-negligible amount of objects not clustered.
In the study by Holland and Melhuish [98], robots are able to push objects
and sense obstacles, and the clustering behavior is achieved using an algorithm
similar to that in [95]. In addition, robots can distinguish objects of different
44
types, and have the ability to pull one object with a gripper when moving
backwards. Using these capabilities, the authors demonstrated that robots are
able to sort objects of two types, forming an inner cluster of objects of the
first type, surrounded by objects of the second type. This spatial organization
emerges when robots pull backwards by a specified distance objects of the second
type before releasing them.
In [89], the object sorting method of [98] is extended to more than two
types of objects, by using a different pullback distance for objects of different
types. In order to achieve good results in terms of both separation between
types and cluster compactness, the pullback distance values for each type are
calculated dynamically by robots based on the rate of encounters of the differ-
ent object types during the clustering activity. Optimized parameters of the
distance adaptation algorithm are found by the authors using an evolutionary
approach.
Vardy [99] used simulated robots with vision capabilities to achieve clus-
tering of multi-type objects differentiated by color. Similar to the experiments
described in [95], object pickup and deposit is obtained with forward and back-
ward movements using a C-shaped shovel. A robot estimates the presence of
objects on its trajectory by counting the pixels of specific colors in its field of
view; if it is carrying an object of a given color and the number of pixels of the
same color exceeds a defined threshold, then it releases the object. Two varia-
tions of the algorithm are presented: in the basic version, in absence of obstacles
robots typically move in straight lines, thus their interactions with the objects
are consequence of random encounters; with an enhanced algorithm, robots ex-
ploit their vision system to accelerate the clustering activity, by identifying the
direction of a target object to pick up or a cluster where to deposit the object
being carried, and moving toward the target.
In [100], single-type object clustering is obtained using a simple neural net-
work with 2 inputs and 3 outputs. Robots push objects with a frontal shovel,
and are able to detect and count the number of objects in two detection ar-
eas, one in immediate proximity and one farther away; the numbers of objects
45
counted in each area are used as input to the neural controller, which selects one
of 3 basic behaviors: BackUpAndTurn, Turn and MoveStraightAhead. Weights
of the neural network are calculated with a supervised learning technique.
Gauci et al. [90] used robots with a sensor able to detect the presence of
another robot or an object in line of sight; based on their sensor input, robots
react by setting a given angular velocity to their left and right wheels, thus
determining their direction of movement and the curvature of their trajectory.
Objects are moved by pushing them. The optimal values of the angular velocity
of the wheels for each possible value of the sensor input (i.e. an object detected,
a robot detected and nothing detected) are selected via artificial evolution. The
authors showed that object clustering can be achieved with this method provided
that robot sensors have a sufficiently long range; interestingly, clustering can be
obtained (although less efficiently) also if the sensors are unable to detect other
robots or to distinguish robots from objects.
46
6.3. Analysis
6.3.1. Metrics
Quantitative measures of task execution performance often used to assess
the effectiveness of an algorithm include the number of clusters being created
[95, 97, 96, 91], the average cluster size [96, 100], the size of the largest cluster
[95, 96, 100, 91, 90], and the time for task completion [95]. Task completion
can be defined as the achievement of a condition where there is a single cluster
per object type (in scenarios where this can be obtained with a significant
probability), or the number and size of clusters is “satisfactory” according to
given criteria. In [94], the percentage of task completion in presence of multi-
type objects is indicated by a performance metric where each object type can
be given a different weight; the metric is calculated by measuring the size of all
clusters and putting it in relation with the total number of objects of a given
type. A similar metric is used in [92], where only the largest cluster of objects
of each type is considered in the percentage of task completion. To evaluate the
47
temporal pattern of task progress during an experiment, Vardy et al. [92] used
another metric, called time-weighted completion, which tracks the percentage
of completion during the entire experiment.
With multi-type object clustering, the performance of task execution is de-
termined not only by how objects of the same type are clustered together, but
also by how objects of different types are separated from each other. In [89, 88],
these considerations are expressed in metrics that include a compactness com-
ponent (measuring the quality of clusters of same-type objects) and a separation
component (measuring the degree to which objects of different types are sepa-
rated from each other).
As mentioned earlier, in the clustering task the presence of multiple robots
working simultaneously usually does not bring a super-linear increase of perfor-
mance: on the contrary, inter-robot interference is likely to cause a decrease in
the amount of useful work that can be done by each robot. The impact of this
phenomenon can be described quantitatively by counting the number of events
where the interaction between robots prevents them from doing useful work; as
can be intuitively understood, given a fixed arena size the number of such events
per robot increases with the number of robots [98], thus the total number of
interactions increases more than linearly with the number of robots [95].
6.3.2. Models
One of the few attempts to build a complete mathematical model of the
clustering dynamics is represented by the work of Martinoli et al. [96], who
developed a probabilistic model and applied it to two different clustering scenar-
ios. The model represents the clustering activity as a sequence of probabilistic
events, such as a robot entering the detection area of a cluster, or an object
being added to or removed from a cluster; the probability of occurrence of each
type of events is calculated from the robot control algorithm using geometric
considerations. As shown by the authors, the model is able to predict the clus-
tering dynamics (using metrics such as the average cluster size, the number of
clusters, the size of the largest cluster, or the time needed to obtain a single
48
cluster) in good agreement with experimental results.
A detailed formal analysis of the object clustering problem is presented by
Kazadi et al. [101]. In their study, the evolution of clusters is expressed as a
function of the probability to add or remove an object to/from a cluster; the
authors found that the ratio between these two probabilities, which is a function
of cluster size since generally both probabilities depend on the cluster size, and
is denoted by g in [101], is a fundamental property of object clustering scenarios
which determines the equilibrium state of the system (expressed as the number
of clusters and their size) starting from given initial conditions. This means
that it is possible to obtain the desired clustering dynamics in a given scenario
by appropriately choosing the g function in the robot implementation.
7. Navigation
7.1. Methods
Given that each robot in a swarm has a limited knowledge of the local
environment in which it operates, the key to a successful navigation strategy in
a multi-robot scenario is sharing knowledge between robots. Such knowledge
usually consists of an estimate of the distance to the target, which can be
expressed as Euclidean distance [102, 103, 104], as a “hop count” [105, 106,
107], or through some other means. Communication between robots is usually
direct short-range communication, but can be achieved also indirectly through
the environment [107]. Since a robot cannot always assume the presence of
nearby robots with which it can communicate or nearby “messages” left in the
49
environment by other robots, typically a random component is present in robot
controllers, by which a robot moves randomly until it receives some information
that helps it to optimize the navigation task. Once initial information has been
received, robot behavior is usually more deterministic, because this information
allows it to determine an optimal direction of movement.
7.2. Algorithms
50
from which the target location can be easily reached. A similar approach is
used in [106], where wireless messages exchanged by robots produce a sort of
pheromone gradient, using an analogy with chemical pheromone utilized by ant
colonies. Following this gradient, the target location can be easily reached from
any position where there is at least one robot in communication range.
Another navigation algorithm where robots position themselves in the en-
vironment in order to allow another robot to reach its destination has been
proposed by Mullins et al. [108]. In this scenario, the target location emits an
audio tone perceived by robots within a limited range; robots are able to mea-
sure the sound intensity, but cannot perceive its direction. To reach the target,
robots implement a search mechanism inspired by the behavior of bacteria which
move toward areas with higher presence of nutrient. A robot alternates between
two states: tumble and run; in the first state, the robot makes a random turn,
while in the second state the robot moves on a straight line for a time duration
which depends on the variation of sound intensity measured during the move: if
the sound increases, the robot moves for a longer duration compared to the case
of decreasing sound intensity. With this approach, under the assumption of a
bounded arena, the robot is always able to reach the target without help from
other robots, even if its initial location is outside the range of the audible tone,
but it may require a long time to accomplish the task. To improve the efficiency
in reaching the target, a distributed algorithm involving the other robots in the
arena is proposed: the searching robot emits a specific sound indicating that it
needs to reach the target; robots hearing this signal propagate the signal and
begin moving randomly, provided that they are not in range of the target; if a
robot is near the target (as indicated by the perception of the audio tone from
the target), it stops and retransmits the tone at a higher frequency; in turn,
other robots hearing the new tone stop and retransmit the tone at a higher
frequency; in this way, a tree-like structure of robots is formed starting from the
target. The searching robot can navigate through this tree with the algorithm
based on the tumble and run states described above, changing repeatedly its
target during the navigation according to the lowest frequency tone perceived.
51
The studies described so far employ algorithms in which robots in the swarm
are dedicated to the navigation task of another robot. While this behavior
minimizes the effort of the single agent to reach the target location, it also
potentially utilizes a large amount of resources in the swarm. For this reason, it
is desirable for robots to be able to execute their own tasks while still providing
help to other robots for navigation. Wurr and Anderson [107] approached the
navigation problem using numbered markers dropped by robots in the arena,
by which a robot can reach a target location following the sequence of markers
from the highest numbered to the lowest numbered. A robot which locates the
target drops a marker at its current location if it perceives no other markers in
sight; the number assigned to the marker indicates its closeness to the target.
Analogously, a robot which locates a marker and perceives no other higher-
numbered markers in sight drops another marker. With this mechanism, trails
of markers are formed in the environment, providing a means for a robot to
reach the target without requiring other robots to assume the role of static
landmarks.
52
munication capabilities limited to line-of-sight conditions, and use a navigation
mechanism based on the following algorithm. A robot which has sensed a target
location shares this information with other robots in line of sight, so that robots
seeking the target can use the location of the first robot as a temporary target
and head toward it in order to get closer to the real target. This mechanism
is extended to chains of robots in line of sight of each other, each of which
transmits information on the estimated distance to the target, calculated either
directly (if the target is in line of sight), or by summing the distance to the next
robot in the chain with the distance to the target advertised by that robot. The
algorithm does not require robots in a chain to be stationary, and deals with
moving robots introducing the concept of “ghost” robots: if a robot seeking a
target has sensed another robot which transmits information on the estimated
target distance, and if the transmitting robot moves in the environment such
that it either becomes invisible to the first robot, or increases its distance to the
target, the first robot moves toward the previous location of the second robot,
i.e. it moves as if there was a “ghost” robot still transmitting its target distance
from that location.
To deal with the constantly changing connectivity structure between commu-
nicating robots in a swarm (due to the fact that robots move and their commu-
nication range is limited), Ducatelle et al. [103] proposed solving the navigation
problem using a dynamic routing algorithm where the route between a robot
and a target location (expressed as a sequence of intermediate nodes) is contin-
uously updated and improved based on the current configuration of the swarm;
the algorithm requires the capability of robots to estimate the relative position
of neighboring robots, and is based on the transmission of short-range wireless
messages that carry routing information associated with path quality informa-
tion; path quality is measured from the relative distance between neighboring
robots. In subsequent studies [109, 110, 104], the authors used for the same sce-
nario a routing algorithm where navigation information is dynamically adjusted
by each robot based on odometry measurements; this algorithm is shown to be
effective also in situations with intermittent connectivity between robots, such
53
as cluttered environments with low robot densities.
In [111], the collective navigation strategy is based on a signal propagation
technique. Robots in the detection range of the target emit a light pulse for a
fixed time duration; neighboring robots are able to detect the light pulse, and
in turn emit the signal; after signal emission, each robot enters in a “refractory”
state, where it is insensitive to received signals. If there is at least one robot in
the detection range of the target and the robot density in the arena is sufficiently
high, this mechanism produces signal waves propagating from the target through
the swarm. Each robot is able to determine the direction of a detected light
pulse, and can thus navigate to the target by following this direction.
7.3. Analysis
54
tially significant use of resources in the swarm; in [108], this cost is quantified
as the total time committed to the navigation process of a single robot by all
swarm members, and with simulation experiments is shown to exceed the aver-
age time the single robot would need to reach the target using a simple random
walk technique. For these reasons, when implementing a distributed navigation
algorithm a tradeoff between navigation performance and total use of swarm
resources may need to be considered.
8. Path formation
Path formation in swarm robotics refers to a process where robots are able
to build collectively a path between two locations in the environment, so that
the time needed to reach one location from the other is minimized. This task
can also be referred to as chain formation, because often the path is marked
by a chain of robots, either stationary or moving. As described in the section
dedicated to the foraging task, path formation mechanisms can be observed in
many instances in the foraging context, because often robots share locations of
interest and thus can benefit from sharing information on how to reach those
locations. Studies reviewed in this section focus on scenarios with two target
locations, where robots must be able to move efficiently from one location to
the other.
8.1. Methods
In past studies, the path formation problem has been addressed using mainly
pheromone-based, probabilistic, and evolutionary methods. Pheromone com-
munication has been extensively studied for the foraging task, where it is used
to dynamically build optimized paths between the nest and rich food sources
[70, 71, 73]; a mechanism analogous to the use of pheromone is put in place when
robots exchange local messages whose content provides an indication of the cur-
rent position of communicating robots with respect to the path being formed
[112]. In probabilistic methods, a chain of robots is created as a sequence of
55
stochastic events determined by robots continuously joining and leaving the
chain; this dynamic process allows robots to explore new areas in the environ-
ment until both locations of interest are found and an optimized path between
them is formed [113]. In evolutionary methods, robot behavior is evolved ac-
cording to a fitness function that measures the quality of a path established
between two locations: for example, in [114] robots are controlled thorough a
simple neural network whose parameters are evolved with a genetic algorithm,
and the fitness function rewards robots based on how many times they travel
from one target location to the other.
8.2. Algorithms
The following brief survey describes some examples of path formation algo-
rithms differentiated based on whether the path between the two target locations
is marked by a chain of stationary robots, or is sustained by a continuous flow
of robots moving along the path.
56
information to other robots. As demonstrated by the authors, this algorithm
works also in challenging environments such a maze-like arena.
In [113], the path between two locations is formed by robots communicating
through colored LEDs and sensor cameras. The algorithm is based on first
finding one of the two locations (the “nest”) and then progressively forming
sequences of inter-communicating robots departing from the nest, until the other
location is found. Two variations of the algorithm are proposed: with the first
variant, one or more linear chains are formed starting from the nest until one of
them comes in contact with the other location; in the second variant, a robot
can join an existing chain at any position, so that tree-like structures can be
formed. In both cases, robots at the end of a chain can leave the structure at any
time, using probabilistic rules. The task is considered successfully completed
when a robot in one of the chains departing from the nest senses the other target
location.
57
direction of movement; inputs of the neural network are a sensor camera and
infrared proximity sensors. Selecting the parameters of the neural controller
with an evolutionary method, the authors obtained the emergence of a collective
behavior where robots form two parallel “traffic” flows between the two target
locations, repeatedly traveling from one location to the other. Such evolved
behavior is made possible by the interaction between robots (through their col-
ored LEDs), which allows the path between the target locations to be sustained
by the continuous flow of robots; in this respect, an interesting analogy can be
found with pheromone-based trail formation in ants.
8.3. Analysis
58
moving in either direction, two separate phases can be identified in the swarm
dynamics: first, randomly distributed robots “explore” the environment search-
ing for the two locations of interest and establishing a path between them (initial
coordination period), then the established path is sustained (and potentially op-
timized) by the robot flow. Sperati et al. [114] evaluated their path formation
method by calculating a measure of energy that takes into account the number
of times a robot travels from one location to another in a given time interval,
compared to the maximum attainable number given the distance between the
two locations and the robot speed; this evaluation, done after the initial coor-
dination period (whose duration is chosen arbitrarily), gives a measure of the
optimality of the chosen path and the ability of robots to follow it without neg-
atively interfering with each other. From results of multiple experiments, the
authors found a correlation between the path distance and the optimal swarm
size, indicating that a larger swarm performs better with a larger distance.
9. Deployment
9.1. Methods
59
latter case, robot movement can be determined following principles of artifi-
cial physics [120, 121], with the objective of preserving connectivity between
swarm members [106, 116, 117], or to obtain formations described by a specific
geometric relationship between neighboring robots [124, 126, 119].
9.1.2. Stigmergy
Stigmergic communication gives robots moving in an area of the environment
an indication of the actions done previously by other robots in the same area. In
various scenarios (e.g. in the foraging task), it is used with a positive feedback
mechanism, i.e. an action done by a robot increases the probability that the
same action is repeated by other robots. Conversely, for the deployment task
a negative feedback mechanism can be put in place, preventing different robots
from repeating the same action, and specifically preventing the same areas of
the environment from being explored multiple times, or (in tasks where the
environment must be covered repeatedly) maximizing the time between two
successive visits to the same place.
Stigmergic communication has been implemented in past works using simu-
lated pheromone traces [128, 129, 130, 131, 132]: the presence and intensity of
pheromone at a given location is used as an indication that the location has been
visited before. In some studies, pheromone is assumed to evaporate over time,
analogously to what happens in nature with chemical traces, and this property
is used to optimize repeated coverage of the same area [128], or to dynamically
assign non-overlapping patrolling areas to different robots [132].
9.2. Algorithms
A survey of past works in the deployment task is presented in the following,
where two main variations of the task (dispersion and pattern formation) are
identified and used to categorize the different studies.
9.2.1. Dispersion
In the dispersion task, swarm members must position themselves away from
one another, with the objective of maximizing the area covered globally by the
60
swarm and/or minimizing the time needed to cover the area. A robot dispersion
technique can be applied for example in scenarios where robots must find either
particular locations in the environment or objects located in unknown places
(as in the case of foraging robots), thus the dispersion task can be used as a
sub-task of more complex activities. In some cases, an additional constraint is
given by the requirement that the connectivity of the swarm must be preserved,
i.e. each robot must be able to sense or communicate with at least another robot
so that there are not isolated groups; the practical effect of this constraint is to
limit the maximum area that can be covered simultaneously by the swarm.
It is intuitively understood that programming robots so that they avoid each
other while moving in the environment increases the capability of the swarm to
cover a large area, compared to a simple random walk technique. Morlok and
Gini [123] evaluated the performance of four basic algorithms (random walk,
wall following, avoiding all obstacles, and avoiding other robots) in maximizing
the area covered by a swarm of robots in closed environments. From simulation
results, the algorithm where robots avoid each other performs better than the
other algorithms, indicating that knowledge of the location of nearby robots is
useful for dispersing the swarm. Also the study done by Batalin and Sukhatme
[115] gives experimental evidence that control algorithms where nearby robots
are distinguished from other obstacles perform better than schemes where robots
are simply considered as obstacles.
To overcome the problem where a robot with limited sensing abilities is
unable to determine whether a given location has been visited before, various
stigmergic techniques have been studied. Wagner et al. [128] devised some al-
gorithms where robots drop evaporating pheromone along their path, and when
choosing their walking path give precedence to areas with the lowest pheromone
level. The techniques described in [128] rely on partitioning the environment
in a discrete set of tiles, and modeling the partitioned arena as a graph where
vertices represent the tiles; in [130], an analogous algorithm is proposed, which
operates on continuous domains as opposed to graphs. In [129], experiments
with real robots are reported, where the pheromone is represented by ink trails
61
drawn by robots along their path with a pen. Kuyucu et al. [131] used a genetic
algorithm to evolve a set of parameters (e.g. pheromone production rate) which
influence the swarm performance in the deployment task: according to simula-
tion results, parameter values obtained with the evolutionary method lead to
better performance compared to manually tuned values. In [132], stigmergic
communication allows a group of robots to coordinate to dynamically partition
an area in contiguous territories, with each territory patrolled by one robot; an
adaptive variant proposed by the authors to the basic algorithm is shown to
allow swarm members to dynamically learn an optimal size of their respective
territories based on the arena size and the total number of robots.
In the context of navigation, Payton et al. [106] described a deployment tech-
nique based on short-range infrared communication, where robots are able to
approximately measure reciprocal distances while within communication range
of each other. In the “gas expansion” model, robots initially grouped in a small
space explore the environment trying to maintain a target distance with neigh-
boring robots in order not to lose connectivity. If the total number of robots is
not sufficiently high to cover the entire arena, this model can be complemented
by a “guided growth” model, such as when one robot starts to move away from
the swarm to explore uncovered areas of the arena and doing so “pulls” the
entire swarm, because the other robots follow the exploring robot in order to
keep connectivity with it and between each other. A similar approach for robot
dispersion and for exploration of new areas is presented in [116].
As opposed to tasks where robots must continuously move in order to cover a
given area (typically employed when the arena size is much larger than the union
of the sensing areas of all robots in the swarm), other studies focus on tasks
where the swarm reaches an equilibrium status with robots occupying static
positions. To regulate the mutual distance between robots where the objective
is to disperse the swarm, some studies utilized artificial physics methods based
on the concept of virtual potential fields and virtual forces. Reif and Wang
[133] introduced the concept of social potential fields, which reflect the “social
relations” between robots performing distributed tasks. Howard et al. [120]
62
proposed a control law for the velocity of robots based on a potential field
determined by the presence of other robots and obstacles. Each robot in close
proximity of other robots or obstacles is repelled by nearby entities, and moves
according to the virtual force determined by this repulsion. This mechanism
leads the swarm to optimize the occupation of the arena according to the total
number of robots. Podury and Sukhatme [121] used potential fields to maximize
the area covered by a swarm of robots with defined sensing and communication
ranges, with the constraint that each robot must stay within communication
range of a minimum number of other robots. In [127], the task of maximizing
the area covered by a swarm of connected robots is tackled with an automatic
design method using probabilistic finite state machines, where parameters of
robot controllers are selected with an optimization algorithm.
In many deployment tasks, and especially in those using potential field ap-
proaches, it is assumed that a robot can measure with a reasonable precision
the distance and relative orientation of nearby robots. When using real robots,
this capability is usually offered by infrared technology, which offers line of sight
communication with highly directive signal radiation patterns with known at-
tenuation characteristics. In [117], the swarm deployment task is performed
by robots using radio frequency communication, characterized by a much less
predictable mapping between signal strength and distance; in addition, relative
orientation cannot generally be inferred from the received signal. Despite these
difficulties, the algorithm proposed by the authors is successful in dispersing a
robot swarm in the environment.
63
neighbors allows a high degree of flexibility in determining the desired positions
of neighbors, from which multi-robot formations can emerge. Thus, by using
local rules, if each robot in a swarm positions itself with the purpose of obtain-
ing a desired distance and orientation with respect to neighboring robots (i.e.
forming a geometric shape with its neighbors), at a global level the swarm can
converge to a state where it is deployed optimally in the environment.
In [122], an extensive analysis is performed on the dynamics of formation of
different patterns with robots controlled by virtual forces. The authors described
how two- and three-dimensional hexagonal lattices of self-controlled particles
can be obtained using attraction and repulsion forces. In addition, particles are
subjected to a viscous friction force, which is proportional to the particle speed
and whose purpose is to avoid continuous oscillations around an equilibrium
state. The authors demonstrated with computer simulations that starting from
particles in random locations the system can converge to a state where particles
form a hexagonal lattice. Mikkelsen et al. [125] used a similar virtual force-
based approach to obtain hexagonal lattice formations; their model includes the
effects of communication errors between neighboring robots which occur due to
the limited range and non-uniform coverage of the infrared sensors. Another
study utilizing virtual forces is the work by Mathews [118], where formation
of triangular grids is obtained; this pattern is shown to optimize area coverage
by robots with given sensing and communication ranges, provided that the
communication radius is sufficiently large compared to the sensing radius.
In [126], each robot chooses two other robots among its neighbors, and then
positions itself so as to form a triangular shape with those neighbors. The
distance between the robot and its neighbors is chosen based on a measured
local characteristic of the environment. If all robots operate with the same
algorithm and if the environment characteristic which determines the distance
has the same value in the entire covered area, this technique leads the formed
triangles to be equilateral, and thus a regular mesh pattern is observed at the
swarm level. In [124], an analogous technique is used in a three-dimensional
space, where each robot selects three neighbors and tries to form a tetrahedron.
64
In [119] desired formations are obtained by defining weight functions which
vary with the spatial position relative to robots. The control law used by each
robot for achieving a specific formation is determined with a consensus-based
approach by calculating the corresponding weight function on the relative po-
sition of each neighbor. As opposed to methods based on potential fields, a
consensus-based approach is more robust in reaching an “agreement” between
neighboring robots on their relative positions, even in presence of communica-
tion delays.
9.3. Analysis
9.3.1. Metrics
Evaluation of the performance of a self-deployment algorithm is usually car-
ried out by executing experiments where robots are initially either clustered at
a specific place, or scattered randomly in the arena. In scenarios where the
arena size is much larger than the union of the sensing areas of all the robots in
the swarm, the deployment task is executed with robots visiting in succession
different areas of the environment until the entire arena is covered. Empirical
results to evaluate how a given algorithm works in practice can be expressed as
the percentage of arena space visited by robots in a given time [115, 123], or
as the time needed to cover the entire arena [128, 129, 130, 131]; often, these
metrics are expressed as a function of the number of robots, to evaluate how
inter-robot interactions affect task execution.
In scenarios where the goal of the deployment task is to reach an equilibrium
state where the swarm assumes a static configuration, a good indication of the
performance of an algorithm is obtained by analyzing the temporal evolution
of the area covered by the swarm [120, 121, 117, 132, 118]: a faster dynamics
where the swarm reaches the equilibrium in a small time usually indicates a
more performing algorithm. To evaluate the quality of the equilibrium state,
especially in pattern formation techniques, some useful measures are the distri-
bution of inter-robot distance [125] and the mean square error between desired
and actual distance [119].
65
9.3.2. Models
In various studies proposing self-deployment algorithms, researchers have
been able to demonstrate with formal analysis the validity of their proposed
approaches. For example, for the dispersion task Wagner et al. [128] and
Osherovich et al. [130] calculated analytically an upper limit to the time needed
for a swarm of robots to cover the arena with their self-deployment algorithms;
for the pattern formation task, Lyapunov theory has been used in [126, 124] to
prove the ability of the swarm to converge to an equilibrium state characterized
by the desired spatial relationships between robots.
In [129], where robots are programmed to cover an arena repeatedly, the
deployment task is modeled with a modified version of “node counting”, which
is an algorithm for executing searches on a connected graph; even though node
counting operates on a discrete space, it is shown to provide a theoretical foun-
dation to predict the performance of the proposed robot deployment algorithm,
which operates on a continuous domain. Moreover, the authors reported an
interesting analysis of the effect of pheromone marking: if pheromone dropped
at a given place does not evaporate, a saturation effect causes the performance
of area coverage to decrease with time, thus the average time needed to cover
the entire arena increases with the number of times the arena is being covered;
if pheromone is subject to evaporation, this drawback can be partially avoided.
The choice of the evaporation time is thus subject to a tradeoff between the
ability to detect previously visited areas and the ability to estimate when a
given area has been visited last.
Usually, swarm systems allow agents to execute collective tasks more effi-
ciently than each individual alone can do; in some instances, the task at hand
cannot be executed by any single individual, but requires cooperation between
multiple individuals. A typical example taken from insect societies is the re-
trieval of large food items by groups of ants: depending on the item size, this
66
task may require a large number of ants, which must work in coordination in
order to bring the task to successful completion (i.e. transport the food item
to the nest). Observation of ant behavior has shown that such coordination is
achieved without central control and with each ant using simple rules governed
by local interactions with other ants and with the environment. Even though
the task may not be completed with optimal efficiency (e.g. multiple unsuccess-
ful attempts may be made by small groups of ants before recruiting other ants),
the absence of global knowledge and centralized control allows the system to be
fault-tolerant and to work with simple individual agents.
Collaborative manipulation refers to swarm robotics tasks where groups of
robots work together to manipulate objects in the environment. As shown by
ant behavior, such tasks can be done with robots obeying simple rules without
central control, and thus are another example of global dynamics obtained with
swarm intelligence principles. In the rest of this section, some past works on
collaborative manipulation tasks are presented.
10.1. Methods
67
along this trail, going toward the item to be transported. A similar principle can
be applied in artificial systems, as demonstrated in [137] where pheromone com-
munication is used by robots to transport large items. An alternative means
to recruit workers is studied in [138], where explicit communication between
robots is implemented with LED signaling. In other studies [139, 140], collab-
orative object transport is obtained engineering robot controllers with artificial
evolution.
A common characteristic found in object manipulation problems is that
robots in proximity of an object that cannot be manipulated because of an
insufficient number of nearby robots tend to stay in proximity of the object,
waiting for other robots to arrive. In scenarios where the number of objects is
high compared to the number of robots needed to manipulate them, this behav-
ior can lead to a stall where all robots are waiting for other robots. To avoid
this potential deadlock, some object manipulation studies [141, 142] adopted
a mechanism by which robots can abandon the object they found and restart
searching for other objects with a potentially higher number of nearby robots.
10.2. Algorithms
Two main types of object manipulation problems can be found in the existing
literature: tasks where robots must collectively transport large objects, and
tasks where sticks must be pulled from the ground. The rest of this section
contains a survey of a representative sample of past works addressing these
problems.
68
decrease the probability of negative interference is by using an attractive force
between robots. In the scenario described in [134], a group of robots must find
a box located in the arena and push it at one edge of the arena. The robots
are programmed with two basic control mechanisms: first, a common goal with
non-interference, by which robots are attracted by the box and avoid collisions
with each other; second, a follow behavior, by which robots tend to follow other
robots located in front of them. The second mechanism is found to increase
the probability that the distribution of robots around the box is asymmetric,
meaning that the overall force exerted by all robots makes the box move. With
these two control mechanisms, the robots are shown to be able to collaboratively
push the box.
As described above, if in a box pushing task there is not a specific direc-
tion along which the box should be moved, different robots may push it from
different directions, thus negatively interfering with each other. If there is a
predefined goal location to which the box must be moved, and if robots are able
to determine the direction toward the goal, this negative interference can be
avoided. In [135], the goal location is identified by a light source, and robots are
able to move the box to its goal by pushing it only if they sense the light emitted
by the goal with their frontal sensors, and repositioning themselves otherwise.
In [136], robots are equipped with a vision system which can sense the goal from
different angles, and the height of the box to be pushed is such that the box
occludes the view of the goal location when robots are pushing along the correct
direction. Exploiting this fact, the robot control algorithm ensures that robots
push the box only when the goal location is not in their field of vision.
Fujisawa et al. [137] used pheromone communication to replicate ant be-
havior with real robots, utilizing ethanol as an artificial pheromone: robots are
equipped with a tank and a pump for laying down ethanol trails, and with a
sensor able to perceive the presence of ethanol. According to experimental re-
sults, communication between robots through artificial pheromone improves the
performance of the robot swarm in transporting large items.
In other studies, robots are equipped with grippers (which can be attached
69
to other robots or to the object to be transported) mounted on a rotating turret,
so that object transportation can be done on an arbitrary direction compared
to the relative orientation of robots with respect to the object. Using their
grippers, robots can physically connect to each other and to the object, and
thus exert force on the connected entity. In [138], large items are transported
by a group of robots surrounding the item. In order to recruit other robots for
the transport activity, robots use direct communication with light signals: a
robot which perceives an item to be transported turns on an LED; if another
robot searching for the item senses the LED, it stops searching, follows the robot
with the LED on and in turn activates its LED, thus potentially recruiting other
nearby robots. When the item is surrounded by robots, it can be transported
collaboratively. Groß and Dorigo [139, 140] used robots controlled by a neural
network whose sensory capabilities include an omnidirectional camera which
can sense objects to be transported, other robots and the target location for
the objects. By selecting neural network parameters with artificial evolution,
and assuming that the target can be sensed from any location in the arena, the
swarm is shown to be able to transport to the target location objects of different
sizes and shapes.
70
in advance, the execution performance of this task depends on the random
interactions between robots and the environment.
In [142], the stick pulling task is executed with reinforcement learning tech-
niques where the timeout parameter value is updated dynamically based on a
reward determined by successfully pulled sticks. Both local and global reinforce-
ment schemes are analyzed. Experimental results show that learning improves
the swarm performance with both schemes; local reinforcement, where each
robot updates its timeout independently, is found to be more effective because
it leads to specialization.
10.3. Analysis
10.3.1. Metrics
In simple tasks where the objective is to find an item in the arena and
transport it to a predefined location, the performance of an object manipula-
tion algorithm can be measured by the time needed to transport the object to
its destination [136]; if the nature of the task allows implementing a successful
collaboration mechanism, this metric, when evaluated as a function of the num-
ber of robots, shows decreasing values for increasing swarm size [137], while in
presence of inter-robot interference task completion time tends to increase with
the number of robots [135]. To measure the quality of swarm behavior, Chen
et al. [136] used the path efficiency, defined as the ratio between the length of
optimal path from the initial object location to its destination and the length
of the path actually followed by the object. If experiments are run for a fixed
time, the distance by which the transported object approaches the target loca-
tion during an experiment [139, 140] and the task success rate [138] provide a
good indication of swarm performance.
Robot implementations for the stick pulling task are usually evaluated by
measuring the collaboration rate, defined as the number of sticks successfully
pulled from the ground per unit time [141, 142, 144]. Plotting the collaboration
rate as a function of swarm size not only indicates that more robots are able to
pull more sticks, but shows a super-linear increase of performance, because the
71
collaboration rate per robot increases with the number of robots. Conversely,
the failed collaboration rate, i.e. the rate of occurrence of events where a robot
abandons a stick, is shown to decrease with increasing swarm size [141].
10.3.2. Models
In general, collaborative manipulation tasks require multiple robots to be
located in the same place at the same time, and can be modeled as stochastic
processes, whose dynamics is regulated by the characteristics of the environment
and the robot control parameters. Probabilistic models for the stick pulling
task are proposed in [141, 144], and are shown to provide good agreement with
experimental results in characterizing the swarm performance. Defining an ex-
periment as a series of stochastic events (e.g. when a robot encounters a stick
or another robot), with probabilities calculated based on geometrical considera-
tions, the evolution of the state of each robot can be modeled as a probabilistic
finite state machine, whose probabilities of state transitions define the swarm
dynamics and allow calculating the expected value of performance measures.
Lerman et al. [145] proposed a generic probabilistic model for predicting the
dynamics of a swarm robotics scenario where the execution of object manipula-
tion tasks requires the simultaneous presence of more than one robot. In their
model, when a robot encounters an object to be manipulated, it stops near the
object and waits for other robots to arrive; if the number of robots required
to execute the task arrive within a certain amount of time, then the task can
be completed successfully, otherwise the robots near the object time out and
restart moving in search of other objects to be manipulated. The rate of suc-
cessful task execution thus depends on various factors such as robot and object
density in the arena, number of required robots for a single manipulation, and
the time robots wait before deciding to abandon an object. The authors applied
this model to the stick pulling task and showed that it is able to predict the
dynamics encountered with simulations and real robot experiments.
72
11. Task allocation
11.1. Methods
73
In task allocation methods applied to a foraging scenario, observed quantities
can be related to the energy stored in the nest, so that robots begin searching
for items when the energy level falls below a given threshold [78], or can be cal-
culated based on the amount of time spent by each robot in previous searches
[146], or more generally based on previous experience gained during the foraging
activity [147, 148]. In similar scenarios where robots can measure the success of
the activity in which they are engaged, threshold values can be calculated from
the success of previous activities [149]. Besides observations of the environment,
in some studies [146, 150] direct communication between robots has been em-
ployed to allow each robot to have a better knowledge of the environment and
thus calculate more optimized threshold values.
11.2. Algorithms
74
items in the environment is the main factor influencing the swarm dynamics,
and robot control algorithms include mechanisms to allow each robot to estimate
from local observations a global characteristic of the environment. The foraging
task is an often used case study where task allocation algorithms have been
applied. This section contains a review of past works dealing with distributed
task allocation applied to foraging or other tasks.
11.2.1. Foraging
In the foraging scenario, task allocation has been used mainly as a mechanism
to determine when each robot should engage in the foraging activity and when
it should rest. Two main principles can be found in the existing literature
to regulate the amount of activity executed by robots: the expected success,
by which robots search for items when items are likely to be found, and the
stimulus, which determines execution of the search activity when it is most
needed.
In [146], motivational behavior is used to decide when to start and stop the
search activity. Two types of motivation, called impatience and acquiescence,
regulate the robot activity: impatience increases with time as long as a robot
stays idle, and triggers the activation of the search activity when reaching a
threshold value, while acquiescence increases with time during the search activ-
ity, and causes robots to abort their activity if not successful. Communication
between robots may affect the distribution of work among swarm members by
changing their motivational status: when other robots in the arena report that
they are busy in the search activity, impatience of idle robots becomes lower,
and acquiescence of searching robots becomes higher, so that the same activity
is not executed by many robots at the same time.
Liu et al. [147] utilized a larger set of cues to modulate the robot activity. In
their proposed approach, each robot keeps track of a searching time threshold
(i.e. the maximum time spent searching for items before returning to the nest)
and a resting time threshold (amount of time which must elapse before a robot
in the nest engages in the foraging activity). These two thresholds are dynam-
75
ically varied based on internal cues (item collection success rate during past
search activities), environmental cues (number of collisions with other robots
while searching) and social cues (results of the search activity of other robots).
As pointed out by the authors, threshold variation is an adaptive mechanism
by which robots change their behavior based on learned environment character-
istics. To preserve scalability, communication between robots, needed for the
social cues, is limited to local interactions inside the nest area, and makes use
of pheromone: robots entering the nest deposit pheromone indicating whether
their search activity was successful. In a subsequent work [148], the authors
used a genetic algorithm to select a near-optimal set of adaptation parameters
for the time thresholds of individual robots.
In [151], when a robot searching for items reaches a timeout without finding
any item, it abandons the search and goes to the nest. When a robot is in the
nest, previous successes and failures in item collection attempts influence the
probability to leave the nest again to search for items; specifically, every time a
search operation terminates (with or without success), this probability is either
increased by an amount proportional to the number of consecutive successes,
or decreased by an amount proportional to the number of consecutive failures.
Thus, each robot learns based on its previous experience the difficulty level of
finding items in the environment, and the global number of robots engaged in
the foraging activity increases with the availability of items to be collected.
This algorithm, tested by the authors in a scenario where items are randomly
added in the arena during an experiment, shows an improved swarm efficiency
compared to a simpler algorithm in which the probability of leaving the nest
is fixed. Updating the probability value based on the outcome of the foraging
activity also means that more successful robots are less likely to be in the resting
state compared to less successful robots. This effect is analyzed in [156], where
it is explained with small mechanical differences between robots, which impact
foraging performance.
Campo and Dorigo [152] derived an equation for calculating the instanta-
neous rate of energy gained by a swarm of foraging robots, expressed in terms
76
of robot control parameters and characteristics of the environment. This equa-
tion is embedded in the decision algorithm of each robot, which dynamically
evaluates a subset of possible behaviors in terms of the expected energy gain,
and adjusts its control parameters (which are essentially probabilities to execute
certain actions during the foraging activity) in order to maximize the energy in-
come. Parameter adjustment is a means of adaptation of robot behavior to the
environment: as shown by the authors, robots are able to dynamically learn
optimal values for their control parameters, even though in presence of sudden
changes in environment characteristics a decrease in performance is observed.
The authors conjectured that this performance decrease may be due to a “mem-
ory effect”, which hinders optimal adaptation to new environment conditions.
Many task allocation methods found in animal societies are based on sens-
ing a stimulus which quantifies the “need” for the execution of a task: larger
stimulus values increase the number of robots engaging in task execution. In a
foraging scenario, the stimulus is usually determined by the amount of food or
energy in the nest: since the amount of food decreases with time if no robots
bring new food to the nest, the corresponding increase of stimulus activates the
foraging activity in resting robots, which will likely result in a future increase of
food at the nest and a corresponding decrease of stimulus. Krieger and Billeter
[78] controlled robots via a fixed activation threshold: when the energy falls
below a given value, resting robots begin searching for items. In this case, to
prevent all resting robots from starting their search activity at the same time, a
heterogeneous system is used where each robot is assigned a different threshold.
In [154], the mapping between stimulus and response is obtained with a prob-
abilistic decision taken by each robot, based on a parameter, called response
threshold, which regulates the “sensitivity” of robots to the stimulus; the value
of this parameter is varied adaptively based on the dynamics of the amount of
food in the nest.
Brutschy et al. [155] considered a foraging task divided in two sub-tasks,
called harvest and store, which are sequentially interdependent, in that exe-
cution of one sub-task requires previous execution of the other. A harvesting
77
robot transports an item from a source area to a task interface area, where
it waits for an available robot engaged in the other sub-task, then passes the
item to that robot, which will transport it to the nest. Analogously, a storing
robot waits at the task interface area if there are no robots engaged in the other
sub-task from which it can pick an item. Dynamic task allocation is obtained
with robots measuring the waiting time and probabilistically switching from one
sub-task to the other when waiting at the task interface; probability values are
calculated from previous experience (i.e. the amount of time spent waiting in
the past when doing each of the two sub-tasks) and from the current waiting
time. In [157], task allocation is obtained in a similar scenario without encoding
specific sub-tasks in robot controllers: using an automatic design method based
on grammatical evolution, a swarm of robots is able to dynamically partition
itself in two “specialized” groups, exploiting a characteristic of the environment
that rewards division of labor.
78
exchange their estimation of the demand for work when they encounter each
other, so that the estimation of each robot can be updated based on the work
done by other robots. The first variant is shown to adapt more to different
sizes of the environment, while with the second variant robots are able to better
handle cases where the number of items in the environment changes during an
experiment.
In [153], robots are located in an arena with randomly scattered items of
two types, and must locate and consume items of one of these types; to keep
the number of items in the arena constant, a consumed item reappears instantly
in another random location. Robots must choose an item type and search for
items of the chosen type, and the objective of the swarm is to allocate robots
to each type proportionally to the number of items of that type. Robots, which
signal with a visual cue the task they are currently executing, are equipped with
a camera vision system capable of sensing nearby items and other robots. In the
distributed control algorithm devised by the authors for this scenario, robots pe-
riodically observe their neighborhood and change their current task (i.e. change
the type of items to consume) probabilistically, based on a short-term memory
of the number of items and robots detected for each type. Simulation experi-
ments show that the robot swarm is able to dynamically change the proportion
of robots dedicated to each task, adapting it to the proportion of items of each
type in the environment.
11.3. Analysis
11.3.1. Metrics
A general objective of task allocation methods is to regulate the activities
done by swarm members based on the expected utility gained from those ac-
tivities. The concept of energy has often been used to describe and measure
the performance of task allocation: robots are assumed to consume energy at a
rate that depends on their current activity [78, 147], energy income derives from
successful completion of a task, and the objective of the swarm is to maximize
the net energy income. Quantitative measures used in past studies to assess
79
swarm efficiency in foraging scenarios include the ratio between the number of
collected items and the total time spent searching [151], the ratio between the
net energy income and the energy available from the environment [147], and the
average time spent by robots to retrieve an item [147]. A measure of robustness
of a swarm is proposed in [78] as the lowest energy level recorded during an
experiment, while Castello et al. [154] considered the average deviation of the
energy level as a performance indicator.
In scenarios where a level of completion for the collective task can be de-
fined, the objective of the swarm is to complete the task with the minimum use
of resources. As a general rule, the number of robots actively engaged in an
activity at a given time can be considered as a measure of resource usage. For
example, in [149] the ratio between the average cluster size (which is a measure
of task completion) and the average number of active workers is used to assess
the efficiency of the swarm in allocating resources. In [150], Agassounon and
Martinoli defined a cost function calculated from the average cluster size, the
number of clusters and the number of active workers, and the performance of
the swarm during an experiment is measured by integrating the cost function
over the experiment duration.
11.3.2. Models
An analytical model of distributed task allocation where each robot decides
to switch from one task to another based on local observations of the environ-
ment is described in [158]. Observations, which can be limited to items in the
environment or can include other robots, are treated in the model as a stochastic
process, and based on their statistical properties and the robot control algorithm
the dynamic evolution of the global swarm behavior can be predicted analyti-
cally. When applied to the scenario described in [153], this model is shown to
predict well the swarm dynamics.
In the foraging context, extensive analytical studies on the influence of task
allocation on global swarm performance have been done by Liu et al [159, 160,
148]. Their model uses a probabilistic finite state machine to represent the
80
number of robots in a given state (e.g. resting, searching for items or carrying
an item toward the nest), and predicts the temporal evolution of the swarm
via difference equations; the probabilities of robots to switch from one state to
another are determined by the robot control parameters and the environment
characteristics. Indicators of global swarm performance such as the overall
energy level can then be predicted with this model by solving the difference
equations.
This section contains a brief description of other swarm robotics tasks with
some examples of relevant past works.
The odor source localization problem consists of finding the source of an odor
in the environment. Propagation of chemical substances in the air is subject
to unpredictable dynamics due to turbulence and other wind characteristics,
which cause the appearance and disappearance of the odor in various places in
the environment. In [161], robots are equipped with a binary odor sensor, from
which the presence of an odor can be sensed, and with an anemometer to infer
the wind direction. In the basic algorithm for odor localization, a robot first
moves on a spiral-like path in order to explore a given region; when it detects the
odor, it moves against the wind direction until it either reaches the odor source,
or does not detect the odor anymore; in the latter case, it resumes its search
with the spiral-like path. Collaboration between different robots is achieved by
introducing a communication mechanism by which a robot detecting the odor
signals this event, and other robots in its proximity in the downwind direction
move toward the first robot. Experiments show that with this mechanism a
group of robots is able to reach the odor source in less time than a single robot.
Values of algorithm parameters are selected off-line with a machine learning
technique. In [162], robots are able to measure also the intensity of an odor,
81
and calculate a gradient formula which includes both the odor intensity and the
wind velocity. Robots are deployed in the environment forming a lattice pat-
tern, regulated by local interaction forces, which is used as a distributed sensor
network, with each robot communicating its sensor readings to its neighbors.
Based on this combined sensing mechanism, each robot calculates a virtual
force heading toward the estimated odor source; this force is combined with the
lattice-preserving force, so that the swarm remains compact and navigates in a
direction determined by the average of the forces calculated by each robot.
82
12.3. Self-assembly and morphogenesis
Self-assembly in swarm robotics refers to the ability of autonomous robots
to physically connect to each other utilizing only local interactions. The swarm-
bots project [167] is a well known research activity that led to the creation of
physical robots, called s-bots, which can attach to each other using a gripper and
a connection ring. In [167], the self-assembly process is guided by the activation
of colored LEDs that s-bots use to indicate whether other s-bots can attach
to them. In [168], an evolutionary method is used to program robot behav-
ior so that two identically programmed s-bots are able to autonomously decide
which of them will activate its gripper to connect to the other robot. The self-
assembly capability of s-bots has been used to accomplish various tasks, such as
collaborative object transportation [138, 139, 140], hole avoidance [169] and nav-
igation in harsh environments [170]. In the hole avoidance experiment, robots
must move in the arena while avoiding holes. Robots assemble in a compact
formation using their grippers; holes are detected by means of ground-facing
proximity sensors, while a traction sensor detects the traction force exerted by
other robots in the formation. These sensors are connected to inputs of a neu-
ral network, whose parameters are selected using an evolutionary method where
the fitness function rewards robot formations able to move far from their ini-
tial positions without falling in the holes. Simulation experiments showed that
with the evolved behavior robots can avoid holes by collectively changing their
direction of movement, using the proximity and traction sensors to evaluate the
presence of nearby holes. As described in [171], the same neural controller can
also allow a robot formation to pass over small holes: by connecting with each
other, robots are able to collectively estimate the size of holes and to traverse
holes which cannot be traversed by a robot alone. In [170], s-bots collectively
decide to self-assemble in a group when a task cannot be accomplished by a
single robot. The task consists in traversing an area containing a hill; each
robot estimates the difficulty of climbing the hill with its accelerometer: if the
hill steepness is below a certain value, robots climb it autonomously, otherwise
they self-assemble in order to be able to climb the hill.
83
Morphogenesis is an extension of the self-assembly concept that allows mod-
ular robotic structures to assume specific shapes: each robot joining the struc-
ture attaches to it at a position and orientation such that the desired shape is
gradually built. O’Grady et al. [172] used s-bots with an LED signaling mecha-
nism by which robots attached to a structure indicate how other robots should
attach; by specifying a set of pattern extension rules, the assembled structure
can be formed according to various predefined shapes. In [173], the growth of
a structure is guided by radio messages sent by robots assembled in the struc-
ture, which are able to measure the relative position of neighboring robots and
recruit them to dock at a desired location. In [174], robots are equipped with a
rotating docking surface that can be attached to a passive surface on the front,
left and right sides, and can self-assemble into linear and multiped structures.
In [175], robots have a cubic shape with four docking units, one at each vertical
side; the self-assembly process begins when a seed robot changes its state from
swarm mode to organism mode and sends wireless messages with its infrared
transmitters to recruit other robots; once a recruited robot has attached to the
seed, it switches to organism mode and in turn starts recruiting other robots,
until the desired structure is built; the shape of the structure is chosen by the
seed robot and communicated via the infrared transmitters to recruited robots.
84
direction, during the evolution robots “learn” how to use the input from their
traction sensors to rotate their chassis.
85
12.7. Collective decision-making
In many of the collective tasks described in this review, the swarm dynamics
leads a group of robots to converge to a unanimous decision; this happens when
the different options available to each robot are associated to different rewards
(even if robots cannot measure directly the reward associated to each option),
but also in the case of options with the same utility. Examples of collective
decision making can be seen in the choice of an aggregation site, a flocking
direction, a path between two target locations, an object clustering site, a di-
rection of movement in robot assemblies, etc. In [179], Montes de Oca et al.
proposed a mechanism for decision making based on the majority rule opinion
formation model, where robots agree on a common decision by applying a local
majority rule to small groups; if a latency period is introduced after each robot
takes a decision, during which the robot cannot be influenced by other robots,
and if the duration of this period depends on the decision being taken (differen-
tial latency), then a global consensus is achieved by the swarm on the decision
associated with the lowest latency. In [180], global consensus on the fastest ac-
tion is achieved with robots entering an observation state, where they exchange
their current preferred action, after each action execution. In [181], robots that
must agree on a site selection transmit their preference to neighboring robots
for a time duration proportional to the perceived quality of the site associated
to their current preference. Yu et al. [182] used an implicit leadership algorithm
to make a swarm converge to a collective decision influenced by a few informed
agents; the behavior of each robot is a combination of the tendency to reach
a specific goal (if the robot is informed about the goal) and the tendency to
follow the average behavior of its neighbors; if the relative weight given to these
two components is changed dynamically based on the local consensus observed
from neighboring robots, the swarm can converge to a collective decision even
in presence of informed agents with conflicting goals.
86
12.8. Human-swarm interaction
87
An interesting topic for future studies is represented by transferring the
swarm robotics discipline to the micro- and nano-scale [186]: nanorobotic sys-
tems could take different forms, for example protein- and DNA-based agents,
which will respond to physical and chemical stimuli, magnetically guided sys-
tems, with robots built from ferromagnetic material that can be guided by ap-
plying magnetic fields, or bacterial systems, characterized by a powerful propul-
sion mechanism that can be leveraged controlling natural bacteria with external
stimuli (e.g. magnetic fields) or building artificial systems replicating bacterial
behavior. Possible applications of nanorobotics research include environmental
sensing with nano-sensor networks, space exploration (where having small and
fault-tolerant systems is of utmost importance), underground exploration of oil
reservoirs, and most importantly medical applications: swarms of nanorobots
can be used for precise drug delivery in the human body, maximizing therapy
efficiency and minimizing negative side effects, for cell repairing, for early diag-
nosis of diseases or to fight tumor cells. The main challenges that researchers
face for bringing nanorobotics to real implementations are related to the re-
quirement of equipping robots with adequate power, propulsion, computation
and communication capabilities.
Robots deployed in the real world will likely need to be able to handle dif-
ferent types of tasks in order to accomplish a given objective: depending on the
scenario, the same robots will have to engage in various tasks such as aggrega-
tion, dispersion, pattern formation and collaborative object manipulation. Even
though each single task may be handled relatively easily by specialized robots,
combining all of them in the same robot controller would inevitably pose com-
plexity issues that have not been fully studied by current research. Operating
in diverse and changing environments, robots may need to reconfigure them-
selves automatically (e.g. switching from a control algorithm to another) based
on the current operating conditions. This flexibility will come at the price of
higher complexity in swarm design, and will also increase the risk of unpre-
dictability of robot behavior under unexpected conditions. In light of this, the
topic of human-swarm interaction could assume more importance, so that ad-
88
equate mechanisms of controlling and influencing swarm operation by human
supervisors are put in place.
Recent progress in technologies such as low-power wireless communication
and optical sensors, especially at small scale, can be exploited in robotic systems
by endowing small robots with relatively “advanced” capabilities: even tough
such capabilities might be seen as deviating from the minimalistic approach
usually followed in swarm robotics, the definition of minimalism is subject to
debate and must be put in relation with the state of the art in technology.
Moreover, the presence of “advanced” capabilities such as vision may allow
obtaining a desired behavior without the use of other resources, as demonstrated
by recent studies where computational resources are minimized to the extreme
of processing only one bit of information.
Many studies cite the lack or inadequacy of a formal mathematical analysis
as a topic for future research: providing a solid theoretical foundation to results
obtained empirically could help in understanding the potential capabilities of a
system, as well as its inherent limitations; this, in turn, would allow deploying
swarms in the real world with more confidence on what can be expected from
them. Given the main inspirational source behind many swarm robotics tasks,
and given the relatively simple and well understood behavior of many species
of insects in nature, some researchers are envisioning the use of swarms of real
insects to perform useful tasks, possibly with the help of robots that could play
the role of “guides” to steer insect behavior toward the desired goal (mixed
insect-robot societies). Small insects have locomotion and object manipulation
properties that are unparalleled by even the most sophisticated robots with
comparable size.
Despite the numerous possibilities ahead, building and deploying large swarms
is still an open issue, because hardware limitations and costs need to be ad-
dressed. With the continuous technological improvements we are witnessing,
envisaging real-world swarm deployments is not utopic, and in the future we
will likely see applications where large numbers of robots are deployed in the
environment, for example to cover large areas or perform activities that are un-
89
feasible or dangerous for humans: de-mining, monitoring of large production or
distribution plants to detect leakages or other potential hazards, environmen-
tal monitoring of underwater surfaces, cleanup of areas affected by an oil spill,
etc. But currently the required technology for enabling these applications in a
cost-effective way is not available yet.
14. Conclusions
90
References
91
[11] Y. Mohan, S. Ponnambalam, An extensive review of research in swarm
robotics, in: Nature & Biologically Inspired Computing, 2009. NaBIC
2009. World Congress on, IEEE, 2009, pp. 140–145.
92
[21] S. Kernbach, D. Häbe, O. Kernbach, R. Thenius, G. Radspieler,
T. Kimura, T. Schmickl, Adaptive collective decision-making in lim-
ited robot swarms without communication, The International Journal of
Robotics Research 32 (1) (2013) 35–55.
93
[29] N. Correll, A. Martinoli, Modeling Self-Organized Aggregation in a Swarm
of Miniature Robots, in: IEEE 2007 International Conference on Robotics
and Automation Workshop on Collective Behaviors inspired by Biological
and Biochemical Systems, 2007.
94
[38] M. Burger, J. Haškovec, M. T. Wolfram, Individual based and mean-
field modeling of direct aggregation, Physica D: Nonlinear Phenomena
260 (2013) 145–158.
95
[47] A. F. Winfield, W. Liu, J. Nembrini, A. Martinoli, Modelling a wireless
connected swarm of mobile robots, Swarm Intelligence 2 (2-4) (2008) 241–
266.
96
[56] A. E. Turgut, H. Çelikkanat, F. Gökçe, E. Şahin, Self-organized flocking
in mobile robot swarms, Swarm Intelligence 2 (2-4) (2008) 97–120.
97
[64] G. Antonelli, F. Arrichiello, S. Chiaverini, Flocking for multi-robot sys-
tems via the null-space-based behavioral control, Swarm Intelligence 4 (1)
(2010) 37–56.
98
[73] N. R. Hoff, A. Sagoff, R. J. Wood, R. Nagpal, Two foraging algorithms
for robot swarms using only local communication, in: Robotics and
Biomimetics (ROBIO), 2010 IEEE International Conference on, IEEE,
2010, pp. 123–130.
[81] A. Drogoul, J. Ferber, From Tom Thumb to the dockers: Some experi-
ments with foraging robots, From Animals to Animats II (1993) 451–459.
99
[82] E. H. Ostergaard, G. S. Sukhatme, M. J. Matarić, Emergent bucket
brigading: a simple mechanism for improving performance in multi-robot
constrained-space foraging tasks, in: Proceedings of the fifth international
conference on Autonomous agents, ACM, 2001, pp. 29–30.
[88] V. Hartmann, Evolving agent swarms for clustering and sorting, in: Pro-
ceedings of the 2005 conference on Genetic and evolutionary computation,
ACM, 2005, pp. 217–224.
100
conference on Autonomous agents and multi-agent systems, International
Foundation for Autonomous Agents and Multiagent Systems, 2014, pp.
421–428.
[94] T. Wang, H. Zhang, Multi-robot collective sorting with local sensing, in:
IEEE intelligent automation conference (IAC), Citeseer, 2003.
101
[99] A. Vardy, Accelerated patch sorting by a robotic swarm, in: Computer
and Robot Vision (CRV), 2012 Ninth Conference on, IEEE, 2012, pp.
314–321.
102
[109] F. Ducatelle, A. Förster, G. Di Caro, L. M. Gambardella, Supporting nav-
igation in multi-robot systems through delay tolerant network communi-
cation, in: Proc. of the IFAC workshop on networked robotics (NetRob),
2009, pp. 25–30.
103
[118] E. Mathews, Self-organizing ad-hoc mobile robotic networks, Ph.D. thesis,
Paderborn, Universitat Paderborn, Diss., 2012 (2012).
104
[127] G. Francesca, M. Brambilla, A. Brutschy, L. Garattoni, R. Miletitch,
G. Podevijn, A. Reina, T. Soleymani, M. Salvaro, C. Pinciroli, et al., An
experiment in automatic design of robot swarms, in: Swarm Intelligence,
Springer, 2014, pp. 25–37.
105
[136] J. Chen, M. Gauci, R. Groß, A strategy for transporting tall objects with a
swarm of miniature mobile robots, in: Robotics and Automation (ICRA),
2013 IEEE International Conference on, IEEE, 2013, pp. 863–869.
106
[145] K. Lerman, A. Galstyan, A. Martinoli, A. J. Ijspeert, A macroscopic an-
alytical model of collaboration in distributed robotic systems, Artificial
Life 7 (4) (2001) 375–393.
107
[154] E. Castello, T. Yamamoto, Y. Nakamura, H. Ishiguro, Task allocation
for a robotic swarm based on an adaptive response threshold model, in:
Control, Automation and Systems (ICCAS), 2013 13th International Con-
ference on, IEEE, 2013, pp. 259–266.
108
[163] J. Wawerla, G. S. Sukhatme, M. J. Matarić, Collective construction with
multiple robots, in: Intelligent Robots and Systems, 2002. IEEE/RSJ
International Conference on, Vol. 3, IEEE, 2002, pp. 2696–2701.
109
[172] R. O’Grady, A. L. Christensen, M. Dorigo, SWARMORPH: multirobot
morphogenesis using directional self-assembly, Robotics, IEEE Transac-
tions on 25 (3) (2009) 738–743.
110
[180] A. Scheidler, A. Brutschy, E. Ferrante, M. Dorigo, The k-unanimity rule
for self-organized decision making in swarms of robots, International Jour-
nal of Robotics Research, page in revision.
111
Levent Bayındır received his [Link]. degree
in Computer Science from Ege University,
Turkey in 2002 and his Ph.D. degree in Swarm
Robotics at the KOVAN research lab from
Middle East Technical University, Turkey in
2012. He is currently an assistant professor
of Computer Engineering at Atatürk Univer-
sity in Erzurum, Turkey. His research in-
terests include swarm robotics and pervasive
computing.
112
Bucket brigading in cooperative robot transport involves passing items from one robot to another until they reach the nest, thereby minimizing interference and enhancing foraging performance . This method allows smooth transport by creating a chain of robots, each responsible for a segment of the distance. By decreasing the burden on individual robots and minimizing the effects of localization errors over long distances, bucket brigading reduces the overall task duration and interference at the nest, resulting in increased efficiency in item transport .
Virtual gradient fields, created by chains of beacon-emitting robots, significantly improve foraging efficiency by guiding searching robots through a physical space. When a beacon-emitting robot reaches the boundary of the nest's signal coverage, it stops searching and emits its own beacon, extending the search region for other robots . These beacons store information about their position relative to the nest, creating a virtual gradient that foraging robots can sense through local communication. As a result, these robots can efficiently locate items and return them to the nest, following pathways defined by the gradient field .
The "firing rate" is a parameter that regulates the emission of signals in the aggregation algorithm inspired by the formation of multi-cellular organisms via chemical agent diffusion . Robots emit signals that propagate throughout the environment, attracting other robots toward the signal's source. In this process, robots that would otherwise remain static or move randomly are drawn to the signal, facilitating the formation of one or multiple aggregates as robots congregate around the signal source. This mechanism effectively uses emission and signal propagation to influence robot movement and aggregation dynamics .
The adaptive system for foraging tasks incorporates multiple algorithms, dynamically switching based on environmental characteristics and task requirements . Initially, the system employs a virtual gradient-based algorithm, effective when item sources are proximal to the nest. If unsuccessful, indicated by all robots becoming beacons, it transitions to a chain formation algorithm that sweeps larger areas. If this too fails, a random walk approach is used as a last resort . This strategic adaptability allows the swarm to efficiently address varying environmental challenges by employing the most suitable method to optimize task success .
Clustering and sorting tasks in robotics with limited localization capabilities face the challenges of sub-optimal clustering due to the robots' inability to determine exact locations and reliance on local sensing . Strategies employed involve probabilistic interactions, where clusters emerge as robots make decisions based on a limited representation of the environment. Inspired by ants' brood sorting behaviors, robots randomly move and form clusters based on short-term memory of encountered objects. Despite these limitations, emergent clustering can be achieved without optimal object placement, leveraging the naturally inspired probabilistic algorithms .
Artificial physics impacts the aggregation dynamics of autonomous swarms by using attraction forces at high distances and repulsive forces at close distances, intended to mimic natural swarm behaviors . However, practical implementation in robots faces challenges such as the limitations of local sensing abilities, high error in orientation determination when using technologies like infrared, and mechanical constraints that limit the efficacy of robot actuators . Thus, while artificial physics can theoretically guide robot motion, implementing these rules in real robots may be cost-intensive due to the need for advanced sensing and actuation capabilities.
Probabilistic methods in swarm aggregation play a crucial role by partially relying on random behaviors and interactions with the environment, as inspired by the behaviors of social insects like honeybees and cockroaches . These methods often utilize finite state machines with basic states such as "walk" and "wait," where robots can switch states based on random decisions or environmental cues, mirroring the decision-making processes observed in natural swarms .
Pheromone communication in robotic swarms facilitates navigation and task completion by emulating the trail-following behavior of ants. As robots move toward their destinations, they leave behind markers (analogous to pheromones) on the pathway, creating a traceable path for other robots to follow . This communication method effectively guides robots through optimal paths toward the nest or task completion points, utilizing signal gradients formed by varying pheromone concentrations. This allows for efficient path formation and resource transportation, ensuring enhanced coordination and task execution within the swarm .
In environment-mediated aggregation, robots leverage the local environment to enhance aggregation by taking advantage of preferential regions that promote aggregation. Similar to how certain species, such as honeybees, select aggregation areas based on optimal environmental conditions (e.g., temperature), robots in these algorithms tend to gravitate towards predefined favorable regions, enhancing the probability of successful aggregation . This method demonstrates how environmental cues are crucial in guiding robot behavior towards aggregation in a decentralized manner.
Task partitioning based on cost estimation enhances the efficiency of swarm robotics in foraging by allowing robots to make informed decisions on how far to transport items before passing them to another robot. By dynamically estimating the cost function, which maps the effort exerted by the robot against the cost of the overall task, robots can limit their transport distance to reduce localization errors . This method adapts to the environmental context and robot capabilities, optimizing task partitioning, and thereby improving success rates in efficiently completing foraging tasks .