Memory Efﬁcient Anonymous Graph Exploration
Leszek G ˛asieniec
1
Tomasz Radzik
2
1
Department of Computer Science, University of Liverpool, Liverpool L69 3BX,United Kingdom. Email:
L.A.Gasieniec@liverpool.ac.uk
2
Department of Computer Science, King’s College London, Strand, London WC2R 2LS,United Kingdom. Email:
Tomasz.Radzik@kcl.ac.uk
.
Abstract.
Efﬁcient exploration of unknown or unmapped environments has become one of the fundamental problem domains in algorithm design. Its applications range from robot navigation in hazardous environments to rigorous searching, indexing and analysing digital data available on the Internet. A large numberof exploration algorithms has been proposed under various assumptions about thecapability of mobile (exploring) entities and various characteristics of the environment which are to be explored. This paper considers the
graph model
, wherethe environment is represented by agraph of connections in which discrete movesare permitted only along its edges. Designing efﬁcient exploration algorithms inthis model has been extensively studied under a diverse set of assumptions, e.g.,directed vs undirected graphs, anonymous nodes vs nodes with distinct identities,deterministic vs probabilistic solutions, single vs multiple agent exploration, aswell as in the context of different complexity measures including the time complexity, the memory consumption, and the use of other computational resourcessuch as tokens and messages. In this work the emphasis is on memory efﬁcientexploration of anonymous graphs. We discuss in more detail three approaches:
random walk
,
Propp machine
and
basic walk
, reviewing major relevant results,presenting recent developments, and commenting on directions for further research.
1 Introduction
A
graph
is a crucial combinatorial notion used for modeling complex systems in various application domains including communication, transportation and computer networks, manufacturing,scheduling, molecular biology and peertopeer networks. Models based on graphs often involve mobile entities which can move throughout the graphfrom node to node along the edges. We call such entities
agents
. An agent can be arobot servicing a hazardous environment, or a software process navigating the Internetin searchforsomeinformation.
Graphexploration
referstoproblemsofdesigningalgorithms (protocols) for an agent, or a group of agents, to traverse a graph in a systematicand efﬁcient way.In recent years the research on efﬁcient graph exploration gathered a new momentum, generated to large extent by the theory and the applications coming ever closertogether. The demand for efﬁcient practical solutions has increased as systems of softwareagentsmovingthroughalargenetworkofcomputershavebecomereality.Anotherexample is the relevance of efﬁcient graph exploration algorithms for efﬁciency of the
Internet search engines. The current applications indicate that the relative importanceof various aspects of graph exploration has been changing, and those changes becomereﬂected in the theoretical research.In the broad context of algorithmic agent design we distinguish two main models:the
geometricmodel
where the search environmentis representedby two or higher dimensional space (see, e.g., [11,28,59]) and the
graph model
, considered in this paper,where the environment is represented by a ﬁnite or inﬁnite graph supported by discretemoves permitted only along its edges. The design of efﬁcient exploration algorithms inthe graph model has been extensively studied under many different assumptions, e.g.,
directed
vs
undirected graphs
,
anonymous
nodes vs nodes with
distinct identities
, or
deterministic
vs
probabilistic
solutions, as well as with different performance objectives in mind including optimal time complexity, memory consumption, or use of otherresources, see [1,6,9,29–31,34,41,56]. Different studies may also consider differentaims of exploration.The aim of exploration can be to visit each node in the network, oreach edge, and terminate. Alternatively, one may drop the termination requirement andask only for
perpetual
exploration and a guarantee that each node is visited inﬁnitelymanytimes,orperhapsa strongerguaranteethatthenodesarebeingvisitedwithsimilarfrequencies.If nodes have distinct identities (given as
O
(log
n
)
bit words), the graph stays unchanged (a static graph), and the size of the memory available to the agent (counted inwords) is linear in the number of nodes of the graph, then the depthﬁrst search (DFS)procedure gives linear time exploration. However, in many applications one or moreof these assumptions might not hold. The nodes may not have unique identities; forexample, if the nodes represent very simple devices (
anonymous
graphs). The graphmay keep changing;for example,if it models a growingpeertopeernetwork(
dynamic
graphs). Finally, the agents may have very limited memory, which may be sufﬁcientfor storing only few node identities. For example, software agents moving through acomputer network from host to host might not be allowed to carry too much data withthem.This surveyis mainlyconcernedwith explorationofanonymousgraphsby agentsequipped with bounded memory.In graph exploration algorithms, the memory utilisation refers actually not only tothe memory of the agents (“carried” by them when they move from node to node), butalso to any extra memory required in the graph environment.The latter may store some(precomputed)additional informationabout the graph to guide the exploration,or mayallowtheagenttoleavemarksatnodesortomovetokensasit traverses.Thedemandforsimple and cost effective agents as well as the desire to design exploration algorithmsthat are suitable for rigorous mathematical analysis imply the importance of limitingthe local memory of agents and their ability to manipulate the explored environment.One of the most challenging problems in the theory of computation is to look at thefar ends,
border cases
, of the considered models. In case of algorithmic agent designsuch a border case may refer to the size of the agent’s memory, where one can limitthe memory of an agent to a constant number of bits. This case is very often modeledas graph exploration by a ﬁnite state automaton and it has been extensively studiedalready in the 1970’s [15,54,58,61]. Probably the strongest result in this setting is
due to Cook and Rackoff [19]. They proved that a ﬁxed group of ﬁnite automata, that
can permanently cooperate and that can use "teleportation" to move from their currentlocation to the location of any other automaton, cannot explore all graphs. See [42]for some recent results about limits of graph exploration with ﬁnite automata. Theseresults imply that we either have to allow the agents to use larger memory, or to divertto randomization,or to providethe agentswith extrastructuralinformationthat restrictsthe set of graphs they have to traverse.Ithasbeenknownforsometimethatifwedonotplacestrictrestrictionsonthelocalmemory, then a single pebble is sufﬁcient to explore an anonymous undirected graph.This result was extended to directed graphs by Bender
et al.
[8]. Note however that agood upper boundon the number of nodes must be known to avoid an exponentialtimesolution. Moreover, even if such a bound is known, the time complexity, while polynomial, remainsimpracticallyhigh.Anotherpossibilityis to dropdeterminismandto look for randomized solutions. It is known, e.g., that a random walk of length
O
(
n
3
log
n
)
visits all nodes of an arbitrary
n
node graph with high probability [3]. Attempts to regain determinism included research on derandomizationof random walks and the mainapproach was
universal traversal sequences
[3] that provide guidance in deterministictraversal of all graphs in a given class. Several important results have been achieved[4,7,46,60] including Reingold’s [60] recent asymptotically optimal
O
(log
n
)
spacedeterministic algorithm for the undirected
stconnectivity
problem based on a novel
O
(log
n
)
−
bit navigation mechanism. However, note that the exploration time given bythis algorithm is a polynomial of a rather high degree.Research closely related to the setting adopted in this paper assumes some structural information about the explored environment. Such additional information allowsimprovements in the time or memory complexity of graph traversal. The ﬁrst results,concerning exploration of a labyrinth using a compass, are due to Blum and Kozen[12]. Later, Flocchini
et al.
[39] introduced a more general notion of
sense of direction
and proved that traversal can be performed using
O
(
n
)
messages/agent moves in thismodel [38]. Fraigniaud
et al.
[40] have shown that interval routing scheme can be usedto achievethe same goal. In fact, givena spanningtree, the graphcan be traversedusing
O
(
n
)
moves. Pelc and Panaite [56] studied the impact of having a map of the graph onthe efﬁciency of graph exploration. Finally Cohen
et al.
studied efﬁcient navigation ingraphs with nodes marked with a constant number of colors, see [18].Inthispaperwefocusonthreegraphtraversalmethods.Westartwiththeprobabilistic
random walk
method and then discuss its recently proposed deterministic counterpart known as the
Propp machine
, which requires some (small) memory at the nodes of the graph. We conclude this survey with presentation of an alternative traversal methodbased on the
basic walk
, in which a small amount of memory is provided to an agentand a certain type of graph preprocessing is permitted. These three methods have several interesting combinatorial properties and one might say that they have already setcertain standards in agent based anonymous graph exploration.
2 The graph model
In this surveywe consider environmentsrepresentedby undirected(symmetric)graphs.We denote by
G
= (
V,E
)
the graph which is to be explored, and assume that it is con
nected, unless stated otherwise. It will sometimes be convenient to view
G
as a digraph
←→
G
obtained by replacing each undirected edge with two arcs pointing in opposite directions. We consider graphs (environments) that are anonymous, i.e., the nodes in thegraphs are neither labeled nor marked in any other way. However, the ends of edgesincident to each node
v
are ordered and often labeled by consecutive integers
1
,...,d
v
called
port numbers
, where
d
v
is the degree of
v
.If an agent is in the current step at a node
v
, then the standard assumptions are thatit knows
d
v
, the label of the port through which it has entered
v
and any informationabout the graph that it might have gathered in previous steps and has been able to (hasbeen allowed to) store in its internal limited memory. The agent decides on the basisof this knowledge which port it should take to move in the next step to a neighbourof
v
. Agents do not have prior knowledge about the topology of the network. The exactdetails about the resources and abilities of the agents as well as about the objectives of exploration may vary from problem to problem. The number of nodes and the numberofedgesin
G
are denotedby
n
and
m
,respectively,butnotethat theseparametersmightnot be known to the exploring agents.
3 The random walk
A
random walk
on an (undirected, connected) graph
G
starting at a node
v
0
is an (inﬁnite) random sequence
(
v
0
,v
1
,v
2
,...
)
of nodes in
G
such that for each
i
≥
1
, node
v
i
is selected randomly and uniformly from all neighbours of node
v
i
−
1
. Using the graphexploration terminology, we say that an agent moves in step
i
from node
v
i
−
1
to itsrandom neighbour
v
i
. To implement such random exploration of an arbitrary
n
nodegraph, the agent has to be able to select a random neighbour of the current node, so itneeds
O
(log
n
)
bit memory and access to
log
n
random bits per step.The
(node) cover time of graph
G
from a node
v
is the expected number of steps
C
v
(
G
)
the random walk starting from node
v
takes to visit all nodes of the graph.The
cover time
C
(
G
)
of graph
G
is deﬁned as the maximum
C
v
(
C
)
over all nodes
v
∈
V
. Thus the cover time is the worstcase (over all starting nodes) expected timeof exploring the whole graph. For graphs of some special types, the cover time canbe easily estimated, or even calculated exactly. For example, it is easy to show that thecovertimeofthegraph
P
n
whichisan
n
nodesimplepathisequalto
C
(
P
n
) = (
n
−
1)
2
,by solving a simple recurrence relation for the number expected number of steps
H
i
requiredto reach the end of the path starting from its
i
th node. Calculation of the covertime of the
n
nodeclique
K
n
is the
couponcollector
problem:at step
i
select randomlyand uniformly one of the
n
coupons/nodes and wait until all coupons have been seen,with a slightmodiﬁcationthat thenextcoupon/nodehas to bedifferentfromthe onejustselected. If we have already seen exactly
i
distinct nodes,then the probabilitythat in thenext step we will see a new node is equal to
(
n
−
i
)
/
(
n
−
1)
,so the expected number of steps before a new node is encountered is equal to
(
n
−
1)
/
(
n
−
i
)
and the cover time
C
(
K
n
)
is equal to
n
−
1
i
=1
(
n
−
1)
/
(
n
−
i
) = (
n
−
1)
n
−
1
i
=1
1
/i
=
n
(ln
n
+
O
(1))
.Random walks became an important tool in algorithm design and complexity theory when Aleliunas
et al.
[3] showed in 1979 that the cover time of
every
graph ispolynomial, or more speciﬁcally, at most
2
m
(
n
−
1)
. Since then general techniques for
bounding the cover times have been developed and bounds for the cover times for various families of graphs have been derived. An agent using a random walk on an
n
nodegraph with the cover time bounded by
T
(
n
) =
poly
(
n
)
should have an
O
(log
n
)
bitcounter to count
T
(
n
)
steps to terminate with the knowledge that the whole graph hasbeen exploredwith constant probability,or to count
T
(
n
)
p
log
n
steps to terminate withthe knowledge the whole graph has been explored with the high probability of at least
1
−
1
/n
p
. Thus a good bound
T
(
n
)
on the cover time is required not to expand unnecessarily the exploration time, and we review below the main known bounds. Observethat an agent implementing a random walk needs memory for two purposes. It needs
O
(log
∆
)
bits to implement individual moves from the current node to a random neighbour, where
∆
≤
n
is a bound on the degree of a node, and
O
(log
n
)
bits to count themoves. If we do not require that the agent terminates, than the total of
O
(log
∆
)
bitssufﬁces. In particular, constantsize memory is sufﬁcient for the randomized
perpetual
exploration of any constant degree graph.Feige [36,35] showed the following tight bounds on the range of the cover times of
n
node graphs:
(1
−
o
(1))
n
ln
n
≤
C
(
G
)
≤
(1 +
o
(1)) 427
n
3
.
Thus
K
n
is an example of a graph with the cover time achieving the lower bound,while it can be shown that the
n
node
lollipop
graph given in Figure 1 has the covertime achieving the upper bound. A quadratic
O
(
n
2
)
upper bound on the cover timeof regular graphs was ﬁrst shown by Kahn, Linial, Nisan and Saks in 1989, and thebest know bound of
2
n
2
is due to Feige [37]. This worstcase upper bound for regulargraphs should be contrasted with Rubinfeld’s [62]
O
(
n
log
n
)
bound on the cover timeof regular
expander
graphs, and with Cooper and Frieze’s [20] recent result showingthat the cover time of a
random
d
regular graph is
(1 +
o
(1))
d
−
1
d
−
2
n
ln
n
with highprobability. The cover time of the 2dimensional
√
n
×√
n
grid is
Θ
(
n
log
2
n
)
, withthe upper bound due to Chandra
et al.
[17] and the lower bounddue to Zuckerman[65].
Aldous [2] showed that the cover time of the
n
node
k
dimensional grid is
Θ
(
n
log
n
)
,for
k
≥
3
.Aleliunas’
et al.
[3] polynomialupper boundon the cover time of an arbitrary graphwas an important result for the computational complexity theory as it showed that theundirected
s

t
connectivity problem can be solved by a randomised logspace algorithm. This started a vast body of research with the goal to settle the conjecture thatthis problem can be solved by a
deterministic
logspace algorithm, and, ultimately, themore general conjecture that
any
problem solvable by a randomised logspace algorithm can be solved by a deterministic logspace algorithm. A natural approachto settlethe ﬁrst conjecture was to derandomise random walks. In this framework the objective was to produce an explicit universal traversal sequence (UTS), i.e., a sequence
p
1
,...,p
k
of port labels, such that the path guided by this sequence visits all edgesof any graph of a given size. It is known that, with high probability, a sequence of length
O
(
n
3
d
2
log
n
)
, chosenuniformlyatrandom,guidesa walkinany
d
regular(connected)
n
node graph [3]. Unfortunately,explicit short UTS are known only for specialfamilies of graphs, including 2regular graphs [7,13,16,50,53], 3regular graphs [49],
cliques [51], and expanders [46]. Some of these sequences can be constructed in log