1 Introduction

A team of mobile autonomous robots wants to search an area with the goal of finding a mobile intruder (or lost entity). The intruder has several properties that dictate how a search should be conducted. First, the intruder is invisible and therefore the robots may conclude its potential locations only from the history of their own moves. Second, it is assumed that the speed of the intruder is unknown and therefore the robots build their search strategy assuming that the intruder is very fast: it may traverse arbitrarily long distance between any two actions of a robot. Third, the intruder is very clever, i.e., it will avoid being captured as long as possible; in other words we may imagine that it knows locations of robots and their future movements at any point. This assumption enforces robots to consider the worst case scenario for them since they want to have a search strategy that guarantees interception. The above problem is usually restated in discrete terms, naturally expressing the search game using graph-theoretic notation. Following the widely used terminology, the mobile entities performing the search are called searchers.

In this work we focus on the graph-theoretic problem statement, where the searchers operate in a given graph in which they move along edges. Moreover, what greatly influences algorithmic approaches is the assumption of whether the searchers know the graph in advance (off-line version of the problem) or whether the graph is unknown and the searchers learn its structure while conducting the search (on-line or distributed setting). We shortly review both approaches, giving later a formal statement of the problem we study in this work. In all cases we are interested in minimizing the number of searchers needed to clear the given network.Footnote 1 We discuss briefly later a possibility for our algorithm to be adopted to operate in distributed asynchronous setting, with searchers having local communication and polynomial memory. From the point of view of this work, the terms on-line and distributed are used exchangeably because we do not impose any communication, memory or synchronization restrictions on the agents (a more detailed discussion on this topic is provided in Section 2).

Off-line searching

Off-line graph searching models are extensively studied and numerous deep results have been obtained, providing insight into not only the problem itself but also enriching the more widely understood graph theory through the connections between graph searching games and many graph parameters, e.g., pathwidth, treewidth, branchwidth, bandwidth, profile, interval thickness, vertex separation number; see, e.g., [24] for a survey and further references. The historically first studied graph searching model is called edge search [37, 38]. In this problem, the goal is to construct a search strategy that guarantees capturing a fast and invisible fugitive (thus, the strategy must ensure success regardless of the moves performed by the fugitive) in a graph that is given as an input to an algorithm computing a search strategy. A search strategy itself is a sequence of moves, where each move is one of the following: (i) placing a searcher on any graph node; (ii) removing a searcher from the node it occupies; (iii) sliding a searcher along an edge in order to clear it. At each point of the strategy one can distinguish a subgraph the is guaranteed to be free of the fugitive. In a valid strategy we require that this subgraph is the input graph at the end of the strategy (thus the capture of the fugitive necessarily occurs at some point) and, additionally, in a connected search strategy we require that after each move this subgraph is connected. In a monotone search strategy we require that once an edge has been cleared, it must remain clear till the end of the search; in other words, the subgraph composed of edges that may contain the fugitive may only shrink as the search progresses. Since we adopt the monotone connected searching problem in our on-line model, we point out to few recent works on the problem [2, 3, 14, 15, 19, 20].

On-line searching

In the distributed, or on-line, version of the problem it is assumed that the network is unknown in advance to the searchers. In this setting, some assumptions need to be made. First, only monotone search strategies are considered. This assumption is dictated by an observation that otherwise the searchers may first learn the structure of the network by exploring it (and thus ignoring the possibility of capturing the intruder at this stage) and once the network is known, they can compute a search strategy by using an off-line algorithm and finally execute the strategy. The problem then reduces to exploration and map construction, well studied problems in distributed computing. Another natural assumption is to forbid placing a searcher on a node that has not been visited before. Recall that we are interested in minimizing the number of searchers. On-line algorithms are formulated usually in such a way that the algorithm is adding new searchers whenever necessary and in the analysis one counts how many searchers will be added in the worst case — we will follow this route.

We consider connected search strategies in this work, i.e., strategies that guarantee that at any given time point the subgraph that is clear is connected. Note that this allows us to assume that all searchers start at some node called the homebase and only moves of type (iii) are then made (see the definition of edge search above). Indeed, removing a searcher from a node u and placing it on another one v (i.e., jumping) may be replacedFootnote 2 by a sequence of sliding moves along a path from u to v consisting of clear edges only (such a path must exists due to connectedness and monotonicity).

1.1 Related Work

Off-line problems

One of the central questions raised in the context of various graph searching problems is if the search problem is monotone, i.e, if there exists a monotone search strategy solving it. Note that proving monotonicity is a tool that allows to conclude membership in NP for a given problem. It is known that the edge search problem is monotone [31]. On the other hand, the connected search is not monotone [44]. A question related to the latter searching model is: how many extra searchers one needs to ensure connectivity. It turns out that each monotone edge search strategy can be converted (in polynomial time) into a monotone connected one by approximately ‘doubling’ the number of searchers [16]. Thus, for asymptotic results, like the one in this work, this gives another reason that justifies restricting attention to monotone connected search strategies.

On-line searching

In most cases, when designing distributed searching algorithms, the monotonicity requirement is adopted. (See [6] for an example how an optimal connected search strategy can be constructed in a distributed fashion when recontamination is allowed.) During construction of a monotone strategy in an on-line way, there is naturally some ‘cost’ involved in terms of increased number of searchers required for guarding — this cost measured as the ratio of number of searchers that each on-line algorithm needs to use for some n-node graph and its monotone connected (off-line) search number is know to be \({\varOmega }(n/\log n)\) [28]. In the realm of distributed algorithms, natural questions arise with respect to the amount (and type) of additional information regarding the underlying network given a priori to an algorithm. In [36] is was proved that \(O(n\log n)\) bits of advice are sufficient to construct an optimal connected monotone search strategy (the concept of such quantitative approach to advice analysis was introduced in [25]). An example of an algorithmic approach in a very weak computational model see, e.g., [5, 13].

Grid networks were studied in searching models, where the concepts of temporal and threshold immunities were used. In the first one, a node after clearing remains protected (even if unguarded) against recontamination for a certain amount of time t. A tight upper bound for the grid of size m × n, mn, is equal to \(\min \nolimits \left \{\left \lceil m/\left \lceil t/2 \right \rceil \right \rceil ,\left \lceil (2m - 1)/t \right \rceil \right \}\) [11]. In [21, 22] d-dimensional meshes were investigated in the threshold immunity model, where a node becomes recontaminated when a specified number m (or greater) of its neighbors is contaminated. Especially, it has been proved that for d = 2 (i.e., grids) and m ≥ 2 one searcher is enough. For other searching works involving threshold immunity see, e.g., [12, 23, 32].

For other distributed searching models and algorithms for specific network topologies see [8, 20, 26, 35].

Applications in robotics

We note that our results may be of particular interest not only by providing theoretical insight into searching dynamics in distributed agent computations, but may also find applications in the field of robotics. Most investigations oriented towards algorithms that can be applied on physical devices need to deal with the problem of modeling of the real world. This can be done either by discretizing it (usually through graphs) or by building algorithms that work in continuous search space and need to address the geometric issues that emerge. In Section 7.1, we add a brief discussion on this subject from the point of view of our results. Having in mind the vast literature on the subject we point the interested reader to a few references to recent works in this field [10, 17, 27, 30, 39,40,41,42,43].

1.2 Outline of This Work

The next section provides the notation used in this work and the problem statement. It is subdivided so that Section 2.1 defines the graph searching problem we study while Section 2.2 introduces the terminology related to the partial grid networks we consider in this paper. Section 3 gives a construction of a class of n-node networks such that each on-line algorithm, which produces a monotone connected strategy, uses \({\varOmega }(\sqrt {n})\) searchers for some network in the class which turns out to be \({\varOmega }(\sqrt {n}/\log n)\) times more than an optimal off-line algorithm would use (recall that by off-line algorithm we refer to the case when the entire network is given as an input and hence is known in advance to the algorithm).

Section 4 describes an on-line (i.e., agents do not have any knowledge about the graph a priori) algorithm that performs a monotone connected search in partial grids where it is assumed that the algorithm is given an upper bound n on the size of the network. We assume a ‘sense of direction’ in our model, that is, the grid is embedded into a two-dimensional space by assigning integer coordinates to the nodes. Then, an agent knows the coordinates of each neighbor of the currently occupied node. More details are given in Section 2.2. We point out that this algorithm uses an on-line procedure from [7] as a subroutine that is called many times to clear selected parts of a grid and it can be seen as a generalization from a ‘linear’ graph structure studied in [7] to a 2-dimensional structure discussed in this work. Also, although both algorithms are conducted via some greedy rules which dictate how a search should ‘expand’ to unknown parts of the graph, the analysis of our algorithm is different from the one in [7].

Then, in Section 5 we prove the correctness of the algorithm and provide an upper bound on its performance: it is using \(O(\sqrt {n})\) searchers for any partial grid network. In Section 6 we consider a modified version of the algorithm, which receives no information on the underlying graph in advance, and we prove that the algorithm also uses \(O(\sqrt {n})\) searchers. This result, stated in Theorem 4, is our main contribution. We finish with conclusions in Section 7, giving a few remarks on how our work relates to searching two-dimensional environments, like polygons with holes. As there are many open problems and research directions related to the subject, we list some of them also in Section 7.

We briefly remark on a potential practical motivation of our setting. Partial grids, which can be seen as a grids with obstacles (formally defined later), are a way of modeling two-dimensional shapes, e.g., polygons. Every search strategy for a polygon can be used to obtain a search strategy for its underlying partial grid and vice versa. The number of searchers in both cases are withing a constant factor of each other. Thus in particular, searching strategies for continuous scenarios like polygons can be obtained by first getting the underlying partial grid and then computing a (discrete) search strategy for the grid by the algorithm we propose in this work. For more details see Section 7.1.

2 Definitions and Terminology

In this section we state our problem formally and present the notation we use.

2.1 Problem Statement

Let G be a simple, undirected, connected graph. A monotone connectedk-search strategy\({\mathcal S}\) for a network G is defined as follows. Initially, k searchers are placed on a node h of G, called the homebase. (We also say that \({\mathcal S}\)starts ath.) Then, \({\mathcal S}\) is a sequence of moves, where each move consists of selecting one searcher present at some node u and sliding the searcher along an edge {u, v}. (Thus, the searcher moves from its current location to one of the neighbors.)

Initially, all edges are contaminated. After each move of sliding a searcher along an edge {u, v} it is declared to be clear. It becomes contaminated again (recontaminated) if at any time during execution of the strategy \({\mathcal S}\) at least one of its endpoints is not occupied by a searcher and is incident to a contaminated edge. We consider only strategies in which recontamination does not occur and we call such strategies monotone. Note that this in particular implies that the homebase h remains clear throughout the entire strategy. Moreover, we require that the clear subgraph, that is, the subgraph consisting of all clear edges, is connected after each move of the search strategy. Finally, we require that after the last move of \({\mathcal S}\) all edges are clear. Throughout, we say that a node is clear if it is incident to a clear edge.

The minimum k such that there exists a node h and a monotone connected k-search strategy that starts at h is called the monotone connected search number of G and denoted by mcs(G).

Having defined a search strategy, we now state the on-line model we use. All searchers start at the homebase and the network itself is not known in advance to the searchers (except for the fact that the searchers may expect that the network is a partial grid). In fact, the searchers have no information about the network. We assume that nodes are anonymous and searchers have identifiers. The edges incident to each node are marked with unique labels (port numbers) and because only partial grids are considered in this work (for a definition see Section 2.2) we assume that labels naturally reflect all possible directions for each edge (i.e., left, right, up and down).

For the searchers, we assume that they communicate locally by exchanging information when present at the same node. Our algorithm is stated as if there existed global communication but it can be easily turned into required one with local communication as follows: we can designate one extra searcher called the leader who will be performing the following actions at the beginning of each move of the search strategy to be executed. First, the leader visits all nodes of the subgraph searched to date and gathers complete information about its structure and positions of all other searchers, then the leader computes the next move and finally visits all searchers to pass the information about the next move. Then, the move is performed by the agents.Footnote 3

Our algorithm is described for the synchronous model in which time is divided into steps, each step having the same unit length duration allowing each searcher to perform its local computations and slide along an edge if the searcher decides to move. We note that this assumption can be lifted and the algorithm can be easily restated to be asynchronous. Indeed, having one agent that is the leader one can simulate synchronous behavior of the agents in such a way that the leader waits for the completion of the current move of another searcher and then informs the searcher that is supposed to perform the next move, dictated by the search strategy, to start the move. As to the memory model, our algorithm requires that the memory size of the searchers is polynomial in the size of the network, and we do not attempt to optimize this parameter.

For any on-line algorithm A, let A(G, h) be the number of searchers that it uses to clear a network G in a monotone connected way starting from the homebase h. We say that an algorithm A is f(n)-competitive, for some function f, if

$$\max_{h} \frac{A(G,h)}{\texttt{mcs}(G)}\leq f(n)$$

for each n-node network G.

2.2 Partial Grid Notation

We define a partial gridG = (V, E) with a set of n nodes V and edges E as a connected subgraph of an n × n grid. We consider each partial grid to be embedded into two-dimensional Cartesian coordinate system with a horizontal x-axis and vertical y-axis, where each node of G is located at a point with integer coordinates and two nodes are adjacent if and only if the distance between them equals one (in Euclidean metric). This embedding is considered for two reasons. The first one is technical, as it simplifies some statements when we refer to coordinates when pointing nodes of G. The second is that our on-line algorithm relies on the underlying geometric structure. For convenience, the homebase is located at the point (0,0). In order to refer to a node that corresponds to a point with coordinates (x, y) we write \(v\left (x,y\right )\). In this work n denotes an upper bound on the number of nodes of a partial grid, such that \(\sqrt {n}\) is an integer.

Informally speaking, our algorithm will conduct a search by expanding the clear part of the graph from one ‘checkpoint’ to another. These checkpoints (defined formally later) will be subsets of nodes and their potential placements on the partial grid are dictated by the concept of a frontier. Take any \(x = i\sqrt {n}\) for some integer i, \(y = j\sqrt {n}\) for some integer j and take i, j∈{0,1},ij. Then, the line segment with endpoints (x, y) and \((x + \sqrt {n} i^{\prime },y + \sqrt {n} j^{\prime })\) is called a frontier and denoted by \(F\left (\left (x,y\right ),\left (x + \sqrt {n} i^{\prime },y + \sqrt {n} j^{\prime }\right )\right )\). Whenever the endpoints of a frontier are clear from the context or not important we will omit them. The frontier \(F\left (\left (0,0\right ),\left (\sqrt {n},0\right )\right )\) that contains the origin is called the homebase frontier and the set of all frontiers is denoted by \({\mathcal F}\). We will also divide frontiers into vertical and horizontal ones, where coordinates of two extreme nodes do not differ on first and second coordinate, respectively.

Given any graph H = (V, E) and XV, H[X] is the subgraph of Hinduced byX: its node set is X and consists of all edges {u, v} of H having both endpoints in X. The subgraph induced by all nodes that belong to a frontier F of a partial grid G is denoted by G[F].

For \(i \in \{1,\dots ,\sqrt {n}\}\) and some frontier \(F=F\left (\left (x,y\right ),\left (x^{\prime },y^{\prime }\right )\right )\), where xx and yy, we define the i-th rectangle ofF, denoted by \(\mathcal {R}(F,i)\), as the rectangle with corner vertices (xi, yi), (xi, y + i), (x + i, yi), (x + i, y + i) if F is horizontal and as the rectangle with corner vertices (xi, yi), (x + i, yi), (xi, y + i), (x + i, y + i) if F is vertical. See Fig. 1 for an example.

Fig. 1
figure 1

An illustration of the concept of rectangles (here \(\sqrt {n}=4\)). In acrosses denote nodes that lie on the homebase frontier \(F = F\left (\left (0,0\right ),\left (4,0\right )\right )\), empty circles denote nodes that lie on \(\mathcal {R}(F,1)\), empty squares the ones on \(\mathcal {R}(F,2)\), dark squares the ones on \(\mathcal {R}(F,3)\) and dark dots denote nodes that lie on \(\mathcal {R}(F,4)\). Gray arrows stand for the 10 frontiers, that lie on the \(\mathcal {R}(F,4)\) (six horizontal and four vertical ones). We denote one of the vertical frontiers that lie on \(\mathcal {R}(F,4)\) as \(F_{1} = F\left (\left (8,0\right ),\left (8,4\right )\right )\). In bdark dots denote nodes that lie on F1, dark squares the ones on \(\mathcal {R}(F_{1},1)\) and empty squares the ones on \(\mathcal {R}(F_{1},2)\)

Informally speaking, the two above concepts, namely frontiers and rectangles, provide a template on how the search may progress. However, due to the structure of a partial grid it may be possible that only certain nodes, but not all, that lie on a frontier have been reached at some point of a search strategy. For this reason, our notation needs to be extended to subsets of nodes that lie on frontiers and the corresponding rectangles. Any subset C of nodes of G that belong to some frontier F is called a checkpoint. The 0-th expansion of a checkpointC is C itself and is denoted by C〈0〉. For \(i \in \{1,\dots ,\sqrt {n}\}\) we define thei-th expansion of C, denoted by Ci〉, recursively as follows: the set Ci〉 consists of all nodes vC〈0〉∪ C〈1〉∪⋯ ∪ Ci − 1〉 for which there exists a node uCi − 1〉, such that there exists a path between v and u in the subgraph of G induced by nodes that lie on the rectangles \(\mathcal {R}(F,0),\mathcal {R}(F,1), \ldots , \mathcal {R}(F,i)\). Define

$$C^{+}\langle i\rangle= C\langle 0\rangle\cup\ldots\cup C\langle i\rangle, \quad i\in\{0,\ldots,\sqrt{n}\}.$$

Informally, Ci〉 consists of only those nodes that belong to the rectangle \(\mathcal {R}(F,i)\) that are connected to nodes of C by paths that lie ‘inside’ of \(\mathcal {R}(F,i)\) — this definition captures the behavior of searchers (in our algorithm) that guard the nodes of C and ‘expand’ from C in all directions: then possible nodes that belong to any of the rectangles \(\mathcal {R}(F,0),\mathcal {R}(F,1), \ldots , \mathcal {R}(F,i)\) but do not belong to C+i〉 will not be reached by the searchers. See Fig. 2 for an exemplary checkpoint with its expansions.

Fig. 2
figure 2

Some expansions of a checkpoint C (here \(\sqrt {n}=9\)); crosses denote C = C〈0〉, thw gray area covers nodes that belong to C+〈3〉, empty squares denote nodes in C〈4〉 and dark squares denote the ones that need to be guarded provided that the gray area consists of the clear nodes. The horizontal dotted line that contains h is the considered frontier

3 Lower Bound

First note that a regular \(\sqrt {n}\times \sqrt {n}\) grid requires \({\varOmega }(\sqrt {n})\) searchers even in the off-line setting [18], that is, when the network is known in advance and the searchers may decide on the location of the homebase. Therefore, our on-line algorithm is asymptotically optimal with respect to this worst case measure.

We aim at proving that for each on-line algorithm A there exists an n-node partial grid network G with homebase h such that \(\max \nolimits _{h}A(G,h)/\texttt {mcs}(G)={\varOmega }(\sqrt {n}/\log n)\).Footnote 4

Define a class of partial grids

$$ \mathcal{L}=\bigcup_{l\geq 0}\mathcal{L}_{l}, $$

where \(\mathscr{L}_{l}\) for l ≥ 0 is defined recursively as follows. We take \(\mathscr{L} 0 \) to contain one network that is a single node located at (0, 0). Then, in order to describe how \(\mathscr{L}_{l+1}\) is obtained from \(\mathscr{L}_{l}\), l ≥ 0, we introduce an operation of extending \(G\in \mathscr{L}_{l}\)ati, for i ∈{0,…,l}. In this operation, first take G and add l + 2 new nodes located at coordinates:

$$ (0,l+1),(1,l),\ldots,(j,l+1-j),\ldots,(l+1,0). $$

Call these coordinates the (l + 1)-th diagonal. For each j ∈{0,…,i} add an edge connecting the nodes \(v\left (j,l-j\right )\) and \(v\left (j,l-j+1\right )\), and for each j ∈{i,…,l} add an edge connecting the nodes \(v\left (j,l-j\right )\) and \(v\left (j+1,l-j\right )\). Then, obtain \(\mathscr{L}_{l+1}\) as follows: initially take \(\mathscr{L}_{l+1}\) to be empty and then for each \(G\in \mathscr{L}_{l}\) and for each i ∈{0,…,l}, obtain a network G by extending G at i and add G to \(\mathscr{L}_{l+1}\). Notice here that a graph constructed this way is not only a partial grid, but also a tree.

Figure 3 shows a network that was obtained from the corresponding network in \(\mathscr{L}_{7}\) by extending it at 6.

Fig. 3
figure 3

A network from \(\mathscr{L}_{8}\) obtained from the corresponding network in \(\mathscr{L}_{7}\) by extending it at 6

For a network \(G\in \mathscr{L}_{l}\), l ≥ 0, we define a characteristic sequence ofG, σ(G), as follows. If l = 0, then the characteristic sequence of G is empty. If l > 0, then take the network G such that G has been obtained by extending G at i. The characteristic sequence of G is σ(G), constructed by appending to σ(G) a new element \(v\left (i,l-i-1\right )\). Note that the characteristic sequence uniquely defines the corresponding network. In other words, G is a binary tree rooted at v(0,0) with l + 1 leaves, where only the vertices from σ(G) have two children. The network introduced in Fig. 3 has characteristic sequence \((v\left (0,0\right )\), \(v\left (1,0\right )\), \(v\left (1,1\right )\), \(v\left (0,3\right )\), \(v\left (3,1\right )\), \(v\left (2,3\right )\), \(v\left (1,5\right )\), \(v\left (6,1\right ))\).

Lemma 1

For any integer l and for each on-line algorithmA computing a connected monotone search strategy thereexists\(G\in \mathscr{L}_{l}\)suchthat for homebase\(v\left (0,0\right )\)wehave\(A(G,v\left (0,0\right ))\geq (l+1)/2\).

Proof

Consider any algorithm A producing a connected monotone search strategy. Run A for each network in \(\mathscr{L}_{l}\) with the homebase \(v\left (0,0\right )\). Note that for each network in \(\mathscr{L}_{l}\), there exist distinct moves m1,…,ml such that till the beginning of move mj, j ∈{1,…,l}, no node on the j-th diagonal has been occupied by a searcher and at the end of mj some node \(v\left (x_{j},y_{j}\right )\) of the j-th diagonal is occupied by a searcher. Consider \(G\in \mathscr{L}_{l}\) such that \(\sigma (G)=(v\left (0,0\right ),v\left (x_{1},y_{1}\right ),\ldots ,v\left (x_{l-1},y_{l-1}\right ))\). Informally speaking, whenever the algorithm reaches for the first time a node \(v\left (i,j-i\right )\) in the j-th diagonal, an adversary decides to extend at i the network explored so far, thus always forcing the situation that the first node reached on a diagonal is of degree three.

Note that at the beginning of move mj, j ∈{1,…,l}, no node of the j-th diagonal has been reached by a searcher and the first j nodes of the characteristic sequence have been reached by searchers. Recall that G is a binary tree.

We analyze the explored part of any graph \(G \in \mathscr{L}_{l}\) at the beginning of the move ml. All edges incident to the leaves in G are contaminated at this point. On the other hand, all nodes of the characteristic sequence have been visited by searchers till the end of the move ml − 1. Therefore, the contaminated subgraph of G at this point is a collection of paths leading from nodes that are guarded to the leaves. Since there are l + 1 leaves in G, there are l + 1 such paths, each such a path needs to have a searcher placed at one of its endpoints (the one that is not a leaf in G) and, by construction of G, any searcher can be present on at most two such endpoints. Thus, at least (l + 1)/2 nodes need to be occupied by searchers, as required by the lemma. □

Theorem 1

For each on-line algorithmA computing a connected monotone searchstrategy there exists an n-node network G with homebase h such that

$$ \frac{A(G,h)}{\texttt{mcs}(G)}={\varOmega}(\sqrt{n}/\log n). $$

Proof

Observe that each network G in \(\mathscr{L}\) is a tree and therefore \(\texttt {mcs}(G)=O(\log (n))\), n = |V (G)| [2, 34]. The theorem follows hence from Lemma 1 and the fact that the length of the characteristic sequence of each network in \(\mathscr{L}_{l}\) is \({\varOmega }(\sqrt {n})\). □

4 The Algorithm

In this section we describe our algorithm that takes an upper bound on the size of the network as an input. Section 4.1 deals with the initialization performed at the beginning of the algorithm. Then, Section 4.2 introduces two procedures used by the algorithm and finally Section 4.3 states the main algorithm.

We point out that the strategy to be computed is monotone. This means that whenever a new node has been reached by some searcher, the node will be guarded as long as it has some incident contaminated edges. After each move performed by searchers, each searcher that occupies a node that does not need to be guarded is said to be free. Each node that needs to be guarded is occupied by at least one searcher; if more searchers occupy such a node then all of them except for one are also free. Once all incident edges of a guarded node v become clear, the searcher that has been guarding v becomes immediately free. So we do not express this fact explicitly in the algorithm as the above rule is sufficient to partition the searchers into the free and guarding ones at any point of the strategy computed by the algorithm. Before we start the description of the algorithm, we stress out how we ‘reuse’ searchers that are free. Whenever the algorithm decides that a searcher needs to perform some action the following decision takes place. If there exists a searcher that is free, then the action is made by an arbitrary such searcher. If there is no free searcher, then a new one is introduced by the algorithm to perform the action. Thus, in our analysis we will count the number of searchers introduced throughout the execution of the algorithm.

If, at some point, no node of the last expansion of some checkpoint needs to be guarded, then we say that the expansion is empty.

4.1 Initialization

We start presenting our algorithm by describing initial conditions. Recall that the origin \(v\left (0,0\right )\) of the two-dimensional xy coordinate system is situated in the homebase. The initial checkpoint C0 is the set of nodes of the connected component of G[F] that contains h, where F is the homebase frontier. Thus, initially |C0| searchers place themselves on all nodes of C0 (note that the nodes of C0 induce a path in G). See Fig. 4 for an example.

Fig. 4
figure 4

Exemplary initialization for \(\sqrt {n} = 9\); crosses denote nodes belonging to the initial checkpoint C0 and empty circles denote nodes that belong to the homebase frontier, but do not fall into C0.

4.2 Procedures

4.2.1 Procedure ClearExpansion

We start with an informal description of the procedure. When a new checkpoint C has been reached, our search strategy ‘expands’ from C by successively clearing subgraphs G[C+i〉] for \(i\in \{1,\ldots ,\sqrt {n}\}\). Once all nodes in C+i − 1〉 are clear for some \(0<i\leq \sqrt {n}\), the transition to reaching the state in which all nodes in C+i〉 are clear requires clearing all nodes of the i-th expansion of C. This is done by calling for every guarded node u from C+i − 1〉 a special procedure (ModConnectedSearching, described below), which clears nodes which belong to Ci〉 and ‘can be accessed’ from u. Procedure ClearExpansion makes the above-mentioned calls to ModConnectedSearching and uses \(O(\sqrt {n})\) searchers in the process.

For clearing all nodes of the i-th expansion of C, provided that G[C+i − 1〉] is clear we will use a procedure from [7]. That procedure is more general and it is stated in [7] as Procedure ConnectedSearching with its performance stated in Theorem 1 in [7]. Here we give its following reformulation that uses our notation.

Theorem 2

[7] Let F be any frontier and letGbeany connected partial grid whose nodes lie entirely on therectangles\(\mathcal {R}(F,0),\mathcal {R}(F,1), \ldots ,\mathcal {R}(F,i)\),i ≥ 0.There exists an on-line procedure ConnectedSearching that,starting at an arbitrarily chosen homebase inG,clearsGina connected and monotone way using 6i + 4 searchers.

We stress out that the above theorem assumes that the partial grid is entirely contained in the area covered by the rectangles. In other words, the subgraph G in Theorem 2 has no vertices ‘outside’ of the specified area. However, while using procedure ConnectedSearching, we will be clearing a subgraph of G[C+i〉] that is embedded into the entire partial grid and thus some nodes v of G[C+i〉] have edges leading to neighbors that lie outside of G[C+i〉]. If such an edge is already clear, then no recontamination happens for the node v and moreover no searcher used by ConnectedSearching for the subgraph of G[C+i〉] needs to stay at v. On the other hand, if such an edge is contaminated (and thus not reached yet by our search strategy), then v needs to be guarded and for that end we place an extra searcher on it that guards v during the remaining execution of ConnectedSearching. Note that in the latter case, the node v belongs to \(\mathcal {R}(F,i)\), where F is the frontier that contains the nodes of C and therefore there exist \(O(\sqrt {n})\) such nodes v. In other words, ConnectedSearching is called to clear a certain subgraph contained within \(\mathcal {R}(F,i)\) and whenever a node on the rectangle \(\mathcal {R}(F,i)\) has a contaminated edge leading outside of the rectangle \(\mathcal {R}(F,i)\), then an extra searcher, not accommodated by ConnectedSearching in Theorem 2, is introduced to be left behind to guard v. The modification of ConnectedSearching that leaves behind a searcher on each such newly reached node of \(\mathcal {R}(F,i)\) will be denoted by ModConnectedSearching. Note that this procedure is invoked for every guarded node from C+i − 1〉 in order to clear C+i〉, see Fig. 5 for an example.

Fig. 5
figure 5

Example of an execution of procedure ClearExpansion; crosses denote C = C〈0〉, empty circles denote nodes that belong to C+〈1〉, dark squares denote the one that belongs to C+〈1〉 and for which procedure ModConnectedSearching is invoked, gray areas show nodes that will be cleared in four calls of ModConnectedSearching in order to clear C〈2〉. Note that the empty circles that lie on a gray area are guarded at first, but after one of the calls of ModConnectedSearching there is no need to guard them any more, so the procedure is not invoked for them

It follows that it is enough to provide as an input to ModConnectedSearching: a node v in C+i − 1〉 that plays the role of homebase for ModConnectedSearching, the frontier F and i. We stress out that there are possibly many such nodes v and once one of them is selected, some other such nodes in C+i − 1〉 may no longer have an incident edge that is contaminated since the call to ModConnectedSearching did clear such an edge. However, we assume that ModConnectedSearching clears only the maximal connected subgraph that contains v and is induced by contaminated edges only. Thus, once its execution is completed, there may exist another vertex v for which a new call to ModConnectedSearching will be made to clear another maximal connected subgraph induced by contaminated edges. See Fig. 5 that illustrates this process: the shaded areas indicate which subgraphs have been actually cleared by subsequent calls to ModConnectedSearching. We point out that, alternatively, a single call to ModConnectedSearching would suffice if the procedure ‘processed’ the entire subgraph contained in the expansion C+i〉 but this approach would ignore that some subgraph of C+i〉 is already clear and hence we present the procedure as having multiple calls to ModConnectedSearching that work on contaminated edges only. We note that each checkpoint used in our final algorithm is obtained as follows: some frontier F is selected and then a checkpoint C is created as some set of nodes that belong to F; thus we assume that with C such a unique frontier F is associated.

Thus, this approach guarantees us using at most 6i + 4 searchers to clear G[Ci〉] and, in addition to those, \(2\sqrt {n} + 8i\) searchers for guarding nodes lying on \(\mathcal {R}(F,i)\), which will be analyzed in more details in Section 5.

To summarize, we give a formal statement of our procedure.

figure a

The following observation summarizes the outcome of an execution of procedure ClearExpansion.

Lemma 2

Suppose thatCi − 1〉,that is an expansion contained in a frontier F, wherei ≥ 1, is an input toprocedure ClearExpansion. SupposethatGis the maximalsubgraph contained inG[Ci〉] andinduced by all nodes v such that there exists a path contained inG[Ci〉] connecting v with a vertexof C. Then, a call to ClearExpansion withthe above input provides the following:

  • exactly the edges of G that are contaminated prior to the call are cleared during this call to ClearExpansion,

  • after the call, each vertex of G with an incident contaminated edge is guarded by a searcher,

  • all of the nodes fromCi − 1〉 arecleaned and do not have to be guarded.

We point out that there may be an indirect interaction between different checkpoints. Consider an execution of procedure ClearExpansion with an input Ci − 1〉. At the point of performing this call, there may exist a different checkpoint C and a corresponding expansion Ci〉 such that some searcher is guarding a node v of Ci〉 because v has (assuming for simplicity) a single contaminated edge e incident to it. It may happen that during the execution of ClearExpansion the edge e becomes clear as it belongs to C+i〉. Therefore, this results in a situation that v is not guarded (since it has no incident contaminated edges) and the corresponding searcher becomes free.

4.2.2 Procedure UpdateCheckpoints

By definition, if F is some frontier, then \(\mathcal {R}(F,\sqrt {n})\) contains 10 frontiers (see Fig. 1). Thus, reaching the \(\sqrt {n}\)-th expansion \( C\langle \sqrt {n}\rangle \) of a checkpoint of F provides a possibility of creating one new checkpoint for each of the above frontiers. Procedure UpdateCheckpoints, which takes as an input \( C\langle \sqrt {n}\rangle \) and a collection \(\mathcal {C}\) of currently present checkpoints, generates these new checkpoints and adds them to \(\mathcal {C}\) and removes C from \(\mathcal {C}\). Also, if it happens that some newly constructed checkpoint belongs to the same frontier as some existing checkpoint in \(\mathcal {C}\) and no expansion for the existing one has been performed yet, then both checkpoints are merged into one. Finally, any checkpoint in \(\mathcal {C}\) whose lastly performed expansion is empty is removed from \(\mathcal {C}\). We remark that procedure UpdateCheckpoints only modifies the collection of checkpoints \(\mathcal {C}\) and this procedure performs no clearing moves.

figure b

Thus, to summarize, the ‘lifetime’ of a checkpoint is as follows. Once the 1-st expansion of C is performed, the checkpoint will remain in the collection \(\mathcal {C}\) and possibly more expansions of C are made (in total at most \(\sqrt {n}\) expansion are possible for each checkpoint). A checkpoint C may disappear from \(\mathcal {C}\) in three ways:

  • when C is in its 0-th expansion and another checkpoint C appears in the same frontier (thus, C is in its 0-th expansion) and then the nodes of C are added to C, or

  • some expansion of C becomes empty (then C is not removed from \(\mathcal {C}\) right away but during the subsequent call to UpdateCheckpoints), or

  • C reaches its \(\sqrt {n}\)-th expansion and procedure UpdateCheckpoints is called for C (in which case C possibly ‘gives birth’ to new checkpoints during the execution of UpdateCheckpoints).

Our algorithm maintains a collection \(\mathcal {C}\) of currently used checkpoints.

4.3 Procedure GridSearching

GridSearching is the main algorithm, whose aim it is to clear the entire partial grid G in a connected and monotone way. We start with an informal introduction of the algorithm. The search strategy it produces is divided into phases, which will formally be defined in the next section. In each step of the algorithm, a checkpoint with the highest number of nodes that need to be guarded is chosen and the next expansion is made on it. When one of the checkpoints reaches its \(\sqrt {n}\)-th expansion, then the current phase ends and the procedure UpdateCheckpoints is invoked. Thus, the division of search strategy into phases is dictated by consecutive calls to procedure UpdateCheckpoints. For an expansion C, in the pseudocode below we write δ(C) to refer to the set of nodes that belong to the last expansion of C and need to be guarded at a given point.

figure c

We now introduce a classification of searchers used in our algorithm. This classification will be used in the proof of Theorem 3 but we place it here as it provides another way of describing several actions that take place in the algorithm. We can divide searchers into three groups: explorers, cleaners and guards. Suppose that procedure ClearExpansion performs the i-th expansion of a checkpoint \(C_{\max \nolimits }\). Denote by \(F_{\max \nolimits }\) the frontier that contains the nodes in \(C_{\max \nolimits }\). All searchers located at nodes on the (i − 1)-th rectangle of \(F_{\max \nolimits }\) that need to be occupied in order to avoid recontamination at the beginning of the call to procedure ClearExpansion are named to be guards. The explorers and cleaners are used by algorithm ModConnectedSearching called during the execution of procedure ClearExpansion. Each time ModConnectedSearching reaches a node v on the i-th rectangle of \(F_{\max \nolimits }\) such that v needs to be guarded, the searcher used for guarding v is called an explorer. The searchers used in ModConnectedSearching that mimic the movements of searchers in algorithm ConnectedSearching are the cleaners. We point out that we do not alter here the behavior of ClearExpansion and ModConnectedSearching but just assign one of the three categories to each searcher they use. Informally speaking, when explorers protect nodes lying on the i-th rectangle and the guards protect the ones lying on the (i − 1)-th rectangle of \(F_{\max \nolimits }\), cleaners clear nodes inside the i-th rectangle of \(F_{\max \nolimits }\) (i.e., the remaining nodes of the i-th expansion of \(C_{\max \nolimits }\)).

We close this chapter with giving examples of the first three expansions of some checkpoint C, see Fig. 6, and showing how our algorithm clears an exemplary partial grid network, see Fig. 7 (for a formal definition of a phase see the first paragraph of Section 5).

Fig. 6
figure 6

First three expansions for some checkpoint C (here \(\sqrt {n}=9\)); crosses denote C = C〈0〉, empty circles denote nodes cleared in previous expansions; squares denote nodes explored in the current expansion; dark circles are nodes not reached yet by the searchers; and dark squares denote nodes that need to be guarded at the end of current expansion. Gray areas show the clear part of the graph, i.e., C+i〉 for i ∈{1,2,3}

Fig. 7
figure 7

Clearing an exemplary partial grid by procedure GridSearching; gray areas denote the clear part, arrows denote frontiers on which the marked checkpoints lie, dotted rectangles around checkpoints denote their current expansions and solid rectangles denote the \(\sqrt {n}\)-th expansions, which end phases

5 Analysis of the Algorithm

By a step of the algorithm, or simply a step, we mean all searching moves performed during a single iteration of the internal ‘while’ loop of procedure GridSearching. Thus, one step of the algorithm includes all moves produced by one call to procedure ClearExpansion. A phase of an algorithm consists of all its steps between two consecutive calls to procedure UpdateCheckpoints. Note that phases may differ with respect to the number of steps they are made of.

We say that a checkpoint is present in a given phase if its last expansion is not empty at the beginning of this phase, i.e., if this checkpoint belongs to \(\mathcal {C}\) at the beginning of the phase. Similarly, a checkpoint is present in a given step if it is present in the phase to which the step belongs. Thus, in particular, a checkpoint is present in none or in all steps of a given phase. Note that some checkpoints may have empty expansions during a part of a the phase, but they still remain present to the end of the phase; this assumption is made to simplify the analysis of the algorithm.

Let t be a step and v be a node, which needs to be guarded at the beginning of step t. We say that the checkpoint Cownsv in step t if:

  • either C owns v in step t − 1 or

  • no checkpoint owns v in step t − 1 and v belongs to the last expansion of C performed till the end of step t − 1.

(Intuitively, if a node v is reached by searchers in a step in which an expansion of C occurred, then C owns v as long as v is guarded.) We note that any vertex v can be owned by only one checkpoint. This follows from the fact that our strategy is monotone. More precisely, once v is owned by some checkpoint C in some step, then in the following steps it either continues to be owned by C or v does not need to be guarded. In the latter case v will not be owned by any checkpoint till the end of the strategy. Given a checkpoint C present in a step t, we write \(\mathcal {E}(C,t)\) to denote the set of nodes that C owns in step t. The weight of a checkpointC present in a step t is \(\omega _{t}(C) = |\mathcal {E}(C,t)|\) and if a checkpoint C is not present in a step t, then we take ωt(C) = 0. Note that each guarded node is owned by exactly one checkpoint and hence, for a step t, the sum of weights of all checkpoints present in step t equals the number of nodes that need to be guarded.

The checkpoint \(C_{\max \nolimits }\) selected in a step t (see the pseudocode of Procedure GridSearching) is called active in stept, or simply active if the step is clear from the context or not important. All other checkpoints present in this step are called inactive. We define an active interval of a checkpoint C to be a maximal interval [t, t] such that C is active in all steps t ∈{t,…,t}.

5.1 Single Phase Analysis — How Weights of Checkpoints Evolve

We now prove lemmas that characterize how the weight of a checkpoint changes over time — see Fig. 8 for an exemplary life cycle of a checkpoint. Informally, the weight of a checkpoint C does not grow in intervals in which C is inactive (Lemma 3). Also, the weight of C at the end of an active interval is not greater than at the beginning of it (Remark 1); however, no upper bounds except for the trivial one of \(O(\sqrt {n})\) can be concluded for the weight of C inside its active interval.

Lemma 3

If a checkpointC is present and inactive in a step t, thenωt+ 1(C) ≤ ωt(C).

Proof

It follows directly from the definitions and procedure ClearExpansion that the only checkpoint on which an expansion is performed during execution of ClearExpansion is the active one. The weight of an inactive checkpoint C can change only in the situation where the active checkpoint in a step t expands on some nodes owned by C. In other words, the weight of C may decrease if C contains in step t nodes that are added to the active checkpoint in step t + 1. Thus, if t is not the last step of a phase, then the proof is completed.

If t is the last step of some phase, then apart from procedure ClearExpansion, procedure UpdateCheckpoints is invoked, which affects C in two situations:

  • there exists a step t in the phase that ends such that ωt(C) = 0. Then, because C cannot be expanded during steps t,…,t of the phase, we get directly that ωt+ 1(C) = ωt(C) = 0.

  • C is in its 0-th expansion and a new checkpoint is placed on the same frontier, which implies that C is not present in step t + 1 and thus ωt+ 1(C) = 0.

Thus, in all cases we obtain that ωt+ 1(C) ≤ ωt(C). □

We next observe that, informally speaking, once a checkpoint becomes active, it remains active until either the phase ends or its weight decreases. Note that a checkpoint that is active in the last step of the phase is not present in the first step of the next phase, i.e., its weight is then zero, which allows us to state the lemma as follows:

Lemma 4

LetC be a checkpoint and let [t, t] bean active interval ofC. For every stept ∈{t,…,t} itholdsωt(C) ≥ ωt+ 1(C).

Proof

Obviously, t and t must belong to the same phase, because at the end of each phase the active checkpoint is removed from \(\mathcal {C}\), i.e., it is no longer present in the next phase.

If t is the last step of the phase then the lemma follows, because ωt+ 1(C) = 0 ≤ ωt(C).

We will now prove that lemma holds when t is not the last step of the phase. Let us suppose for a contradiction that ωt+ 1(C) > ωt(C). From the assumptions of the lemma and definition of an active interval we get that C is not the active checkpoint in step t + 1. Because we are still in the same phase, it means that there must exist a checkpoint C such that ωt+ 1(C) ≥ ωt+ 1(C). Moreover from Lemma 3 we know, that because C was inactive from step t to t, it holds ωt(C) ≥ ωt(C) ≥ ωt+ 1(C). This gives us

$$ \omega_{t}(C^{*}) \geq \omega_{t^{\prime\prime}+1}(C^{*}) \geq \omega_{t^{\prime\prime}+1}(C) > \omega_{t}(C), $$

which is in a contradiction to the assumption that C is the active checkpoint in step t. □

Remark 1

Let C be a checkpoint and let [t, t] be an active interval of C. Then, ωt+ 1(C) ≤ ωt(C).

We now conclude from the two previous lemmas about the weight of inactive checkpoints in the ends of the consecutive phases.

Lemma 5

Suppose that a phase ends in a steptandthe next one ends in a stept.If a checkpointC is inactive (but present) in stepstandt,thenωt(C) ≤ ωt(C).

Proof

Each checkpoint C can be active or inactive in different steps during the whole phase. If in some step t ∈{t,…,t} a checkpoint C is inactive then from Lemma 3 we have that its weight will not increase, i.e., ωt(C) ≥ ωt+ 1(C). On the other hand, Lemma 4 guarantees us, that the weight of an active checkpoint cannot be greater after its active interval than at the beginning. □

5.2 How Many Nodes are Explored by a Checkpoint?

Define a bottleneck of a checkpoint C, denoted by b(C) to be its minimum weight taken over all steps in which C was present. (Note that a checkpoint may be present in many consecutive phases, see Fig. 8.)

Fig. 8
figure 8

Exemplary life cycle of a checkpoint C

Suppose that a node v has been reached by a searcher for the first time in a step t. Let C be the active checkpoint in step t. We say that v has been explored byC.

If an expansion of an active checkpoint C reaches in a step t a node u already explored by some checkpoint C, then in most situations u does not need to be guarded. However there might occur a “corner situation” when u still needs to be guarded in order to avoid contamination. In such case, the algorithm clearly needs one searcher on u to guard it and so it is counted in our analysis due to the ‘ownership’ relation used in the definition of the weight of a checkpoint.

The next lemma states a lower bound on the number of nodes explored by a checkpoint reaching its last expansion.

Lemma 6

Suppose that a phase ends in a step t. LetC be the active checkpointin step t. The number of nodes explored byC in all steps is at least\(b(C)\sqrt {n}\).

Proof

First let us make a remark that nodes can be only explored by C during execution of procedure ClearExpansion that took C as an input, i.e., when C is active. Let us denote by S the set of all nodes explored by C.

Because C is active in the last step of the phase, it had to be active in exactly \(\sqrt {n}\) steps in total, which can be contained in several past phases. Let \(t_{1}, t_{2},\ldots , t_{\sqrt {n}} = t\) be all steps in which C is active. Note that

$$ \bigcup\limits_{i=1}^{\sqrt{n}}\mathcal{E}(C,t_{i}) \subseteq S $$

and \(\mathcal {E}(C,t_{i})\cap \mathcal {E}(C,t_{j})=\emptyset \) for ij. The latter follows directly from the fact that nodes in \(\mathcal {E}(C,t_{i})\) and \(\mathcal {E}(C,t_{j})\) belong to different rectangles of the frontier containing C for ij. (Recall that \(|\mathcal {E}(C,t)|=\omega _{t}(C)\) for each step t.) Also from the definition of the bottleneck, we get that \(b(C)\leq \omega _{t_{i}}(C)\) for each \(i\in \{1,\ldots ,\sqrt {n}\}\) and hence we conclude that:

$$ |S| \geq \sum\limits_{i=1}^{\sqrt{n}}\omega_{t_{i}}(C) \geq b(C)\sqrt{n}. $$

We now give an upper bound on the weight of each inactive checkpoint at the end of a phase.

Lemma 7

Suppose that a phase ends in a step t. LetC1,…,Clbeall checkpoints present in this phase, whereC1isthe active checkpoint in step t. Then,b(C1) ≥ ωt(Cj) foreachj ∈{2,…,l}.

Proof

Let us denote by t the last step in which ωt(C1) = b(C1). If t = t then the lemma follows strictly from the definition of an active checkpoint. We will now prove that lemma stands also when t < t.

Suppose that t and t do not belong to the same active interval of C1. From the Lemma 4 we know that ωt(C1) = b(C1) occurs for some t that does not belong to an active interval. Moreover from Remark 1 we get that every next active interval will need to start and finish on the same weight as the bottleneck, which is in contradiction that t is the last step when b(C1) occurred.

Hence t and t are part of the same active interval of C1. Then, we get from Lemma 3 and the fact that C1 is active in step t:

$$ \omega_{t}(C_{j}) \leq \omega_{t^{\prime}}(C_{j}) \leq \omega_{t^{\prime}}(C_{1}) = b(C_{1}), \quad j \in \{2,\ldots,l\}, $$

which finishes our proof. □

Let us introduce a relation ≺ on a set of checkpoints. Whenever CC, we say that C is a predecessor of C and C is a successor of C. We stress out that the construction depends on the execution of the algorithm, namely only checkpoints that appear in some step are considered, and the division of the steps into phases shapes the relation. More precisely, the relation is defined only for checkpoints added to the set \(\mathcal {C}\) during all executions of procedure UpdateCheckpoints. To construct the relation we iterate over the consecutive phases of the algorithm. Initially the relation is empty and once the construction is done for each phase smaller than i, we perform the following for phase i. Let C be the active checkpoint in the last step of phase i. Let C1,…,Cl be all checkpoints, different from C, that have no successors so far and were added to \(\mathcal {C}\) till the end of phase i − 1 (including the last step). Then, let CjC for each j ∈{1,…,l}.

An important property of our algorithm is that each checkpoint may have only a constant number of predecessors:

Lemma 8

Each checkpoint has at most 10 predecessors.

Proof

A checkpoint C can only once be active in the last step of some phase i, because after that it will not be present in any later phases. At the end of phase i the only checkpoints that do not have any successors are the ones that were constructed by the procedure UpdateCheckpoints at the end of phase i − 1. There are at most 10 such checkpoints. □

5.3 The Algorithm Uses \(O(\sqrt {n})\) Searchers in Total

We now bound the total weight of all checkpoints at the end of each phase — note that this bounds the total number of searchers used for guarding at the end of a phase. A high level intuition behind the proof of Lemma 9 is as follows. Due to Lemma 6, each checkpoint C that is active in the last step of a phase explores at least \(b(C)\sqrt {n}\) nodes in total. Therefore, the sum of bottlenecks of all such checkpoints C cannot exceed \(\sqrt {n}\). Moreover, C can have at most 10 predecessors and hence the sum of weights of those predecessors is bounded by 10b(C) according to Lemma 7. Since each checkpoint (except the one that is active in the last step of a given phase) is a predecessor of some checkpoint that is active in the last step of some phase, we bound the sum of all weights of all such checkpoints present in a given phase by \(10\sqrt {n}\).

Lemma 9

Suppose thatC1,…,Clareall checkpoints present in a phase that ends in step t, whereC1is active in step t. Then,

$$ \sum\limits_{i=1}^{l} \omega_{t}(C_{i}) \leq \omega_{t}(C_{1}) + 10\sqrt{n}. $$

Proof

Suppose that phase j ends in step t. Let ti be the last step of phase i and let \({C_{i}^{0}}\) be the active checkpoint in step ti for each i ∈{0,…,j}. We denote by s the number of nodes visited by searchers till the end of step t = tj. From Lemma 6 and the fact that the number of all nodes n is at least s we have:

$$ n \geq s \geq \sum\limits_{i=0}^{j} b({C_{i}^{0}}) \sqrt{n} \quad \Rightarrow \quad 10\sqrt{n} \geq 10 \sum\limits_{i=0}^{j} b({C_{i}^{0}}). $$
(1)

From Lemma 8 we have that the checkpoints \({C_{0}^{0}},\ldots ,{C_{j}^{0}}\) can have at most 10 predecessors. From the definition, they are constructed (i.e., added to collection \(\mathcal {C}\) during the execution of procedure UpdateCheckpoints) at the beginning of the first step of a phase at the end of which their successor is active. Let us denote by \({C_{i}^{1}},\ldots ,C_{i}^{l_{i}}\), 0 ≤ li ≤ 10, the predecessors of \({C_{i}^{0}}\) for each i ∈{0,…,j} (by li = 0 we denote that \({C_{i}^{0}}\) has no predecessors). From Lemma 7 we have:

$$ \sum\limits_{k=1}^{l_{i}} \omega_{t_{i}}({C_{i}^{k}}) \leq 10 b({C_{i}^{0}}), \quad i\in\{0,\ldots,j\}. $$
(2)

Lemma 5 assures us that weights of inactive checkpoints will not be greater at the end of the next phase than they are in the last step of current phase:

$$ \omega_{t}({C_{i}^{k}}) = \omega_{t_{j}}({C_{i}^{k}}) \leq \omega_{t_{j - 1}}({C_{i}^{k}}) \leq {\cdots} \leq \omega_{t_{i}}({C_{i}^{k}}),\ i\in\{0,\ldots,j\};\ k\in\{1,\ldots,l_{i}\}. $$
(3)

Because

$$ \{C_{1},\ldots,C_{l}\}\subseteq\{{C_{j}^{0}}\} \cup \left\{{C_{i}^{k}}\bigl|\bigr. k\in\{1,\ldots, l_{i}\}, i\in\{0,\ldots,j\} \right\}, $$

we can conclude from (3), (2) and (1) (in this order) that:

$$ \begin{array}{@{}rcl@{}} \sum\limits_{i=1}^{l} \omega_{t}(C_{i}) &\leq& \omega_{t}({C_{j}^{0}}) + \sum\limits_{i=0}^{j} \sum\limits_{k=1}^{l_{i}} \omega_{t}({C_{i}^{k}})\\ &\leq& \omega_{t}({C_{j}^{0}}) + \sum\limits_{i=0}^{j} \sum\limits_{k=1}^{l_{i}} \omega_{t_{i}}({C_{i}^{k}})\\ & \leq & \omega_{t}({C_{j}^{0}}) + \sum\limits_{i=0}^{j} 10 b({C_{i}^{0}}) \\ & \leq & \omega_{t}({C_{j}^{0}}) + 10\sqrt{n}. \end{array} $$

Theorem 3

Given an upper boundn of the size of the network as an input, the algorithm GridSearching clearsin a connected and monotone way any unknown underlying partial grid networkusing\(O(\sqrt {n})\)searchers.

Proof

At first let us notice that the algorithm GridSearching ends with the whole network cleared. Indeed, as long as there are contaminated nodes, it will continue clearing next expansions of the checkpoints. Because no recontamination takes place, it eventually terminates. We will bound the number of searchers s used by a single call to procedure ClearExpansion and the total number of searchers s used for guarding at the end of any step of the algorithm. Note that s + s bounds the total number of searchers used by GridSearching. In the proof we refer to the classification of searchers into explorers, cleaners and guards introduced in Section 4.

We first analyze procedure ClearExpansion to give an upper bound on s. The fact that each rectangle of a frontier contains at most \(10\sqrt {n}\) nodes and Theorem 2 give that:

$$ \begin{array}{@{}rcl@{}} \text{number of explorers} &\leq& 10 \sqrt{n}, \\ \text{number of cleaners} &\leq& 6\sqrt{n} + 4. \end{array} $$

Thus,

$$ s\leq 16\sqrt{n} +4. $$

The guards used to protect nodes lying on the (i − 1)-th rectangle are accounted for during the estimation of s below.

We now bound the maximal number of searchers used for guarding at the end of each step t of our search strategy, which we denote by gt. It is easy to see that \(g_{t} \leq 10\sqrt {n}\) if t belongs to phase 0.

Let us now take any step t that belongs to an i-th phase, where i > 0 and denote by t the last step of the phase i − 1 and by C the active checkpoint in step t. From Lemma 9 we know that \(g_{t^{\prime }} \leq \omega _{t^{\prime }}(C) + 10\sqrt {n} \leq 20\sqrt {n}\). The latter inequality follows from the fact that all nodes in \(\mathcal {E}(C,t^{\prime })\) belong to the j-th rectangle of the frontier that contains C, \(j\leq \sqrt {n}\), and the number of nodes in this rectangle is at most \(10\sqrt {n}\).

We know now that every phase starts with at most \(20\sqrt {n}\) guards. If t is the first step of an active interval of some checkpoint, then by Lemma 3 and Remark 1 we have that \(g_{t} \leq g_{t^{\prime }} \leq 20\sqrt {n}\). But if t is a step inside some active interval, then an active checkpoint can reach at most \(10\sqrt {n}\) new nodes that need to be guarded. Note that by Lemma 2, the nodes of subsequent expansions of a checkpoint that need to be guarded do not accumulate, that is, we only guard the one of the last expansion. Because in one step only one checkpoint can be active that leads us to conclusion that for every step t we have \(g_{t} \leq 30\sqrt {n}\). Therefore, we obtain that \(s^{\prime }\leq 30\sqrt {n}\).

Thus, we obtain \(s+s^{\prime }\leq 46\sqrt {n} + 4=O(\sqrt {n})\) as required. □

6 Unknown Size of the Graph

The algorithm we have described needs to know an upper bound on the size of the underlying partial grid network G. In this section we design a procedure called ModGridSearching that performs the search using \(O(\sqrt {n})\) searchers and having no prior information on the network. The procedure is based on a standard technique: guessing an upper bound on n by doubling potential estimate each time. More about applications of the doubling technique in designing on-line and off-line approximation algorithms can be found in [9].

The procedure ModGridSearching is composed of a certain number of rounds. In round i, procedure GridSearching first introduces \(c\sqrt {2^{i}}\) new searchers called i-th team, where c is the constant from the asymptotic notation in Theorem 3. Then, a call to GridSearching is made, where procedure GridSearching is using only the searchers of the i-th team. The outcome can be twofold. The procedure may succeed in searching the entire graph and in such case the i-th round is the last one and ModGridSearching is completed, or the procedure may encounter a situation in which it would be forced to use more than \(c\sqrt {2^{i}}\) searchers to continue. In such case GridSearching stops, the i-th round ends and the (i + 1)-th round will follow. Once the i-th round is completed, the searchers of the i-th team stay idle indefinitely. We point out that during the execution of an i-th round, i > 1, procedure GridSearching using the searchers of the i-th team is ignoring the fact that the network may be partially clear as a result of the work done in previous rounds. Moreover, the searchers of j-th team for each j < i are not used and thus also ignored during i-th round.

We close this section by giving an upper bound on the number of searchers that need to be used in the presented modified version of our algorithm.

Theorem 4

The on-line algorithm ModGridSearching clears(starting at an arbitrary homebase) in a connected andmonotone way any unknown underlying partial grid networkusing\(O(\sqrt {n})\)searchers.The algorithm receives no prior information on the network.

Proof

Let n be the number of nodes of the partial grid network, which is unknown to our procedure. The number of rounds m fulfills 2m− 1 < n ≤ 2m, i.e., \(m = \left \lceil \log _{2}{n} \right \rceil \). At the end of i-th round, \(c \sqrt {2^{i}}\) searchers need to stay in their last positions till the end of our procedure and are not used in subsequent rounds. This means that the total number of searcher s is upper bounded by a sum of searchers used in every round:

$$ \begin{array}{@{}rcl@{}} s & \leq & c\sqrt{2} + c\sqrt{2^{2}} + {\dots} + c\sqrt{2^{\left\lceil \log_{2}{n} \right\rceil}} = c\sum\limits_{j=1}^{\left\lceil \log_{2}{n} \right\rceil}\left( \sqrt{2} \right)^{j} \\ & = & \sqrt{2}c \frac{ 1- \sqrt{2}^{\left\lceil \log_{2}{n} \right\rceil } }{1 - \sqrt{2} } = \frac{\sqrt{2}c}{\sqrt{2} - 1} \left( \sqrt{2^{\left\lceil \log_{2}{n} \right\rceil }} - 1 \right). \end{array} $$

Because \(\sqrt {n} \leq \sqrt {2^{\left \lceil \log _{2}{n} \right \rceil }} < \sqrt {2n}\), we conclude

$$s < \frac{\sqrt{2}c}{\sqrt{2} - 1} \left( \sqrt{2n} - 1 \right)\quad \Rightarrow \quad s= O(\sqrt{n}) .$$

7 Conclusions

7.1 Motivation

There exists a number of studies of graph searching problems in the graph-theoretic context. Much less is known for geometric scenarios. It turns out that the geometric (or continuous) analogue of graph searching is challenging to analyze. More precisely, in the recently introduced continuous version [29, 33] the input geometric shape is searched by using line segments or curves (that form a barrier separating contaminated and clear area) instead of searchers. The corresponding optimization criterion is then the total length of this barrier. It can be observed that computing optimal strategies even for some simple shapes turns out to be quite non-trivial [33].

The class of graphs we have selected to study in this work is motivated by the following arguments. First, on-line (monotone) searching turns out to be difficult in terms of achievable upper bound on the number of searchers even in simple topologies like trees. This suggest that some additional information is needed to perform on-line search efficiently and our work shows that, informally speaking, a two-dimensional sense of direction is enough to search a graph in asymptotically almost optimal way. Our second motivation comes from approaching the problem of geometric search by considering its discrete analogues, i.e., by modeling via graph theory. We give a short sketch to give an overview as the problem of modeling is out of scope of this work and we only refer to some recent works on the subject [1, 4, 29, 33]. Consider a continuous search problem in which k searchers initially placed at the same location need to capture the fugitive hiding in an arbitrary polygon that possibly has holes. The polygon is not known a priori to the searchers. The fugitive is considered captured in time t when it is located at distance at most r from some searcher at time point t. (The distance r can be related to physical dimensions of searchers and/or their visibility range, etc.)

Consider the following transition from the above continuous searching problem of a polygon to a discrete one. Overlap the coordinate system with the polygon in such a way that the origin coincides with the original placement of the searchers. Then, place nodes on all points with coordinates, which are multiples of r and lie in the polygon. Connect two nodes with an edge if the edge is contained in the polygon. In this way we obtain a partial grid network. In this brief sketch we omit potential problems that may arise in such modeling, like obtaining disconnected networks or having ‘blind spots’, i.e., points in the polygon that cannot be cleared by using the above nodes and edges only. We say that a partial grid network Gcovers the polygon if G is connected and for each point p in the polygon there exist a node of G in distance at most r from p.

See Fig. 9 for an example.

Fig. 9
figure 9

An example of the construction of a partial grid network

Note that any search strategy \({\mathcal S}^{\prime }\) for a polygon P can be used to obtain a search strategy \({\mathcal S}\) for underlying partial grid network G as follows. For each searcher s used in \({\mathcal S}^{\prime }\) introduce four searchers s1,…,s4 that will ‘mimic’ its movements by going along edges of G. More precisely, the searchers s1,…,s4 will ensure that at any point, if s is located at a point (x, y), then s1,…,s4 will reside on nodes with coordinates (⌊x/r⌋,⌊y/r⌋), (⌊x/r⌋,⌈y/r⌉), (⌈x/r⌉,⌊y/r⌋), (⌈x/r⌉,⌈y/r⌉). In this way, area protected by s in \({\mathcal S}^{\prime }\) is always protected by four searchers in \({\mathcal S}\). This allows us to state the following.

Observation 1

Let P be a polygon and let G by an underlying partial grid network that covers P. Then, there exists a search strategy for G using k searchers such that its execution in G results in clearing P and k = O(p), where p is the minimum number of searchers required for clearing P (in a continuous way).

7.2 Open problems

In view of the lower bound shown in [28] that even in such simple networks as trees each distributed or on-line algorithm may be forced to use \({\varOmega }(n/\log n)\) times more searchers than the connected search number of the underlying network, one possible line of research is to restrict attention to specific topologies that allow to obtain algorithms with good provable upper bounds. This work gives one such an example. An interesting research direction is to find other non-trivial settings in which distributed or on-line search can be conducted efficiently. Also, we leave a logarithmic gap in our approximation ratio. Since there exist grids that require \({\varOmega }(\sqrt {n})\) searchers the gap can be possibly closed by analyzing the grids that require few (e.g. \(O(\log n)\)) searchers.

The above questions related to network topologies can be stated more generally: what properties of the on-line model are crucial for such a search for fast and invisible fugitive to be efficient? This work and also a recent one [7] suggest that a ‘sense of direction’ may be one such a factor. Possibly interesting directions may be to analyze the influence of visibility on search scenarios.

We finally note that the only optimization criterion that was of interest in this work is the number of searchers. This coincides with the research done in off-line search problems where this was the most important criterion giving nice ties between graph searching theory and structural graph theory. However, one may consider adding different optimization criteria like time (defined as the maximum number of synchronized steps) or the total distance (the total number of moves performed by all searchers).