Abstract
Given an undirected graph G on n nodes and m edges in the form of a data stream we study the problem of finding an Euler tour in G. Our main result is the first one-pass streaming algorithm computing an Euler tour of G in the form of an edge successor function with only \(\mathcal O(n\log (n))\) RAM, which is optimal for this setting (e.g. Sun and Woodruff (2015)). Since the output size can be much larger, we use a write-only tape to gradually output the solution. The previously best-known result for finding Euler tours in data streams is implicitly given by the W-stream algorithm of Demetrescu et al. (2010) using \(\mathcal O(m/n)\) passes under the same RAM limitation. Our approach is to partition the edges into edge-disjoint cycles and to merge the cycles until a single Euler tour is achieved. In the streaming environment such a merging is far from being obvious as the limited RAM allows the processing of only a constant number of cycles at once. This enforces merging of cycles that partially are no longer present in RAM. We solve this problem with a new edge swapping technique, for which storing two certain edges per node is sufficient to merge tours without having all tour edges in RAM. The mathematical key is to model tours and their merging in an algebraic way, where certain equivalence classes represent subtours. This quite general approach might be of interest also in other routing problems.
Similar content being viewed by others
1 Introduction
1.1 Euler Tours in the Graph-Streaming Model
In the Euler tour problem, we are looking for a closedFootnote 1 trail in an undirected graph G = (V, E), where n = |V | and m = |E|, such that each edge is visited exactly once. It is a well studied graph-theoretic problem with applications in the field of big data. For example, solving the traveling salesman problem with the well-known Christofides algorithm requires an Euler tour [1]. Further, for de novo genome assembly, large de Bruijn graphs can be examined with Eulerian paths [2]. For the processing of large graphs, the graph streaming or semi streaming model introduced by Feigenbaum et al. [3] has been studied extensively over the last decade. In this model, a graph with n nodes and m edges is given as a stream of its edges. Random-access memory (RAM, also called internal memory) is restricted to \(\mathcal O(n \text { polylog}(n))\) edges at a time, see, e.g., the survey [4] for a detailed introduction. In consequence, the model cannot be applied to problems where the size of the solution exceeds this amount of memory. Since the size of an Euler tour is m, which might even be Θ(n2), we need a relaxation of the model that allows us to store the output separate from the RAM. An obvious solution for this problem is the addition of a write-only output tape with the sole purpose of storing the Euler tour. In this setting it is common to output the Euler tour in the form of an edge-successor function, i.e. a bijective function \(E \rightarrow E\) mapping each edge to the respective subsequent edge of the Euler tour (e.g. [5], [6]).
1.2 Previous Work
Graph streaming with the usage of an additional output tape resembles the model used by Grohe et al. [7]. They consider Turing machines with a read/write input tape, multiple read/write tapes of significantly smaller size, called ‘internal memory tapes’, and an additional write-only output tape. They count the number of times the head of the input tape changes its direction. This is closely related to the streaming model by Feigenbaum et al. since a streaming pass can be remodeled as a run on the input tape with two direction changes of the head. In a more recent work by François et al. [8], a read-only input tape and a write-only output tape are considered for the problems of stream reverting and sorting.
Another related model is the W-streaming model introduced by Demetrescu et al. [9], which is a relaxation of the classical streaming model. It originated as a more restrictive alternative to the StrSort model introduced by Aggarwal et al. [10, 11]. At each pass, an output stream is written, which becomes the input stream of the next pass. Finding an Euler tour in trees in W-streaming has been studied in multiple papers (e.g., [6]), but to the best of our knowledge the general Euler tour problem has hardly been considered in a streaming model so far. However, there are some general results for transferring PRAM algorithms to the W-streaming model. Atallah and Vishkin [5] presented a PRAM algorithm for finding Euler tours, using \(\mathcal O(\log (n))\) time and n + m processors. Transferred to the W-streaming model with the methods from [6], this algorithm computes an Euler tour in the form of a bijective successor function within \(p=\mathcal O(m\text { polylog}(n)/s)\) passes, where s is the RAM-capacity.
Sun and Woodruff [12] showed that a one-pass streaming algorithm for verifying whether a graph is Eulerian needs \({\varOmega }(n \log (n))\) RAM. This implies that the minimum RAM-requirement of a one pass streaming algorithm with additional output tape for finding an Euler tour also is \({\varOmega }(n \log (n))\).
1.3 Our Contribution
We present the streaming algorithm Euler-Tour for finding an Euler tour in a graph in form of a bijective successor function or stating that the graph is not Eulerian, using only one pass, \(\mathcal O(n \log (n))\) bits of RAM and an additional write-only output tape. This is not only a significant improvement over previous results, but is in the view of the lower bound of Sun and Woodruff [12] the first optimal algorithm in this setting. Atallah and Vishkin [5] find edge disjoint tours (in our case cycles) and connect them by pairwise swapping the successor edges of suitable edges. This idea is easy to implement without memory restrictions but the implementation gets distinctly more complicated with limited memory space: We cannot store all cycles in RAM. Therefore, we have to output edges and their successors before finding resp. processing all cycles. Our idea is to keep specific edges of some cycles in RAM along with additional information so that we are able to merge following cycles regardless of their appearance with already processed tours which likely are no longer present in RAM.
We develop a new mathematical foundation by partitioning the edges into equivalence classes induced by a given bijective successor function and prove structural properties that allow to iteratively change this function on a designated set of edges so that the modified function is still bijective. Translated to graphs this is a tour merging process. This mathematical approach is quite general and might be useful in other routing scenarios in streaming models.
1.4 Organization of the Article
In Section 2 we give some basic definitions. Our algorithm consists of several subroutines and technical data structures. Therefore, for the reader’s convenience Section 3 is dedicated to an informal and example-based description. Section 4 contains the pseudo code of the algorithm. In Section 5, we show the connection of the concepts of Euler tours and successor functions and then show that the required RAM of the algorithm does not exceed \(\mathcal O(n \log (n))\) and that the output actually depicts an Euler tour (Theorem 1).
2 Preliminaries
Let \(\mathbb {N}:=\{1,2,\ldots \}\) denote the set of natural numbers. For \(n\in \mathbb {N}\) let [n] := {1,…,n}. In the following, we consider a simple graph G = (V, E) without loops where V denotes the set of nodes and E the set of (undirected) edges. A walk in G is a finite sequence T = (v1,…,vℓ) of nodes of G with {vi, vi+ 1}∈ E for all i ∈{1,…,ℓ − 1}. If additionally every edge is visited at most once, we call it trail. The length of T is ℓ − 1. The (directed) edge set of T is E(T) := {(vi, vi+ 1)|i ∈ [ℓ − 1]}. We also write e ∈ T instead of e ∈ E(T). For a directed edge e we denote by e(1) its first and by e(2) its second component. A trail T = (v1,…,vℓ, v1) of length l, which starts and ends at v1, is called a tour. In tours, we usually do not care about starting point and end point, so we slightly abuse the notation and write vi+ℓ or vi−ℓ for a node vi, identifying vℓ+ 1 := v1 and vℓ+ 2 := v2 and so on. If additionally vi≠vj holds for all i, j ∈ [ℓ − 1],i≠j (and ℓ ≥ 3), we call T a cycle. An Euler tour of G is a tour T with E(T) = E. Since in the streaming model the graph is represented as a set of edges, we often use the edges for the depiction of tours. With ei := {vi, vi+ 1} for all i ∈ [l], T can be written as T = (e1,…,el). Here, we also use the slightly abusive index notation. Note that for the tour T the edges are distinct. For i ∈ [l], we call ei+ 1 the successor edge of ei in tour T. Our algorithm outputs an Euler tour T = (v1,…,v|E|, v1) in form of a successor function, i.e., for every i ∈ [|E|], we output the triple (vi, vi+ 1, vi+ 2), where {vi+ 1, vi+ 2} is the successor edge of {vi, vi+ 1} in T.
3 Idea of the Algorithm
In this section we explain the new algorithmic idea in a more informal way. First we describe how merging of subtours can be accomplished without RAM limitation clarifying why this does not work in the streaming environment. Thereafter we explain our merging technique, its locality and RAM efficiency.
3.1 Subtour merging in Unrestricted RAM
Recall that the Euler tour will be presented by a successor function, so for every edge we will compute the corresponding successor edge in the tour. Let G = (V, E) be an Eulerian graph and \(T, T^{\prime }\) be edge-disjoint tours in G. The tour induces an orientation of the edges in a canonical way. If T and \(T^{\prime }\) have a common node v, it is easy to merge them to a single tour: T has at least one in-going edge (u, v) with a successor edge (v, w), and \(T^{\prime }\) has at least one in-going edge \((u^{\prime },v)\) with a successor edge \((v,w^{\prime })\). By changing the successor edge of (u, v) from (v, w) to \((v,w^{\prime })\) and the successor edge of \((u^{\prime },v)\) to (v, w), we get a tour containing all edges of \(T \cup T^{\prime }\) (see Fig. 1).
The same principle can be applied when merging more than two tours at once. When we have a tour T and tours T1,…,Tk, \(k \in \mathbb N\), such that T, T1,…,Tk are pairwise edge-disjoint and for every j ∈ [k] there is a common node vj of T and Tj, switching the successor edges of two in-going edges per node vj as described above creates a tour containing the edges of T ∪ T1 ∪⋯ ∪ Tk.
We can use this method as a simple algorithm for finding an Euler tour:
a) Find a partition of E into edge disjoint cycles.
b) Iteratively pick a cycle C and merge it with all tours encountered so far which have at least one common node with C.
Such a merging process certainly converges to a tour covering all nodes, if a subtour obtained by merging some subtours does not decompose into some subtours again. If we use a local swapping technique to merge tours, this can very well happen, if swapping is applied again to some other node of the merged tour (see Fig. 2). In the RAM model this problem does not appear, since we can keep all tours in RAM and avoid such fatal nodes.
But in the streaming model with \(O(n\log n)\) RAM it is far from being obvious how to implement an efficient tour merging for the following reasons.
-
1.
We cannot keep every intermediate tour in RAM, so we have to regularly remove some edges together with their successors from RAM, even if we do not know the edges yet to come. On the other hand, we have to keep edges in RAM which are essential in later merging steps.
-
2.
Sometimes we have to merge cycles with tours that had already left RAM. Therefore, we must keep track of common nodes and the related edges.
3.2 Subtour Merging in Limited RAM
We start with an example of four cycles C1,…,C4, all sharing one node v (see Fig. 3). Let’s assume that the cycles are in the same order as found by the algorithm. Let (u1, v),…,(u4, v) be the respective in-going edges and (v, w1),…,(v, w4) be the respective out-going edges. By swapping the successor edges of (u1, v) and (u2, v) as explained before, we construct a tour T containing all edges from C1 and C2. We then merge this tour with C3 swapping the successor edges of (u1, v) and (u3, v). Finally, C4 is merged with T by swapping the successors of (u1, v) and (u4, v). The successor edges are now as follows:
For i > 1 and cycle Ci, the successor of the edge (ui, v) is edge (v, wi− 1), the out-going edge of Ci− 1. The edge (u1, v) of the cycle C1 has the out-going edge of the last cycle as its successor edge. The edge (u1, v) is the first in-going edge of v and called the first-in edge of v.
Let us briefly show how this merging can be implemented in the streaming model with an additional output tape. When C1 is kept in RAM, we store the edge (u1, v), since we don’t know its final successor edge yet. We also keep the edge (v, w1) in RAM, because it will be the successor edge of C2. We call such an edge the potential successor edge of v.
The crux is that, no matter how many cycles are merged at the node v, we get along with only two additional edges in RAM at a time: the first-in edge, which will never change after it is initialized and the recent potential successor, which will be replaced every time we merge a new cycle at v. In our example, when merging a cycle Ci for i > 1, we assign the edge (v, wi− 1) as successor edge of (ui, v) and replace (v, wi− 1) by (v, wi) in RAM as potential successor edge of v. All other edges are written on the output tape with their corresponding successors and don’t have to be stored in RAM. Finally, when no more cycles with node v occur, we can write (u1, v) together with the recent successor edge (in our case this is (v, w4)) on the output tape.
Now, let us consider the more complicated case, where we wish to merge a cycle C with multiple tours at several nodes. Consider a cycle C and tours T1,…,Tj. Let v1,…,vj, be nodes such that vi belongs to Ti and C for all i. We distinguish between merging at three types of nodes:
-
1.
For the nodes v1,…,vj we swap the successor edges.
-
2.
Nodes in C and in T1 ∪⋯ ∪ Tj∖{v1,…,vj}: as only one successor edge swapping per tour is needed, these additional common nodes are not used, hence for every v ∈ T1 ∪⋯ ∪ Tj∖{v1,…,vj} the in-going edge (u, v) of C keeps its successor edge, so nothing happens here.
-
3.
Nodes in C∖(T1 ∪⋯ ∪ Tj). These nodes are visited by the algorithm for the first time. Since we might want to merge C with future cycles at these nodes, we store for every v ∈ C∖(T1 ∪⋯ ∪ Tj) the in-going edge (u, v) of C as first-in edge and the out-going edge (v, w) of C as potential successor edge.
At any point of time we store at most one cycle C and two edges per node (a first-in edge and a potential successor) in RAM, so \(\mathcal {O}(n\log n)\) RAM suffices. Note that the very first cycle found by the algorithm consists only of type 3 nodes, so every edge will become a first-in edge.
3.3 High Level Description
For the readers convenience we give a high level description of our algorithm. A detailed description in pseudo code together with an outline of the analysis and the proof of the main theorem will follow in the next sections. We denote the set of first-in edges by F.
-
1.
Iteratively:
-
1.1
Read edges from the input stream until the edges in RAM contain a cycle C.
-
1.2
If a node v of C is visited for the first time, i.e. C is the first cycle with v ∈ C,
-
a) store the in-going edge (u, v) of C in F (we will process these ≤ n edges in step 2),
-
b) remember the out-going edge (v, w) as potential successor edge of v.
-
-
1.3
Every node v that has already been visited, has thereby been assigned to a unique tour T with v ∈ C ∩ T. For each tour that shares a node with C, choose exactly one common node.
-
1.4
For each node v chosen in step 1.3 ‘swap the successors’. That means, we write the in-going edge e on the output tape and take the recent potential successor edge of v as successor edge for e. Then, save the out-going edge as new potential successor edge of v.
-
1.5
For each edge that has not been stored in F (step 1.2) or written on the output tape (step 1.4) so far, write this edge on the output tape and take as successor the following edge in C.
-
1.6
Assign all tours with common nodes together with all newly visited nodes to a single tour.
-
1.1
-
2.
After the end of the input stream is reached, all edges have either been written on the output tape or stored in F. For every edge (u, v) ∈ F, write it on the output tape and take as its successor the potential successor edge of v.
An example of how the algorithm works can be found in the Appendix.
4 The Algorithm and Main Results
4.1 The Algorithm in Pseudo Code
To enable a clear and structured analysis, in this section we present the pseudo-code for our algorithm. For a better understanding it is split up into several procedures that correspond to the steps from our high level description in Section 3.3. Note that these procedures are not independent algorithms, since they access variables from the main algorithm. The output is an Euler tour on G, given in the form of a successor function δ∗. To be more precise, the output is a sequence of triples (v1, v2, s) written on the output tape with v1, v2, s ∈ V and {v1, v2}∈ E. Each of these triples represents the information δ∗((v1, v2)) = (v2, s). If a triple (v1, v2, s) is written on the output tape, we say that the edge (v2, s) is marked as successor of the edge (v1, v2). For every node v ∈ V we store the two values s(v) and t(v). If v is considered for the first time, s(v) = t(v) = 0. Otherwise, s(v) ∈ V indicates that (v, s(v)) is the potential successor edge of v and t(v) ∈ [n] represents the tour that v is assigned to at the moment.
The algorithm searches the stream for cycles (Step 1.1 in our high level description in Section 3.3) and whenever a cycle is found, we will run the procedure Merge-Cycle on this cycle. Next we state the procedure Merge-Cycle, which implements the steps 1.2 to 1.6 in Section 3.3.
Let us now explain all the procedures that are used in Merge-Cycle. The procedure New-Nodes implements step 1.2 from Section 3.3. If a node v ∈ C is processed the very first time by the algorithm, this is indicated by t(v) = 0. If this is the case, we store the C-edge in-going to v in the set F and define s(v) as the next node on C. So the edge (v, vi+ 1) becomes the potential successor of v.
The procedure Construct-J-M is a realization of step 1.3 in Section 3.3. Recall that the t-values of the nodes represent the tours that we have constructed so far. For every possible t-value j ∈ [n], we pick exactly one node v with t(v) = j if there is one. These nodes are stored in J, their t-values are stored in M. The nodes in J are the nodes we want to use for merging tours. If two nodes have the same t-value, this means they are already part of the same tour (see Lemma 8), so we have to avoid using both of them for merging.
In the procedure Merge, we use the nodes from J to merge all tours that share a node with the cycle C by edge-swapping (step 1.4 in Section 3.3).
In the procedure Write, we take care of all the edges that have not been stored in F and have not been written on the output tape in the procedure Merge (Step 1.5 in Section 3.3).
In the procedure Update we update the t-values to implement step 1.6. This way we ensure that the t-values of the nodes still represent the tours they belong to. If M = ∅, the cycle C has no intersecting nodes with already constructed tours, so it is declared as a new tour. Otherwise, all nodes from the cycle C and all intersecting tours now belong to one single tour, represented by the t-value a.
Finally, in the procedure Write-F (step 2 in Section 3.3), the first-in edges that have been stored in F during the algorithm are written on the output tape with proper successors. Note that for each (u, v) ∈ F it holds s(v)≠ 0.
4.2 Main Results and Proof Idea
For the readers convenience we first sketch the proof idea and new mathematical techniques in the analysis. The goal is to show that the algorithm Euler-Tour works as claimed, that is, the memory requirement does not exceed the \(\mathcal {O}(n \log n)\) bound and the output successor function δ∗ determines an Euler tour for the input graph G.
We state the two main results of this paper:
Lemma 1
Algorithm Euler-Tour needs at most \(\mathcal {O}(n\log n)\) bits of RAM.
Theorem 1
If G is Eulerian, δ∗ determines an Euler tour on G.
The memory estimation is done in Section 5.2 and is a careful analysis of the Algorithm Euler-Tour and its subroutines. The main arguments are that at any point of time we store at most two edges per node (a first-in edge and a potential successor), the edges of the cycle that is recently processed and a label for every node (its t-value). The correctness proof turns out to be much more complicated. Every bijective successor function δ partitions the edges of G into edge-disjoint tours. In this case ‘being in the same tour’ forms an equivalence relation on the set of edges, which is denoted by ≡δ. We prove this fact in Theorem 2. This allows us to use equivalence classes, when analyzing the dynamic change of subtours induced by successor functions in an algebraic framework. This analysis takes place in Section 5.3 and is concluded by Theorem 1, which states that the output successor function δ∗ indeed induces an Euler Tour on G. We start with a successor function δC. In Lemma 5 we show that δC is bijective and that the equivalence classes of the corresponding equivalence relation \(\equiv _{\delta ^{C}}\) are simply the directed cycles found during the algorithm Euler-Tour (line 4). Next we construct a recursive sequence of bijective successor functions \(\delta ^{*}_{0},\ldots ,\delta ^{*}_{N}\) and their corresponding equivalence classes with \(\delta ^{*}_{0}=\delta ^{C}\) and \(\delta ^{*}_{N}=\delta ^{*}\), where N denotes the total number of cycles found by the algorithm. The differences from \(\delta ^{*}_{k+1}\) to \(\delta ^{*}_{k}\) basically correspond to the successor swapping process in the Procedure Merge-Cycle running on the (k + 1)-th cycle found by the algorithm. At this point we need the lemmata 7 and 8, the crux of the analysis. Lemma 7 ensures that the successor swapping leads to a union of all involved equivalence classes and finally Lemma 8 tells us that all successor functions from the sequence \(\delta ^{*}_{0},\ldots ,\delta ^{*}_{N}\) are bijective, that the procedure Update works correctly and that after an edge (u, v) has been processed by Euler-Tour, all processed edges incident in u or v belong to the same equivalence class. Using that Eulerian Graphs are connected, the proof of Theorem 1 then follows with a few simple arguments, because by applying Theorem 2 it suffices to show that all edges of G are in the same equivalence class of the relation \(\equiv _{\delta ^{*}}\).
5 Detailed Analysis
5.1 Subtour Representation by Equivalence Classes
In this subsection we present some basic definitions and results that allow us to transfer the problem of tour merging in a graph to the notion of equivalence relations on E. This will facilitate an elegant and clear analysis of our algorithm.
Definition 1
-
(i) Let G = (V, E) be an undirected graph. An orientation of the edges of G is a function \(R:E\rightarrow V^{2}\) such that for every edge {u, v}∈ E either R({u, v}) = (u, v) or R({u, v}) = (v, u). So R(G) := (V, R(E)) is a directed graph. For an oriented edge e = (u, v) we write e(1) := u and e(2) := v.
-
(ii) Let G = (V, E) be a directed graph. A successor function on G is a function \(\delta :\mathbf {E}\rightarrow \mathbf {E}\) with δ(e)(1) = e(2) for all e ∈E.
-
(iii) Let G = (V, E) be a directed graph with successor function δ. We define the relation ≡δ on E as follows: For \(e,e^{\prime }\in E\), \(e\equiv _{\delta } e^{\prime }:\Leftrightarrow \exists k\in \mathbb {N}:\delta ^{k}(e)=e^{\prime }\), where δk denotes the k-wise composition of δ.
So \(e\equiv _{\delta } e^{\prime }\) means that \(e^{\prime }\) can be reached from e by iteratively applying δ.
Lemma 2
Let δ be a bijective successor function on a directed graph G = (V, E). Then ≡δ is an equivalence relation on E.
Proof
Reflexivity: Let e ∈E. Since E is finite, there exists \(k\in \mathbb {N}\) with the following property: There exists \(k^{\prime }\in \mathbb {N}\) with \(k^{\prime }<k\) and \(\delta ^{k}(e)=\delta ^{k^{\prime }}(e)\). Otherwise, all elements of the sequence \((\delta ^{\ell }(e))_{\ell \in \mathbb {N}}\) would be pairwise distinct, in contradiction to the fact that there exist only |E| edges. Let k be minimal with this property. Since δ is injective, it follows that \(\delta ^{k-1}(e)=\delta ^{k^{\prime }-1}(e)\) and the minimality of k enforces that \(k^{\prime }-1\notin \mathbb {N}\). So \(k^{\prime }=1\), therefore δk(e) = δ(e) and by the injectivity of δ we have δk− 1(e) = e.
Symmetry: Let \(e,e^{\prime }\in \mathbf {E}\) with \(e\neq e^{\prime }\) and \(e\equiv _{\delta } e^{\prime }\). Then there exists a minimal \(k\in \mathbb {N}\) with \(\delta ^{k}(e)=e^{\prime }\). As shown above, there also exists a \(k^{\prime }\in \mathbb {N}\) with \(\delta ^{k^{\prime }}(e)=e\). Then \(k<k^{\prime }\), because otherwise \(\delta ^{k-k^{\prime }}(e)=\delta ^{k-k^{\prime }}(\delta ^{k^{\prime }}(e))=\delta ^{k}(e)=e^{\prime }\), in contradiction to the minimality of k. It follows that \(\delta ^{k^{\prime }-k}(e^{\prime })=\delta ^{k^{\prime }-k}(\delta ^{k}(e))=\delta ^{k^{\prime }}(e)=e\).
Transitivity: Let \(e,e^{\prime },e^{\prime \prime }\in \mathbf {E}\) with \(e\equiv _{\delta } e^{\prime }\) and \(e^{\prime }\equiv _{\delta } e^{\prime \prime }\). Then there exist \(k_{1},k_{2}\in \mathbb {N}\) with \(\delta ^{k_{1}}(e)=e^{\prime }\) and \(\delta ^{k_{2}}(e^{\prime })=e^{\prime \prime }\). So we have \(\delta ^{k_{1}+k_{2}}(e)=e^{\prime \prime }\). □
In the following we denote the equivalence class of an edge e ∈E w.r.t. ≡δ by [e]δ.
The following technical lemma is necessary to show in Theorem 2 that the equivalence classes of δ always form tours on G.
Lemma 3
Let G = (V, E) be a directed graph with bijective successor function δ and the corresponding equivalence relation ≡δ. Then we have:
-
(i) Let e ∈E and \(k_{1},k_{2}\in \mathbb {N}_{0}\) with k1≠k2 and \(\delta ^{k_{1}}(e)=\delta ^{k_{2}}(e)\). Then |k1 − k2|≥|[e]δ|.
-
(ii) For all e ∈E we have \(\delta ^{|{[e]}_{\delta }|}(e)=e\).
Proof
(i): Assume for a moment that there exist e ∈E and \(k_{1},k_{2}\in \mathbb {N}\) with \(\delta ^{k_{1}}(e)=\delta ^{k_{2}}(e)\) and 0 < |k1 − k2| < |[e]δ|. Without loss of generality let k1 > k2. We have \(\delta ^{k_{1}-k_{2}}(\delta ^{k_{2}}(e))=\delta ^{k_{1}}(e)=\delta ^{k_{2}}(e)\) and via induction for every \(s\in \mathbb {N}\), we get \(\delta ^{s(k_{1}-k_{2})}(\delta ^{k_{2}}(e))\) \(=\delta ^{k_{2}}(e)\). For the set M := {δk(e)|k2 ≤ k < k1}, we have |M|≤ k1 − k2 < |[e]δ|. But on the other hand, we also have \({[e]}_{\delta }\subseteq M\): let \(e^{\prime }\in {[e]}_{\delta }={[\delta ^{k_{2}}(e)]}_{\delta }\) and let \(\ell \in \mathbb {N}\) with \(e^{\prime }=\delta ^{\ell }(\delta ^{k_{2}}(e))\). Then there exist unique \(s,r\in \mathbb {N}_{0}\) with 0 ≤ r < k1 − k2 and ℓ = s(k1 − k2) + r. So
Now, |M|≤ k1 − k2 < |[e]δ|≤|M|, a contradiction.
(ii): Assume that there exists e ∈E with \(\delta ^{|{[e]}_{\delta }|}(e)=e^{\prime }\neq e\). Define \(M:=\{\delta ^{k}(e)|1\leq k\leq |{[e]}_{\delta }|\}\). Clearly \(M\subseteq {[e]}_{\delta }\).
Case 1: e ∈ M. Then δ0(e) = e = δk(e) for some k with 1 ≤ k < |[e]δ|. By (i) we get k = |k − 0|≥|[e]δ|, a contradiction.
Case 2: e∉M. Then |M| < |[e]δ|, By the pigeonhole principle, there exist 1 ≤ k1, k2 ≤|[e]δ| with \(\delta ^{k_{1}}(e)=\delta ^{k_{2}}(e)\) in contradiction to (i). □
Theorem 2 (Structure Theorem)
Let G = (V, E) be a directed graph with bijective successor function δ such that \(e\equiv _{\delta } e^{\prime }\) for all \(e,e^{\prime }\in \mathbf {E}\). Then δ determines an Euler tour on G in the following sense: For every e ∈E the sequence \((e_{(1)},{\delta (e)}_{(1)},\ldots ,\delta ^{|\mathbf {E}|}{(e)}_{(1)})\) is an Euler tour on G.
Proof
Let e ∈E. Note that [e]δ = E. The sequence \((e_{(1)},{\delta (e)}_{(1)},\ldots ,\delta ^{|\mathbf {E}|}{(e)}_{(1)})\) consists of |E| edges, namely e, δ(e),…,δ|E|− 1(e). These edges are pairwise distinct, because otherwise we would have \(\delta ^{k_{1}}(e)=\delta ^{k_{2}}(e)\) for some k1, k2 ∈{0,…,|E|− 1}. Hence |k1 − k2| < |E| = |[e]δ| in contradiction to Lemma 3 (i). So the sequence is a trail. By applying Lemma 3 (ii), we get \(e=\delta ^{|{[e]}_{\delta }|}(e)=\delta ^{|\mathbf {E}|}(e)\), thus the trail is a tour on G and since it has length |E|, it is an Euler tour on G. □
Before we start with a detailed memory- and correctness analysis, we show that at the end of the algorithm, every edge {u, v}∈ E has been written on the output tape exactly once, either in the form (u, v) or in the form (v, u).
Lemma 4
-
(i) After each processing of an edge in the algorithm Euler-Tour (lines 2 to 5), the graph Gint = (V, Eint) is cycle-free, so |Eint|≤ n. If all nodes have even degree in G, after completion of Euler-Tour, Eint = ∅.
-
(ii) If all nodes have even degree in G, after completion of Euler-Tour every edge {u, v}∈ E has been written on the output tape either in the form (u, v, s) or in the form (v, u, s) for some s ∈ V.
Proof
We start by proving the first part of (i) via induction over the number of already processed edges. If there are no edges processed so far, then Eint = ∅, so Gint is cycle-free. Now let k ∈ [|E|] ∪{0}, let Gk, Gk+ 1 denote Gint after k resp. k + 1 edges have been processed and let Gk be cycle-free. Let e denote the (k + 1)-th processed edge. When e is added to Gint, it may produce a cycle C. If e does not produce a cycle, then Gk+ 1 = Gk ∪{e} is cycle-free and we are done. If e produces a cycle C, then according to lines 6,7 in Merge-Cycle all C-edges are deleted from Eint. Because e ∈ C, we get \(G_{k+1}=(G_{k}\cup \{e\})\setminus C\subseteq G_{k}\) and we are done by the induction hypothesis.
Now assume for a moment that Eint≠∅ at the end of Euler-Tour. We know that Gint is cycle-free at this time, so Gint contains a node with odd degree in Gint. Because we always delete whole cycles, the degree of this node in G has to be odd as well, but then G is not an Eulerian graph. In this case we might output a message that G does not contain an Euler tour.
(ii). During the execution of Euler-Tour every edge from E is added to Eint at some point of time and there is only one way for an edge to be deleted from Eint again, namely in line 7 of the procedure Merge-Cycle. At that time the edge has either been written on the output tape by the procedure Merge or Write (in which case we are done), or it has been added to F in the procedure New-Nodes. In that case it is written on the output tape in Write-F. Because, according to (i), Eint = ∅ at the end of Euler-Tour,by then every edge must have been written on the output tape in exactly one of these ways. □
5.2 Memory Requirement
Proof of Lemma 1
We consider the different variables and sets.
Variable c: c is initialized with 0 and changed in the procedure Update if and only if M = ∅ at that time. This only happens if in the procedure Construct-J-M for every node v of the considered cycle, we have t(v) = 0, which means that none of the cycle nodes was considered before. This case can occur at most n/3 times during the algorithm, because there are at most n/3 node disjoint cycles in G. So c ≤ n/3 and \(\log n\) bits suffice to store c.
Variable s(v): With this variable we store the label of a node, so for fixed v ∈ V, \(\log n\) bits suffice and altogether \(n\log n\) bits suffice.
Variable t(v): We prove that for any v ∈ V t(v) ≤ n at any time: Assume for a moment that this is not the case. Let T be the first point of time at which t(v) is set to a value > n for some v ∈ V. t(v) is only changed in the procedure Update, line 9 or 11. In both cases t(v) is set to a which is either c (line 4) or \(\min \limits (M)\) (line 6). We already showed c ≤ n/3 < n. Hence, by our assumption, \(\min \limits (M)>n\) at that time. But this implies that at the time of construction of M, there already existed a node u ∈ V with t(u) > n (procedure Construct-J-M, line 3) in contradiction to the choice of T.
Sets Eint, F, J, M: Because a single element of each of these sets can be stored in \(\log n\) bits, it suffices to show that the cardinalities of these sets do not exceed n. For Eint, this is shown in Lemma 4. For J and M, this follows directly from the construction (procedure Construct-J-M). In the set F, for every node only the first edge entering this node is stored (procedure New-Nodes, lines 2 and 4), so clearly |F|≤ n. □
5.3 Correctness
In this subsection, we prove by a series of lemmas that the output successor function δ∗ determines an Euler tour on G, provided that G is Eulerian (Theorem 1). This is done with the help of our structure theorem (Theorem 2), where bijectivity of δ∗ and the condition that δ∗ induces only one equivalence class is required. In the following, we show that these assumptions are true for δ∗ by generating a sequence of bijective successor functions \(\delta ^{*}_{0},\ldots ,\delta ^{*}_{N}\) such that \(\delta ^{*}_{N}=\delta ^{*}\) and \(\delta _{i+1}^{*}\) emerges from \(\delta _{i}^{*}\) by swapping of edge successors.
Lemma 4 (ii) induces an orientation on E, which we call R∗: For all {u, v}∈ E, we define Rˆ*({u,v}):=
Let C1,…,CN denote the cycles found by the algorithm Euler-Tour (lines 2-5) in chronological order. We use this ordering only for the sake of analysis. Note that the cycles C1,…,CN form a partition of E. For k ∈ [N] and a variable x ∈{s(v),t(v),…|v ∈ V }, we denote by xk the value of x after the k-th call of Merge-Cycle. With x0 we denote the initial value of x.
Definition 2
For each i ∈ [N], let \(C_{i}=(v_{1}^{(i)},\ldots ,v_{\ell _{i}}^{(i)})\) the cycle supplied to Merge-Cycle. Define \({\delta ^{C}_{i}}:E(C_{i})\rightarrow E(C_{i})\) by \({\delta ^{C}_{i}}(v_{j}^{(i)},v_{j+1}^{(i)}):=(v_{j+1}^{(i)},v_{j+2}^{(i)})\) for every j ∈ [ℓi] and define with (1) the successor function \(\delta ^{C}:R^{*}(E)\rightarrow R^{*}(E)\) by \(\delta ^{C}|_{E(C_{i})}:={\delta ^{C}_{i}}\) for all i ∈ [N].
So δC is the canonical successor function induced by the cycles C1,…,CN.
Lemma 5
-
(i) The successor function δC is bijective.
-
(ii) For any two edges \(e,e^{\prime }\) we have \(e\equiv _{\delta ^{C}}e^{\prime }\Leftrightarrow \exists i\in [N]: e,e^{\prime }\in C_{i}\).
Proof
(i). We first show that δC is surjective: Let e ∈ R∗(E). Then e belongs to some cycle \(C_{k}=(v_{1}^{(k)},\ldots , v_{\ell _{k}}^{(k)})\) for some k ∈ [N]. Hence \(e=(v_{i}^{(k)},v_{i+1}^{(k)})\) for some i ∈ [ℓk]. Then \(\delta (v_{i-1}^{(k)},v_{i}^{(k)})=(v_{i}^{(k)},v_{i+1}^{(k)})=e\). Because R∗(E) is finite, δC is bijective.
(ii). Let \(e,e^{\prime }\in R^{*}(E)\) with \(e\equiv _{\delta ^{C}} e^{\prime }\) and k ∈ [N] such that e ∈ Ck. Since δC(E(Ck)) = E(Ck), it follows that \(e^{\prime }\in C_{k}\). Hence, there exist i, j ∈ [ℓk] with \(e=(v_{i}^{(k)},v_{i+1}^{(k)})\) and \(e^{\prime }=(v_{j}^{(k)},v_{j+1}^{(k)})\). W.l.o.g. let i < j and set r := j − i. Then \({(\delta ^{C})}^{r}(e)=e^{\prime }\), so \(e\equiv _{\delta ^{C}}e^{\prime }\). □
Let k ∈{0,…,N}. We consider the time right after the k-th iteration of Merge-Cycle. For k = 0 this means the very beginning of the algorithm. We call edges from \(\bigcup \limits _{i=1}^{k} E(C_{i})\) processed edges, since those edges have already been loaded into Eint and then have been deleted from there. All processed edges can be divided into two types:
-
Type A: The edge has been written on the output tape with a dedicated successor.
-
Type B: The edge has been added to F.
These are the only possible cases for processed edges, because an edge which is deleted from Eint is either written on the output tape or added to F (procedure Write). This leads to the following definition.
Definition 3
For every k ∈{0,…,N} define the function \(\delta _{k}:\bigcup \limits _{i=1}^{k} E(C_{i})\rightarrow \bigcup \limits _{i=1}^{k}\) E(Ci) by
Note that \(\delta ^{*}_{0}=\delta ^{C}\) and \(\delta ^{*}_{N}=\delta ^{*}\).
Lemma 6
Let k, ℓ ∈{0,…,N} with k < ℓ. Then for any \(v,v^{\prime }\in V\), e ∈ R∗(E), we have
-
(i) If \(t_{k}(v)=t_{k}(v^{\prime })\neq 0\), then \(t_{\ell }(v)=t_{\ell }(v^{\prime })\).
-
(ii) If e ∈ Cℓ, then \({[e]}_{\delta ^{*}_{k}}={[e]}_{\delta ^{C}}\).
Proof
(i). Let \(v,v^{\prime }\in V\) with \(t_{k}(v)=t_{k}(v^{\prime })\neq 0\). Assume for a moment that \(t_{\ell }(v)\neq t_{\ell }(v^{\prime })\). Then there exists \(k\leq k^{\prime }< \ell \) such that \(t_{k^{\prime }}(v)=t_{k^{\prime }}(v^{\prime })\) and \(t_{k^{\prime }+1}(v)\neq t_{k^{\prime }+1}(v^{\prime })\). Because tk(v)≠ 0 and the t-value of v is never set to 0 after its initialization, \(t_{k^{\prime }}(v)\neq 0\). We take a closer look at the \((k^{\prime }+1)\)-th call of Merge-Cycle. If for a node its t-value is changed in this call, it is set to \(a_{k^{\prime }+1}\) (line 9 or 11 in Update), so we may assume that \(t_{k^{\prime }+1}(v)=a_{k^{\prime }+1}\neq t_{k^{\prime }+1}(v^{\prime })\). But this implies that \(t_{k^{\prime }}(v)\in M\) or \(v\in C_{k^{\prime }+1}\). In the latter case, since \(t_{k^{\prime }}(v)\neq 0\), we also have \(t_{k^{\prime }}(v)\in M\) (procedure Construct-J-M). But then, \(t_{k^{\prime }}(v^{\prime })=t_{k^{\prime }}(v)\in M\) and therefore \(t_{k^{\prime }+1}(v^{\prime })=a_{k^{\prime }+1}=t_{k^{\prime }+1}(v)\), in contradiction to our assumption.
(ii). Let e ∈ Cℓ. With Lemma 5 (ii), we get \({[e]}_{\delta ^{C}}=E(C_{\ell })\). Since ℓ > k, we have \(\delta ^{*}_{k} (e^{\prime })=\delta ^{C} (e^{\prime })\) for any \(e^{\prime }\in C_{\ell }\). Hence, \(\delta ^{*}_{k} (e)=\delta ^{C} (e)\) and by induction we get \({(\delta ^{*}_{k})}^{j}(e)={(\delta ^{C})}^{j}(e)\in C_{\ell }\) for any j ≥ 1, which proves the claim. □
The next lemma describes the cycle merging in terms of equivalence classes.
Lemma 7
Let G = (V, E) be a directed graph with bijective successor function δ and the related equivalence relation ≡δ. Let \(r\in \mathbb {N}\) and e1,…,er ∈E be distinct with ei ≡δej for every i, j ∈ [r]. Let \(e_{1}^{\prime },\ldots ,e_{r}^{\prime }\in \mathbf {E}\) with \(e_{i}^{\prime }\not \equiv _{\delta } e_{j}^{\prime }\) and \(e_{i}\not \equiv _{\delta } e_{i}^{\prime }\) for every i≠j ∈ [r]. Let \(\delta ^{\prime }\) be a successor function on G with \(\delta ^{\prime }(e)=\delta (e)\) for every \(e\in \mathbf {E}\setminus \{e_{1},\ldots ,e_{r},e_{1}^{\prime },\ldots ,e_{r}^{\prime }\}\) and \(\delta ^{\prime }(e_{i})=\delta (e_{i}^{\prime })\) and \(\delta ^{\prime }(e_{i}^{\prime })=\delta (e_{i})\) for any i ∈ [r]. Then, \(\delta ^{\prime }\) is bijective and
Let us briefly explain the meaning of the quite technical notation in this lemma. The reader may think of e1,…,er as edges of the same cycle C. The edges \(e_{1}^{\prime },\ldots ,e_{r}^{\prime }\) are edges that do not belong to C, but to several tours that share at least one node with C, such that \((e_{i})_{(2)}=(e_{i}^{\prime })_{(2)}\) for all i ∈ [r]. The condition \(e_{i}^{\prime }\not \equiv _{\delta } e_{j}^{\prime }\) reflects the fact that we have to choose exactly one common node per tour for merging, as already explained in Section 3, see Fig. 2. We obtain the new successor function \(\delta ^{\prime }\) by performing the successor swapping as described in step 1.4 in Section 3.3. (P1) tells us that via this swapping all associated equivalence classes are merged which means that all affected tours become one big tour. (P2) makes sure that tours that don’t share nodes with C are not changed at all.
Proof
First we note that \(\delta ^{\prime }\) is bijective, because δ is bijective. We prove the lemma via induction over r. Let r = 1. To shorten notation, we write e and \(e^{\prime }\) instead of e1 and \(e_{1}^{\prime }\). We start with proving
We show that for any \(e^{\prime \prime }\in {[e]}_{\delta }\cup {[e^{\prime }]}_{\delta }\), we have \(\delta ^{\prime }(e^{\prime \prime })\in {[e]}_{\delta }\cup {[e^{\prime }]}_{\delta }\): Let \(e^{\prime \prime }\in {[e]}_{\delta }\cup {[e^{\prime }]}_{\delta }\). Then there exists \(k\in \mathbb {N}\) such that \(e^{\prime \prime }=\delta ^{k}(e)\) or \(e^{\prime \prime }=\delta ^{k}(e^{\prime })\). If \(e^{\prime \prime }\in \{e,e^{\prime }\}\), then \(\delta ^{\prime }(e^{\prime \prime })\in \{\delta (e),\delta (e^{\prime })\}\). Otherwise \(\delta ^{\prime }(e^{\prime \prime })=\delta (e^{\prime \prime })=\delta ^{k+1}(e)\) or \(\delta ^{\prime }(e^{\prime \prime })=\delta ^{k+1}(e^{\prime })\), respectively. In each case we have \(\delta ^{\prime }(e^{\prime \prime })\in {[e]}_{\delta }\cup {[e^{\prime }]}_{\delta }\). Since \(e\in {[e]}_{\delta }\cup {[e^{\prime }]}_{\delta }\), it follows by induction on n that \({(\delta ^{\prime })}^{n}(e)\in {[e]}_{\delta }\cup {[e^{\prime }]}_{\delta }\) for any \(n\in \mathbb {N}\), so \({[e]}_{\delta ^{\prime }} \subseteq {[e]}_{\delta } \cup {[e^{\prime }]}_{\delta }\).
Next, we show
Let \(e^{\prime \prime }\in {[e^{\prime }]}_{\delta }\). Then there exists \(k\in \{1,\ldots ,|{[e^{\prime }]}_{\delta }|\}\) with \(e^{\prime \prime }=\delta ^{k}(e^{\prime })\). Since by assumption \(e\notin {[e^{\prime }]}_{\delta }\) and by Lemma 3 (i) \(\delta ^{\ell }(e^{\prime })\neq e^{\prime }\) for all ℓ ∈{1,…,k − 1}, we have
Hence, \(e^{\prime \prime } = \delta ^{k}(e^{\prime }) = {(\delta ^{\prime })}^{k-1}(\delta (e^{\prime })) = {(\delta ^{\prime })}^{k-1}(\delta ^{\prime }(e)) = { (\delta ^{\prime })}^{k}(e)\in {[e]}_{\delta ^{\prime }}\). So we have
and analogously we get
By (5) \(\delta (e^{\prime })\in {[e^{\prime }]}_{\delta }\subseteq {[e]}_{\delta ^{\prime }}\), so \({[\delta (e^{\prime })]}_{\delta ^{\prime }}={[e]}_{\delta ^{\prime }}\) and thus using the assumption \(\delta (e^{\prime })=\delta ^{\prime }(e)\),
Combining (5), (6), and (7), we proved (4). With (3), (4), and (7), we have
so property (P1) is proven. For the proof of (P2), let \(e^{\prime \prime }\in \mathbf {E}\) with \(e^{\prime \prime }\not \equiv _{\delta } e, e^{\prime \prime }\not \equiv _{\delta } e^{\prime }\). Then, \(\delta ^{k}(e^{\prime \prime })\notin \{e,e^{\prime \prime }\}\) for all \(k\in \mathbb {N}\), so we get
But this implies \([e^{\prime \prime }]_{\delta }=[e^{\prime \prime }]_{\delta ^{\prime }}\).
Induction step: Now let \(r\in \mathbb {N}\) and let the claim be true for all \(k\leq r\in \mathbb {N}\). Let e1,…,er+ 1 ∈E with ei ≡δej for every i, j ∈ [r + 1]. Let \(e_{1}^{\prime },\ldots , e_{r+1}^{\prime }\in \mathbf {E}\) with \(e_{i}^{\prime }\not \equiv _{\delta } e_{j}^{\prime }\) and \(e_{i}^{\prime }\not \equiv _{\delta } e_{i}\) for every i≠j ∈ [r + 1]. Let \(\delta ^{\prime }\) be a successor function on G with \(\delta ^{\prime }(e)=\delta (e)\) for every \(e\in \mathbf {E}\setminus \{e_{1},\ldots ,e_{r+1},e_{1}^{\prime },{\ldots } e_{r+1}^{\prime }\}\) and \(\delta ^{\prime }(e_{i})=\delta (e_{i}^{\prime })\) and \(\delta ^{\prime }(e_{i}^{\prime })=\delta (e_{i})\) for every i ∈ [r + 1]. We define a successor function δr for G by
With the induction hypothesis applied to δ and δr, we get by (P1)
and by (P2)
Now we apply the induction hypothesis to δr and \(\delta ^{\prime }\) as follows: We take δr instead of δ, \(\delta ^{\prime }\) remains, r = 1 and e1 resp. \(e_{1}^{\prime }\) are replaced by er+ 1 resp. \(e_{r+1}^{\prime }\). Then P1 gives
Since e1 ≡δer+ 1, we get with (8)
which implies
Summarizing, we have
So (P1) is proved, if \({[e_{r+1}]}_{\delta ^{\prime }}={[e_{1}]}_{\delta ^{\prime }}\). By (13) \({[e_{1}]}_{\delta }\subseteq {[e_{r+1}]}_{\delta ^{\prime }}\), so \(e_{1}\in {[e_{r+1}]}_{\delta ^{\prime }}\) and hence
For the proof of (P2), let \(e\in \mathbf {E}\setminus {[e_{1}]}_{\delta ^{\prime }}\). Since \(e\notin {[e_{1}]}_{\delta ^{\prime }}\), by (12) and (14) \(e\notin {[e_{1}]}_{\delta _{r}}\). Applying the induction hypothesis to δ and δr, (P2) gives us \({[e]}_{\delta _{r}}={[e]}_{\delta }\). We know that \({[e_{r+1}]}_{\delta ^{\prime }}={[e_{1}]}_{\delta ^{\prime }}\), so \(e\notin {[e_{r+1}]}_{\delta ^{\prime }}\). As above, we apply the induction hypothesis to δr and \(\delta ^{\prime }\) and get \({[e]}_{\delta ^{\prime }}={[e]}_{\delta _{r}}\). Altogether, \({[e]}_{\delta ^{\prime }}={[e]}_{\delta _{r}}=[e_{\delta }]\). □
Lemma 8
Let k ∈{0,…,N}. Then, \(\delta ^{*}_{k}\) is bijective and for any \((u,v),(u^{\prime },v^{\prime }) \linebreak \in R^{*}(E)\), we have
-
(i) If \((u,v),(u^{\prime },v^{\prime })\) are processed edges, then \((u,v)\equiv _{\delta ^{*}_{k}}(u^{\prime },v^{\prime })\Leftrightarrow t_{k}(u)=t_{k}(u^{\prime })\).
-
(ii) If (u, v) is a processed edge, then tk(u) = tk(v).
-
(iii) If tk(u) = 0, then \((u,v)\equiv _{\delta ^{*}_{k}}(u^{\prime },v^{\prime })\Leftrightarrow (u,v)\equiv _{\delta ^{C}}(u^{\prime },v^{\prime })\).
Claim (i) says that the procedure Update works correctly, i.e., that the t-value of a node (if it isn’t 0) always represents the tour it currently is associated to. Claim (ii) says that after an edge has been processed, both of its nodes are associated to the same tour. So, after the algorithm has terminated, every node of G is in the same tour as its neighbor.
Proof
We prove all claims via one induction over k. For k = 0 we have \(\delta ^{*}_{0}=\delta ^{C}\) which is bijective (Lemma 5). Moreover, no edge has been processed so far, so (i) and (ii) are trivially fulfilled and (iii) follows directly from \(\delta ^{*}_{0}=\delta ^{C}\).
Now let all of the claims be true for a fixed k ∈{0,…,N − 1}. We start with proving the bijectivity and (i) for k + 1.
To do so we take a closer look at the (k + 1)-th call of Merge-Cycle. Suppose that \(\delta ^{*}_{k}\neq \delta ^{*}_{k+1}\). This change must happen in one of the procedures New-Nodes, Merge or Write, since these are the only procedures in which edges are written on the output tape or added to F. First, note that for every edge e written on the output tape during Write or added to F in New-Nodes it holds \(\delta ^{*}_{k}(e)=\delta ^{*}_{k+1}(e)\):
If \(e=(v_{i}^{(k+1)},v_{i+1}^{(k+1)})\) is written on the output tape during Write, it is written in the form \((v_{i}^{(k+1)},v_{i+1}^{(k+1)},v_{i+2}^{(k+1)})\), so \(\delta ^{*}_{k+1}(e)=(v_{i+1}^{(k+1)},v_{i+2}^{(k+1)})=\delta ^{C}(e)=\delta ^{*}_{k}(e)\).
If \(e=(v_{i-1}^{(k+1)},v_{i}^{(k+1)})\) is added to F during New-Nodes, it becomes a type-B-edge at this point, so \(\delta ^{*}_{k+1}(e)=(v_{i}^{(k+1)},s_{k}(v_{i}^{(k+1)}))\). Furthermore, \(s_{k+1}(v_{i}^{(k+1)})\) is set to \(v_{i+1}^{(k+1)}\) in line 3, so \(\delta ^{*}_{k+1}(e)=(v_{i}^{(k+1)},v_{i+1}^{(k+1)})=\delta ^{C}(e)=\delta ^{*}_{k}(e)\).
So it suffices to consider the procedure Merge: Here we process every node from the set Jk+ 1. Let r := |Jk+ 1|, for instance J = {w1,…,wr}. Each of these nodes wi has been processed before, hence, there is a unique edge in Ck+ 1 that points to wi and which we denote by ei. Moreover, there is a unique edge in Fk pointing to wi and which we denote by \(e_{i}^{\prime }\). Now let i ∈ [r]. We process wi in two steps:
Step 1: (wi, s(wi)) is marked as successor of ei. So directly after this step, ei and \(e_{i}^{\prime }\) share the same successor, while the out-going edge of wi in Ck+ 1 has lost its predecessor.
Step 2: s(wi) is set to the next node on Ck+ 1, so that the out-going edge of wi becomes the successor of \(e_{i}^{\prime }\) in Ck+ 1 concerning \(\delta ^{*}_{k+1}\).
In these two steps we swapped the successors of ei and \(e_{i}^{\prime }\) and did not change anything else, so we get
and for any i ∈ [r]
Let i, j ∈ [r] with i≠j. We have \(e_{i}\equiv _{\delta ^{*}_{k}} e_{j}\), since \(e_{j}\in C_{k+1}={[e_{i}]}_{\delta ^{C}}={[e_{i}]}_{\delta ^{*}_{k}}\). We also have \(e_{i}^{\prime }\not \equiv _{\delta _{k}^{*}} e_{j}^{\prime }\), which follows from tk(wi)≠tk(wj) (Construct-J-M, line 4) together with the induction hypothesis. Finally we have \(e_{i}\not \equiv _{\delta _{k}^{*}} e_{i}^{\prime }\), because \(e_{i}^{\prime }\notin E(C_{k+1})={[e_{i}]}_{\delta ^{C}}={[e_{i}]}_{\delta ^{*}_{k}}\).
So we can apply Lemma 7 with \(\delta =\delta ^{*}_{k}\) and \(\delta ^{\prime }=\delta ^{*}_{k+1}\). This ensures the bijectivity of \(\delta ^{*}_{k+1}\) and for every processed edge e by property P1 we get
where the second equivalence follows with the induction hypothesis: We have
Analogously we get
Now we are able to complete the proof of (i): Let \((u,v),(u^{\prime },v^{\prime })\) be processed edges.
Case 1: \((u,v),(u^{\prime },v^{\prime })\in {[e_{1}]}_{\delta ^{*}_{k+1}}\). Then \((u,v)\equiv _{\delta ^{*}_{k+1}} (u^{\prime },v^{\prime })\) and \(t(u)=a_{k+1}=t(u^{\prime })\).
Case 2: \((u,v) \in {[e_{1}]}_{\delta ^{*}_{k+1}},(u^{\prime },v^{\prime })\notin {[e_{1}]}_{\delta ^{*}_{k+1}}\). Then \((u,v)\not \equiv _{\delta ^{*}_{k+1}} (u^{\prime },v^{\prime })\) and \(t(u)=a_{k+1}\neq t(u^{\prime })\).
Case 3: \((u,v) \notin {[e_{1}]}_{\delta ^{*}_{k+1}},(u^{\prime },v^{\prime })\in {[e_{1}]}_{\delta ^{*}_{k+1}}\). Analog to case 2.
Case 4: \((u,v),(u^{\prime },v^{\prime })\notin {[e_{1}]}_{\delta ^{*}_{k+1}}\). Then \(t_{k+1}(u) = t_{k}(u),t_{k+1}(u^{\prime }) = t_{k}(u^{\prime })\) and with P2 of Lemma 7 we get \({[(u,v)]}_{\delta ^{*}_{k+1}}={[(u,v)]}_{\delta ^{*}_{k}}\) and \({[(u^{\prime },v^{\prime })]}_{\delta ^{*}_{k+1}}={[(u^{\prime },v^{\prime })]}_{\delta ^{*}_{k}}\). So
(ii). Let (u, v) be a processed edge. If (u, v) ∈ Ck+ 1, then at the end of Merge-Cycle both t(u) and t(v) are set to the same value a. If (u, v)∉Ck+ 1, then (u, v) already was a processed edge before. By induction hypothesis (ii) we have tk(u) = tk(v) and applying Lemma 6(i) we get tk+ 1(u) = tk+ 1(v).
(iii). Let u ∈ V with tk+ 1(u) = 0. That means that u is not processed in the first k + 1 calls of Merge-Cycle. Hence, (u, v) ∈ E(Cℓ) for some ℓ > k + 1. Lemma 6(ii) gives \([(u,v)]_{\delta ^{*}_{k+1}}=[(u,v)]_{\delta ^{C}}\), therefore we get \((u^{\prime },v^{\prime })\in [(u,v)]_{\delta ^{*}_{k+1}}\Leftrightarrow (u^{\prime },v^{\prime })\in [(u,v)]_{\delta ^{C}}\). □
Finally, we prove Theorem 1.
Proof Proof of Theorem 1
According to Theorem 2, it suffices to show that δ∗ is bijective and that \(e\equiv _{\delta ^{*}}e^{\prime }\) for any \(e,e^{\prime }\in R^{*}(E)\). Remember that \(\delta ^{*}=\delta ^{*}_{N}\), so by Lemma 8 δ∗ is bijective. For the second property, let \(e,e^{\prime }\in R^{*}(E)\) with e = (u, v) and \(e^{\prime }=(u^{\prime },v^{\prime })\). If G is Eulerian, it is connected, so there exists a u-\(u^{\prime }\)-path P in G. For every edge on P, either the edge itself or the corresponding reversed edge has been processed during the algorithm Euler-Tour. By Lemma 8 (ii), tN(x) = tN(y) for all nodes x, y of P, hence, \(t_{N}(u)=t_{N}(u^{\prime })\) and by Lemma 8 (i), we get \(e\equiv _{\delta ^{*}_{N}}e^{\prime }\). Since \(\delta ^{*}_{N}=\delta ^{*}\), we are done. □
6 Conclusion
We have presented a one-pass algorithm with \(\mathcal O(n \text { log}(n))\) RAM for finding Euler tours in undirected graphs in the graph streaming model with an additional write-only output tape. This gives two possible directions for future work.
-
Are there other well suited graph-theoretical problems, where the size of a solution exceeds \(\mathcal O(n \text { polylog}(n))\) RAM but which can be solved in the graph streaming model with an additional output tape?
-
Can our technique of linking tours to equivalence classes be useful for other routing problems?
Notes
A previous version of this paper was published on Arxiv and can be accessed on the following link: https://arxiv.org/abs/1710.04091
References
Christofides, N: Worst-case analysis of a new heuristic for the travelling salesman problem. Technical Report, Graduate School of Industrial Administration, Carnegie Mellon University (1976)
Pevzner, PA, Tang, H, Waterman, MS: An eulerian path approach to dna fragment assembly. Proc. Natl. Acad. Sci. 98(17), 9748–9753 (2001)
Feigenbaum, J, Kannan, S, McGregor, A, Suri, S, Zhang, J: On graph problems in a semi-streaming model. Theor. Comput. Sci. 348 (2), 207–216 (2005)
McGregor, A: Graph stream algorithms: A survey. SIGMOD Rec. 43(1), 9–20 (2014)
Atallah, M, Vishkin, U: Finding euler tours in parallel. J. Comput. Syst. Sci. 29(3), 330–337 (1984)
Demetrescu, C, Escoffier, B, Moruz, G, Ribichini, A: Adapting parallel algorithms to the W-stream model, with applications to graph problems. Theor. Comput. Sci. 411(44-46), 3994–4004 (2010)
Grohe, M, Koch, C, Schweikardt, N: Tight lower bounds for query processing on streaming and external memory data. In: Caires, L, Italiano, GF, Monteiro, L, Palamidessi, C, Yung, M (eds.) Automata, Languages and Programming, pp 1076–1088. Springer Berlin Heidelberg, Berlin, Heidelberg (2005)
François, N, Jain, R, Magniez, F: Unidirectional input/output streaming complexity of reversal and sorting. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2014, September 4-6, 2014, Barcelona, Spain, pp 654–668 (2014)
Demetrescu, C, Finocchi, I, Ribichini, A: Trading off space for passes in graph streaming problems. ACM Trans. Algorithms 6(1), 6:1–6:17 (2009)
Ruhl, JM: Efficient algorithms for new computational models. Ph.D. Thesis, Massachusetts Institute of Technology. AAI0805714 (2003)
Aggarwal, G, Datar, M, Rajagopalan, S, Ruhl, M: On the streaming model augmented with a sorting primitive. In: Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’04, pp 540–549. IEEE Computer Society, Washington, DC, USA (2004)
Sun, X, Woodruff, DP: Tight Bounds for Graph Problems in Insertion Streams. In: Garg, N, Jansen, K, Rao, A, Rolim, JDP (eds.) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015), Leibniz International Proceedings in Informatics (LIPIcs), vol. 40, pp 435–448. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2015)
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A
Appendix A
On the following two pages we present a working example for the method Merge-Cycle that corresponds to the steps 1.2 to 1.6 in our high level description. Note that every node has at most one in-going first-in edge and one out-going potential successor edge at a time.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Glazik, C., Schiemann, J. & Srivastav, A. A One Pass Streaming Algorithm for Finding Euler Tours. Theory Comput Syst 67, 671–693 (2023). https://doi.org/10.1007/s00224-022-10077-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00224-022-10077-w