1 Introduction

On-demand mobile distribution hubs are emerging as a popular solution for businesses looking to meet growing demand and provide efficient, long-term access to warehouse space for their customers. These mobile facilities can be relocated to different locations within a region, based on market needs, and can be used for various purposesFootnote 1 such as delivering goods, providing services, and improving network coverage. To reduce different costs, applications may need to move facilities closer to the end user. This means that facilities should always stay within a reasonable distance from customers who have ongoing service demands. However, relocating facilities also involves a cost. An efficient solution could dynamically adjust the locations of these mobile facilities in response to new customers, ensuring they are always within a reasonable distance of where they are needed. This results in a trade-off between two types of costs: the assignment cost, which is the sum of distances between customers and their nearest facilities at the current time, and the total movement cost, which is the total cost of relocating facilities between locations so far. The goal is to minimize the total cost of serving customers that arrive over time and request long-term service, which is the sum of these two costs.

In this paper, we introduce the online mobile facility location (OMFL) problem. We present two applications from different domains. The first one is related to the distribution of goods from mobile warehouses to retail stores. Suppose a company has a fixed number of mobile warehouses, initially located at various points within a region. The company serves this region, where the distances between points represent the cost of shipping goods or relocating warehouses. As demand for the company’s products grows, new retail stores are built, resulting in new requests. These requests represent new retail stores that need to be served by the company. In this scenario, we do not consider the individual demands of each retail store. Instead, we assume that these stores typically have demands for goods that need to be shipped from mobile warehouses.

In the second application, we consider a telecommunications company that provides mobile broadband services to households within a region using movable internet providers. The company has a fixed number of these providers, which are initially positioned at various locations throughout the region. The distances between these points represent the signal strength or coverage between them. As the population in the region gradually grows and new households are established or connected, there is an increase in demand for the company’s services, resulting in new requests for continuous and long-term mobile broadband coverage.

Paper Outline

In the rest of this section, we introduce the OMFL problem and its two variants, G-OMFL and M-OMFL. Each subsection defines the corresponding problem, provides results, and reviews related work. In Section 2, we analyze G-OMFL and provide an upper bound. Section 3 presents our lower bound analysis for OMFL, showing that our analysis for G-OMFL is almost tight since OMFL is a special case of G-OMFL. Section 4 offers our lower bound analysis for M-OMFL. Finally, Section 5 concludes the paper and poses two open questions. Appendix A defines HSTs and discusses their use as a tool for solving online problems.

1.1 Online Mobile Facility Location (OMFL)

The mobile facility location (MFL) problem was first introduced by Demaine et al. [2] and later studied by Friggstad and Salavatipour [3]. Many MFL scenarios, such as the two applications presented above, are online, where requests arrive one at a time and require long-term service, and the end of the request sequence is unknown. In MFL, each facility and request has a starting location in a metric space, and the goal is to find a destination point for each such that every request is assigned to a point that is the destination of some facility. The objective can be either to minimize the total or maximum distance between any pair of facilities and requests and their destinations. In OMFL, facilities can move more than once as new requests arrive online, while in MFL, facilities move only once offline. However, the assignment cost in both problems is based on the final configuration of facilities. In the online setting, the future requests are unknown, so any facility configuration at a given time could be the final one. The intermediate movements of facilities could be interpreted as the “learning cost” that an online algorithm must pay compared to an offline algorithm that solves MFL to ensure that facilities are well positioned at any time. Therefore, OMFL can be seen as an online adaptation of MFL, where the online algorithm must balance the trade-off between moving facilities closer to requests and minimizing relocation costs.

1.1.1 Problem Defintion

We define an instance of the OMFL problem as follows. We are given an arbitrary finite metric space \(\mathcal {M}=(V,d)\) where \(|V |= n\) and a non-negative, symmetric distance function \(d:V \times V \rightarrow \mathbb {R}_{+}\), which satisfies the triangle inequality. A set of requests are issued at points one at a time. There is a set of k mobile facilities. We define a configuration (at the time \(t=1,2,\ldots \)) to be a function \(\mathcal {S}(t): [k]\rightarrow V\) that specifies which of the facilities \(\{1,\ldots ,k\}\) is located at which point at time t, i.e., \(\mathcal {S}_{j}(t)\) denotes the point that hosts facility j at time t. At each time \(t=1,2,3,\dots \), there is exactly one request placed, and \(\mathcal {J}(t)\) denotes the point at which the single request at time t is placed.

Formally, “serving all requests” at time t is a function \(\mathcal {I}: [t] \rightarrow [k]\), which assigns any request \(i \in [t]\) to one of the k facilities.Footnote 2 We assume there is no bound on the number of requests that can be assigned to the same facility. If request i is assigned to facility j at time t then this induces the cost \(d\left( \mathcal {J}(i),\mathcal {S}_j(t)\right) \), denoted as assignment cost of request i at time t. The overall assignment cost of a configuration \(F \subset V\) at time t denoted by \(S_t(F)\) is the sum of the assignment costs of all requests at this time, i.e., \(\sum _{i=1}^t d\left( \mathcal {J}(i),\mathcal {S}_{\mathcal {I}_i(t)}(t)\right) \). In order to keep the overall assignment cost small, an algorithm can move the facilities between the points, which implies some additional cost, called the movement cost. The movement cost equals the distance between the points, i.e., if facility j is at point v at time t and at point \(v'\) at time \(t+1\), the cost of movement of facility j at time \(t+1\) equals the distance \(d(v,v')\).

Feasible Solution

At any given time t, any facility configuration at time t is considered a feasible solution. This means that even if an algorithm does not perform any actions can output the initial facility configuration as a feasible solution. However, this solution may not necessarily be efficient, as it may result in a high assignment cost.

The total cost of any algorithm \({\mathcal {A}}\) at time t is the sum of the total movement cost by time t and the overall assignment cost at time t and defined as follows

$$\begin{aligned} cost_{t}^\mathcal {A}:= S^\mathcal {A}_t+M^\mathcal {A}_t \end{aligned}$$
(1)

where

$$\begin{aligned} S^\mathcal {A}_t:=\sum \limits _{i=1}^t d\left( \mathcal {J}(i),\mathcal {S}^{\mathcal {A}}_{\mathcal {I}^{\mathcal {A}}_i(t)}(t)\right) \end{aligned}$$

and

$$\begin{aligned} M^{\mathcal {A}}_t:=\sum \limits _{i=1}^t\sum \limits _{j=1}^{k} d\left( \mathcal {S}^{\mathcal {A}}_j(i-1),\mathcal {S}^{\mathcal {A}}_j(i)\right) . \end{aligned}$$

Remark 1

The competitive ratio for any online algorithm at any given time \(t \ge 1\) is defined as the total cost of the online algorithm at time t divided by the cost of the optimal MFL solution at time t. It is important to note that an MFL algorithm can wait until time t and perform all necessary facility movements at that time. However, the online algorithm does not know the value of t in advance. As a result, if the online algorithm aims to guarantee a competitive ratio \(\gamma \) against the optimal MFL solution at any given time t, it must guarantee \(\gamma \) at any time \(t' \in [t]\) and may relocate facilities at any time \(t'\) since it does not know the value of t in advance.

Definition 1

(Optimal Assignment Cost) The optimal assignment cost at any time t, denoted by \(S_t^*\), is the lowest total assignment cost that can be achieved by an “optimal configuration” of facilities at time t. Mathematically,

$$\begin{aligned} S_t^* := \min _{F} S_t(F), \end{aligned}$$

where \(S_t(F)\) is the sum of the assignment costs of all requests at time t given the configuration F of facilities at time t. Further, \(F^* \in \arg \min \limits _{F}S_t(F)\) is an optimal configuration.

Remark 2

Due to the trade-off in the cost function (1), an optimal offline algorithm \(\mathcal {O}\) may not move facilities to the optimal configuration. This means that the optimal offline algorithm may lose more in movement cost than it gains in assignment cost if it moves its facilities to the configuration that minimizes the total assignment cost. Therefore, the total assignment cost of an optimal offline algorithm at any time t, i.e., \(S_t^{{\mathcal {O}}}\) may not equal the optimal assignment cost at t, i.e., \(S_t^*\).

1.1.2 Our Results

We provide a lower bound on the achievable competitive ratio. Moreover, the lower bound even holds for the OMFL problem on uniform metric spaces when all pairwise distances are equal. The lower bound theorem essentially says that if any online algorithm \({\mathcal{O}\mathcal{N}}\) wants to guarantee a small multiplicative competitive ratio, \({\mathcal{O}\mathcal{N}}\) has to tolerate a relatively large additive term. Basically, if \({\mathcal{O}\mathcal{N}}\) wants to keep the additive term within a bound of \(\beta \), the multiplicative competitive ratio becomes at least \(1+\Omega (k/\beta )\).

Theorem 1

Let \({\mathcal {O}}\) denote an optimal offline algorithm. Consider any deterministic online algorithm \({\mathcal{O}\mathcal{N}}\). Further, assume that \({\mathcal{O}\mathcal{N}}\) guarantees that at all times \(t>0\), the additive difference between the assignment cost of \({\mathcal{O}\mathcal{N}}\) and the optimal assignment cost at time t is less than \(\beta \). Then, there exist an execution and a time \(t_0 > 0\) such that the total cost of \({\mathcal{O}\mathcal{N}}\) can be lower bounded as follows.

  • If \(\beta =O(k/\varepsilon )\) for any \(\varepsilon >0\), it holds that

    $$\begin{aligned} cost_{t_0}^\mathcal{O}\mathcal{N} \ge (1+\varepsilon ) \cdot {cost}_{t_0}^\mathcal {O}+ \Omega (k \log k - \varepsilon k). \end{aligned}$$
  • If \(\beta = O\left( \frac{k \cdot \log k}{\log \log k}\right) \), then for every \(\varepsilon \ge \frac{\log \log k}{\log ^{1-\delta }k}\) and any constant \(0 < \delta \le 1\), we obtain

    $$\begin{aligned} cost_{t_0}^\mathcal{O}\mathcal{N} \ge (1+\varepsilon ) \cdot cost_{t_0}^\mathcal {O} + \Omega \left( \frac{k \cdot \log k}{\log \log k}-\varepsilon k\right) . \end{aligned}$$

1.1.3 Related Work

MFL is a generalization of the k-facility location problem [4], which itself generalizes the facility location and k-median problems [5,6,7]. [2] provided a 2-approximation for the maximum objective, which is optimal unless \(P=NP\). [3] provided an 8-approximation for the total objective.

Consider the following abstract online problem: The problem is defined on a graph or metric space with resources that can be either fixed (potentially any point of the metric can be opened as a resource) or mobile (typically k resource points are initially given). Requests are revealed one by one and must be served by a resource. A request may need to be served by moving a mobile resource to the requested point. Alternatively, a model may relax this consideration and allow a request to be served remotely by either a fixed or mobile resource. The offline setting for resource allocation problems does not clearly capture the difference between real-world applications where requests need long-term service or one-time service at the time of arrival. However, the online setting can define this difference more clearly. Requests may need either long-term service or to be served only at the time they arrive. The assignment cost of a request, \(r_i\), that arrives at time i for one-time service is equal to the distance from \(r_i\) to \(f_i(i)\), where \(f_i(t)\) represents the nearest facility to \(r_i\) at any time \(t \ge i\). For long-term service, there are different cases. Consider the following two cases: 1) The “fixed long-term” case, where reassignment is not an option, the assignment cost of \(r_i\) is the distance from \(r_i\) to the point that hosts \(f_i(i)\) at any time \(t \ge i\). 2) The “dynamic long-term” case, ’where reassignment is possible, the assignment cost of \(r_i\) equals the distance from \(r_i\) to \(f_i(t)\) at any time \(t \ge i\). The movement cost is the total distance resources travel. In some models, moving a resource over some distance is more expensive than serving a request over the same distance, that we call it “expensive move”. By varying these criteria, several classic online problems are defined. Some aim to minimize only the total movement cost, while others involve a trade-off between the assignment cost and either the total opening cost or the total movement cost. The goal is to find an optimal balance between these costs.

OMFL is an online problem that falls under the umbrella of this abstract problem, along with other classic online problems such as page migration introduced by Black and Sleator [8], online facility location (OFL) introduced by Meyerson [9], and k-server introduced by Manasse et al. [10]. For uniform metric spaces, the k-server problem is equivalent to the paging problem [11]. Table 1 summarizes the similarities and differences between OMFL and these classic online problems, as well as some other related work.

In the online version of the k-facility reallocation problem, we have more than one customer points that request service at each time step, instead of one request at a time as in the abstract online problem. Fotakis et al. [13] present a constant-competitive algorithm for this setting when \(k = 2\) on the lines.

Almost all work have investigated the page migration problem for only a single page (i.e., \(k=1\)). A recent paper introduces a variant of page migration in the Euclidean space (of arbitrary dimension) where they limit the allowed distance the facility can move in a time step [15]. [16] extends [15] for k mobile facilities. The best deterministic and randomized algorithms for page migration problem have competitive ratios of 4 [17] and 3 [18], respectively. There are methods to transform any k-server algorithm into a deterministic and randomized k-page migration algorithm with competitive ratios of \(O(c^2)\) and O(c), respectively, where c is the competitive ratio of the k-server algorithm [19]. Results on page migration are elaborated in detail in [20].

Fotakis [21] shows that the competitive ratio for OFL is \(\Theta (\log m / \log \log m)\) where m represents the total number of requests. A variant of OFL is the incremental facility location problem [22, 23] where it is possible to merge clusters. A relaxed variant of incremental facility location that investigates different models is presented where an online algorithm may make corrections to the positions of its open facilities [14, 24]. While the OMFL problem is an online variant of MFL, the models proposed by [14, 24] are mobile variants of OFL. As such, there are key differences between the OMFL and these models. In contrast to the OMFL problem, [24] does not charge any cost for moving a facility. This is a major limitation of their model. They present an algorithm which is 2-competitive. Afterward, a model was proposed by Feldkord et al. [14] where an online algorithm may move its open facilities, but instead of this being free, it incurs an “expensive movement” cost. Further, they consider two models where the movements of the open facilities can be arbitrarily or limited to some constant in each time step. They achieve the same competitiveness on the line for both one-time and fixed long-term services. In the scenario where movement is unrestricted, the expected competitive ratio achieved is \(O(\log D / \log \log D)\), where \(D > 1\) represents the multiplicative factor that increases the cost of movement relative to the distance traveled. In the case of limiting the movement of an open facility to some constant \(\delta \) in each time step, they achieve an expected competitive ratio that depends on \(\delta \) and the cost of opening a new facility. They showed that their results are asymptotically tight on the line. For the one-time service, they extend their result to the Euclidean space of arbitrary dimension, where they achieve a competitive ratio with an additional additive term \(O(\sqrt{k})\), where k is the number of facilities in the optimal solution.

Sections 1.2.3 and 1.3.3 review the similarities and differences between OMFL variants, paging, and the k-server problems. M-OMFL is more closely related to paging when it is studied on uniform metrics, while some works on the randomized k-server conjecture are more closely related to G-OMFL.

Table 1 A comparison of OMFL and related work in terms of resource type, service location, service duration, and cost function

1.2 G-OMFL

According to Theorem 1, if a company aims to keep its facilities close to an optimal configuration in response to requests at any given time, it is inevitable to have a multiplicative competitive ratio of \(\Omega (k)\). This motivates us to explore the role of randomization in the OMFL problem. Following the approach of [25, 26], who used randomization to achieve a poly-logarithmic competitive ratio for the k-server problem, we also consider solving OMFL on HSTs. In [26], it is shown that by exploiting the randomized low-stretch hierarchical tree decomposition of [27], it is possible to obtain a poly-logarithmic competitive ratio for the k-server problem. Utilizing HSTs as a tool in solving some online problems is a common approach [25, 26, 28, 29] (see Appendix A for more details about HSTs and the approach). HSTs are a useful tool for solving some online problems, as they have a simple structure that allows reducing the problem on an HST to a more general problem on a uniform metric [25, 26]. This leads us to define a generalized variant of OMFL on uniform metrics as the first step toward solving OMFL on HSTs and general metrics.

1.2.1 Problem Defintion

We adapt the OMFL problem statement to accommodate the G-OMFL on uniform metrics and introduce new notations accordingly. We are given a set V of n nodes and there is a set of k facilities. Further, a set of requests arrive one at a time. We assume that at time \(t\ge 1\), request t arrives at node \(v(t)\in V\). For a node \(v\in V\), let \(r_{v,t}\) be the number of requests at node v after t requests have arrived, i.e.,

$$\begin{aligned} r_{v,t}:=|\{i\le t: v(i)=v\} |. \end{aligned}$$

In order to keep the total service cost small, an online algorithm can move the facilities between the nodes (if necessary, for answering one new request, we allow an algorithm to also move more than one facility). We define a configuration of facilities by integers \(f_v\in \mathbb {N}_0\) for each \(v\in V\) such that \(\sum _{v \in V} f_v=k\). We describe such a configuration by a set of pairs as

$$\begin{aligned} F := \{(v,f_{v}) : v \in V\}. \end{aligned}$$

The initial configuration is denoted by \(F_0\).

Feasible Solution

The feasible solution for G-OMFL is the same as for OMFL, as defined in Section 1.1.1. This means that any facility configuration at any time step is a feasible solution.

For any online algorithm \({\mathcal{O}\mathcal{N}}\), we denote the sequence of facility configurations until time t by \({\mathcal {F}}^{\mathcal{O}\mathcal{N}}_t:=\{F^{\mathcal{O}\mathcal{N}}(i): i \in [0,t]\}\), where \(F^{\mathcal{O}\mathcal{N}}(t)\) is the configuration after reacting to the arrival of request t and where \(F^{\mathcal{O}\mathcal{N}}(0)=F_0\).

Service Cost

We implicitly assume that if a node v has some facilities, all requests at v are served by these facilities. Depending on the number of facilities and the number of requests at a node \(v\in V\), an algorithm has to pay some service cost to serve the requests located at v. This service cost of node v is defined by a service cost function \(\sigma _v\) such that \(\sigma _v(x,y)\ge 0\) is the cost for serving y requests if there are x facilities at node v. For convenience, for \(t\ge 1\), we also define \(\sigma _{v,t}(x):=\sigma _v(x,r_{v,t})\) to be the service cost with x facilities at node v at time t. For some configuration F, we denote the total service cost at time t by

$$\begin{aligned} S_t(F) := \sum _{v\in V} \sigma _{v,t}(f_v) = \sum _{v\in V} \sigma _v(f_v, r_{v,t}). \end{aligned}$$

The service cost of any algorithm \(\mathcal {A}\) at time t is denoted by \(S_t^{\mathcal {A}} := S_t(F^{\mathcal {A}}(t))\). Let \(S_t(F^*)\) equal \(S_t^*\) for any optimal configuration \(F^*\) at time t, i.e., \(S_t(F^*)=S_t^*\) (see Definition 1 and Remark 2 for \(S_t^*\)).

Movement Cost

We define the movement cost \(M_t^{\mathcal {A}}\) of given algorithm \({\mathcal {A}}\) to be the total number of facility movements by time t. Generally, for two configurations, \(F=\{(v,f_v): v\in V\}\) and \(F'=\{(v,f_v'): v\in V\}\), we define the distance \(d(F,F')\) between the two configurations as follows:

$$\begin{aligned} d(F,F'):=\sum _{v \in V} \max \{0,f_v-f_v'\} = \frac{1}{2}\cdot \sum _{v\in V} |f_v - f_v' |. \end{aligned}$$
(2)

The distance \(d(F,F')\) is equal to the number of movements that are needed to get from configuration F to configuration \(F'\) (or vice versa). Based on the definition of d given by (2), we can express the movement cost of an online algorithm \({\mathcal{O}\mathcal{N}}\) with the sequence of facility configurations \(\mathcal {F}_t^{\mathcal{O}\mathcal{N}}=\{F^{\mathcal{O}\mathcal{N}}(i): i \in [0,t]\}\) as

$$\begin{aligned} M_t^{\mathcal{O}\mathcal{N}} = \sum _{i=1}^{t} d\left( F^{\mathcal{O}\mathcal{N}}(i-1),F^{\mathcal{O}\mathcal{N}}(i)\right) . \end{aligned}$$

Regarding the movement cost of an offline algorithm for G-OMFL at any given time t, it is similar to the MFL solution in that an offline algorithm for G-OMFL only performs movements once, since it knows the request sequence until time t in advance from the beginning. Therefore, at any given time t, the movement cost of any optimal offline algorithm, denoted as \({\mathcal {O}}\), is \(M_t^{\mathcal {O}} = d\left( F_0,F^{\mathcal {O}}(t)\right) \).

The total cost of any algorithm \({\mathcal {A}}\) is

$$\begin{aligned} {cost}_t^\mathcal {A}:=S_t^\mathcal {A} + M_t^\mathcal {A}. \end{aligned}$$

Service Cost Function Properties

The service cost function \(\sigma \) has to satisfy a number of natural properties. First of all, for every \(v\in V\), \(\sigma _v(x,y)\) has to be monotonically decreasing in the number of facilities x that are placed at node v and monotonically increasing in the number of requests y at v.

$$\begin{aligned} \forall v\in V\, \forall x,y\in \mathbb {N}_0&:&\sigma _v(x,y) \ge \sigma _v(x+1,y)\end{aligned}$$
(3)
$$\begin{aligned} \forall v\in V\, \forall x,y\in \mathbb {N}_0&:&\sigma _v(x,y) \le \sigma _v(x,y+1) \end{aligned}$$
(4)

These two properties imply that adding more facilities at a node will reduce the total service cost, while adding more requests at a node will increase the total service cost. Naturally, this is because more facilities can serve a fixed number of requests with lower cost, while more requests are served with the same number of facilities with higher cost.

Further, the effect of adding additional facilities to a node v should become smaller with the number of facilities (convex property in x) and it should not decrease if the number of requests gets larger. Therefore, for all \(v\in V\) and all \(x,y\in \mathbb {N}_0\), we have

$$\begin{aligned} \sigma _v(x,y)-\sigma _v(x+1,y)\ge & {} \sigma _v(x+1,y)-\sigma _v(x+2,y)\end{aligned}$$
(5)
$$\begin{aligned} \sigma _v(x,y)-\sigma _v(x+1,y)\le & {} \sigma _v(x,y+1)-\sigma _v(x+1,y+1) \end{aligned}$$
(6)

The third property means that if one adds more facilities at node v, the total service cost at that node will decrease, but the decrease will be less and less as one adds more facilities. This is because there are diminishing returns to adding more facilities: the first few facilities will have a big impact on reducing the service cost, but after a certain point, the impact will become smaller and smaller. The fourth property means that the difference in assignemnt cost between adding one facility with more requests should be greater than or equal to the difference in cost between adding one facility with fewer requests. These properties ensure that the service cost function \(\sigma \) is well-behaved.

Remark 3

We note that the OMFL problem on uniform metric spaces is a special case of G-OMFL, where the cost is either 0 (if there is a facility at the node) or it is equal to the number of requests at the node.

1.2.2 Our Results

We devise a simple, deterministic online algorithm, called dynamic greedy allocation (DGA), denote by \({\mathcal {G}}\) with the following properties. For two parameters \(\alpha \ge 1\) and \(\beta \ge 0\), DGA guarantees that at all times \(t\ge 0\), \(S_t^{\mathcal {G}} < \alpha S_t^*+\beta \). DGA achieves this while keeping the total movement cost small. In particular, our algorithm a) only moves when it needs to move because the configuration is not feasible anymore and b) always moves a facility which improves the service cost as much as possible. We show that the total number of movements up to a time t of our online greedy algorithm can be upper bounded as a function of the optimal service cost \(S_t^*\) at time t. Most significantly, we show that at the cost of an additive term which is roughly linear in k, it is possible to achieve a competitive ratio of \((1+\varepsilon )\) for every constant \(\varepsilon >0\). This result almost matches the lower bound of Theorem 1.

Remark 4

Note that the lower bound of Theorem 1 even holds for OMFL on uniform metrics, and therefore w.r.t. the Remark 3 also holds for the G-OMFL problem.

More precisely, we prove the following main theorem.

Theorem 2

Let \(\mathcal {O}\) denote an optimal offline algorithm. There is a deterministic online algorithm \({\mathcal {G}}\) such that for all times \(t \ge 0\), the total cost of \({\mathcal {G}}\) can be upper bounded as follows.

  • If \(\alpha =1\) and \(\beta = \Omega \left( k+\frac{k}{\varepsilon }\right) \) for any \(\varepsilon >0\), it holds that

    $$\begin{aligned} cost_t^\mathcal {G} \le (1+\varepsilon )\cdot {{cost}}_t^\mathcal {O} + O(k \log k + \beta ).{} \end{aligned}$$
  • If \(\alpha =1\) and \(\beta = \Omega \left( \frac{k \cdot \log k}{\log \log k}\right) \), then for every \(\varepsilon \ge \frac{\log \log k}{\log ^{1-\delta }k}\) and any constant \(0 < \delta \le 1\), we obtain

    $$\begin{aligned} cost_t^\mathcal {G} \le (1+\varepsilon )\cdot {{cost}}_t^\mathcal {O} + O\left( \beta \right) . \end{aligned}$$

Choosing \(\varvec{\alpha }>1\)

The results of the above theorem all hold for \(\alpha =1\), i.e., our algorithm is always forced to move to a configuration that is optimal up to the additive term \(\beta \). Even if \(\alpha \) is chosen to be larger than 1, as long as we want to guarantee a reasonably small multiplicative competitive ratio (of order o(k)), an additive term of order \(\Omega (k)\) is unavoidable. In fact, in order to reduce the additive term to O(k), \(\alpha \) has to be chosen to be of order \(k^\delta \) for some constant \(\delta >0\). Note that in this case, the multiplicative competitive ratio grows to at least \(\alpha \gg 1\). However, it might still be desirable to choose \(\alpha >1\). In that case, it can be shown that the movement cost \(M_t^{\mathcal {G}}\) only grows logarithmically with the optimal service cost \(S_t^*\) (where the basis of the logarithm is \(\alpha \)). As an application, this, for example, allows being \((1+\varepsilon )\)-competitive for any constant \(\varepsilon >0\) against an objective function of the form \(\gamma \cdot S_t^{\mathcal {G}} + M_t^{\mathcal {G}}\) even if \(\gamma \) is chosen of order \(k^{-O(1)}\).

1.2.3 Related Work

The OMFL variant is a specific instance of the G-OMFL problem on uniform metrics, where the service cost function is the most basic. Recall that in Section 1.1.1, we defined the assignment cost for the OMFL problem as the distance between each request and its nearest facility. In a uniform metric, this means that the assignment cost is equal to the number of requests that are not co-located with any facility.

We have introduced a general service cost model for G-OMFL in Section 1.2.1. This model is similar to the one used by Hajiaghayi et al. [30] for facility location, where the opening cost of a facility depends on the number of requests it serves. Another related generalization was made for the k-server problem by [25, 26], who defined a cost function for each subtree of an HST. They used an online algorithm for this variant, called the “allocation problem” that is defined on the uniform metrics, as a building block to design an online algorithm for k-server on the HST. The guarantees obtained by [25] for the allocation problem on a two-point metric space allowed them to obtain a polylogarithmic-competitive algorithm for the k-server problem on a binary HST with sufficient separation. However, this algorithm is limited by the binary and well-separated structure of the HST. [26] extended [25] to general HSTs and presented the first polylogarithmic-competitive algorithm for the k-server problem on arbitrary finite metrics with a competitive ratio of \(O(\log ^3 n \log ^2 k \log \log n)\). This result is remarkable considering that the best deterministic upper bound for the k-server problem is \(2k-1\) [31] while there are no lower bounds better than \(\Omega (\log k)\) [32] for the k-server problem. The randomized k-server conjecture states that the expected competitive ratio for the k-server is \(O(\log k)\).

1.3 M-OMFL

A naive algorithm for OMFL may not relocate any facilities and output the initial facility location as its solution. This may result in a high assignment cost, but no movement cost. However, this may not be desirable for companies that have to keep the total assignment cost below some threshold. There are several reasons why companies may have such a constraint on the assignment cost. One reason is budget limitation. Companies may have a fixed amount of money to spend on assigning requests to mobile facilities. Another reason is market competition. If the assignment cost is too high, it may affect the end users, making the company’s services less attractive than those of similar companies that assign requests from closer facilities. For example, if a company serves end users from mobile facilities that are far away from them, it may incur a higher assignment cost, which could reduce their attractiveness. To address this issue, we define a variant of OMFL where any solution has to keep the total assignment cost below some threshold as a function of optimal assignment cost and at the same time the goal is to minimize the movement cost.

1.3.1 Problem Defintion

The problem statement and the model definition for M-OMFL are the same as for OMFL, as stated in Section 1.1.1. The only difference is that the goal is to minimize the movement cost only, subject to a constraint that the total assignment cost does not exceed a given threshold for any algorithm that solves the problem, including an offline algorithm. The movement cost and the assignment cost are defined as in Section 1.1.1. The threshold is defined as a function of the optimal assignment cost. The problem is specified by two parameters \(\alpha \) and \(\beta \) such that

$$\begin{aligned} \alpha \ge 1 \quad \text {and}\quad \max \{\alpha -1,\beta \} \ge 1. \end{aligned}$$
(7)

Feasible Configuration

We define a configuration F to be feasible at time t iff

$$\begin{aligned} S_t(F) < \alpha \cdot S_t^* + \beta . \end{aligned}$$
(8)

Feasible Solution

For a given algorithm \(\mathcal {A}\), we denote the solution at time t by \({\mathcal {F}}^{\mathcal {A}}_t:=\{F^{\mathcal {A}}(i): i \in [0,t]\}\) that is a sequence of facility configurations of algorithm \(\mathcal {A}\) where \(F^{\mathcal {A}}(t)\) is the feasible configuration after reacting to the arrival of request t and where \(F^{\mathcal {A}}(0)=F_0\). The assignment cost of an algorithm \({\mathcal {A}}\) at time t is denoted by \(S_t^{\mathcal {A}} := S_t(F^{\mathcal {A}}(t))\).

The total cost of an algorithm, including an optimal offline algorithm, is the sum of the movement costs incurred by the algorithm. However, an optimal offline algorithm for M-OMFL cannot defer all the server movements until the end, as it can for OMFL and G-OMFL. It has to adjust the facility configuration whenever the condition in (8) is violated.

1.3.2 Our Results

We show that any deterministic online algorithm that solves an instance of the M-OMFL problem in uniform metrics necessarily has a competitive ratio of at least \(\Omega (n)\).

Theorem 3

Assume that we are given parameters \(\alpha \) and \(\beta \) which satisfy (7). Then, for any online algorithm \({\mathcal{O}\mathcal{N}}\) and for every \(1\le k<n\), there exists an execution and a time \(t > 0\) such that the competitive ratio between the number of movements by \({\mathcal{O}\mathcal{N}}\) and the number of movements of an optimal offline algorithm \(\mathcal {O}\) is at least n/2. More precisely for all \(M_t^\mathcal {O}>0\) there is an execution such that \(M_t^{\mathcal{O}\mathcal{N}} \ge \frac{n}{2} \cdot M_t^\mathcal {O}\).

1.3.3 Related Work

Theorems 1 and  2 demonstrate that if we aim to maintain the facilities in a configuration with an optimal assignment cost up to an additive \(\beta \) for small \(\beta \), we must pay a multiplicative competitive ratio of \(\Omega (k)\). This is similar to the deterministic competitive ratio of \(\Theta (k)\) for the k-server problem [11, 31]. The M-OMFL is analogous to the k-server problem, where only the movement cost is minimized. When the distances between every pair of points are uniform, this variant becomes more similar to the paging problem. However, while the paging problem has a deterministic competitive ratio of k [33], we prove that the deterministic M-OMFL on uniform metrics (and hence on general metrics) has a lower bound of n/2, where n is the number of metric points. Our lower bound analysis for M-OMFL is similar to the classic deterministic lower bound analysis for paging, particularly in how we compare the movements by the optimal offline algorithm and any online algorithm in each phase or interval. The randomized competitive ratio for the paging problem is \(\Theta (\log k)\) [32].

2 G-OMFL: An Upper Bound Analysis

In this section, we provide a proof of Theorem 2. The proof of Theorem 1 is postponed to Section 3, as it also holds for G-OMFL, considering Remark 4. In Section 2.1, we introduce our online algorithm, the dynamic greedy allocation (DGA) denoted by \({\mathcal {G}}\), for solving the G-OMFL problem. An overview of our analysis of DGA is provided in Section 2.2. The complete analysis of DGA is presented in Section 2.3. In the following, whenever clear from the context, we omit the superscript \({\mathcal {G}}\) in the algorithm-dependent quantities defined above.

2.1 Algorithm Description

The goal of our algorithm is two-fold. On the one hand, we want to guarantee that the service cost of DGA is always within some fixed bounds of the optimal service cost. On the other hand, we want to achieve this while keeping the overall movement cost low. Specifically, for two parameters \(\alpha \) and \(\beta \), where

$$\begin{aligned} \alpha \ge 1 \quad \text {and}\quad \max \{\alpha -1,\beta \} \ge 1. \end{aligned}$$
(9)

we guarantee that at all times

$$\begin{aligned} S_t < \alpha \cdot S_t^* + \beta , \end{aligned}$$
(10)

where \(S_t\) denotes the total service cost of DGA at time t. Condition (10) is maintained in the most straightforward greedy manner. Whenever after a new request arrives, Condition (10) is not satisfied, DGA greedily move facilities until Condition (10) holds again. Hence, as long as Condition (10) does not hold, DGA moves a facility that reduces the total service cost as much as possible. Our algorithm stops moving any facilities as soon as the validity of Condition (10) is restored.

Whenever DGA moves a facility, it does a best possible move, i.e., a move that achieves the best possible service cost improvement. Thus, DGA always moves a facility from a node where removing a facility is as cheap as possible to a node where adding a facility reduces the cost as much as possible. Therefore, for each movement m, we have

$$\begin{aligned} v^{{src}}_m\in & {} \arg \min _{v\in V} \{\sigma _{v,\tau _m}(f_{v,m-1}-1)-\sigma _{v,\tau _m}(f_{v,m-1})\} \, \text {and}\end{aligned}$$
(11)
$$\begin{aligned} v^{{dst}}_m\in & {} \arg \max _{v\in V} \{\sigma _{v,\tau _m}(f_{v,m-1})-\sigma _{v,\tau _m}(f_{v,m-1}+1)\}. \end{aligned}$$
(12)

Let \(\tau _m\) be the time of the m-th movement and \(f_{v,m-1}\) be the number of facilities at node v after the \((m-1)\)-th movement. As defined in Section 1.2.1, \(\sigma _{v,t}(x)\) is the service cost at node v of having x facilities at time t. (11) identifies a set of points where removing a facility causes the least increase in service cost. (12) identifies a set of points where adding a facility causes the most decrease in service cost. DGA then moves a facility from a point in the first set to a point in the second set.

2.2 Analysis Overview

While the algorithm DGA itself is quite simple, its analysis turns out relatively technical. We thus first describe the key steps of the analysis by discussing a simple case. We assume the classic cost model where on uniform metrics the service cost at any node is equal to 0 if there is at least one facility at the node and the service cost is equal to the number of requests at the node, otherwise. Further, we assume that we run DGA with parameters \(\alpha =1\) and \(\beta =0\), i.e. after each request arrives, DGA moves to a configuration with optimal service cost. Note that these parameter settings violate Condition (10) and we therefore get a weaker bound than the one promised by Theorem 2.

First, note that in the described simple scenario, DGA clearly never puts more than one facility to the same node. Further, whenever DGA moves a facility from a node u to a node v, the overall service cost has to strictly decrease and thus, the number of requests at node v is larger than the number of requests at node u. Consider some point in time t and let

$$\begin{aligned} r_{\min }(t):=\min \limits _{v\in V:f_{v,t}=1}r_{v,t} \end{aligned}$$

be the minimum number of requests among the nodes v with a facility at time t. Hence, whenever at a time t, DGA moves a facility from a node u to a node v, node u has at least \(r_{\min }(t)\) requests and consequently, node v has at least \(r_{\min } (t) +1\) requests. Further, if at some later time \(t'>t\), the facility at node v is moved to some other node w, because DGA always removes a facility from a node with as few requests as possible, we have \(r_{\min }(t')\ge r_{\min }(t)+1\). Consequently, if in some time interval \([t_1,t_2]\), there is some facility that is moved more than once, we know that \(r_{\min }(t_1)<r_{\min }(t_2)\). We partition time into phases, starting from time 0. Each phase is a maximal time interval where no facility is moved more than once. (cf. Definition 2 in the formal analysis of DGA).

The above argument implies that after each phase \(r_{\min }\) increases by at least one and therefore at any time t in Phase p, we have \(r_{\min }(t)\ge p-1\) and at the end of Phase p, we have \(r_{\min }(t)\ge p\). In Section 2.3, the more general form of this statement appears in Lemma 4. There, \(\gamma _p\) is defined to be the smallest service cost improvement of any movement in Phase p (\(\gamma _p=1\) in the simple case considered here), and Lemma 4 shows that \(r_{\min }\) grows by at least \(\gamma _p\) in Phase p. Assume that at some time t in Phase p, a facility is moved from a node u to a node v. Because node u already had its facility at the end of Phase \(p-1\), we have \(r_{u,t}=r_{\min }(t)\ge p-1\). Consequently, at the end of Phase p, there is at least one node (the source of the last movement) that has no facility and at least \(p-1\) requests. The corresponding (more technical) statement in our general analysis appears in Lemma 6.

We bound the total cost of DGA and an optimal offline algorithm from above and below, respectively, as a function of the optimal service cost. Hence, the ratio between these two total costs provides the desired competitive factor. Our algorithm guarantees that at all times, the service cost is within fixed bounds of the optimal service cost (in the simple case here, the service cost is always equal to the optimal service cost). Knowing that there are nodes with many requests and no facilities, therefore allows to lower bound the optimal service cost. In the general case, this is done by Lemma 9 and Lemma 10. In the simple case, considered here, as at the end of Phase p, thereare k nodes with at least p requests (the nodes that have facilities) and there is at least one additional node with at least \(p-1\) requests, we know that at the end of Phase p, the optimal service cost is at least \(p-1\). Consequently, DGA (in the simple case) pays exactly the optimal service cost (as mentioned before, in the general case, the service cost is within fixed bounds of the optimal service cost) and at most \((p-1)k\) as movement cost. Hence, the total cost paid by DGA is at most a factor \(k+1\) times the optimal service cost since the optimal service cost is at least \(p-1\). By choosing \(\alpha \) slightly larger than 1 and a larger \(\beta \) (\(\beta \ge k\)), DGA becomes lazier and one can show that the difference between the number of movements of DGA and the optimal service cost becomes significantly smaller. Also note that by construction, the service cost of DGA is always at most \(\alpha S_t^* + \beta \).

When analyzing DGA, we mostly ignore the movement cost of an optimal offline algorithm. We only exploit the fact that by the time DGA decides to move a facility for the first time, any other algorithm must also move at least one facility and therefore the optimal offline cost becomes at least 1.

2.3 Upper Bound Analysis

In the following, we show that how to upper bound the total cost of our online algorithm denoted by \({\mathcal {G}}\) by a function of the total cost of an optimal offline algorithm \(\mathcal {O}\). Clearly, DGA at all times \(t\ge 0\) guarantees that the service cost can be bounded as

$$\begin{aligned} S_t^\mathcal {G} < \alpha \cdot S_t^* + \beta \le \alpha \cdot {cost}_t^\mathcal {O} + \beta . \end{aligned}$$
(13)

In order to upper bound the total cost, it therefore suffices to study how the movement cost \(M_t^{\mathcal {G}}\) of DGA grows as a function of the optimal offline algorithm cost. Let \({\mathcal {O}}\) be an optimal offline algorithm and let \(F^{\mathcal {O}}(t)\) be the configuration of \(\mathcal {O}\) at time t. Recall that \(d(F_0,F^\mathcal {O}(t))\) denotes the total number of movements required to move from the initial configuration to configuration \(F^\mathcal {O}(t)\). We therefore have \({cost}_t^\mathcal {O} = S_t^\mathcal {O} + M_t^\mathcal {O} \ge S_t^* + d(F_0,F^\mathcal {O}(t))\). In order to upper bound \(M_t^\mathcal {G}\) as a function of \({cost}_t^\mathcal {O}\), we upper bound it as a function of \(S_t^* + d(F_0,F^\mathcal {O}(t))\).

Instead of directly dealing with \(d(F_0,F^\mathcal {O}(t))\), we make use of the fact that our analysis works for a general cost function \(\sigma \) satisfying the conditions given in (3), (4), (5), and (6). Given an service cost function \(\sigma \), consider a function \(\sigma '\) which is defined as follows:

$$\begin{aligned} \forall v\in V, \forall x\in \{0,\dots ,k\}, \forall y\in \mathbb {N}_0: \sigma _v'(x,y) := \sigma _v(x,y) + \max \{0,f_{v}(0)-x\} \end{aligned}$$

where \(f_{v}(t)\) is the number of facilities at time t on node v. Clearly, \(\sigma '\) also satisfies the conditions given in (3), (4), (5), and (6). In addition, for any time t and any configuration \(F=\{(v,f_v) : v\in V\}\), we have

$$\begin{aligned} S_t'(F)= & {} \sum _{v\in V} \sigma _v'(f_v,r_{v,t}) \nonumber \\= & {} \sum _{v\in V} \left( \sigma _v(f_v,r_{v,t}) + \max \{0,f_{v}(0)-f_v\}\right) \nonumber \\\overset{(2)}{=} & {} S_t(F) + d(F_0, F) \end{aligned}$$
(14)

where \(S_t'(F)\) refers to the total service cost w.r.t. the new cost function \(\sigma '\). Hence, \(S'_t(F)\) exactly measures the sum of service cost and movement cost of a configuration F. Of course now, in all our results, \(S_t^*\) corresponds to the combination of service and movement cost of an optimal configuration \(F^*\).

We are now going to analyze DGA. In our analysis, we bound the total costs of optimal offline algorithm \(\mathcal {O}\) and online algorithm \(\mathcal {G}\) from below and above, respectively, as functions of optimal service cost and thus provide the upper bound (competitive factor) promised in Theorem 2. Hence we first go through calculating the optimal service cost.

For the analysis of DGA, we partition the movements into phases \(p=1,2,\dots \), where roughly speaking, a phase is a maximal consecutive sequence of movements in which no facility is moved twice. We use \(m_p\) to denote the first movement of Phase p (for \(p\in \mathbb {N}\)). In addition, we define \(v^{{src},\mathcal {G}}_m\) and \(v^{{dst},\mathcal {G}}_m\) to be the nodes involved in the m-th facility move, where we assume that \(\mathcal {G}\) moves a facility from node \(v^{{src}}_m\) to \(v^{{dst}}_m\). Formally, the phases are defined as follows.

Definition 2

(Phases) The movements are divided into phases \(p=1,2,\dots \), where Phase p starts with movement \(m_p\) and ends with movement \(m_{p+1}-1\). We have \(m_1=1\), i.e., the first phase starts with the first movement. Further for every \(p>1\), we define

$$\begin{aligned} m_p := \min \{m>m_{p-1}\, :\, \exists \,m'\in [m_{p-1},m-1]\text { s.t.\ } v^{{src}}_m=v^{{dst}}_{m'}\}. \end{aligned}$$
(15)

For a Phase \(p\ge 1\), let \(\lambda _p:=m_{p+1}-m_p\) be the number of movements of Phase p.

2.3.1 Optimal Service Cost Analysis

The algorithm DGA moves facilities in order to improve the service cost. Throughout the rest of our analysis, we use \(\tau ^{\mathcal {G}}_m\) to denote the time of the m-th movement. For a given movement m, we use \(\gamma (m)>0\) to denote service cost improvement of m. Further, we use \(F_0\) to denote the initial configuration of the k facilities and for a given (deterministic) algorithm \(\mathcal {A}\), for any \(m\ge 1\), we let \(F_m^\mathcal {A}=\{(v,f_{v,m}^\mathcal {A}):v\in V\}\) be the configuration of the k facilities for \(\mathcal {A}\) after m facility movements (i.e., after m server movements of \(\mathcal {A}\), node v has \(f_{v,m}^\mathcal {A}\) facilities).

$$\begin{aligned} \gamma (m):= & {} \ S_{\tau _m}(F_{m-1}) - S_{\tau _m}(F_m)\nonumber \\&\;=&\big (\sigma _{v^{{dst}}_m,\tau _m}(f_{v^{{dst}}_m,m-1})-\sigma _{v^{{dst}}_m,\tau _m}(f_{v^{{dst}}_m,m})\big )\nonumber \\{} & {} \quad -\, \big (\sigma _{v^{{src}}_m,\tau _m}(f_{v^{{src}}_m,m})-\sigma _{v^{{src}}_m,\tau _m}(f_{v^{{src}}_m,m-1})\big ). \end{aligned}$$
(16)

For each Phase p, we define the improvement \(\gamma _p\) of p and the cumulative improvement \(\Gamma _p\) by Phase p as follows

$$\begin{aligned} \gamma _p := \min _{m\in [m_p,m_{p+1}-1]} \gamma (m)\quad \text {and}\quad \Gamma _p := \sum _{i=1}^p \gamma _i,\quad \Gamma _0:=0,\ \gamma _0 := 0. \end{aligned}$$
(17)

We are now ready to prove our first technical lemma, which lower bounds the cost of removing facilities from nodes with facilities (for all \(v\in V\) such that \(f_v \ge 1\)) at any point in the execution. The result of following lemma implies that removing any facility of an optimal configuration during some Phase p increases the optimal service cost at least \(\Gamma _{p-1}\) (and \(\Gamma _p\) at end of Phase p) since the facilities of an optimal configuration are located at places with maximum number of requests.

Lemma 4

Let m be a movement and, \(F=\{(v,f_v):v\in V\}\) be the configuration of DGA at any point in the execution after movement m and let \(t\ge \tau _m\) be the time at which the configuration F occurs. Then, for all times \(t'\ge t\) and for all nodes \(v\in V\), if \(f_v>0\) it holds that

$$\begin{aligned} \sigma _{v,t'}(f_v-1) - \sigma _{v,t'}(f_v)\ge \Gamma _{p-1}, \end{aligned}$$

where p is the phase in which movement m occurs.

Proof

We show that for each facility movement \(m\in \mathbb {N}\) of DGA, it holds that

$$\begin{aligned} \forall v\in V\,:\, f_{v,m}>0\ \Longrightarrow \ \sigma _{v,\tau _m}(f_{v,m}-1) - \sigma _{v,\tau _m}(f_{v,m})\ge \Gamma _{p-1}, \end{aligned}$$
(18)

where p is the phase in which movement m occurs (i.e., the claim of the lemma holds immediately after movement m). The lemma then follows because (i) any configuration \(\{(v,f_v):v\in V\}\) occurring after movement m is the configuration \(F_{m'}\) for some movement \(m'\ge m\), (ii) the values \(\Gamma _{p-1}\) are monotonically increasing with p, and (iii) by (6), for all \(v\in V\), the value \(\sigma _{v,t}(f-1)-\sigma _{v,t}(f)\) is monotonically non-decreasing with t.

It therefore remains to prove (18) for every m, where p is the phase of movement m. We prove a slightly stronger statement. Generally, for a movement \(m'\) and a Phase \(p'\), let \(V^{{dst}}_{p',m'}\) be the set of nodes that have received a new facility by some movement \(m''\le m'\) of Phase \(p'\). Hence,

$$\begin{aligned} V^{{dst}}_{p',m'}:=\{v\in V: \exists \text { movement } m''\le m' \text { of Phase } p' \text { s.t. } v^{{dst}}_{m''}=v\}. \end{aligned}$$

We show that in addition to (18), it also holds that

$$\begin{aligned} \forall v\in V^{{dst}}_{p,m}\,:\, f_{v,m}>0\ \Longrightarrow \ \sigma _{v,\tau _m}(f_{v,m}-1) - \sigma _{v,\tau _m}(f_{v,m})\ge \Gamma _{p}. \end{aligned}$$
(19)

We prove (18) and (19) together by using induction on m.

Induction Base (\(\varvec{m=1}\)) The first movement occurs in Phase 1. By (17), \(\Gamma _0=0\) and by (3), we also have \(\sigma _{v,t}(f-1)-\sigma _{v,t}(f)\ge 0\) for all times \(t\ge 0\), all nodes \(v\in V\), and all \(f\ge 1\). Inequality (18) therefore clearly holds for \(m=1\). It remains to show that also (19) holds for \(m=1\). We have \(V^{{dst}}_{1,1}=\{v^{{dst}}_1\}\) and showing (19) for \(m=1\) therefore reduces to showing that \(\sigma _{v^{{dst}}_1,\tau _1}(f_{v^{{dst}}_1,1}-1)-\sigma _{v^{{dst}}_1,\tau _1}(f_{v^{{dst}}_1,1})\ge \Gamma _1=\gamma _1\), which follows directly from (16) and (17).

Induction Step (\(\varvec{m>1}\)) We first show that Inequalities (18) and (19) hold immediately before movement m and thus,

$$\begin{aligned} \forall v\in V&:&f_{v,m-1}>0\ \Rightarrow \! \sigma _{v,\tau _m}(f_{v,m-1}\!-1) \!- \sigma _{v,\tau _m}(f_{v,m-1})\!\ge \! \Gamma _{p-1},\end{aligned}$$
(20)
$$\begin{aligned} \forall v\in V^{{dst}}_{p,m-1}&:&f_{v,m-1}>0\ \Rightarrow \ \sigma _{v,\tau _m}(f_{v,m-1}-1) - \sigma _{v,\tau _m}(f_{v,m-1})\ge \Gamma _{p}. \end{aligned}$$
(21)

If m is not the first movement of Phase p, Inequalities (20 and (21) follow directly from the induction hypothesis (for \(m-1\)) and from (6). Let us therefore assume that m is the first movement of Phase p. Note that in this case \(V^{{dst}}_{p,m-1}=\emptyset \) and (21) therefore trivially holds. Because \(m>1\), we know that in this case \(p\ge 2\). From the induction hypothesis and from (6), we can therefore conclude that for every node \(v\in V^{{dst}}_{p-1,m-1}\) (every node v that is the destination of some facility movement in Phase \(p-1\)), we have \(\sigma _{v,\tau _m}(f_{v,m-1}-1) - \sigma _{v,\tau _m}(f_{v,m-1})\ge \Gamma _{p-1}\). Note that for all these nodes, we have \(f_{v,m-1}>0\). Because m is the first movement of Phase p, Definition 2 implies that \(v^{{src}}_m\in V^{{dst}}_{p-1,m-1}\). Applying (11), we get that for all \(v\in V\), \(\sigma _{v,\tau _m}(f_{v,m-1}-1) - \sigma _{v,\tau _m}(f_{v,m-1})\ge \sigma _{v^{{src}}_m,\tau _m}(f_{v^{{src}}_m,m-1}-1) - \sigma _{v^{{src}}_m,\tau _m}(f_{v^{{src}}_m,m-1})\ge \Gamma _{p-1}\) and therefore (20) also holds if \(m\ge 2\) is the first movement of some phase.

We can now prove (18) and (19). For all nodes \(v\notin \{v^{{src}}_m,v^{{dst}}_m\}\), we have \(f_{v,m}=f_{v,m-1}\) and we further have \(V^{{dst}}_{p,m}=V^{{dst}}_{p,m-1}\cup \{v^{{dst}}_m\}\). For \(v\notin \{v^{{src}}_m,v^{{dst}}_m\}\), (18) and (19) therefore directly follow from (20) and (21), respectively. For the two nodes involved in movement m, first note that \(v^{{src}}_m\notin V^{{dst}}_{p,m-1}\). It therefore suffices to show that

$$\begin{aligned} f_{v^{{src}}_m,m}=0\quad \text {or}\quad \sigma _{v^{{src}}_m,\tau _m}(f_{v^{{src}}_m,m}-1) - \sigma _{v^{{src}}_m,\tau _m}(f_{v^{{src}}_m,m})\ge & {} \Gamma _{p-1},\end{aligned}$$
(22)
$$\begin{aligned} \text {as well as}\quad \sigma _{v^{{dst}}_m,\tau _m}(f_{v^{{dst}}_m,m}-1) - \sigma _{v^{{dst}}_m,\tau _m}(f_{v^{{dst}}_m,m})\ge & {} \Gamma _{p}. \end{aligned}$$
(23)

We have \(f_{v^{src}_m,m} = f_{v^{src}_m,m-1}-1\) and \(f_{v^{dst}_m,m}=f_{v^{dst}_m,m-1}+1\). Inequality (22) therefore directly follows from (20) and (5). For (23), we have

$$\begin{aligned}{} & {} \sigma _{v^{dst}_m,\tau _m}(f_{v^{dst}_m,m}-1) - \sigma _{v^{dst}_m,\tau _m}(f_{v^{dst}_m,m})\qquad \\\overset{(16)}{=} & {} \sigma _{v^{src}_m,\tau _m}(f_{v^{src}_m,m}) - \sigma _{v^{src}_m,\tau _m}(f_{v^{src}_m,m}-1) + \gamma (m)\\\overset{(20)}{\ge } & {} \Gamma _{p-1}+ \gamma (m) \ \ \overset{(17)}{\ge }\ \ \Gamma _p. \end{aligned}$$

This completes the proof of (18) and (19) and thus the proof of the lemma.\(\square \)

For each phase number p, let \(\theta _p:=\tau _{m_p}\) be the time of the the first movement \(m_p\) of Phase p. Before continuing, we give lower and upper bounds on \(\gamma _p\), the improvement of Phase p. For all \(p\ge 1\), we define

$$\begin{aligned} \eta _p := (\alpha -1)\cdot S_{\theta _p}^* + \beta . \end{aligned}$$
(24)

Lemma 5

Let m be a movement of Phase p and let \(F^* \in \arg \min \limits _{F}S_t(F)\) be the optimal configuration at time \(\tau _m\). We then have

$$\begin{aligned} \frac{\eta _p}{d(F_{m-1},F^*)} \le \gamma (m) \le \eta _{p+1}. \end{aligned}$$

Proof

For the upper bound, observe that we have

$$\begin{aligned} \gamma (m)\le S_{\tau _m}(F_{m-1})-S_{\tau _m}^* \end{aligned}$$

as clearly the service cost cannot be improved by a larger amount. Because at all times t, DGA keeps the service cost below \(\alpha S_t^*+\beta \), we have \(S_{\tau _m-1}(F_{m-1}) < \alpha S_{\tau _m-1}^*+\beta \le \alpha S_{\tau _m}^*+\beta \). The upper bound on \(\gamma (m)\) follows from (24) and because \(S_{\tau _m}^*\le S_{\theta _{p+1}}^*\).

For the lower bound on \(\gamma (m)\), we need to prove that \(d(F_{m-1},F^*)\ge \eta _p/\gamma (m)\). Because DGA moves a facility at time \(\tau _m\), we know that \(S_{\tau _m}(F_{m-1})\!\ge \!\alpha Service_{\tau _m}(F^*)\)\( + \beta \) and applying the (24) of \(\eta _p\), we thus have \(S_{\tau _m}(F_{m-1})-S_{\tau _m}(F^*) \ge \eta _p\). Intuitively, we have \(d(F_{m-1},F^*)\ge \eta _p/\gamma (m)\) because DGA always chooses the best possible movement and thus every possible movement improves the overall service cost by at most \(\gamma (m)\). Thus, the number of movements needs to get from \(F_{m-1}\) to an optimal configuration \(F^*\) has to be at least \(\eta _p/\gamma (m)\). For a formal argument, assume that we are given a sequence of \(\ell :=d(F_{m-1},F^*)\) movements that transform configuration \(F_{m-1}\) into configuration \(F^*\). For \(i\in [\ell ]\), assume that the i-th of these movements moves a facility from node \(u_i\) to node \(v_i\). Further, for any \(i\in [\ell ]\) let \(f_i\) be the number of facilities at node \(u_i\) and let \(f_i'\) be the number of facilities at node \(v_i\) before the i-th of these movements. Because the sequence of movements is minimal to get from \(F_{m-1}\) to \(F^*\), we certainly have \(f_i\le f_{u_i,m-1}\) and \(f_i'\ge f_{v_i,m-1}\). For the service cost improvement \(\gamma \) of the i-th of these movements, we therefore obtain

$$\begin{aligned} \gamma= & {} \big (\sigma _{v_i,\tau _m}(f_i') - \sigma _{v_i,\tau _m}(f_i'+1)\big ) - \big (\sigma _{u_i,\tau _m}(f_i-1)-\sigma _{u_i,\tau _m}(f_i)\big )\\\overset{(5)}{\le } & {} \big (\sigma _{v_i,\tau _m}(f_{v_i,m-1}) - \sigma _{v_i,\tau _m}(f_{v_i,m-1}+1)\big )\\{} & {} \quad -\, \big (\sigma _{u_i,\tau _m}(f_{u_i,m-1}-1)-\sigma _{u_i,\tau _m}(f_{u_i,m-1})\big )\\\le & {} \gamma (m). \end{aligned}$$

The last inequality follows from (11), (12), and (16). As the sum of the \(\ell \) service cost improvements has to be at least \(\eta _p\), we obtain \(\ell =d(F_{m-1},F^*)\ge \eta _p/\gamma (m)\) as claimed.

We can now lower bound the distribution of requests at the time of each movement.

Lemma 6

Let m be a movement of Phase p (for \(p\ge 1\)). Then, there are integers \(\psi _v\ge 0\) for all nodes \(v\in V\) such that

$$\begin{aligned} \sum _{v\in V} \psi _v \ge k+ & {} \frac{\eta _p}{\gamma (m)} \quad \text {and}\\ \forall t\ge \tau _m \,\forall v\in V : \ \psi _v>0 \Longrightarrow&\sigma&_{v,t}(\psi _v-1) - \sigma _{v,t}(\psi _v) \ge \Gamma _{p-1}. \end{aligned}$$

Proof

It suffices to prove the statement for \(t=\tau _m\). For larger t, the claim then follows from (6). Consider an optimal configuration

$$\begin{aligned} F^*=\{(v,f_v^*):v\in V\} \end{aligned}$$

at the time \(\tau _m\) of movement m. Let us further consider the configuration \(F_{m-1}\) of DGA immediately before movement m. Consider a pair of nodes u and v such that \(f_u^*>f_{u,m-1}\) and \(f_{v,m-1}>f_v^*\). By the optimality of \(F^*\), we have

$$\begin{aligned} \sigma _{u,\tau _m}(f_u^*-1)-\sigma _{u,\tau _m}(f_u^*) \ge \sigma _{v,\tau _m}(f_{v,m-1}-1)-\sigma _{v,\tau _m}(f_{v,m-1}). \end{aligned}$$
(25)

Otherwise, moving a facility from u to v would (strictly) improve the configuration \(F^*\). By Lemma 4, we have \(\sigma _{v,\tau _m}(f_{v,m-1}-1)-\sigma _{v,\tau _m}(f_{v,m-1})\ge \Gamma _{p-1}\) for all nodes v for which \(f_{v,m-1}>0\). Together with (25), for all \(v\in V\) for which \(\max \{f_{v,m-1},f_v^*\}>1\), we obtain

$$\begin{aligned} \sigma _{v,\tau _m}(\max \{f_{v,m-1},f_v^*\}-1)-\sigma _{v,\tau _m}(\max \{f_{v,m-1},f_v^*\})\ge \Gamma _{p-1}. \end{aligned}$$
(26)

To prove the lemma, it therefore suffices to show that \(\sum _{v\in V} \max \{f_{v,m-1},f_v^*\} \ge k+\eta _p/\gamma (m)\), as we can then set \(\psi _v:=\max \{f_{v,m-1},f_v^*\}\) and (26) implies the claim of the lemma. By (2), we have

$$\begin{aligned} \sum _{v\in V}\max \{f_{v,m-1},f_v^*\} = k + \sum _{v\in V}\max \{0,f_v^*-f_{v,m-1}\} = k+d(F_{m-1},F^*). \end{aligned}$$

We therefore need that \(d(F_{m-1},F^*)\ge \eta _p/\gamma (m)\), which follows from Lemma 5.\(\square \)

In the next lemma, we derive a lower bound on \(S_{\theta _p}^*\), the service cost of optimal configuration when Phase p starts. For each Phase \(p\ge 1\), we first define \(\overline{S}_{p}\) as follows.

$$\begin{aligned} \text {For } p \ge 3: \overline{S}_{p} := \left( 1+ (\alpha -1)\frac{\gamma _{p-2}}{\gamma _{p-1}}\right) \cdot \overline{S}_{p-1}+ \frac{\gamma _{p-2}}{\gamma _{p-1}}\beta ,\text { and } \overline{S}_{1}:=\overline{S}_{2} := 1. \end{aligned}$$
(27)

Lemma 7

For all \(p\ge 1\), we have \(S_{\theta _p}^* \ge \overline{S}_{p}\).

Proof

We prove the lemma by induction on p.

Induction Base \(\varvec{(p=1,2)}\) Using (14) we have \(S_{\theta _1}^* \ge 1\) and since \(\overline{S}_{1}=\overline{S}_{2}=1\), we get \(S_{\theta _2}^*\ge S_{\theta _1}^* \ge \overline{S}_{2}=\overline{S}_{1}\).

Induction Step \(\varvec{(p>2)}\) We use the induction hypothesis to assume that the claim of the lemma is true up to Phase p and we prove that it also holds for Phase \(p+1\). Therefore by the induction hypothesis, for all \(i \in [p]\),

$$\begin{aligned} S_{\theta _p}^* \ge \overline{S}_{p}. \end{aligned}$$
(28)

For all \(i\in [p]\), we define \(\overline{\eta }_{i}:= (\alpha -1)\overline{S}_{i}+\beta \) and \(\delta _i:=\max \big \{\frac{\overline{\eta }_{i+1}}{\gamma _{i+1}},\cdots ,\frac{\overline{\eta }_{p}}{\gamma _{p}}\big \}\). As a consequence of (24 and (28), we get that \(\eta _i\ge \overline{\eta }_{i}\) for all \(i\in [p]\). In the following, let \(p'\in [2,p]\) be some phase. Lemma 6 implies that after the last movement m of Phase \(p'\), there are non-negative integers \(\psi _v\) (for \(v\in V\)) such that \(\sum _{v\in V}\psi _v\ge k+\eta _{p'}/\gamma _{p'}\ge \overline{\eta }_{p'}/\gamma _{p'}\) and for all times \(t\ge \tau _m\), for all \(v\in V\) for which \(\psi _v>0\), \(\sigma _{v,t}(\psi _v-1)-\sigma _{v,t}(\psi _v)\ge \Gamma _{p'-1}\). As there are only k facilities for any feasible configuration \(F=\{(v,f_v)\}\), we have \(\sum _{v\in V} f_v=k\) and therefore \(\sum _{v\in V}(\psi _v-f_v)\ge \overline{\eta }_{p'}/\gamma _{p'}\). For any \(v\in V\) for which \(\psi _v>f_v\), by using (5), we get \(\sigma _{v,t}(f_v)\ge (\psi _v-f_v)\Gamma _{p'-1}\). Hence, after the last movement of Phase \(p'\), for any feasible configuration F, we have \(S_t(F)\ge S_t^* \ge \frac{\overline{\eta }_{p'}}{\gamma _{p'}}\Gamma _{p'-1}\). At the beginning of Phase \(p+1\) (for \(p\ge 2\)), the total optimal service cost therefore is

$$\begin{aligned} S_{\theta _{p+1}}^* \ge \max _{p'\in [2,p]} \frac{\overline{\eta }_{p'}}{\gamma _{p'}} \Gamma _{p'-1} \ge \delta _{p-1}\Gamma _{p-1} +\sum _{i=1}^{p-2}(\delta _i-\delta _{i+1})\cdot \Gamma _i = \sum _{i=1}^{p-1}\gamma _i\cdot \delta _i. \end{aligned}$$
(29)

We define \(\zeta _i\) for all \(i \in [3,p]\) as follows:

$$\begin{aligned} \zeta _i := \sum _{j=1}^{i-2} \gamma _j \cdot \delta _j. \end{aligned}$$
(30)

Using the definition of \(\delta _i\), we thus have

$$\begin{aligned} \zeta _{p+1} = \zeta _{p} + \gamma _{p-1} \delta _{p-1}=\zeta _{p} + \overline{\eta }_{p}\frac{\gamma _{p-1}}{\gamma _{p}}. \end{aligned}$$

Considering the definition of \(\overline{\eta }_{i}\) we get

$$\begin{aligned} \zeta _{p+1} = \zeta _{p} \cdot \left( 1 + (\alpha -1)\frac{\gamma _{p-1}}{\gamma _{p}}\right) + \beta \cdot \frac{\gamma _{p-1}}{\gamma _{p}}. \end{aligned}$$

We therefore have \(\zeta _{p+1}=\overline{S}_{p+1}\) directly from (27) and thus the claim of the lemma follows.\(\square \)

In order to explicitly lower bound the optimal service cost after p phases, we need the following technical statement.

Lemma 8

Let \(\ell \ge 2\) be an integer and consider a sequence \(c_1,c_2,\dots ,c_\ell >0\) of \(\ell \) positive real numbers and let \(c_{\max }=\max \limits _{i\in [\ell ]} c_i\) and \(c_{\min }=\min \limits _{i\in [\ell ]}c_i\). Further, let \(\lambda \ge 0\) be an arbitrary non-negative real number. We have

$$\begin{aligned}{} & {} \mathrm {(I)}\quad \sum _{i=2}^\ell \frac{c_{i-1}}{c_i} \ge (\ell -1)\cdot \left( \frac{c_{\min }}{c_{\max }}\right) ^{\frac{1}{\ell -1}},\\{} & {} \mathrm {(II)}\quad \prod _{i=2}^{\ell } \left( 1+\lambda \frac{c_{i-1}}{c_i}\right) \ge \left( 1+\lambda \left( \frac{c_{\min }}{c_{\max }}\right) ^{\frac{1}{\ell -1}}\right) ^{\ell -1}. \end{aligned}$$

Proof

The first part of the claim follows from the means inequality (the fact that the arithmetic mean is larger than or equal to the geometric mean). In the following, we nevertheless directly prove both parts together. We let \({\varvec{x}}=(x_1,\dots ,x_\ell )\in \mathbb {R}^\ell \) be a vector \(\ell \) real variables and we define multivariate functions \(f({\varvec{x}}):\mathbb {R}^{\ell }\rightarrow \mathbb {R}\) and \(g({\varvec{x}}):\mathbb {R}^{\ell }\rightarrow \mathbb {R}\) as follows:

$$\begin{aligned} f({\varvec{x}}) := \sum _{i=2}^{\ell }\frac{x_{i-1}}{x_i}\quad \text {and}\quad g({\varvec{x}}) := \prod _{i=2}^{\ell }\left( 1+\lambda \frac{x_{i-1}}{x_i}\right) . \end{aligned}$$

We further define \(X\subset \mathbb {R}^\ell \) as \(X:=\{(z_1,\dots ,z_\ell )\in \mathbb {R}^\ell \, |\ \forall i\in [\ell ]:c_{\min }\le z_i\le c_{\max }\}\). We need to show that for \({\varvec{x}}\in X\), \(f({\varvec{x}})\) and \(g({\varvec{x}})\) are lower bounded by the right-hand sides of Inequalities (I) and (II) above, respectively. Note that X is a closed subset of \(\mathbb {R}^\ell \) and because \(c_{\min }>0\), both functions \(f({\varvec{x}})\) and \(g({\varvec{x}})\) are continuous when defined on X. The minimum for \({\varvec{x}} \in X\) is therefore well-defined for both \(f({\varvec{x}})\) and \(g({\varvec{x}})\). We show that both \(f({\varvec{x}})\) and \(g({\varvec{x}})\) attain their minimum for

$$\begin{aligned} {\varvec{x^*}} := (x_1^*,\dots ,x_\ell ^*),\quad \text {where } \forall i\in [\ell ] : x_i^* = c_{\min }\cdot \left( \frac{c_{\max }}{c_{\min }}\right) ^{\frac{i-1}{\ell -1}}. \end{aligned}$$

Note that \({\varvec{x^*}}\) is the unique configuration \({\varvec{x}}\in X\) to the following system of equations

$$\begin{aligned} x_1=c_{\min },\quad x_\ell =c_{\max },\quad \forall i\in \{2,\dots ,\ell -1\}: x_i\in \frac{x_{i-1}}{x_i} = \frac{x_i}{x_{i+1}}. \end{aligned}$$
(31)

Because we know that \(\min \limits _{{\varvec{x}}\in X}f({\varvec{x}})=f({\varvec{x^*}})\) and \(\min \limits _{{\varvec{x}}\in X} g({\varvec{x}})=g({\varvec{x^*}})\), it is therefore sufficient to show that for any \({\varvec{y}}\in X\) that does not satisfy (31), \(f({\varvec{y}})\) and \(g({\varvec{y}})\) are not minimal. Let us therefore consider a vector \({\varvec{y}}=(y_1,\dots ,y_\ell )\in X\) that does not satisfy (31). First note that both \(f({\varvec{x}})\) and \(g({\varvec{x}})\) are strictly monotonically increasing in \(x_1\) and strictly monotonically decreasing in \(x_\ell \). If either \(y_1>c_{\min }\) or \(y_\ell <c_{\max }\), it is therefore clear that \(f({\varvec{y}})\) and \(g({\varvec{y}})\) are both not minimal (over X). Let us therefore assume that \(y_1=c_{\min }\) and \(y_\ell =c_{\max }\). From the assumption that \({\varvec{y}}\) does not satisfy (31), we then have an \(i_0\in \{2,\dots ,\ell -1\}\) for which \(\frac{y_{i_0-1}}{y_{i_0}}\ne \frac{y_{i_0}}{y_{i_0+1}}\) and thus \(y_{i_0} \ne \sqrt{y_{i_0-1}y_{i_0+1}}\). We define a new vector \({\varvec{y'}}=(y_1',\dots ,y_\ell ')\in X\) as follows. We have \(y_{i_0}'=\sqrt{y_{i_0-1}y_{i_0+1}}\) and \(y_i'=y_i\) for all \(i\ne i_0\) and we show that \(f({\varvec{y'}})<f({\varvec{y}})\) and \(g({\varvec{y'}})<g({\varvec{y}})\). Define

$$\begin{aligned} C:=\prod _{i\in [2,\ell ]\setminus \{i_0,i_0+1\}}\left( 1+\lambda \frac{y_{i-1}}{y_i}\right) . \end{aligned}$$

We then have

$$\begin{aligned} f({\varvec{y}})-f({\varvec{y'}})= & {} \left( \frac{y_{i_0-1}}{y_{i_0}} + \frac{y_{i_0}}{y_{i_0+1}}\right) - \left( \frac{y_{i_0-1}}{y_{i_0}'} + \frac{y_{i_0}'}{y_{i_0+1}}\right) \\ g({\varvec{y}})-g({\varvec{y'}})= & {} \left[ \left( 1\!+\!\lambda \frac{y_{i_0-1}}{y_{i_0}}\right) \cdot \! \left( 1\!+\!\lambda \frac{y_{i_0}}{y_{i_0+1}}\right) \!-\! \left( \!1+\lambda \frac{y_{i_0-1}}{y_{i_0}'}\!\right) \!\cdot \left( \!1\!+\!\lambda \frac{y_{i_0}'}{y_{i_0+1}}\!\right) \!\right] \cdot \! C\\= & {} \left[ \left( \frac{y_{i_0-1}}{y_{i_0}} + \frac{y_{i_0}}{y_{i_0+1}}\right) - \left( \frac{y_{i_0-1}}{y_{i_0}'} + \frac{y_{i_0}'}{y_{i_0+1}}\right) \right] \cdot \lambda C. \end{aligned}$$

Note that \(\lambda \ge 0\) and \(C>0\). In both cases, we therefore need to show that

$$\begin{aligned} \forall y_{i_0}\in [c_{\min },c_{\max }]\setminus \{\sqrt{y_{i_0-1}y_{i_0+1}}\} : \left( \frac{y_{i_0-1}}{y_{i_0}} + \frac{y_{i_0}}{y_{i_0+1}}\right) > \left( \frac{y_{i_0-1}}{y_{i_0}'} + \frac{y_{i_0}'}{y_{i_0+1}}\right) . \end{aligned}$$
(32)

This follows because the function \(h:[c_{\min },c_{\max }]\rightarrow \mathbb {R}\), \(h(z):=\frac{y_{i_0-1}}{z}+\frac{z}{y_{i_0+1}}\) is strictly convex for \(z\in [c_{\min },c_{\max }]\) and it has a stationary point at \(z=\sqrt{y_{i_0-1}y_{i_0+1}}\in [c_{\min },c_{\max }]\).\(\square \)

As long as \((\alpha -1)S_{\theta _p}^*<\beta \), the effect of the \((\alpha -1)S_{\theta _p}^*\)-term on \(\eta _p\) (and thus of the \(\alpha S_t^*\) term in (10) is relatively small. Let us therefore first analyze how the service cost grows by just considering terms that depends on \(\beta \) (and not on \(\alpha \)).

Lemma 9

For all \(p\ge 3\), we have

$$\begin{aligned} S_{\theta _p}^* \ge \min \{\frac{\beta }{\alpha -1},\ \beta \cdot (p-2)\cdot (2k)^{-\frac{1}{p-2}}\}. \end{aligned}$$

Proof

Assume that \(S_{\theta _p}^*<\beta /(\alpha -1)\) as otherwise the claim of the lemma is trivially true. By Lemma 7, using \(\alpha \ge 1\), for all \(p\ge 3\), we get \(\overline{S}_{p} \ge \overline{S}_{p-1} + \frac{\gamma _{p-2}}{\gamma _{p-1}}\beta \). Plugging in \( \overline{S}_{2} \ge 0\), induction on p therefore gives

$$\begin{aligned} S_{\theta _p}^* \ge \overline{S}_{p} \ge \beta \cdot \sum _{i=2}^{p-1}\frac{\gamma _{i-1}}{\gamma _i} \end{aligned}$$
(33)

for all \(p\ge 3\). We define \(\gamma _{\min }=\min \{\gamma _1,\dots ,\gamma _{p-1}\}\) and \(\gamma _{\max }=\max \{\gamma _1,\dots ,\gamma _{p-1}\}\). By Lemma 5 and because \(\eta _1\le \dots \le \eta _{p-1}\), we have \(\gamma _{\min }\ge \eta _1/k\) and \(\gamma _{\max }\le \eta _{p}\). From \(\alpha \ge 1\) and (24), we have \(\eta _1\ge (\alpha -1)+\beta \) since we know \(S_{\theta _p}^* \ge 1\) for \(p\ge 1\) regarding to (14). Further, we have \(\eta _{p}=(\alpha -1)S_{\theta _p}^*+\beta <2\beta \). We therefore have \(\gamma _{\min }\ge [(\alpha -1)+\beta ]/k\) and \(\gamma _{\max } < 2\beta \) and thus

$$\begin{aligned} \frac{\gamma _{\min }}{\gamma _{\max }}\ \ge \ \frac{(\alpha -1)+\beta }{2k\beta }\ \overset{(9)}{\ge }\ \frac{\max \{\beta ,1\}}{2k\beta }\ \ge \ \frac{1}{2k}. \end{aligned}$$

The lemma now follows from (33) and from Inequality (I) of Lemma 8.\(\square \)

On the other hand, as soon as \(S_{\theta _p}^* > \max \{1,\frac{\beta }{\alpha -1}\}\), the effect of the \(\beta \)-term in (10) becomes relatively small. As a second case, therefore, we analyze how the service cost grows by just considering terms that depends on \(\alpha \) (and not on \(\beta \)).

Lemma 10

Let \(p_0\ge 2\) be a phase for which \(\overline{S}_{p_0}\ge \overline{S}_{p_0-1}\ge S_0:=\max \big \{1,\frac{\beta }{\alpha -1}\big \}\). For any phase \(p>p_0\), we have

$$\begin{aligned} S_{\theta _p}^* \ge S_0\cdot \left( 1 + \frac{\sqrt{\alpha }-1}{(2k)^{\frac{1}{p-p_0}}} \right) ^{p-p_0} \ \ge \ \frac{S_0}{2k}\cdot \alpha ^{\frac{p-p_0}{2}}. \end{aligned}$$

Proof

By Lemma 7, using \(\beta \ge 0\), for all \(p> p_0\), we get \(\overline{S}_{p}\ge \left( 1 + (\alpha -1)\frac{\gamma _{p-2}}{\gamma _{p-1}}\right) \cdot \overline{S}_{p-1}\). Induction on p therefore gives

$$\begin{aligned} S_{\theta _p}^* \ge \overline{S}_{p} \ge \overline{S}_{p_0}\cdot \prod _{i=p_0}^{p-1} \left( 1+(\alpha -1)\frac{\gamma _{i-1}}{\gamma _{i}}\right) \end{aligned}$$
(34)

for all \(p\ge p_0\). Similarly to before, we define \(\gamma _{\min }=\min \{\gamma _{p_0-1},\dots ,\gamma _{p-1}\}\) and \(\gamma _{\max }=\max \{\gamma _{p_0-1},\dots ,\gamma _{p-1}\}\). By Lemma 5, the assumptions regarding \(p_0\), and because the values \(\eta _i\) are non-decreasing in i, we have

$$\begin{aligned} \gamma _{\min }\ge & {} \frac{\eta _{p_0-1}}{k} \ge \frac{\max \{(\alpha -1)+\beta ,2\beta \}}{k} \quad \text {and}\\ \gamma _{\max }\le & {} \eta _{p} \le (\alpha -1)S_{\theta _p}^* + \beta \le 2(\alpha -1)S_{\theta _p}^*. \end{aligned}$$

The last inequality follows because \(S_{\theta _p}^*\ge \overline{S}_{p} \ge \overline{S}_{p_0}\ge \max \big \{1, \frac{\beta }{\alpha -1}\big \}\) and by applying (9). We can now apply Inequality (II) from Lemma 8 to obtain

$$\begin{aligned} S_{\theta _p}^* \ge \overline{S}_{p}\ge & {} \overline{S}_{p_0} \cdot \left( 1 + (\alpha -1)\left( \frac{\gamma _{\min }}{\gamma _{\max }}\right) ^{\frac{1}{p-p_0}}\right) ^{p-p_0}\nonumber \\\ge & {} \overline{S}_{p_0} \cdot \left( 1 + (\alpha -1)\left( \frac{\max \{(\alpha -1)+\beta ,2\beta \}}{2k(\alpha -1)S_{\theta _p}^*}\right) ^{\frac{1}{p-p_0}}\right) ^{p-p_0}. \end{aligned}$$
(35)

In the following, assume that

$$\begin{aligned} S_{\theta _p}^* \le \max \{1,\frac{\beta }{\alpha -1}\} \alpha ^{\frac{p-p_0}{2}}. \end{aligned}$$
(36)

Note that if (36) does not hold, the claim of the lemma is trivially true. By replacing \(S_{\theta _p}^*\) on the right-hand side of (35) with the upper bound of (36), we obtain

$$\begin{aligned} S_{\theta _p}^* \ge \overline{S}_{p}\ge & {} \overline{S}_{p_0} \cdot \left( 1 + (\alpha -1)\cdot \left( \frac{(\alpha -1) + \beta }{2k(\alpha -1)\max \{1,\frac{\beta }{\alpha -1}\}\alpha ^{\frac{p-p_0}{2}}} \right) ^{\frac{1}{p-p_0}}\right) ^{p-p_0}\\\ge & {} \overline{S}_{p_0} \cdot \left( 1 + \frac{\alpha -1}{(2k)^{\frac{1}{p-p_0}}\sqrt{\alpha }}\right) ^{p-p_0}\\\ge & {} \overline{S}_{p_0} \cdot \left( 1 + \frac{\sqrt{\alpha }-1}{(2k)^{\frac{1}{p-p_0}}} \right) ^{p-p_0}\ \ge \ \frac{\overline{S}_{p_0}}{2k}\cdot \alpha ^{\frac{p-p_0}{2}}. \end{aligned}$$

The lemma then follows because we assumed that \(\overline{S}_{p_0}\ge \max \big \{1,\frac{\beta }{\alpha -1}\big \}\).\(\square \)

2.3.2 Optimal Offline Algorithm Total Cost

Service Cost

In order to minimize the service cost, we can simply bound the service cost of \(\mathcal {O}\) as follows

$$\begin{aligned} S_{\theta _p}^\mathcal {O} \ge S_{\theta _p}^*. \end{aligned}$$

Movement Cost

To simplify our analysis, we take no notice of movement cost by optimal offline algorithm since it has no substantial effect on the competitive factor we provide since \(\mathcal {O}\) has to pay at least the optimal service cost which we show it is large enough. The total cost of optimal offline algorithm, therefore, is bounded as follows

$$\begin{aligned} cost_{\theta _p}^\mathcal {O}= M_{\theta _p}^\mathcal {O}+S_{\theta _p}^\mathcal {O} \ge S_{\theta _p}^*. \end{aligned}$$
(37)

2.3.3 DGA Total Cost

Service Cost

The online algorithm DGA has to keep the service cost smaller than a linear function of optimal service cost as mentioned in (10). Thus

$$\begin{aligned} S_{\theta _p}^\mathcal {G} < \alpha S_{\theta _p}^* + \beta . \end{aligned}$$
(38)

Movement Cost

First, using Definition 2 we bound the number of movement in each phase.

Observation 11

For each Phase \(p\ge 1\), we have \(\lambda _p\le k\).

Proof

As an immediate consequence of Definition 2, we obtain that the maximum number of movements in each phase is at most k. Let \(m>m_p\) and consider the movements \([m_p,m]\). We prove that if \(m<m_{p+1}\), no two the movements in \([m_p,m]\) move the same facility. The claim then follows because there are only k facilities. For the sake of contradiction, assume that there is some facility i that is moved more than once and let \(m'\) and \(m''\) (\(m',m''\in [m_p,m]\), \(m'<m''\)) be the first two movements in \([m_p,m]\), where facility i is moved. We clearly have \(v^{dst}_{m'}=v^{src}_{m''}\) and Definition 2 thus leads to a contradiction to the assumption that \(m<m_{p+1}\).\(\square \)

As a result of above observation and Lemma 9 and Lemma 10, it is possible to prove the following lemma to bound the number of DGA movements by means of optimal service cost.

Lemma 12

For any \(\alpha \ge 1\) and \(\beta \) satisfying (9), there is a deterministic online algorithm \(\mathcal {G}\), such that for all times \(t\ge 0\), the total movement cost \(M_t^{\mathcal {G}}\) is bounded as follows.

  • If \(\alpha =1\), for any \(\ell \ge 1\), \(\varepsilon >0\), and \(\beta \ge k(2k)^{1/\ell }/\varepsilon \), we have

    $$\begin{aligned} M_t^{\mathcal {G}} \le \varepsilon \cdot S_t^* + O(\ell k). \end{aligned}$$
  • For \(\alpha \ge 1+\varepsilon \) where \(\varepsilon >0\) is some constant and any \(\beta \) satisfying (9), we have

    $$\begin{aligned} M_t^{\mathcal {G}} \le k \cdot O\left( 1+ \log _{\alpha } S_t^* + \min \{\frac{\log k}{\log \log k},\log _{\alpha } k\} + \log _{\alpha }\frac{k}{1+\beta }\right) . \end{aligned}$$

Proof

First note that by Observation 11, the movement cost of our algorithm by time \(\theta _p\) is at most

$$\begin{aligned} M_{\theta _p}\le (p-1)k+1\le pk. \end{aligned}$$
(39)

Together with the lower bounds on \(S_{\theta _p}^*\) of Lemmas 9 and 10, this allows to derive an upper bound on the movement cost of our algorithm as a function of \(S_{\theta _p}^*\). Note that as all upper bound claimed in the lemma have an additive term of O(k) (with no specific constant), it is sufficient to prove that the lemma holds for all time \(t=\theta _p\), where \(p\ge 2\) is a phase number.

Let us first consider the case where \(\alpha =1\). Because in that case \(\beta /(\alpha -1)\) is unbounded, we can only apply Lemma 9 to upper bound the movement cost as a function of \(S_t^*\). We choose \(\ell \ge 1\) and assume that \(\beta \ge k(2k)^{1/\ell }/\varepsilon \) for \(\varepsilon >0\). Together with (39), for \(p\ge \ell +2\), Lemma 9 then gives

$$\begin{aligned} S_{\theta _p}^* \ge \frac{k(2k)^{\frac{1}{\ell }}}{\varepsilon }\cdot (p-2)\cdot (2k)^{-\frac{1}{\ell }} = \frac{k}{\varepsilon }(p-2) \ge \frac{1}{\varepsilon }(M_{\theta _p} - 2k). \end{aligned}$$
(40)

The first part of Lemma 12 then follows because the total movement cost for the first \(\ell +2\) phases is at most \(O(\ell k)\). The special cases are obtained as follows. For \(\beta =\Omega \left( k+k/\varepsilon \right) \), we set \(\ell =\Theta (\log k)\) and every \(\varepsilon >0\), whereas for \(\beta =\Omega (k\log k/\log \log k)\), we set \(\varepsilon =\Theta (\log \log k/\log ^{1-\delta }k)\) and \(\ell =\Theta \big (\frac{1}{\delta } \cdot \frac{\log k}{\log \log k}\big )\) for constant \(0 < \delta \le 1\).

Let us therefore move to the case where \(\alpha >1\). Let \(p_0\) be the first Phase \(p_0\ge 2\) for which \(S_{\theta _{p_0}}^*\ge S_0\), where \(S_0=\max \big \{1,\frac{\beta }{\alpha -1}\big \}\) as in Lemma 10. Further, we set \(p_1=p_0 + \lceil 2\log _\alpha (2k)\rceil \). Using Lemma 10, for \(p\ge p_1\), we have

$$\begin{aligned} S_{\theta _p} \ge \frac{S_0}{2k}\cdot \alpha ^{\frac{p-p_1}{2}}\alpha ^{\frac{p_1-p_0}{2}} \ge S_0\cdot \alpha ^{\frac{p-p_1}{2}}. \end{aligned}$$

We therefore get

$$\begin{aligned} M_{\theta _p} \le k\cdot p \le k \left( p_1 + 2\log _{\alpha }\frac{S_{\theta _p}^*}{S_0}\right) \le k \left( p_0 + 1 + 2\log _{\alpha }S_{\theta _p}^* + \log _{\alpha }\frac{2k}{S_0}\right) . \end{aligned}$$

The second claim of Lemma 12 then follows by showing that

$$\begin{aligned} p_0=O\big (\min \big \{\frac{\log k}{\log \log k}, \log _{\alpha } k\big \}\big ). \end{aligned}$$

If \(S_0=1\), we have \(p_0=2\). Otherwise, we can apply Lemma 9 to upper bound \(p_0\) as the smallest value \(p_0\) for which \(\frac{\beta }{\alpha -1}=\beta (p-2)(2k)^{-1/(p-2)}\). For \(\alpha =O\big (\frac{\log k}{\log \log k}\big )\), the assumption that \(\alpha \) is at least \(1+\varepsilon \) for some constant \(\varepsilon >0\) gives that \(p_0=\Theta \big (\frac{\log k}{\log \log k}\big )\). Otherwise, (i.e., for large \(\alpha \)), we obtain \(p_0 =\Theta (\log _{\alpha -1}k)=\Theta (\log _{\alpha } k)\).\(\square \)

Note that by choosing \(\alpha >1\), the dependency of the movement cost \(M_t^\mathcal {G}\) on the optimal service cost \(S_t^*\) is only logarithmic because terms \(\min \{\frac{\log k}{\log \log k},\log _{\alpha } k\}\) and \(\log _{\alpha }\frac{k}{1+\beta }\) are dominated by \(\log k\).

Proof of Theorem 2

Putting (37), (38), and Lemma 12 all together conclude the claim of theorem.\(\square \)

3 OMFL: A Lower Bound Analysis

In this section, we provide a proof of Theorem 1. The lower bound presented here, combined with the upper bound presented in Section 2, provides a tight analysis for G-OMFL on uniform metrics considering Remark 4. Before delving into the details, we overview the lower bound analysis.

Outline of the Analysis

We consider a metric space with uniform distances, i.e., by scaling appropriately, we can assume that the distansce between each pair of points is equal to 1. Assume that we are given an online algorithm \(\mathcal{O}\mathcal{N}\) which guarantees that

$$\begin{aligned} S^{\mathcal{O}\mathcal{N}}_t < S_t^* + \beta \end{aligned}$$
(41)

at all times t for some parameter \(\beta \). In the following, let \(\mathcal {O}\) be any optimal offline algorithm. We essentially compute the total cost of \(\mathcal{O}\mathcal{N}\) and \(\mathcal {O}\) at any time t as functions of the optimal assignment cost at time t. Given \(\mathcal{O}\mathcal{N}\), we construct an execution in which \(\mathcal{O}\mathcal{N}\) has to perform a large number of movements while the optimal assignment cost does not grow too much. We divide time into phases such that in each phase, \(\mathcal{O}\mathcal{N}\) has to move \(\Omega (k)\) facilities and the optimal assignment cost grows as slowly as possible. For p phases, we define a sequence of integers \(n_1\ge n_2\ge \dots \ge n_p\) where \(n_1 \le k/3\) and \(n_p \ge 1\) and values \(\Gamma _1<\Gamma _2<\dots < \Gamma _p\). In the following, let v be a free node if v does not have a facility. Roughly, at the beginning of a Phase i, we choose a set \(N_i\) of \(n_i\) (preferably) free nodes and make sure that all these nodes have \(\Gamma _i\) requests. Note that constructing an execution means to determine where to add the request in each iteration. The value \(\Gamma _i\) is chosen large enough such that throughout Phase i a assignment cost of \(n_i\Gamma _i\) is sufficiently large to force an algorithm to move. Hence, whenever there are \(n_i\) free nodes with \(\Gamma _i\) requests, \(\mathcal{O}\mathcal{N}\) has to move at least one facility to one of these nodes. For each such movement, we pick another free node that currently has less than \(\Gamma _i\) requests and make sure it has \(\Gamma _i\) requests. We proceed until there are k nodes with \(\Gamma _i\) requests at which point the main part of the phase ends. Except for the nodes in \(N_i\), each of the k nodes with \(\Gamma _i\) requests leads to a movement of \(\mathcal{O}\mathcal{N}\) and therefore, \(\mathcal{O}\mathcal{N}\) has to move at least \(k-n_i=\Omega (k)\) facilities in Phase i. At the end of Phase i, we can guarantee that there are exactly k nodes with \(\Gamma _i\) requests, \(n_i\) nodes with \(\Gamma _{i-1}\) requests, \(n_{i-1}-n_i\) nodes with \(\Gamma _{i-2}\) requests, etc. The optimal assignment cost after Phase p, therefore, is \(n_p\Gamma _{p-1}+\sum _{i=3}^{p} (n_{i-1}-n_i)\Gamma _{i-2}\). The assignment cost paid by \(\mathcal{O}\mathcal{N}\) at time t can not be smaller than \(S^*_t\).

By contrast, the optimal offline algorithm can wait until all requests have arrived and just perform all the necessary facility movements at the very end to have an optimal configuration. Therefore by the end of Phase p, \(\mathcal {O}\) has to pay at most k as the total movement cost, while \(\mathcal{O}\mathcal{N}\) has to pay \(\Theta (pk)\) for the movement cost in total by this time. The assignment cost of \(\mathcal {O}\) equals the optimal assignment cost at the end of Phase p. By choosing the values \(n_i\) appropriately, we obtain the claimed bounds.

3.1 Lower Bound Analysis

The formal proof consists of three parts. In Section 3.1.1, given some online algorithm \(\mathcal{O}\mathcal{N}\), we construct an explicit bad execution. In Section 3.1.2, we analyze the cost of the online algorithm \(\mathcal{O}\mathcal{N}\) in the constructed execution and in Section 3.1.3, we bound the cost of an optimal offline algorithm \(\mathcal {O}\), and we combine everything to complete the proof of Theorem 1.

3.1.1 Lower Bound Execution

We assume that \(\mathcal{O}\mathcal{N}\) is the given online algorithm and \(\mathcal {O}\) is an optimal offline algorithm. Further recall that we assume that \(\mathcal{O}\mathcal{N}\) guarantees that the difference between the assignment cost of \(\mathcal{O}\mathcal{N}\) and the optimal assignment cost at all times is less than \(\beta \) for some given \(\beta >0\).

We need n to be sufficiently large and for simplicity, we assume that \(n\ge 3k\). We denote a feasible configuration by a set \(F\subset V\) of size \(|F|=k\). Further, without loss of generality, we assume that all facilities of \(\mathcal{O}\mathcal{N}\) and \(\mathcal {O}\) are at the same locations at the beginning (i.e. at time \(t=0\)). At each point t in the execution, a configuration \(F_t^*\) with optimal assignment cost places facilities at the k nodes with the most requests (breaking ties arbitrarily if there are several nodes with the same number of requests). Also, at a time t the optimal assignment cost is equal to the total number of requests at nodes in \(V\setminus F_t^*\) for an arbitrary optimal configuration \(F_t^*\).

Time is divided into phases. We construct the execution such that it lasts for at least k phases. As described in the outline, we define integers \(\Gamma _1<\Gamma _2<\dots \) such that at the end of Phase i, there are exactly k nodes with \(\Gamma _i\) requests (and all other nodes have fewer requests). For each phase i, we define \(V_i\) to be this set of k nodes with \(\Gamma _i\) requests. We also fix integers \(n_1\ge n_2\ge \dots \ge 1\) where \(n_1 \le k/3\) and at the beginning of each Phase i, we pick a set \(N_i\) of \(n_i\) nodes to which we directly add requests so that all of them have exactly \(\Gamma _i\) requests. For \(i=1\), we pick \(N_1\) as an arbitrary subset of \(V\setminus F_0\). We define \(V_0:=F_0\). For \(i\ge 2\), we choose \(N_i\) as an arbitrary subset of \(V_{i-2}\setminus V_{i-1}\). Clearly, at the end of Phase i, we have \(N_i\subseteq V_i\) as otherwise there would be more than k nodes with exactly \(\Gamma _i\) requests. Note that because \(N_{i-1}\subseteq V_{i-1}\) and because \(N_{i-1}\cap V_{i-2}=\emptyset \), \(V_{i-2}\setminus V_{i-1}\) contains \(n_{i-1}\ge n_i\) nodes and it is therefore possible to choose \(N_i\) as described. Note also that because \(N_i\subseteq V_{i-2}\setminus V_{i-1}\), at the beginning of Phase i all nodes in \(N_i\) have exactly \(\Gamma _{i-2}\) requests. The remaining ones of the k nodes that end up in \(V_i\) (and thus have \(\Gamma _i\) requests at the end of Phase i) are chosen among the nodes in \(V_{i-1}\). Consequently, at the end of Phase \(i-1\) and thus at the beginning of Phase i, there are exactly k nodes \(V_{i-1}\) with \(\Gamma _{i-1}\) requests, \(n_{i-1}\) nodes \(V_{i-2}\setminus V_{i-1}\) with \(\Gamma _{i-2}\) requests, \(n_{i-2}-n_{i-1}\) requests \(V_{i-3}\setminus (V_{i-2}\cup N_{i-1})\) with \(\Gamma _{i-3}\) requests, \(n_{i-3}-n_{i-2}\) nodes with \(\Gamma _{i-4}\) requests, and so on. Now, \(n_i\) of the nodes in \(V_{i-2}\setminus V_{i-1}\) are chosen as set \(N_i\) and we increase their number of requests to \(\Gamma _i\). From now on, throughout phase i, there are \(k+n_i\) nodes with at least \(\Gamma _{i-1}\) requests such that at most k of these nodes have \(\Gamma _i\) requests. The number of nodes with less than \(\Gamma _{i-1}\) requests is the same as at the end of Phase \(i-1\). In fact nodes that are not in \(V_{i-1}\cup N_i\) do not change their number of requests after phase \(i-1\). As a consequence of the execution, after increasing the number of requests in \(N_i\) to \(\Gamma _i\), the optimal assignment cost remains constant throughout Phase \(i\ge 1\) and it can be evaluated to

$$\begin{aligned} \Sigma _i^* := n_i \cdot \Gamma _{i-1} + \sum _{j=2}^{i-1} (n_j - n_{j+1})\Gamma _{j-1}. \end{aligned}$$

For convenience, we also define \(\Sigma _0^*:=0\) and moreover \(\Sigma _1^*=0\) since there are at most k nodes with \(\Gamma _1\) requests at the end of Phase 1.

In the following, let v be a free node at some point in the execution, if the algorithm currently has no facility at node v. We now fix a Phase \(p\ge 1\) and assume that we are at a time t, when we have already picked the set \(N_p\) and increased the number of requests of nodes in \(N_p\) to \(\Gamma _p\). By the above observation, we have \(S_t^*=\Sigma _p^*\) and therefore \(\mathcal{O}\mathcal{N}\) is forced to move if there are \(n_p\) free nodes with \(\Gamma _p\) requests and if we choose \(\Gamma _p\) such that

$$\begin{aligned} \gamma _p := \Gamma _p-\Gamma _{p-1} = \frac{(\alpha -1)\Sigma _p^*+\beta }{n_p}. \end{aligned}$$
(42)

We can now describe how and when the remaining \(k-n_p\) nodes of \(V_p\) are chosen after picking the nodes in \(N_p\). As described above, the nodes are chosen from \(V_{p-1}\). We choose the nodes sequentially. Whenever we choose a new node from \(V_p\), we pick some free node \(v\in V_{p-1}\) with less than \(\Gamma _p\) requests and increase the number of requests of v to \(\Gamma _p\). As described above, \(\Gamma _p\) is chosen large enough (as given in (42)) such that throughout Phase p there are never more than \(n_p-1\) free nodes with \(\Gamma _p\) requests. Because \(|N_p\cup V_{p-1}|=k+n_p\), as long as there are at most k nodes with \(\Gamma _p\) requests there always needs to be a free node \(v\in V_{p-1}\) that we can pick and we actually manage to add k nodes to \(V_p\).

3.1.2 Online Algorithm Total Cost

The assignment cost paid by \(\mathcal{O}\mathcal{N}\) at any time t could be simply lower bounded by \(S^*_t\). Hence, it remains to compute a lower bound for \(M_t^\mathcal{O}\mathcal{N}\) as a function of optimal assignment cost. The following lemma computes such a lower bound.

Lemma 13

For any \(\alpha \ge 1\) and \(\beta \), assume \(\mathcal{O}\mathcal{N}\) be any deterministic online algorithm that can solve the problem. There exists a time \(t > 0\) such that the execution of Section 3.1.1 guarantees the total movement cost \(M_t^\mathcal{O}\mathcal{N}\) can be bounded as follows.

  • If \(\alpha =1\), for any \(\ell \ge 1\), \(\varepsilon >0\), and \(\beta \le k(2k)^{1/\ell }/\varepsilon \), we have

    $$\begin{aligned} M_t^{\mathcal{O}\mathcal{N}} \ge \varepsilon \cdot S_t^* + \Omega (\ell k). \end{aligned}$$

    Specifically, for \(\beta =O(k/\varepsilon )\) we get \(M_t^{\mathcal{O}\mathcal{N}} \ge \varepsilon \cdot S_t^* + \Omega (k\log k)\) and for \(\beta =O\big (\frac{k\log k}{\log \log k}\big )\) we have \(M_t^{\mathcal{O}\mathcal{N}} \ge \varepsilon \cdot S_t^* + \Omega \big (\frac{k\log k}{\log \log k}\big )\).

  • For \(\alpha \ge 1+\varepsilon \) where \(\varepsilon >0\) is some constant and any \(\beta \), we have

    $$\begin{aligned} M_t^{\mathcal{O}\mathcal{N}} \ge k \cdot \Omega \left( 1+ \log _{\alpha } S_t^*\right) . \end{aligned}$$

Proof

Let us count the number of movements of \(\mathcal{O}\mathcal{N}\) in a given Phase p. At each point in time t during the phase, let \(\Phi _t\) be the number of free nodes with \(\Gamma _p\) requests (possibly including a node v that we already chose to be added to \(V_p\)). We know that for all t, \(\Phi _t<n_p\). Whenever we decide to add a new node v to \(V_p\), \(\Phi _t\) increases by 1 (as v is a free node). The value of \(\Phi _t\) can only decrease when \(\mathcal{O}\mathcal{N}\) moves a facility and each facility movement reduces the value of \(\Phi _t\) by at most 1. As after fixing \(N_p\), we add \(k-n_p\) nodes to \(V_p\), we need at least \(k-2n_p\ge k/3\) movements to keep \(\Phi _t\) below \(n_p\) throughout the phase. Consequently, every online algorithm \(\mathcal{O}\mathcal{N}\) has to do at least k/3 movements in each phase.

Now we upper bound the optimal assignment cost \(\Sigma _p^*\) as a function of \(\alpha \), \(\beta \), and p. Using (42), for all \(p\ge 0\), we have

$$\begin{aligned} \Sigma _p^* = \sum _{i=1}^p n_p\cdot \gamma _{p-1} \end{aligned}$$

For \(p\ge 1\), we then get

$$\begin{aligned} \begin{aligned} \Sigma _p^*&= \frac{n_p}{n_{p-1}}\big ((\alpha -1)\Sigma _{p-1}^*+\beta \big ) +\Sigma _{p-1}^*\\&= \left( 1 + (\alpha -1)\frac{n_p}{n_{p-1}}\right) \cdot \Sigma _{p-1}^* + \beta \cdot \frac{n_p}{n_{p-1}}. \end{aligned} \end{aligned}$$
(43)

In the following, we for simplicity assume that for \(i=1,2,\dots ,p\), values \(n_i\) do not have to be integers. For integer \(n_i\), the proof works in the same way, but becomes more technical and harder to read. We fix the values of \(n_i\) as

$$\begin{aligned} n_i:=(k/3)^{\frac{p-i}{p-1}} \end{aligned}$$

such that \(n_1=k/3\) and \(n_p=1\). For all \(i\ge 1\), we then have \(\frac{n_i}{n_i-1}=(\frac{k}{3})^{-\frac{1}{p-1}}\). (43) now be simplified as

$$\begin{aligned} \Sigma _p^* = \left( 1 + \frac{\alpha -1}{(k/3)^{1/(p-1)}}\right) \cdot \Sigma _{p-1}^*+ \beta \cdot \frac{1}{(k/3)^{1/(p-1)}}. \end{aligned}$$
(44)

We have already seen that \(S_t^*=\Sigma _p^*\). Using (44) and (43), the claim of the first part of the lemma follows analogously from Lemmas 7 and 9 and the claim of the second part of the lemma follows analogously from Lemmas 7 and 10 in the upper bound analysis section.\(\square \)

3.1.3 Optimal Offline Algorithm Total Cost

An optimal offline algorithm, say \(\mathcal {O}\), knows the request sequence in advance. In other words, it can wait until all requests have arrived and just perform all the necessary facility movements at the very end. Therefore, an upper bound for the total cost of \(\mathcal {O}\) at any time t is

$$\begin{aligned} cost^{\mathcal {O}}_t \le k+S^*_t. \end{aligned}$$
(45)

We now have everything we need to prove Theorem 1.

Proof of Theorem 1

The proof of Theorem 1 now directly follows from Lemma 13 and from (45).\(\square \)

4 M-OMFL: A Lower Bound Analysis

We provide our lower bound execution in the following. We consider a uniform metric where the distance between every pair of points is 1. In this setting, the goal of any feasible solution for M-OMFL is to minimize the total number of movements. As we can assume that each node either has 0 or 1 facilities, we slightly overload our notation and simply denote a feasible configuration by a set \(F\subset V\) of size \(|F|=k\). We first fix \(\mathcal{O}\mathcal{N}\) to be any given deterministic online algorithm and \(\mathcal {O}\) to be an optimal offline algorithm denoted by \(\mathcal {O}\). For proving the statement of Theorem 3, we distinguish two cases, depending on the number of facilities k. In both cases, we define iterations to be subsequences of requests such that \(\mathcal{O}\mathcal{N}\) needs to move at least once per iteration. The number of movements by \(\mathcal{O}\mathcal{N}\) is therefore at least the number of iterations of a given execution.

Case \(\varvec{k} \le \left\lfloor n/2 \right\rfloor \)

At the beginning, we place a large number of requests on any \(k-1\) nodes that initially have facilities. We choose this number of requests sufficiently large such that no algorithm can even move any of these \(k-1\) facilities. This essentially reduces the problem to \(k=1\) and \(n-k+1\) nodes.

To bound the number of movements by \(\mathcal {O}\), we then consider intervals of \(n-k\) iterations such that \(\mathcal{O}\mathcal{N}\) is forced to move in each iteration. During each interval, the requests are distributed in such a way that at the beginning of the i-th iteration of the interval there are at least \(n-k-i+1\) nodes such that if any offline algorithm places a facility on one of these nodes, (8) remains satisfied throughout the whole interval. Hence, there exists an offline algorithm that moves at most once in each interval and therefore the number of movements by \(\mathcal {O}\) is upper bounded by the number of intervals.

Case \(\varvec{k} > \left\lfloor n/2 \right\rfloor \)

In this case, there is some resemblance between the constructed execution and the lower bound constructions for the paging problem. For simplicity assume that there are \(n=k+1\) nodes (we let requests arrive at only \(k+1\) nodes). At the beginning of each iteration we locate a sufficiently large number of requests on the node without any facility of \(\mathcal{O}\mathcal{N}\) such that (8) is violated. Thus, \(\mathcal{O}\mathcal{N}\) has to move at least one server to keep (8) satisfied. By contrast, \(\mathcal {O}\) does not need to move in each iteration. There is always a node which will not get new requests for the next k iterations and therefore \(\mathcal {O}\) only needs to move at most once every k iterations to keep (8) satisfied.

Proof of Theorem 3

Consider any request sequence. First we provide a partitioning of the request sequence as follows. The request sequence is partitioned into iterations. Iteration 0 is the empty sequence and for every \(i \ge 1\), iteration i consists of a request sequence of a length dependent on \(\alpha \), \(\beta \), and the iteration number i. The request sequence of an iteration i is chosen dependent on a given online algorithm \(\mathcal{O}\mathcal{N}\) such that \(\mathcal{O}\mathcal{N}\) must move at least once in iteration i. We see that while \(\mathcal{O}\mathcal{N}\) needs to move at least once per iteration, there is an offline algorithm which only moves once every at least n/2 iterations.

In the proof, we reduce all the cases to two extreme cases. In the first case, we reduce the original metric on a set of n nodes with \(k \le \left\lfloor n/2 \right\rfloor \) facilities to the case where there is only 1 facility. To do this, we first place sufficiently many requests on \(k-1\) nodes that have facilities at the beginning of execution (for simplicity, assume that we place an unbounded number of requests on these nodes). This prevents any algorithm from moving its facilities from these \(k-1\) nodes during the execution and hence we can ignore these \(k-1\) nodes and facilities in our analysis. In contrast, for the second case where \(k > \left\lfloor n/2 \right\rfloor \), we assume that w.l.o.g., \(k=n-1\) by simply only placing requests on the k nodes which have facilities at the beginning and on one additional node.

In the following, we let \(t_i\) denote the end of an iteration i. Moreover suppose \(\mathcal {I}\) is the total number of iterations, where we assume that \(\mathcal {I} \equiv 0\pmod {\max \{k,n-k\}}\).

Case \(\varvec{k} \le \left\lfloor n/2 \right\rfloor \)

The idea behind the execution is to uniformly increase the number of requests on the \(n-k\) nodes that do not have the facility at the beginning of an iteration i (i.e., at time \(t_{i-1}\)) in such a way that \(\mathcal{O}\mathcal{N}\) has to move at least once to satisfy (8) at the end of iteration i. Moreover the distribution of requests guarantees that any node without the facility at time \(t_{i-1}\) is a candidate to have the (free) facility of \(\mathcal{O}\mathcal{N}\) at time \(t_i\). Let v be any node in the set V of nodes. We use \(r_{v,t}\) to denote the number of requests at node v after the arrival of t requests. When the context is clear, we omit the second subscript (i.e., t) and simply write \(r_v\). Further, let \(v^\mathcal{O}\mathcal{N}_t\) denote the node on which \(\mathcal{O}\mathcal{N}\) locates its facility at time t and let U(t) be the set of all nodes without facility at time t. Moreover, let \(v^*_t\) be a node which has the largest number of requests among all nodes at time t. The node with the largest number of requests at the end of an iteration i, i.e. \(v^*_{t_i}\), is chosen such that \(v^*_{t_i} \ne v^\mathcal{O}\mathcal{N}_{t_{i-1}}\). At time 0, we have \(r_u=0\) for all nodes u. The distribution of requests at the end of iteration i is as follows:

$$\begin{aligned} \forall u \in U(t_{i-1}) \setminus \{v^*_{t_i}\} : r_u= & {} r_{v^*_{t_{i-1}}}+\max \{\beta ,1\},\end{aligned}$$
(46)
$$\begin{aligned} r_{v^\mathcal{O}\mathcal{N}_{t_{i-1}}}= & {} r_{v^*_{t_{i-1}}},\end{aligned}$$
(47)
$$\begin{aligned} r_{v^*_{t_i}}= & {} (\alpha -1) \cdot S^*_{t_i} + r_{v^*_{t_{i-1}}} + \beta . \end{aligned}$$
(48)

Claim 14

The above execution guarantees that \(\mathcal{O}\mathcal{N}\) has to move at least once per iteration. Further, there exists an offline algorithm \(\mathcal{O}\mathcal{F}\) that moves its facilities at most \(\mathcal {I}/(n-k)\) times.

Proof

Consider any interval of \(n-k\) iterations such that the first iteration of this interval has ending time \(\tau _1\) and the finishing time of the last iteration (or the finishing time of the interval) is \(\tau _{n-k}\). Further, suppose the previous interval has finished at \(\hat{t}\).Obviously, if this is the first interval, \(\hat{t}=0\). Let \(U:=U(\hat{t}) \setminus \bigcup _{t=\tau _1}^{\tau _{n-k}}\{v^\mathcal{O}\mathcal{N}_t\}\) denote the set of nodes which have not had the facility of \(\mathcal{O}\mathcal{N}\) during this interval. The offline algorithm for all iterations of this interval, locates its facility either on node \(v^\mathcal{O}\mathcal{N}_{\tau _{n-k}}\) if set U is empty or on some node in U, otherwise. The case in which U is empty indicates that every node in \(U(\hat{t})\) has had the facility of \(\mathcal{O}\mathcal{N}\) exactly once within the interval. Whenever the offline algorithm needs to move, it locates its facility at a node in \(U \cup \{v^\mathcal{O}\mathcal{N}_{\tau _{n-k}}\}\). On the one hand and according to (46), node \(v^\mathcal{O}\mathcal{N}_{\tau _{n-k}}\) or any node in U (in the case this set is not empty) has at least \(r_{v^*_{t_{i-1}}}+\max \{\beta ,1\}\) requests at the end of each iteration i that is in this interval. Therefore, the offline assignment cost at \(t_i\) is

$$\begin{aligned} S^\mathcal{O}\mathcal{F}_{t_i} \le (\alpha -1)\cdot S^*_{t_i}+2r_{v^*_{t_{i-1}}}+\beta +(n-k-2) \cdot \left( \max \{\beta ,1\}+r_{v^*_{t_{i-1}}}\right) \end{aligned}$$
(49)

On the other hand, the optimal assignment cost is

$$\begin{aligned} S^*_{t_i} = (n-k-1) \cdot \left( \max \{\beta ,1\}+r_{v^*_{t_{i-1}}}\right) + r_{v^*_{t_{i-1}}} \end{aligned}$$
(50)

using (46), (47), and (48). Hence (49) and (50) imply that

$$\begin{aligned} S^\mathcal{O}\mathcal{F}_{t_i} < \alpha S^*_{t_i} + \beta . \end{aligned}$$
(51)

This guarantees that offline algorithm does not need to move more than once during any interval of \(n-k\) iterations. In other words, at the beginning of the interval, the offline algorithm decides to locate its facility to a node in \(U \cup \left\{ v^\mathcal{O}\mathcal{N}_{\tau _{n-k}}\right\} \) if it needs because it knows the behavior of the online algorithm in advance as well as the request sequence. According to (51),this one movement by the offline algorithm is sufficient to keep (8) satisfied within the interval. Therefore, the offline algorithm moves at most \(\mathcal {I}/(n-k)\) times.

At the end of each iteration i, if the online algorithm has not moved yet within the iteration i then we have \(v^\mathcal{O}\mathcal{N}_{t_{i-1}} = v^\mathcal{O}\mathcal{N}_{t_i}\). Thus,

$$\begin{aligned} S^\mathcal{O}\mathcal{N}_{t_i} = (\alpha -1)\cdot S^*_{t_i}+r_{v^*_{t_{i-1}}}+\beta +(n-k-1) \cdot \left( \max \{\beta ,1\}+r_{v^*_{t_{i-1}}}\right) \end{aligned}$$
(52)

with respect to (46), (47), and (48). Therefore due to (50) and (52) we have \(S^\mathcal{O}\mathcal{N}_{t_i} = \alpha S^*_{t_i} +\beta \). This implies that the online algorithm must had moved at least once to guarantee

$$\begin{aligned} \forall i : v^\mathcal{O}\mathcal{N}_{t_{i-1}} \ne v^\mathcal{O}\mathcal{N}_{t_i}. \end{aligned}$$

Thus \(\mathcal{O}\mathcal{N}\) has to move once per iteration and then the claim holds.\(\square \)

Corollary 15

The Claim 14 implies that

$$\begin{aligned} M_t^\mathcal{O}\mathcal{F} \le \frac{M_t^\mathcal{O}\mathcal{N}}{n-k}. \end{aligned}$$

where t be the ending time of \((c \cdot (n-k))\)-th iteration for any integer \(c \ge 1\).

Proof

It follows by the fact that \(M_t^\mathcal {O} \le M_t^\mathcal{O}\mathcal{F}\).\(\square \)

Case \(\varvec{k} > \left\lfloor n/2 \right\rfloor \)

Here when we have more facilities than half of the nodes, we assume, w.l.o.g. \(n=k+1\). This is doable by letting the requests arrive at a fix set of nodes of size \(k+1\) including k facilities. Therefore, at each time there is only one node without a facility in which this situation holds for any algorithm. Let \(\bar{v}^\mathcal{O}\mathcal{N}_t\) denote the node without any facility of \(\mathcal{O}\mathcal{N}\) at time t. We force \(\mathcal{O}\mathcal{N}\) to move in each iteration i by putting large enough number of requests on \(\bar{v}^\mathcal{O}\mathcal{N}_{t_{i-1}}\) while any optimal offline algorithm only moves one of its facilities after at least k iterations. Consider an interval of k iterations starting from the first iteration of this interval with ending time \(\tau _1\) and ending at the last iteration at time \(\tau _{k}\). For any iteration i of this interval the distribution of the requests at the end of the iteration is as follows.

$$\begin{aligned} r_{\bar{v}^\mathcal{O}\mathcal{N}_{t_{i-1}}}=\alpha S^*_{t_i} + \max \{\beta ,1\}. \end{aligned}$$
(53)

According to (53) the optimal assignment cost does not change during the interval, i.e. \(S^*_{\tau _i}=S^*_{\tau _{i}+1}\) for all \(i \in [k-1]\) of the current interval.

Claim 16

The above execution guarantees that \(\mathcal{O}\mathcal{N}\) has to move at least once per iteration while the number of movements by any optimal offline algorithm is at most \(\mathcal {I}/k\).

Proof

At the end of iteration i, assume \(\bar{v}^\mathcal{O}\mathcal{N}_{t_{i-1}} = \bar{v}^\mathcal{O}\mathcal{N}_{t_i}\), then we have

$$\begin{aligned} S^\mathcal{O}\mathcal{N}_{t_i}=\alpha S^*_{t_i} + \max \{\beta ,1\} \ge \alpha S^*_{t_i} + \beta \end{aligned}$$
(54)

using (53). It implies that the online algorithm must had moved at least once to guarantee

$$\begin{aligned} \forall i : \bar{v}^\mathcal{O}\mathcal{N}_{t_{i-1}} \ne \bar{v}^\mathcal{O}\mathcal{N}_{t_i}. \end{aligned}$$

The optimal offline algorithm, by contrast, need to move a facility from \(\bar{v}^\mathcal{O}\mathcal{N}_{\tau _k}\) to \(\bar{v}^\mathcal{O}\mathcal{N}_{\tau _1}\) during the interval with respect to the request distribution in (53). The node \(\bar{v}^\mathcal{O}\mathcal{N}_{\tau _k}\) is the node has \(\alpha S^*_{\hat{t}} + \max \{\beta ,1\}\) requests within the interval due to (53) where \(\hat{t}\) is the ending time of any iteration of the previous interval. Hence, at the end of any iteration i in the interval, the optimal offline assignment cost equals the optimal assignment cost and thus (8) remains satisfied. Consequently it implies that at most one movement by optimal offline algorithm is sufficient during the interval. This concludes that the number of movements by any optimal offline algorithm is at most \(\mathcal {I}/k\) in this case.\(\square \)

Let t be the ending time of \((c \cdot \max \{k,n-k\})\)-th iteration for any integer \(c \ge 1\). Using Corollary 15 and Claim 16

$$\begin{aligned} M_t^\mathcal{O}\mathcal{N} \ge \max \{n-k,k\} \cdot M_t^\mathcal {O} \ge \frac{n}{2} \cdot M_t^\mathcal {O}. \end{aligned}$$

Thus the claim of the theorem holds.

5 Conclusion

In light of the limited research on the online variants of the mobile facility location problem (MFL), we introduce and examine the OMFL problem and its two subtypes: G-OMFL and M-OMFL. We establish tight bounds for G-OMFL on uniform metrics, where our lower bound for OMFL also applies to G-OMFL on uniform metrics since OMFL on uniform metrics is a special case of G-OMFL on uniform metrics. Additionally, we demonstrate a linear lower bound on the competitiveness for M-OMFL, even on uniform metrics.

Motivated by the approach used by [25, 26] for the k-server problem, we define and study G-OMFL on uniform metrics, similar to the allocation problem defined by [25, 26] on uniform metrics. This is the first step towards solving OMFL on HSTs and general metrics. The second step, which remains an open question, is to adapt a similar approach for OMFL using the DGA algorithm presented in this paper. The idea is for each internal node of the HST to run an instance of G-OMFL to decide how to allocate its facilities among its children nodes. Starting from the root, which has k facilities, this recursive process determines the number of facilities at each leaf of the HST, providing a feasible solution for OMFL.

The second open question is whether the result provided by Theorem 3 is tight. Additionally, due to the similarities between the M-OMFL problem in uniform metrics and the paging problem, it would be interesting to explore the use of randomized online algorithms against oblivious adversaries for M-OMFL. The goal would be to achieve a sublinear competitive ratio for this problem.