1 Introduction

How items should be assigned to groups is an intensively investigated combinatorial optimization problem from a theoretical as well as from a practical point of view. In the literature on combinatorial optimization, partition, multi-way number partitioning, and the maximally diverse grouping problem (MDGP) are well-investigated problems. Likewise these problems are applied in practical applications. The MDGP, for example, is used to assign students to groups, courses or teams but further applications like in parallel machine scheduling or surgery scheduling are possible (we elaborate on applications in Sect. 2).

This paper is in line with the transfer of combinatorial optimization problems into practical applications. We consider the variant of the MDGP with attribute values proposed by Schulz (2021). The author described the set of optimal solutions for the MDGP with attribute values by a set of equalities. These equalities are used as constraints to find the best balanced solution amongst all optimal solutions of the MDGP with attribute values. Furthermore, the paper presents a proof that the problem is NP-hard.

Attribute values are often integral in practical applications like student assignments (performance scores), surgery scheduling (surgery durations) or machine scheduling (processing times). Therefore, we restrict the problem setting of the balanced MDGP to integer attribute values in the paper on hand. Our contribution to the theoretical investigation of the problem is to prove that the problem can be solved in pseudo-polynomial time and outline relations to partition and multi-way number partitioning. Moreover, we introduce a lower bound for the optimal objective value. By this, we are able to solve instances with several thousand items within seconds to optimality. Thus, we close the gap mentioned in Schulz (2021) (considering discrete attribute values) and present an efficient solution approach, which can easily be used in applications.

The paper is structured as follows: First, we present a literature review especially from the point of view of applications (Sect. 2). Then, we define our problem, and prove that it is solvable in pseudo-polynomial time by a relation to the well-known PARTITION problem in Sect. 3. Our solution approach, a mixed-integer program (MIP), is introduced in Sect. 4. Afterwards, it is evaluated in a comprehensive computational study (Sect. 5). The paper closes with a conclusion in Sect. 6.

2 Literature review

In this section, we review two streams of the literature. First, regarding the MDGP which is the underlying problem of our approach and second, regarding practical applications that assign items with integer attributes values to groups with the objective to have groups as balanced as possible. Ideally, this means that there is a bijective mapping between each pair of groups such that each item has a partner with the same attribute value in each other group.

The MDGP considers a set of items \(i,j \in I\) with differences \(d_{ij} \ge 0\) as well as a set of groups \(g \in G\) as given. The aim is to assign the items to the groups such that the diversity, which is measured by the sum of differences \(d_{ij}\) of items assigned to the same group, is maximized. The problem can be formulated as a short integer program with the binary variable \(x_{ig}\), which is one if item i is assigned to group g and zero otherwise.

$$\begin{aligned} \max \sum _{g \in G} \sum _{i \in I} \sum _{j \in I: j>i}&\quad d_{ij}x_{ig}x_{jg} \end{aligned}$$
(1)
$$\begin{aligned} \sum _{g \in G} x_{ig} = 1&\quad \forall i \in I \end{aligned}$$
(2)
$$\begin{aligned} \sum _{i \in I} x_{ig} = |I|/|G|&\quad \forall g \in G \end{aligned}$$
(3)
$$\begin{aligned} x_{ig} \in \{ 0,1 \}&\quad \forall i \in I, g \in G \end{aligned}$$
(4)

Beside the version with equal-sized groups (compare (3)), the number of items per group can also be required to be within a given range (Gallego et al. 2013).

The MDGP is NP-hard (Feo and Khellaf 1990). However, a few papers present exact solution approaches based on mixed-integer programming for the MDGP. Gallego et al. (2013) conducted a computational study with both versions (equal-sized and not equal-sized). They could solve instances with 12 items to optimality within 1800 s. Papenberg and Klau (2021) presented a different model formulation [based on a work by Grötschel and Wakabayashi (1989)] for the version with equal-sized groups. With this version they were able to solve instances with 28 items within 950 s and with 30 items in almost 10000 s to optimality. For the MDGP with attribute values Schulz (2022c) introduced a mixed-integer programming formulation which is able to solve instances with 70 items and three attributes in 1800 s to (near) optimality.

As only small instances can be solved to optimality, most of the approaches presented in the literature are based on metaheuristics. Beside Gallego et al. (2013), who investigated a tabu search approach with strategic oscillation, examples are an artificial bee colony algorithm (Rodriguez et al. 2013), an iterated maxima search approach (Lai and Hao 2016) or a hybrid genetic algorithm (Singh and Sundar 2019). In the following, we focus our review on the practical applications.

In surgery scheduling, patients are assigned to operating rooms among others according to their surgery duration. A well-studied objective is the minimization of waiting times for emergency patients. As they are in danger to life, their immediate treatment is necessary. In the approach by Zhang et al. (2009), an entire operating room is reserved for emergency patients such that they do not need to be included in further planning. van Veen-Berkx et al. (2016) showed that this would lead to a higher utilization, less overtime, and fewer cancellations. The paper by Wullink et al. (2007) showed that reserving capacity for emergency patients in all operating rooms, and hence allowing emergency patients to enter all operating rooms, leads to a huge reduction of their waiting times. As long waiting times increase the risk of postoperative complications and morbidity for emergency patients (Wullink et al. 2007), several authors (as van der Lans et al. 2006, van Essen et al. 2012, and Schulz and Fliedner 2023) proposed approaches, in which elective and emergency patients share the same set of operating rooms. Transferring the requirements of emergency patients to the weekly scheduling level, it is important that they have appropriate access to operating rooms at each day. This means, it is natural to expand the aforementioned approaches to the weekly level by assigning surgeries to days such that the distribution of their durations is as similar as possible for each pair of days. Thus, a perfect assignment has a bijective mapping between the first day and any other day such that each surgery of the first day is mapped with a surgery of the other day which has exactly the same duration. If such a perfect assignment is possible, the same general schedule can be repeated at each day and randomly arriving emergency patients have the same expected waiting time at each day. As reserving the same number of resources at each point in time minimizes emergency patients’ waiting times (Schulz and Fliedner 2023), the approach leads to an indirect minimization of emergency patients’ waiting times by allowing the scheduler to schedule all surgery days as similar as possible (with the objective to minimize emergency patients’ waiting times).

Further, well-studied objectives in surgery scheduling are maximization of utilization and minimization of overtime in operating rooms. As surgery durations are stochastic in reality (see e.g. Samudra et al. 2017), both goals are in conflict with each other such that a perfect utilization without a significant risk of overtime is unlikely to find (Van Houdenhoven et al. 2007). In the review by Cardoen et al. (2010), 47 out of 124 cited papers considered either underutilization/undertime (26 paper) or overutilization/overtime (45 paper) or both. Beside many others, Fei et al. (2010), Gul et al. (2011), Mannino et al. (2012), Astaraky and Patrick (2015), and Lin and Chou (2019) minimized overtime in more recent articles not yet considered in that review. Balancing the distribution of surgery durations over days and ideally having a partner surgery of the same length on each day, as we aim for, leads to the same expected over-/underutilization on all days, which hence has to be optimal if rejection is not allowed.

Our approach might, moreover, be useful in parallel machine scheduling, where jobs are assigned to identical machines (see Han et al. 2013, see also Korf and Schreiber 2013, who point out the relation to multi-way number partitioning) with the objective to minimize makespan (see Cheng and Sin 1990). As machine breakdowns occur randomly (Kaabi and Harrath 2014), assigning jobs to machines such that all machines have the same distribution of processing times might be an appropriate strategy; especially in the case where all machines are unavailable after a disruption occurs (Lee and Yu 2008).

A similar argumentation holds if the sum of completion times is minimized. As on a single machine shortest processing time first (SPT) is optimal (Smith 1956 and Biskup 1999), an assignment has to be optimal as well in which each identical parallel machine is scheduled according to SPT and each job on each machine has on each of the other machines a unique partner with the same duration (Conway et al. 1967).

Another field of application is the assignment of pupils to tutor groups (Baker and Benn 2001) or students to groups (Cutshall et al. 2007; Krass and Ovchinnikov 2006), projects (Beheshtian-Ardekani and Mahmood 1986) or teams (Galvão Dias and Borges 2017).

The assignment of students to some kind of groups is also the main application used in articles on the MDGP. However, in practice differences \(d_{ij}\) between students are computed according to attribute values \(av_i\), i.e. \(d_{ij} = |av_i - av_j|\), like marks, 0, 1, and 2 for male, female, and diverse or 0 and 1 for non-international and international. Schulz (2021) used this to introduce the MDGP with attribute values. In the case with attribute values, the set of optimal solutions can be described by (2), (4), and the block constraints introduced in Schulz (2021) (if the system of Eqs. (2), (4), and block constraints has a feasible solution). As the number of optimal solutions for the MDGP with attribute values can be very large, the paper further suggests to find the best balanced among them such that the diversity of each group is as similar as possible. Schulz (2021) used a mixed-integer program to solve the balanced MDGP.

In subsequent studies, the author generalized the approach to the case where (2), (4), and the block constraints have no feasible solution (Schulz 2023) or varying group sizes are allowed (Schulz 2022b). Schulz (2022c) used the idea of the block constraints to develop the mixed-integer programming formulation for the MDGP with attribute values as mentioned above.

However, like in the example with the students, attribute values are usually integer in practical applications. Thus, we see a need for approaches to distribute some kind of items according to integer attribute values to groups such that the groups are as balanced as possible. Ideally, we have a bijective matching between each pair of groups where each item has the same attribute value like its partner in the other group. In the paper on hand, we assign items according to a single integer attribute to groups such that the groups are as balanced as possible. Before we introduce our solution approaches in Sect. 4, we define the problem formally and investigate it theoretically in the following section.

3 Problem statement and theoretical analysis

We assume integer attribute values \(av_i \in \mathbb {N}_0\) for all items \(i \in I\). All items have to be assigned to groups \(g \in G\) such that each group gets the same number of items assigned. Otherwise, there cannot be a bijective mapping. Thus, the number of items per group |K| has to fulfil \(|I| = |G| \cdot |K|\). This is no restriction since items with attribute value zero can be used to fill up. We say the groups are perfectly balanced if the sum of pairwise absolute differences of all items assigned to a group \(g \in G\), i.e.

$$\begin{aligned} \sum _{i,j \text { assigned to } g: i<j} d_{ij} = \sum _{i,j \text { assigned to } g: i<j} |av_i - av_j|, \end{aligned}$$

is equal for all groups \(g \in G\).

Moreover, we divide all items into blocks according to the following pattern: The |G| items with the largest attribute values are assigned to the first block \(k = 1\), the |G| items with the next largest attribute values are assigned to the second block \(k = 2\), and so on until the |G| items with the smallest attribute values are assigned to the last block \(k = |K|\). Then, exactly one item of each block has to be assigned to each group. Schulz (2021) calls this block constraints (Assumption 1 in Schulz 2021).

We define the decision version of BALANCED MAXIMALLY DIVERSE GROUPING PROBLEM WITH INTEGER ATTRIBUTE VALUES (BMDGPIAV) (see Schulz 2021 for the general problem definition not requiring integer values) as follows:

$$\begin{aligned}&\hbox {INSTANCE: Set }I\hbox { of items with attribute value }av_i \in \mathbb {N}_{0}\hbox { for each }i \in I,\nonumber \\&\quad \hbox { a number }|G|\hbox { of groups, and a set }K\hbox { of blocks with }|I| = |G| \cdot |K|.\nonumber \\&\qquad \hbox { Each item is assigned to exactly one block }k \in K\hbox { as described before.} \nonumber \\&\hbox {QUESTION: Can all items }i \in I\hbox { be partitioned into } |G|\hbox { disjunct subsets }I_1, ..., I_{|G|}\nonumber \\&\quad \hbox { such that each subset gets exactly one item of each block assigned and }\nonumber \\&\quad \sum _{i,j \in I_g: i<j} |av_i - av_j| \nonumber \\&\qquad = \frac{\sum _{g = 1}^{|G|} \sum _{i,j \in I_g: i<j} |av_i - av_j|}{|G|} \text { for all } g \in \{ 1, ..., |G| \}? \end{aligned}$$
(5)

3.1 Complexity analysis

Although the more general version of BMDGPIAV (with \(av_i \in \mathbb {Q}_{\ge 0}\)) is NP-hard (Schulz 2021), the following theorem shows that our problem with integer attribute values is solvable in pseudo-polynomial time if the number of groups is restricted.

Theorem 1

BMDGPIAV is solvable in pseudo-polynomial time if |G| is fixed.

The contribution of an item i, i.e. \(\sum _{j \in I_g} |av_i - av_j|/2\), in

$$\begin{aligned} div_g = \sum _{i,j \in I_g: i<j} |av_i - av_j| = \frac{\sum _{g = 1}^{|G|} \sum _{i,j \in I_g: i<j} |av_i - av_j|}{|G|} \text { for all } g \in G \end{aligned}$$
(6)

depends on all other items assigned to the same group. Therefore, we introduce the following lemma, which repeals this dependency, to get a well-arranged notation before we prove the theorem. For a proof of the lemma and an example see Schulz (2021).

Lemma 2

(Schulz 2021) If \(|K| > 1\), an item of block k with attribute value av has a share on the within-group diversity \(div_g\) of

$$\begin{aligned} \max (av - med, med - av) \cdot c_k \end{aligned}$$

with

$$\begin{aligned} c_k = 2 \cdot \max \left( \frac{|K|}{2} - k, k - \frac{|K|}{2} - 1 \right) + 1 \end{aligned}$$
(7)

for even |K| and

$$\begin{aligned} c_k = 2 \cdot \left| k - \bigg \lceil \frac{|K|}{2} \bigg \rceil \right| \end{aligned}$$
(8)

for odd |K|, where med is the median of all attribute values of the group. For even |K| sort all attribute values in decreasing order and set med to any value between the two attribute values in the middle.

Let \(b_{ki}\) be a binary parameter which is one if item i is assigned to block k according to the aforementioned pattern, i.e. the |G| items with the largest attribute values are assigned to the first block, the items with the next |G| largest attribute values to the second, and so on. With the help of Lemma 2 we are able to determine the contribution of an item i in the objective value as \(\sum _{k=1}^{|K|} b_{ki} \cdot c_k \cdot |av_i - med|\) and express \(div_g\) as \(\sum _{i \in I_g} \sum _{k=1}^{|K|} b_{ki} \cdot c_k \cdot |av_i - med|\). By this, the evaluation of the assignment of an item to a group becomes independent of the other items assigned to the same group. We use this fact in the proof of Theorem 1.

Proof of Theorem 1

The following dynamic programming algorithm, based on the same idea as the PARTITION algorithm in (Garey and Johnson 1979, p. 90-91), solves the problem. Set \(\bar{av}_i = |av_i - med|\), \(i \in I\). Then,

$$\begin{aligned} B = \frac{\sum _{i=1}^{|I|} \sum _{k=1}^{|K|} b_{ki} \cdot c_k \cdot \bar{av}_i}{|G|} = \frac{\sum _{g = 1}^{|G|} \sum _{i,j \in I_g} |av_i - av_j|}{|G|} \end{aligned}$$

is bounded by \((\max _k c_k) \cdot \sum _{i \in I} |\bar{av}_i| \le 2|K|\sum _{i \in I} |\bar{av}_i| \le 2|K||I|\max _i av_i\). Define \(\tilde{av}_i = \sum _{k = 1}^{|K|} b_{ki} \cdot c_k \cdot \bar{av}_i\) and for all integers \(1 \le i \le |I|\) and \(0 \le j_{kl} \le B\), \(l = 1,..., |G|\), \(k = 1,..., |K|\), \(s(k,j_{k1},..., j_{k|G|})\) which specifies the truth value of the statement: “there are subsets \(h_1,..., h_{|G|}\) of the first k blocks’ items \(\{ i_1,..., i_{k|G|} \}\) with \(h_l \cap h_m = \emptyset \) for all \(l,m = 1,..., |G|\), \(l \ne m\), and \(\bigcup _{l=1}^{|G|} h_l = \{ i_1,..., i_{k|G|} \}\) for which \(\sum _{i \in h_l} \tilde{av}_i = j_{kl}\) for all \(l = 1,..., |G|\) such that each \(h_l\), \(l = 1,..., |G|\), contains exactly one item of each block \(k' = 1,..., k\).” Fill the resulting \(|G| + 1\) dimensional table row by row after k with

$$\begin{aligned} s(k, j_{k1},..., j_{k|G|}) = {\left\{ \begin{array}{ll} \text {true}, &{} \text {if} \; j_{(k-1)l} + \tilde{av}_{k_l} = j_{kl} \; \forall l = 1,... |G| \; \text {and any} \\ {} &{} \text {permutation of the set of attribute values} \\ &{} \{ \tilde{av}_{k_1},..., \tilde{av}_{k_{|G|}} \} \; \text {of items} \; i_{(k-1)|G|+1},..., i_{k|G|} \; \text {with} \\ &{} s(k, j_{(k-1)1},..., j_{(k-1)|G|}) = \text {true}, \\ \text {false}, &{} \text {{otherwise}}. \\ \end{array}\right. }\nonumber \\ \end{aligned}$$
(9)

Then, the answer to (5) is “yes” iff \(s(|K|, B,..., B) = true\).

Note that there are |G|! permutations on the right side of (9).

3.2 Optimization version of the problem

Although Theorem 1 states that we are able to decide in pseudo-polynomial time, whether a given instance has a perfectly balanced solution or not, we need an optimization version to find a best balanced solution if a perfect balance is not possible. Therefore, we minimize the following objective

$$\begin{aligned} F = \sum _{g,g' \in G:g'>g} |div_g - div_{g'}|. \end{aligned}$$
(10)

There is a detailed motivation for the objective and the choice of blocks in Schulz (2021). We only mention briefly that the kind of objective in (10) weights larger differences between attribute values assigned to the same group with a larger value. Consider a solution with \(div_1 = 1\), \(div_2 = 2\), \(div_3 = 3\), and \(div_4 = 4\). Then,

$$\begin{aligned} F&= \sum _{g,g' \in G:g'>g} |div_g - div_{g'}| = |4-3| + |4-2| + |4-1| + |3-2| \nonumber \\&\qquad + |3-1| + |2-1| \nonumber \\&= |4-1| + \underbrace{|4-3| + |3-1|}_{= |4-1|} + \underbrace{|4-2| + |2-1|}_{= |4-1|} + |3-2| \nonumber \\&= 3 \cdot |4-1| + 1 \cdot |3-2|. \end{aligned}$$
(11)

The factors 3 and 1 correspond to the \(c_k\) values in Lemma 2. Because of this, the objective aims at reducing each summand \(|div_g - div_{g'}|\) of (10) as far as possible but has a higher weight on the larger differences.

On the other hand, the block constraints take care of our requirement that each group gets the same number of items assigned and the total diversity

$$\begin{aligned} \sum _{g = 1}^{|G|} \sum _{i,j \in I_g: i<j} |av_i - av_j| \end{aligned}$$
(12)

is maximized as it is shown in Schulz (2021).

Consider the following example with eight items with attribute values 1, 1, 3, 3, 3, 3, 5, 5 (items are represented by their attribute value). It has the solution (1,1,3,3) and (3,3,5,5) where both groups have a value of 8 for \(\sum _{i,j} |av_i - av_j|\). Hence, it is optimal due to the second part of (5) and (10) but not maximal for (12) and not perfect in our sense that each item of each group should have a unique partner with the same attribute value in each other group. With block constraints the optimal solution due to (10) and (12) would be (1,3,3,5) and (1,3,3,5) with a diversity of 12 for both groups, which is perfectly what we want.

3.3 Relation to PARTITION

The proof of Theorem 1 indicates a strong relation to PARTITION in the case of two groups or to the multi-way number partitioning problem in general. Whereas there is even an equivalence between our problem for two groups and PARTITION (Schulz 2021), our objective differs from the one of the multi-way number partitioning problem, where the sum of attribute values assigned to the same group and the range between the largest and lowest one are considered as objective (Korf 2009).

Korf (2009) asserts that in any optimal three-way partitioning solution each pair of groups must be an optimal solution for the appropriate PARTITION(O) (optimization version of PARTITION) instance with all numbers assigned to these two groups, but this is not the case for any optimal four-way partitioning solution. In other words, take two groups a and b of an optimal solution of a multi-way number partitioning instance and solve the corresponding PARTITION(O) instance with all items in a and b. For three-way partitioning, the original assignment into the groups a and b is also an optimal solution for the PARTITION(O) instance. For four-way partition, the original assignment of two groups a and b is not necessarily an optimal solution for the corresponding PARTITION(O) instance. For the optimization version of BMDGPIAV and for any number of groups (ways) each pair of groups a and b is an optimal solution for the PARTITION(O) instance with all items in a and b like the following theorem shows.

Theorem 3

In any optimal solution of the optimization version of BMDGPIAV, the partition of any two groups corresponds to an optimal solution of the PARTITION(O) instance with all items assigned to these two groups.

Proof

Consider an optimal solution for any BMDGPIAV instance with |G| groups. Let w.l.o.g. \(div_1 \ge div_2 \ge ... \ge div_{|G|}\). Lemma 2 says that the objective value F equals

$$\begin{aligned} \sum _{g=1}^{|G|} c_g \cdot |div_g - med|. \end{aligned}$$

Note that \(c_1 = c_{|G|}> c_2 = c_{|G| - 1} >... \ge 0\) for an odd number of days and \(c_1 = c_{|G|}> c_2 = c_{|G| - 1}>...> c_{\lfloor |G|/2 \rfloor } = c_{\lceil |G|/2 \rceil } = 1 > 0\) in the even case. If the solution in the corresponding instance of PARTITION(O) is for a pair of two groups \(G_1\) and \(G_2\) not optimal, we can solve it to optimality (item sizes s(i) in the PARTITION(O) instance are \(\sum _{k=1}^{|K|} b_{ki} \cdot c_k \cdot \bar{av}_i\)), transform it back into our setting, and get

$$\begin{aligned} |div_{G_1} - div_{G_2}| > |div'_{G_1} - div'_{G_2}|, \end{aligned}$$
(13)

where \(div'_{G_j}\), \(j = 1, 2\), is the within-group diversity of the transformed optimal PARTITION(O) instance. Let further w.l.o.g \(div_{G_1} > div_{G_2}\) and \(div'_{G_1} > div'_{G_2}\). Then, (13) implies that \(div_{G_1} > div'_{G_1}\) and \(div_{G_2} < div'_{G_2}\) holds. Note that one item of each block is assigned to each group which means that the value of (12) is independent of the item assignment to the groups. Therefore, \(div_{G_1} + div_{G_2} = div'_{G_1} + div'_{G_2}\). A case-by-case analysis shows that this leads to a reduction of F, which is a contradiction because we assumed to consider an optimal solution. The case-by-case analysis is given in Appendix 1. Thus, the solution of the corresponding PARTITION(O) instance has to be optimal as well.

The following example shows that it is not sufficient for global optimality to prove that the partition of any two groups corresponds to an optimal solution of the corresponding PARTITION(O) instance. Nevertheless, it is sufficient for a local optimum relating to the neighbourhood, where two solutions are neighboured if they differ in the assignment of only two groups.

Example 4

Consider the instance given in Table 1 with 12 items which have to be assigned to three groups with four items each.

Table 1 Example that Theorem 3 is not sufficient for a global optimum

The first block contains items a, b, and c, the second items d, e, and f, the third g, h, and i, and the fourth j, k, and l, i.e. each group gets exactly one item out of a, b, and c assigned.

Table 2 A local and a global optimal solution

Table 2 gives a local optimal solution on the left and a global optimal solution on the right side. The numbers in brackets are the values for \(\sum _{k = 1}^{|K|} b_{ki} \cdot c_k \cdot \bar{av}_i\) such that div is the sum of them. For item a, which is in block 1, \(\sum _{k = 1}^{|K|} b_{ka} \cdot c_k = b_{1a} \cdot c_1 = 1 \cdot c_1 = 3\) (compare (7)) and \(\bar{av}_a = av_a - med = 12 - 8 = 4\), where med can be set to any value between 7 (attribute value of i) and 8 (attribute value of d); here \(med = 8\).

While the first two blocks are equal in both solutions, the last blocks’ difference is within our neighbourhood (j is assigned to group one in both cases such that there is only a swap of items k and l). However, in the third block no swap of two items is possible which leads from the assignment (ghi) on the left side to (hig) on the right side, i.e. both solutions are not neighbouring. Furthermore, the lower part of Table 2 presents the within-group diversities (div) and the objective values F, which shows that the right solution is better than the left one. Nevertheless, the left one is a local optimum such that the result of Theorem 3 is necessary but not sufficient for global optima.

Theorem 3 and Example 4 show that our objective leads—in contrast to the one in the multi-way number partitioning problem—to a necessary condition for optimal solutions, although it is not sufficient.

Korf (2010) mentions three objectives for multi-way number partitioning: minimize the largest sum of items assigned to the same group, maximize the smallest sum of items assigned to the same group, and minimize the difference between them. Cong and Lim (1998) point out that it is reasonable to solve the multi-way number partitioning problem by optimizing the PARTITION(O) instances with all items currently assigned to only two groups and assign the items of each partition to one of the two groups afterwards. Our objective could be an alternative objective for the multi-way number partitioning problem, which leads even to a stronger reference to PARTITION.

4 Solution approaches

This section presents our two solution approaches: a dynamic program (DP, Sect. 4.1) and a mixed-integer program (MIP, Sect. 4.2).

4.1 Dynamic program

Our first solution approach is based on the dynamic program (DP) proposed in the proof of Theorem 1. In the proof, the algorithm adds one block after the other. In the first row, \(s(1, j_1,..., j_{|G|}) = true\) for all permutations \((j_1,..., j_{|G|})\) of the |G| values \(\tilde{av}\) of the items of the first block. In the second row, \(s(2, j_1 + \pi _1,..., j_{|G|} + \pi _{|G|}) = true\) if \(s(1, j_1,..., j_{|T|}) = true\) and \((\pi _1,..., \pi _{|G|})\) is a permutation of the values \(\tilde{av}\) of the items of the second block. The same procedure is done for the remaining blocks.

Table 3 Worst case analysis for different implementations for eight blocks (sufficiently large \(\max av_i\))

Table 3 summarizes the worst case effort (dependent on the number of groups |G|) for eight blocks for the implementation in Theorem 1 (column 2), a more efficient version of it (column 3), and for our implementation (column 4). In the more efficient implementation of the algorithm in Theorem 1, the first block is set in any fixed sequence due to symmetry. Afterwards, each solution of the previous step, \(k-1\), is combined with each permutation of block k (|G|! permutations) except the last step. In the last step, group diversities of the previous step are sequenced in decreasing order and combined with the last blocks’ attribute values in increasing order. This leads to the best possible assignment.

Fig. 1
figure 1

Procedure of our DP for eight blocks (worst case, sufficiently large \(\max av_i\))

Figure 1 describes the procedure of our DP, again for eight blocks. In the first step, two blocks are combined each, i.e. one block is fixed (symmetry) and the other is added in all |G|! permutations (upper level in the figure). In the second step, all combinations of one pair (up to |G|! entries) of the previous step are combined with all permutations of another pair (up to |G|! entries times |G|! permutations) and so on (middle level in the figure). In the last step, direct assignments between the considered combinations (one in increasing and one in decreasing order) lead to the best solution (lower level in the figure); thus, the permutation factor |G|! can be omitted.

In total, Table 3 shows that our implementation is more efficient in the worst case. Nevertheless, in all transitions between \(s(k,*)\) and \(s(k+1,*)\) in (9) all new solutions, consisting of all previous solutions and all permutations of the current block, have to be evaluated. Therefore, this approach is very time intensive. Hence, we develop a different solution approach in the following subsection.

However, an interesting aspect of our DP lies in the asymptotic behaviour. For a block with |G| items with identical attribute values all permutations yield to the same result. They can be assigned to the groups in any sequence. It is unnecessary to include them in the solution approach. This means that at the most \(\max _i av_i\) blocks have to be considered (if the attribute values in all these blocks vary only by one). Hence, the computational effort of our DP is bounded even if |K| is unbounded. However, the effort for finding these \(\max _i av_i\) blocks is unbounded if |K| is.

4.2 Mixed-integer program

We present a mixed-integer program to solve the optimization version of BMDGPIAV in this section. The following model is an adapted version of the one by Schulz (2021):

Sets

 

\(i \in I\)

items (\(|I| = |G| \cdot |K|\))

\(g,g' \in G\)

groups

\(k \in K\)

blocks (|K| equates the number of items per group)

Parameter

 

\(\tilde{av}_i\)

\(= \sum _{k=1}^{|K|} b_{ki} \cdot c_k \cdot |av_i - med|\), where \(av_i\) is the attribute value of

 

item i, med is the median attribute value, and \(c_k\) is the

 

corresponding block value from Lemma 2

\(b_{ki}\)

is 1 if item i is in block k and 0 otherwise

Positive variables

 

\(div_{g}\)

diversity within group g

Binary variables

 

\(x_{ig}\)

is 1 if item i is assigned to group g and 0 otherwise

$$\begin{aligned}&F = \min \sum _{g,g' \in G:g'>g} div_g - div_{g'} \end{aligned}$$
(14)
$$\begin{aligned}&\text {with the constraints} \nonumber \\&(2), (4) \nonumber \\&\sum _{i \in I:b_{ki} = 1} x_{ig} = 1 \quad \forall g \in G, k \in K \end{aligned}$$
(15)
$$\begin{aligned}&div_g =\sum _{i \in I} \tilde{av}_i \cdot x_{ig}\quad \forall g \in G \end{aligned}$$
(16)
$$\begin{aligned}&div_g \ge div_{g'} \quad \forall g,g' \in G: g'>g \end{aligned}$$
(17)

Objective function (14) minimizes the differences of within-group diversities. Constraints (2) ensure that each item is assigned to exactly one group. Constraints (15) make sure that exactly one item of each block is assigned to each group (block constraints; compare (Schulz 2021)). Constraints (16) compute the within-group diversities while Constraints (17) eliminate symmetric solutions by ordering the groups in decreasing order of their within-group diversities (which is also the reason why we do not need the absolute values in (14)). Finally, Constraints (4) are the binary restrictions.

Often techniques based on a branch-and-bound procedure are used to solve models with binary variables. However, the LP-relaxation of our model is poor because \(x_{ig} = 1/|G|\) is a valid solution with \(div_g = div_{g'}\) for all \(g, g' \in G\) and, therefore, an objective value of zero. To reduce this drawback and improve the search, we introduce a technique to reduce the number of binary variables and a valid lower bound for the objective value in the following.

The model can be strengthened by eliminating all blocks in which all items have the same attribute value because the objective value is independent of their assignment. They can be assigned blockwise in a post-processing step in any order to the groups. Moreover, not all items of a block have to be modelled since only differences are relevant. Then, (15) has to be relaxed to

$$\begin{aligned} \sum _{i \in I:b_{ki} = 1} x_{ig} \le 1 \hspace{2cm} \forall g \in G, k \in K. \end{aligned}$$
(18)

For example, in a block with five items with attribute values 60, 60, 60, 65, and 70 it is sufficient to model the last two items with attribute values of \(5 \cdot c_k\) and \(10 \cdot c_k\) (\(c_k\) from Lemma 2). In a post-processing step, all within-group diversities are increased by \(60 \cdot c_k\), and the first three items are assigned (in any order) to the three groups to which none of the last two items is assigned to. By this, the number of modelled items can be reduced by a factor of at least \(\frac{1}{|K|}\).

Furthermore, the restriction to integer values allows us to compute a lower bound for F with the remainder of \(\frac{\sum _{i \in I} \tilde{av}_i}{|G|}\) (average group diversity). Let \(A = \left\lfloor \frac{\sum _{i \in I} \tilde{av}_i}{|G|} \right\rfloor \) and r be the remainder, i.e. \(r = \sum _{i \in I} \tilde{av}_i - A \cdot |G|\).

Theorem 5

A valid lower bound is given by

$$\begin{aligned} LB = \sum _{k=1}^{\min \left( r,\left\lfloor \frac{|G|}{2} \right\rfloor \right) } c_k - \sum _{k=\left\lfloor \frac{|G|}{2} \right\rfloor +1}^{\min (r,|G|)} c_k \end{aligned}$$

with \(c_k\) from Lemma 2.

Proof

We use Lemma 2 for this proof, i.e. the groups are the items and their diversities (div) the attribute values. Each block contains only a single group. This means \(|K| = |G|\) and each group forms its own block in decreasing order of their diversities \(div_g\) (in line with (17)). Therefore, we use k and g synonymously.

With the help of Lemma 2

$$\begin{aligned} F&= \sum _{g=1}^{|G|} c_g \cdot |div_g - med| = \sum _{g=1}^{\left\lfloor \frac{|G|}{2} \right\rfloor } c_g \cdot (div_g - med) + \sum _{g=\left\lfloor \frac{|G|}{2} \right\rfloor +1}^{|G|} c_g \cdot (med - div_g) \nonumber \\&= \sum _{g=1}^{\left\lfloor \frac{|G|}{2} \right\rfloor } c_g \cdot (div_g - div_{|G|+1-g}) \end{aligned}$$
(19)
$$\begin{aligned}&= \sum _{g=1}^{\left\lfloor \frac{|G|}{2} \right\rfloor } c_g \cdot div_g - \sum _{g=\left\lfloor \frac{|G|}{2} \right\rfloor +1}^{|G|} c_g \cdot div_g \end{aligned}$$
(20)

with \(med = div_{\left\lfloor \frac{|G|}{2} \right\rfloor }\) and because \(c_g = c_{|G|+1-g}\). Because of (17) and since \(c_1> c_2>... > c_{\lfloor \frac{|G|}{2}\rfloor } = c_{\lceil \frac{|G|}{2} \rceil }<... < c_{|G|}\), F is minimized if \(div_g = A + 1\) for \(g = 1,..., r\) and \(div_g = A\) for \(g = r+1,..., |G|\). Due to the definition of A it holds that \(div_g \ge A\) for all \(g \in G\). Thus, we can subtract A from \(div_g\) and \(div_{|G|+1-g}\) in (19) and accordingly from both \(div_g\) in (20). With \(div_g = 1\) for \(g = 1,..., r\) and \(div_g = 0\) for \(g = r+1,..., |G|\) (20) equals

$$\begin{aligned} \sum _{g=1}^{\min \left( r,\left\lfloor \frac{|G|}{2} \right\rfloor \right) } c_g - \sum _{g=\left\lfloor \frac{|G|}{2} \right\rfloor +1}^{\min (r,|G|)} c_g \end{aligned}$$

and the theorem follows.

Table 4 gives an example with five groups.

Table 4 Lower bounds (LB) for five groups

With the help of the \(c_k\) values in Lemma 2, LB can be computed as the sum of four (\(= c_1 = c_5\) if \(|K| = 5\)) times the difference between the values in columns 3 and 7 and twice (\(= c_2 = c_4\) if \(|K| = 5\)) the difference between the values in columns 4 and 6.

5 Computational study

In this section, our computational study regarding the MIP (2), (4), (14)–(17) is presented. We add the factor \(\frac{2}{|G| \cdot (|G|-1)} = \frac{1}{\frac{|G| \cdot (|G|-1)}{2}}\) to the objective value and present \(\frac{2}{|G| \cdot (|G|-1)} \cdot F\) in the results. So, the objective value is the average absolute pairwise difference of group diversities. This makes the objective values comparable for different instance sizes. The dynamic program was implemented in C++. The MIP was implemented in GAMS (version 39) and solved with CPLEX (version 20.1). The first subsection explains the composition, the second presents the results.

5.1 Composition

Attribute values are uniformly distributed between one and 500, as this covers most practical applications like surgery durations (in minutes), measures for academic achievements, and processing times (in minutes). Our approach uses the block constraints [see Schulz (2021) and (15)], i.e. the more items have the same attribute value, the easier our model can deal with the instance. A uniform distribution reduces such cases; so, we choose instances which are more complicated for our model.

Table 5 Tested parameter settings

The tested instances are presented in Table 5. The number of groups is varied between five and 15, and 20. For each of them four, ten, 50, 100, 200, and 300 blocks are considered. Note that instances with three blocks can be solved optimally by a direct assignment (Schulz 2021). We generated for each parameter setting 60 instances independently from each other. The time limit was set to ten minutes, i.e. the search was interrupted after ten minutes.

5.2 Results

Table 6 Results for the dynamic program

Table 6 presents the results regarding the DP in Sect. 4.1. We only present results for the first four settings with four blocks, as larger instances required more than ten minutes of computation time. The table shows the number of items |I|, the number of blocks |K|, and the number of groups |G| in the first three columns. Afterwards, the average objective value over all 60 instances \(\varnothing \)F and the average computation time are presented. The DP solves all instances to optimality. For the first three settings the DP requires only up to a few seconds to solve the instances. For the next larger setting with only four items more already more than 120 s are needed. This effect is explained by the factorial function of the number of groups shown in Table 3. Although the DP is only helpful for small instances, it is superior to the MIP for these instances as we will see in the following.

Table 7 Results of the computational study—best-bound search

The results for best-bound search are shown in Table 7. The first three columns show the parameter settings containing the numbers of items |I|, blocks |K|, and groups |G|. The fourth column gives the average objective value of all instances where a feasible solution was found. The following column presents the average objective value of all optimally solved instances. The sixth column contains the average time required for them to solve and the last three columns give the number of optimally solved instances, the number of optimally solved instances in which the optimal solution was the lower bound, and the number of instances in which a solution was found within the time limit of ten minutes.

For five to seven groups all instances were solved to optimality within seconds. Beside the smallest setting with four blocks, all instances were solved to optimality for at most nine groups within ten minutes. From ten groups on an increasing number of instances of the biggest setting with 300 blocks could not be solved to optimality any more. From 12 groups on, the same is true for the remaining settings. The same \(\cap \)-structure can be observed for the number of solutions found and at least 14 groups. The contrary \(\cup \)-structure can be observed for the average computation times.

The reason for this interesting relation might be located in a trade-off between more difficult instances due to instance size, and therefore higher computation times, and the high average objective values for small numbers of blocks. A high objective value leads to a larger gap to a perfectly balanced assignment (objective value: zero) and to our lower bound. So the solver has to increase the lower bound to close this gap (compare column #(LB = opt.)). But in a relaxed—fractional—problem it is easy to find a good balanced solution. Thus, it is hard to increase the lower bound. In fact, the lower bound improved only in 23.2% of all 426 instances in which a solution was found, but optimality was not proven within ten minutes (all with four blocks, compare Table 8). Another indicator are the average objective values, which are for small instances up to 13 groups (four blocks) smaller than the average objective values of all optimally solved instances. So, CPLEX was able to find good solutions but could not close the gap. In summary, the model is fast because it often finds a solution which meets the lower bound (compare columns #opt. and #(LB = opt.)). If there is no solution meeting the lower bound (especially if \(|K| = 4\)) or instances are large enough, the model needs significantly more computation time.

Table 8 Bound improvement—best-bound search

Korf (2009) mentioned that for multi-way number partitioning exactly the opposite \(\cap \)-structure is observed for computation times. Small instances are easy to solve due to the small number of possible solutions. Large instances are again easy to solve because there is only a little number of possible subset sum differences in comparison with the number of possible partitions. Schreiber (2014) investigated in his dissertation exact algorithms for multi-way number partitioning. In one part, a cached iterative weakening algorithm is proposed with which instances up to 60 items and 12 groups were solved optimally for large numbers (Schreiber and Korf 2014). For the same instance sizes, an overview of state-of-the-art algorithms subject to number of items and groups is given by Schreiber et al. (2018). Frias Faria et al. (2019) implemented a variable neighbourhood search approach, which has a good performance for instances with three or four groups.

A reason for the difference in the size of solvable instances between our problem and multi-way number partitioning lies in the data structure, since we draw attribute values—due to realistic sizes in applications—uniformly between one and 500 while Schreiber et al. (2018) considered uniformly distributed samples in the range [\(1,2^{48}-1\)]. We did another test with the parameter setting 600 items, six groups, 100 blocks, and attribute values between one and 1.000.000. Only 15 out of 60 instances were solved optimally within ten minutes (in 14 instances the optimal objective value equals the lower bound in Theorem 5). CPLEX needed 292.8 s on average for them. In comparison with an average time of 0.81 s to proven optimal solutions for instances with processing times between one and 500. Thus, there is a strong dependency on the range of possible attribute values. A reason is that the number of blocks with identical attribute values will be much smaller for a wider range. Another reason might lie in the more restrictive structure of our problem. In contrast to multi-way number partitioning, we only allow one item of each block to be assigned to a group, and hence restrict the number of possible assignments, and require that each group gets the same number of items assigned. Furthermore, the \(c_k\) values of Lemma 2 restrict the set of possible products with attribute values. Nevertheless, algorithms from the multi-way number partitioning literature might be adaptable for BMDGPIAV—especially to solve instances with a small number of blocks more efficiently.

In most of the instances which could be solved to optimality, the optimal solution equals the lower bound in Table 7. We discussed already that it is hard to improve the lower bound, which explains why only a few of the other instances could be solved to optimality. A reason why it is not that hard to find a good solution is the limited range of possible attribute values and therefore \(\tilde{av}_i\) values. In the last paragraph, we discussed that a larger range for attribute values leads to worse results. Given the restricted range between 1 and 500, we have blocks in which each item has the same attribute value (up to 6000 items). These blocks can be ignored in the solution procedure and added afterwards in any feasible way. Thus, instance sizes reduce. Moreover, the small range makes it likely that attribute values within a block vary only by a small number. Hence, it is likely that a difference in the assigned diversity of one block can be balanced by another block. The chance to balance the groups’ diversities increases with an increased number of blocks. An easy strategy would be to assign the next block’s items always in decreasing order of their \(\tilde{av}_i\) values to the groups ordered in increasing order of their so far diversity.

Table 9 Results of the computational study—depth-first search

The results in Table 7 were computed with best-bound search, the standard \(GAMS-CPLEX\) option. Because of the decreasing number of instances in which solutions were found and the bad objective values found for a larger number of groups, we repeated our computational experiments with the depth-first search adjustment in \(GAMS-CPLEX\). The results are presented in Table 9. In comparison, best-bound search leads in general to more optimal solutions. Furthermore, as expected, objective values of large instances in which no proven optimal solution was found decreased for depth-first search.

Moreover, as expected, computation times increased for both search strategies, a fixed number of blocks, and an increasing number of groups in general. As a conclusion the model works well for up to 15 groups with limitations for small and large numbers of blocks.

6 Conclusion

The paper introduces an approach for the assignment of items to groups according to their integer attribute values. As the block constraints lead to a maximal total diversity (Schulz 2021), we search for the best balanced solution of all optimal solutions of the MDGP. The paper introduces a new lower bound for the objective value and a variable reduction technique for the problem. Our computational study (Sect. 5) showed that the model works well for up to 15 groups in general (except for very small and very large instances). The main reason for the good performance is the lower bound, which is often met by optimal solutions.

One of the possible applications mentioned in the literature review (Sect. 2) is surgery scheduling. Several authors mentioned the importance of a balanced surgery schedule: van Oostrum et al. (2008) stated that an unbalanced surgery schedule often leads to demand fluctuation in other departments like surgical wards or intensive care units. Beliën et al. (2006) developed a master surgical schedule which levels bed occupancy. Marcon et al. (2003) balanced the workload over all operating rooms referring to the number of beds in the Postanesthesia Care Unit. Ogulata and Erol (2003) provided an approach to balance the distribution of surgeries between surgeon groups with respect to surgery days and length of surgery times. Adan et al. (2011) stated that master surgical schedules can lead to balanced use of resources such as beds, operating rooms, and nursing staff and Vanberkel et al. (2011) pointed out that it should lead to a balanced workload.

This indicates that there are several different resources that should be balanced in a surgery schedule to minimize resource requirements or, relating to staff, overtime. We think that there is further research demand in integrated approaches which allow balanced schedules with respect to many different resources. One possible approach is given in this paper. Further research demand lies in the evaluation of whether other resource requirements can be integrated in this approach in the way that surgery durations as well as these resource requirements are balanced over days and how this effects other stakeholders like surgeons, nurses, and elective patients. Besides, further research demand lies in the application to other practical problems and in the relation to multi-way number partitioning (compare Sects. 3 and 5). Future research might also focus on effective heuristic solution approaches—especially for instances with a small number of blocks and large instance sizes.