Introduction

New forms of transport services have been emerging over the years with the support of digitization (Cruz and Sarmento 2020). Focusing on the trip planners, these services provide integrated solutions with the main focus on assisting transportation-related choices (Mazzoncini et al. 2020). Beyond the basic approach of finding the shortest path between the origin and the destination, trip planners are mobility assistants that involve decisions on the modes of transport, trip duration, or trip costs utilizing different strategies (Rocha et al. 2021). However, it is important to keep in mind that route optimization is the main goal of trip planners, which can be realized using different criteria. Sourlas and Nathanail (2019) report in their route optimization analysis that 88% optimize routes in terms of time, 36% in terms of distance, 16% in terms of costs, 5% in terms of environmental pollution, and 1% in terms of calorie consumption. The route optimization may be also realized in terms of combined criteria. For instance, Dib et al. (2017) optimize the routes of travelers in a multimodal transport network by considering travel time, number of changes, and walking time criteria. Torres et al. (2018) develop a pedestrian-oriented trip planner using a multi-objective function to minimize the distance traveled, limit upward and downward slopes, and maximize the use of green zones. Another example is the one proposed by Georgakis et al. (2020), where the routes are optimized based on the minimization of distance considering the availability of transport alternatives, and the preferences of the travelers to use those alternatives. Esztergár-Kiss (2020) proposes a novel approach known as the activity chain optimization (ACO), which model aims to optimize the daily activity scheduling of travelers by minimizing the total time of their tours, considering the duration and flexibility of the activities, and a time window to perform such activities.

This ACO approach is formulated as an extension of the Traveling Salesman Problem (TSP) with Time Windows (TSP-TW). The TSP-TW involves the design of a minimum cost path for a traveler who visits a set of nodes. Each node is visited exactly once and the service at a node must begin within the time windows defined by the earliest and the latest time when the start of service is permitted at that node, as proposed by Dash et al. (2011). The added value of the ACO includes considering the flexibility of the activities performed by a traveler in a node or location, where the traveler can access some facilities. The term activity chain seems to be closely related to the definition of the trip chain, but the difference is that the former has an activity-oriented approach since the characteristics (e.g., flexibility) of the activities are considered as the basic element, whereas in the latter case, a trip based approach looks at the trip between the activities (Bowman and Ben-Akiva 2001). In the activity chain context, the flexibility of an activity refers to the flexibility a traveler towards schedule deviation in time and space for an activity. An activity is assumed to be fixed or flexible, either temporally or spatially (Arentze and Timmermans 2000), where the optimization is based on the premise that the traveler looks for an activity chain that maximizes gains and minimizes costs (Liu et al. 2021).

The combination of the TSP-TW backbone and the activity-oriented approach positions the ACO as a suitable framework for two distinct applications: the development of a scheduling optimizer tool and the construction of an activity-based model grounded in optimization principles. In the context of schedule optimization tool, it enables the generation of optimal daily travel plans, which can be interactively presented to end users helping them on their travel decisions (Vaughn et al. 1999). This process involves users logging their activities and their associated attributes in advance to obtain an optimal plan. Conversely, in the realm of an activity-based model, the ACO framework is adept at generating ideal activity schedules that serve as benchmarks for assessing real-world behaviors. The outcomes obtained in the latter application provide essential information regarding trip purpose, destination, scheduling, and mode selection, which can subsequently be applied for network assignment (Miller 2021).

The literature on solving the trip scheduling problem is broad. Several articles have published about new mathematical formulations, problems solution using heuristic algorithms, new criteria for the optimization process, and new applications. For example, Papalitsas et al. (2019) express TSP-TW as a quadratic unconstrainted binary optimization model to overcome the form of the inequality constraints of the time windows. Fachini and Armentano (2020) propose an exact and heuristic dynamic programming algorithm to solve TSP considering the flexibility of the time windows, while Bretin et al. (2021) model the functioning of postal services using the TSP-TW approach. However, the use of the TSP-TW in activity-based approaches is barely present in the literature to establish a basis about the application of the ACO in the realm of an activity-based model. It is significant to keep unveiling the promising connection of optimization approaches into activity-based models as they seem to capture the complex trade-offs between scheduling decisions for multiple activities, such as the impact of dedicating more time to one activity on the availability for others, or how the sequencing of activities influences travel durations (Pougala et al. 2022). Consequently, these complex trade-offs increase the complexity of those models and their runtimes in both applications. Fortunately, the arrival of artificial intelligence-based methods, such as metaheuristic algorithms, is allowing to reduce computational costs and finding a solution to optimization problems in a decreased time.

Related to the activity chain approach, simulations are carried out using either small examples of activity chains or small datasets, where studies focus more on examining the structural functioning of the optimization problem (Esztergár-Kiss and Rózsa 2015). Similarly, Cuchý. et al. (2018) test their electric vehicle-based tour planner on a small set of activity chains, while Esztergár-Kiss et al. (2018) generate a small set of activity chains to represent a quasi-typical case of real-world travel patterns. If the evaluation has a large-scale approach, synthetic data is used in the activity chain approach. van Heerden and Joubert (2014) create a large synthetic dataset of vehicle activity chains. Felbermair et al. (2020) generate synthetic population with activity chains. Hence, based on the knowledge of the authors, no real activity chains have been tested with the ACO framework, especially no big household survey data have been assessed. The evaluation of big datasets is important to optimize the processing of information (Gómez-Marín et al. 2023), which leads to time-saving in the process of routing optimization. Thus, the purpose of the present paper is to address how the ACO method works by using real-world mass data as input, as well as how the method solving the optimization problem performs in the case of large datasets and how much travel time reduction can be achieved. The underlying motivation for this study lies on the rationale that robust input data can reveal the weaknesses of models, facilitating their mitigation to enhance both the computation speed and quality of optimization outcomes (Bast et al. 2013; Botea et al. 2019). Consequently, conducting the evaluation in this study is significant because it allows to unveil the performance of solving this optimization problem, whether used for end-user tour optimization or within the context of activity-based modeling.

The remainder of this paper is organized as follows. In Related literature, concerns about trip scheduling, the optimization of activity chains, and similar activity-based models are reviewed. The Methods section describes the methods utilized to obtain the data, the mathematical formulation of the ACO method, and the optimization steps of the algorithm. In the Computational experiment and results, the computational experiment is described, the results of the optimization problem are analyzed, and the performance of the solving algorithm is evaluated. In the Discussion, a discussion about the findings of the results, the limitations of the study, and future research directions are presented. Finally, a summary of the study is presented in the Conclusion.

Related literature

The essential characteristics of activity-based scheduling problems are identified and gathered in Table 1, which reflects the data input utilized, the transport network considered, the features of the activities, the objective function of the problem, its constraints, and the solving methods. Column “Article” shows the name of the authors and the topic addressed, while column “Goal” refers to the aim of the article, whether it focuses on optimization, proposes a model, or conducts an experiment. Column “Data input” deals with the type of data utilized in the research, which can be an artificial benchmark or based on real-world Points of Interest (POI). Column “Instances size” refers to the number of test cases utilized in the research, which is considered small when less than 50 instances were employed, and large for more than 50. Column “Transport network” is related to the type of transport modes covered, which can be unimodal when only one mode is considered, or multimodal when more modes are included. Column “Flexibility” describes if spatial or temporal flexibility of activity is considered in the elaborated approach. The “Objective function” comprises the distinction of a utility or a cost function considering a single criterion or a multi-criteria optimization. The term “Constraint” is related to the inclusion of time windows and demand times in the optimization. Finally, the “Solving method” refers to the applied algorithm to solve the optimization, whether exact solution-based algorithms or heuristic-based algorithms are used.

Table 1 Summary and comparison of the related literature

The first group of the reviewed articles embraced TSP either including time windows or demand times, but most of them do not really cover the activity-based approach. More specifically, Hou et al. (2022) model a multi-depot vehicle routing problem as part of the integration of the production process of a set of jobs with the distribution problem of these completed jobs. The input data utilized to assess the proposed model is based on three types of benchmark problems with 30 instances. The transport mode includes only private cars. The flexibility of the jobs scheduled is considered only in the temporal dimension, whereas the optimization aims to minimize a single criterion (i.e., time) objective function with a time window constraint. The selected optimization algorithm is a metaheuristic known as brainstorm optimization. Another example, Kirschstein and Bierwirth (2018) model a multi-criteria optimization problem based on TSP, where the aim is to minimize the GHG emission allocated to a trip considering unimodal transport mode. The model includes time windows, and temporal flexibility. The instances utilized are 360 artificially generated benchmarks to be assessed using a heuristic algorithm called large neighborhood search. Crişan et al. (2021) analyze the approach of secure transport systems and model a variant named secure-TSP, which is an optimization problem with additional security constraints. These constraints are criteria related to risks present in road transport, such as natural risk, willful risk, and environmental risk. However, no experimental optimization is performed, and therefore no benchmark instances or solving algorithms are used. Furthermore, Sabry et al. (2018) model and optimize a TSP, where the objective function is to reduce the total emission of different transport modes, calculated by a multi-criteria approach. Özcan and Kaya (2018) optimize the simplest TSP, which is integrated into a real-world touristic application in Ankara and Rome, where real POIs are considered. However, only unimodal transport is included, and neither time window constraints nor the flexibility of activities is comprised. The solving algorithm utilized is an exact search algorithm and a heuristic hill climbing algorithm to solve a single criterion path distance minimization TSP.

The second group of papers focuses on optimizing travel itineraries and mobility services although not necessarily a TSP-based model is approached. For example, da Silva et al. (2018) model and optimize the problem of elaborating travel itineraries using a multi-criteria approach, such as visitor profiles, distances, and costs. However, neither the time windows constraint nor the spatial flexibility of the locations is considered. The transport mode is unimodal, and the instances are a few real solutions from Paraná city. The algorithms to solve are an exact-based CPLEX framework and a metaheuristic tabu search. This scheduling problem is solved using heuristics. Furthermore, Kurata and Hara (2014) optimize a computer-aided tour planner including real POIs of Tokyo together with their time windows and the temporal flexibility of the plan optimization. The objective function contains multi-criteria maximization, but only walking mode is available. The optimization is performed using a genetic algorithm (GA) and is tested with the support of 56 users. Torres et al. (2018) model an intelligent multi-criteria personalized route assistant focusing on pedestrians, which optimizes the distance, the slopes, and the use of green zones. However, neither flexibility is considered, nor the time windows constraint. The model is evaluated using real POIs of Granada, Spain using an exact algorithm.

The third group of the reviewed literature deals with activity-based trip chaining but is not necessarily applying the TSP method. For example, Gao et al. (2019) analyze and model an activity-based trip chaining method including a multi-criteria approach focusing on parking fees of private cars. However, neither the constraints related to time windows, nor the flexibility of activities are comprised. The model is solved using a simulated annealing metaheuristic in two created instances. Balać and Axhausen (2017) model and optimize a methodology for activity scheduling that allows changes in the daily schedules based on temporal and spatial dimensions using a multi-criteria utility function that includes opening times, waiting times, and demand times of the activities in a multimodal transport network. The model is solved using heuristic rules and is tested on a large set of created instances based on the population of Zurich. However, the framework includes the synthetic creation of activity schedules, which is not realistic since user preferences are not considered in the scheduling process, where activities may be freely swapped without restrictions. Krishnamurthi et al. (2021) analyze a case of a mobile application for task scheduling in smart city infrastructure. The application includes spatial and temporal flexibility of the schedules using heuristic rules. The scheduling of user preferences is based on the demand time of the tasks, but not the time windows of the locations. Real POIs are obtained by geocoding, but only car mode is available as a transport mode. Further, Kwasnik et al. (2017) model and optimize an online mobility activity-based recommendation system that uses real-world POIs and takes into consideration the demand times of the users to perform activities in a multi-criteria optimization approach. The model comprises a multimodal transport network and is tested on 256 real instances using an exact solving algorithm. Finally, Esztergár-Kiss et al. (2020) randomly generate 30 test cases within the area of Budapest to solve an undirect weighted graph model of the activity chain optimization. However, the preferences of the travelers are considered in a second stage of the optimization.

From the reviewed literature, it can be summarized that if a problem is covered with an activity-based approach, time windows constraints are only partially considered. In addition, most articles covered in this research do not cover spatial flexibility of activities when solving their models, especially those ones that involving the TSP, as by the definition the activity locations are predefined. Additionally, if the problem is covered as a TSP, usually artificial instances are created. Generally, the number of created instances is small (less than 50). Only in one paper do the authors use real POIs and evaluate their optimization with a large number of instances simultaneously. In addition, most solutions use real-world POI data, but only a few articles utilize survey-based data. The contributions of the current study are: (1) the development of a mixed integer linear programming (MILP) model that integrates the preferences of the travelers, which is based on the flexibility of their activities, (2) the creation of real-world activity chain instances and their conversion to coordinate points for testing the ACO method, and (3) the evaluation of the ACO method using real-world mass data using a metaheuristic and an exact-based approach. The adaptable architecture of the ACO for developing travel personalization tools and activity-based modeling has been previously addressed by Rizopoulos and Esztergár-Kiss (2020) and Dingil and Esztergár-Kiss (2023), respectively. Hence, discussions on these aspects fall outside the scope of the current investigation. Instead, our study takes a step forward by focusing on the performance evaluation of the optimization process for both purposes indiscriminately.

Methods

The complete procedure of the activity chain optimization comprises three main stages: the creation of the benchmark data, the benchmark data preparation, and the optimization, as shown in Fig. 1.

Fig. 1
figure 1

Framework schema of the optimization process

Benchmark data

In order to create the benchmark data, three primary datasets were utilized: (1) The Regional Household Travel Survey Data (SACOGHTS) obtained from the Sacramento Region Transportation Study (Gao 2018), which contains daily mobility schedules within Traffic Analysis Zones (TAZs); (2) The Points of Interest (POIs) dataset of businesses, shops, and locations offering various facilities within the city of Sacramento (SACOGPOI) created from OpenStreetMap (OSM), which includes information such as names, coordinates, and operating hours (Raifer 2019); and (3) The geospatial dataset of TAZs for the Sacramento region (SACOGTAZ) obtained from (SACOG 2016).

Benchmark data preparation

SACOGHTS is a survey dataset of the Sacramento Region Transportation Study conducted by the Sacramento Area Council of Governments from 10 April 2018−2021 May 2018. The dataset was collected from 8321 individuals across six counties, resulting in a total of 146,000 linked trips and 34,000 complete person-days. The dataset includes detailed information on the places visited by each individual on each day, including the trip purpose, origin and destination locations (as represented by TAZ codes), departure and arrival times, trip duration, and transport mode. In order to optimize the activity chains, the dataset was transformed from a place-per-row format to an activity chain-per-row format, as illustrated in Fig. 2. Additionally, activity chains with lengths of 15 activities or more were filtered out.

Fig. 2
figure 2

Dataset rearrangement schema

The POIs in the SACOGPOI dataset were extracted from OSM, which utilize tags to represent physical features on the ground. Based on these tags, the relevant data, such as coordinates, names, and descriptions, were extracted. The relevant POIs were selected based on their OSM tags (shown in Appendix 1) which represent locations where an individual can engage in a variety of activities (OpenStreetMap Wiki 2022), such as eating and shopping. Additionally, the opening times for each POI were obtained through the Foursquare Places API (Foursquare 2022). This resulted in the creation of a dataset containing POIs with amenity locations, opening times, and coordinates.

The locations in the SACOGHTS dataset were initially retrieved at the TAZ level and subsequently disaggregated to coordinate points using in combination the TAZ coding from the SACOGTAZ dataset and the POIs from the SACOGTHS dataset. The disaggregation method is detailed by Toaza and Esztergár-Kiss (2023). First, the trip purposes in the SACOGTHS dataset are converted into the corresponding OSM POIs tag values considering the potential locations where each activity can be performed. For instance, the activity purpose “college/university” can be performed at OSM locations with tag values “college” or “university” of “research_institute.” Then, a list of POIs within the corresponding TAZ and corresponding converted trip purpose is obtained. From this list, a POI is selected using a random normal distribution. Additionally, to facilitate the analysis of transport mode choices, the individual modes of transportation utilized in each trip within the activity chains of SACOGHTS were clustered into broader categories. These categories, based on similarities in vehicle types, include private car (car, taxi), public transport (transit, school bus), and walk (walk, bike). A final benchmark dataset containing 5274 activity chains was obtained after cleansing data (e.g., erasing activity chains where the respondents did not report the activity purpose), which includes information on the start location and trip purpose, coordinate points of the location where the activity took place, transport mode choice, and processing time of the activity performed. The input dataset of activity chains can be used in the optimization process.

The optimization method

In the following subsections, the model for the ACO is formulated, and the description of the extension from TSP-TW towards an activity-based approach is treated. This is followed by the description of the algorithm to solve the optimization problem, and the assumptions established to run the optimization framework.

The ACO formulation

The ACO originally proposed by Esztergár-Kiss et al. (2020) is an extension of the TSP-TW aiming to find the most efficient sequence of activities at a set of locations for a traveler while considering the time window for when the location can be visited, the duration of the activity, as well as the desired time window for the traveler when they want to visit each location. The traveler must visit the set of locations and return to the starting location, visiting each location only once, and doing so within the given time windows. Furthermore, the ACO captures the preferences of the traveler to perform an activity based on their flexibility to deviate their original schedule in terms of time and space. These preferences are represented though a label system Table 2. Certain activities that are performed regularly are fixed in time and space for a traveler (e.g., commutes to school or work). Conversely, the activities that are performed less frequently can be flexible in time (e.g., picking up a package from the post office along a day), space (e.g., having lunch in any nearby restaurant), or both flexible in time and space (e.g., buy a transport ticket in any of the stations at any time).

Table 2 Flexibility of activities

This model was represented using an undirected weighted graph \(G=\left(A,E\right)\) where \(A=\left\{{a}_{0}, {a}_{1},{a}_{2},\dots {a}_{n},{a}_{n+1}\right\}\) is the set of activities, \({a}_{i}\) is the index of the \(i\)th activity. By convention, the activity \({a}_{0}\) and \({a}_{n+1}\) is called home. \(E=\left\{\left(i,j\right)\hspace{0.2em}|\hspace{0.2em} i,j\in A\hspace{0.5em} {\text{and}}\hspace{0.5em}i\ne j\right\}\) is the set of the edges representing the travel times between two activities. In this study, we present a new MILP model formulation. Unlike the original model, this proposed formulation incorporates the flexibility labels of the activities in the MILP model formulation.

The additional elements required to define an activity chain are the durations of each of the \(n\) activities, \({d}_{{a}_{i}}\) where \({d}_{{a}_{i}}>0\); a variable set of possible locations \(L=\left\{{\ell}_{0},{\ell}_{1},{\ell}_{2},\dots ,{\ell}_{m}\right\}\) to perform each \(i\)th activity \({a}_{{i}^{\left\{L\right\}}}\); the travel times between all pair of activity locations \({a}_{{i}^{\left\{L\right\}}}\) and \({a}_{{j}^{\left\{L\right\}}}\), \({t}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}\) where \({t}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}\ge 0\); the time windows associated with each location in \(L\), \([TW{s}_{{a}_{i^{\left\{L\right\}}}},TW{e}_{{a}_{i^{\left\{L\right\}}}}]\); the time windows associated with the desire of the traveler to visit an activity location \(i\), \([TD{s}_{{a}_{{i}^{\left\{L\right\}}}},TD{e}_{{a}_{{i}^{\left\{L\right\}}}}]\). Furthermore, the variables required to estimate the activity chain’s solution are:

$${A}_{{a}_{{j}^{\left\{L\right\}}}}={D}_{{a}_{{i}^{\left\{L\right\}}}}+{t}_{{a}_{{i}^{\left\{L\right\}}} {a}_{{j}^{\left\{L\right\}}}}$$
(1)
$$wai{t}_{{a}_{{i}^{\left\{L\right\}}}{ a}_{{j}^{\left\{L\right\}}}}=\left\{\begin{array}{ll} TW{s}_{{a}_{{j}^{\left\{L\right\}}}}-{A}_{{a}_{{j}^{\left\{L\right\}}}}, &\quad TW{s}_{{a}_{{j}^{\left\{L\right\}}}}-{A}_{{a}_{{j}^{\left\{L\right\}}}}>0\\ 0, &\quad otherwise\end{array}\right.$$
(2)
$${D}_{{a}_{{j}^{\left\{L\right\}}}}={A}_{{a}_{{j}^{\left\{L\right\}}}}+wai{t}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}+{d}_{{a}_{{j}^{\left\{L\right\}}}}$$
(3)

where \({A}_{{a}_{{j}^{\left\{L\right\}}}}\) is the arrival time to an activity location \({a}_{{j}^{\left\{L\right\}}}\) from a location \({a}_{{i}^{\left\{L\right\}}}\), \({D}_{{a}_{{j}^{\left\{L\right\}}}}\) is the departure time, and \(wai{t}_{{a}_{{i}^{\left\{L\right\}}}{ a}_{{j}^{\left\{L\right\}}}}\) is a waiting time in case the traveler arrives earlier than the start of the time window. Whenever \(wai{t}_{{a}_{{i}^{\left\{L\right\}}}{ a}_{{j}^{\left\{L\right\}}}}>0,\) the traveler can wait a constant maximum waiting time \({wait}_{max}\). The resulting MILP model for the activity chain optimization is:

$$\mathrm{min}\hspace{0.3cm}\,T=\sum_{i=1}^{n}\sum_{j=1}^{n}\left({t}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}+wai{t}_{{a}_{{i}^{\left\{L\right\}} }{a}_{{j}^{\left\{L\right\}}}}\right){x}_{{a}_{{i}^{\left\{L\right\}}}{a}_{{j}^{\left\{L\right\}}}}+\sum_{i=1}^{n}{d}_{{a}_{i}}$$
(4)
$$\text{s.t. }\sum_{j=1}^{n}{x}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}=1,\quad \forall {a}_{{i}^{\left\{L\right\} }}\in A \quad {\text{and}}\quad {a}_{{i}^{\left\{L\right\}}}\ne {a}_{{j}^{\left\{L\right\}}}$$
(5)
$$\hspace{0.9cm}\sum_{i=1}^{n}{x}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}=1, \quad \forall {a}_{{j}^{\left\{L\right\}}}\in A \quad{\text{and}}\quad {a}_{{i}^{\left\{L\right\}}}\ne {a}_{{j}^{\left\{L\right\}}}$$
(6)
$$\hspace{0.5cm}\sum _{{i \in S}} \;\sum _{{j \notin S}} \;x_{{a_{{i^{{\left\{ L \right\}}} }} a_{{j^{{\left\{ L \right\}}} }} }} \ge 1,\quad \forall S \subseteq A,\left| S \right| \ge 2$$
(7)
$$\hspace{0.5cm}{t}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}\le {t}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{k}^{\left\{L\right\}}}}+{t}_{{a}_{{k}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}},\quad \forall {a}_{{i}^{\left\{L\right\}}}, {a}_{{j}^{\left\{L\right\}}}, {a}_{{k}^{\left\{L\right\}}}$$
(8)
$${D}_{ {a}_{{i}^{\left\{L\right\}}}}\le TW{e}_{ {a}_{{j}^{\left\{L\right\}}}},\quad \forall {a}_{{i}^{\left\{L\right\}}}\in A$$
(9)
$${A}_{ {a}_{{i}^{\left\{L\right\}}}}+wai{t}_{max}\ge TW{s}_{ {a}_{{i}^{\left\{L\right\}}}},\quad \forall \hspace{0.17em} {a}_{{i}^{\left\{L\right\}}}\in A$$
(10)
$${x}_{{a_{i^{\{ L\}}} a_{j^{\{ L\}}}}} \in {{\{ 0,1\} }}, \quad \forall {a}_{i^{\{L\}}},a_{j^{\{L\}}} \in A.$$
(11)

where:

\(T=\) total tour time of the activity chain.

\({x}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}\)= binary variable indicating if the edge \(\left(i,j\right)\) is part of the activity chain \(\left({x}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}=1\right)\) or not \(\left( {x_{{a_{{i^{{\{ L \}}} }} a_{{j^{{\{ L \}}} }} }} = 0} \right).\)

The objective function (4) aims to minimize the total tour time for the entire activity chain. The flow conservation constraints (5) and (6) ensure that each activity location is visited only once. Constraint (5) demands that each location is arrived at from exactly one other location, and (6) demands that from each location there is a departure to exactly one other location. The sub-tour elimination constraint (7) imposes that disjunctive partial activity chains are not allowed, and therefore the traveler must always visit an activity that was not visited before. Constraint (8) requires that the travel times between locations satisfy the triangle inequality. The triangle inequality guarantees that going directly from \({a}_{{i}^{\left\{L\right\}}}\) to a new activity \({a}_{{j}^{\left\{L\right\}}}\) is always faster than going there via intermediate \({a}_{{k}^{\left\{L\right\}}}\). The time windows constraints (9) and (10) ensure that the activities are performed within their respective time windows. Constraint (9) ensures that the traveler departures from location \({a}_{{i}^{\left\{L\right\}}}\) before its closing time. Constraint (10) ensures that when the traveler arrives earlier at location \({a}_{{i}^{\left\{L\right\}}}\), they can wait a maximum waiting time. The binary constraint (11) states that the \({x}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}\) variables can be only 0 or 1 indicating whether a trip is part of the activity or not.

Furthermore, the temporal and spatial flexibility of the activities based on the preferences of the traveler reshape the constraints of the optimization problem in the following fashion:

Label 1 both temporally and spatially fixed, meaning that the activity must be performed in a predefined location and desired time window. The traveler defines their preferred location, their desired start time (or end time), and the duration of the activity. The traveler must arrive at location \({a}_{{i}^{\left\{L\right\}}}\) at \(TD{s}_{{a}_{{i}^{\left\{L\right\}}}}\) or earlier. Therefore, in terms of spatial flexibility, the set of possible locations \(L\) is reduced to a single location \(\ell_{0}\) for all the constraints as in (12); and in terms of temporal flexibility, constraints (9) and (10) become (13) and (14) respectively.

$$a_{{i^{{\left\{ L \right\}}} }} \to a_{{i^{{\left( {\ell _{0} } \right)}} }} \quad \forall a_{i} \in A$$
(12)
$$D_{{a_{{i^{{(\ell _{0} )}} }} }} \le TDe_{{a_{{i^{{(\ell _{0} )}} }} }} \quad \forall a_{i} \in A$$
(13)
$${A}_{{a}_{{i}^{(\ell _{0})}}}+wait_{max}\ge TD{s}_{{a}_{{i}^{(\ell _{0})}}}\quad \forall {a}_{i}\in A$$
(14)

Label 2 spatially flexible activities maintain the constraint of fixed temporality as in (13) and (14), meaning that the traveler has fixed preference to perform the activity \({a}_{i}\). However, the traveler is not aware of the location to perform such activity. Therefore, the set of alternative locations \(L=\left\{{\ell}_{0},{\ell}_{1},{\ell}_{2},\dots ,{\ell}_{m}\right\}\) is considered in addition to the original location \(\ell_{0}\) for each activity \({a}_{i}\), forming m additional activity chains to include in the optimization, shown in (15).

$$a_{{i^{{\left\{ L \right\}}} }} \to a_{{i^{{\left( {\ell_{0} ,\ell_{1} ,\ell_{2} , \ldots ,\ell_{m} } \right)}} }}$$
(15)

Label 3 temporally flexible activities maintain the predefined location criteria as in (12), meaning that only a single alternative location is considered for the activity \({a}_{i}\). However, the traveler is not interested in a desired time window to perform such activity as long as it is within the location’s time window, therefore constraints (9) and (10) remain.

Label 4 both spatially and temporally flexible maintains the temporal flexibility criteria as in label 3 [constraint (9) and (10)], and the spatial flexibility criteria as in label 2 [Eq. (15)].

Solving the ACO

The present study employed a GA for solving the ACO. The GA is an approximate algorithm that seeks to find a near-optimal solution, in a reasonable amount of time, as opposed to an exact method that finds the best solution in a prolonged time (Mitchell 1998). The choice of the GA was motivated by its capability to tackle combinatorial problems with high complexity, which is often a limiting factor for exact solution-based algorithms (Wang et al. 2020), since the flexibility of activities may increase the computation time to find a solution. Additionally, the population-based nature of the GA enables a more comprehensive exploration of the search space than single-solution based algorithms such as simulated annealing, avoiding the risk of being trapped in a local optimal solution (Osaba et al. 2020). The GA process starts with the creation of an initial population of feasible solutions, followed by a series of iterations, where heuristic rules are applied to simulate the evolutionary process of reproduction, mutation, and creation of new generations, to generate improved solutions (Abidi et al. 2018). The process is repeated until a stopping criterion is met, such as the maximum number of generations, finding a sufficient number of high-quality solutions, or a lack of improvement in the solutions (Danloup et al. 2018). The details of the GA developed in this study is presented in the following paragraphs.

Encoding The encoding of the GA to the MILP model of the ACO can be done in several ways. One of the most common encodings is the permutation representation, where a chromosome encodes a solution to the problem (i.e., an activity chain) as a permutation of its genes (i.e., activity locations). Each gene g within the chromosome corresponds to an activity location with all its information. The set of chromosomes constitutes the population P of solutions. The encoding of the GA is explained on a problem of five activity locations O, A, B, C, and D, where O represents home (Table 3).

Table 3 Activity locations with their genetic information

Initialization (lines 1–3). A random population of chromosomes was generated. The permutation of the chromosomes followed the Fisher-Yates’ Shuffle algorithm (Durstenfeld 1964). For example, the first chromosome in the population might be represented as O → A → B → C → D → O. This method used to generate the initial population not only preserves population diversity but also accelerates the convergence to the final solution by furnishing promising initial solutions.

Evaluation (lines 4–10). The objective function (4) was evaluated to determine the fitness of each chromosome. The fitness value is the total time \(T\). Then, the lesser the \(T\), the fitter the chromosome. In this scenario, locations A and D, labeled as 4 and 2 respectively, possess spatial flexibility and thus require consideration in the optimization process (A{L}, D{L}). This was achieved by introducing alternative locations with the same OSM tag within the same TAZ. For example, the original location A0 has one alternative location A1, and the original location D0 has two alternative locations D1 and E2 (Fig. 3). Consequently, the additional chromosomes are included in the evaluation of the fitness.

Fig. 3
figure 3

The creation of chromosomes for activity locations with spatial flexibility. Notes Each letter indicates an activity location while the subscript numbers indicate the additional locations

Selection (lines 12–14). The chromosomes with the highest fitness were selected for crossover to serve as parents in the next generation, Chromosomes with medium or low fitness levels were also utilized to introduce variability in future generations. A small portion of the population, referred to as the elite chromosomes, are ensured to persist in the next generation. The remaining portion of the population is subjected to crossover, and this proportion of the population is referred to as the crossover fraction (\(CF\)).

Crossover (lines 15–18). Two parent chromosomes were chosen and applied a crossover operator to generate two offspring chromosomes. The local search crossover operator developed by Lin et al. (2016) was used. This operator was chosen because it improves the effectiveness and efficiency of GA by exploring the more favorable areas of the solution space of TSP. The mechanism of the selected crossover operator is the following: (1) two parent chromosomes are randomly selected. (2) one gene within the first parent chromosome is selected randomly to be the first gene of the first offspring chromosome (excluding the origin O) and is set as current node. (3) all the one-step neighborhood genes of the current gene in the two parent chromosomes are determined. (4) the travel time between the current gene and its one-step neighborhood genes are compared. (5) then, the one-step neighborhood gene with the shortest time to the current gene is inserted in the sequence of the offspring chromosome and is set as the new current node. (6) the inserting process is repeated excluding the genes that have been already inserted until all the genes have been added into the first offspring chromosome. The process is repeated for the second offspring chromosome. Thereby, two complete valid offspring chromosomes are produced.

Mutation (lines 19–22). The mutation of the offspring prevents the GA to get stacked in a local optimum since the diversity of the population is maintained. This operator alters a few genes of the chromosomes with a mutation probability pm. The mutation operator utilized was the random swap mutation, which randomly selects two activity locations in the chromosome and swaps their positions.

New population (lines 24–27). Mutated offspring and elite individuals form a new population and become parents in a new generation, where the process is repeated till reaching the maximum number of generations Ngen or when there is no improvement after an established maximum stall generation.

When the stop conditions are met, the chromosome with the highest fitness is identified as the ultimate solution, symbolizing the optimized activity chain. In instances where very strict constraints lead to no feasible solution. The optimization process concludes without producing any result and \(T\) is set to \(\infty\). However, given the utilization of realistic and existing activity chains in this study, if no further optimized solution is found, the original activity chain is retained as the solution. The steps of the GA developed in this study are shown in Algorithm 1.

Algorithm 1
figure a

Genetic Algorithm applied to solve ACO

Computational experiment and results

The ACO was implemented using Matlab R2021a on an HP Omen 15 with an Intel Core i7 2.6 GHz and 16.0 GB RAM, running under Windows 10. The model and algorithms were tested on the benchmark data obtained from the SACOGPOI dataset, which consists of 5274 activity chains with fewer than 15 activity locations. The experimental parameters for the ACO and the GA are detailed in Table 4. The algorithm was run under two different scenarios. In the first scenario, the ACO was solved using an exact-based algorithm, namely the branch-and-bound, as a base for comparison. The second scenario involved the optimization of activity chains using the GA Quantiles were calculated for each scenario.

Table 4 Experimental parameter settings

Data preprocessing

It was posited that a selected transport mode was associated with the entire activity chain. In cases where a traveler reported the use of a private car during any trip, it was explicitly stated that this mode persisted throughout the entire day, regardless of whether public transport use was also reported. In instances where no private car but public transportation was utilized, it was clarified that only public transportation and walking were considered. If neither mode was reported, it was explicitly stated that walking was assumed for the entire activity chain. Additionally, the time required for a driver to park a private car can be considered, although it was set to zero in this study. The travel times were calculated using the OpenTripPlanner, which is an open-source multimodal trip planning service that uses the OSM road network data and the GTFS public transport network (OpentripPlanner 2020).

The time windows (\(TWs\) and \(TWe\)) of the locations were the respective opening times of the POIs in SACOGPOI for a specific day. In cases when the POIs did not have opening times for a day, these were obtained from an adjacent day. The durations \({d}_{{a}_{i}}\) of the activities and the desired time windows of the travelers (\(TDs\) and \(TDe\)) were extracted from SACOGHTS. We ensure that the desired time windows of travelers must be within the time windows of the locations \((TDs\ge TWs, TDe\le TWe)\). Furthermore, the maximum waiting time \({wait}_{max}\) was set to the 30 min. Due to the absence of flexibility labels from the traveler data, these labels were assumed. Regularly performed activities, such as work or school, were assigned a label 1, while infrequently performed activities, such as filling up on gas or grocery shopping, were randomly assigned a label using a discrete distribution \(X\sim U\left(l,u\right)\), where \(X\) represent set of activities uniformly distributed from \(l=1\) to \(u=4\) which correspond to the flexibility labels.

Transport modes

The distribution of transport mode choice among the activity chains from the experiment is presented in detail (see Table 5). The activity chains are grouped based on their size, which is indicated in the first column, and the number of activity chains that utilized each transport mode is presented in subsequent columns. Out of all the activity chains, 1.479% used PT, 89.420% used private cars, and the remaining 9.101% were performed by walking. Likewise, 56.636% of the activity chains comprised two activities, of which 48.597% by private car, 7.224% by walking, and 0.815% by PT. In the same context, 22.620% of the activity chains performed 3 activities, which only 0.398% by PT, 21.085% by car, and 1.138% by walking. On the opposite side, there were only two activity chains with 14 activities (0.038%), all using a private car. To better understand the results, a graphical representation is depicted in Fig. 4, which shows the number of activity chains based on size and mode choice. Overall, it can be observed that the number of activity chains decreases as the size increases, with a significant decrease observed between the number of activity chains with two activities compared to those with three.

Table 5 Number of activity chains by transport mode choice
Fig. 4
figure 4

Distribution of the activity chains size by mode choice

Total and travel times

The simulation results regarding the total time T and total travel time \({t}_{{a}_{i}{a}_{j}}\) of the activity chains were analyzed by obtaining their quantiles. To minimize interference errors, the quartiles (0.25, 0.50, 0.75 quantiles) were used (full results in Appendix 2), as suggested by Yang et al. (2020). The column Best refers to the best-found solution in the total number of runs, and the same applies to columns Mean and Worst. In the case of the branch-and-bound scenario, only the optimal solution is displayed for comparison against the GA scenario.

Regarding the total time \(T\), the results indicate that 50% of the activity chains with 2 activities lasted between 36 and 240.25 min for the base scenario, while in the GA scenario with flexible activities, they lasted between 36 and 236.75 min. For activity chains with size 3, the base scenario showed that 50% of the activity chains were in the range of 63 and 383 min, whereas for the GA scenario, they lasted between 70.75 and 406.25 min. Notably, for size 9, the 75% of the activity chains in the base scenario lasted between up to 720 min, while in the optimization scenario, they lasted up to 685 min. From this size onwards, the optimization performance was better and tended to reduce the total times for the percentile 75. This trend is illustrated in Fig. 5a, which presents a box plot of the quartiles. In general, it can be observed that the total times were shorter for the base scenario than for the GA scenario in smaller sizes. However, the strength of the GA was revealed when the size of the activity chains increased, shortening both the total travel times and the dispersion around the median. Apparently, the GA scenario performed better for larger sizes than for smaller ones. To further understand this behavior, an analysis of the travel times between activity locations was conducted.

Fig. 5
figure 5

Duration times of the activity chains by size. Notes The box of each size indicates the quartiles, the circles indicate the outliers, while the lines indicate the average size

Regarding the total travel times \({t}_{{a}_{i}{a}_{j}}\), the results indicate that in the case of the smallest activity chains, 50% of the travel times are between 14 and 38 min for both the base and the GA scenario. On the other extreme, for size 14, 50% of the activity chains are between 105 and 228 min for the base scenario, and between 121 and 228 min for the GA scenario. As the size of the activity chain increases, the interquartile range (IQR) of the GA tends to reduce and show a slight upward offset compared to their respective sizes in the base scenario (see Fig. 5b). Furthermore, the presence of outliers with high values is noticeable in the box plots for the smaller sizes 2 and 3. This can be attributed to the observation that activity chains of size 2 and 3 exhibit the highest frequency distribution. Additionally, in some activity chains with two activities, the distances between the locations can be longer (e.g., going to the airport), resulting in longer travel times. On the other hand, for other activity chains with two activities, the distance between the locations could be much shorter (e.g., going to a grocery shop around the corner). This difference in travel times can create a large gap between two activities, thus producing outliers in the box plot.

Solution of an instance

We present the solution to the dataset’s instance problem that was described in Table 3. The solved daily schedule is shown in Table 6, while its graphical representation is depicted in Fig. 6. The same solution was retrieved by the GA and the branch-and-bound algorithms. The optimal schedule for the traveler is to depart from their home at 16:19 to the activity A, then visit activities B, C, D and return to home at 21:00. The total time of the tour is 281 min, while the total time spent on traveling is 29 min, and the time that the traveler waited in total when arriving earlier to a location is 9 min. The GA solved this instance in 0.0303 s while the branch-and-bound solved the problem in 0.9410 s. Due the flexibility labels of activities A and D, additional locations were added to the problem, precisely 3 locations for the first one (A{4}), and 13 locations for the second one (D{14}). Therefore, the number of locations increased to 21.

Table 6 Solution to the daily activity schedule example
Fig. 6
figure 6

Graphic solution to the activity schedule optimization problem

For this example, the decision variables \({x}_{{a}_{{i}^{\left\{L\right\} }}{a}_{{j}^{\left\{L\right\}}}}\) are the arcs between the activities shown as a matrix in (16), where the activity chain \({x}_{{a}_{O }{a}_{{A}^{\left(0\right)}}}, {x}_{{a}_{{A}^{\left(0\right)}}{a}_{B}}, {x}_{{a}_{B}{a}_{C}}, {x}_{{a}_{C}{a}_{{D}^{(0)}}}, {x}_{{a}_{{D}^{(0) }}{a}_{O}}=1\) is a feasible solution that satisfies the constraints (58). Furthermore, activity A has a flexibility label 4 meaning that, if necessary, the traveler could visit this location later (after another activity) or could visit one of the other shops nearby instead (grey dots around the red one A, in Fig. 6). Therefore, both algorithms evaluated the additional tours such as O → B → A0 → … → O, O → A1 → B → … → O, and so on. The additional locations create new matrixes with the alternative locations, such as \({x}_{{a}_{O }{a}_{{A}^{\left(1\right)}}}, {x}_{{a}_{O }{a}_{{A}^{\left(2\right)}}}\). It is the same case for activity D and its 13 alternative locations.

$$X_{{a_{{i^{{\left\{ L \right\}}} }} a_{{j^{{\left\{ L \right\}}} }} }} = \left[ {\begin{array}{*{20}c} - & {x_{{a_{O} a_{{A^{{\left( 0 \right)}} }} }} } & {x_{{a_{O} a_{B} }} } & {x_{{a_{O} a_{C} }} } & {x_{{a_{O} a_{{D^{{\left( 0 \right)}} }} }} } \\ {x_{{a_{{A^{{\left( 0 \right)}} }} a_{O} }} } & - & {x_{{a_{{A^{{\left( 0 \right)}} }} a_{B} }} } & {x_{{a_{{A^{{\left( 0 \right)}} }} a_{C} }} } & {x_{{a_{{A^{{\left( 0 \right)}} }} a_{{D^{{\left( 0 \right)}} }} }} } \\ {x_{{a_{B} a_{O} }} } & {x_{{a_{B} a_{{A^{{\left( 0 \right)}} }} }} } & - & {x_{{a_{B} a_{C} }} } & {x_{{a_{B} a_{{D^{{\left( 0 \right)}} }} }} } \\ {\quad x_{{a_{C} a_{O} }} } & {x_{{a_{C} a_{{A^{{\left( 0 \right)}} }} }} } & {x_{{a_{C} a_{B} }} } & - & {x_{{a_{C} a_{{D^{{\left( 0 \right)}} }} }} } \\ {x_{{a_{{D^{{\left( 0 \right)}} }} a_{O} }} } & {x_{{a_{{D^{{\left( 0 \right)}} }} a_{{A^{{\left( 0 \right)}} }} }} } & {x_{{a_{{D^{{\left( 0 \right)}} }} a_{B} }} } & {x_{{a_{{D^{{\left( 0 \right)}} }} a_{C} }} } & - \\ \end{array} } \right]\quad X_{{a_{{i^{{\left\{ L \right\}}} }} a_{{j^{{\left\{ L \right\}}} }} }} = \left[ {\begin{array}{*{20}c} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 & 0 \\ \end{array} } \right]$$
(16)

Optimization quality

The performance of the proposed algorithm was evaluated using three metrics: computation time, relative error, and size increase. The computation time was calculated as the time elapsed from the creation of the first population P to the occurrence of one of the stopping conditions, such as reaching the maximum number of generations or observing no improvement in the fitness value, for the GA, and the time elapsed to evaluate all the tour permutations, for the branch-an-bound. To measure the computation time, quantile indicators were utilized. The relative error was calculated as the difference between the best-found solution obtained from the 10 runs of the GA scenario and the optimal solution obtained in the base scenario, as outlined in Eq. (17). The size increase refers to the difference in the number of additional alternative activities with flexibility labels 2 and 4 added during the optimization process and the initial activity chain, as defined in Eq. (18).

$${\text{Rel}}.\,\mathrm{ error}=\frac{\mathrm{GA\, best\, found}-{\text{optimal}}}{{\text{optimal}}}$$
(17)
$$\mathrm{Size\, increase }=\mathrm{locations\, ACO }-\mathrm{ initial\, locations}$$
(18)

Table 7 shows the results corresponding to the performance measures. Regarding the computation times, the results vary within 1/1000 of a second for the small sizes. For instance, activity chains with size 2 show an increase in the IQR from 0.153⋅10–3 to 0.251⋅10–3 s in the base scenario to 77.159⋅10–3 to 85.559⋅10–3 in the GA scenario. On the other extreme, the IQR for activity chains with size 14 varies from 9.980⋅10–3 to 2.647⋅103 s to 360.357⋅10–3 to 1.679⋅100 respectively. Seemingly, the computation times slightly increase as the size increases. Branch-and-bound is suitable for smaller sizes (1–8) but from size 9, GA seems to perform better.

Table 7 Quantiles of the computation time and averages of the relative error, size increase, and success rate

Furthermore, the results of the assessment are graphically displayed in a box plot in Fig. 7. The mean values by size are represented by lines. It can be observed that both scenarios exhibit several outliers for smaller activity chains, which is attributed to the inclusion of the flexibility labels 2 and 4 during the optimization process, which leads to an increase in the size and computational time of the activity chain. The correlation between the size of the activity chain and the computational time is expected, as larger activity chains will inherently lead to longer computation times. However, it is noteworthy that for larger sizes, particularly size 9−14, the third quartiles of the computation time in the GA scenario are reduced compared to the base scenario, which suggests that in 75% of activity chains with these sizes, the GA scenario has a shorter computation time. Especially for size 11, the third quartile for the ACO scenario is lower, meaning that 75% of the computation for activity chains in the ACO scenario is shorter compared to the base scenario. This result highlights the advantage of using metaheuristic algorithms, such as GA, in solving this optimization problem. Furthermore, the exponential growth for the base scenario is visible, while the GA scenario keeps linear growth.

Fig. 7
figure 7

Computation times of the base and the ACO scenarios. Notes The box of each size indicates the quartiles, the circles indicate the outliers, while the lines indicate the average size

The results obtained in terms of relative error demonstrate an increase as the size of the activity chains increases. This means that the error observed in activity chains with size 2 is relatively shorter compared to the ones observed in larger sizes. This phenomenon may be present due to the larger number of activity chains with size 2 compared to other sizes (size 14 has only 2 activity chains). Another potential explanation for the increased error in activity chains of larger size may be attributed to the use of real time windows data from the POIs. During the optimization process, there may be a substantial gap between the time of leaving an activity and the start time of the time window for the next activity location, resulting in unnecessary waiting times.

The increase in the activity chain size during the optimization process can be attributed to the presence of activities with flexibility labels 2 and 4. This increase is linked to the number of POIs with the same OSM tag within their respective TAZ. It can be observed that activity chains with sizes between 2 and 7 demonstrate an increasing average of additional alternative activity locations. However, from this size, the average values of the size increase of activities do not exhibit any specific behavior.

Discussion

The results obtained in this study are subject to certain limitations and shortcomings, related to the assumptions made during the data obtention, benchmark preparation, and optimization processes. Firstly, the availability of data is a key limitation of this study. For instance, the POIs used in the generated SACOGPOI dataset were extracted from the OSM, which is an open-source data supplier. This implies that not all the POIs may be up-to-date or accurately uploaded, unlike other map services such as Google Maps, particularly for residential buildings and houses.

Furthermore, the ACO model used in this study may have its shortcomings. By definition, TSP states that a traveler begins at a home location, visits a set of locations only once, and returns to home only after visiting all locations. However, in the real world and in the SACOGTHS dataset, this is not always the case. For instance, a person may leave home in the morning for work, return home for lunch, go back to work after lunch, and finally return home in the evening. This implies that the home location was visited twice as an intermediate point of the activity chain, which the current ACO model does not consider, since it considers the TSP definition, meaning that one location cannot be visited twice. As a result, the activity chain is split into two sub-activity chains with separate optimization for each sub-chain. The primer corresponds to the first chain till when the traveler returns home at lunchtime, and the second when the traveler departs from home after lunch. The shortcoming is that the optimization would be carried out as two completely different activity chains even with different characteristics, such as the transport mode or the flexibility. This could also explain why activity chains were mostly grouped in sizes 2 and 3, as larger sizes could have been split.

Several potential future research directions can be addressed to overcome the identified limitations. Firstly, the limitation in the available data for POIs and opening times could be overcome by using map services with more accurate data, although it may entail higher costs to access these services. Additionally, the ACO model formulation could be improved by considering the activity chain optimization as a tour with multiple home visits instead of splitting the activity chains into different sub-activity chains. Another avenue for potential future research is to explore alternative metaheuristic algorithms that could potentially improve the performance and reduce computation times during the optimization process of the ACO. Finally, the objective function, which only considered travel times in this study, could be improved by incorporating other factors that take part during the travel process, such as traffic conditions, environmental impact, among others.

Conclusion

In this study, a comprehensive experimental evaluation of the ACO model was carried out using large real-world data. The activity chain benchmark data were generated from a household survey dataset, while the flexibility of the activities was established through a stochastic approach to simulate the preferences of the travelers. Furthermore, the POIs and their corresponding real opening times were extracted from open-source mapping services to aid in the calculation process. The algorithms to solve the ACO were the exact approach branch-and-bound and the GA. The findings of the evaluation overall demonstrated that the GA performs better on larger activity chains compared to smaller ones, with the computational time decreased as the size of the activity chain increases. The total tour times and travel times are similar for both algorithms while finding the optimal real activity chains. Additionally, the inclusion of activity flexibility leads to an added complexity to the problem, increasing computational time overall. The computational time grows exponentially as the size increase when solving the ACO with branch-and-bound, while GA keeps a linear growth. The results of this study also demonstrate that, despite not yielding the optimal values for smaller activity chain sizes, the travel times among the locations are optimized to a satisfactory level. This study constitutes a valuable contribution to the research on time reduction, trip optimization, and activity scheduling optimization.