当前位置: X-MOL 学术Comput. Oper. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The stochastic share-a-ride problem with electric vehicles and customer priorities
Computers & Operations Research ( IF 4.6 ) Pub Date : 2024-01-20 , DOI: 10.1016/j.cor.2024.106550
Yutong Gao , Shu Zhang , Zhiwei Zhang , Quanwu Zhao

We introduce a stochastic share-a-ride problem in which a fleet of electric vehicles (EV) in a ride-hailing system are dynamically dispatched to serve passenger and parcel orders in a shared manner. We assume uncertain demands of both passenger and parcel orders and consider that passenger orders have priority over parcel orders. Passengers must be transported directly from their origins to destinations, while parcels can share a vehicle with other orders. The operator of the ride-hailing platform needs to decide whether to accept a newly arrived service request, how to assign orders to vehicles, and how to route and charge the EVs. To develop dynamic policies for the problem, we formulate the problem as a Markov decision process (MDP) and propose a reinforcement learning (RL) approach to solve the problem. We develop action-space restriction and state-space aggregation schemes to facilitate the implementation of the RL algorithm. We also present two rolling horizon heuristic methods to develop dynamic policies for our problem. We conduct computational experiments based on real-world taxi data from New York City. The computational results show that our RL policies perform better than the three benchmark policies in terms of serving more orders and collecting more rewards. Our RL policies are able to make high-quality decisions more efficiently when compared with the rolling horizon policies.



中文翻译:

电动汽车的随机拼车问题和客户优先级

我们引入了随机共享乘车问题,其中动态调度乘车系统中的电动汽车(EV)车队以共享方式服务乘客和包裹订单。我们假设旅客订单和包裹订单的需求均不确定,并认为旅客订单优先于包裹订单。乘客必须直接从出发地运送到目的地,而包裹可以与其他订单共享车辆。网约车平台的运营商需要决定是否接受新到达的服务请求、如何向车辆分配订单以及如何对电动汽车进行路线和充电。为了针对该问题制定动态策略,我们将问题表述为马尔可夫决策过程(MDP),并提出强化学习(RL)方法来解决该问题。我们开发了动作空间限制和状态空间聚合方案来促进 RL 算法的实现。我们还提出了两种滚动范围启发式方法来为我们的问题制定动态策略。我们根据纽约市的真实出租车数据进行计算实验。计算结果表明,我们的强化学习策略在服务更多订单和收集更多奖励方面比三个基准策略表现更好。与滚动策略相比,我们的强化学习策略能够更有效地做出高质量决策。

更新日期:2024-01-25
down
wechat
bug