| Sign up

PDF (504.3 KB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

A Matching Algorithm with Reinforcement Learning and Decoupling Strategy for Order Dispatching in On-Demand Food Delivery

Jingfang Chen^¹, Ling Wang^¹(), Zixiao Pan^¹, Yuting Wu^¹, Jie Zheng^¹, Xuetao Ding^²

1Department of Automation, Tsinghua University, Beijing 100084, China

2Department of Delivery Technology, Meituan, Beijing 100102, China

Show Author Information

Abstract

The on-demand food delivery (OFD) service has gained rapid development in the past decades but meanwhile encounters challenges for further improving operation quality. The order dispatching problem is one of the most concerning issues for the OFD platforms, which refer to dynamically dispatching a large number of orders to riders reasonably in very limited decision time. To solve such a challenging combinatorial optimization problem, an effective matching algorithm is proposed by fusing the reinforcement learning technique and the optimization method. First, to deal with the large-scale complexity, a decoupling method is designed by reducing the matching space between new orders and riders. Second, to overcome the high dynamism and satisfy the stringent requirements on decision time, a reinforcement learning based dispatching heuristic is presented. To be specific, a sequence-to-sequence neural network is constructed based on the problem characteristic to generate an order priority sequence. Besides, a training approach is specially designed to improve learning performance. Furthermore, a greedy heuristic is employed to effectively dispatch new orders according to the order priority sequence. On real-world datasets, numerical experiments are conducted to validate the effectiveness of the proposed algorithm. Statistical results show that the proposed algorithm can effectively solve the problem by improving delivery efficiency and maintaining customer satisfaction.

Keywords

order dispatching on-demand delivery reinforcement learning decoupling strategy sequence-to-sequence neural network

References

[1]

Y.

Huang

, Y.

Chai

, Y.

Liu

, and J.

Shen

, Architecture of next-generation e-commerce platform, Tsinghua Science and Technology, vol. 24, no. 1, pp. 18–29, 2019.

Crossref Google Scholar

[2]

J.

Tao

, H.

Dai

, W.

Chen

, and H.

Jiang

, The value of personalized dispatch in O2O on-demand delivery services, Eur. J. Oper. Res., vol. 304, no. 3, pp. 1022–1035, 2023.

Crossref Google Scholar

[3]

Statista, Online food delivery — worldwide, https://www.statista.com/outlook/dmo/eservices/online-food-delivery/worldwide#revenue, 2023.

[4]

D.

Reyes

, A.

Erera

, M.

Savelsbergh

, S.

Sahasrabudhe

, and R.

O’Neil

, The meal delivery routing problem, http://www.optimization-online.org/DB_FILE/2018/04/6571.pdf, 2018.

[5]

B.

Yildiz

and M.

Savelsbergh

, Provably high-quality solutions for the meal delivery routing problem, Transp. Sci., vol. 53, no. 5, pp. 1372–1388, 2019.

Crossref Google Scholar

[6]

M.

Cosmi

, G.

Oriolo

, V.

Piccialli

, and P.

Ventura

, Single courier single restaurant meal delivery (without routing), Oper. Res. Lett., vol. 47, no. 6, pp. 537–541, 2019.

Crossref Google Scholar

[7]

M.

Cosmi

, G.

Nicosia

, and A.

Pacifici

, Scheduling for last-mile meal-delivery processes, IFAC-PapersOnLine, vol. 52, no. 13, pp. 511–516, 2019.

Crossref Google Scholar

[8]

H.

Yu

, X.

Luo

, and T.

Wu

, Online pickup and delivery problem with constrained capacity to minimize latency, J. Comb. Optim., vol. 43, pp. 974–993, 2022.

Crossref Google Scholar

[9]

Z.

Steever

, M.

Karwan

, and C.

Murray

, Dynamic courier routing for a food delivery service, Comput. Oper. Res., vol. 107, pp. 173–188, 2019.

Crossref Google Scholar

[10]

K.

Wang

, Y.

Zhou

, and L.

Zhang

, A workload-balancing order dispatch scheme for O2O food delivery with order splitting choice, J. Theor. Appl. Electron. Commer. Res., vol. 17, no. 1, pp. 295–312, 2022.

Crossref Google Scholar

[11]

M. W.

Ulmer

, B. W.

Thomas

, A. M.

Campbell

, and N.

Woyak

, The restaurant meal delivery problem: Dynamic pickup and delivery with deadlines and random ready times, Transp. Sci., vol. 55, no. 1, pp. 75–100, 2021.

Crossref Google Scholar

[12]

S.

Liu

, L.

He

, and Z. J.

Max Shen

, On-time last-mile delivery: Order assignment with travel-time predictors, Manag. Sci., vol. 67, no. 7, pp. 4095–4119, 2020.

Crossref Google Scholar

[13]

A.

Kohar

and S. K.

Jakhar

, A capacitated multi pickup online food delivery problem with time windows: A branch-and-cut algorithm, Ann. Oper. Res., .

Crossref Google Scholar

[14]

Y.

Liu

, An optimization-driven dynamic vehicle routing algorithm for on-demand meal delivery using drones, Comput. Oper. Res., vol. 111, pp. 1–20, 2019.

Crossref Google Scholar

[15]

H.

Huang

, C.

Hu

, J.

Zhu

, M.

Wu

, and R.

Malekian

, Stochastic task scheduling in UAV-based intelligent on-demand meal delivery system, IEEE Trans. Intell. Transp. Syst., vol. 23, no. 8, pp. 13040–13054, 2022.

Crossref Google Scholar

[16]

J. F.

Chen

, L.

Wang

, S.

Wang

, X.

Wang

, and H.

Ren

, An effective matching algorithm with adaptive tie-breaking strategy for online food delivery problem, Complex Intell. Syst., vol. 8, pp. 107–128, 2022.

Crossref Google Scholar

[17]

J. F.

Chen

, L.

Wang

, H.

Ren

, J.

Pan

, S.

Wang

, J.

Zheng

, and X.

Wang

, An imitation learning-enhanced iterated matching algorithm for on-demand food delivery, IEEE Trans. Intell. Transp. Syst., vol. 23, no. 10, pp. 18603–18619, 2022.

Crossref Google Scholar

[18]

J.

Zheng

, L.

Wang

, L.

Wang

, S.

Wang

, J. F.

Chen

, and X.

Wang

, Solving stochastic online food delivery problem via iterated greedy algorithm with decomposition-based strategy, IEEE Trans. Syst. Man Cybern.: Syst., vol. 53, no. 2, pp. 957–969, 2023.

Crossref Google Scholar

[19]

J. F.

Chen

, S.

Wang

, L.

Wang

, J.

Zheng

, Y.

Cha

, J.

Hao

, R.

He

, and Z.

Sun

, A hybrid differential evolution algorithm for the online meal delivery problem, in Proc. 2020 IEEE Congress on Evolutionary Computation (CEC), Glasgow, UK, 2020, pp. 1–8.

Crossref Google Scholar

[20]

X.

Wang

, S.

Wang

, L.

Wang

, H.

Zheng

, J.

Hao

, R.

He

, and Z.

Sun

, An effective iterated greedy algorithm for online route planning problem, in Proc. 2020 IEEE Congr. Evol. Comput. (CEC), Glasgow, UK, 2020, pp. 1–8.

Crossref Google Scholar

[21]

X.

Wang

, L.

Wang

, S.

Wang

, J. F.

Chen

, and C.

Wu

, An XGBoost-enhanced fast constructive algorithm for food delivery route planning problem, Comput. Ind. Eng., vol. 152, p. 107029, 2021.

Crossref Google Scholar

[22]

K. K.

Kottakki

, S.

Rathee

, K.

Mitra Adusumilli

, J.

Mathew

, B.

Nayak

, and S.

Ahuja

, Customer experience driven assignment logic for online food delivery, in Proc. 2020 IEEE Int. Conf. Ind. Eng. Eng. Manag. (IEEM), Singapore, 2020, pp. 827–831.

Crossref Google Scholar

[23]

S.

Paul

, S.

Rathee

, J.

Matthew

, and K. M.

Adusumilli

, An optimization framework for on-demand meal delivery system, in Proc. 2020 IEEE Int. Conf. Ind. Eng. Eng. Manag. (IEEM), Singapore, 2020, pp. 822–826.

Crossref Google Scholar

[24]

M.

Joshi

, A.

Singh

, S.

Ranu

, A.

Bagchi

, P.

Karia

, and P.

Kala

, Batching and matching for food delivery in dynamic road networks, http://arxiv.org/abs/2008.12905, 2020.

[25]

H.

Jahanshahi

, A.

Bozanta

, M.

Cevik

, E. M.

Kavuk

, A.

Tosun

, S. B.

Sonuc

, B.

Kosucu

, and A.

Başar

, A deep reinforcement learning approach for the meal delivery problem, Knowl. Based Syst., vol. 243, p. 108489, 2022.

Crossref Google Scholar

[26]

A.

Bozanta

, M.

Cevik

, C.

Kavaklioglu

, E. M.

Kavuk

, A.

Tosun

, S. B.

Sonuc

, A.

Duranel

, and A.

Basar

, Courier routing and assignment for food delivery service using reinforcement learning, Comput. Ind. Eng., vol. 164, p. 107871, 2022.

Crossref Google Scholar

[27]

J.

Hu

, H.

Wu

, B.

Zhong

, and R.

Xiao

, Swarm intelligence-based optimisation algorithms: An overview and future research issues, Int. J. Autom. Control, vol. 14, nos. 5&6, pp. 656–693, 2020.

Crossref Google Scholar

[28]

W.

Fang

, R.

Min

, and Q.

Wang

, Large-scale global optimisation using cooperative co-evolution with self-adaptive differential grouping, Int. J. Autom. Control, vol. 15, no. 1, pp. 58–77, 2021.

Crossref Google Scholar

[29]

W.

Fan

, P.

Chen

, D.

Shi

, X.

Guo

, and L.

Kou

, Multi-agent modeling and simulation in the AI age, Tsinghua Science and Technology, vol. 26, no. 5, pp. 608–624, 2021.

Crossref Google Scholar

[30]

F.

Zhao

, S.

Di

, J.

Cao

, J.

Tang

, and

Jonrinaldi

, A novel cooperative multi-stage hyper-heuristic for combination optimization problems, Complex System Modeling and Simulation, vol. 1, no. 2, pp. 91–108, 2021.

Crossref Google Scholar

[31]

L.

Wang

, Z.

Pan

, and J.

Wang

, A review of reinforcement learning based intelligent optimization for manufacturing scheduling, Complex System Modeling and Simulation, vol. 1, no. 4, pp. 257–270, 2021.

Crossref Google Scholar

Tsinghua Science and Technology

Volume 29 Issue 2,
April 2024

Pages 386-399

DOI: 10.26599/TST.2023.9010069

Cite this article:

Chen J, Wang L, Pan Z, et al. A Matching Algorithm with Reinforcement Learning and Decoupling Strategy for Order Dispatching in On-Demand Food Delivery. Tsinghua Science and Technology, 2024, 29(2): 386-399. https://doi.org/10.26599/TST.2023.9010069

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号