| Sign up

PDF (8.5 MB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

Monte Carlo Simulation-Based Robust Workflow Scheduling for Spot Instances in Cloud Environments

Quanwang Wu^¹, Jianzhao Fang^¹, Jie Zeng^²(), Junhao Wen^³, Fengji Luo^⁴

1College of Computer Science, Chongqing University, Chongqing 400044, China

2National Experimental Teaching Demonstration Center, Chongqing University, Chongqing 400044, China

3College of Big Data and Software Engineering, Chongqing University, Chongqing 400044, China

4School of Civil Engineering, The University of Sydney, Sydney 2006, Australia

Show Author Information

Abstract

When deploying workflows in cloud environments, the use of Spot Instances (SIs) is intriguing as they are much cheaper than on-demand ones. However, SIs are volatile and may be revoked at any time, which results in a more challenging scheduling problem involving execution interruption and hence hinders the successful handling of conventional cloud workflow scheduling techniques. Although some scheduling methods for SIs have been proposed, most of them are no more applicable to the latest SIs, as they have evolved by eliminating bidding and simplifying the pricing model. This study focuses on how to minimize the execution cost with a deadline constraint when deploying a workflow on volatile SIs in cloud environments. Based on Monte Carlo simulation and list scheduling, a stochastic scheduling method called MCLS is devised to optimize a utility function introduced for this problem. With the Monte Carlo simulation framework, MCLS employs sampled task execution time to build solutions via deadline distribution and list scheduling, and then returns the most robust solution from all the candidates with a specific evaluation mechanism and selection criteria. Experimental results show that the performance of MCLS is more competitive compared with traditional algorithms.

Keywords

constrained optimization Monte Carlo simulation robustness Spot Instances (SIs)workflow scheduling

References

[1]

J.

Sahni

and D. P.

Vidyarthi

, A cost-effective deadline-constrained dynamic scheduling algorithm for scientific workflows in a cloud environment, IEEE Trans. Cloud Comput., vol. 6, no. 1, pp. 2–18, 2018.

Crossref Google Scholar

[2]

L.

Zheng

, C.

Joe-Wong

, C. W.

Tan

, M.

Chiang

, and X. Y.

Wang

, How to bid the cloud, ACM SIGCOMM Comput. Commun. Rev., vol. 45, no. 4, pp. 71–84, 2015.

Crossref Google Scholar

[3]

B.

Javadi

, R. K.

Thulasiram

, and R.

Buyya

, Characterizing spot price dynamics in public cloud environments, Future Gener. Comput. Syst., vol. 29, no. 4, pp. 988–999, 2013.

Crossref Google Scholar

[4]

S.

Mandal

, G.

Maji

, S.

Khatua

, and R. K.

Das

, Cost minimizing reservation and scheduling algorithms for public clouds, IEEE Trans. Cloud Comput., .

Crossref Google Scholar

[5]

M.

Adhikari

, T.

Amgoth

, and S. N.

Srirama

, A survey on scheduling strategies for workflows in cloud environment and emerging trends, ACM Comput. Surv., vol. 52, no. 4, p. 68, 2019.

Crossref Google Scholar

[6]

D.

Poola

, K.

Ramamohanarao

, and R.

Buyya

, Enhancing reliability of workflow execution using task replication and spot instances, ACM Trans. Auton. Adapt. Syst., vol. 10, no. 4, p. 30, 2016.

Crossref Google Scholar

[7]

D. W.

Wei

, H. S.

Ning

, F. F.

Shi

, Y. L.

Wan

, J. B.

Xu

, S. K.

Yang

, and L.

Zhu

, Dataflow management in the internet of things: Sensing, control, and security, Tsinghua Science and Technology, vol. 26, no. 6, pp. 918–930, 2021.

Crossref Google Scholar

[8]

S.

Abrishami

, M.

Naghibzadeh

, and D. H. J.

Epema

, Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds, Future Gener. Comput. Syst., vol. 29, no. 1, pp. 158–169, 2013.

Crossref Google Scholar

[9]

V.

Arabnejad

, K.

Bubendorfer

, and B.

Ng

, Budget and deadline aware e-science workflow scheduling in clouds, IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 1, pp. 29–44, 2019.

Crossref Google Scholar

[10]

L. W.

Yang

, L. J.

Ye

, Y. Q.

Xia

, and Y. F.

Zhan

, Look-ahead workflow scheduling with width changing trend in clouds, Future Gener. Comput. Syst., vol. 139, pp. 139–150, 2023.

Crossref Google Scholar

[11]

M. A.

Rodriguez

and R.

Buyya

, Deadline based resource provisioningand scheduling algorithm for scientific workflows on clouds, IEEE Trans. Cloud Comput., vol. 2, no. 2, pp. 222–235, 2014.

Crossref Google Scholar

[12]

Q. W.

Wu

, F.

Ishikawa

, Q. S.

Zhu

, Y. N.

Xia

, and J. H.

Wen

, Deadline-constrained cost optimization approaches for workflow scheduling in clouds, IEEE Trans. Parallel Distrib. Syst., vol. 28, no. 12, pp. 3401–3412, 2017.

Crossref Google Scholar

[13]

H. R.

Faragardi

, M. R. S.

Sedghpour

, S.

Fazliahmadi

, T.

Fahringer

, and N.

Rasouli

, GRP-HEFT: A budget-constrained resource provisioning scheme for workflow scheduling in IaaS clouds, IEEE Trans. Parallel Distrib. Syst., vol. 31, no. 6, pp. 1239–1254, 2020.

Crossref Google Scholar

[14]

H.

Topcuoglu

, S.

Hariri

, and M. Y.

Wu

, Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE Trans. Parallel Distrib. Syst., vol. 13, no. 3, pp. 260–274, 2002.

Crossref Google Scholar

[15]

R.

Ghafouri

, A.

Movaghar

, and M.

Mohsenzadeh

, A budget constrained scheduling algorithm for executing workflow application in infrastructure as a service clouds, Peer-to-Peer Netw. Appl., vol. 12, no. 1, pp. 241–268, 2019.

Crossref Google Scholar

[16]

N.

Rizvi

and D.

Ramesh

, Fair budget constrained workflow scheduling approach for heterogeneous clouds, Cluster Comput., vol. 23, no. 4, pp. 3185–3201, 2020.

Crossref Google Scholar

[17]

J. J.

Durillo

and R.

Prodan

, Multi-objective workflow scheduling in Amazon EC2, Cluster Comput., vol. 17, no. 2, pp. 169–189, 2014.

Crossref Google Scholar

[18]

Q. W.

Wu

, M. C.

Zhou

, Q. S.

Zhu

, Y. N.

Xia

, and J. H.

Wen

, MOELS: Multiobjective evolutionary list scheduling for cloud workflows, IEEE Trans. Autom. Sci. Eng., vol. 17, no. 1, pp. 166–176, 2020.

Crossref Google Scholar

[19]

X. M.

Zhou

, G. X.

Zhang

, J.

Sun

, J. L.

Zhou

, T. Q.

Wei

, and S. Y.

Hu

, Minimizing cost and makespan for workflow scheduling in cloud using fuzzy dominance sort based HEFT, Future Gener. Comput. Syst., vol. 93, pp. 278–289, 2019.

Crossref Google Scholar

[20]

L.

Dierks

and S.

Seuken

, Cloud pricing: The spot market strikes back, Manage. Sci., vol. 68, no. 1, pp. 105–122, 2022.

Crossref Google Scholar

[21]

O. A.

Ben-Yehuda

, M.

Ben-Yehuda

, A.

Schuster

, and D.

Tsafrir

, Deconstructing Amazon EC2 spot instance pricing, ACM Trans. Econ. Comput., vol. 1, no. 3, p. 16, 2013.

Crossref Google Scholar

[22]

G. J.

Portella

, G. N.

Rodrigues

, E. Y.

Nakano

, A.

Boukerche

, and A. C. M.

Melo

, A novel statistical and neural network combined approach for the cloud spot market, IEEE Trans. Cloud Comput., .

Crossref Google Scholar

[23]

J.

Li

, Y. M.

Zhu

, J. D.

Yu

, C. N.

Long

, G. T.

Xue

, and S. Y.

Qian

, Online auction for IaaS clouds: Towards elastic user demands and weighted heterogeneous VMs, IEEE Trans. Parallel Distrib. Syst., vol. 29, no. 9, pp. 2075–2089, 2018.

Crossref Google Scholar

[24]

W.

Voorsluys

and R.

Buyya

, Reliable provisioning of spot instances for compute-intensive applications, in Proc. 2012 IEEE 26^th Int. Conf. Advanced Information Networking and Applications, Fukuoka, Japan, 2012, pp. 542–549.

Crossref Google Scholar

[25]

X.

He

, P.

Shenoy

, R.

Sitaraman

, and D.

Irwin

, Cutting the cost of hosting online services using cloud spot markets, in Proc. 24^th Int. Symp. High-Performance Parallel and Distributed Computing, Portland, OR, USA, 2015, pp. 207–218.

Crossref Google Scholar

[26]

S.

Yang

, S.

Khuller

, S.

Choudhary

, S.

Mitra

, and K.

Mahadik

, Scheduling ML training on unreliable spot instances, in Proc. 14^th IEEE/ACM Int. Conf. Utility and Cloud Computing Companion, Leicester, UK, 2021, p. 29.

Crossref Google Scholar

[27]

L.

Teylo

, A. L.

Nunes

, A. C. M. A.

Melo

, C.

Boeres

, L. M. de A.

Drummond

, and N. F.

Martins

, Comparing SARS-CoV-2 sequences using a commercial cloud with a spot instance based dynamic scheduler, in Proc. 2021 IEEE/ACM 21^st Int. Symp. Cluster, Cloud and Internet Computing (CCGrid), Melbourne, Australia, 2021, pp. 247–256.

Crossref Google Scholar

[28]

F.

Xu

, H. Y.

Zheng

, H.

Jiang

, W. J.

Shao

, H. K.

Liu

, and Z.

Zhou

, Cost-effective cloud server provisioning for predictable performance of big data analytics, IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 5, pp. 1036–1051, 2019.

Crossref Google Scholar

[29]

S. J.

Cao

, K. F.

Deng

, K. J.

Ren

, X. Y.

Li

, T. F.

Nie

, and J. Q.

Song

, An optimizing algorithm for deadline constrained scheduling of scientific workflows in IaaS clouds using spot instances, in Proc. 2019 IEEE Int. Conf. Parallel and Distributed Processing with Applications, Big Data and Cloud Computing, Sustainable Computing and Communications, Social Computing and Networking (ISPA/BDCloud/SocialCom/SustainCom), Xiamen, China, 2019, pp. 1421–1428.

Crossref Google Scholar

[30]

R. G.

Martinez

, A.

Lopes

, and L.

Rodrigues

, Planning workflow executions when using spot instances in the cloud, in Proc. 34^th ACM/SIGAPP Symp. Applied Computing, Limassol, Cyprus, 2019, pp. 310–317.

Crossref Google Scholar

[31]

H.

Ghavamipoor

, S. A. K.

Mousavi

, H. R.

Faragardi

, and N.

Rasouli

, A reliability aware algorithm for workflow scheduling on cloud spot instances using artificial neural network, in Proc. 2020 10^th Int. Symp. Telecommunications (IST), Tehran, Iran, 2020, pp. 67–71.

Crossref Google Scholar

[32]

G.

George

, R.

Wolski

, C.

Krintz

, and J.

Brevik

, Analyzing AWS spot instance pricing, in Proc. 2019 IEEE Int. Conf. Cloud Engineering (IC2E), Prague, Czech Republic, 2019, pp. 222–228.

Crossref Google Scholar

[33]

A. C.

Zhou

, J. M.

Lao

, Z. B.

Ke

, Y.

Wang

, and R.

Mao

, FarSpot: Optimizing monetary cost for HPC applications in the cloud spot market, IEEE Trans. Parallel Distrib. Syst., vol. 33, no. 11, pp. 2955–2967, 2022.

[34]

L.

Teylo

, L.

Arantes

, P.

Sens

, and L. M. A.

Drummond

, A dynamic task scheduler tolerant to multiple hibernations in cloud environments, Cluster Comput., vol. 24, no. 2, pp. 1051–1073, 2021.

Crossref Google Scholar

[35]

T. P.

Pham

and T.

Fahringer

, Evolutionary multi-objective workflow scheduling for volatile resources in the cloud, IEEE Trans. Cloud Comput., vol. 10, no. 3, pp. 1780–1791, 2022.

Crossref Google Scholar

[36]

F.

Cao

and M. X.

Zhu

, A fault-tolerant workflow mapping algorithm under end-to-end delay constraint, in 2011 IEEE Int. Conf. High Performance Computing and Communications, Banff, Canada, 2011, pp. 575–580.

Crossref Google Scholar

[37]

H.

Youness

, A.

Omar

, and M.

Moness

, An optimized weighted average makespan in fault-tolerant heterogeneous MPSoCs, IEEE Trans. Parallel Distrib. Syst., vol. 32, no. 8, pp. 1933–1946, 2021.

Crossref Google Scholar

[38]

G. S.

Yao

, Y. S.

Ding

, and K. R.

Hao

, Using imbalance characteristic for fault-tolerant workflow scheduling in cloud systems, IEEE Trans. Parallel Distrib. Syst., vol. 28, no. 12, pp. 3671–3683, 2017.

Crossref Google Scholar

[39]

D.

Poola

, S. K.

Garg

, R.

Buyya

, Y.

Yang

, and K.

Ramamohanarao

, Robust scheduling of scientific workflows with deadline and budget constraints in clouds, in Proc. 2014 IEEE 28^th Int. Conf. Advanced Information Networking and Applications, Victoria, Canada, 2014, pp. 858–865.

Crossref Google Scholar

[40]

J.

Zhou

, Y.

Zhang

, and W. F.

Wong

, Fault tolerant stencil computation on cloud-based GPU spot instances, IEEE Trans. Cloud Comput., vol. 7, no. 4, pp. 1013–1024, 2019.

Crossref Google Scholar

[41]

L. C.

Canon

and E.

Jeannot

, Evaluation and optimization of the robustness of DAG schedules in heterogeneous environments, IEEE Trans. Parallel Distrib. Syst., vol. 21, no. 4, pp. 532–546, 2010.

Crossref Google Scholar

[42]

W.

Zheng

and R.

Sakellariou

, Stochastic DAG scheduling using a Monte Carlo approach, J. Parallel Distrib. Comput., vol. 73, no. 12, pp. 1673–1689, 2013.

Crossref Google Scholar

[43]

X. Y.

Tang

, K. L.

Li

, G. P.

Liao

, K.

Fang

, and F.

Wu

, A stochastic scheduling algorithm for precedence constrained tasks on grid, Future Gener. Comput. Syst., vol. 27, no. 8, pp. 1083–1091, 2011.

Crossref Google Scholar

[44]

K. L.

Li

, , X. Y.

Tang

, B.

Veeravalli

, and K. Q.

Li

, Scheduling precedence constrained stochastic tasks on heterogeneous cluster systems, IEEE Trans. Comput., vol. 64, no. 1, pp. 191–204, 2015.

Crossref Google Scholar

[45]

T. P.

Pham

, S.

Ristov

, and T.

Fahringer

, Performance and behavior characterization of amazon EC2 spot instances, in Proc. 2018 IEEE 11^th Int. Conf. Cloud Computing (CLOUD), San Francisco, CA, USA, 2018, pp. 73–81.

Crossref Google Scholar

[46]

H.

Wang

, L.

Cai

, X.

Hao

, J.

Ren

, and Y. H.

Ma

, ETS-TEE: An energy-efficient task scheduling strategy in a mobile trusted computing environment, Tsinghua Science and Technology, vol. 28, no. 1, pp. 105–116, 2023.

Crossref Google Scholar

[47]

Z. Y.

Hu

and D. S.

Li

, Improved heuristic job scheduling method to enhance throughput for big data analytics, Tsinghua Science and Technology, vol. 27, no. 2, pp. 344–357, 2022.

Crossref Google Scholar

[48]

H.

Arabnejad

and J. G.

Barbosa

, List scheduling algorithm for heterogeneous systems by an optimistic cost table, IEEE Trans. Parallel Distrib. Syst., vol. 25, no. 3, pp. 682–694, 2014.

Crossref Google Scholar

[49]

R. N.

Calheiros

, R.

Ranjan

, A.

Beloglazov

, C. A. F.

De Rose

, and R.

Buyya

, CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms, Softw.: Pract. Exper., vol. 41, no. 1, pp. 23–50, 2011.

Crossref Google Scholar

Tsinghua Science and Technology

Volume 29 Issue 1,
February 2024

Pages 112-126

DOI: 10.26599/TST.2022.9010065

Cite this article:

Wu Q, Fang J, Zeng J, et al. Monte Carlo Simulation-Based Robust Workflow Scheduling for Spot Instances in Cloud Environments. Tsinghua Science and Technology, 2024, 29(1): 112-126. https://doi.org/10.26599/TST.2022.9010065

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号