AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

A Case for Adaptive Resource Management in Alibaba Datacenter Using Neural Networks

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences Beijing 100190, China
University of Chinese Academy of Sciences, Beijing 100049, China
Peng Cheng Laboratory, Shenzhen 518055, China
Alibaba Inc., Hangzhou 311121, China
Department of Computer Science, Wayne State University, Michigan, MI 48202, U.S.A.
Show Author Information

Abstract

Both resource efficiency and application QoS have been big concerns of datacenter operators for a long time, but remain to be irreconcilable. High resource utilization increases the risk of resource contention between co-located workload, which makes latency-critical (LC) applications suffer unpredictable, and even unacceptable performance. Plenty of prior work devotes the effort on exploiting effective mechanisms to protect the QoS of LC applications while improving resource efficiency. In this paper, we propose MAGI, a resource management runtime that leverages neural networks to monitor and further pinpoint the root cause of performance interference, and adjusts resource shares of corresponding applications to ensure the QoS of LC applications. MAGI is a practice in Alibaba datacenter to provide on-demand resource adjustment for applications using neural networks. The experimental results show that MAGI could reduce up to 87.3% performance degradation of LC application when co-located with other antagonist applications.

Electronic Supplementary Material

Download File(s)
jcst-35-1-209-Highlights.pdf (1.1 MB)
Journal of Computer Science and Technology
Pages 209-220
Cite this article:
Wang S, Zhu Y-H, Chen S-P, et al. A Case for Adaptive Resource Management in Alibaba Datacenter Using Neural Networks. Journal of Computer Science and Technology, 2020, 35(1): 209-220. https://doi.org/10.1007/s11390-020-9732-x

282

Views

3

Crossref

N/A

Web of Science

5

Scopus

1

CSCD

Altmetrics

Received: 22 May 2019
Revised: 14 October 2019
Published: 10 January 2020
©Institute of Computing Technology, Chinese Academy of Sciences 2020
Return