Scholar - SciOpen

With the increasing demand for computility driven by emerging applications such as artificial intelligence, the compilation technology, serving as a crucial bridge between software and hardware, is facing unprecedented challenges and opportunities. This article focuses on the development trends of domain-specific compilers, and gives an in-depth discussion on the compilation techniques tailored for emerging domains. By examining various aspects including whole-program operator fusion, dynamic-shape tensor compilation, co-design of software and hardware, computational security, this article provides a comprehensive summary and evaluation of representative domain-specific compilation technologies for new application paradigms and architectures. The key role of domain-specific compilation technologies in adapting to diverse computing platforms, improving program execution efficiency, ensuring software security and supporting hardware design are analyzed. Its prospects for applications and future work are also discussed.

Regular Paper Issue

Automatic Target Description File Generation

Hong-Na Geng, Fang Lyu, Ming Zhong, Hui-Min Cui, Jingling Xue, Xiao-Bing Feng

Journal of Computer Science and Technology 2023, 38(6): 1339-1355

Published: 15 November 2023

Abstract Collect Collected

Agile hardware design is gaining increasing momentum and bringing new chips in larger quantities to the market faster. However, it also takes new challenges for compiler developers to retarget existing compilers to these new chips in shorter time than ever before. Currently, retargeting a compiler backend, e.g., an LLVM backend to a new target, requires compiler developers to write manually a set of target description files (totalling 10300+ lines of code (LOC) for RISC-V in LLVM), which is error-prone and time-consuming. In this paper, we introduce a new approach, Automatic Target Description File Generation (ATG), which accelerates the generation of a compiler backend for a new target by generating its target description files automatically. Given a new target, ATG proceeds in two stages. First, ATG synthesizes a small list of target-specific properties and a list of code-layout templates from the target description files of a set of existing targets with similar instruction set architectures (ISAs). Second, ATG requests compiler developers to fill in the information for each instruction in the new target in tabular form according to the list of target-specific properties synthesized and then generates its target description files automatically according to the list of code-layout templates synthesized. The first stage can often be reused by different new targets sharing similar ISAs. We evaluate ATG using nine RISC-V instruction sets drawn from a total of 1029 instructions in LLVM 12.0. ATG enables compiler developers to generate compiler backends for these ISAs that emit the same assembly code as the existing compiler backends for RISC-V but with significantly less development effort (by specifying each instruction in terms of up to 61 target-specific properties only).

Regular Paper Issue

VTensor: Using Virtual Tensors to Build a Layout-Oblivious AI Programming Framework

Feng Yu, Jia-Cheng Zhao, Hui-Min Cui, Xiao-Bing Feng, Jingling Xue

Journal of Computer Science and Technology 2023, 38(5): 1074-1097

Published: 30 September 2023

Abstract Collect Collected

Tensors are a popular programming interface for developing artificial intelligence (AI) algorithms. Layout refers to the order of placing tensor data in the memory and will affect performance by affecting data locality; therefore the deep neural network library has a convention on the layout. Since AI applications can use arbitrary layouts, and existing AI systems do not provide programming abstractions to shield the layout conventions of libraries, operator developers need to write a lot of layout-related code, which reduces the efficiency of integrating new libraries or developing new operators. Furthermore, the developer assigns the layout conversion operation to the internal operator to deal with the uncertainty of the input layout, thus losing the opportunity for layout optimization. Based on the idea of polymorphism, we propose a layout-agnostic virtual tensor programming interface, namely the VTensor framework, which enables developers to write new operators without caring about the underlying physical layout of tensors. In addition, the VTensor framework performs global layout inference at runtime to transparently resolve the required layout of virtual tensors, and runtime layout-oriented optimizations to globally minimize the number of layout transformation operations. Experimental results demonstrate that with VTensor, developers can avoid writing layout-dependent code. Compared with TensorFlow, for the 16 operations used in 12 popular networks, VTensor can reduce the lines of code (LOC) of writing a new operation by 47.82% on average, and improve the overall performance by 18.65% on average.

Survey Issue

Reinvent Cloud Software Stacks for Resource Disaggregation

Chen-Xi Wang, Yi-Zhou Shan, Peng-Fei Zuo, Hui-Min Cui

Journal of Computer Science and Technology 2023, 38(5): 949-969

Published: 30 September 2023

Abstract Collect Collected

Due to the unprecedented development of low-latency interconnect technology, building large-scale disaggregated architecture is drawing more and more attention from both industry and academia. Resource disaggregation is a new way to organize the hardware resources of datacenters, and has the potential to overcome the limitations, e.g., low resource utilization and low reliability, of conventional datacenters. However, the emerging disaggregated architecture brings severe performance and latency problems to the existing cloud systems. In this paper, we take memory disaggregation as an example to demonstrate the unique challenges that the disaggregated datacenter poses to the existing cloud software stacks, e.g., programming interface, language runtime, and operating system, and further discuss the possible ways to reinvent the cloud systems.

Total 4

<1/11>GOpage