Sort:
Open Access Issue
More Bang for Your Buck: Boosting Performance with Capped Power Consumption
Tsinghua Science and Technology 2021, 26(3): 370-383
Published: 12 October 2020
Abstract PDF (5.3 MB) Collect
Downloads:18

Achieving faster performance without increasing power and energy consumption for computing systems is an outstanding challenge. This paper develops a novel resource allocation scheme for memory-bound applications running on High-Performance Computing (HPC) clusters, aiming to improve application performance without breaching peak power constraints and total energy consumption. Our scheme estimates how the number of processor cores and CPU frequency setting affects the application performance. It then uses the estimate to provide additional compute nodes to memory-bound applications if it is profitable to do so. We implement and apply our algorithm to 12 representative benchmarks from the NAS parallel benchmark and HPC Challenge (HPCC) benchmark suites and evaluate it on a representative HPC cluster. Experimental results show that our approach can effectively mitigate memory contention to improve application performance, and it achieves this without significantly increasing the peak power and overall energy consumption. Our approach obtains on average 12.69% performance improvement over the default resource allocation strategy, but uses 7.06% less total power, which translates into 17.77% energy savings.

Open Access Issue
Brief Introduction of TianHe Exascale Prototype System
Tsinghua Science and Technology 2021, 26(3): 361-369
Published: 12 October 2020
Abstract PDF (40.7 MB) Collect
Downloads:82

Facing the challenges of the next generation exascale computing, National University of Defense Technology has developed a prototype system to explore opportunities, solutions, and limits toward the next generation Tianhe system. This paper briefly introduces the prototype system, which is deployed at the National Supercomputer Center in Tianjin and has a theoretical peak performance of 3.15 Pflops. A total of 512 compute nodes are found where each node has three proprietary CPUs called Matrix-2000+. The system memory is 98.3 TB, and the storage is 1.4 PB in total.

Open Access Issue
A Holistic Energy-Efficient Approach for a Processor-Memory System
Tsinghua Science and Technology 2019, 24(4): 468-483
Published: 07 March 2019
Abstract PDF (2.8 MB) Collect
Downloads:29

Component overclocking is an effective approach to speed up the components of a system to realize a higher program performance; it includes processor overclocking or memory overclocking. However, overclocking will unavoidably result in increase in power consumption. Our goal is to optimally improve the performance of scientific computing applications without increasing the total power consumption for a processor-memory system. We built a processor-memory energy efficiency model for multicore-based systems, which coordinates the performance and power of processor and memory. Our model exploits performance boost opportunities for a processor-memory system by adopting processor overclocking, processor Dynamic Voltage and Frequency Scaling (DVFS), memory active ratio adjustment, and memory overclocking, according to different scientific applications. This model also provides a total power control method by considering the same four factors mentioned above. We propose a processor and memory Coordination-based holistic Energy-Efficient (CEE) algorithm, which achieves performance improvement without increasing the total power consumption. The experimental results show that an average of 9.3% performance improvement was obtained for all 14 benchmarks. Meanwhile the total power consumption does not increase. The maximal performance improvement was up to 13.1% from dedup benchmark. Our experiments validate the effectiveness of our holistic energy-efficient model and technology.

Total 3