Sort:
Open Access Issue
Adaptive Model Compression for Steel Plate Surface Defect Detection: An Expert Knowledge and Working Condition-Based Approach
Tsinghua Science and Technology 2024, 29(6): 1851-1871
Published: 20 June 2024
Abstract PDF (7.4 MB) Collect
Downloads:54

The steel plate is one of the main products in steel industries, and its surface quality directly affects the final product performance. How to detect surface defects of steel plates in real time during the production process is a challenging problem. The single or fixed model compression method cannot be directly applied to the detection of steel surface defects, because it is difficult to consider the diversity of production tasks, the uncertainty caused by environmental factors, such as communication networks, and the influence of process and working conditions in steel plate production. In this paper, we propose an adaptive model compression method for steel surface defect online detection based on expert knowledge and working conditions. First, we establish an expert system to give lightweight model parameters based on the correlation between defect types and manufacturing processes. Then, lightweight model parameters are adaptively adjusted according to working conditions, which improves detection accuracy while ensuring real-time performance. The experimental results show that compared with the detection method of constant lightweight parameter model, the proposed method makes the total detection time cut down by 23.1%, and the deadline satisfaction ratio increased by 36.5%, while upgrading the accuracy by 4.2% and reducing the false detection rate by 4.3%.

Open Access Issue
Thermal-Aware on-Device Inference Using Single-Layer Parallelization with Heterogeneous Processors
Tsinghua Science and Technology 2023, 28(1): 82-92
Published: 21 July 2022
Abstract PDF (14.2 MB) Collect
Downloads:213

Numerous neural network (NN) applications are now being deployed to mobile devices. These applications usually have large amounts of calculation and data while requiring low inference latency, which poses challenges to the computing ability of mobile devices. Moreover, devices’ life and performance depend on temperature. Hence, in many scenarios, such as industrial production and automotive systems, where the environmental temperatures are usually high, it is important to control devices’ temperatures to maintain steady operations. In this paper, we propose a thermal-aware channel-wise heterogeneous NN inference algorithm. It contains two parts, the thermal-aware dynamic frequency (TADF) algorithm and the heterogeneous-processor single-layer workload distribution (HSWD) algorithm. Depending on a mobile device’s architecture characteristics and environmental temperature, TADF can adjust the appropriate running speed of the central processing unit and graphics processing unit, and then the workload of each layer in the NN model is distributed by HSWD in line with each processor’s running speed and the characteristics of the layers as well as heterogeneous processors. The experimental results, where representative NNs and mobile devices were used, show that the proposed method can considerably improve the speed of the on-device inference by 21%–43% over the traditional inference method.

Open Access Issue
A Hybrid Unsupervised Clustering-Based Anomaly Detection Method
Tsinghua Science and Technology 2021, 26(2): 146-153
Published: 24 July 2020
Abstract PDF (2.2 MB) Collect
Downloads:156

In recent years, machine learning-based cyber intrusion detection methods have gained increasing popularity. The number and complexity of new attacks continue to rise; therefore, effective and intelligent solutions are necessary. Unsupervised machine learning techniques are particularly appealing to intrusion detection systems since they can detect known and unknown types of attacks as well as zero-day attacks. In the current paper, we present an unsupervised anomaly detection method, which combines Sub-Space Clustering (SSC) and One Class Support Vector Machine (OCSVM) to detect attacks without any prior knowledge. The proposed approach is evaluated using the well-known NSL-KDD dataset. The experimental results demonstrate that our method performs better than some of the existing techniques.

Open Access Issue
VirtCO: Joint Coflow Scheduling and Virtual Machine Placement in Cloud Data Centers
Tsinghua Science and Technology 2019, 24(5): 630-644
Published: 29 April 2019
Abstract PDF (1.1 MB) Collect
Downloads:14

Cloud data centers, such as Amazon EC2, host myriad big data applications using Virtual Machines (VMs). As these applications are communication-intensive, optimizing network transfer between VMs is critical to the performance of these applications and network utilization of data centers. Previous studies have addressed this issue by scheduling network flows with coflow semantics or optimizing VM placement with traffic considerations. However, coflow scheduling and VM placement have been conducted orthogonally. In fact, these two mechanisms are mutually dependent, and optimizing these two complementary degrees of freedom independently turns out to be suboptimal. In this paper, we present VirtCO, a practical framework that jointly schedules coflows and places VMs ahead of VM launch to optimize the overall performance of data center applications. We model the joint coflow scheduling and VM placement optimization problem, and propose effective heuristics for solving it. We further implement VirtCO with OpenStack and deploy it in a testbed environment. Extensive evaluation of real-world traces shows that compared with state-of-the-art solutions, VirtCO greatly reduces the average coflow completion time by up to 36.5%. This new framework is also compatible with and readily deployable within existing data center architectures.

Open Access Issue
Task-Aware Flow Scheduling with Heterogeneous Utility Characteristics for Data Center Networks
Tsinghua Science and Technology 2019, 24(4): 400-411
Published: 07 March 2019
Abstract PDF (6.5 MB) Collect
Downloads:45

With the continuous enrichment of cloud services, an increasing number of applications are being deployed in data centers. These emerging applications are often communication-intensive and data-parallel, and their performance is closely related to the underlying network. With their distributed nature, the applications consist of tasks that involve a collection of parallel flows. Traditional techniques to optimize flow-level metrics are agnostic to task-level requirements, leading to poor application-level performance. In this paper, we address the heterogeneous task-level requirements of applications and propose task-aware flow scheduling. First, we model tasks’ sensitivity to their completion time by utilities. Second, on the basis of Nash bargaining theory, we establish a flow scheduling model with heterogeneous utility characteristics, and analyze it using Lagrange multiplier method and KKT condition. Third, we propose two utility-aware bandwidth allocation algorithms with different practical constraints. Finally, we present Tasch, a system that enables tasks to maintain high utilities and guarantees the fairness of utilities. To demonstrate the feasibility of our system, we conduct comprehensive evaluations with real-world traffic trace. Communication stages complete up to 1.4 × faster on average, task utilities increase up to 2.26 ×, and the fairness of tasks improves up to 8.66 × using Tasch in comparison to per-flow mechanisms.

Open Access Issue
Distributed Storage System for Electric Power Data Based on HBase
Big Data Mining and Analytics 2018, 1(4): 324-334
Published: 02 July 2018
Abstract PDF (2 MB) Collect
Downloads:53

Managing massive electric power data is a typical big data application because electric power systems generate millions or billions of status, debugging, and error records every single day. To guarantee the safety and sustainability of electric power systems, massive electric power data need to be processed and analyzed quickly to make real-time decisions. Traditional solutions typically use relational databases to manage electric power data. However, relational databases cannot efficiently process and analyze massive electric power data when the data size increases significantly. In this paper, we show how electric power data can be managed by using HBase, a distributed database maintained by Apache. Our system consists of clients, HBase database, status monitors, data migration modules, and data fragmentation modules. We evaluate the performance of our system through a series of experiments. We also show how HBase’s parameters can be tuned to improve the efficiency of our system.

Total 6