Scholar - SciOpen

Support vector machines (SVMs) are supervised learning models traditionally employed for classification and regression analysis. In classification analysis, a set of training data is chosen, and each instance in the training data is assigned a categorical class. An SVM then constructs a model based on a separating plane that maximizes the margin between different classes. Despite being one of the most popular classification models because of its strong performance empirically, understanding the knowledge captured in an SVM remains difficult. SVMs are typically applied in a black-box manner where the details of parameter tuning, training, and even the final constructed model are hidden from the users. This is natural since these details are often complex and difficult to understand without proper visualization tools. However, such an approach often brings about various problems including trial-and-error tuning and suspicious users who are forced to trust these models blindly.

The contribution of this paper is a visual analysis approach for building SVMs in an open-box manner. Our goal is to improve an analyst’s understanding of the SVM modeling process through a suite of visualization techniques that allow users to have full interactive visual control over the entire SVM training process. Our visual exploration tools have been developed to enable intuitive parameter tuning, training data manipulation, and rule extraction as part of the SVM training process. To demonstrate the efficacy of our approach, we conduct a case study using a real-world robot control dataset.

Open Access Issue

An Online Visualization System for Streaming Log Data of Computing Clusters

Jing Xia, Feiran Wu, Fangzhou Guo, Cong Xie, Zhen Liu, Wei Chen

Tsinghua Science and Technology 2013, 18(2): 196-205

Published: 30 April 2013

Abstract

PDF (2.6 MB) Collect Collected

Downloads：9

Monitoring a computing cluster requires collecting and understanding log data generated at the core, computer, and cluster levels at run time. Visualizing the log data of a computing cluster is a challenging problem due to the complexity of the underlying dataset: it is streaming, hierarchical, heterogeneous, and multi-sourced. This paper presents an integrated visualization system that employs a two-stage streaming process mode. Prior to the visual display of the multi-sourced information, the data generated from the clusters is gathered, cleaned, and modeled within a data processor. The visualization supported by a visual computing processor consists of a set of multivariate and time variant visualization techniques, including time sequence chart, treemap, and parallel coordinates. Novel techniques to illustrate the time tendency and abnormal status are also introduced. We demonstrate the effectiveness and scalability of the proposed system framework on a commodity cloud-computing platform.

Total 2