EasySVM: A visual analysis approach for open-box support vector machines

Yuxin Ma; Wei Chen; Xiaohong Ma; Jiayi Xu; Xinxin Huang; Ross Maciejewski; Anthony K. H. Tung

doi:10.1007/s41095-017-0077-5

| Sign up

PDF (6.2 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Research Article | Open Access

EasySVM: A visual analysis approach for open-box support vector machines

Yuxin Ma^¹, Wei Chen^¹(), Xiaohong Ma^¹, Jiayi Xu^¹, Xinxin Huang^¹, Ross Maciejewski^², Anthony K. H. Tung^³

1 State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058, China.

2 Arizona State University, USA.

3 National University of Singapore, Singapore.

Show Author Information

Abstract

Support vector machines (SVMs) are supervised learning models traditionally employed for classification and regression analysis. In classification analysis, a set of training data is chosen, and each instance in the training data is assigned a categorical class. An SVM then constructs a model based on a separating plane that maximizes the margin between different classes. Despite being one of the most popular classification models because of its strong performance empirically, understanding the knowledge captured in an SVM remains difficult. SVMs are typically applied in a black-box manner where the details of parameter tuning, training, and even the final constructed model are hidden from the users. This is natural since these details are often complex and difficult to understand without proper visualization tools. However, such an approach often brings about various problems including trial-and-error tuning and suspicious users who are forced to trust these models blindly.

The contribution of this paper is a visual analysis approach for building SVMs in an open-box manner. Our goal is to improve an analyst’s understanding of the SVM modeling process through a suite of visualization techniques that allow users to have full interactive visual control over the entire SVM training process. Our visual exploration tools have been developed to enable intuitive parameter tuning, training data manipulation, and rule extraction as part of the SVM training process. To demonstrate the efficacy of our approach, we conduct a case study using a real-world robot control dataset.

Keywords

support vector machines (SVMs)rule extraction visual classification high-dimensional visualization visual analysis

Electronic Supplementary Material

Video

41095_2017_77_MOESM1_ESM.mp4

References

[1]

Cortes,

; V.

Vapnik,

Support-vector networks. Machine Learning Vol. 20, No. 3, 273-297, 1995.

Crossref Google Scholar

[2]

Tong,

; D.

Koller,

Support vector machine active learning with applications to text classification. Journal of Machine Learning Research Vol. 2, 45-66, 2001.

Google Scholar

[3]

Osuna,

; R.

Freund,

; F.

Girosi,

Training support vector machines: An application to face detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 130-136, 1997.

[4]

T. S.

Furey,

; N.

Cristianini,

; N.

Duffy,

; D. W.

Bednarski,

; M.

Schummer,

; D.

Haussler,

Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics Vol. 16, No. 10, 906-914, 2000.

Crossref Google Scholar

[5]

Hasenauer,

; J.

Heinrich,

; M.

Doszczak,

; P.

Scheurich,

; D.

Weiskopf,

; F.

Allgöwer,

A visual analytics approach for models of heterogeneous cell populations. EURASIP Journal on Bioinformatics and Systems Biology Vol. 2012, 4, 2012.

Crossref Google Scholar

[6]

Abe,

Support Vector Machines for Pattern Classification. Springer London, 2010.

Crossref

[7]

F.-Y.

Tzeng,

; K.-L.

Ma,

Opening the black box—Data driven visualization of neural networks. In: Proceedings of the IEEE Visualization, 383-390, 2005.

[8]

Martens,

; B. B.

Baesens,

; T.

van Gestel,

Decompositional rule extraction from support vector machines by active learning. IEEE Transactions on Knowledge and Data Engineering Vol. 21, No. 2, 178-191, 2009.

Crossref Google Scholar

[9]

Núñez,

; C.

Angulo,

; A.

Català,

Rule extraction from support vector machines. In: Proceedings of the European Symposium on Artificial Neural Networks, 107-112, 2002.

[10]

Schölkopf,

; A. J.

Smola,

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002.

[11]

Ladicky,

; P.

Torr,

Locally linear support vector machines. In: Proceedings of the 28th International Conference on Machine Learning, 985-992, 2011.

[12]

Ganti,

; A.

Gray,

Local support vector machines: Formulation and analysis. arXiv preprint arXiv:1309.3699, 2013.

[13]

Baesens,

; T. V.

Gestel,

; S.

Viaene,

; M.

Stepanova,

; J.

Suykens,

; J.

Vanthienen,

Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society Vol. 54, No. 6, 627-635, 2003.

Crossref Google Scholar

[14]

Wahba,

Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV. In: Advances in Kernel Methods. B.

Schölkopf,

; C. J. C.

Burges,

; A. J.

Smola,

Eds. Cambridge, MA, USA: MIT Press, 69-88, 1999.

[15]

C.-W.

Hsu,

; C.-C.

Chang,

; C.-J.

Lin,

A practical guide to support vector classification. 2016. Available at http://www.csie.ntu.edu.tw/∼cjlin/papers/guide/guide.pdf.

[16]

O. L.

Mangasarian,

; E. W.

Wild,

Proximal support vector machine classifiers. In: Proceedings of KDD-2001: Knowledge Discovery and Data Mining, 77-86, 2001.

[17]

Maji,

; A. C.

Berg,

; J.

Malik,

Classification using intersection kernel support vector machines is efficient. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-8, 2008.

Crossref

[18]

Blanzieri,

; F.

Melgani,

An adaptive SVM nearest neighbor classifier for remotely sensed imagery. In: Proceedings of the IEEE International Symposium on Geoscience and Remote Sensing, 3931-3934, 2006.

Crossref

[19]

Yin,

; Y.

Zhu,

; S.

Mu,

; S.

Tian,

Local support vector machine based on cooperative clustering for very large-scale dataset. In: Proceedings of the 8th International Conference on Natural Computation, 88-92, 2012.

Crossref

[20]

N. H.

Barakat,

; A. P.

Bradley,

Rule extraction from support vector machines: A sequential covering approach. IEEE Transactions on Knowledge and Data Engineering Vol. 19, No. 6, 729-741, 2007.

Crossref Google Scholar

[21]

Fung,

; S.

Sandilya,

; R. B.

Rao,

Rule extraction from linear support vector machines. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, 32-40, 2005.

Crossref

[22]

Caragea,

; D.

Cook,

; H.

Wickham,

; V.

Honavar,

Visual methods for examining SVM classifiers. In: Visual Data Mining. S. J.

Simoff,

; M. H.

Böhlen,

; A.

Mazeika,

Eds. Springer Berlin Heidelberg, 2007.

[23]

C. R.

Aragon,

; S. J.

Bailey,

; S.

Poon,

; K. J.

Runge,

; R. C.

Thomas,

Sunfall: A collaborative visual analytics system for astrophysics. In: Proceedings of the IEEE Symposium on Visual Analytics Science and Technology, 219-220, 2007.

Crossref

[24]

Ma,

; W.

Chen,

; X.

Ma,

; J.

Xu,

; X.

Huang,

; R.

Maciejewski,

; A. K. H.

Tung,

EasySVM: A visual analysis approach for open-box support vector machines. In: Proceedings of the IEEE VIS 2014 Workshop on Visualization for Predictive Analytics, 2014.

[25]

Asimov,

The grand tour: A tool for viewing multidimensional data. SIAM Journal on Scientific and Statistical Computing Vol. 6, No. 1, 128-143, 1985.

Crossref Google Scholar

[26]

J. H.

Friedman,

; J. W.

Tukey,

A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computers Vol. C-23, No. 9, 881-890, 1974.

Crossref Google Scholar

[27]

Buja,

; D.

Cook,

; D.

Asimov,

; C.

Hurley,

Computational methods for high-dimensional rotations in data visualization. In: Handbook of Statistics, Volume 24: Data Mining and Data Visualization. C. R.

Rao,

; E. J.

Wegman,

; J. L.

Solka,

Eds. Amsterdam, the Netherlands: North-Holland Publishing Co., 391-413, 2005.

Crossref

[28]

Cook,

; A.

Buja,

Manual controls for high-dimensional data projections. Journal of Computational and Graphical Statistics Vol. 6, No. 4, 464-480, 1997.

Crossref Google Scholar

[29]

J. E.

Nam,

; K.

Mueller,

TripAdvisor^N-D: A tourism-inspired high-dimensional space exploration framework with overview and detail. IEEE Transactions on Visualization and Computer Graphics Vol. 19, No. 2, 291-305, 2013.

Crossref Google Scholar

[30]

W. C.

Cleveland,

; M. E.

McGill,

Dynamic Graphics for Statistics. Boca Raton, FL, USA: CRC Press, 1988.

[31]

Inselberg,

The plane with parallel coordinates. The Visual Computer Vol. 1, No. 2, 69-91, 1985.

Crossref Google Scholar

[32]

Inselberg,

; B.

Dimsdale,

Parallel coordinates: A tool for visualizing multi-dimensional geometry. In: Proceedings of the 1st Conference on Visualization, 361-378, 1990.

[33]

J. M.

Chambers,

; W. S.

Cleveland,

; B.

Kleiner,

; P. A.

Tukey,

Graphical Methods for Data Analysis. Duxbury Press, 1983.

[34]

Elmqvist,

; P.

Dragicevic,

; J. D.

Fekete,

Rolling the dice: Multidimensional visual exploration using scatterplot matrix navigation. IEEE Transactions on Visualization and Computer Graphics Vol. 14, No. 6, 1539-1148, 2008.

Crossref Google Scholar

[35]

Sanftmann,

; D.

Weiskopf,

3D scatterplot navigation. IEEE Transactions on Visualization and Computer Graphics Vol. 18, No. 11, 1969-1978, 2012.

Crossref Google Scholar

[36]

Liu,

; Y.

Ma,

; C. K.

Wong,

Improving an association rule based classifier. In: Principles of Data Mining and Knowledge Discovery. D. A.

Zighed,

; J.

Komorowski,

; J.

Żytkow,

Eds. Springer Berlin Heidelberg, 504-509, 2000.

Crossref

[37]

J. R.

Quinlan,

Induction of decision trees. Machine Learning Vol. 1, No. 1, 81-106, 1986.

Crossref Google Scholar

[38]

S. T.

Teoh,

; K.-L.

Ma,

PaintingClass: Interactive construction, visualization and exploration of decision trees. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 667-672, 2003.

Crossref

[39]

Van den Elzen,

; J. J.

van Wijk,

BaobabView: Interactive construction and analysis of decision trees. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology, 151-160, 2011.

Crossref

[40]

Heimerl,

; S.

Koch,

; H.

Bosch,

; T.

Ertl,

Visual classifier training for text document retrieval. IEEE Transactions on Visualization and Computer Graphics Vol. 18, No. 12, 2839-2848, 2012.

Crossref Google Scholar

[41]

Höferlin,

; R.

Netzel,

; M.

Höferlin,

; D.

Weiskopf,

; G.

Heidemann,

Inter-active learning of ad-hoc classifiers for video visual analytics. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology, 23-32, 2012.

Crossref

[42]

Joia,

; D.

Coimbra,

; J. A.

Cuminato,

; F. V.

Paulovich,

; L. G.

Nonato,

Local affine multidimensional projection. IEEE Transactions on Visualization and Computer Graphics Vol. 17, No. 12, 2563-2571, 2011.

Crossref Google Scholar

[43]

Guyon,

; A.

Elisseeff,

An introduction to variable and feature selection. Journal of Machine Learning Research Vol. 3, 1157-1182, 2003.

Google Scholar

[44]

J. H. T.

Claessen,

; J. J.

van Wijk,

Flexible linked axes for multivariate data visualization. IEEE Transactions on Visualization and Computer Graphics Vol. 17, No. 12, 2310-2316, 2011.

Crossref Google Scholar

[45]

A. L.

Freire,

; G. A.

Barreto,

; M.

Veloso,

; A. T.

Varela,

Short-term memory mechanisms in neural network learning of robot navigation tasks: A case study. In: Proceedings of the 6th Latin American Robotics Symposium, 1-6, 2009.

Crossref

Computational Visual Media

Volume 3 Issue 2,
June 2017

Pages 161-175

DOI: 10.1007/s41095-017-0077-5

Cite this article:

Ma Y, Chen W, Ma X, et al. EasySVM: A visual analysis approach for open-box support vector machines. Computational Visual Media, 2017, 3(2): 161-175. https://doi.org/10.1007/s41095-017-0077-5