MIC<i>k</i>NN: Multi-Instance Covering <i>k</i>NN Algorithm

Shu Zhao; Chen Rui; Yanping Zhang

doi:10.1109/TST.2013.6574674

| Sign up

PDF (364.5 KB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Open Access

MICkNN: Multi-Instance Covering kNN Algorithm

Shu Zhao, Chen Rui, Yanping Zhang()

Department of Computer Science and Technology and Key Lab of Intelligent Computing and Signal Processing, Anhui University, Hefei 230601, China

Show Author Information

Abstract

Mining from ambiguous data is very important in data mining. This paper discusses one of the tasks for mining from ambiguous data known as multi-instance problem. In multi-instance problem, each pattern is a labeled bag that consists of a number of unlabeled instances. A bag is negative if all instances in it are negative. A bag is positive if it has at least one positive instance. Because the instances in the positive bag are not labeled, each positive bag is an ambiguous. The mining aim is to classify unseen bags. The main idea of existing multi-instance algorithms is to find true positive instances in positive bags and convert the multi-instance problem to the supervised problem, and get the labels of test bags according to predict the labels of unknown instances. In this paper, we aim at mining the multi-instance data from another point of view, i.e., excluding the false positive instances in positive bags and predicting the label of an entire unknown bag. We propose an algorithm called Multi-Instance Covering kNN (MICkNN) for mining from multi-instance data. Briefly, constructive covering algorithm is utilized to restructure the structure of the original multi-instance data at first. Then, the kNN algorithm is applied to discriminate the false positive instances. In the test stage, we label the tested bag directly according to the similarity between the unseen bag and sphere neighbors obtained from last two steps. Experimental results demonstrate the proposed algorithm is competitive with most of the state-of-the-art multi-instance methods both in classification accuracy and running time.

Keywords

mining ambiguous data multi-instance classification constructive covering algorithm kNN algorithm

References

[1]

T. G.

Dietterich

, R. H.

Lathrop

, and T.

Lozano-Pérez

, Solving the multiple instance problem with axis-parallel rectangles, Artificial Intelligence, vol. 89, no. 1, pp. 31-71, 1997.

Crossref Google Scholar

[2]

B. B.

, Z.

Song

, and S. C.

Yan

, Web image mining towards universal age estimator, in Proceedings of the 17th ACM International Conference on MultimediaInt, ACM, 2009, pp. 85-94.

Crossref

[3]

Zafra

, C.

Romero

, S.

Ventura

, and E.

Herrera-Viedma

, Multi-instance genetic programming for web index recommendation, Expert Systems with Applications, vol. 36, no. 9, pp. 11470-11479, 2009.

Crossref Google Scholar

[4]

Z. H.

Zhou

, K.

Jiang

, and M.

, Multi-instance learning based web mining, Applied Intelligence, vol. 22, no. 2, pp. 135-147, 2005.

Crossref Google Scholar

[5]

Andrews

, I.

Tsochantaridis

, and T.

Hofmann

, Support vector machines for multiple-instance learning, Advances in Neural Information Processing Systems, vol. 15, pp. 561-568, 2002.

Google Scholar

[6]

Z. H.

Zhou

, Y. Y.

Sun

, and Y. F.

, Multi-instance learning by treating instances as non-I.I.D samples, in Proceedings of the 26th Annual International Conference on Machine Learning, ACM, 2009, pp. 1249-1256.

Crossref

[7]

Jorgensen

, Y.

Zhou

, and M.

Inge

, A multiple instance learning strategy for combating good word attacks on spam filters, The Journal of Machine Learning Research, vol. 9, no. 6, pp. 1115-1146, 2008.

Google Scholar

[8]

J. B.

, Y. X.

Chen

, and J. Z.

Wang

, A sparse support vector machine approach to region-based image categorization, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 1121-1128, 2005.

Google Scholar

[9]

Z. Y.

, A.

Robles-Kelly

, and J.

Zhou

, MILIS: Multiple instance learning with instance selection, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 5, pp. 958-977, 2011.

Crossref Google Scholar

[10]

S. H.

Feng

and D.

, Transductive multi-instance multi-label learning algorithm with application to automatic image annotation, Expert Systems with Applications, vol. 37, no. 1, pp. 661-670, 2010.

Crossref Google Scholar

[11]

D. T.

Nguyen

, C. D.

Nguyen

, R.

Hargraves

, L. A.

Kurgan

, and K. J.

Cios

, mi-DS: Multiple-instance learning algorithm, IEEE Transactions on Systems, Man, and Cybernetics Society. Part B, Cybernetics, vol. 43, no. 1, pp. 143-154, Feb. 2013.

Crossref Google Scholar

[12]

Babenko

, M. H.

Yang

, and S.

Belongie

, Visual tracking with online multiple instance learning, in IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 983-990.

Crossref

[13]

Babenko

, M. H.

Yang

, and S.

Belongie

, Robust object tracking with online multiple instance learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 8, pp. 1619-1632, 2011.

Crossref Google Scholar

[14]

Leistner

, A.

Saffari

, and H.

Bischof

, MIForests: Multiple-instance learning with randomized trees, in Proceedings of the 11th European Conference on Computer Vision: Part VI, Springer, 2010, pp. 29-42.

Crossref

[15]

Xie

, Y. Y.

, C. H.

, and W. S

Zhang

, Online multiple instance gradient feature selection for robust visual tracking, Pattern Recognition Letters, vol. 33, no. 9, pp. 1075-1082, 2012.

Crossref Google Scholar

[16]

Zeisl

, C.

Leistner

, A.

Saffari

, and H.

Bischof

, On-line semi-supervised multiple-instance boosting, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2010, pp. 1879-1886.

Crossref

[17]

Z. Q.

, Y. T.

, L. S.

Wang

, and Y.

Song

, Online multiple instance boosting for object detection, Neurocomputing, vol. 74, no. 10, pp. 1769-1775, 2011.

Crossref Google Scholar

[18]

Bellare

, T.

Ristenpart

, and S.

Tessaro

, Multi-instance security and its application to password-based cryptography, Advances in Cryptology-CRYPTO 2012, Springer, pp. 312-329, 2012.

Crossref

[19]

Maron

and T.

Lozano-Pérez

, A framework for multiple-instance learning, Advances in Neural Information Processing Systems, vol. 10, pp. 570-576, 1998.

Google Scholar

[20]

Zhang

and S. A.

Goldman

, EM-DD: An improved multiple-instance learning technique, Advances in Neural Information Processing Systems, vol. 14, no. 2022, pp. 1073-1080, 2001.

Google Scholar

[21]

Y. J.

Han

, Q.

Tao

, and J.

Wang

, Avoiding false positive in multi-instance learning, Advances in Neural Information Processing Systems, vol. 23, pp. 1-8, 2010.

Google Scholar

[22]

M. Y.

Kim

and F.

Torre

, Gaussian processes multiple instance learning, in Proceedings of the 27th International Conference on Machine Learning, 2010, pp. 535-542.

[23]

F. X.

and C.

Sminchisescu

, Convex multiple-instance learning by estimating likelihood ratio, Advances in Neural Information Processing Systems, vol. 23, pp. 1360-1368, 2010.

Google Scholar

[24]

M. L.

Zhang

and Z. H.

Zhou

, Adapting RBF neural networks to multi-instance learning, Neural Processing Letters, vol. 23, no. 1, pp. 1-26, 2006.

Crossref Google Scholar

[25]

J. T.

Kwok

and P. M.

Cheung

, Marginalized multi-instance kernels, in Proceedings of the 20th International Joint Conference on Artifical Intelligence, Morgan Kaufmann Publishers Inc, 2007, pp. 901-906.

[26]

Zhang

, F.

Wang

, L.

, and T.

, M

^{3}

IC: Maximum margin multiple instance clustering, in Proceedings of the 21th International Joint Conference on Artifical intelligence, Morgan Kaufmann Publishers Inc, 2009, pp. 1339-1344.

[27]

C. H.

, I.

Gondra

, and L.

Liu

, An efficient parallel neural network-based multi-instance learning algorithm, The Journal of Supercomputing, vol. 62, no. 2, pp. 724-740, 2012.

Crossref Google Scholar

[28]

H. Y.

Wang

, Q.

Yang

, and H. B.

Zha

, Adaptive p-posterior mixture-model kernels for multiple instance learning, in Proceedings of the 25th International Conference on Machine Learning, ACM, 2008, pp. 1136-1143.

Crossref

[29]

Wang

and J. D.

Zucker

, Solving the multiple-instance problem: A lazy learning approach, in Proceedings of the 17th International Conference on Machine Learning, Morgan Kaufmann Publishers Inc, 2000, pp. 1119-1125.

[30]

Deselaers

and V.

Ferrari

, A conditional random field for multiple-instance learning, in Proceedings of the 27th International Conference on Machine Learning, Morgan Kaufmann Publishers Inc, 2010, pp. 1119-1125.

[31]

Babenko

, N.

Verma

, P.

Dollár

, and S. J.

Belongie

, Multiple instance learning with manifold bags, in Proceedings of the 28th International Conference on Machine Learning, Morgan Kaufmann Publishers Inc, 2011, pp. 81-88.

[32]

Zhang

, Y.

Liu

, L.

, J.

Zhang

, and R. D.

Lawrence

, Multiple instance learning on structured data, Advances in Neural Information Processing Systems, vol. 24, pp. 145-153, 2011.

Google Scholar

[33]

L. X.

Jiang

, Z. H.

Cai

, D. H.

Wang

, and H.

Zhang

, Bayesian citation-KNN with distance weighting, International Journal of Machine Learning and Cybernetics, pp. 1-7, 2013.

Google Scholar

[34]

Maron

, Learning from ambiguity, Ph.D. dissertation, Massachusetts Institute of Technology, USA, 1998.

[35]

Zhang

and B.

Zhang

, A geometrical representation of mcculloch-pitts neural model and its applications, IEEE Transactions on Neural Networks, vol. 10, no. 4, pp. 925-929, 1999.

Crossref Google Scholar

[36]

W. S.

McCulloch

and W.

Pitts

, A logical calculus of the ideas immanent in nervous activity, Bulletin of Mathematical Biology, vol. 5, no. 4, pp. 115-133, 1943.

Crossref Google Scholar

[37]

P. N.

Tan

, Introduction to Data Mining, Pearson Education India, 2007.

[38]

G. A.

Edgar

, Measure, Topology, and Fractal Geometry, Springer-Verlag, New York, 2008.

Crossref

[39]

Lia

, D. M. J.

Taxa

, R. P. W.

Duina

, and M.

Looga

, Multiple-instance learning as a classifier combining problem, Pattern Recognition, vol. 46, no. 3, pp. 865-874, 2013.

Crossref Google Scholar

[40]

Z. H.

Zhou

, M. L.

Zhang

, S. J.

Huang

, and Y. F.

, Multi-instance multi-label learning, Artificial Intelligence, vol. 176, no. 1, pp. 2291-2320, 2012.

Crossref Google Scholar

Tsinghua Science and Technology

Volume 18 Issue 4,
August 2013

Pages 360-368

DOI: 10.1109/TST.2013.6574674

Cite this article:

Zhao S, Rui C, Zhang Y. MICkNN: Multi-Instance Covering kNN Algorithm. Tsinghua Science and Technology, 2013, 18(4): 360-368. https://doi.org/10.1109/TST.2013.6574674