Object tracking using a convolutional network and a structured output SVM

Junwei Li; Xiaolong Zhou; Sixian Chan; Shengyong Chen

doi:10.1007/s41095-017-0087-3

| Sign up

PDF (4.6 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Research Article | Open Access

Object tracking using a convolutional network and a structured output SVM

Junwei Li^¹, Xiaolong Zhou^¹, Sixian Chan^¹, Shengyong Chen^²()

1 Zhejiang University of Technology, Hangzhou, 310023, China.

2 Tianjin University of Technology, Tianjin, 300384, China.

Show Author Information

Abstract

Object tracking has been a challenge in computer vision. In this paper, we present a novel method to model target appearance and combine it with structured output learning for robust online tracking within a tracking-by-detection framework. We take both convolutional features and hand-crafted features into account to robustly encode the target appearance. First, we extract convolutional features of the target by kernels generated from the initial annotated frame. To capture appearance variation during tracking, we propose a new strategy to update the target and background kernel pool. Secondly, we employ a structured output SVM for refining the target’s location to mitigate uncertainty in labeling samples as positive or negative. Compared with existing state-of-the-art trackers, our tracking method not only enhances the robustness of the feature representation, but also uses structured output prediction to avoid relying on heuristic intermediate steps to produce labelled binary samples. Extensive experimental evaluation on the challenging OTB-50 video sequences shows competitive results in terms of both success and precision rate, demonstrating the merits of the proposed tracking method.

Keywords

object tracking convolutional network structured learning feature extraction

References

[1]

Zhang,

; Q.

Liu,

; Y.

Wu,

; M.-H

Yang,

. Robust visual tracking via convolutional networks without training. IEEE Transactions on Image Processing Vol. 25, No. 4, 1779-1792, 2016.

Crossref Google Scholar

[2]

J. F.

Henriques,

; R.

Caseiro,

; P.

Martins,

; J

Batista,

. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 37, No. 3, 583-596, 2015.

Crossref Google Scholar

[3]

A. W. M.

Smeulders,

; D. M.

Chu,

; R.

Cucchiara,

; S.

Calderara,

; A.

Dehghan,

; M

Shah,

. Visual tracking: An experimental survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 36, No. 7, 1442-1468, 2014.

Crossref Google Scholar

[4]

Babenko,

; M.-H.

Yang,

; S

Belongie,

. Robust object tracking with online multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 33, No. 8, 1619-1632, 2011.

Crossref Google Scholar

[5]

Kalal,

; K.

Mikolajczyk,

; J

Matas,

. Tracking-learning-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 34, No. 7, 1409-1422, 2012.

Crossref Google Scholar

[6]

Grabner,

; M.

Grabner,

; H.

Bischof,

Real-time tracking via on-line boosting. In: Proceedings British Machine Vision Conference, Vol. 1, 47-56, 2006.

Crossref

[7]

Grabner,

; C.

Leistner,

; H.

Bischof,

Semi-supervised on-line boosting for robust tracking. In: Computer Vision-ECCV 2008. D.

Forsyth,

; P.

Torr,

; A.

Zisserma,

Eds. Springer Berlin Heidelberg, 234-247, 2008.

Crossref

[8]

Ma,

; W.

Chen,

; X.

Ma,

; J.

Xu,

; X.

Huang,

; R.

Maciejewski,

; A. K. H

Tung.

. EasySVM: A visual analysis approach for open-box support vector machines. Computational Visual Media Vol. 3, No. 2, 161-175, 2017.

Crossref Google Scholar

[9]

Hare,

; S.

Golodetz,

; A.

Saffari,

; V.

Vineet,

; M.-M.

Cheng,

; S. L.

Hicks,

; P. H

Torr,

. Struck: Structured output tracking with kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 38, No. 10, 2096-2109, 2016.

Crossref Google Scholar

[10]

Ren,

; J.

Malik,

Tracking as repeated figure/ground segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-8, 2007.

Crossref

[11]

Zhou,

; Y.

Li,

; B.

He,

; T

Bai,

. GM-PHD-based multi-target visual tracking using entropy distribution and game theory. IEEE Transactions on Industrial Informatics Vol. 10, No. 2, 1064-1076, 2014.

Crossref Google Scholar

[12]

Zhou,

; H.

Yu,

; H.

Liu,

; Y

Li,

. Tracking multiple video targets with an improved GM-PHD tracker. Sensors Vol. 15, No. 12, 30240-30260, 2015.

Crossref Google Scholar

[13]

Mei,

; H.

Ling,

Robust visual tracking using

l

1 minimization. In: Proceedings of the IEEE 12th International Conference on Computer Vision, 1436-1443, 2009.

[14]

Simonyan,

; A

Zisserman,

. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

[15]

Han,

; D.

Comaniciu,

; Y.

Zhu,

; L. S

Davis,

. Sequential kernel density approximation and its application to real-time visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 30, No. 7, 1186-1197, 2008.

Crossref Google Scholar

[16]

A. D.

Jepson,

; D. J.

Fleet,

; T. F

El-Maraghi,

. Robust online appearance models for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 25, No. 10, 1296-1311, 2003.

Crossref Google Scholar

[17]

D. A.

Ross,

; J.

Lim,

; R.-S

Lin,

; M.-H

Yang,

. Incremental learning for robust visual tracking. International Journal of Computer Vision Vol. 77, Nos. 1-3, 125-141, 2008.

Crossref Google Scholar

[18]

Zhong,

; H.

Lu,

; M.-H

Yang,

. Robust object tracking via sparse collaborative appearance model. IEEE Transactions on Image Processing Vol. 23, No. 5, 2356-2368, 2014.

Crossref Google Scholar

[19]

Zhang,

; L.

Zhang,

; M.-H.

Yang,

Real-time compressive tracking. In: Computer Vision-ECCV 2012. A.

Fitzgibbon,

; S.

Lazebnik,

; P.

Perona,

; Y.

Sato,

; C.

Schmid,

Eds. Springer Berlin Heidelberg, 864-877, 2012.

[20]

Kalal,

; J.

Matas,

; K.

Mikolajczyk,

P-N learning: Bootstrapping binary classifiers by structural constraints. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 49-56, 2010.

Crossref

[21]

Gao,

; H.

Ling,

; W.

Hu,

; J.

Xing,

Transfer learning based visual tracking with Gaussian processes regression. In: Computer Vision-ECCV 2014. D.

Fleet,

; T.

Pajdla,

; B.

Schiele,

; . Eds. Springer Cham, 188-203, 2014.

Crossref

[22]

Zhu,

; D.

Liang,

; S.

Zhang,

; X.

Huang,

; B.

Li,

; S.

Hu,

Traffic-sign detection and classification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2110-2118, 2016.

Crossref

[23]

Li,

; Y.

Li,

; F.

Porikli,

Robust online visual tracking with a single convolutional neural network. In: Computer Vision-ACCV 2014. D.

Cremers,

; I.

Reid,

; H.

Saito,

; M.-H.

Yang,

Eds. Springer Cham, 194-209, 2014.

Crossref

[24]

Zhou,

; L.

Xie,

; P.

Zhang,

; Y.

Zhang,

An ensemble of deep neural networks for object tracking. In: Proceedings of the IEEE International Conference on Image Processing, 843-847, 2014.

Crossref

[25]

Fan,

; W.

Xu,

; Y.

Wu,

; Y

Gong,

. Human tracking using convolutional neural networks. IEEE Transactions on Neural Networks Vol. 21, No. 10, 1610-1623, 2010.

Crossref Google Scholar

[26]

Wang,

; D.-Y.

Yeung,

Learning a deep compact image representation for visual tracking. In: Proceedings of the Advances in Neural Information Processing Systems, 809-817, 2013.

[27]

Wang,

; T.

Liu,

; G.

Wang,

; K. L.

Chan,

; Q

Yang,

. Video tracking using learned hierarchical features. IEEE Transactions on Image Processing Vol. 24, No. 4, 1424-1435, 2015.

Crossref Google Scholar

[28]

Avidan,

. Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 26, No. 8, 1064-1072, 2004.

Crossref Google Scholar

[29]

R. T.

Collins,

; Y.

Liu,

; M

Leordeanu,

. Online selection of discriminative tracking features. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 27, No. 10, 1631-1643, 2005.

Crossref Google Scholar

[30]

Yang,

; H.

Lu,

; M.-H

Yang,

. Robust superpixel tracking. IEEE Transactions on Image Processing Vol. 23, No. 4, 1639-1651, 2014.

Crossref Google Scholar

[31]

J. F.

Henriques,

; R.

Caseiro,

; P.

Martins,

; J.

Batista,

Exploiting the circulant structure of tracking-by-detection with kernels. In: Computer Vision-ECCV 2012. A.

Fitzgibbon,

; S.

Lazebnik,

; P.

Perona,

; Y.

Sato,

; C.

Schmid,

Eds. Springer Berlin Heidelberg, 702-715, 2012.

Crossref

[32]

Elad,

; M. A. T.

Figueiredo,

; Y

Ma,

. On the role of sparse and redundant representations in image processing. Proceedings of the IEEE Vol. 98, No. 6, 972-982, 2010.

Crossref Google Scholar

[33]

Dalal,

; B.

Triggs,

Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 886-893, 2005.

[34]

Wu,

; J.

Lim,

; M.-H.

Yang,

Online object tracking: A benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2411-2418, 2013.

Crossref

[35]

Zhong,

; H.

Lu,

; M.-H.

Yang,

Robust object tracking via sparsity-based collaborative model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1838-1845, 2012.

Computational Visual Media

Volume 3 Issue 4,
December 2017

Pages 325-335

DOI: 10.1007/s41095-017-0087-3

Cite this article:

Li J, Zhou X, Chan S, et al. Object tracking using a convolutional network and a structured output SVM. Computational Visual Media, 2017, 3(4): 325-335. https://doi.org/10.1007/s41095-017-0087-3