Robust camera pose estimation by viewpoint classification using deep learning

Yoshikatsu Nakajima; Hideo Saito

doi:10.1007/s41095-016-0067-z

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (3.7 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Research Article | Open Access

Robust camera pose estimation by viewpoint classification using deep learning

Yoshikatsu Nakajima^¹(

), Hideo Saito^¹

1 Department of Science and Technology, Keio University, Japan.

Show Author Information

Abstract

Camera pose estimation with respect to target scenes is an important technology for superimposing virtual information in augmented reality (AR). However, it is difficult to estimate the camera pose for all possible view angles because feature descriptors such as SIFT are not completely invariant from every perspective. We propose a novel method of robust camera pose estimation using multiple feature descriptor databases generated for each partitioned viewpoint, in which the feature descriptor of each keypoint is almost invariant. Our method estimates the viewpoint class for each input image using deep learning based on a set of training images prepared for each viewpoint class. We give two ways to prepare these images for deep learning and generating databases. In the first method, images are generated using a projection matrix to ensure robust learning in a range of environments with changing backgrounds. The second method uses real images to learn a given environment around a planar pattern. Our evaluation results confirm that our approach increases the number of correct matches and the accuracy of camera pose estimation compared to the conventional method.

Keywords

deep learning pose estimation convolutional neural network augmented reality (AR)

References

[1]

H. Kato,; M. Billinghurst, Marker tracking and HMD calibration for a video-based augmented reality conferencing system. In: Proceedings of the 2nd IEEE and ACM International Workshop on Augmented Reality, 85-94, 1999.

[2]

T. Lee,; T. Hollerer, Hybrid feature tracking and user interaction for markerless augmented reality. In: Proceedings of IEEE Virtual Reality Conference, 145-152, 2008.

Crossref

[3]

M. Maidi,; M. Preda,; V. H. Le, Markerless tracking for mobile augmented reality. In: Proceedings of IEEE International Conference on Signal and Image Processing Applications, 301-306, 2011.

Crossref

[4]

D. G. Lowe, Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision Vol. 60, No. 2, 91-110, 2004.

Crossref Google Scholar

[5]

K. Mikolajczyk,; C. Schmid, A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 27, No. 10, 1615-1630, 2005.

Crossref Google Scholar

[6]

V. Lepetit,; P. Fua, Keypoint recognition using randomized trees. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 28, No. 9, 1465-1479, 2006.

Crossref Google Scholar

[7]

L. Breiman, Random forests. Machine Learning Vol. 45, No. 1, 5-32, 2001.

Crossref Google Scholar

[8]

T. Yoshida,; H. Saito,; M. Shimizu,; A. Taguchi, Stable keypoint recognition using viewpoint generative learning. In: Proceedings of the International Conference on Computer Vision Theory and Applications, Vol. 2, 310-315, 2013.

[9]

J. A. Hartigan,; M. A. Wong, Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics) Vol. 28, No. 1, 100-108, 1979.

Crossref Google Scholar

[10]

K. Fukushima,; S. Miyake, Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position. Pattern Recognition Vol. 15, No. 6, 455-469, 1982.

Crossref Google Scholar

[11]

D. H. Hubel,; T. N. Wiesel, Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology Vol. 160, No. 1, 106-154, 1962.

Crossref Google Scholar

[12]

Y. LeCun,; B. Boser,; J. S. Denker,; D. Henderson,; R. E. Howard,; W. Hubbard,; L. D. Jackel, Backpropagation applied to handwritten zip code recognition. Neural Computation Vol. 1, No. 4, 541-551, 1989.

Crossref Google Scholar

[13]

O. Russakovsky,; J. Deng,; H. Su,; J. Krause,; S. Satheesh,; S. Ma,; Z. Huang,; A. Karpathy,; A. Khosla,; M. Bernstein,; A. C. Berg,; L. Fei-Fei, ImageNet large scale visual recognition challenge. International Journal of Computer Vision Vol. 115, No. 3, 211-252, 2015.

Crossref Google Scholar

[14]

P. Agrawal,; J. Carreira,; J. Malik, Learning to see by moving. In: Proceedings of IEEE International Conference on Computer Vision, 37-45, 2015.

Crossref

[15]

D. E. Rumelhart,; G. E. Hintont,; R. J. Williams, Learning representations by back-propagating errors. Nature Vol. 323, 533-536, 1986.

Crossref Google Scholar

[16]

G. E. Hinton,; N. Srivastava,; A. Krizhevsky,; I. Sutskever,; R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.

[17]

A. Krizhevsky,; I. Sutskever,; G. E. Hinton, ImageNet classification with deep convolutional neural network. In: Proceedings of Advances in Neural Information Processing Systems, 1097-1105, 2012.

[18]

K. Mikolajczyk,; T. Tuytelaars,; C. Schmid,; A. Zisserman,; J. Matas,; F. Schaffalitzky,; T. Kadir,; L. V. GooL, A comparison of affine region detectors. International Journal of Computer Vision Vol. 65, No. 1, 43-72, 2005.

Crossref Google Scholar

[19]

M. A. Fischler,; R. C. Bolles, Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM Vol. 24, No. 6, 381-395, 1981.

Crossref Google Scholar

[20]

G. Yu,; J.-M. Morel, ASIFT: An algorithm for fully affine invariant comparison. Image Processing On Line Vol. 1, 1-28, 2011.

Crossref Google Scholar

[21]

M. Ozuysal,; M. Calonder,; V. Lepetit,; P. Fua, Fast keypoint recognition using random ferns. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 32, No. 3, 448-461, 2009.

Crossref Google Scholar

[22]

S. Tokui,; K. Oono,; S. Hido,; J. Clayton, Chainer: A next-generation open source framework for deep learning. In: Proceedings of Workshop on Machine Learning Systems (LearningSys) in the 29th Annual Conference on Neural Information Processing Systems, 2015.

[23]

M. Lin,; Q. Chen,; S. Yan, Network in network. arXiv preprint arXiv:1312.4400, 2013.

[24]

P. F. Alcantarilla,; J. Nuevo,; A. Bartoli, Fast explicit diffusion for accelerated features in nonlinear scale spaces. In: Proceedings of British Machine Vision Conference, 13.1-13.11, 2013.

Crossref

Computational Visual Media

Volume 3 Issue 2,
June 2017

Pages 189-198

DOI: 10.1007/s41095-016-0067-z

Cite this article:

Nakajima Y, Saito H. Robust camera pose estimation by viewpoint classification using deep learning. Computational Visual Media, 2017, 3(2): 189-198. https://doi.org/10.1007/s41095-016-0067-z

645

Views

Downloads

Crossref

N/A

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Revised: 25 July 2016

Accepted: 13 November 2016

Published: 06 December 2016

This article is published with open access at Springerlink.com

The articles published in this journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www. editorialmanager.com/cvmj.