PDF (10.6 MB)
Collect
Submit Manuscript
Show Outline
Figures (5)

Tables (4)
Table 1
Table 2
Table 3
Table 4
Research Article | Open Access

On the role of geometry in geo-localization

Department of Electrical Engineering, Tel-Aviv University, Tel-Aviv, Israel
The Efi Arazi School of Computer Science, The Interdisciplinary Center Herzliya, Herzliya, Israel
Show Author Information

Abstract

Consider the geo-localization task of finding the pose of a camera in a large 3D scene from a single image. Most existing CNN-based methods use as input textured images. We aim to experimentally explore whether texture and correlation between nearby images are necessary in a CNN-based solution for the geo-localization task. To do so, we consider lean images, textureless projections of a simple 3D model of a city. They only contain information related to the geometry of the scene viewed (edges, faces, and relative depth). The main contributions of this paper are: (i) todemonstrate the ability of CNNs to recover camera pose using lean images; and (ii) to provide insight into the role of geometry in the CNN learning process.

References

[1]
Se, S.; Lowe, D.; Little, J. Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks. The International Journal of Robotics Research Vol. 21, No. 8, 735-758, 2002.
[2]
Lowe, D. G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision Vol. 60, No. 2, 91-110, 2004.
[3]
Li, Y. P.; Snavely, N.; Huttenlocher, D. P. Location recognition using prioritized feature matching. In: Computer Vision - ECCV 2010. Lecture Notes in Computer Science, Vol. 6312. Daniilidis, K.; Maragos, P.; Paragios, N. Eds. Springer Berlin Heidelberg, 791-804, 2010.
[4]
Ramalingam, S.; Bouaziz, S.; Sturm, P. Pose estimation using both points and lines for geo-localization. In: Proceedings of the IEEE International Conference on Robotics and Automation, 4716-4723, 2011.
[5]
Bansal, M.; Daniilidis, K. Geometric urban geo-localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3978-3985, 2014.
[6]
Kendall, A.; Grimes, M.; Cipolla, R. PoseNet: A convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, 2938-2946, 2015.
[7]
Walch, F.; Hazirbas, C.; Leal-Taixé, L.; Sattler, T.; Hilsenbeck, S.; Cremers, D. Image-based localization using LSTMs for structured feature correlation. In: Proceedings of the IEEE International Conference on Computer Vision, 627-637, 2017.
[8]
Melekhov, I.; Ylioinas, J.; Kannala, J.; Rahtu, E. Image-based localization using hourglass networks. arXiv preprint arXiv:1703.07971, 2017.
[9]
Sattler, T.; Torii, A.; Sivic, J.; Pollefeys, M.; Taira, H.; Okutomi, M.; Pajdla, T. Are large-scale 3D models really necessary for accurate visual localization? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6175-6184, 2017.
[10]
Sivic, J.; Zisserman, A. Video Google: A text retrieval approach to object matching in videos. In: Proceedings 9th IEEE International Conference on Computer Vision, 1470-1477, 2003.
[11]
Robertsone, D.; Cipolla, R. An Image-based system for urban navigation. In: Proceedings of the British Machine Conference, 84.1-84.10, 2004.
[12]
Hays, J.; Efros, A. A. IM2GPS: Estimating geographic information from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, 1-8, 2008.
[13]
Bergamo, A.; Sinha, S. N.; Torresani, L. Leveraging structure from motion to learn discriminative codebooks for scalable landmark classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 763-770, 2013.
[14]
Zhang, W.; Kosecka, J. Image based localization in urban environments. In: Proceedings of the 3rd International Symposium on 3D Data Processing, Visualization, and Transmission, 33-40, 2006.
[15]
Nister, D.; Stewenius, H. Scalable recognition with a vocabulary tree. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2161-2168, 2006.
[16]
Schindler, G.; Brown, M.; Szeliski, R. City-scale location recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-7, 2007.
[17]
Irschara, A.; Zach, C.; Frahm, J.; Bischof, H. From structure-from-motion point clouds to fast location recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2599-2606, 2009.
[18]
Sattler, T.; Leibe, B.; Kobbelt, L. Fast image-based localization using direct 2D-to-3D matching. In: Proceedings of the International Conference on Computer Vision, 667-674, 2011.
[19]
Matei, B. C.; Vander Valk, N.; Zhu, Z.; Cheng, H.; Sawhney, H. S. Image to LIDAR matching for geotagging in urban environments. In: Proceedings of the IEEE Workshop on Applications of Computer Vision, 413-420, 2013.
[20]
Svärm, L.; Enqvist, O.; Oskarsson, M.; Kahl, F. Accurate localization and pose estimation for large 3D models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 532-539, 2014.
[21]
Baatz, G.; Saurer, O.; Köser, K.; Pollefeys, M. Large scale visual geo-localization of images in mountainous terrain. In: Computer Vision - ECCV 2012. Lecture Notes in Computer Science, Vol. 7573. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 517-530, 2012.
[22]
Svarm, L.; Enqvist, O.; Kahl, F.; Oskarsson, M. City-scale localization for cameras with known vertical direction. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 39, No. 7, 1455-1461, 2017.
[23]
Piasco, N.; Sidibé, D.; Demonceaux, C.; Gouet-Brunet, V. A survey on visual-based localization: On the benefit of heterogeneous data. Pattern Recognition Vol. 74, 90-109, 2018.
[24]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-9, 2015.
[25]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778, 2016.
[26]
Kendall, A.; Cipolla, R. Modelling uncertainty in deep learning for camera relocalization. In: Proceedings of the IEEE International Conference on Robotics and Automation, 4762-4769, 2016.
[27]
Kendall, A.; Cipolla, R. Geometric loss functions for camera pose regression with deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6555-6564, 2017.
[28]
Berlin Partner für Wirtschaft und Technologie GmbH. Berlin 3D city model. 2016. Available at https://www.businesslocationcenter.de/en/WA/B/seite0.jsp.
[29]
Zhang, C.; Bengio, S.; Hardt, M.; Recht, B.; Vinyals, O. Understanding deep learning requires rethinking generalization. arXiv preprint arXiv:1611.03530, 2016.
[30]
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; Berg, A. C.; Fei-Fei, L. ImageNet large scale visual recognition challenge. International Journal of Computer Vision Vol. 115, No. 3, 211-252, 2015.
[31]
Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248-255, 2009.
[32]
OpenStreetMap Wiki contributors. OSM-3D.org.OpenStreetMap Wiki, 2018. Available athttps://wiki.openstreetmap.org/w/index.php?title=OSM-3D.org&oldid=2025859.
Computational Visual Media
Pages 103-113
Cite this article:
Kadosh M, Moses Y, Shamir A. On the role of geometry in geo-localization. Computational Visual Media, 2021, 7(1): 103-113. https://doi.org/10.1007/s41095-020-0196-2
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return