AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (9.5 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Benchmarking visual SLAM methods in mirror environments

School of Computer Science and Informatics, Cardiff University, Abacws Building, Senghennydd Rd, Cardiff CF24 4AG, UK
School of Engineering, Cardiff University, Queen’s Buildings, The Parade, Cardiff CF24 3AA, UK
Show Author Information

Graphical Abstract

Abstract

Visual simultaneous localisation and mapping (vSLAM) finds applications for indoor and outdoor navigation that routinely subjects it to visual complexities, particularly mirror reflections. The effect of mirror presence (time visible and its average size in the frame) was hypothesised to impact localisation and mapping performance, with systems using direct techniques expected to perform worse. Thus, a dataset, MirrEnv, of image sequences recorded in mirror environments, was collected, and used to evaluate the performance of existing representative methods. RGBD ORB-SLAM3 and BundleFusion appear to show moderate degradation of absolute trajectory error with increasing mirror duration, whilst the remaining results did not show significantly degraded localisation performance. The mesh maps generated proved to be very inaccurate, with real and virtual reflections colliding in the reconstructions. A discussion is given of the likely sources of error and robustness in mirror environments, outlining future directions for validating and improving vSLAM performance in the presence of planar mirrors. The MirrEnv dataset is available at https://doi.org/10.17035/d.2023.0292477898.

Electronic Supplementary Material

Video
41095_0329_ESM(1).avi
41095_0329_ESM(2).avi
41095_0329_ESM(3).avi

References

[1]
Taketomi, T.; Uchiyama, H.; Ikeda, S. Visual SLAM algorithms: A survey from 2010 to 2016. IPSJ Transactions on Computer Vision and Applications Vol. 9, No. 1, 16, 2017.
[2]
Mourikis, A. I.; Roumeliotis, S. I. A multi-state constraint Kalman filter for vision-aided inertial navigation. In: Proceedings of the IEEE International Conference on Robotics and Automation, 35653572, 2007.
[3]
Mur-Artal, R.; Tardós, J. D. Visual-inertial monocular SLAM with map reuse. IEEE Robotics and Automation Letters Vol. 2, No. 2, 796803, 2017.
[4]
Qin, T.; Li, P. L.; Shen, S. J. VINS-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics Vol. 34, No. 4, 10041020, 2018.
[5]
Graeter, J.; Wilczynski, A.; Lauer, M. LIMO: Lidar-monocular visual odometry. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 78727879, 2018.
[6]
Huang, S. S.; Ma, Z. Y.; Mu, T. J.; Fu, H. B.; Hu, S. M. Lidar-monocular visual odometry using point and line features. In: Proceedings of the IEEE International Conference on Robotics and Automation, 10911097, 2020.
[7]
Abaspur Kazerouni, I.; Fitzgerald, L.; Dooly, G.; Toal, D. A survey of state-of-the-art on visual SLAM. Expert Systems with Applications Vol. 205, 117734, 2022.
[8]
Huang, B. C.; Zhao, J.; Liu, J. B. A survey of simultaneous localization and mapping with an envision in 6G wireless networks. arXiv preprint arXiv:1909.05214, 2019.
[9]
Servières, M.; Renaudin, V.; Dupuis, A.; Antigny, N. Visual and visual-inertial SLAM: State of the art, classification, and experimental benchmarking. Journal of Sensors Vol. 2021, 126, 2021.
[10]
Siegwart, R.; Nourbakhsh, I. R.; Scaramuzza, D. Introduction to Autonomous Mobile Robots, 2nd edn. Cambridge: MIT Press, 2011.
[11]
Pretto, A.; Menegatti, E.; Bennewitz, M.; Burgard, W.; Pagello, E. A visual odometry framework robust to motion blur. In: Proceedings of the IEEE International Conference on Robotics and Automation, 22502257, 2009.
[12]
Lee, H. S.; Kwon, J.; Lee, K. M. Simultaneous localization, mapping and deblurring. In: Proceedings of the International Conference on Computer Vision, 12031210, 2011.
[13]
Liu, P. D.; Zuo, X. X.; Larsson, V.; Pollefeys, M. MBA-VO: Motion blur aware visual odometry. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 55305539, 2021.
[14]
Park, S.; Schöps, T.; Pollefeys, M. Illumination change robustness in direct visual SLAM. In: Proceedings of the IEEE International Conference on Robotics and Automation, 45234530, 2017.
[15]
Huang, J. W.; Liu, S. G. Robust simultaneous localization and mapping in low-light environment. Computer Animation and Virtual Worlds Vol. 30, Nos. 3–4, e1895, 2019.
[16]
Huang, J. H.; Yang, S.; Zhao, Z. S.; Lai, Y. K.; Hu, S. M. ClusterSLAM: A SLAM backend for simultaneous rigid body clustering and motion estimation. Computational Visual Media Vol. 7, No. 1, 87101, 2021.
[17]
Ma, P.; Bai, Y.; Zhu, J. N.; Wang, C. J.; Peng, C. DSOD: DSO in dynamic environments. IEEE Access Vol. 7, 178300178309, 2019.
[18]
Rabiee, S.; Biswas, J. IV-SLAM: Introspective vision for simultaneous localization and mapping. In: Proceedings of the 4th Conference on Robot Learning, 11001109, 2020.
[19]
Zhou, H. Z.; Zou, D. P.; Pei, L.; Ying, R. D.; Liu, P. L.; Yu, W. X. StructSLAM: Visual SLAMwith building structure lines. IEEE Transactions on Vehicular Technology Vol. 64, No. 4, 13641375, 2015.
[20]
Yousif, K.; Bab-Hadiashar, A.; Hoseinnezhad, R. 3D SLAM in texture-less environments using rank order statistics. Robotica Vol. 35, No. 4, 809831, 2017.
[21]
Whelan, T.; Salas-Moreno, R. F.; Glocker, B.; Davison, A. J.; Leutenegger, S. ElasticFusion: Real-time dense SLAM and light source estimation. The International Journal of Robotics Research Vol. 35, No. 14, 16971716, 2016.
[22]
Yang, N.; von Stumberg, L.; Wang, R.; Cremers, D. D3VO: Deep depth, deep pose and deep uncertainty for monocular visual odometry. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12781289, 2020.
[23]
Tan, J. Q.; Lin, W. J.; Chang, A. X.; Savva, M. Mirror3D: Depth refinement for mirror surfaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1598515994, 2021.
[24]
Park, D.; Park, Y. H. Identifying reflected images from object detector in indoor environment utilizing depth information. IEEE Robotics and Automation Letters Vol. 6, No. 2, 635642, 2020.
[25]
Koch, R.; May, S.; Koch, P.; Kühn, M.; Nüchter, A. Detection of specular reflections in range measurements for faultless robotic SLAM. In: Robot 2015: Second Iberian Robotics Conference. Advances in Intelligent Systems and Computing, Vol. 417. Reis, L.; Moreira, A.; Lima, P.; Montano, L.; Muñoz-Martinez, V. Eds. Springer Cham, 133145, 2016.
[26]
Yang, S. W.; Wang, C. C. Dealing with laser scanner failure: Mirrors and windows. In: Proceedings of the IEEE International Conference on Robotics and Automation, 30093015, 2008.
[27]
Mur-Artal, R.; Montiel, J. M. M.; Tardós, J. D. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics Vol. 31, No. 5, 11471163, 2015.
[28]
Mur-Artal, R.; Tardós, J. D. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics Vol. 33, No. 5, 12551262, 2017.
[29]
Dai, A.; Nießner, M.; Zollhöfer, M.; Izadi, S.; Theobalt, C. BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 76a, 2017.
[30]
Forster, C.; Pizzoli, M.; Scaramuzza, D. SVO: Fast semi-direct monocular visual odometry. In: Proceedings of the IEEE International Conference on Robotics and Automation, 1522, 2014.
[31]
Davison, A. J.; Reid, I. D.; Molton, N. D.; Stasse, O.MonoSLAM: Real-time single camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 29, No. 6, 10521067, 2007.
[32]
Klein, G.; Murray, D. Parallel tracking and mapping for small AR workspaces. In: Proceedings of the 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, 225234, 2007.
[33]
Tang, J. X.; Folkesson, J.; Jensfelt, P. Geometric correspondence network for camera motion estimation. IEEE Robotics and Automation Letters Vol. 3, No. 2, 10101017, 2018.
[34]
Tang, J. X.; Ericson, L.; Folkesson, J.; Jensfelt, P. GCNv2: Efficient correspondence prediction for real-time SLAM. IEEE Robotics and Automation Letters Vol. 4, No. 4, 35053512, 2019.
[35]
Engel, J.; Koltun, V.; Cremers, D. Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 40, No. 3, 611625, 2017.
[36]
Engel, J.; Usenko, V.; Cremers, D. A photometrically calibrated benchmark for monocular visual odometry. arXiv preprint arXiv:1607.02555, 2016.
[37]
Schöps, T.; Sattler, T.; Pollefeys, M. BAD SLAM: Bundle adjusted direct RGB-D SLAM. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 134144, 2019.
[38]
Engel, J.; Schöps, T.; Cremers, D. LSD-SLAM: Large-scale direct monocular SLAM. In: Computer Vision – ECCV 2014. Lecture Notes in Computer Science, Vol. 8690. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 834849, 2014.
[39]
Gao, X.; Wang, R.; Demmel, N.; Cremers, D. LDSO: Direct sparse odometry with loop closure. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 21982204, 2018.
[40]
Forster, C.; Zhang, Z. C.; Gassner, M.; Werlberger, M.; Scaramuzza, D. SVO: Semidirect visual odometry for monocular and multicamera systems. IEEE Transactions on Robotics Vol. 33, No. 2, 249265, 2017.
[41]
Kerl, C.; Sturm, J.; Cremers, D. Dense visual SLAM for RGB-D cameras. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 21002106, 2013.
[42]
Engel, J.; Sturm, J.; Cremers, D. Semi-dense visual odometry for a monocular camera. In: Proceedings of the IEEE International Conference on Computer Vision, 14491456, 2013.
[43]
Whelan, T.; Kaess, M.; Johannsson, H.; Fallon, M.; Leonard, J. J.; McDonald, J. Real-time large-scale dense RGB-D SLAM with volumetric fusion. International Journal of Robotics Research Vol. 34, Nos. 4–5, 598626, 2015.
[44]
Tateno, K.; Tombari, F.; Laina, I.; Navab, N. CNN-SLAM: Real-time dense monocular SLAM withlearned depth prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 65656574, 2017.
[45]
Bloesch, M.; Czarnowski, J.; Clark, R.; Leutenegger, S.; Davison, A. J. CodeSLAM - Learning a compact, optimisable representation for dense visual SLAM. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 25602568, 2018.
[46]
Czarnowski, J.; Laidlow, T.; Clark, R.; Davison, A. J. DeepFactors: Real-time probabilistic dense monocular SLAM. IEEE Robotics and Automation Letters Vol. 5, No. 2, 721728, 2020.
[47]
Fuentes-Pacheco, J.; Ruiz-Ascencio, J.; Rendón-Mancha, J. M. Visual simultaneous localization and mapping: A survey. Artificial Intelligence Review Vol. 43, No. 1, 5581, 2015.
[48]
Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J.; Reid, I.; Leonard, J. J. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics Vol. 32, No. 6, 13091332, 2016.
[49]
Duan, C.; Junginger, S.; Huang, J. H.; Jin, K. R.; Thurow, K. Deep learning for visual SLAM in transportation robotics: A review. Transportation Safety and Environment Vol. 1, No. 3, 177184, 2019.
[50]
Chen, C. H.; Wang, B.; Lu, C. X.; Trigoni, N.; Markham, A. A survey on deep learning for localization and mapping: Towards the age of spatial machine intelligence. arXiv preprint arXiv:2006.12567, 2020.
[51]
Wang, K.; Ma, S.; Chen, J. L.; Ren, F.; Lu, J. B. Approaches, challenges, and applications for deep visual odometry: Toward complicated and emerging areas. IEEE Transactions on Cognitive and Developmental Systems Vol. 14, No. 1, 3549, 2022.
[52]
Sturm, J.; Engelhard, N.; Endres, F.; Burgard, W.; Cremers, D. A benchmark for the evaluation of RGB-D SLAM systems. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 573580, 2012.
[53]
Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 33543361, 2012.
[54]
Burri, M.; Nikolic, J.; Gohl, P.; Schneider, T.; Rehder, J.; Omari, S.; Achtelik, M. W.; Siegwart, R. The EuRoC micro aerial vehicle datasets. International Journal of Robotics Research Vol. 35, No. 10, 11571163, 2016.
[55]
Dai, A.; Chang, A. X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 24322443, 2017.
[56]
Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from RGBD images. In: Computer Vision – ECCV 2012. Lecture Notes in Computer Science, Vol. 7576. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 746760, 2012.
[57]
Ming, Y.; Ye, W.; Calway, A. iDF-SLAM: End-to-end RGB-D SLAM with neural implicit mapping and deep feature tracking. arXiv preprint arXiv:2209.07919, 2022.
[58]
Zhu, Z. H.; Peng, S. Y.; Larsson, V.; Xu, W. W.; Bao, H. J.; Cui, Z. P.; Oswald, M. R.; Pollefeys, M. NICE-SLAM: Neural implicit scalable encoding for SLAM. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1277612786, 2022.
[59]
Handa, A.; Whelan, T.; McDonald, J.; Davison, A. J. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In: Proceedings of the IEEE International Conference on Robotics and Automation, 15241531, 2014.
[60]
Straub, J.; Whelan, T.; Ma, L.; Chen, Y.; Wijmans, E.; Green, S.; Engel, J. J.; Mur-Artal, R.; Ren, C.; Verma, S.; et al. The replica dataset: A digital replica of indoor spaces. arXiv preprint arXiv:1906.05797, 2019.
[61]
Wang, W. S.; Zhu, D. L.; Wang, X. W.; Hu, Y. Y.; Qiu, Y. H.; Wang, C.; Hu, Y. F.; Kapoor, A.; Scherer, S. TartanAir: A dataset to push the limits of visual SLAM. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 49094916, 2020.
[62]
Shah, S. M. Z. A.; Marshall, S.; Murray, P. Removal of specular reflections from image sequences using feature correspondences. Machine Vision and Applications Vol. 28, Nos. 3–4, 409420, 2017.
[63]
Sirinukulwattana, T.; Choe, G.; Kweon, I. S. Reflection removal using disparity and gradient-sparsity via smoothing algorithm. In: Proceedings of the IEEE International Conference on Image Processing, 19401944, 2015.
[64]
DelPozo, A.; Savarese, S. Detecting specular surfaces on natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 18, 2007.
[65]
Yang, X.; Mei, H. Y.; Xu, K.; Wei, X. P.; Yin, B. C.; Lau, R. Where is my mirror? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 88088817, 2019.
[66]
Lin, J. Y.; Wang, G. D.; Lau, R. W. H. Progressive mirror detection. In: Proceedings of the IEEE/CVFConference on Computer Vision and Pattern Recognition, 36943702, 2020.
[67]
Mei, H. Y.; Dong, B.; Dong, W.; Peers, P.; Yang, X.; Zhang, Q.; Wei, X. P. Depth-aware mirror segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 30433052, 2021.
[68]
Whelan, T.; Goesele, M.; Lovegrove, S. J.; Straub, J.; Green, S.; Szeliski, R.; Butterfield, S.; Verma, S.; Newcombe, R. Reconstructing scenes with mirror and glass surfaces. ACM Transactions on Graphics Vol. 37, No. 4, Article No. 102, 2018.
[69]
Hart, J. W.; Scassellati, B. Mirror perspective-taking with a humanoid robot. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, 19901996, 2012.
[70]
Zeng, Y.; Zhao, Y. X.; Bai, J. Towards robot self-consciousness (I): Brain-inspired robot mirror neuron system model and its application in mirror self-recognition. In: Advances in Brain Inspired Cognitive Systems. Lecture Notes in Computer Science, Vol. 10023. Liu, C. L.; Hussain, A.; Luo, B.; Tan, K.; Zeng, Y.; Zhang, Z. Eds. Springer Cham, 1121, 2016.
[71]
Safeea, M.; Neto, P. KUKA sunrise toolbox: Interfacing collaborative robots with MATLAB. IEEE Robotics & Automation Magazine Vol. 26, No. 1, 9196, 2019.
[72]
Safeea, M.; Neto, P. KUKA sunrise toolbox: Interfacing collaborative robots with MATLAB. IEEE Robotics & Automation Magazine Vol. 26, No. 1, 9196, 2019.
[73]
Shah, M.; Eastman, R. D.; Hong, T. An overview of robot-sensor calibration methods for evaluation of perception systems. In: Proceedings of the Workshop on Performance Metrics for Intelligent Systems, 1520, 2012.
[74]
Tsai, R. Y.; Lenz, R. K. A new technique for fully autonomous and efficient 3D robotics hand/eye calibration. IEEE Transactions on Robotics and Automation Vol. 5, No. 3, 345358, 1989.
[75]
Park, F. C.; Martin, B. J. Robot sensor calibration: Solving AX=XB on the Euclidean group. IEEE Transactions on Robotics and Automation Vol. 10, No. 5, 717721, 1994.
[76]
Andreff, N.; Horaud, R.; Espiau, B. On-line hand-eye calibration. In: Proceedings of the 2nd International Conference on 3-D Digital Imaging and Modeling, 430436, 1999.
[77]
Daniilidis, K. Hand-eye calibration using dual quaternions. The International Journal of Robotics Research Vol. 18, No. 3, 286298, 1999.
[78]
Sharafutdinov, D.; Griguletskii, M.; Kopanev, P.; Kurenkov, M.; Ferrer, G.; Burkov, A.; Gonnochenko,A.; Tsetserukou, D. Comparison of modern open-source visual SLAM approaches. arXiv preprint arXiv:2108.01654, 2021.
[79]
Campos, C.; Elvira, R.; Rodríguez, J. J. G.; M Montiel, J. M.; D Tardós, J. ORB-SLAM3: An accurate open-source library for visual, visual–inertial, and multimap SLAM. IEEE Transactions on Robotics Vol. 37, No. 6, 18741890, 2021.
[80]
Zhao, F. FangGet/bundlefusion_ubuntu_pangolin: Aporting for bundlefusion working on ubuntu, with Pangolin as Visualizer. 2020. Available at https://github.com/FangGet/BundleFusion_Ubuntu_Pangolin.
[81]
Zhang, Z. C.; Scaramuzza, D. A tutorial on quantitative trajectory evaluation for visual (-inertial) odometry. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 72447251, 2018.
[82]
Havasi, L.; Szlavik, Z.; Sziranyi, T. The use of vanishing point for the classification of reflections from foreground mask in videos. IEEE Transactions on Image Processing Vol. 18, No. 6, 13661372, 2009.
Computational Visual Media
Pages 215-241
Cite this article:
Herbert P, Wu J, Ji Z, et al. Benchmarking visual SLAM methods in mirror environments. Computational Visual Media, 2024, 10(2): 215-241. https://doi.org/10.1007/s41095-022-0329-x

600

Views

59

Downloads

0

Crossref

1

Web of Science

1

Scopus

0

CSCD

Altmetrics

Received: 27 July 2022
Accepted: 18 December 2022
Published: 03 January 2024
© The Author(s) 2023.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return