We introduce a novel end-to-end deep-learning solution for rapidly estimating a dense spherical depth map of an indoor environment. Our input is a single equirectangular image registered with a sparse depth map, as provided by a variety of common capture setups. Depth is inferred by an efficient and lightweight single-branch network, which employs a dynamic gating system to process together dense visual data and sparse geometric data. We exploit the characteristics of typical man-made environments to efficiently compress multi-resolution features and find short- and long-range relations among scene parts. Furthermore, we introduce a new augmentation strategy to make the model robust to different types of sparsity, including those generated by various structured light sensors and LiDAR setups. The experimental results demonstrate that our method provides interactive performance and outperforms state-of-the-art solutions in computational efficiency, adaptivity to variable depth sparsity patterns, and prediction accuracy for challenging indoor data, even when trained solely on synthetic data without any fine tuning.
Zollhöfer, M.; Stotko, P.; Görlitz, A.; Theobalt, C.; Nießner, M.; Klein, R.; Kolb, A. State of the art on 3D reconstruction with RGB-D cameras. Computer Graphics Forum Vol. 37, No. 2, 625–652, 2018.
Pintore, G.; Mura, C.; Ganovelli, F.; Fuentes-Perez, L.; Pajarola, R.; Gobbetti, E. State-of-the-art in automatic 3D reconstruction of structured indoor environments. Computer Graphics Forum Vol. 39, No. 2, 667–699, 2020.
Mertan, A.; Duff, D. J.; Unal, G. Single image depth estimation: An overview. Digital Signal Processing Vol. 123, 103441, 2022.
Ming, Y.; Meng, X. Y.; Fan, C. X.; Yu, H. Deep learning for monocular depth estimation: A review. Neurocomputing Vol. 438, 14–33, 2021.
Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets robotics: The KITTI dataset. International Journal of Robotics Research Vol. 32, No. 11, 1231–1237, 2013.
Cao, Y.; Wu, Z. F.; Shen, C. H. Estimating depth from monocular images as classification using deep fully convolutional residual networks. IEEE Transactions on Circuits and Systems for –Video Technology Vol. 28, No. 11, 3174–3182, 2018.
Liao, Y.; Xie, J.; Geiger, A. KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2D and 3D. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 45, No. 3, 3292–3310, 2023.
Eldesokey, A.; Felsberg, M.; Khan, F. S. Confidence propagation through CNNs for guided sparse depth regression. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 42, No. 10, 2423–2436, 2020.
Tang, J.; Tian, F. P.; Feng, W.; Li, J.; Tan, P. Learning guided convolutional network for depth completion. IEEE Transactions on Image Processing Vol. 30, 1116–1129, 2021.
Lee, S.; Lee, J.; Kim, D.; Kim, J. Deep architecture with cross guidance between single image and sparse LiDAR data for depth completion. IEEE Access Vol. 8, 79801–79810, 2020.
Liu, R. Y.; Zhang, G. D.; Wang, J. M.; Zhao, S. W. Cross-modal 360° depth completion and reconstruction for large-scale indoor environment. IEEE Transactions on Intelligent Transportation Systems Vol. 23, No. 12, 25180–25190, 2022.
Pintore, G.; Almansa, E.; Agus, M.; Gobbetti, E. Deep3DLayout: 3D reconstruction of an indoor layout from a spherical panoramic image. ACM Transactions on Graphics Vol. 40, No. 6, Article No. 250, 2021.
Morales, J.; Plaza-Leiva, V.; Mandow, A.; Gomez-Ruiz, J. A.; Serón, J.; García-Cerezo, A. Analysis of 3D scan measurement distribution with application to a multi-beam lidar on a rotating platform. Sensors Vol. 18, No. 2, 395, 2018.
Wu, T.; Fu, H.; Liu, B. K.; Xue, H. Z.; Ren, R. K.; Tu, Z. M. Detailed analysis on generating the range image for LiDAR point cloud processing. Electronics Vol. 10, No. 11, 1224, 2021.
Lambert-Lacroix, S.; Zwald, L. The adaptive BerHu penalty in robust regression. Journal of Nonparametric Statistics Vol. 28, No. 3, 487–514, 2016.
Wang, Z.; Bovik, A. C.; Sheikh, H. R.; Simoncelli, E. P. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing Vol. 13, No. 4, 600–612, 2004.
Li, Y. W.; Dai, S. M.; Shi, Y.; Zhao, L. L.; Ding, M. H. Navigation simulation of a mecanum wheel mobile robot based on an improved A* algorithm in Unity3D. Sensors Vol. 19, No. 13, 2976, 2019.
Eldesokey, A.; Felsberg, M.; Khan, F. S. Confidence propagation through CNNs for guided sparse depth regression. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 42, No. 10, 2423–2436, 2020.