AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (4.3 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Neural 3D reconstruction from sparse views using geometric priors

BNRist, Department of Computer Science andTechnology, Tsinghua University, Beijing 100084, China
Academy of Military Sciences, Beijing 100091, China
Show Author Information

Graphical Abstract

Abstract

Sparse view 3D reconstruction has attracted increasing attention with the development of neural implicit 3D representation. Existing methods usually only make use of 2D views, requiring a dense set of input views for accurate 3D reconstruction. In this paper, we show that accurate 3D reconstruction can be achieved by incorporating geometric priors into neural implicit 3D reconstruction. Our method adopts the signed distance function as the 3D representation, and learns a generalizable 3D surface reconstruction model from sparse views. Specifically, we build a more effective and sparse feature volume from the input views by using corresponding depth maps, which can be provided by depth sensors or directly predicted from the input views. We recover better geometric details by imposing both depth and surface normal constraints in addition to the color loss when training the neural implicit 3D representation. Experiments demonstrate that our method both outperforms state-of-the-art approaches, and achieves good generalizability.

References

[1]
Mildenhall, B.; Srinivasan, P. P.; Tancik, M.; Barron, J. T.; Ramamoorthi, R.; Ng, R. NeRF: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM Vol. 65, No. 1, 99–106, 2022.
[2]
Eftekhar, A.; Sax, A.; Malik, J.; Zamir, A. Omnidata: A scalable pipeline for making multi-task mid-level vision datasets from 3D scans. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 10766–10776, 2021.
[3]
Yao, Y.; Luo, Z.; Li, S.; Fang, T.; Quan, L. MVSNet: Depth inference for unstructured multi-view stereo. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11212. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 785–801, 2018.
[4]
Long, X.; Lin, C.; Wang, P.; Komura, T.; Wang, W. SparseNeuS: Fast generalizable neural surface reconstruction from sparse views. In: Computer Vision – ECCV 2022. Lecture Notes in Computer Science, Vol. 13692. Avidan, S.; Brostow, G.; Cissé, M.; Farinella, G. M.; Hassner, T. Eds. Springer Cham, 210–227, 2022.
[5]
Seitz, S. M.; Dyer, C. R. Photorealistic scene reconstruction by voxel coloring. International Journal of Computer Vision Vol. 35, No. 2, 151–173, 1999.
[6]
Kar, A.; Häne, C.; Malik, J. Learning a multi-view stereo machine. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 364–375, 2017.
[7]
Sun, J. M.; Xie, Y. M.; Chen, L. H.; Zhou, X. W.; Bao, H. J. NeuralRecon: Real-time coherent 3D reconstruction from monocular video. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15593–15602, 2021.
[8]
Ji, M. Q.; Zhang, J. Z.; Dai, Q. H.; Fang, L. SurfaceNet: An end-to-end 3D neural network for very sparse multi-view stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 43, No. 11, 4078–4093, 2021.
[9]
Lhuillier, M.; Quan, L. A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 27, No. 3, 418–433, 2005.
[10]
Furukawa, Y.; Ponce, J. Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 32, No. 8, 1362–1376, 2010.
[11]
Gu, X. D.; Fan, Z. W.; Zhu, S. Y.; Dai, Z. Z.; Tan, F. T.; Tan, P. Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2492–2501, 2020.
[12]
Long, X.; Liu, L.; Theobalt, C.; Wang, W. Occlusion-aware depth estimation with adaptive normal constraints. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12354. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 640–657, 2020.
[13]
Long, X. X.; Lin, C.; Liu, L. J.; Li, W.; Theobalt, C.; Yang, R. G.; Wang, W. Adaptive surface normal constraint for depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 12829–12838, 2021.
[14]
Long, X. X.; Liu, L. J.; Li, W.; Theobalt, C.; Wang, W. P. Multi-view depth estimation using epipolar spatio-temporal networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8254–8263, 2021.
[15]
Fuentes-Pacheco, J.; Ruiz-Ascencio, J.; Rendón-Mancha, J. M. Visual simultaneous localization and mapping: A survey. Artificial Intelligence Review Vol. 43, No. 1, 55–81, 2015.
[16]
Xu, Z. W.; Rong, Z.; Wu, Y. H. A survey: Which features are required for dynamic visual simultaneous localization and mapping? Visual Computing for Industry, Biomedicine, and Art Vol. 4, No. 1, Article No. 20, 2021.
[17]
Özyeşil, O.; Voroninski, V.; Basri, R.; Singer, A. A survey of structure from motion. Acta Numerica Vol. 26, 305–364, 2017.
[18]
Li, Y. Z.; Luo, F.; Xiao, C. X. Self-supervised coarse-to-fine monocular depth estimation using a lightweight attention module. Computational Visual Media Vol. 8, No. 4, 631–647, 2022.
[19]
Schönberger, J. L.; Frahm, J. M. Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4104–4113, 2016.
[20]
Choy, C. B.; Xu, D.; Gwak, J.; Chen, K.; Savarese, S. 3D-R2N2: A unified approach for single and multi-view 3D object reconstruction. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9912. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 628–644, 2016.
[21]
Huang, P. H.; Matzen, K.; Kopf, J.; Ahuja, N.; Huang, J. B. DeepMVS: Learning multi-view stereopsis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2821–2830, 2018.
[22]
Wang, D.; Cui, X. R.; Chen, X.; Zou, Z. X.; Shi, T. Y.; Salcudean, S.; Wang, Z. J.; Ward, R. Multi-view 3D reconstruction with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5702–5711, 2021.
[23]
Liu, L.; Gu, J.; Lin, K. Z.; Chua, T.; Theobalt, C. Neural sparse voxel fields. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, Article No. 1313, 15651–15663, 2020.
[24]
Trevithick, A.; Yang, B. GRF: Learning a general radiance field for 3D representation and rendering. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 15162–15172, 2021.
[25]
Barron, J. T.; Mildenhall, B.; Tancik, M.; Hedman, P.; Martin-Brualla, R.; Srinivasan, P. P. Mip-NeRF: A multiscale representation for anti-aliasing neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5835–5844, 2021.
[26]
Verbin, D.; Hedman, P.; Mildenhall, B.; Zickler, T.; Barron, J. T.; Srinivasan, P. P. Ref-NeRF: Structured view-dependent appearance for neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5481–5490, 2022.
[27]
Guo, Y. C.; Kang, D.; Bao, L. C.; He, Y.; Zhang, S. H. NeRFReN: Neural radiance fields with reflections. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18388–18397, 2022.
[28]
Wang, P.; Liu, L.; Liu, Y.; Theobalt, C.; Komura, T.; Wang, W. NeuS: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. In: Proceedings of the 35th Conference on Neural Information Processing Systems, 27171–27183, 2021.
[29]
Zhang, J. Y.; Yao, Y.; Quan, L. Learning signed distance field for multi-view surface reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 6505–6514, 2021.
[30]
Niemeyer, M.; Mescheder, L.; Oechsle, M.; Geiger, A. Differentiable volumetric rendering: Learning implicit 3D representations without 3D supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3501–3512, 2020.
[31]
Oechsle, M.; Peng, S. Y.; Geiger, A. UNISURF: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5569–5579, 2021.
[32]
Darmon, F.; Bascle, B.; Devaux, J. C.; Monasse, P.; Aubry, M. Improving neural implicit surfaces geometry with patch warping. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6250–6259, 2022.
[33]
Yariv, L.; Gu, J.; Kasten, Y.; Lipman, Y. Volume rendering of neural implicit surfaces. In: Proceedings of the 35th Conference on Neural Information Processing Systems, 4805–4815, 2021.
[34]
Yariv, L.; Kasten, Y.; Moran, D.; Galun, M.; Atzmon, M.; Basri, R.; Lipman, Y. Multiview neural surface reconstruction by disentangling geometry and appearance. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, Article No. 210, 2492–2502, 2020.
[35]
Liu, S. H.; Zhang, Y. D.; Peng, S. Y.; Shi, B. X.; Pollefeys, M.; Cui, Z. P. DIST: Rendering deep implicit signed distance function with differentiable sphere tracing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016–2025, 2020.
[36]
Kellnhofer, P.; Jebe, L. C.; Jones, A.; Spicer, R.; Pulli, K.; Wetzstein, G. Neural lumigraph rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4285–4295, 2021.
[37]
Insafutdinov, E.; Campbell, D.; Henriques, J. F.; Vedaldi, A. SNeS: Learning probably symmetric neural surfaces from incomplete data. In: Computer Vision – ECCV 2022. Lecture Notes in Computer Science, Vol. 13692. Avidan, S.; Brostow, G.; Cissé, M.; Farinella, G. M.; Hassner, T. Eds. Springer Cham, 367–383, 2022.
[38]
Azinović, D.; Martin-Brualla, R.; Goldman, D. B.; Nießner, M.; Thies, J. Neural RGB-D surface reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6280–6291, 2022.
[39]
Guo, H. Y.; Peng, S. D.; Lin, H. T.; Wang, Q. Q.; Zhang, G. F.; Bao, H. J.; Zhou, X. Neural 3D scene reconstruction with the Manhattan-world assumption. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5501–5510, 2022.
[40]
Wang, J. P.; Wang, P.; Long, X. X.; Theobalt, C.; Komura, T.; Liu, L. J.; Wang, W. NeuRIS: Neural reconstruction of indoor scenes using normal priors. In: Computer Vision – ECCV 2022. Lecture Notes in Computer Science, Vol. 13692. Avidan, S.; Brostow, G.; Cissé, M.; Farinella, G. M.; Hassner, T. Eds. Springer Cham, 139–155, 2022.
[41]
Chen, A. P.; Xu, Z. X.; Zhao, F. Q.; Zhang, X. S.; Xiang, F. B.; Yu, J. Y.; Su, H. MVSNeRF: Fast generalizable radiance field reconstruction from multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 14104–14113, 2021.
[42]
Chibane, J.; Bansal, A.; Lazova, V.; Pons-Moll, G. Stereo radiance fields (SRF): Learning view synthesis for sparse views of novel scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7907–7916, 2021.
[43]
Yu, A.; Ye, V.; Tancik, M.; Kanazawa, A. pixelNeRF: Neural radiance fields from one or few images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4576–4585, 2021.
[44]
Wang, Q. Q.; Wang, Z. C.; Genova, K.; Srinivasan, P.; Zhou, H.; Barron, J. T.; Martin-Brualla, R.; Snavely, N.; Funkhouser T. A. IBRNet: Learning multi-view image-based rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4688–4697, 2021.
[45]
Roessle, B.; Barron, J. T.; Mildenhall, B.; Srinivasan, P. P.; Nießner, M. Dense depth priors for neural radiance fields from sparse input views. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12882–12891, 2022.
[46]
Johari, M. M.; Lepoittevin, Y.; Fleuret, F. GeoNeRF: Generalizing NeRF with geometry priors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18344–18347, 2022.
[47]
Yu, Z.; Peng, S.; Niemeyer, M.; Sattler, T.; Geiger, A. MonoSDF: Exploring monocular geometric cues for neural implicit surface reconstruction. In: Proceedings of the 36th Conference on Neural Information Processing Systems, 2022.
[48]
Jensen, R.; Dahl, A.; Vogiatzis, G.; Tola, E.; Aanæs, H. Large scale multi-view stereopsis evaluation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 406–413, 2014.
[49]
Lin, T. Y.; Dollár, P.; Girshick, R.; He, K. M.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 936–944, 2017.
[50]
Tang, H. T.; Liu, Z. J.; Zhao, S. Y.; Lin, Y. J.; Lin, J.; Wang, H. R.; Han, S. Searching efficient 3D architectures with sparse point-voxel convolution. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12373. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 685–702, 2020.
[51]
Hu, S. M.; Liang, D.; Yang, G. Y.; Yang, G. W.; Zhou, W. Y. Jittor: A novel deep learning framework with meta-operators and unified graph execution. Science China Information Sciences Vol. 63, No. 12, Article No. 222103, 2020.
Computational Visual Media
Pages 687-697
Cite this article:
Mu T-J, Chen H-X, Cai J-X, et al. Neural 3D reconstruction from sparse views using geometric priors. Computational Visual Media, 2023, 9(4): 687-697. https://doi.org/10.1007/s41095-023-0337-5

552

Views

61

Downloads

2

Crossref

0

Web of Science

2

Scopus

0

CSCD

Altmetrics

Received: 25 November 2022
Accepted: 31 January 2023
Published: 05 March 2023
© The Author(s) 2023.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return