AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (6.5 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Multi-scale hash encoding based neural geometry representation

School of Mathematical Sciences, University of Scienceand Technology of China, Hefei 230026, China
Alibaba Artificial Intelligence Governance Laboratory, Alibaba Group, Hangzhou 310017, China
Show Author Information

Graphical Abstract

Abstract

Recently, neural implicit function-based representation has attracted more and more attention, and has been widely used to represent surfaces using differentiable neural networks. However, surface reconstruction from point clouds or multi-view images using existing neural geometry representations still suffer from slow computation and poor accuracy. To alleviate these issues, we propose a multi-scale hash encoding-based neural geometry representation which effectively and efficiently represents the surface as a signed distance field. Our novel neural network structure carefully combines low-frequency Fourier position encoding with multi-scale hash encoding. The initialization of the geometry network and geometry features of the rendering module are accordingly redesigned. Our experiments demonstrate that the proposed representation is at least 10 times faster for reconstructing point clouds with millions of points. It also significantly improves speed and accuracy of multi-view reconstruction. Our code and models are available at https://github.com/Dengzhi-USTC/Neural-Geometry-Reconstruction.

References

[1]
Wang, P.; Liu, L.; Liu, Y.; Theobalt, C.; Komura, T.; Wang, W. NeuS: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. In: Proceedings of the 35th Conference on Neural Information Processing Systems, 27171–27183, 2021.
[2]
Sitzmann, V.; Martel, J.; Bergman, A.; Lindell, D.; Wetzstein, G. Implicit neural representations with periodic activation functions. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, Article No. 626, 7462–7473, 2020.
[3]
Lorensen, W. E.; Cline, H. E. Marching cubes: A high resolution 3D surface construction algorithm. ACM SIGGRAPH Computer Graphics Vol. 21, No. 4, 163–169, 1987.
[4]
Gropp, A.; Yariv, L.; Haim, N.; Atzmon, M.; Lipman, Y. Implicit geometric regularization for learning shapes. In: Proceedings of the 37th International Conference on Machine Learning, 3789–3799, 2020.
[5]
Yariv, L.; Kasten, Y.; Moran, D.; Galun, M.; Atzmon, M.; Basri, R.; Lipman, Y. Multiview neural surface reconstruction by disentangling geometry and appearance. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, Article No. 210, 2492–2502, 2020.
[6]
Rahaman, N.; Baratin, A.; Arpit, D.; Draxler, F.; Lin, M.; Hamprecht, F. A.; Bengio, Y.; Courville, A. C. On the spectral bias of neural networks. In: Proceedings of the 36th International Conference on Machine Learning, 5301–5310, 2019.
[7]
Mildenhall, B.; Srinivasan, P. P.; Tancik, M.; Barron, J. T.; Ramamoorthi, R.; Ng, R. NeRF: representing scenes as neural radiance fields for view synthesis. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12346. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 405–421, 2020.
[8]
Tancik, M.; Srinivasan, P. P.; Mildenhall, B.; Fridovich-Keil, S.; Raghavan, N.; Singhal, U.; Ramamoorthi, R.; Barron, J. T.; Ng, R. Fourier features let networks learn high frequency functions in low dimensional domains. In: Proceedings of the 34th International Conference on Neural Information Processing System, 7537–7547, 2020.
[9]
Hertz, A.; Perel, O.; Giryes, R.; Sorkine-Hornung, O.; Cohen-Or, D. SAPE: Spatially-adaptive progressive encoding for neural optimization. In: Proceedings of the 35th Conference on Neural Information Processing Systems, 8820–8832, 2021.
[10]
Wang, P. S.; Liu, Y.; Yang, Y. Q.; Tong, X. Spline positional encoding for learning 3D implicit signed distance fields. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence, 1091–1097, 2021.
[11]
Müller, T.; Evans, A.; Schied, C.; Keller, A. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics Vol. 41, No. 4, Article No. 102, 2022.
[12]
Atzmon, M.; Lipman, Y. SAL: Sign agnostic learning of shapes from raw data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2562–2571, 2020.
[13]
Park, J. J.; Florence, P.; Straub, J.; Newcombe, R.; Lovegrove, S. DeepSDF: Learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 165–174, 2019.
[14]
Liu, S. L.; Guo, H. X.; Pan, H.; Wang, P. S.; Tong, X.; Liu, Y. Deep implicit moving least-squares functions for 3D reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1788–1797, 2021.
[15]
Chibane, J.; Mir, A.; Pons-Moll, G. Neural unsigned distance fields for implicit function learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, Article No. 1816, 21638–21652, 2020.
[16]
Mescheder, L.; Oechsle, M.; Niemeyer, M.; Nowozin, S.; Geiger, A. Occupancy networks: Learning 3D reconstruction in function space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4455–4465, 2019.
[17]
Chen, Z. Q.; Zhang, H. Learning implicit fields for generative shape modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5932–5941, 2019.
[18]
Xiao, Y. P.; Lai, Y. K.; Zhang, F. L.; Li, C. P.; Gao, L. A survey on deep geometry learning: From a representation perspective. Computational Visual Media Vol. 6, No. 2, 113–133, 2020.
[19]
Peng, S. Y.; Niemeyer, M.; Mescheder, L.; Pollefeys, M.; Geiger, A. Convolutional occupancy networks. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12348. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 523–540, 2020.
[20]
Jiang, C. Y.; Sud, A.; Makadia, A.; Huang, J. W.; NieBner, M.; Funkhouser, T. Local implicit grid representations for 3D scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6000–6009, 2020.
[21]
Chabra, R.; Lenssen, J. E.; Ilg, E.; Schmidt, T.; Straub, J.; Lovegrove, S.; Newcombe, R. Deep local shapes: Learning local SDF priors for detailed 3D reconstruction. In: Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science, Vol. 12374. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 608–625, 2020.
[22]
Genova, K.; Cole, F.; Sud, A.; Sarna, A.; Funkhouser, T. Local deep implicit functions for 3D shape. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4856–4865, 2020.
[23]
Liang, R.; Sun, H.; Vijaykumar, N. CoordX: Accelerating implicit neural representation with a split MLP architecture. arXiv preprint arXiv:2201.12425, 2022.
[24]
Chan, E. R.; Lin, C. Z.; Chan, M. A.; Nagano, K.; Pan, B. X.; de Mello, S.; Gallo, O.; Guibas, L.; Tremblay, J.; Khamis, S.; et al. Efficient geometry-aware 3D generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 16102–16112, 2022.
[25]
Martel, J. N. P.; Lindell, D. B.; Lin, C. Z.; Chan, E. R.; Monteiro, M.; Wetzstein, G. Acorn: Adaptive coordinate networks for neural scene representation. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 58, 2021.
[26]
Takikawa, T.; Litalien, J.; Yin, K. X.; Kreis, K.; Loop, C.; Nowrouzezahrai, D.; Jacobson, A.; McGuire, M.; Fidler, S. Neural geometric level of detail: Real-time rendering with implicit 3D shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11353–11362, 2021.
[27]
Nießner, M.; Zollhöfer, M.; Izadi, S.; Stamminger, M. Real-time 3D reconstruction at scale using voxel hashing. ACM Transactions on Graphics Vol. 32, No. 6, Article No. 169, 2013.
[28]
Klingensmith, M.; Dryanovski, I.; Srinivasa, S.; Xiao, J. Z. Chisel: Real time large scale 3D reconstruction onboard a mobile device using spatially hashed signed distance fields. In: Proceedings of the Robotics: Science and Systems, 2015.
[29]
Gao, X.; Zhong, C. L.; Xiang, J.; Hong, Y.; Guo, Y. D.; Zhang, J. Y. Reconstructing personalized semantic facial NeRF models from monocular video. ACM Transactions on Graphics Vol. 41, No. 6, Article No. 200, 2022.
[30]
Carr, J. C.; Beatson, R. K.; Cherrie, J. B.; Mitchell, T. J.; Fright, W. R.; McCallum, B. C.; Evans, T. R. Reconstruction and representation of 3D objects with radial basis functions. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, 67–76, 2001.
[31]
Walder, C.; Schölkopf, B.; Chapelle, O. Implicit surfaces with globally regularised and compactly supported basis functions. In: Proceedings of the 19th International Conference on Neural Information Processing System, 273–280, 2006.
[32]
Kazhdan, M.; Bolitho, M.; Hoppe, H. Poisson surface reconstruction. In: Proceedings of the 4th Eurographics Symposium on Geometry Processing, 61–70, 2006.
[33]
Berger, M.; Tagliasacchi, A.; Seversky, L. M.; Alliez, P.; Guennebaud, G.; Levine, J. A.; Sharf, A.; Silva, C. T. A survey of surface reconstruction from point clouds. Computer Graphics Forum Vol. 36, No. 1, 301–329, 2017.
[34]
Erler, P.; Guerrero, P.; Ohrhallinger, S.; Mitra, N. J.; Wimmer, M. Points2Surf learning implicit surfaces from point clouds. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12350. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 108–124, 2020.
[35]
Atzmon, M.; Lipman,Y. SALD: Sign agnostic learning with derivatives. In: Proceedings of the 9th International Conference on Learning Representations, 2021.
[36]
Ma. B.; Han, Z.; Liu, Y. S.; Zwicker, M. Neural-pull: Learning signed distance functions from point clouds by learning to pull space onto surfaces. In: Proceedings of the 38th International Conference on Machine Learning, 7246–7257, 2021.
[37]
Chen, Z. Q.; Tagliasacchi, A.; Funkhouser, T.; Zhang, H. Neural dual contouring. ACM Transactions on Graphics Vol. 41, No. 4, Article No. 104, 2022.
[38]
Furukawa, Y.; Ponce, J. Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 32, No. 8, 1362–1376, 2010.
[39]
Langguth, F.; Sunkavalli, K.; Hadap, S.; Goesele, M. Shading-aware multi-view stereo. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9907. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 469–485, 2016.
[40]
Schönberger, J. L.; Zheng, E. L.; Frahm, J. M.; Pollefeys, M. Pixelwise view selection for unstructured multi-view stereo. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9907. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 501–518, 2016.
[41]
Furukawa, Y.; Hernández, C. Multi-view stereo: A tutorial. Foundations and Trends® in Computer Graphics and Vision Vol. 9, Nos. 1–2, 1–148, 2015.
[42]
Kar, A.; Häne, C.; Malik, J. Learning a multi-view stereo machine. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 364–375, 2017.
[43]
Wang, F.; Galliani, S.; Vogel, C.; Speciale, P.; Pollefeys, M. PatchmatchNet: Learned multi-view patchmatch stereo. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14189–14198, 2021.
[44]
Chen, R.; Han, S. F.; Xu, J.; Su, H. Point-based multi-view stereo network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1538–1547, 2019.
[45]
Yang, J.; Mao, W.; Alvarez, J. M.; Liu, M. Cost volume pyramid based depth inference for multi-view stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 44, No. 9, 4748–4760, 2022.
[46]
Yao, Y.; Luo, Z. X.; Li, S. W.; Shen, T. W.; Fang, T.; Quan, L. Recurrent MVSNet for high-resolutionmulti-view stereo depth inference. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5520–5529, 2019.
[47]
Peng, R.; Wang, R. J.; Wang, Z. Y.; Lai, Y. W.; Wang, R. G. Rethinking depth estimation for multi-view stereo: A unified representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8635–8644, 2022.
[48]
Xu, H. F.; Zhang, J. Y. AANet: Adaptive aggregation network for efficient stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1956–1965, 2020.
[49]
Yang, Z. P.; Ren, Z. L.; Shan, Q.; Huang, Q. X. MVS2D: Efficient multiview stereo via attention-driven 2D convolutions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8564–8574, 2022.
[50]
Cheng, X.; Zhong, Y.; Harandi, M.; Dai, Y.; Chang, X.; Li, H.; Drummond, T.; Ge, Z. Hierarchical neural architecture search for deep stereo matching. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, Article No. 1858, 22158–22169, 2020.
[51]
Wang, F.; Galliani, S.; Vogel, C.; Speciale, P.; Pollefeys, M. PatchmatchNet: Learned multi-view patchmatch stereo. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14189–14198, 2021.
[52]
Niemeyer, M.; Mescheder, L.; Oechsle, M.; Geiger, A. Differentiable volumetric rendering: Learning implicit 3D representations without 3D supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3501–3512, 2020.
[53]
Wang, X. Y.; Guo, Y. D.; Yang, Z. Q.; Zhang, J. Y. Prior-guided multi-view 3D head reconstruction. IEEE Transactions on Multimedia Vol. 24, 4028–4040, 2022.
[54]
Yariv, L.; Gu, J.; Kasten, Y.; Lipman, Y. Volume rendering of neural implicit surfaces. In: Proceedings of the 35th Conference on Neural Information Processing Systems, 4805–4815, 2021.
[55]
Oechsle, M.; Peng, S. Y.; Geiger, A. UNISURF: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5569–5579, 2021.
[56]
Wei, Y.; Liu, S. H.; Rao, Y. M.; Zhao, W.; Lu, J. W.; Zhou, J. NerfingMVS: Guided optimization of neural radiance fields for indoor multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5590–5599, 2021.
[57]
Sun, J. M.; Xie, Y. M.; Chen, L. H.; Zhou, X. W.; Bao, H. J. NeuralRecon: Real-time coherent 3D reconstruction from monocular video. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15593–15602, 2021.
[58]
Zhang, J. Y.; Yao, Y.; Quan, L. Learning signed distance field for multi-view surface reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 6505–6514, 2021.
[59]
Huang, J. H.; Huang, S. S.; Song, H. X.; Hu, S. M. DI-fusion: Online implicit 3D reconstruction with deep priors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8928–8937, 2021.
[60]
Jensen, R.; Dahl, A.; Vogiatzis, G.; Tola, E.; Aanæs, H. Large scale multi-view stereopsis evaluation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 406–413, 2014.
[61]
Yao, Y.; Luo, Z. X.; Li, S. W.; Zhang, J. Y.; Ren, Y. F.; Zhou, L.; Fang, T.; Quan, L. BlendedMVS: A large-scale dataset for generalized multi-view stereo networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1787–1796, 2020.
[62]
Yang, H. T.; Zhu, H.; Wang, Y. R.; Huang, M. K.; Shen, Q.; Yang, R. G.; Cao, X. FaceScape: A large-scale high quality 3D face dataset and detailed riggable 3D face prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 598–607, 2020.
[63]
Wardetzky, M.; Mathur, S.; Kälberer, F.; Grinspun, E. Discrete Laplace operators: No free lunch. In: Proceedings of the 5th Eurographics Symposium on Geometry Processing, 33–37, 2007.
[64]
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An imperative style, high-performance deep learning library. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Article No. 721, 8026–8037, 2019.
[65]
Tiwary, K.; Klinghoffer, T.; Raskar, R. Towards learning neural representations from shadows. In: Computer Vision – ECCV 2022. Lecture Notes in Computer Science, Vol. 13693. Avidan, S.; Brostow, G.; Cissé, M.; Farinella, G. M.; Hassner, T. Eds. Springer Cham, 300–316, 2022.
[66]
Suhail, M.; Esteves, C.; Sigal, L.; Makadia, A. Light field neural rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8259–8269, 2022.
[67]
Cai, H.; Feng, W.; Feng, X.; Wang, Y.; Zhang, J. Neural surface reconstruction of dynamic scenes with monocular RGB-D camera. In: Proceedings of the 36th Conference on Neural Information Processing Systems, 967–981, 2022.
[68]
Jiang, B. Y.; Hong, Y.; Bao, H. J.; Zhang, J. Y. SelfRecon: Self reconstruction your digital avatar from monocular video. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5595–5605, 2022.
[69]
Deng, Z.; Liu, Y.; Pan, H.; Jabi, W.; Zhang, J. Y.; Deng, B. L. Sketch2PQ: Freeform planar quadrilateral mesh design via a single sketch. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 9, 3826–3839, 2023.
Computational Visual Media
Pages 453-470
Cite this article:
Deng Z, Xiao H, Lang Y, et al. Multi-scale hash encoding based neural geometry representation. Computational Visual Media, 2024, 10(3): 453-470. https://doi.org/10.1007/s41095-023-0340-x

323

Views

19

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 02 January 2023
Accepted: 28 February 2023
Published: 22 March 2024
© The Author(s) 2024.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return