Learning conditional photometric stereo with high-resolution features

Yakun Ju; Yuxin Peng; Muwei Jian; Feng Gao; Junyu Dong

doi:10.1007/s41095-021-0223-y

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (3.5 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Research Article | Open Access

Learning conditional photometric stereo with high-resolution features

Yakun Ju^¹, Yuxin Peng^², Muwei Jian^³, Feng Gao^¹, Junyu Dong^¹(

)

1Department of Computer Science and Technology, Ocean University of China, Qingdao 266100, China

2Wangxuan Institute of Computer Technology, Peking University, Beijing 100871, China

3School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan 250002, China

Show Author Information

Abstract

Photometric stereo aims to reconstruct 3D geometry by recovering the dense surface orientation of a 3D object from multiple images under differing illumination. Traditional methods normally adopt simplified reflectance models to make the surface orientation computable. However, the real reflectances of surfaces greatly limit applicability of such methods to real-world objects. While deep neural networks have been employed to handle non-Lambertian surfaces, these methods are subject to blurring and errors, especially in high-frequency regions (such as crinkles and edges), caused by spectral bias: neural networks favor low-frequency representations so exhibit a bias towards smooth functions. In this paper, therefore, we propose a self-learning conditional network with multi-scale features for photometric stereo, avoiding blurred reconstruction in such regions. Our explorations include: (i) a multi-scale feature fusion architecture, which keeps high-resolution representations and deep feature extraction, simultaneously, and (ii) an improved gradient-motivated conditionally parameterized convolution (GM-CondConv) in our photometric stereo network, with different combinations of convolution kernels for varying surfaces. Extensive experiments on public benchmark datasets show that our calibrated photometric stereo method outperforms the state-of-the-art.

Keywords

3D reconstruction photometric stereo deep neural networks normal estimation

References

[1]

Jian, M. W.; Dong, J. Y.; Gong, M. G.; Yu, H.; Nie, L. Q.; Yin, Y. L.; Lam, K.-M. Learning the traditional art of Chinese calligraphy via three-dimensional reconstruction and assessment. IEEE Transactions on Multimedia Vol. 22, No. 4, 970–979, 2020.

Crossref Google Scholar

[2]

Woodham, R. J. Photometric method for determining surface orientation from multiple images. Optical Engineering Vol. 19, No. 1, 191139, 1980.

Crossref Google Scholar

[3]

Khanian, M.; Boroujerdi, A. S.; Breuß, M. Photometric stereo for strong specular highlights. Computational Visual Media Vol. 4, No. 1, 83–102, 2018.

Crossref Google Scholar

[4]

Wu, L.; Ganesh, A.; Shi, B.; Matsushita, Y.; Wang, Y.; Ma, Y. Robust photometric stereo via low-rank matrix completion and recovery. In: Proceedings of the Asian Conference on Computer Vision, 703–717, 2010.

Crossref

[5]

Ikehata, S.; Wipf, D.; Matsushita, Y.; Aizawa, K. Robust photometric stereo using sparse regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 318–325, 2012.

Crossref

[6]

Ikehata, S.; Aizawa, K. Photometric stereo using constrained bivariate regression for general isotropic surfaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2187–2194, 2014.

Crossref

[7]

Higo, T.; Matsushita, Y.; Ikeuchi, K. Consensus photometric stereo. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1157–1164, 2010.

Crossref

[8]

Wei, K.; Yang, M. L.; Wang, H.; Deng, C.; Liu, X. L. Adversarial fine-grained composition learning for unseen attribute-object recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3740–3748, 2019.

Crossref

[9]

Yang, X.; Deng, C.; Liu, T. L.; Tao, D. C. Heterogeneous graph attention network for unsupervised multiple-target domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2020.

Crossref Google Scholar

[10]

Wei, K.; Deng, C.; Yang, X. Lifelong zero-shot learning. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, 551–557, 2020.

Crossref

[11]

Santo, H.; Samejima, M.; Sugano, Y.; Shi, B. X.; Matsushita, Y. Deep photometric stereo network. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, 501–509, 2017.

Crossref

[12]

Chen, G. Y.; Han, K.; Wong, K. Y. K. PS-FCN: A flexible learning framework for photometric stereo. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11213. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 3–19, 2018.

[13]

Chen, G.; Han, K.; Shi, B.; Matsushita, Y.; Wong, K.-Y. K. Self-calibrating deep photometric stereo networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 8739–8747, 2019.

Crossref

[14]

Ju, Y. K.; Jian, M. W.; Dong, J. Y.; Lam, K. M. Learning photometric stereo via manifold-based mapping. In: Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 411–414, 2020.

[15]

Taniai, T.; Maehara, T. Neural inverse rendering for general reflectance photometric stereo. In: Proceedings of the 35th International Conference on Machine Learning, 4857–4866, 2018.

[16]

Rahaman, N.; Baratin, A.; Arpit, D.; Draxler, F.; Lin, M.; Hamprecht, F.; Bengio, Y.; Courville, A. On the spectral bias of neural networks. In: Proceedings of the International Conference on Machine Learning, 5301–5310, 2019.

[17]

Sun, K.; Xiao, B.; Liu, D.; Wang, J. D. Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5686–5696, 2019.

Crossref

[18]

Yang, B.; Bender, G.; Le, Q. V.; Ngiam, J. Condconv: Conditionally parameterized convolutions for efficient inference. In: Proceedings of the Advances in Neural Information Processing Systems, 1307–1318, 2019.

[19]

Shi, B.; Mo, Z.; Wu, Z.; Duan, D.; Yeung, S.; Tan, P. A benchmark dataset and evaluation for non-Lambertian and uncalibrated photometric stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 41, No. 2, 271–284, 2019.

Crossref Google Scholar

[20]

Herbort, S.; Wöhler, C. An introduction to image-based 3D surface reconstruction and a survey of photometric stereo methods. 3D Research Vol. 2, No. 3, 4, 2011.

Crossref Google Scholar

[21]

Alldrin, N.; Zickler, T.; Kriegman, D. Photometric stereo with non-parametric and spatially-varying reflectance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2008.

Crossref

[22]

Shi, B. X.; Tan, P.; Matsushita, Y.; Ikeuchi, K. Bi-polynomial modeling of low-frequency reflectances. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 36, No. 6, 1078–1091, 2014.

Crossref Google Scholar

[23]

Goldman, D. B.; Curless, B.; Hertzmann, A.; Seitz, S. M. Shape and spatially-varying BRDFs from photometric stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 32, No. 6, 1060–1071, 2010.

Crossref Google Scholar

[24]

Chung, H.-S.; Jia, J. Y. Efficient photometric stereo on glossy surfaces with wide specular lobes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2008.

[25]

Yeung, S. K.; Wu, T. P.; Tang, C. K.; Chan, T. F.; Osher, S. J. Normal estimation of a transparent object using a video. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 37, No. 4, 890–897, 2015.

Crossref Google Scholar

[26]

Chen, T. B.; Goesele, M.; Seidel, H. P. Mesostructure from specularity. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1825–1832, 2006.

[27]

Tozza, S.; Mecca, R.; Duocastella, M.; Del Bue, A. Direct differential photometric stereo shape recovery of diffuse and specular surfaces. Journal of Mathematical Imaging and Vision Vol. 56, No. 1, 57–76, 2016.

Crossref Google Scholar

[28]

Georghiades, A. S. Incorporating the Torrance and sparrow model of reflectance in uncalibrated photometric stereo. In: Proceedings of the 9th IEEE International Conference on Computer Vision, 816–823, 2003.

Crossref

[29]

Verbiest, F.; van Gool, L. Photometric stereo with coherent outlier handling and confidence estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2008.

Crossref

[30]

Sunkavalli, K.; Zickler, T.; Pfister, H. Visibility subspaces: Uncalibrated photometric stereo with shadows. In: Computer Vision – ECCV 2010. Lecture Notes in Computer Science, Vol. 6312. Daniilidis, K.; Maragos, P.; Paragios, N. Eds. Springer Berlin Heidelberg, 251–264, 2010.

Crossref

[31]

Yu, C.; Seo, Y.; Lee, S. W. Photometric stereo from maximum feasible Lambertian reflections. In: Computer Vision – ECCV 2010. Lecture Notes in Computer Science, Vol. 6314. Daniilidis, K.; Maragos, P.; Paragios, N. Eds. Springer Berlin Heidelberg, 115–126, 2010.

Crossref

[32]

Ju, Y. K.; Dong, X. H.; Wang, Y. Y.; Qi, L.; Dong, J. Y. A dual-cue network for multispectral photometric stereo. Pattern Recognition Vol. 100, 107162, 2020.

Crossref Google Scholar

[33]

Wang, C.; Wu, Y. T.; Su, Z. X.; Chen, J. Y. Joint self-attention and scale-aggregation for self-calibrated deraining network. In: Proceedings of the 28th ACM International Conference on Multimedia, 2517–2525, 2020.

Crossref

[34]

Ikehata, S. CNN-PS: CNN-based photometric stereo for general non-convex surfaces. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11219. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 3–19, 2018.

Crossref

[35]

Li, J. X.; Robles-Kelly, A.; You, S. D.; Matsushita, Y. Learning to minify photometric stereo. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7560–7568, 2019.

[36]

Zheng, Q.; Jia, Y.; Shi, B.; Jiang, X.; Duan, L.-Y.; Kot, A. C. SPLINE-Net: Sparse photometric stereo through lighting interpolation and normal estimation networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8549–8558, 2019.

Crossref

[37]

Ju, Y. K.; Lam, K. M.; Chen, Y.; Qi, L.; Dong, J. Y. Pay attention to Devils: A photometric stereo network for better details. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, 694–700, 2020.

[38]

Mildenhall, B.; Srinivasan, P. P.; Tancik, M.; Barron, J. T.; Ramamoorthi, R.; Ng, R. NeRF: Representing scenes as neural radiance fields for view synthesis. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12346. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 405–421, 2020.

Crossref

[39]

Liu, L.; Liu, J. Z.; Yuan, S. X.; Slabaugh, G., Leonardis, A., Zhou, W. G.; Tian , Q. Wavelet-based dual-branch network for image demoiréing. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12358. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 86–102, 2020.

Crossref

[40]

Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L. et al. PyTorch: An imperative style, high-performance deep learning library. In: Proceedings of the Advances in Neural Information Processing Systems, 8026–8037, 2019.

[41]

Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations, 2015.

[42]

Johnson, M. K.; Adelson, E. H. Shape estimation in natural illumination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2553–2560, 2011.

Crossref

[43]

Wiles, O.; Zisserman, A. SilNet: Single- and multi-view reconstruction by learning from silhouettes. In: Proceedings of the British Machine Vision Conference, 2017.

Crossref

[44]

Matusik, W.; Pfister, H.; Brand, M.; McMillan, L. A data-driven reflectance model. ACM Transactions on Graphics Vol. 22, No. 3, 759–769, 2003.

Crossref Google Scholar

[45]

Jakob, W. Mitsuba renderer. 2010. Available at https://www.mitsuba-renderer.org/.

[46]

Einarsson, P.; Chabert, C.-F.; Jones, A.; Ma, W.-C.; Lamond, B.; Hawkins, T.; Bolas, M.; Sylwan, S.; Debevec, P. Relighting human locomotion with owed reflectance fields. In: Proceedings of the 17th Eurographics Conference on Rendering Techniques, 183–194, 2006.

[47]

Crossref Google Scholar

[48]

Alldrin, N. G.; Mallick, S. P.; Kriegman, D. J. Resolving the generalized bas-relief ambiguity by entropy minimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–7, 2007.

Crossref

[49]

Shi, B. X.; Matsushita, Y.; Wei, Y. C.; Xu, C.; Tan, P. Self-calibrating photometric stereo. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1118–1125, 2010.

Crossref

[50]

Wu, Z.; Tan, P. Calibrating photometric stereo by holistic reflectance symmetry analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1498–1505, 2013.

Crossref

[51]

Papadhimitri, T.; Favaro, P. A closed-form, consistent and robust solution to uncalibrated photometric stereo via local diffuse reflectance maxima. International Journal of Computer Vision Vol. 107, No. 2, 139–154, 2014.

Crossref Google Scholar

[52]

Simchony, T.; Chellappa, R.; Shao, M. Direct analytical methods for solving Poisson equations in computer vision problems. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 12, No. 5, 435–446, 1990.

Crossref Google Scholar

Computational Visual Media

Volume 8 Issue 1,
March 2022

Pages 105-118

DOI: 10.1007/s41095-021-0223-y

Cite this article:

Ju Y, Peng Y, Jian M, et al. Learning conditional photometric stereo with high-resolution features. Computational Visual Media, 2022, 8(1): 105-118. https://doi.org/10.1007/s41095-021-0223-y

813

Views

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 10 January 2021

Accepted: 01 March 2021

Published: 27 October 2021

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www. editorialmanager.com/cvmj.