AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (712.8 KB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

JMNet: A joint matting network for automatic human matting

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China.
AI Center at Visual China Group, Burlingame, CA 94010, USA.
School of Engineering and Computer Science, Victoria University of Wellington, New Zealand.
Show Author Information

Abstract

We propose a novel end-to-end deep learning framework, the Joint Matting Network (JMNet), to automatically generate alpha mattes for human images. We utilize the intrinsic structures of the human body as seen in images by introducing a pose estimation module, which can provide both global structural guidance and a local attention focus for the matting task. Our network model includes a pose network, a trimap network, a matting network, and a shared encoder to extract features for the above three networks. We also append a trimap refinement module and utilize gradient loss to provide a sharper alpha matte. Extensive experiments have shown that our method outperforms state-of-the-art human matting techniques; the shared encoder leads to better performance and lower memory costs. Our model can process real images downloaded from the Internet for use in composition applications.

References

[1]
X. Chen,; D. Qi,; J. Shen, Boundary-aware network for fast and high-accuracy portrait segmentation. arXiv preprint arXiv:1901.03814, 2019.
[2]
X. Y. Shen,; A. Hertzmann,; J. Y. Jia,; S. Paris,; B. Price,; E. Shechtman,; I. Sachs, Automatic portrait segmentation for image stylization. Computer Graphics Forum Vol. 35, No. 2, 93-102, 2016.
[3]
A. Levin,; D. Lischinski,; Y. Weiss, A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 30, No. 2, 228-242, 2008.
[4]
Q. F. Chen,; D. Li,; C. K. Tang, KNN matting. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 35, No. 9, 2175-2188, 2013.
[5]
X. Y. Shen,; X. Tao,; H. Y. Gao,; C. Zhou,; J. Y. Jia, Deep automatic portrait matting. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9905. B. Leibe,; J. Matas,; N. Sebe,; M. Welling, Eds. Springer Cham, 92-107, 2016.
[6]
Q. Chen,; T. Z. Ge,; Y. Y. Xu,; Z. Q. Zhang,; X. X. Yang,; K. Gai, Semantic human matting. In: Proceedings of the 26th ACM International Conference on Multimedia, 618-626, 2018.
[7]
N. Xu,; B. Price,; S. Cohen,; T. Huang, Deep image matting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2970-2979, 2017.
[8]
Y.-Y. Chuang,; B. Curless,; D. H. Salesin,; R. Szeliski, A Bayesian approach to digital matting. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 264-271, 2001.
[9]
J. Wang,; M. F. Cohen, Optimized color sampling for robust matting. In: Proceedings of the IEEEConference on Computer Vision and Pattern Recognition, 1-8, 2007.
[10]
E. S. L. Gastal,; M. M. Oliveira, Shared sampling for real-time alpha matting. Computer Graphics Forum Vol. 29, No. 2, 575-584, 2010.
[11]
K. He,; C. Rhemann,; C. Rother,; X. Tang,; J. Sun A global sampling method for alpha matting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2049-2056, 2011.
[12]
D. Cho,; Y. W. Tai,; I. Kweon, Natural image matting using deep convolutional neural networks. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9906. B. Leibe,; J. Matas,; N. Sebe,; M. Welling, Eds. Springer Cham, 626-643, 2016.
[13]
S. Lutz,; K. Amplianitis,; A. Smolic, Alphagan: Generative adversarial networks for natural image matting. arXiv preprint arXiv:1807.10088, 2018.
[14]
J. W. Tang,; Y. Aksoy,; C. Oztireli,; M. Gross,; T. O. Aydin, Learning-based sampling for natural image matting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3055-3063, 2019.
[15]
J. Long,; E. Shelhamer,; T. Darrell, Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3431-3440, 2015.
[16]
H. S. Zhao,; J. P. Shi,; X. J. Qi,; X. G. Wang,; J. Y. Jia, Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2881-2890, 2017.
[17]
Y. K. Zhang,; L. X. Gong,; L. B. Fan,; P. R. Ren,; Q. X. Huang,; H. J. Bao,; W. Xu, A late fusion CNN for digital matting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7469-7478, 2019.
[18]
O. Ronneberger,; P. Fischer,; T. Brox, U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015. Lecture Notes in Computer Science, Vol. 9351. N. Navab,; J. Hornegger,; W. Wells,; A. Frangi, Eds. Springer Cham, 234-241, 2015.
[19]
L. C. Chen,; G. Papandreou,; I. Kokkinos,; K. Murphy,; A. L. Yuille, DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 40, No. 4, 834-848, 2018.
[20]
J. Carreira,; P. Agrawal,; K. Fragkiadaki,; J. Malik, Human pose estimation with iterative error feedback. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4733-4742, 2016.
[21]
A. Toshev,; C. Szegedy, DeepPose: Human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1653-1660, 2014.
[22]
A. Newell,; K. Y. Yang,; J. Deng, Stacked hourglass networks for human pose estimation. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9912. B. Leibe,; J. Matas,; N. Sebe,; M. Welling, Eds. Springer Cham, 483-499, 2016.
[23]
S.-E. Wei,; V. Ramakrishna,; T. Kanade,; Y. Sheikh, Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4724-4732, 2016.
[24]
X. Chu,; W. L. Ouyang,; H. S. Li,; X. G. Wang, Structured feature learning for pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4715-4723, 2016.
[25]
X. D. Liang,; K. Gong,; X. H. Shen,; L. Lin, Look into person: Joint body parsing & pose estimation network and a new benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 41, No. 4, 871-885, 2019.
[26]
T. Kikuchi,; Y. Endo,; Y. Kanamori,; T. Hashimoto,; J. Mitani, Transferring pose and augmenting background for deep human-image parsing and its applications. Computational Visual Media Vol. 4, No. 1, 43-54, 2018.
[27]
X. Wu,; R. L. Li,; F. L. Zhang,; J. C. Liu,; J. Wang,; A. Shamir,; S.-M. Hu, Deep portrait image completion and extrapolation. IEEE Transactions on Image Processing Vol. 29, 2344-2355, 2020.
[28]
K. M. He,; X. Y. Zhang,; S. Q. Ren,; J. Sun, Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778, 2016.
[29]
K. M. He,; G. Gkioxari,; P. Dollar,; R. Girshick, Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2961-2969, 2017.
[30]
T. Y. Lin,; P. Dollar,; R. Girshick,; K. M. He,; B. Hariharan,; S. Belongie, Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2117-2125, 2017.
[31]
Z. Cao,; G. Hidalgo,; T. Simon,; S. E. Wei,; Y. Sheikh, OpenPose: Realtime multi-person 2D pose estimation using Part Affinity Fields. arXiv preprint arXiv:1812.08008, 2018.
[32]
T. Y. Lin,; M. Maire,; S. Belongie,; J. Hays,; P. Perona,; D. Ramanan,; P. Dollár,; C. L. Zitnick, Microsoft COCO: Common objects in context. In: Computer Vision - ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. D. Fleet,; T. Pajdla,; B.. Schiele,; T. Tuytelaars Eds. Springer Cham, 740-755, 2014.
[33]
C. Rhemann,; C. Rother,; J. Wang,; M. Gelautz,; P. Kohli,; P. Rott, A perceptually motivated online benchmark for image matting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1826-1833, 2009.
Computational Visual Media
Pages 215-224
Cite this article:
Wu X, Fang X-N, Chen T, et al. JMNet: A joint matting network for automatic human matting. Computational Visual Media, 2020, 6(2): 215-224. https://doi.org/10.1007/s41095-020-0168-6

724

Views

38

Downloads

8

Crossref

N/A

Web of Science

6

Scopus

2

CSCD

Altmetrics

Received: 13 January 2020
Revised: 13 January 2020
Accepted: 19 February 2020
Published: 14 April 2020
© The Author(s) 2020

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return