AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (2.7 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Reference-guided structure-aware deep sketch colorization for cartoons

Caritas Institute of Higher Education, Hong Kong SAR, China
Shenzhen University, Shenzhen 518060, China
Show Author Information

Graphical Abstract

Abstract

Digital cartoon production requires extensive manual labor to colorize sketches with visually pleasant color composition and color shading. During colorization, the artist usually takes an existing cartoon image as color guidance, particularly when colorizing related characters or an animation sequence. Reference-guided colorization is more intuitive than colorization with other hints, such as color points or scribbles, or text-based hints. Unfortunately, reference-guided colorization is challenging since the style of the colorized image should match the style of the reference image in terms of both global color composition and local color shading. In this paper, we propose a novel learning-based framework which colorizes a sketch based on a color style feature extracted from a reference color image. Our framework contains a color style extractor to extract the color feature from a color image, a colorization network to generate multi-scale output images by combining a sketch and a color feature, and a multi-scale discriminator to improve the reality of the output image. Extensive qualitative and quantitative evaluations show that our method outperforms existing methods, providing both superior visual quality and style reference consistency in the task of reference-based colorization.

References

[1]
Isola, P.; Zhu, J. Y.; Zhou, T. H.; Efros, A. A. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5967–5976, 2017.
[2]
Zhu, J. Y.; Park, T.; Isola, P.; Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, 2242–2251, 2017.
[3]
Zhang, R.; Isola, P.; Efros, A. A. Colorful image colorization. In: Computer Vision–ECCV 2016. Lecture Notes in Computer Science, Vol 9907. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 649–666, 2016.
[4]
Yonetsuji, T. Paints chainer. 2017. Available at https://github.com/pfnet/Paintschainer.
[5]
Zhang, L.; Ji, Y.; Lin, X.; Liu, C. P. Style transfer for anime sketches with enhanced residual U-net and auxiliary classifier GAN. In: Proceedings of the 4th IAPR Asian Conference on Pattern Recognition, 506–511, 2017.
[6]
Kim, H.; Jhoo, H. Y.; Park, E.; Yoo, S. Tag2Pix: Line art colorization using text tag with SECat and changing loss. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9055–9064, 2019.
[7]
Huang, X.; Belongie, S. Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, 1510–1519, 2017.
[8]
Li, Y.; Fang, C.; Yang, J.; Wang, Z.; Lu, X.; Yang, M. Universal style transfer via feature transforms. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 385–395,2017.
[9]
Gonzalez-Garcia, A.; van de Weijer, J.; Bengio, Y. Image-to-image translation for cross-domain disentanglement. In: Proceedings of the 33rd Conference on Neural Information Processing Systems, 1294–1305, 2018.
[10]
Yu, X.; Chen, Y.; Liu, S.; Li, T.; Li, G. Multi-mapping image-to-image translation via learning disentanglement. In: Proceedings of the 33rd Conference on Neural Information Processing Systems, 2990–2999, 2019.
[11]
Li, X. J.; Zhao, H. L.; Nie, G. Z.; Huang, H. Image recoloring using geodesic distance based color harmonization. Computational Visual Media Vol. 1, No. 2, 143–155, 2015.
[12]
Miao, Y. W.; Hu, F. X.; Zhang, X. D.; Chen, J. Z.; Pajarola, R. SymmSketch: Creating symmetric 3D free-form shapes from 2D sketches. Computational Visual Media Vol. 1, No. 1, 3–16, 2015.
[13]
Todo, H.; Yamaguchi, Y. Estimating reflectance and shape of objects from a single cartoon-shaded image. Computational Visual Media Vol. 3, No. 1, 21–31, 2017.
[14]
Hertzmann, A.; Jacobs, C. E.; Oliver, N.; Curless, B.; Salesin, D. H. Image analogies. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, 327–340, 2001.
[15]
Gatys, L. A.; Ecker, A. S.; Bethge, M. Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2414–2423, 2016.
[16]
Johnson, J.; Alahi, A.; Li, F. F. Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9906. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 694–711, 2016.
[17]
Sanakoyeu, A.; Kotovenko, D.; Lang, S.; Ommer, B. A style-aware content loss for real-time HD style transfer. In: Proceedings of the European Conference on Computer Vision, Vol. 11212, 715–731, 2018.
[18]
Zhang, Y. L.; Fang, C.; Wang, Y. L.; Wang, Z. W.; Lin, Z.; Fu, Y.; Yang, J. Multimodal style transfer via graph cuts. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5942–5950, 2019.
[19]
Song, C.; Wu, Z.; Zhou, Y.; Gong, M.; Huang, H. ETNet: Error transition network for arbitrary style transfer. In: Proceedings of the Advances in Neural Information Processing Systems, 668–677, 2019.
[20]
Wang, H.; Li, Y. J.; Wang, Y. H.; Hu, H. J.; Yang, M. H. Collaborative distillation for ultra-resolution universal style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1857–1866, 2020.
[21]
Li, X.; Liu, S.; Kautz, J.; Yang, M. Learning linear transformations for fast arbitrary style transfer. arXiv preprint arXiv: 1808.04537, 2018.
[22]
Gao, W.; Li, Y. J.; Yin, Y. H.; Yang, M. H. Fast video multi-style transfer. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 3211–3219, 2020.
[23]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Communications of the ACM Vol. 63, No. 11, 139–144, 2020
[24]
Kim, T.; Cha, M.; Kim, H.; Lee, J. K.; Kim, J. Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning, 1857–1865, 2017.
[25]
Zhang, L.; Li, C. Z.; Wong, T. T.; Ji, Y.; Liu, C. P. Two-stage sketch colorization. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 261, 2019.
[26]
Zhu, J.-Y.; Zhang, R.; Pathak, D.; Darrell, T.; Efros, A. A.; Wang, O.; Shechtman, E. Toward multimodal image-to-image translation. In: Proceedings of the 31st Conference on Neural Information Processing Systems, 465–476, 2017.
[27]
Lee, J.; Kim, E.; Lee, Y.; Kim, D.; Chang, J.; Choo, J. Reference-based sketch image colorization using augmented-self reference and dense semantic correspondence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5800–5809, 2020.
[28]
Welsh, T.; Ashikhmin, M.; Mueller, K. Transferring color to greyscale images. In: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, 277–280, 2002.
[29]
Bugeau, A.; Ta, V. T.; Papadakis, N. Variational exemplar-based image colorization. IEEE Transactions on Image Processing Vol. 23, No. 1, 298–307, 2014.
[30]
Liu, X. P.; Wan, L.; Qu, Y. G.; Wong, T. T., Lin, S., Leung, C. S., Heng, P. A. Intrinsic colorization. In: Proceedings of the ACM SIGGRAPH Asia 2008 papers, Article No. 152, 2008.
[31]
Chia, A. Y. S.; Zhuo, S. J.; Gupta, R. K.; Tai, Y. W.; Cho, S. Y.; Tan, P.; Lin, S. Semantic colorization with Internet images. ACM Transactions on Graphics Vol. 30, No. 6, Article No. 156, 2011.
[32]
Gupta, R. K.; Chia, A. Y. S.; Rajan, D.; Ng, E. S.; Huang, Z. Y. Image colorization using similar images. In: Proceedings of the 20th ACM International Conference on Multimedia, 369–378, 2012.
[33]
Tai, Y. W.; Jia, J. Y.; Tang, C. K. Local color transfer via probabilistic segmentation by expectation-maximization. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 747–754, 2005.
[34]
He, M. M.; Chen, D. D.; Liao, J.; Sander, P. V.; Yuan, L. Deep exemplar-based colorization. ACM Transactions on Graphics Vol. 37, No. 4, Article No. 47, 2018.
[35]
Zhang, B.; He, M. M.; Liao, J.; Sander, P. V.; Yuan, L.; Bermak, A.; Chen, D. Deep exemplar-based video colorization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8052–8061, 2019.
[36]
Huang, X.; Liu, M. Y.; Belongie, S.; Kautz, J. Multimodal unsupervised image-to-image translation. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11207. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 179–196, 2018.
[37]
Lee, H. Y.; Tseng, H. Y.; Huang, J. B.; Singh, M.; Yang, M. H. Diverse image-to-image translation via disentangled representations. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11205. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 36–52, 2018.
[38]
He, K. M.; Zhang, X. Y.; Ren, S. Q.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.
[39]
Ulyanov, D.; Vedaldi, A.; Lempitsky, V. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.
[41]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[42]
Miyato, T.; Kataoka, T.; Koyama, M.; Yoshida, Y. Spectral normalization for generative adversarial networks. In: Proceedings of the 6th International Conference on Learning Representations, 2018.
[43]
Mescheder, L.; Geiger, A.; Nowozin, S. Which training methods for GANs do actually converge? In: Proceedings of the 35th International Conference on Machine Learning, 3478–3487, 2018.
[44]
Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations, 2015.
[45]
Sun, T. H.; Lai, C. H.; Wong, S. K.; Wang, Y. S. Adversarial colorization of icons based on contour and color conditions. In: Proceedings of the 27th ACM International Conference on Multimedia, 683–691, 2019.
[46]
Wang, Z.; Bovik, A. C.; Sheikh, H. R.; Simoncelli, E. P. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing Vol. 13, No. 4, 600–612, 2004.
[47]
Dowson, D. C.; Landau, B. V. The Fréchet distance between multivariate normal distributions. Journal of Multivariate Analysis Vol. 12, No. 3, 450–455, 1982.
[48]
Iizuka, S.; Simo-Serra, E.; Ishikawa, H. Globally and locally consistent image completion. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 107, 2017.
Computational Visual Media
Pages 135-148
Cite this article:
Liu X, Wu W, Li C, et al. Reference-guided structure-aware deep sketch colorization for cartoons. Computational Visual Media, 2022, 8(1): 135-148. https://doi.org/10.1007/s41095-021-0228-6

570

Views

28

Downloads

14

Crossref

10

Web of Science

13

Scopus

2

CSCD

Altmetrics

Received: 21 January 2021
Accepted: 18 March 2021
Published: 27 October 2021
© The Author(s) 2021.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www. editorialmanager.com/cvmj.

Return