AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (16.6 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Temporally consistent video colorization with deep feature propagation and self-regularization learning

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
University of Chinese Academy of Sciences, Beijing 100049, China
Department of Electrical & Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798
Applied Research Center, Tencent PCG, Shenzhen, China

* Yihao Liu and Hengyuan Zhao contributed equally to this work.

Show Author Information

Graphical Abstract

Abstract

Video colorization is a challenging and highly ill-posed problem. Although recent years have witnessed remarkable progress in single image colorization, there is relatively less research effort on video colorization, and existing methods always suffer from severe flickering artifacts (temporal incon-sistency) or unsatisfactory colorization. We address this problem from a new perspective, by jointly considering colorization and temporal consistency in a unified framework. Specifically, we propose a novel temporally consistent video colorization (TCVC) framework. TCVC effectively propagates frame-level deep features in a bidirectional way to enhance the temporal consistency of colorization. Furthermore, TCVC introduces a self-regularization learning (SRL) scheme to minimize the differences in predictions obtained using different time steps. SRL does not require any ground-truth color videos for training and can further improve temporal consistency. Experiments demonstrate that our method can not only provide visually pleasing colorized video, but also with clearly better temporal consistency than state-of-the-art methods. A video demo is provided at https://www.youtube.com/watch?v=c7dczMs-olE, while code is available at https://github.com/lyh-18/TCVC-Temporally-Consistent-Video-Colorization.

References

[1]
Ren, S. Q.; He, K. M.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Vol. 1, 9199, 2015.
[2]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 779788, 2016.
[3]
Vondrick, C.; Shrivastava, A.; Fathi, A.; Guadarrama, S.; Murphy, K. Tracking emerges by colorizing videos. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11217. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 402419, 2018.
[4]
Zhang, Z. P.; Peng, H. W. Deeper and wider Siamese networks for real-time visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 45864595, 2019.
[5]
Larsson, G.; Maire, M.; Shakhnarovich, G. Colorization as a proxy task for visual understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 840849, 2017.
[6]
Iizuka, S.; Simo-Serra, E.; Ishikawa, H. Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Transactions on Graphics Vol. 35, No. 4, Article No. 110, 2016.
[7]
Zhang, R.; Isola, P.; Efros, A. A. Colorful imagecolorization. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9907. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 649666, 2016.
[8]
Cheng, Z. Z.; Yang, Q. X.; Sheng, B. Deep colorization. In: Proceedings of the IEEE International Conference on Computer Vision, 415423, 2015.
[9]
Zhang, R.; Zhu, J. Y.; Isola, P.; Geng, X. Y.; Lin, A. S.; Yu, T. H.; Efros, A. A. Real-time user-guided image colorization with learned deep priors. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 119, 2017.
[10]
Su, J. W.; Chu, H. K.; Huang, J. B. Instance-aware image colorization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 79657974, 2020.
[11]
Paul, S.; Bhattacharya, S.; Gupta, S. Spatiotemporal colorization of video using 3D steerable Pyramids. IEEE Transactions on Circuits and Systems for Video Technology Vol. 27, No. 8, 16051619, 2017.
[12]
Sheng, B.; Sun, H. Q.; Magnor, M.; Li, P. Video colorization using parallel optimization in feature space. IEEE Transactions on Circuits and Systems for Video Technology Vol. 24, No. 3, 407417, 2014.
[13]
Lei, C. Y.; Chen, Q. F. Fully automatic video colorization with self-regularization and diversity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 37483756, 2019.
[14]
Bonneel, N.; Tompkin, J.; Sunkavalli, K.; Sun, D. Q.; Paris, S.; Pfister, H. Blind video temporal consistency. ACM Transactions on Graphics Vol. 34, No. 6, Article No. 196, 2015.
[15]
Yao, C. H.; Chang, C. Y.; Chien, S. Y. Occlusion-aware video temporal consistency. In: Proceedings of the 25th ACM International Conference on Multimedia, 777785, 2017.
[16]
Lai, W. S.; Huang, J. B.; Wang, O.; Shechtman, E.; Yumer, E.; Yang, M. H. Learning blind video temporal consistency. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11219. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 179195, 2018.
[17]
Lei, C. Y.; Xing, Y. Z.; Chen, Q. F. Blind video temporal consistency via deep video prior. arXiv preprint arXiv:2010.11838, 2020.
[18]
Levin, A.; Lischinski, D.; Weiss, Y. Colorization using optimization. ACM Transactions on Graphics Vol. 23, No. 3, 689694, 2004.
[19]
Qu, Y. G.; Wong, T. T.; Heng, P. A. Manga colorization. ACM Transactions on Graphics Vol. 25, No. 3, 12141220, 2006.
[20]
Luan, Q.; Wen, F.; Cohen-Or, D.; Liang, L.; Xu, Y. Q.; Shum, H. Y. Natural image colorization. In: Proceedings of the 18th Eurographics Conference on Rendering Techniques, 309320, 2007.
[21]
Larsson, G.; Maire, M.; Shakhnarovich, G. Learning representations for automatic colorization. In: Com-puter Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9908. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 577593, 2016.
[22]
Chen, X. W.; Zou, D. Q.; Zhao, Q. P.; Tan, P. Manifold preserving edit propagation. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 132, 2012.
[23]
Gupta, R. K.; Chia, A. Y. S.; Rajan, D.; Ng, E. S.; Huang, Z. Y. Image colorization using similar images. In: Proceedings of the 20th ACM International Conference on Multimedia, 369378, 2012.
[24]
Welsh, T.; Ashikhmin, M.; Mueller, K. Transferring color to greyscale images. In: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, 277280, 2002.
[25]
Liu, X. P.; Wan, L.; Qu, Y. G.; Wong, T. T.; Lin, S.; Leung, C. S.; Heng, P. A. Intrinsic colorization. ACM Transactions on Graphics Vol. 27, No. 5, Article No. 152, 2008.
[26]
He, M. M.; Chen, D. D.; Liao, J.; Sander, P. V.; Yuan, L. Deep exemplar-based colorization. ACM Transactions on Graphics Vol. 37, No. 4, Article No. 47, 2018.
[27]
Lee, J.; Kim, E.; Lee, Y.; Kim, D.; Chang, J.; Choo, J. Reference-based sketch image colorization using augmented-self reference and dense semantic correspondence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 58005809, 2020.
[28]
Yoo, S.; Bahng, H.; Chung, S.; Lee, J.; Chang, J.; Choo, J. Coloring with limited data: Few-shot colorization via memory augmented networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1127511284, 2019.
[29]
Xu, Z. Y.; Wang, T. T.; Fang, F. M.; Sheng, Y.; Zhang, G. X. Stylization-based architecture for fast deep exemplar colorization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 93609369, 2020.
[30]
Zhang, B.; He, M. M.; Liao, J.; Sander, P. V.; Yuan, L.; Bermak, A.; Chen, D. Deep exemplar-based video colorization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 80448053, 2019.
[31]
Shi, M.; Zhang, J. Q.; Chen, S. Y.; Gao, L.; Lai, Y. K.; Zhang, F. L. Reference-based deep line art video colorization. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 6, 29652979, 2023.
[32]
Thasarathan, H.; Nazeri, K.; Ebrahimi, M. Automatic temporally coherent video colorization. In: Proceedings of the 16th Conference on Computer and Robot Vision, 189194, 2019.
[33]
Iizuka, S.; Simo-Serra, E. DeepRemaster: Temporal source-reference attention networks for comprehensive video enhancement. ACM Transactions on Graphics Vol. 38, No. 6, Article No. 176, 2019.
[34]
Gatys, L. A.; Ecker, A. S.; Bethge, M. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576, 2015.
[35]
Zhu, J. Y.; Park, T.; Isola, P.; Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, 22422251, 2017.
[36]
Ruder, M.; Dosovitskiy, A.; Brox, T. Artistic style transfer for videos. In: Pattern Recognition. Lecture Notes in Computer Science, Vol. 9796. Rosenhahn, B.; Andres, B. Eds. Springer Cham, 2636, 2016.
[37]
Jampani, V.; Gadde, R.; Gehler, P. V. Video pro-pagation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 31543164, 2017.
[38]
Chu, M. Y.; Xie, Y.; Mayer, J.; Leal-Taixé, L.; Thuerey, N. Learning temporal coherence via self-supervision for GAN-based video generation. ACM Transactions on Graphics Vol. 39, No. 4, Article No. 75, 2020.
[39]
Dong, Y.; Liu, Y. H.; Zhang, H.; Chen, S. F.; Qiao, Y. FD-GAN: Generative adversarial networks with fusion-discriminator for single image dehazing. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 1072910736, 2020.
[40]
He, J. W.; Liu, Y. H.; Qiao, Y.; Dong, C. Conditional sequential modulation for efficient global image retouching. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12358. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 679695, 2020.
[41]
Eilertsen, G.; Mantiuk, R. K.; Unger, J. Single-frame regularization for temporally stable CNNs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1116811177, 2019.
[42]
Lei, C. Y.; Xing, Y. Z.; Chen, Q. F. Blind video temporal consistency via deep video prior. arXiv preprint arXiv:2010.11838, 2020.
[43]
Johnson, J.; Alahi, A.; Li, F. F. Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9906. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 694711, 2016.
[44]
Jaderberg, M.; Simonyan, K.; Zisserman, A.; Kavukcuoglu, K. Spatial transformer networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Vol. 2, 20172025, 2015.
[45]
Ilg, E.; Mayer, N.; Saikia, T.; Keuper, M.; Dosovitskiy, A.; Brox, T. FlowNet 2.0: Evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 16471655, 2017.
[46]
Perazzi, F.; Pont-Tuset, J.; McWilliams, B.; Van Gool, L.; Gross, M.; Sorkine-Hornung, A. A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 724732, 2016.
[47]
Hasler, D.; Suesstrunk, S. E. Measuring colorfulness in natural images. In: Proceedings of the SPIE 5007, Human Vision and Electronic Imaging VIII, 8795, 2003.
[48]
Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[49]
Deshpande, A.; Lu, J. J.; Yeh, M. C.; Chong, M. J.; Forsyth, D. Learning diverse image colorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 28772885, 2017.
[50]
Xue, T. F.; Chen, B. A.; Wu, J. J.; Wei, D. L.; Freeman, W. T. Video enhancement with task-oriented flow. International Journal of Computer Vision Vol. 127, No. 8, 11061125, 2019.
[51]
Bao, W. B.; Lai, W. S.; Ma, C.; Zhang, X. Y.; Gao, Z. Y.; Yang, M. H. Depth-aware video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 36983707, 2019.
[52]
Lu, G.; Ouyang, W. L.; Xu, D.; Zhang, X. Y.; Cai, C. L.; Gao, Z. Y. DVC: An end-to-end deep video compression framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1099811007, 2019.
Computational Visual Media
Pages 375-395
Cite this article:
Liu Y, Zhao H, Chan KCK, et al. Temporally consistent video colorization with deep feature propagation and self-regularization learning. Computational Visual Media, 2024, 10(2): 375-395. https://doi.org/10.1007/s41095-023-0342-8

479

Views

29

Downloads

13

Crossref

6

Web of Science

10

Scopus

1

CSCD

Altmetrics

Received: 07 December 2022
Accepted: 12 March 2023
Published: 03 January 2024
© The Author(s) 2023.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return