Image resizing by reconstruction from deep features

Dov Danon; Moab Arar; Daniel Cohen-Or; Ariel Shamir

doi:10.1007/s41095-021-0216-x

| Sign up

PDF (34.9 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Research Article | Open Access

Image resizing by reconstruction from deep features

Dov Danon^{¹^,^*}(), Moab Arar^{¹^,^*}, Daniel Cohen-Or^¹, Ariel Shamir^²

1Tel Aviv University, Tel Aviv, 69978, Israel

2The Interdisciplinary Center Herzliya, Herzliya, 4610101, Israel

^*Dov Danon and Moab Arar contribucted equally to this work.

Show Author Information

Abstract

Traditional image resizing methods usually work in pixel space and use various saliency measures. The challenge is to adjust the image shape while trying to preserve important content. In this paper weperform image resizing in feature space using the deep layers of a neural network containing rich important semantic information. We directly adjust the imagefeature maps, extracted from a pre-trained classification network, and reconstruct the resized image using neural-network based optimization. This novel approachleverages the hierarchical encoding of the network, and in particular, the high-level discriminative power of its deeper layers, that can recognize semantic regions and objects, thereby allowing maintenance of their aspect ratios. Our use of reconstruction from deep features results in less noticeable artifacts than use of image-space resizing operators. We evaluate our method on benchmarks, compare it to alternative approaches, and demonstrate its strengths on challenging images.

Keywords

image retargeting reconstruction deep seam carving image resizing

Electronic Supplementary Material

Download File(s)

41095_2021_216_MOESM1_ESM.pdf (51.4 MB)

References

[1]

Avidan,

; Shamir,

Seam carving for content-aware image resizing. ACM Transactions on Graphics Vol. 26, No. 3, 10-es, 2007.

Crossref Google Scholar

[2]

Wolf,

; Guttmann,

; Cohen-Or,

Non-homogeneous content-driven video-retargeting. In: Proceedings of the IEEE 11th International Conference on Computer Vision, 1-6, 2007.

Crossref

[3]

Rubinstein,

; Shamir,

; Avidan,

Improved seam carving for video retargeting. ACM Transactions on Graphics Vol. 27, No. 3, Article No. 16, 2008.

Crossref Google Scholar

[4]

Wang,

Y. S.

; Tai,

C. L.

; Sorkine,

; Lee,

T. Y.

Optimized scale-and-stretch for image resizing. In: Proceedings of the ACM SIGGRAPH Asia Papers, Article No. 118, 2008.

Crossref

[5]

Pritch,

; Kav-Venaki,

; Peleg,

Shift-map image editing. In: Proceedings of the IEEE 12th International Conference on Computer Vision, 151-158, 2009.

Crossref

[6]

Guo,

Y. W.

; Liu,

; Shi,

; Zhou,

Z. H.

; Gleicher,

Image retargeting using mesh parametrization. IEEE Transactions on Multimedia Vol. 11, No. 5, 856-867, 2009.

Crossref Google Scholar

[7]

Krähenbühl,

; Lang,

; Hornung,

; Gross,

A system for retargeting of streaming video. ACM Transactions on Graphics Vol. 28, No. 5, , 2009.

Crossref Google Scholar

[8]

Rubinstein,

; Shamir,

; Avidan,

Multi-operator media retargeting. In: Proceedings of the ACM SIGGRAPH Papers, Article No. 23, 2009.

Crossref

[9]

Wu,

H. S.

; Wang,

Y. S.

; Feng,

K. C.

; Wong,

T. T.

; Lee,

T. Y.

; Heng,

P. A.

Resizing by symmetry- summarization. In: Proceedings of the ACM SIGGRAPHAsia Papers, Article No. 159, 2010.

[10]

Panozzo,

; Weber,

; Sorkine,

Robust image retargeting via axis-aligned deformation. Computer Graphics Forum Vol. 31, No. 2pt1, 229-236, 2012.

Crossref Google Scholar

[11]

Cho,

; Park,

; Oh,

T. H.

; Tai,

Y. W.

; Kweon,

I. S.

Weakly- and self-supervised learning for content-aware deep image retargeting. In: Proceedings of the IEEE International Conference on Computer Vision, 4568-4577, 2017.

Crossref

[12]

Shocher,

; Bagon,

; Isola,

; Irani,

InGAN: Capturing and retargeting the “DNA” of a natural image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4491-4500, 2019.

Crossref

[13]

Shaham,

T. R.

; Dekel,

; Michaeli,

SinGAN: Learning a generative model from a single natural image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4569-4579, 2019.

Crossref

[14]

Blau,

; Michaeli,

The perception-distortion tradeoff. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6228-6237, 2018.

Crossref

[15]

Kiess,

; Kopf,

; Guthier,

; Effelsberg,

A survey on content-aware image and video retargeting. ACM Transactions on Multimedia Computing, Communications, and Applications Vol. 14, No. 3, Article No. 76, 2018.

Crossref Google Scholar

[16]

Vaquero,

; Turk,

; Pulli,

; Tico,

; Gelfand,

A survey of image retargeting techniques. In: Proceedings of the Applications of Digital Image Processing XXXIII, Vol. 7798, 328-342, 2010.

Crossref

[17]

Krizhevsky,

; Sutskever,

; Hinton,

G. E.

ImageNet classification with deep convolutional neural networks. Communications of the ACM Vol. 60, No. 6, 84-90, 2017.

Crossref Google Scholar

[18]

Simonyan

; Zisserman,

Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

Google Scholar

[19]

Szegedy,

; Liu,

; Jia,

Y. Q.

; Sermanet,

; Reed,

; Anguelov,

; Erhan,

Vanhoucke,

; Rabinovich,

Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-9, 2015.

Crossref

[20]

He,

K. M.

; Zhang,

X. Y.

; Ren,

S. Q.

; Sun,

Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778, 2016.

[21]

Girshick,

; Donahue,

; Darrell,

; Malik,

Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 580-587, 2014.

Crossref

[22]

Girshick,

Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 1440-1448, 2015.

Crossref

[23]

He,

K. M.

; Zhang,

X. Y.

; Ren,

S. Q.

; Sun,

Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Computer Vision - ECCV 2014. Lecture Notes in Computer Science, Vol. 8691. Fleet,

; Pajdla,

; Schiele,

; Tuytelaars,

Eds. Springer Cham, 346-361, 2014.

[24]

Ren,

S. Q.

; He,

K. M.

; Girshick,

; Sun,

Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 39, No. 6, 1137-1149, 2017.

Crossref Google Scholar

[25]

Liu,

Z. G.

; Wang,

Z. P.

; Zhang,

L. M.

; Shah,

R. R.

; Xia,

Y. J.

; Yang,

; Li,

FastShrinkage: Perceptually-aware retargeting toward mobile platforms. In: Proceedings of the 25th ACM International Conference on Multimedia, 501-509, 2017.

Crossref

[26]

Song,

; Tang,

; Dong,

W. M.

; Zhang,

X. P.

; Deussen,

; Lee,

T. Y.

Photo squarization by deep multi-operator retargeting. In: Proceedings of the 26th ACM international Conference on Multimedia, 1047-1055, 2018.

Crossref

[27]

Kajiura,

; Kosugi,

; Wang,

X. T.

; Yamasaki,

Self-play reinforcement learning for fast image retargeting. In: Proceedings of the 28th ACM International Conference on Multimedia, 1755-1763, 2020.

Crossref

[28]

Esmaeili,

S. A.

; Singh,

; Davis,

L. S.

Fast-at: Fast automatic thumbnail generation using deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4178-4186, 2017.

Crossref

[29]

Lin,

J. X.

; Zhou,

T. K.

; Chen,

Z. B.

DeepIR: A deep semantics driven framework for image retargeting. In: Proceedings of the IEEE International Conference on Multimedia & Expo Workshops, 54-59, 2019.

Crossref

[30]

Liao,

; Yao,

; Yuan,

; Hua,

; Kang,

S. B.

Visual attribute transfer through deep image analogy. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 120, 2017.

Crossref Google Scholar

[31]

Barnes,

; Shechtman,

; Finkelstein,

; Goldman,

D. B.

PatchMatch: A randomized correspondence algorithm for structural image editing. In: Proceedings of the ACM SIGGRAPH 2009 Papers, Article No. 24, 2009.

Crossref

[32]

Jaderberg,

; Simonyan,

; Zisserman,

; Kavukcuoglu,

Spatial transformer networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Vol. 2, 2017-2025, 2015.

[33]

Gatys,

L. A.

; Ecker,

A. S.

; Bethge,

Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2414-2423, 2016.

Crossref

[34]

Kingma,

D. P.

; Ba,

Adam: A method forstochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

Google Scholar

[35]

Odena,

; Dumoulin,

; Olah,

Deconvolution and checkerboard artifacts. Distill, 2016. Available at https://distill.pub/2016/deconv-checkerboard/.

Crossref Google Scholar

[36]

Russakovsky,

; Deng,

; Su,

; Krause,

; Satheesh,

; Ma,

; Huang,

; Karpathy,

; Khosla,

; Bernstein,

ImageNet large scale visual recognition challenge. International Journal of Computer Vision Vol. 115, No. 3, 211-252, 2015.

Crossref Google Scholar

[37]

Everingham,

; van Gool,

; Williams,

C. K. I.

; Winn,

; Zisserman,

The PASCAL visual object classes challenge 2007 (VOC2007) results. 2007. Available at http://www.pascalnetwork.org/challenges/VOC/voc2007/workshop/index.html.

[38]

Rubinstein,

; Gutierrez,

; Sorkine,

; Shamir,

A comparative study of image retargeting. In: Proceedings of the ACM SIGGRAPH Asia Papers, Article No. 160, 2010.

Crossref

[39]

Lin,

T. Y.

; Maire,

; Belongie,

; Hays,

; Perona,

; Ramanan,

; Dollár,

; Zitnick,

C. L.

Microsoft COCO: Common objects in context. In: Computer Vision - ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. Fleet,

; Pajdla,

; Schiele,

; Tuytelaars,

Eds. Springer Cham, 740-755, 2014.

Crossref

[40]

Ulyanov,

; Lebedev,

; Vedaldi,

; Lempitsky,

Texture networks: Feed-forward synthesis of textures and stylized images. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, Vol. 48, 1349-1357, 2016.

[41]

Ulyanov,

; Vedaldi,

; Lempitsky,

V. S.

Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.

Google Scholar

Computational Visual Media

Volume 7 Issue 4,
December 2021

Pages 453-466

DOI: 10.1007/s41095-021-0216-x

Cite this article:

Danon D, Arar M, Cohen-Or D, et al. Image resizing by reconstruction from deep features. Computational Visual Media, 2021, 7(4): 453-466. https://doi.org/10.1007/s41095-021-0216-x