AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (895.3 KB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Unsupervised natural image patch learning

Tel-Aviv University, Tel Aviv 6997801, Israel.
Stanford University, Stanford, CA 94305, USA.
Show Author Information

Abstract

A metric for natural image patches is an important tool for analyzing images. An efficient means of learning one is to train a deep network to map an image patch to a vector space, in which the Euclidean distance reflects patch similarity. Previous attempts learned such an embedding in a supervised manner, requiring the availability of many annotated images. In this paper, we present an unsupervised embedding of natural image patches, avoiding the need for annotated images. The key idea is that the similarity of two patches can be learned from the prevalence of their spatial proximity in natural images. Clearly, relying on this simple principle, many spatially nearby pairs are outliers. However, as we show, these outliers do not harm the convergence of the metric learning. We show that our unsupervised embedding approach is more effective than a supervised one or one that uses deep patch representations. Moreover, we show that it naturally lends itself to an efficient self-supervised domain adaptation technique onto a target domain that contains a common foreground object.

Electronic Supplementary Material

Download File(s)
41095_2019_147_MOESM1_ESM.pdf (523.3 MB)

References

[1]
Y. Matviychuk,; S. M. Hughes, Exploring the manifold of image patches. In: Proceedings of Bridges, 339-342, 2015.
[2]
K. Shi,; S.-C. Zhu, Mapping natural image patches by explicit and implicit manifolds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-7, 2007.
[3]
O. Fried,; S. Avidan,; D. Cohen-Or, Patch2Vec: Globally consistent image patch representation. Computer Graphics Forum Vol. 36, No. 7, 183-194, 2017.
[4]
C. Doersch,; A. Gupta,; A. A. Efros, Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, 1422-1430, 2015.
[5]
B. Julesz, Textons, the elements of texture perception, and their interactions. Nature Vol. 290, No. 5802, 91-97, 1981.
[6]
T. Randen,; J. H. Husoy, Filtering for texture classification: A comparative study. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 21, No. 4, 291-310, 1999.
[7]
D. Gabor, Theory of communication. Part 1: The analysis of information. Journal of the Institution of Electrical Engineers - Part III: Radio and Com-munication Engineering Vol. 93, No. 26, 429-441, 1946.
[8]
J. S. De Bonet,; P. Viola, Texture recognition using a non-parametric multi-scale statistical model. In: Pro-ceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 641-647, 1998.
[9]
D. J. Heeger,; J. R. Bergen, Pyramid-based texture analysis/synthesis. In: Proceedings of the IEEE Inter-national Conference on Image Processing 648-650, 1995.
[10]
M. Varma,; A. Zisserman, Texture classification: Are filter banks necessary? In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, II-691, 2003.
[11]
C. Barnes,; F. L. Zhang, A survey of the state-of-the-art in patch-based synthesis. Computational Visual Media Vol. 3, No. 1, 3-20, 2017.
[12]
J. Žbontar,; Y. LeCun, Stereo matching by training a convolutional neural network to compare image patches. The Journal of Machine Learning Research Vol. 17, No. 1, 2287-2318, 2016.
[13]
E. Simo-Serra,; E. Trulls,; L. Ferraz,; I. Kokkinos,; P. Fua,; F. Moreno-Noguer, Discriminative learning of deep convolutional feature point descriptors. In: Proceedings of the IEEE International Conference on Computer Vision, 118-126, 2015.
[14]
S.-M. Hu,; F.-L. Zhang,; M. Wang,; R. R. Martin,; J. Wang, PatchNet: A patch-based image representation for interactive library-driven image editing. ACM Transactions on Graphics Vol. 32, No. 6, Article No. 196, 2013.
[15]
C. Barnes,; F.-L. Zhang,; L. Lou,; X. Wu,; S.-M. Hu, PatchTable: Efficient patch queries for large datasets and applications. ACM Transactions on Graphics Vol. 34, No. 4, Article No. 97, 2015.
[16]
M. Cimpoi,; S. Maji,; A. Vedaldi, Deep filter banks for texture recognition and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3828-3836, 2015.
[17]
J. Long,; E. Shelhamer,; T. Darrell, Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3431-3440, 2015.
[18]
P. Isola,; D. Zoran,; D. Krishnan,; E. H. Adelson, Learning visual groups from co-occurrences in space and time. arXiv preprint arXiv:1511.06811, 2015.
[19]
X. Wang; A. Gupta, Unsupervised learning of visual representations using videos. In: Proceeding of the IEEE International Conference on Computer Vision, 2794-2802, 2015.
[20]
D. Pathak,; P. Krähenbühl,; J. Donahue,; T. Darrell,; A. A. Efros, Context encoders: Feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2536-2544, 2016.
[21]
S. Ben-David,; J. Blitzer,; K. Crammer,; A. Kulesza,; F. Pereira,; J. W. Vaughan, A theory of learning from different domains. Machine Learning Vol. 79, Nos. 1-2, 151-175, 2010.
[22]
M. Chen,; Z. Xu,; K. Q. Weinberger,; F. Sha,Marginalized denoising autoencoders for domain adaptation. In: Proceedings of the 29th International Conference on Machine Learning, 1627-1634, 2012.
[23]
O. Russakovsky,; J. Deng,; H. Su,; J. Krause,; S. Satheesh,; S. A. Ma, et al. ImageNet large scale visual recognition challenge. International Journal of Computer Vision Vol. 115, No. 3, 211-252, 2015.
[24]
M. Oquab,; L. Bottou,; I. Laptev,; J. Sivic, Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1717-1724, 2014.
[25]
A. S. Razavian,; H. Azizpour,; J. Sullivan,; S. Carlsson, CNN features off-the-shelf: An astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 806-813, 2014.
[26]
V. M. Patel,; R. Gopalan,; R. N. Li,; R. Chellappa, Visual domain adaptation: A survey of recent advances. IEEE Signal Processing Magazine Vol. 32, No. 3, 53-69, 2015.
[27]
Y. Ganin,; V. Lempitsky, Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd International Conference on Machine Learning, Vol. 37, 1180-1189, 2015.
[28]
E. Kodirov,; T. Xiang,; Z. Fu,; S. Gong, Unsupervised domain adaptation for zero-shot learning. In: Proceedings of the IEEE International Conference on Computer Visio, 2452-2460, 2015.
[29]
C. Szegedy,; W. Liu,; Y. Q. Jia,; P. Sermanet,; S. Reed,; D. Anguelov,; D. Erhan,; V. Vanhoucke,; A. Rabinovich, Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-9, 2015.
[30]
S. Bagon, Matlab wrapper for graph cut. 2006. Available at http://www.wisdom.weizmann.ac.il/∼bagon/matlab.html.
[31]
P. Arbeláez,; M. Maire,; C. Fowlkes,; J. Malik, Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 33, No. 5, 898-916, 2011.
[32]
M. Rubinstein,; A. Joulin,; J. Kopf,; C. Liu, Unsupervised joint object discovery and segmentation in Internet images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1939-1946, 2013.
[33]
M. Wang,; Y. Lai,; Y. Liang,; R. R. Martin,; S.-M. Hu, BiggerPicture: Data-driven image extrapolation using graph matching. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 173, 2014.
Computational Visual Media
Pages 229-237
Cite this article:
Danon D, Averbuch-Elor H, Fried O, et al. Unsupervised natural image patch learning. Computational Visual Media, 2019, 5(3): 229-237. https://doi.org/10.1007/s41095-019-0147-y

772

Views

22

Downloads

13

Crossref

N/A

Web of Science

15

Scopus

2

CSCD

Altmetrics

Revised: 23 April 2019
Accepted: 18 May 2019
Published: 22 August 2019
© The author(s) 2019

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from thecopyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return