Cross-depiction problem:&nbsp;Recognition and synthesis of photographs and artwork

Peter Hall; Hongping Cai; Wu Qi; Tadeo Corradi

doi:10.1007/s41095-015-0017-1

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (13.1 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Research Article | Open Access

Cross-depiction problem: Recognition and synthesis of photographs and artwork

Peter Hall^¹(

), Hongping Cai^¹, Wu Qi^², Tadeo Corradi^¹

1 Department of Computer Science, University of Bath, UK.

2 School of Computer Science, University of Adelaide, Australia.

Show Author Information

Abstract

Abstract Cross-depiction is the recognition—and synthesis—of objects whether they are photographed, painted, drawn, etc. It is a significant yet under-researched problem. Emulating the remarkable human ability to recognise and depict objects in an astonishingly wide variety of depictive forms is likely to advance both the foundations and the applications of computer vision. In this paper we motivate the cross-depiction problem, explain why it is difficult, and discuss some current approaches. Our main conclusions are (i) appearance-based recognition systems tend to be over-fitted to one depiction, (ii) models that explicitly encode spatial relations between parts are more robust, and (iii) recognition and non-photorealistic synthesis are related tasks.

Keywords

classification synthesis cross-depiction feature spatial layout connectivity representation

References

[1]

Csurka, G.; Dance, C. R.; Fan, L.; Willamowski, J.; Bray, C. Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, 1–22, 2004.

[2]

Lazebnik,

; Schmid,

; Ponce,

Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 2169–2178, 2006.

[3]

Russakovsky, O.; Lin, Y.; Yu, K.; Li, F.-F. Object-centric spatial pooling for image classification. Lecture Notes in Computer Science 1–15, 2012.

Crossref Google Scholar

[4]

Gong, B.; Shi, Y.; Sha, F.; Grauman, K. Geodesic flow kernel for unsupervised domain adaptation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2066–2073, 2012.

[5]

Vedaldi, A.; Fulkerson, B. Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the international conference on Multimedia, 1469–1472, 2010.

Crossref

[6]

Gu, C.; Lim, J. J.; Arbelaez, P.; Malik, J. Recognition using regions. In: IEEE Conference on Computer Vision and Pattern Recognition, 1030–1037, 2009.

[7]

Jia, W.; McKenna, S. J. Classifying textile designs using bags of shapes. In: The 20th International Conference on Pattern Recognition, 294–297, 2010.

Crossref

[8]

Cootes, T. F.; Edwards, G. J.; Taylor, C. J. Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 23, No. 6, 681-685, 2001.

Crossref Google Scholar

[9]

Coughlan, J.; Yuille, A.; English, C.; Snow, D. Efficient deformable template detection and localization without user initialization. Computer Vision and Image Understanding Vol. 78, No. 3, 303-319, 2000.

Crossref Google Scholar

[10]

Crandall, D.; Felzenszwalb, P.; Huttenlocher, D. Spatial priors for part-based recognition using statistical models. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 10–17, 2005.

[11]

Amit, Y.; Trouvé, A. Pop: Patchwork of parts models for object recognition. International Journal of Computer Vision Vol. 75, No. 2, 267-282, 2007.

Crossref Google Scholar

[12]

Felzenszwalb, P. F.; Girshick, R. B.; McAllester, D.; Ramanan, D. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 32, No. 9, 1627-1645, 2010.

Crossref Google Scholar

[13]

Felzenszwalb, P. F.; Huttenlocher, D. P. Pictorial structures for object recognition. International Journal of Computer Vision Vol. 61, No. 1, 55-79, 2005.

Crossref Google Scholar

[14]

Fergus, R.; Perona, P.; Zisserman, A. Object class recognition by unsupervised scale-invariant learning. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, II-264–II-271, 2003.

[15]

Fischler, M. A.; Elschlager, R. A. The representation and matching of pictorial structures. IEEE Transactions on Computers Vol. C-22, No. 1, 67-92, 1973.

Crossref Google Scholar

[16]

Leibe, B.; Leonardis, A.; Schiele, B. Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision Vol. 77, Nos. 1–3, 259-289, 2008.

Crossref Google Scholar

[17]

Leordeanu, M.; Herbert, M.; Sukthankar, R. Beyond local appearance: Category recognition from pairwise interactions of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2007.

Crossref

[18]

Elidan, G.; Heitz, G.; Koller, D. Learning object shape: From drawings to images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 2064–2071, 2006.

[19]

Ferrari, V.; Fevrier, L.; Jurie, F.; Schmid, C. Groups of adjacent contour segments for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 30, No. 1, 36-51, 2008.

Crossref Google Scholar

[20]

Rom, H.; Medioni, G. Hierarchical decomposition and axial shape description. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 15, No. 10, 973-981, 1993.

Crossref Google Scholar

[21]

Sundar, H.; Silver, D.; Gagvani, N.; Dickinson, S. Skeleton based shape matching and retrieval. In: Proceedings of the Shape Modeling International, 130–139, 2003.

[22]

Siddiqi, K.; Shokoufandeh, A.; Dickinson, S. J.; Zucker, S. W. Shock graphs and shape matching. International Journal of Computer Vision Vol. 35, No. 1, 13-32, 1999.

Crossref Google Scholar

[23]

Pan, S. J.; Tsang, I. W.; Kwok, J. T.; Yang, Q. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks Vol. 22, No. 2, 199-210, 2011.

Crossref Google Scholar

[24]

Gopalan, R.; Li, R.; Chellappa, R. Domain adaptation for object recognition: An unsupervised approach. In: IEEE International Conference on Computer Vision, 999–1006, 2011.

Crossref

[25]

Fernando, B.; Habrard, A.; Sebban, M.; Tuytelaars, T. Unsupervised visual domain adaptation using subspace alignment. In: IEEE International Conference on Computer Vision, 2960–2967, 2013.

Crossref

[26]

Crowley, E. J.; Zisserman, A. Of gods and goats: Weakly supervised learning of figurative art. In: British Machine Vision Conference, 2013. Available at http://www.robots.ox.ac.uk/~vgg/publications/2013/Crowley13/crowley13.pdf.

Crossref

[27]

Hu, R.; Collomosse, J. A performance evaluation of gradient field HOG descriptor for sketch based image retrieval. Computer Vision and Image Understanding Vol. 117, No. 7, 790-806, 2013.

Crossref Google Scholar

[28]

Li, Y.; Song, Y.-Z.; Gong, S. Sketch recognition by ensemble matching of structured features. In: Proceedings of the British Machine Vision Conference, 35.1–35.11, 2013.

Crossref

[29]

Collomosse, J. P.; McNeill, G.; Qian, Y. Storyboard sketches for content based video retrieval. In: IEEE 12th International Conference on Computer Vision, 245–252, 2009.

Crossref

[30]

Hu, R.; James, S.; Wang, T.; Collomosse, J. Markov random fields for sketch based video retrieval. In: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval, 279–286, 2013.

Crossref

[31]

Shechtman, E.; Irani, M. Matching local self-similarities across images and videos. In: IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2007.

Crossref

[32]

Crowley, E. J.; Zisserman, A. The state of the art: Object retrieval in paintings using discriminative regions. In: British Machine Vision Conference, 2014. Available at https://www.robots.ox.ac.uk/~vgg/publications/2014/Crowley14/crowley14.pdf.

Crossref

[33]

Shrivastava, A.; Malisiewicz, T.; Gupta, A.; Efros, A. A. Data-driven visual similarity for cross-domain image matching. ACM Transaction of Graphics Vol. 30, No. 6, Article No. 154, 2011.

Crossref Google Scholar

[34]

Wu, Q.; Hall, P. Modelling visual objects invariant to depictive style. In: Proceedings of the British Machine Vision Conference, 23.1–23.12, 2013.

Crossref

[35]

Wu, Q.; Hall, P. Prime shapes in natural images. In: BMCV, 1–12, 2012.

Crossref

[36]

Wu, Q.; Cai, H.; Hall, P. Learning graphs to model visual objects across different depictive styles. Lecture Notes in Computer Science Vol. 8695, 313-328, 2014.

Crossref Google Scholar

[37]

Xiao, B.; Song Y.-Z.; Hall, P. Learning invariant structure for object identification by using graph methods. Computer Vision and Image Understanding Vol. 115, No. 7, 1023-1031, 2011.

Crossref Google Scholar

[38]

Crossref

[39]

Ginosar, S.; Haas, D.; Brown, T.; Malik, J. Detecting people in cubist art. Lecture Notes in Computer Science Vol. 8925, 101-116, 2015.

Crossref Google Scholar

[40]

BBC. Your paintings dataset. Available at http://www.bbc.co.uk/arts/yourpaintings/.

[41]

Everingham, M.; Gool, L. V.; Williams, C. K. I.; Winn, J.; Zisserman, A. The PASCAL visual object classes (voc) challenge. International Journal of Computer Vision Vol. 88, No. 2, 303-338, 2010.

Crossref Google Scholar

[42]

Kyprianidis, J. E.; Collomosse, J.; Wang, T.; Isenberg, T. State of the “art”: A taxonomy of artistic stylization techniques for images and video. IEEE Transactions on Visualization and Computer Graphics Vol. 19, No. 5, 866-885, 2013.

Crossref Google Scholar

[43]

Lowe, D. G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision Vol. 60, No. 2, 91-110, 2004.

Crossref Google Scholar

[44]

Berg, A. C.; Malik, J. Geometric blur for template matching. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, I-607–I-614, 2001.

[45]

Chatfield, K.; Philbin, J.; Zisserman, A. Efficient retrieval of deformable shape classes using local self-similarities. In: IEEE 12th International Conference on Computer Vision Workshops, 264–271, 2009.

Crossref

[46]

Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 886–893, 2005.

[47]

Vedaldi, A.; Zisserman, A. Efficient additive kernels via explicit feature maps. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 34, No. 3, 480-492, 2012.

Crossref Google Scholar

[48]

Ferrari, V.; Jurie, F.; Schmid, C. From images to shape models for object detection. International Journal of Computer Vision Vol. 87, No. 3, 284-303, 2010.

Crossref Google Scholar

[49]

Perronnin, F.; Sánchez, J.; Mensink, T. Improving the fisher kernel for large-scale image classification. Lecture Notes in Computer Science Vol. 6314, 143-156, 2010.

Crossref Google Scholar

[50]

Hu, R.; Barnard, M.; Collomosse, J. P. Gradient field descriptor for sketch based retrieval and localization. In: The 17th IEEE International Conference on Image Processing, 1025–1028, 2010.

Crossref

[51]

Gong, B.; Grauman, K.; Sha, F. Connecting the dots with landmarks: Discriminatively learning domain-invariant features for unsupervised domain adaptation. In: Proceedings of the International Conference on Machine Learning, 222–230, 2013.

[52]

Saenko, K.; Kulis, B.; Fritz, M.; Darrell, T. Adapting visual category models to new domains. Lecture Notes in Computer Science Vol. 6314, 213-226, 2010.

Crossref Google Scholar

[53]

Song, Y.-Z.; Arbelaez, P.; Hall, P.; Li, C.; Balikai, A. Finding semantic structures in image hierarchies using Laplacian graph energy. Lecture Notes in Computer Science Vol. 6314, 694-707, 2010.

Crossref Google Scholar

[54]

Wu, Q.; Hall, P. Prime shapes in natural images. In: Proceedings of the British Machine Vision Conference, 45.1–45.12, 2012.

Crossref

[55]

Crossref Google Scholar

[56]

Cho, M.; Alahari, K.; Ponce, J. Learning graphs to match. In: Proceedings of the IEEE International Conference on Computer Vision, 25–32,

[57]

Krizhevsky, A.; Sutskever, I.; Hinton, G. E. ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25, 1097–1105, 2012.

[58]

Song, Y.-Z.; Pickup, D.; Li, C.; Rosin, P.; Hall, P. Abstract art by shape classification. IEEE Transactions on Visualization and Computer Graphics Vol. 19, No. 8, 1252-1263, 2013.

Crossref Google Scholar

[59]

Hall, P.; Song, Y.-Z. Simple art as abstractions of photographs. In: Proceedings of the Symposium on Computational Aesthetics, 77–85, 2013.

Crossref

Computational Visual Media

Volume 1 Issue 2,
June 2015

Pages 91-103

DOI: 10.1007/s41095-015-0017-1

Cite this article:

Hall P, Cai H, Qi W, et al. Cross-depiction problem: Recognition and synthesis of photographs and artwork. Computational Visual Media, 2015, 1(2): 91-103. https://doi.org/10.1007/s41095-015-0017-1

715

Views

Downloads

Crossref

N/A

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Revised: 25 March 2015

Accepted: 20 May 2015

Published: 18 September 2015

This article is published with open access at Springerlink.com

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.