AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (13.1 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Cross-depiction problem: Recognition and synthesis of photographs and artwork

Department of Computer Science, University of Bath, UK.
School of Computer Science, University of Adelaide, Australia.
Show Author Information

Abstract

Abstract Cross-depiction is the recognition—and synthesis—of objects whether they are photographed, painted, drawn, etc. It is a significant yet under-researched problem. Emulating the remarkable human ability to recognise and depict objects in an astonishingly wide variety of depictive forms is likely to advance both the foundations and the applications of computer vision. In this paper we motivate the cross-depiction problem, explain why it is difficult, and discuss some current approaches. Our main conclusions are (i) appearance-based recognition systems tend to be over-fitted to one depiction, (ii) models that explicitly encode spatial relations between parts are more robust, and (iii) recognition and non-photorealistic synthesis are related tasks.

References

[1]
Csurka, G.; Dance, C. R.; Fan, L.; Willamowski, J.; Bray, C. Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, 1–22, 2004.
[2]
Lazebnik, S.; Schmid, C.; Ponce, J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 21692178, 2006.
[3]
Russakovsky, O.; Lin, Y.; Yu, K.; Li, F.-F. Object-centric spatial pooling for image classification. Lecture Notes in Computer Science 1–15, 2012.
[4]
Gong, B.; Shi, Y.; Sha, F.; Grauman, K. Geodesic flow kernel for unsupervised domain adaptation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2066–2073, 2012.
[5]
Vedaldi, A.; Fulkerson, B. Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the international conference on Multimedia, 1469–1472, 2010.
[6]
Gu, C.; Lim, J. J.; Arbelaez, P.; Malik, J. Recognition using regions. In: IEEE Conference on Computer Vision and Pattern Recognition, 1030–1037, 2009.
[7]
Jia, W.; McKenna, S. J. Classifying textile designs using bags of shapes. In: The 20th International Conference on Pattern Recognition, 294–297, 2010.
[8]
Cootes, T. F.; Edwards, G. J.; Taylor, C. J. Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 23, No. 6, 681-685, 2001.
[9]
Coughlan, J.; Yuille, A.; English, C.; Snow, D. Efficient deformable template detection and localization without user initialization. Computer Vision and Image Understanding Vol. 78, No. 3, 303-319, 2000.
[10]
Crandall, D.; Felzenszwalb, P.; Huttenlocher, D. Spatial priors for part-based recognition using statistical models. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 10–17, 2005.
[11]
Amit, Y.; Trouvé, A. Pop: Patchwork of parts models for object recognition. International Journal of Computer Vision Vol. 75, No. 2, 267-282, 2007.
[12]
Felzenszwalb, P. F.; Girshick, R. B.; McAllester, D.; Ramanan, D. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 32, No. 9, 1627-1645, 2010.
[13]
Felzenszwalb, P. F.; Huttenlocher, D. P. Pictorial structures for object recognition. International Journal of Computer Vision Vol. 61, No. 1, 55-79, 2005.
[14]
Fergus, R.; Perona, P.; Zisserman, A. Object class recognition by unsupervised scale-invariant learning. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, II-264–II-271, 2003.
[15]
Fischler, M. A.; Elschlager, R. A. The representation and matching of pictorial structures. IEEE Transactions on Computers Vol. C-22, No. 1, 67-92, 1973.
[16]
Leibe, B.; Leonardis, A.; Schiele, B. Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision Vol. 77, Nos. 1–3, 259-289, 2008.
[17]
Leordeanu, M.; Herbert, M.; Sukthankar, R. Beyond local appearance: Category recognition from pairwise interactions of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2007.
[18]
Elidan, G.; Heitz, G.; Koller, D. Learning object shape: From drawings to images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 2064–2071, 2006.
[19]
Ferrari, V.; Fevrier, L.; Jurie, F.; Schmid, C. Groups of adjacent contour segments for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 30, No. 1, 36-51, 2008.
[20]
Rom, H.; Medioni, G. Hierarchical decomposition and axial shape description. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 15, No. 10, 973-981, 1993.
[21]
Sundar, H.; Silver, D.; Gagvani, N.; Dickinson, S. Skeleton based shape matching and retrieval. In: Proceedings of the Shape Modeling International, 130–139, 2003.
[22]
Siddiqi, K.; Shokoufandeh, A.; Dickinson, S. J.; Zucker, S. W. Shock graphs and shape matching. International Journal of Computer Vision Vol. 35, No. 1, 13-32, 1999.
[23]
Pan, S. J.; Tsang, I. W.; Kwok, J. T.; Yang, Q. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks Vol. 22, No. 2, 199-210, 2011.
[24]
Gopalan, R.; Li, R.; Chellappa, R. Domain adaptation for object recognition: An unsupervised approach. In: IEEE International Conference on Computer Vision, 999–1006, 2011.
[25]
Fernando, B.; Habrard, A.; Sebban, M.; Tuytelaars, T. Unsupervised visual domain adaptation using subspace alignment. In: IEEE International Conference on Computer Vision, 2960–2967, 2013.
[26]
Crowley, E. J.; Zisserman, A. Of gods and goats: Weakly supervised learning of figurative art. In: British Machine Vision Conference, 2013. Available at http://www.robots.ox.ac.uk/~vgg/publications/2013/Crowley13/crowley13.pdf.
[27]
Hu, R.; Collomosse, J. A performance evaluation of gradient field HOG descriptor for sketch based image retrieval. Computer Vision and Image Understanding Vol. 117, No. 7, 790-806, 2013.
[28]
Li, Y.; Song, Y.-Z.; Gong, S. Sketch recognition by ensemble matching of structured features. In: Proceedings of the British Machine Vision Conference, 35.1–35.11, 2013.
[29]
Collomosse, J. P.; McNeill, G.; Qian, Y. Storyboard sketches for content based video retrieval. In: IEEE 12th International Conference on Computer Vision, 245–252, 2009.
[30]
Hu, R.; James, S.; Wang, T.; Collomosse, J. Markov random fields for sketch based video retrieval. In: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval, 279–286, 2013.
[31]
Shechtman, E.; Irani, M. Matching local self-similarities across images and videos. In: IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2007.
[32]
Crowley, E. J.; Zisserman, A. The state of the art: Object retrieval in paintings using discriminative regions. In: British Machine Vision Conference, 2014. Available at https://www.robots.ox.ac.uk/~vgg/publications/2014/Crowley14/crowley14.pdf.
[33]
Shrivastava, A.; Malisiewicz, T.; Gupta, A.; Efros, A. A. Data-driven visual similarity for cross-domain image matching. ACM Transaction of Graphics Vol. 30, No. 6, Article No. 154, 2011.
[34]
Wu, Q.; Hall, P. Modelling visual objects invariant to depictive style. In: Proceedings of the British Machine Vision Conference, 23.1–23.12, 2013.
[35]
Wu, Q.; Hall, P. Prime shapes in natural images. In: BMCV, 1–12, 2012.
[36]
Wu, Q.; Cai, H.; Hall, P. Learning graphs to model visual objects across different depictive styles. Lecture Notes in Computer Science Vol. 8695, 313-328, 2014.
[37]
Xiao, B.; Song Y.-Z.; Hall, P. Learning invariant structure for object identification by using graph methods. Computer Vision and Image Understanding Vol. 115, No. 7, 1023-1031, 2011.
[38]
Crowley, E. J.; Zisserman, A. The state of the art: Object retrieval in paintings using discriminative regions. In: British Machine Vision Conference, 2014. Available at https://www.robots.ox.ac.uk/~vgg/publications/2014/Crowley14/crowley14.pdf.
[39]
Ginosar, S.; Haas, D.; Brown, T.; Malik, J. Detecting people in cubist art. Lecture Notes in Computer Science Vol. 8925, 101-116, 2015.
[40]
BBC. Your paintings dataset. Available at http://www.bbc.co.uk/arts/yourpaintings/.
[41]
Everingham, M.; Gool, L. V.; Williams, C. K. I.; Winn, J.; Zisserman, A. The PASCAL visual object classes (voc) challenge. International Journal of Computer Vision Vol. 88, No. 2, 303-338, 2010.
[42]
Kyprianidis, J. E.; Collomosse, J.; Wang, T.; Isenberg, T. State of the “art”: A taxonomy of artistic stylization techniques for images and video. IEEE Transactions on Visualization and Computer Graphics Vol. 19, No. 5, 866-885, 2013.
[43]
Lowe, D. G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision Vol. 60, No. 2, 91-110, 2004.
[44]
Berg, A. C.; Malik, J. Geometric blur for template matching. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, I-607–I-614, 2001.
[45]
Chatfield, K.; Philbin, J.; Zisserman, A. Efficient retrieval of deformable shape classes using local self-similarities. In: IEEE 12th International Conference on Computer Vision Workshops, 264–271, 2009.
[46]
Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 886–893, 2005.
[47]
Vedaldi, A.; Zisserman, A. Efficient additive kernels via explicit feature maps. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 34, No. 3, 480-492, 2012.
[48]
Ferrari, V.; Jurie, F.; Schmid, C. From images to shape models for object detection. International Journal of Computer Vision Vol. 87, No. 3, 284-303, 2010.
[49]
Perronnin, F.; Sánchez, J.; Mensink, T. Improving the fisher kernel for large-scale image classification. Lecture Notes in Computer Science Vol. 6314, 143-156, 2010.
[50]
Hu, R.; Barnard, M.; Collomosse, J. P. Gradient field descriptor for sketch based retrieval and localization. In: The 17th IEEE International Conference on Image Processing, 1025–1028, 2010.
[51]
Gong, B.; Grauman, K.; Sha, F. Connecting the dots with landmarks: Discriminatively learning domain-invariant features for unsupervised domain adaptation. In: Proceedings of the International Conference on Machine Learning, 222–230, 2013.
[52]
Saenko, K.; Kulis, B.; Fritz, M.; Darrell, T. Adapting visual category models to new domains. Lecture Notes in Computer Science Vol. 6314, 213-226, 2010.
[53]
Song, Y.-Z.; Arbelaez, P.; Hall, P.; Li, C.; Balikai, A. Finding semantic structures in image hierarchies using Laplacian graph energy. Lecture Notes in Computer Science Vol. 6314, 694-707, 2010.
[54]
Wu, Q.; Hall, P. Prime shapes in natural images. In: Proceedings of the British Machine Vision Conference, 45.1–45.12, 2012.
[55]
Felzenszwalb, P. F.; Girshick, R. B.; McAllester, D.; Ramanan, D. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 32, No. 9, 1627-1645, 2010.
[56]
Cho, M.; Alahari, K.; Ponce, J. Learning graphs to match. In: Proceedings of the IEEE International Conference on Computer Vision, 25–32,
[57]
Krizhevsky, A.; Sutskever, I.; Hinton, G. E. ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25, 1097–1105, 2012.
[58]
Song, Y.-Z.; Pickup, D.; Li, C.; Rosin, P.; Hall, P. Abstract art by shape classification. IEEE Transactions on Visualization and Computer Graphics Vol. 19, No. 8, 1252-1263, 2013.
[59]
Hall, P.; Song, Y.-Z. Simple art as abstractions of photographs. In: Proceedings of the Symposium on Computational Aesthetics, 77–85, 2013.
Computational Visual Media
Pages 91-103
Cite this article:
Hall P, Cai H, Qi W, et al. Cross-depiction problem: Recognition and synthesis of photographs and artwork. Computational Visual Media, 2015, 1(2): 91-103. https://doi.org/10.1007/s41095-015-0017-1

715

Views

17

Downloads

30

Crossref

N/A

Web of Science

31

Scopus

0

CSCD

Altmetrics

Revised: 25 March 2015
Accepted: 20 May 2015
Published: 18 September 2015
© The Author(s) 2015

This article is published with open access at Springerlink.com

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return