Image Tagging by Semantic Neighbor Learning Using User-Contributed Social Image Datasets

Feng Tian; Xukun Shen; Xianmei Liu; Maojun Cao

doi:10.23919/TST.2017.8195340

| Sign up

PDF (38.7 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Figures (15)

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Tables (3)

Table 1

Table 2

Table 3

Open Access

Image Tagging by Semantic Neighbor Learning Using User-Contributed Social Image Datasets

Feng Tian(), Xukun Shen, Xianmei Liu, Maojun Cao

School of Computer and Information Technology, Northeast Petroleum University, Daqing 163318, China.

State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing 100191, China.

Show Author Information

Abstract

The explosive increase in the number of images on the Internet has brought with it the great challenge of how to effectively index, retrieve, and organize these resources. Assigning proper tags to the visual content is key to the success of many applications such as image retrieval and content mining. Although recent years have witnessed many advances in image tagging, these methods have limitations when applied to high-quality and large-scale training data that are expensive to obtain. In this paper, we propose a novel semantic neighbor learning method based on user-contributed social image datasets that can be acquired from the Web’s inexhaustible social image content. In contrast to existing image tagging approaches that rely on high-quality image-tag supervision, we acquire weak supervision of our neighbor learning method by progressive neighborhood retrieval from noisy and diverse user-contributed image collections. The retrieved neighbor images are not only visually alike and partially correlated but also semantically related. We offer a step-by-step and easy-to-use implementation for the proposed method. Extensive experimentation on several datasets demonstrates that the performance of the proposed method significantly outperforms others.

Keywords

image tag social image tagging user-contributed datasets semantic neighbor learning

References

[1]

, Uricchio

, Ballan

, Bertini

, Snoek

C. G.

, and Bimbo

A. D.

, Socializing the semantic gap: A comparative survey on image tag assignment, refinement, and retrieval, ACM Computing Surveys, vol. 49, no. 14, pp. 1-14, 2015.

Crossref Google Scholar

[2]

Nie

L. Q.

, Yan

S. C.

, Wang

, Hong

R. C.

, and Chua

T. S.

, Harvesting visual concepts for image search with complex queries, in Proc. ACM International Conference on Multimedia, Amsterdam, the Netherlands, 2012, pp. 59-68.

$𝒙_{i} \in ℝ^{m}$	Feature vector for i-th image
$𝒕_{i} \in ℝ^{q}$	Original tag vector for i-th image
$L = {(𝒙_{i}, 𝒕_{i})}_{i = 1}^{l}}$	Training set
$𝑻 \in ℝ^{l \times q}$	Original tag indicator matrix
$𝒀 \in ℝ^{l \times q}$	Replenished tag indicator matrix
$𝑾 \in ℝ^{l \times l}$	Weight matrix of neighbor graph
$𝑿_{i .}$	i-th row of X
$𝑿_{. j}$	j-th column of X
$L_{i}$	i-th semantic group
$BN (x_{i})$	Semantically balanced neighborhood of $𝒙_{i}$
$𝑪 \in ℝ^{n \times n}$	Similarity matrix of semantically consistent neighbors
$𝑨 \in ℝ^{n \times q}$	Predicted tag indicator matrix
$⊙$	Multiplication of two matrices or vectors by bit

Dataset	Tags per image			Infrequent tags and ratio
Dataset	Mean	Median	Maximum	Infrequent tags and ratio
COREL5K	3.4	4	5	195 (0.75)
IAPR-TC12	5.7	5	23	217 (0.746)
ESPGAME	4.7	5	15	201 (0.750)
MIRFLICKR-25000	8.9	13	27	947 (0.683)

Method	COREL5K				IAPR-TC12				ESP-GAME				MIRFLICKR-25000
Method	P	R	F1	N+	P	R	F1	N+	P	R	F1	N+	P	R	F1	N+
SML	0.25	0.28	0.26	132	0.18	0.21	0.19	206	0.13	0.17	0.15	197	0.11	0.12	0.12	183
JEC	0.26	0.32	0.29	137	0.28	0.29	0.29	223	0.19	0.21	0.20	219	0.15	0.16	0.156	272
TagProp	0.32	0.40	0.36	158	0.44	0.32	0.37	251	0.38	0.25	0.30	237	0.21	0.18	0.20	284
GS	0.29	0.31	0.30	148	0.32	0.28	0.30	243	0.26	0.20	0.23	223	0.17	0.15	0.16	276
FastTag	0.31	0.34	0.324	152	0.45	0.24	0.31	279	0.47	0.21	0.30	246	0.22	0.17	0.19	293
SNL	0.43	0.45	0.44	185	0.53	0.39	0.45	281	0.53	0.32	0.40	253	0.32	0.23	0.27	318