Pattern Mining in Linked Data by Edge-Labeling

Xiang Zhang; Wenyao Cheng

doi:10.1109/TST.2016.7442500

| Sign up

PDF (1.2 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Open Access

Pattern Mining in Linked Data by Edge-Labeling

Xiang Zhang(), Wenyao Cheng

School of Computer Science and Engineering, Southeast University, Nanjing 210096, China.

Show Author Information

Abstract

Link patterns are consensus practices characterizing how different types of objects are typically interlinked in linked data. Mining link patterns in large-scale linked data has been inefficient due to the computational complexity of mining algorithms and memory limitations. To improve scalability, partitioning strategies for pattern mining have been proposed. But the efficiency and completeness of mining results are still under discussion. In this paper we propose a novel partitioning strategy for mining link patterns in large-scale linked data, in which linked data is partitioned according to edge-labeling rules: Edges are grouped into a primary multi-partition according to edge labels. A feedback mechanism is proposed to produce a secondary bi-partition according to a quick mining process. Local discovered link patterns in partitions are then merged into global patterns. Experiments show that our partition strategy is feasible and efficient.

Keywords

link pattern labeling partitioning scalability evaluation

References

[1]

Zhang

, Zhao

C. F.

, and Wang

, Mining link patterns in linked data, presented at the 13th International Conference on Web Age Information Management, Harbin, China, 2012.

[2]

Sheth

, Aleman-Meza

, Arpinar

, Bertram

, Warke

Y. S.

, and Ramakrishnan

, Semantic association identification and knowledge discovery for national security applications, Journal of Database Management, vol. 16, no. 1, pp. 33–53, 2005.

Crossref Google Scholar

[3]

Basse

, Gandon

, Mirbel

, Lo

, and Mirbel

, DFS-based frequent graph pattern extraction to characterize the content of RDF triple stores, presented at the Web Science Conference 2010: Extending the Frontiers of Society Online, Raleigh, USA, 2010.

[4]

Yan

X. F.

and Han

J. W.

, Gspan: Graph-based substructure pattern mining, presented at the 2002 IEEE International Conference on Data Mining, Maebashi, Japan, 2002.

[5]

Yan

X. F.

and Han

J. W.

, CloseGraph: Mining closed frequent graph patterns, presented at the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington DC, USA, 2003.

[6]

Wang

, Hsu

, Lee

M. L.

, and Sheng

, A partition-based approach to graph mining, presented at the 22nd International Conference on Data Engineering, Atlanta, GA, USA, 2006.

[7]

Inokuchi

, Washio

, and Motoda

, An apriori-based algorithm for mining frequent substructures from graph data, presented at the 4th European Symposium on the Principle of Data Mining and Knowledge Discovery, Lyon, France, 2000.

[8]

Nijssen

and Kok

J. A.

, Quickstart in frequent structure mining can make a difference, presented at the 10th ACM SIGKDD International Conference on Kowledge Discovery in Databases (KDD04), Seattle, WA, USA, 2004.

[9]

Holder

L. B.

, Cook

D. J.

, and Djoko

, Substructure discovery in the subdue system, in Proceedings of the AAAI94 Workshop Knowledge Discovery in Databases, Seattle, WA, USA, 1994.

[10]

Huan

, Wang

, and Prins

, Efficient mining of frequent subgraph in the presence of isomorphism, presented at the 3rd International Conference on Data Mining, Melbourne, FL, USA, 2003.

[11]

Huan

, Wang

, Prins

, and Yang

, Spin: Mining maximal frequent subgraphs from graph databases, presented at the 10th ACM SIGKDD International Conference on Knowledge Discovery in Databases, Seattle, WA, USA, 2004.

[12]

Karypis

and Kumar

, Multilevel algorithms for multiconstraint graph partitioning, in Proceedings of the ACM/IEEE Conference on Supercomputing, 1998, pp. 343-348.

[13]

Wang

, Wang

, Pei

, Zhu

, and Shi

, Scalable mining of large disk-based graph databases, presented at the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Orlando, FL, USA, 2004.

[14]

Nguyen

S. N.

, Orlowska

M. E.

, and Li

, Graph mining based on a data partitioning approach, presented at the 19th Conference on Australasian Databases, Wollongong, Australia, 2008.

Tsinghua Science and Technology

Volume 21 Issue 2,
April 2016

Pages 168-175

DOI: 10.1109/TST.2016.7442500

Cite this article:

Zhang X, Cheng W. Pattern Mining in Linked Data by Edge-Labeling. Tsinghua Science and Technology, 2016, 21(2): 168-175. https://doi.org/10.1109/TST.2016.7442500