| Sign up

PDF (15.3 MB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

Trajectory Big Data Processing Based on Frequent Activity

Amina Belhassena(), Hongzhi Wang

School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.

Show Author Information

Abstract

With the rapid development and wide use of Global Positioning System in technology tools, such as smart phones and touch pads, many people share their personal experience through their trajectories while visiting places of interest. Therefore, trajectory query processing has emerged in recent years to help users find their best trajectories. However, with the huge amount of trajectory points and text descriptions, such as the activities practiced by users at these points, organizing these data in the index becomes tedious. Therefore, the parallel method becomes indispensable. In this paper, we have investigated the problem of distributed trajectory query processing based on the distance and frequent activities. The query is specified by start and final points in the trajectory, the distance threshold, and a set of frequent activities involved in the point of interest of the trajectory. As a result, the query returns the shortest trajectory including the most frequent activities with high support and high confidence. To simplify the query processing, we have implemented the Distributed Mining Trajectory R-Tree index (DMTR-Tree). For this method, we initially managed the large trajectory dataset in distributed R-Tree indexes. Then, for each index, we applied the frequent itemset Apriori algorithm for each point to select the frequent activity set. For the faster computation of the above algorithms, we utilized the cluster computing framework of Apache Spark with MapReduce as the programing model. The experimental results show that the DMTR-Tree index and the query-processing algorithm are efficient and can achieve the scalability.

Keywords

distributed R-tree trajectory frequent activity query

References

[1]

Aung

H. H.

, Guo

L.

, and Tan

K. L.

, Mining sub-trajectory cliques to find frequent routes, in Proc. 13th Int. Symp. Advances in Spatial and Temporal Databases, Munich, Germany, 2013, pp. 92-109.

[2]

Zheng

K.

, Shang

S.

, Yuan

N. J.

, and Yang

Y.

, Towards efficient search for activity trajectories, in Proc. 29th Int. Conf. Data Engineering, Brisbane, Australia, 2013, pp. 230-241.

[3]

Zhang

C.

, Han

J. W.

, Shou

L. D.

, Lu

J. J.

, and La Porta

T.

, Splitter: Mining fine-grained sequential patterns in semantic trajectories, Proc. VLDB Endow., vol. 7, no. 9, pp. 769-780, 2014.

Crossref Google Scholar

[4]

Chen

W.

, Zhao

L.

, Xu

J. J.

, Liu

G. F.

, Zheng

K.

, and Zhou

X. F.

, Trip oriented search on activity trajectory, J. Comput. Sci. Technol., vol. 30, no. 4, pp. 745-761, 2015.

Crossref Google Scholar

[5]

Agrawal

R.

and Srikant

R.

, Fast algorithms for mining association rules in large databases, in Proc. 20th Int. Conf. Very Large Data Bases, Santiago de Chile, Chile, 1994, pp. 487-499.

[6]

Bayardo

R. J. Jr.

, Efficiently mining long patterns from databases, ACM SIGMOD Rec., vol. 27, no. 2, pp. 85-93, 1998.

Crossref Google Scholar

[7]

Zaki

M. J.

, Parthasarathy

S.

, Ogihara

M.

, and Li

W.

, Parallel algorithms for discovery of association rules, Data Min. Knowl. Disc., vol. 1, no. 4, pp. 343-373, 1997.

Crossref Google Scholar

[8]

Qiu

H. J.

, Gu

R.

, Yuan

C. F.

, and Huang

Y. H.

, YAFIM: A parallel frequent itemset mining algorithm with spark, in Proc. 28th Int. Parallel & Distributed Processing Symp. Workshops, Phoenix, AZ, USA, 2014.

[9]

Guttman

A.

, R-trees: A dynamic index structure for spatial searching, in Proc. 1984 ACM SIGMOD Int. Conf. Management of Data, Boston, MA, USA, 1984, pp. 47-57.

[10]

Dean

J.

and Ghemawat

S.

, MapReduce: Simplified data processing on large clusters, in Proc. 6th Conf. Symp. Operating Systems Design & Implementation, San Francisco, CA, USA, 2004.

[11]

Eldawy

A.

, Alarabi

L.

, and Mokbel

M. F.

, Spatial partitioning techniques in Spatial Hadoop, in Proc. 41st Int. Conf. Very Large Data Bases, Kohala Coast, HI, USA, 2015, pp. 1602-1605.

[12]

Yang

H. C.

, Dasdan

A.

, Hsiao

R. L.

, and Parker

D. S.

, Map-reduce-merge: Simplified relational data processing on large clusters, in Proc. ACM SIGMOD Int. Conf. Management of Data, Beijing, China, 2007, pp. 1029-1040.

[13]

Wang

H. Z.

and Belhassena

A.

, Parallel trajectory search based on distributed index, Inf. Sci., vols. 388&389, pp. 62-83, 2017.

Crossref Google Scholar

[14]

Agrawal

R.

and Srikant

R.

, Mining sequential patterns, in Proc. 11th Int. Conf. Data Engineering, Taipei, China, 1995.

[15]

Morzy

M.

, Prediction of moving object location based on frequent trajectories, in Proc. 21st Int. Conf. Computer and Information Sciences, Istanbul, Turkey, 2006, pp. 583-592.

[16]

Morzy

M.

, Mining frequent trajectories of moving objects for location prediction, in Proc. 5th Int. Conf. Machine Learning and Data Mining in Pattern Recognition, Leipzig, Germany, 2007, pp. 18-20.

[17]

Masciari

E.

, Shi

G.

, and Zaniolo

C.

, Sequential pattern mining from trajectory data, in Proc. 17th Int. Database Engineering Applications Symp., Barcelona, Spain, 2013, pp. 162-167.

[18]

Monreale

A.

, Pinelli

F.

, Trasarti

R.

, and Giannotti

F.

, WhereNext: A location predictor on trajectory pattern mining, in Proc. 15th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Paris, France, 2009.

[19]

Li

N.

, Zeng

L.

, He

Q.

, and Shi

Z. Z.

, Parallel implementation of apriori algorithm based on MapReduce, in Proc. 13th ACIS Int. Conf. Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, Kyoto, Japan, 2012, pp. 236-241.

[20]

Lin

M. Y.

, Lee

P. Y.

, and Hsueh

S. C.

, Apriori-based frequent itemset mining algorithms on MapReduce, in Proc. 16th Int. Conf. Ubiquitous Information Management and Communication, Kuala Lumpur, Malaysia, 2012.

[21]

Guo

J.

and Ren

Y. G.

, Research on improved a priori algorithm based on coding and MapReduce, in Proc. 10th Web Information System and Application Conf., Yangzhou, China, 2013, pp. 294-299.

[22]

Zaki

M. J.

, Parthasarathy

S.

, Ogihara

M.

, and Li

W.

, New algorithms for fast discovery of association rules, in Proc. 3rd Int. Conf. Knowledge Discovery and Data Mining, Newport Beach, CA, USA, 1997.

[23]

Li

N.

, Zang

L.

, He

Q.

, and Shi

Z. Z.

, Parallel implementation of apriori algorithm based on MapReduce, Int. J. Netw. Distrib. Comput., vol. 1, no. 2, pp. 89-96, 2013.

Crossref Google Scholar

[24]

Tong

W.

, Rudin

C.

, Wagner

D.

, and Sevieri

R.

, Learning to detect patterns of crime, in Proc. European Conf. Machine Learning and Knowledge Discovery in Databases, Prague, Czech Republic, 2013, pp. 515-530.

[25]

Zaharia

M.

, Chowdhury

M.

, Franklin

M. J.

, Shenker

S.

, and Stoica

I.

, Spark: Cluster computing with working sets, in Proc. 2nd USENIX Conf. Hot Topics in Cloud Computing, Boston, MA, USA, 2010.

[26]

Rathee

S.

, Kaul

M.

, and Kashyap

A.

, R-Apriori: An efficient apriori based algorithm on spark, in Proc. 8th Workshop on Ph.D. Workshop in Information and Knowledge Management, Melbourne, Australia, 2015, pp. 27-34.

[27]

Beckmann

N.

, Kriegel

H. P.

, Schneider

R.

, and Seeger

B.

, The R

^{*}

-tree: An efficient and robust access method for points and rectangles, in Proc. 1990 ACM SIGMOD Int. Conf. Management of Data, Atlantic City, NJ, USA, 1990, pp. 322-331.

[28]

Hariharan

R.

, Hore

B.

, Li

C.

, and Mehrotra

S.

, Processing spatial-keyword (SK) queries in geographic information retrieval (GIR) systems, in Proc. 19th Int. Conf. Scientific and Statistical Database Management, Banff, Canada, 2007.

[29]

Zhang

D. X.

, Chee

Y. M.

, Mondal

A.

, Tung

A. K. H.

, and Kitsuregawa

M.

, Keyword search in spatial databases: Towards searching by document, in Proc. 25th Int. Conf. Data Engineering, Shanghai, China, 2009, pp. 688-699.

[30]

Chen

W.

, Zhao

L.

, Xu

J. J.

, Zheng

K.

, and Zhou

X. F.

, Ranking based activity trajectory search, in Proc. 15th Int. Conf. Web Information Systems Engineering, Thessaloniki, Greece, 2014, pp. 170-185.

[31]

Du Mouza

C.

, Litwin

W.

, and Rigaux

P.

, SD-Rtree: A scalable distributed R-tree, in Proc. 23rd Int. Conf. Data Engineering, Istanbul, Turkey, 2007, pp. 296-305.

[32]

Eldawy

A.

and Mokbel

M. F.

, A demonstration of Spatial Hadoop: An efficient MapReduce framework for spatial data, Proc. VLDB Endow, vol. 6, no. 12, pp. 1230-1233, 2013.

Crossref Google Scholar

[33]

Wang

L.

, Chen

B.

, and Liu

Y. H.

, Distributed storage and index of vector spatial data based on HBase, in Proc. 21st Int. Conf. Geoinformatics, Kaifeng, China, 2013, pp. 1-5.

[34]

Yu

J.

, Wu

J. X.

, and Sarwat

M.

, GeoSpark: A cluster computing framework for processing large-scale spatial data, in Proc. 23rd SIGSPATIAL Int. Conf. Advances in Geographic Information Systems, Seattle, WA, USA, 2015.

[35]

Lee

J. G.

, Han

J. W.

, and Whang

K. Y.

, Trajectory clustering: A partition-and-group framework, in Proc. 2007 ACM SIGMOD Int. Conf. Management of Data, Beijing, China, 2007, pp. 593-604.

[36]

Han

J. W.

, Pei

J.

, and Yin

Y. W.

, Mining frequent patterns without candidate generation, in Proc. 2000 ACM SIGMOD Int. Conf. Management of Data, Dallas, TX, USA, 2000, pp. 1-12.

[37]

Ester

M.

, Kriegel

H. P.

, Sander

J.

, and Xu

X. W.

, A density-based algorithm for discovering clusters in large spatial databases with noise, in Proc. 2nd Int. Conf. Knowledge Discovery and Data Mining, Portland, OR, USA, 1996, pp. 226-231.

Tsinghua Science and Technology

Volume 24 Issue 3,
June 2019

Pages 317-332

DOI: 10.26599/TST.2018.9010087

Cite this article:

Belhassena A, Wang H. Trajectory Big Data Processing Based on Frequent Activity. Tsinghua Science and Technology, 2019, 24(3): 317-332. https://doi.org/10.26599/TST.2018.9010087

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号