AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

Scalable and Adaptive Joins for Trajectory Data in Distributed Stream System

Institute of Artificial Intelligence, School of Computer Science and Technology, Soochow University, Suzhou 215006, China
Neusoft Corporation, Shenyang 110179, China
Show Author Information

Abstract

As a fundamental operation in LBS (location-based services), the trajectory similarity of moving objects has been extensively studied in recent years. However, due to the increasing volume of moving object trajectories and the demand of interactive query performance, the trajectory similarity queries are now required to be processed on massive datasets in a real-time manner. Existing work has proposed distributed or parallel solutions to enable large-scale trajectory similarity processing. However, those techniques cannot be directly adapted to the real-time scenario as it is likely to generate poor balancing performance when workload variance occurs on the incoming trajectory stream. In this paper, we propose a new workload partitioning framework, ART (Adaptive Framework for Real-Time Trajectory Similarity), which introduces practical algorithms to support dynamic workload assignment for RTTS (real-time trajectory similarity). Our proposal includes a processing model tailored for the RTTS scenario, a load balancing framework to maximize throughput, and an adaptive data partition manner designed to cut off unnecessary network cost. Based on this, our model can handle the large-scale trajectory similarity in an on-line scenario, which achieves scalability, effectiveness, and efficiency by a single shot. Empirical studies on synthetic data and real-world stream applications validate the usefulness of our proposal and prove the huge advantage of our approach over state-of-the-art solutions in the literature.

Electronic Supplementary Material

Download File(s)
jcst-34-4-747-Highlights.pdf (370.5 KB)

References

[1]
Trasarti R, Pinelli F, Nanni M et al. Mining mobility user profiles for car pooling. In Proc. the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2011, pp.1190-1198.
[2]

Zheng K, Zheng Y, Yuan N J et al. Online discovery of gathering patterns over trajectories. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1974-1988.

[3]

Li L, Kim J, Xu J et al. Time-dependent route scheduling on road networks. SIGSPATIAL Special, 2018, 10(1): 10-14.

[4]

Shang S, Ding R, Zheng K et al. Personalized trajectory matching in spatial networks. The International Journal on Very Large Data Bases, 2014, 23(3): 449-468.

[5]

Shang S, Chen L, Wei Z et al. Collective travel planning in spatial networks. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(5): 1132-1146.

[6]
Su H, Zheng K, Wang H et al. Calibrating trajectory data for similarity-based analysis. In Proc. the 2013 ACM SIGMOD International Conference on Management of Data, June 2013, pp.833-844.
[7]

Su H, Zheng K, Huang J et al. Calibrating trajectory data for spatio-temporal similarity analysis. The International Journal on Very Large Data Bases, 2015, 24(1): 93-116.

[8]
Sun J, Xu J, Zhou R et al. Discovering expert drivers from trajectories. In Proc. the 34th International Conference on Data Engineering, April 2018, pp.1332-1335.
[9]

Shang S, Chen L, Wei Z et al. Parallel trajectory similarity joins in spatial networks. The International Journal on Very Large Data Bases, 2018, 27(3): 395-420.

[10]

Shang S, Chen L, Wei Z et al. Trajectory similarity join in spatial networks. Proceedings of the VLDB Endowment, 2017, 10(11): 1178-1189.

[11]

Xie D, Li F, Phillips J M. Distributed trajectory similarity search. Proceedings of the VLDB Endowment, 2017, 10(11): 1478-1489.

[12]
Toshniwal A, Taneja S, Shukla A et al. Storm@twitter. In Proc. the 2014 ACM SIGMOD International Conference on Management of Data, June 2014, pp.147-156.
[13]
Kulkarni S, Bhagat N, Fu M et al. Twitter heron: Stream processing at scale. In Proc. the 2015 ACM SIGMOD International Conference on Management of Data, May 2015, pp.239-250.
[14]
Zaharia M, Das T, Li H et al. Discretized streams: Fault-tolerant streaming computation at scale. In Proc. the 24th ACM SIGOPS Symposium on Operating Systems Principles, November 2013, pp.423-438.
[15]
Nasir M A U, Morales G D F, Garcia-Soriano D et al. The power of both choices: Practical load balancing for distributed stream processing engines. In Proc. the 31st IEEE International Conference on Data Engineering, April 2015, pp.137-148.
[16]
Nasir M A U, Morales G D F, Kourtellis N et al. When two choices are not enough: Balancing at scale in distributed stream processing. In Proc. the 32nd IEEE International Conference on Data Engineering, May 2016, pp.589-600.
[17]
Lin Q, Ooi B C, Wang Z et al. Scalable distributed stream join processing. In Proc. the 2015 ACM SIGMOD International Conference on Management of Data, May 2015, pp.811-825.
[18]

Elseidy M, Elguindy A, Vitorovic A et al. Scalable and adaptive online joins. Proceedings of the VLDB Endowment, 2014, 7(6): 441-452.

[19]
Wang H, Su H, Zheng K et al. An effectiveness study on trajectory similarity measures. In Proc. the 24th Australasian Database Conference, February 2013, pp.13-22.
[20]
Ying R, Pan J, Fox K et al. A simple efficient approximation algorithm for dynamic time warping. In Proc. the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, October 2016, Article No. 21.
[21]
Vlachos M, Kollios G, Gunopulos D. Discovering similar multidimensional trajectories. In Proc. the 18th International Conference on Data Engineering, February 2002, pp.673-684.
[22]
Chen L, Özsu M T, Oria V. Robust and fast similarity search for moving object trajectories. In Proc. the 2005 ACM SIGMOD International Conference on Management of Data, June 2005, pp.491-502.
[23]
Frentzos E, Gratsias K, Theodoridis Y. Index-based most similar trajectory search. In Proc. the 23rd International Conference on Data Engineering, April 2007, pp.816-825.
[24]
Yanagisawa Y, Akahani J, Satoh T. Shape-based similarity query for trajectory of mobile objects. In Proc. the 3rd International Conference on Mobile Data Management, January 2003, pp.63-77.
[25]

Jiang Y, Li G, Feng J et al. String similarity joins: An experimental evaluation. Proceedings of the VLDB Endowment, 2014, 7(8): 625-636.

[26]

Gedik B. Partitioning functions for stateful data parallelism in stream processing. The International Journal on Very Large Data Bases, 2014, 23(4): 517-539.

[27]
Shah M A, Hellerstein J M, Chandrasekaran S et al. Flux: An adaptive partitioning operator for continuous query systems. In Proc. the 19th International Conference on Data Engineering, March 2003, pp.25-36.
Journal of Computer Science and Technology
Pages 747-761
Cite this article:
Fang J-H, Zhao P-P, Liu A, et al. Scalable and Adaptive Joins for Trajectory Data in Distributed Stream System. Journal of Computer Science and Technology, 2019, 34(4): 747-761. https://doi.org/10.1007/s11390-019-1940-x

351

Views

5

Crossref

N/A

Web of Science

4

Scopus

0

CSCD

Altmetrics

Received: 13 January 2019
Revised: 13 May 2019
Published: 19 July 2019
© 2019 Springer Science + Business Media, LLC & Science Press, China
Return