AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

O2iJoin: An Efficient Index-Based Algorithm for Overlap Interval Join

School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
Guangdong Key Laboratory of Popular High Performance Computers, Key Laboratory of Service Computing and Application, Shenzhen 518000, China

Recommended by ICPCSEE 2017

Show Author Information

Abstract

Time intervals are often associated with tuples to represent their valid time in temporal relations, where overlap join is crucial for various kinds of queries. Many existing overlap join algorithms use indices based on tree structures such as quad-tree, B+-tree and interval tree. These algorithms usually have high CPU cost since deep path traversals are unavoidable, which makes them not so competitive as data-partition or plane-sweep based algorithms. This paper proposes an efficient overlap join algorithm based on a new two-layer flat index named as Overlap Interval Inverted Index (i.e., O2i Index). It uses an array to record the end points of intervals and approximates the nesting structures of intervals via two functions in the first layer, and the second layer uses inverted lists to trace all intervals satisfying the approximated nesting structures. With the help of the new index, the join algorithm only visits the must-be-scanned lists and skips all others. Analyses and experiments on both real and synthetic datasets show that the proposed algorithm is as competitive as the state-of-the-art algorithms.

Electronic Supplementary Material

Download File(s)
jcst-33-5-1023-Highlights.pdf (784.6 KB)

References

[1]

Cafagna F, Böhlen M H. Disjoint interval partitioning. The VLDB Journal, 2017, 26(3): 447-466.

[2]

Bouros P, Mamoulis N. A forward scan based plane sweep algorithm for parallel interval joins. Proceedings of the VLDB Endowment, 2017, 10(11): 1346-1357.

[3]
Piatov D, Helmer S, Dignös A. An interval join optimized for modern hardware. In Proc. the 32nd IEEE International Conference on Data Engineering, May 2016, pp.1098-1109.
[4]
Dignös A, Böhlen M H, Gamper J. Overlap interval partition join. In Proc. the 2014 ACM SIGMOD International Conference on Management of Data, June 2014, pp.1459-1470.
[5]

Gao D, Jensen C S, Snodgrass R T, Soo M. Join operations in temporal databases. The VLDB Journal, 2005, 14(1): 2-29.

[6]
Elmasri R, Wuu G T J, Kim Y J. The time index: An access structure for temporal data. In Proc. the 16th International Conference on Very Large Databases, Aug. 1990, pp.1-12.
[7]
Son D, Elmasri R. Efficient temporal join processing using time index. In Proc. the 8th International Conference on Scientific and Statistical Data Base Management, Jun. 1996, pp.252-261.
[8]
Soo M D, Snodgrass R T, Jensen C S. Efficient evaluation of the valid-time natural join. In Proc. the 10th International Conference on Data Engineering, Feb. 1994, pp.282-292.
[9]
Sitzmann I, Stuckey P J. Improving temporal joins using histograms. In Proc. the 11th International Conference on Database and Expert Systems Applications, Sept. 2000, pp.488-498.
[10]
Gunadhi H, Segev A. Query processing algorithms for temporal intersection joins. In Proc. the 7th International Conference on Data Engineering, April 1991, pp.336-344.
[11]
Rana S P, Fotouhi F. Efficient processing of time-joins in temporal data bases. In Proc. the 3rd International Conference on Database Systems for Advanced Applications, April 1993, pp.427-432.
[12]
Pfser D, Jensen S J. Incremental join of time-oriented data. In Proc. the 11th International Conference on Scientific and Statistical Data Base Management, Jul. 1999, pp.232-243.
[13]
Zhang C, Naughton J, DeWitt D, Luo Q, Lohman G. On supporting containment queries in relational database management systems. In Proc. the 2001 ACM SIGMOD International Conference on Management of Data, May 2001, pp.425-436.
[14]
Zhang D, Tsotras V J, Seeger B. Efficient temporal join processing using indices. In Proc. the 18th International Conference on Data Engineering, Feb. 2002, pp.103-113.
[15]
Kriegel H P, Pötke M, Seidl T. Managing intervals efficiently in object-relational databases. In Proc. the 26th International Conference on Very Large Data Bases, Sept. 2000, pp.407-418.
[16]

Tsotras V J, Kangelaris N. The snapshot index: An I/O-optimal access method for timeslice queries. Information Systems, 1995, 20(3): 237-260.

[17]
Beckmann N, Seeger B. A revised R*-tree in comparison with related index structures. In Proc. the 2009 ACM SIGMOD International Conference on Management of Data, June 2009, pp.799-812.
[18]

Ozsoyoglu G, Snodgrass R T. Temporal and real-time databases: A survey. IEEE Transactions on Knowledge and Data Engineering, 1995, 7(4): 513-532.

[19]
Beckmann N, Kriegel H P, Schneider R, Seeger B. The R*-tree: An efficient and robust access method for points and rectangles. In Proc. the 1990 ACM SIGMOD International Conference on Management of Data, May 1990, pp.322-331.
[20]
Koudas N, Sevcik K C. Size separation spatial join. In Proc. the 1997 ACM SIGMOD International Conference on Management of Data, May 1997, pp.324-335.
[21]
Nobari S, Tauheed F, Heinis T, Karras P, Bressan S, Ailamaki A. TOUCH: In-memory spatial join by hierarchical data-oriented partitioning. In Proc. the 2013 ACM SIGMOD International Conference on Management of Data, Jun. 2013, pp.701-712.
[22]
Leung T Y C, Muntz R. Stream processing: Temporal query processing and optimization. In Temporal Databases: Theory, Design, and Implementation, Tassel A U, Clifford J, Gadia S et al. (eds.), Benjamin-Cummings, 1993, pp.329–355.
[23]
Lu H, Ooi B C, Tan K L. On spatially partitioned temporal join. In Proc. the 20th International Conference on Very Large Data Bases, Sept. 1994, pp.546-557.
[24]
Shen H, Ooi B C, Lu H. The TP-index: A dynamic and efficient indexing mechanism for temporal databases. In Proc. the 10th Int. Conf. Data Engineering, Feb. 1994, pp.274-281.
[25]
Arge L, Procopiuc O, Ramaswamy S et al. Scalable sweeping-based spatial join. In Proc. the 24th International Conference on Very Large Data Bases, Aug. 1998, pp.570-581.
[26]
Lo M L, Ravishankar C V. Spatial hash-joins. In Proc. the 1996 ACM SIGMOD International Conference on Management of Data, June 1996, pp.247-258.
[27]
Patel J M, Dewitt D J. Partition based spatial-merge join. In Proc. the 1996 ACM SIGMOD International Conference on Management of Data, Jun. 1996, pp.259-270.
[28]
Samet H. Foundation of Multidimensional and Metric Data Structures (1st edition). Morgan Kaufmann, 2006.
[29]
Enderle J, Hampel M, Seidl T. Joining interval data in relational databases. In Proc. the 2004 ACM SIGMOD International Conference on Management of Data, Jun. 2004, pp.683-694.
[30]
Zhang D, Tsotras V J. Index based processing of semi-restrictive temporal joins. In Proc. the 9th International Symposium on Temporal Representation and Reasoning, July 2002, pp.70-77.
[31]

Luo J Z, Shi S F, Wang H Z, Li J Z. FrepJoin: An efficient partition-based algorithm for edit similarity join. Frontiers of Information Technology & Electronic Engineering, 2017, 18(10): 1499-1510.

[32]
Wang C P, Wang C K, Wang H, Chen J, Ye X J. Preference join on heterogeneous data. In Proc. the 18th Asia Pacific Web Conference, Sept. 2016, pp.317-329.
[33]

Khayyat Z, Lucia W, Singh M, Ouzzani M, Papotti P, Quiané-Ruiz J A, Tang N, Kalnis P. Fast and scalable inequality joins. The VLDB Journal, 2017, 26(1): 125–150.

Journal of Computer Science and Technology
Pages 1023-1038
Cite this article:
Luo J-Z, Shi S-F, Yang G, et al. O2iJoin: An Efficient Index-Based Algorithm for Overlap Interval Join. Journal of Computer Science and Technology, 2018, 33(5): 1023-1038. https://doi.org/10.1007/s11390-018-1872-x

393

Views

3

Crossref

N/A

Web of Science

3

Scopus

0

CSCD

Altmetrics

Received: 21 June 2017
Revised: 18 June 2018
Published: 12 September 2018
©2018 LLC & Science Press, China
Return