AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (2.7 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Building a High-Performance Graph Storage on Top of Tree-Structured Key-Value Stores

School of Computer Science, Peking University, Beijing 100871, China
Ant Group, Beijing 100020, China
Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
Show Author Information

Abstract

Graph databases have gained widespread adoption in various industries and have been utilized in a range of applications, including financial risk assessment, commodity recommendation, and data lineage tracking. While the principles and design of these databases have been the subject of some investigation, there remains a lack of comprehensive examination of aspects such as storage layout, query language, and deployment. The present study focuses on the design and implementation of graph storage layout, with a particular emphasis on tree-structured key-value stores. We also examine different design choices in the graph storage layer and present our findings through the development of TuGraph, a highly efficient single-machine graph database that significantly outperforms well-known Graph DataBase Management System (GDBMS). Additionally, TuGraph demonstrates superior performance in the Linked Data Benchmark Council (LDBC) Social Network Benchmark (SNB) interactive benchmark.

References

[1]
Neo4j, https://neo4j.com, 2023.
[2]
A. Deutsch, Y. Xu, M. Wu, and V. Lee, TigerGraph: A native MPP graph database, arXiv preprint arXiv: 1901.08248, 2019.
[3]
Z. Fu, Z. Wu, H. Li, Y. Li, M. Wu, X. Chen, X. Ye, B. Yu, and X. Hu, GeaBase: A high-performance distributed graph database for industry-scale applications, Int. J. High Perform. Comput. Netw., vol. 15, nos. 1&2, pp. 12−21, 2019.
[4]
Symas Corp, Memcache benchmark, http://www.lmdb.tech/bench/memcache/, 2013.
[5]
[6]
Alibaba cloud, https://www.aliyun.com/, 2023.
[7]
Ldbc social network benchmark, https://ldbcouncil.org/benchmarks/snb/, 2023.
[8]
Twitter follower network 2010, https://snap.stanford.edu/data/twitter-2010.html, 2023.
[9]

L. Lu, T. S. Pillai, H. Gopalakrishnan, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau, WiscKey: Separating keys from values in SSD-conscious storage, ACM Trans. Storage, vol. 13, no. 1, pp. 5, 2017.

[11]
JanusGraph, https://janusgraph.org, 2023.
[13]
ArangoDB, https://www.arangodb.com/, 2023.
[14]
Apache Cassandra, https://cassandra.apache.org/, 2023.
[15]
Apache HBase, https://hbase.apache.org/, 2023.
[16]
Google Cloud Bigtable, https://cloud.google.com/bigtable/, 2023.
[18]
OrientDB, https://orientdb.com, 2023.
[19]
C. Kankanamge, S. Sahu, A. Mhedbhi, J. Chen, and S. Salihoglu, Graphflow: An active graph database, in Proc. 2017 ACM Int. Conf. Management of Data, Chicago, IL, USA, 2017, pp. 1695–1698.
[20]
C. Buragohain, K. M. Risvik, P. Brett, M. Castro, W. Cho, J. Cowhig, N. Gloy, K. Kalyanaraman, R. Khanna, J. Pao, et al., A1: A distributed in-memory graph database, in Proc. 2020 ACM SIGMOD Int. Conf. Management of Data, Portland, OR, USA, 2020, pp. 329–344.
[21]
Aws Neptune, https://aws.amazon.com/neptune/, 2023.
[22]
[23]
Azure Cosmos DB documentation, https://docs.microsoft.com/en-us/azure/cosmos-db, 2023.
[24]
X. Zhu, G. Feng, M. Serafini, X. Ma, J. Yu, L. Xie, A. Aboulnaga, and W. Chen, LiveGraph: A transactional graph storage system with purely sequential adjacency list scans, arXiv preprint arXiv: 1910.05773, 2019.
[25]
N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, et al., TAO: Facebook’s distributed data store for the social graph, in Proc. 2013 USENIX Conf. Annual Technical Conf. (USENIX ATC 13), San Jose, CA, USA, 2013, pp. 49–60.
[26]
H. Chen, C. Li, J. Fang, C. Huang, J. Cheng, J. Zhang, Y. Hou, and X. Yan, Grasper: A high performance distributed system for OLAP on property graphs, in Proc. ACM Symp. on Cloud Computing, Santa Cruz, CA, USA, 2019, pp. 87–100.
Big Data Mining and Analytics
Pages 156-170
Cite this article:
Lin H, Wang Z, Qi S, et al. Building a High-Performance Graph Storage on Top of Tree-Structured Key-Value Stores. Big Data Mining and Analytics, 2024, 7(1): 156-170. https://doi.org/10.26599/BDMA.2023.9020015

1401

Views

230

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 07 January 2023
Revised: 08 June 2023
Accepted: 19 June 2023
Published: 25 December 2023
© The author(s) 2023.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return