PDF (2.7 MB)
Collect
Submit Manuscript
Open Access

Building a High-Performance Graph Storage on Top of Tree-Structured Key-Value Stores

School of Computer Science, Peking University, Beijing 100871, China
Ant Group, Beijing 100020, China
Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
Show Author Information

Abstract

Graph databases have gained widespread adoption in various industries and have been utilized in a range of applications, including financial risk assessment, commodity recommendation, and data lineage tracking. While the principles and design of these databases have been the subject of some investigation, there remains a lack of comprehensive examination of aspects such as storage layout, query language, and deployment. The present study focuses on the design and implementation of graph storage layout, with a particular emphasis on tree-structured key-value stores. We also examine different design choices in the graph storage layer and present our findings through the development of TuGraph, a highly efficient single-machine graph database that significantly outperforms well-known Graph DataBase Management System (GDBMS). Additionally, TuGraph demonstrates superior performance in the Linked Data Benchmark Council (LDBC) Social Network Benchmark (SNB) interactive benchmark.

References

[1]
Neo4j, https://neo4j.com, 2023.
[2]
A. Deutsch, Y. Xu, M. Wu, and V. Lee, TigerGraph: A native MPP graph database, arXiv preprint arXiv: 1901.08248, 2019.
[3]
Z. Fu, Z. Wu, H. Li, Y. Li, M. Wu, X. Chen, X. Ye, B. Yu, and X. Hu, GeaBase: A high-performance distributed graph database for industry-scale applications, Int. J. High Perform. Comput. Netw., vol. 15, nos. 1&2, pp. 12−21, 2019.
[4]
Symas Corp, Memcache benchmark, http://www.lmdb.tech/bench/memcache/, 2013.
[5]
[6]
Alibaba cloud, https://www.aliyun.com/, 2023.
[7]
Ldbc social network benchmark, https://ldbcouncil.org/benchmarks/snb/, 2023.
[8]
Twitter follower network 2010, https://snap.stanford.edu/data/twitter-2010.html, 2023.
[9]

L. Lu, T. S. Pillai, H. Gopalakrishnan, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau, WiscKey: Separating keys from values in SSD-conscious storage, ACM Trans. Storage, vol. 13, no. 1, pp. 5, 2017.

[11]
JanusGraph, https://janusgraph.org, 2023.
[13]
ArangoDB, https://www.arangodb.com/, 2023.
[14]
Apache Cassandra, https://cassandra.apache.org/, 2023.
[15]
Apache HBase, https://hbase.apache.org/, 2023.
[16]
Google Cloud Bigtable, https://cloud.google.com/bigtable/, 2023.
[18]
OrientDB, https://orientdb.com, 2023.
[19]
C. Kankanamge, S. Sahu, A. Mhedbhi, J. Chen, and S. Salihoglu, Graphflow: An active graph database, in Proc. 2017 ACM Int. Conf. Management of Data, Chicago, IL, USA, 2017, pp. 1695–1698.
[20]
C. Buragohain, K. M. Risvik, P. Brett, M. Castro, W. Cho, J. Cowhig, N. Gloy, K. Kalyanaraman, R. Khanna, J. Pao, et al., A1: A distributed in-memory graph database, in Proc. 2020 ACM SIGMOD Int. Conf. Management of Data, Portland, OR, USA, 2020, pp. 329–344.
[21]
[22]
[23]
Azure Cosmos DB documentation, https://docs.microsoft.com/en-us/azure/cosmos-db, 2023.
[24]
X. Zhu, G. Feng, M. Serafini, X. Ma, J. Yu, L. Xie, A. Aboulnaga, and W. Chen, LiveGraph: A transactional graph storage system with purely sequential adjacency list scans, arXiv preprint arXiv: 1910.05773, 2019.
[25]
N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, et al., TAO: Facebook’s distributed data store for the social graph, in Proc. 2013 USENIX Conf. Annual Technical Conf. (USENIX ATC 13), San Jose, CA, USA, 2013, pp. 49–60.
[26]
H. Chen, C. Li, J. Fang, C. Huang, J. Cheng, J. Zhang, Y. Hou, and X. Yan, Grasper: A high performance distributed system for OLAP on property graphs, in Proc. ACM Symp. on Cloud Computing, Santa Cruz, CA, USA, 2019, pp. 87–100.
Big Data Mining and Analytics
Pages 156-170
Cite this article:
Lin H, Wang Z, Qi S, et al. Building a High-Performance Graph Storage on Top of Tree-Structured Key-Value Stores. Big Data Mining and Analytics, 2024, 7(1): 156-170. https://doi.org/10.26599/BDMA.2023.9020015
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return