AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Cover Article

openGauss: An Enterprise-Grade Open-Source Database System

Department of Computer Science, Tsinghua University, Beijing 100084, China
Huawei Technologies Co., Ltd, Beijing 100088, China
China Mobile Information Technology Center, Beijing 100027, China
Show Author Information

Abstract

We have built openGauss, an enterprise-grade open-source database system. openGauss has fulfilled its design goal of high performance, high availability, high security, and high intelligence. For high performance, it leverages NUMA (non-uniform memory access)-aware data access among multiple cores to enable efficient concurrent transaction processing, and symmetric multi-processing to make use of parallel processing resources adaptively. Moreover, memory-optimized tables (MOTs) are designed to put everything in memory. For high availability, a three-tier pooling architecture that shares storage among the master and standby instances is proposed to achieve availability at 99.99%, containing both a distributed memory service (DMS) and a distributed storage service (DSS). For high security, it is a fully encrypted database with safe storage features, efficient complex querying, and tamper-proof. For high intelligence, an AI-based optimizer in the kernel and a self-driving platform named DBMind are demonstrated to achieve better performance and greater user-friendliness. openGauss has served over 150 enterprises and institutions since its release in 2020. We share the lessons we learned from its development and operation, and our customers.

Electronic Supplementary Material

Download File(s)
JCST-2403-14302-Highlights.pdf (352.8 KB)

References

[1]

Porobic D, Pandis I, Branco M, Tözün P, Ailamaki A. OLTP on hardware islands. Proceedings of the VLDB Endowment, 2012, 5(11): 1447–1458. DOI: 10.14778/2350229.2350260.

[2]
Kemper A, Neumann T. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In Proc. the 27th International Conference on Data Engineering, Apr. 2011, pp.195–206. DOI: 10.1109/ICDE.2011.5767867.
[3]
Sheng Y, Tomasic A, Zhang T, Pavlo A. Scheduling OLTP transactions via machine learning. arXiv: 1903.02990, 2019. https://arxiv.org/abs/1903.02990, Aug. 2024.
[4]

Appavoo J, Da Silva D, Krieger O, Auslander M, Ostrowski M, Rosenburg B, Waterland A, Wisniewski R W, Xenidis J, Stumm M, Soares L. Experience distributing objects in an SMMP OS. ACM Trans. Computer Systems (TOCS), 2007, 25(3): Article No. 6. DOI: 10.1145/1275517.1275518.

[5]
Bryant R, Hawkes J, Steiner J, Barnes J, Higdon J. Scaling Linux® to the extreme. In Proc. the 2004 Linux Symposium, Jun. 2004, pp.133–148.
[6]
Gamsa B, Krieger O, Appavoo J, Stumm M. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In Proc. the 3rd Symposium on Operating Systems Design and Implementation, Feb. 1999.
[7]
Porobic D, Liarou E, Tözün P, Ailamaki A. ATraPos: Adaptive transaction processing on hardware Islands. In Proc. the 30th International Conference on Data Engineering, Mar. 31–Apr. 4, 2014, pp.688–699. DOI: 10.1109/ICDE.2014.6816692.
[8]

Funke F, Kemper A, Neumann T. HyPer-sonic combined transaction and query processing. Proceedings of the VLDB Endowment, 2011, 4(12): 1367–1370. DOI: 10.14778/3402755.3402772.

[9]
Li S, Hoefler T, Snir M. NUMA-aware shared-memory collective communication for MPI. In Proc. the 22nd International Symposium on High-Performance Parallel and Distributed Computing, Jun. 2013, pp.85–96. DOI: 10.1145/2493123.2462903.
[10]
Calciu I, Dice D, Lev Y, Luchangco V, Marathe VJ, Shavit N. NUMA-aware reader-writer locks. In Proc. the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Feb. 2013, pp.157–166. DOI: 10.1145/2442516.2442532.
[11]

Kallman R, Kimura H, Natkins J, Pavlo A, Rasin A, Zdonik S, Jones E P C, Madden S, Stonebraker M, Zhang Y, Hugg J, Abadi D. H-store: A high-performance, distributed main memory transaction processing system. Proceedings of the VLDB Endowment, 2008, 1(2): 1496–1499. DOI: 10.14778/1454159.1454211.

[12]

Larson P Å, Blanas S, Diaconu C, Freedman C, Patel J M, Zwilling M. High-performance concurrency control mechanisms for main-memory databases. Proceedings of the VLDB Endowment, 2011, 5(4): 298–309. DOI: 10.14778/2095686.2095689.

[13]

Hsiao D K, Kung H T, Robinson J T. On optimistic methods for concurrency control. ACM Trans. Database Systems (TODS), 1981, 6(2): 213–226. DOI: 10.1145/319566.319567.

[14]
Tu S, Zheng W T, Kohler E, Liskov B, and Madden S. Speedy transactions in multicore in-memory databases. In Proc. the 24th ACM Symposium on Operating Systems Principles, Nov. 2013, pp.18–32. DOI: 10.1145/2517349.2522713.
[15]
Mao Y, Kohler E, Morris R T. Cache craftiness for fast multicore key-value storage. In Proc. the 7th ACM European Conference on Computer Systems, Apr. 2012, pp.183–196. DOI: 10.1145/2168836.2168855.
[16]
Li G, Zhou X, Cao L. AI meets database: AI4DB and DB4AI. In Proc. the 2021 International Conference on Management of Data, Jun. 2021, pp.2859–2866. DOI: 10.1145/3448016.3457542.
[17]

Zhou X, Li G, Wu J, Liu J, Sun Z, Zhang X. A learned query rewrite system. Proceedings of the VLDB Endowment, 2023, 16(12): 4110–4113. DOI: 10.14778/3611540.3611633.

[18]

Yu X, Chai C, Li G, Liu J. Cost-based or learning-based?: A hybrid query optimizer for query plan selection. Proceedings of the VLDB Endowment, 2022, 15(13): 3924–3936. DOI: 10.14778/3565838.3565846.

[19]
Li G, Zhou X, Cao L. Machine learning for databases. In Proc. the 1st International Conference on AI-ML Systems, Oct. 2021, Article No. 28. DOI: 10.1145/3486001.3486248.
[20]
Wu Z, Shaikhha A. BayesCard: A unified bayesian framework for cardinality estimation. arXiv: 2012.14743, 2020. https://arxiv.org/abs/2012.14743v1, Aug. 2024.
[21]

Dutt A, Wang C, Nazi A, Kandula S, Narasayya V, Chaudhuri S. Selectivity estimation for range predicates using lightweight models. Proceedings of the VLDB Endowment, 2019, 12(9): 1044–1057. DOI: 10.14778/3329772.3329780.

[22]

Sun J, Zhang J, Sun Z, Li G, Tang N. Learned cardinality estimation: A design space exploration and a comparative evaluation. Proceedings of the VLDB Endowment, 2021, 15(1): 85–97. DOI: 10.14778/3485450.3485459.

[23]
Sun J, Li G, Tang N. Learned cardinality estimation for similarity queries. In Proc. the 2021 International Conference on Management of Data, Jun. 2021, pp.1745–1757. DOI: 10.1145/3448016.3452790.
[24]

Sun J, Li G. An end-to-end learning-based cost estimator. Proceedings of the VLDB Endowment, 2019, 13(3): 307–319. DOI: 10.14778/3368289.3368296.

Journal of Computer Science and Technology
Pages 1007-1028
Cite this article:
Li G-L, Wang J, Chen G. openGauss: An Enterprise-Grade Open-Source Database System. Journal of Computer Science and Technology, 2024, 39(5): 1007-1028. https://doi.org/10.1007/s11390-024-4302-2

118

Views

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 20 March 2024
Accepted: 30 August 2024
Published: 05 December 2024
© Institute of Computing Technology, Chinese Academy of Sciences 2024
Return