PDF (2 MB)
Collect
Submit Manuscript
Show Outline
Figures (18)

Show 9 more figures Hide 9 figures
Open Access

Distributed Storage System for Electric Power Data Based on HBase

School of Computer Science and Engineering, Southeast University, Nanjing 211189, China.
Show Author Information

Abstract

Managing massive electric power data is a typical big data application because electric power systems generate millions or billions of status, debugging, and error records every single day. To guarantee the safety and sustainability of electric power systems, massive electric power data need to be processed and analyzed quickly to make real-time decisions. Traditional solutions typically use relational databases to manage electric power data. However, relational databases cannot efficiently process and analyze massive electric power data when the data size increases significantly. In this paper, we show how electric power data can be managed by using HBase, a distributed database maintained by Apache. Our system consists of clients, HBase database, status monitors, data migration modules, and data fragmentation modules. We evaluate the performance of our system through a series of experiments. We also show how HBase’s parameters can be tuned to improve the efficiency of our system.

References

[1]
S. Y. Pan, T. Morris, and U. Adhikari, Developing a hybrid intrusion detection system using data mining for power systems, IEEE Trans. Smart Grid, vol. 6, no. 6, pp. 3104-3113, 2015.
[2]
H. Jiang, K. Wang, Y. H. Wang, M. Gao, and Y. Zhang, Energy big data: A survey, IEEE Access, vol. 4, pp. 3844-3861, 2016.
[3]
M. Yigit, V. C. Gungor, and S. Baktir, Cloud computing for smart grid applications, Comput. Netw., vol. 70, pp. 312-329, 2014.
[4]
S. Ghemawat, H. Gobioff, and S. T. Leung, The Google file system, ACM SIGOPS Ope. Syst. Rev., vol. 37, no. 5, pp. 29-43, 2003.
[5]
Welcome to ApacheTM Hadoop® http://hadoop.apache.org/, 2017.
[6]
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, Bigtable: A distributed storage system for structured data, in Proc. 7th USENIX Symp. Operating Systems Design and Implementation, Seattle, WA, USA, 2006, pp. 205-218.
[7]
Apache HBase-Apache HBaseTM Home, http://hbase.apache.org/, 2017.
[8]
J. Dean and S. Ghemawat, Mapreduce: Simplified data processing on large clusters, Commun. ACM, vol. 51, no. 1, pp. 107-113, 2008.
[9]
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, Spark: Cluster computing with working sets, in Proc. 2nd USENIX Conf. Hot Topics in Cloud Computing, Boston, MA, USA, 2010, p. 10.
[10]
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing, in Proc. 9th USENIX Conf. Networked Systems Design and Implementation, NSDI’12, San Jose, CA, USA, 2012, p. 2.
[11]
S. Kawasoe, Y. Igarashi, K. Shibayama, Y. Nagashima, and S. Nagashima, Examples of distributed information platforms constructed by power utilities in Japan, in Proc. CIGRE Symp. 2012, Paris, France, 2012, pp. 108-113.
[12]
T, Harter, D. Borthakur, S. Y. Dong, A. Aiyer, L. Y. Tang, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau, Analysis of HDFS under HBase: A facebook messages case study, in Proc. 12th USENIX Conf. File and Storage Technologies, Santa Clara, CA, USA, 2014, pp. 199-212.
[13]
Ganglia monitoring system, http://ganglia.info/, 2017.
[14]
D. Lasalle and G. Karypis, Multi-threaded graph partitioning, in Proc. 27th IEEE Int. Symp. Parallel & Distributed Processing, Boston, MA, USA, 2013, pp. 225-236.
[15]
H. Lyu, P. Li, Y. N. Xiao, H. J. Qian, B. Sheng, and R. M. Shen, Mass data storage platform for smart grid, in Proc. 2016 Int. Conf. Progress in Informatics and Computing (PIC), Shanghai, China, 2016, pp. 530-535.
[16]
S. Rusitschka, K. Eger, and C. Gerdes, Smart grid data cloud: A model for utilizing cloud computing in the smart grid domain, in Proc. 1st IEEE Int. Conf. Smart Grid Communications, Gaithersburg, MD, USA, 2010, pp. 483-488.
[17]
W. Medjroubi, U. P. Müller, M. Scharf, C. Matke, and D. Kleinhans, Open data in power grid modelling: New approaches towards transparent grid models, Energy Rep., vol. 3, pp. 14-21, 2017.
[18]
R. Meier, E. Cotilla-Sanchez, B. McCamish, D. Chiu, M. Histand, J. Landford, and R. B. Bass, Power system data management and analysis using synchrophasor data, in Proc. 2014 IEEE Conf. Technologies for Sustainability (SusTech), Portland, OR, USA, 2014, pp. 225-231.
[19]
A. Bose, Smart transmission grid applications and their supporting infrastructure, IEEE Trans. Smart Grid, vol. 1, no. 1, pp. 11-19, 2010.
[20]
T. Niimura, M. Dhaliwal, and K. Ozawa, Fuzzy regression models to represent electricity market data in deregulated power industry, in Proc. Joint 9th IFSA World Congress and 20th NAFIPS Int. Conf., Vancouver, Canada, 2001, pp. 2556-2561.
[21]
Z. J. Fu, X. M. Sun, Q. Liu, L. Zhou, and J. G. Shu, Achieving efficient cloud search services: Multi-keyword ranked search over encrypted cloud data supporting parallel computing, IEICE Trans. Commun., vol. 98, no. 1, pp. 190-200, 2015.
[22]
X. He, Q. Ai, R. C. Qiu, W. T. Huang, L. J. Piao, and H. C. Liu, A big data architecture design for smart grids based on random matrix theory, IEEE Trans. Smart Grid, vol. 8, no. 2, pp. 674-686, 2017.
[23]
S. Ruj and A. Nayak, A decentralized security framework for data aggregation and access control in smart grids, IEEE Trans. Smart Grid, vol. 4, no. 1, pp. 196-205, 2013.
[24]
Y. Yan, Y. Qian, and H. Sharif, A secure data aggregation and dispatch scheme for home area networks in smart grid, in Proc. 2011 IEEE Global Telecommunications Conf., Kathmandu, Nepal, 2011, pp. 1-6.
[25]
F. J. Li, B. Luo, and P. Liu, Secure information aggregation for smart grids using homomorphic encryption, in Proc. 1st IEEE Int. Conf. Smart Grid Communications, Gaithersburg, MD, USA, 2010, pp. 327-332.
[26]
G. Kalogridis, C. Efthymiou, S. Z. Denic, T. A. Lewis, and R. Cepeda, Privacy for smart meters: Towards undetectable appliance load signatures, in Proc. 1st IEEE Int. Conf. Smart Grid Communications, Gaithersburg, MD, USA, 2010, pp. 232-237.
[27]
V. Rastogi and S. Nath, Differentially private aggregation of distributed time-series with transformation and encryption, in Proc. 2010 ACM SIGMOD Int. Conf. Management of Data, Indianapolis, IN, USA, 2010, pp. 735-746.
[28]
R. Tan, V. B. Krishna, D. K. Y. Yau, and Z. Kalbarczyk, Impact of integrity attacks on real-time pricing in smart grids, in Proc. 2013 ACM SIGSAC Conf. Computer & Communications Security, Berlin, Germany, 2013, pp. 439-450.
[29]
S. Tan, W. Z. Song, M. Stewart, J. J. Yang, and L. Tong, Online data integrity attacks against real-time electrical market in smart grid, IEEE Trans. Smart Grid, vol. 9, no. 1, pp. 313-322, 2018.
Big Data Mining and Analytics
Pages 324-334
Cite this article:
Jin J, Song A, Gong H, et al. Distributed Storage System for Electric Power Data Based on HBase. Big Data Mining and Analytics, 2018, 1(4): 324-334. https://doi.org/10.26599/BDMA.2018.9020026
Metrics & Citations  
Article History
Copyright
Return