AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article

Clustering and classification of energy meter data: A comparison analysis of data from individual homes and the aggregated data from multiple homes

Juan Sala1Rongling Li1( )Morten H. Christensen2
Department of Civil Engineering, Technical University of Denmark, Lyngby, Denmark
Department of Electrical Engineering, Technical University of Denmark, Lyngby, Denmark
Show Author Information

Abstract

The transition towards a more sustainable environment requires the development of new control systems on the demand side to integrate renewable energy sources into the energy systems. For this purpose, energy meter data of homes have been broadly used in modelling, forecast and optimal control of energy use. However, usability and reliability of household energy meter data have not been specifically addressed. In this study, we apply commonly used machine learning methods on the heating consumption data of (1) two individual homes in an apartment building and (2) the district heating substation of the apartment building which includes 72 homes, to identify how the characteristics of data affect the result of data analysis. Two clustering approaches were applied using the K-means algorithm to group similar heating daily profiles. Using the clustering results, different classification algorithms such as logistic regression and random forest were applied to predict the heating consumption level with regards to the weather conditions. The data analysis process showed that the substation data which is the aggregated heating consumption of the 72 homes is more reliable and valid for energy prediction than the data from two individual homes. This is due to the large variation and uncertainty in the daily energy use of individual homes.

References

 
Chang RF, Lu CN (2003). Load profiling and its applications in power market. In: Proceedings of IEEE Power Engineering Society General Meeting, Toronto, Canada, pp. 974-978.
 
Charrad M, Ghazzali N, Boiteau V, Niknafs A (2014). NbClust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, 61(6): 1-36.
 
Chicco G (2012). Overview and performance assessment of the clustering methods for electrical load pattern grouping. Energy, 42: 68-80.
 
Deng H (n.d.). An Introduction to Random Forest. https://towardsdatascience.com/random-forest-3a55c3aca46d.
 
DMI (2016). DMI Report 16-02 Denmark-DMI historical climate data collection 1768-2015 Available at http://www.dmi.dk/fileadmin/user_upload/Rapporter/TR/2016/DMIRep16-02.pdf.
 
do Carmo CMR, Christensen TH (2016). Cluster analysis of residential heat load profiles and the role of technical and household characteristics. Energy and Buildings, 125: 171-180.
 
Fabi V, Andersen RK, Corgnati SP, Venezia F (2012). Main physical environmental drivers of occupant. In: Proceedings of the 10th International Conference on Healthy.
 
Gianniou P, Liu X, Heller A, Nielsen PS, Rode C (2018). Clustering-based analysis for residential district heating data. Energy Conversion and Management, 165: 840-850.
 
Goia A, May C, Fusai G (2010). Functional clustering and linear regression for peak load forecasting. International Journal of Forecasting, 26: 700-711.
 
Herlau T, Schmidt MN, Mørup M (2019). FIntroduction to Machine Learning and Data MiningF. Technical University of Denmark.
 
Jin W (n.d.). Nested cross-validation explained. Available at https://weina.me/nested-cross-validation.
 
Kleinbaum DG, Klein M (2010). Logistic Regression, 3rd edn. New York: Springer.
 
Kwac J, Flora J, Rajagopal R (2014). Household energy consumption segmentation using hourly data. IEEE Transactions on Smart Grids, 5: 420-430.
 
Li K, Wang B, Wang Z, Wang F, Mi Z, Zhen Z (2017). A baseline load estimation approach for residential customer based on load pattern clustering. Energy Procedia, 142: 2042-2049.
 
Mihailescu RC, Davidsson P (2017). Integration of smart home technologies for district heating. In: Proceedings of the 1st International Workshop on Pervasive Smart Living, Athens, Greece.
 
MissingLink (n.d.). Neural Network Concepts: Classification with Neural Networks. https://bit.ly/2XCuJVS.
 
Park S, Ryu S, Choi Y, Kim H (2014). A framework for baseline load estimation in demand response: Data mining approach. In: Proceedings of International Conference on Smart Grid Communications, Venice, Italy.
 
Powers DMW (2011). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies, 2: 33-67.
 
Sasaki Y (2007). FThe truth of the F-measureF. University of Manchester, UK.
 
Veber C (2016). Correlation and Regression. In: Naghettini M (ed), Fundamentals of Statistical Hydrology. Cham, Switzerland: Springer International Publishing.
 
Vercamer D, Steurtewagen B, van den Poel D, Vermeulen F (2016). Predicting consumer load profiles using commercial and open data. IEEE Transactions on Power Systems, 31: 3693-3701.
Building Simulation
Pages 103-117
Cite this article:
Sala J, Li R, Christensen MH. Clustering and classification of energy meter data: A comparison analysis of data from individual homes and the aggregated data from multiple homes. Building Simulation, 2021, 14(1): 103-117. https://doi.org/10.1007/s12273-019-0587-4

947

Views

13

Crossref

N/A

Web of Science

15

Scopus

0

CSCD

Altmetrics

Received: 31 July 2019
Accepted: 08 October 2019
Published: 03 December 2019
© Tsinghua University Press and Springer-Verlag GmbH Germany, part of Springer Nature 2020
Return