Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
There is a growing demand for time series data analysis in industry areas. Apache IoTDB is a time series database designed for the Internet of Things (IoT) with enhanced storage and I/O performance. With User-Defined Functions (UDF) provided, computation for time series can be executed on Apache IoTDB directly. To satisfy most of the common requirements in industrial time series analysis, we create a UDF library, IoTDQ, on Apache IoTDB. This library integrates stream computation functions on data quality analysis, data profiling, anomaly detection, data repairing, etc. IoTDQ enables users to conduct a wide range of analyses, such as monitoring, error diagnosis, equipment reliability analysis. It provides a framework for users to examine IoT time series with data quality problems. Experiments show that IoTDQ keeps the same level of performance compared to mainstream alternatives, and shortens I/O consumption for Apache IoTDB users.
J. Qiao, X. Huang, J. Wang, and R. K. Wong, Dual-PISA: An index for aggregation operations on time series data, Inf. Syst., vol. 87, p. 101427, 2020.
C. Wang, J. Qiao, X. Huang, S. Song, H. Hou, T. Jiang, L. Rui, J. Wang, and J. Sun, Apache IoTDB: A time series database for IoT applications, Proc. ACM Manag. Data, vol. 1, no. 2, p. 195, 2023.
T. Kolajo, O. Daramola, and A. Adebiyi, Big data stream analysis: A systematic literature review, J. Big Data, vol. 6, no. 1, p. 47, 2019.
P. Esling and C. Agón, Time-series data mining, ACM Comput. Surv., vol. 45, no. 1, p. 12, 2012.
C. Fang, S. Song, and Y. Mei, On repairing timestamps for regular interval time series, Proc. VLDB Endow., vol. 15, no. 9, pp. 1848–1860, 2022.
Z. Chen, S. Song, Z. Wei, J. Fang, and J. Long, Approximating median absolute deviation with bounded error, Proc. VLDB Endow., vol. 14, no. 11, pp. 2114–2126, 2021.
C. Leys, C. Ley, O. Klein, P. Bernard, and L. Licata, Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median, J. Experim. Soc. Psychol., vol. 49, no. 4, pp. 764–766, 2013.
S. Yoon, J. G. Lee, and B. S. Lee, NETS: Extremely fast outlier detection from a data stream via set-based processing, Proc. VLDB Endow., vol. 12, no. 11, pp. 1303–1315, 2019.
L. Tran, M. Y. Mun, and C. Shahabi, Real-time distance-based outlier detection in data streams, Proc. VLDB Endow., vol. 14, no. 2, pp. 141–153, 2020.
Z. Liu, Y. Zhang, R. Huang, Z. Chen, S. Song, and J. Wang, EXPERIENCE: Algorithms and case study for explaining repairs with uniform profiles over IoT data, J. Data Inf. Qual., vol. 13, no. 3, p. 18, 2021.
724
Views
127
Downloads
0
Crossref
0
Web of Science
0
Scopus
0
CSCD
Altmetrics
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).