AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (913.5 KB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Distributed and Weighted Extreme Learning Machine for Imbalanced Big Data Learning

Zhiqiong WangJunchang Xin( )Hongxu YangShuo TianGe YuChenren XuYudong Yao
Sino-Dutch Biomedical & Information Engineering School, Northeastern University, Shenyang 110169, China.
School of Computer Science & Engineering, Northeastern University, Shenyang 110169, China.
School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China.
Department of Electrical and Computer Engineering, Stevens Institute of Technology, Castle Point on Hudson Hoboken, NJ 07030, USA.
Show Author Information

Abstract

The Extreme Learning Machine (ELM) and its variants are effective in many machine learning applications such as Imbalanced Learning (IL) or Big Data (BD) learning. However, they are unable to solve both imbalanced and large-volume data learning problems. This study addresses the IL problem in BD applications. The Distributed and Weighted ELM (DW-ELM) algorithm is proposed, which is based on the MapReduce framework. To confirm the feasibility of parallel computation, first, the fact that matrix multiplication operators are decomposable is illustrated. Then, to further improve the computational efficiency, an Improved DW-ELM algorithm (IDW-ELM) is developed using only one MapReduce job. The successful operations of the proposed DW-ELM and IDW-ELM algorithms are finally validated through experiments.

References

[1]
Langley P., The changing science of machine learning, Mach. Learn., vol. 82, no. 3, pp. 275-279, 2011.
[2]
Huang G.-B., Zhu Q.-Y., and Siew C.-K., Extreme learning machine: Theory and applications, Neurocomputing, vol. 70, nos. 1–3, pp. 489-501, 2006.
[3]
Huang G.-B., Chen L., and Siew C.-K., Universal approximation using incremental constructive feedforward networks with random hidden nodes, IEEE Trans. Neural Netw., vol. 17, no. 4, pp. 879-892, 2006.
[4]
Huang G.-B. and Chen L., Convex incremental extreme learning machine, Neurocomputing, vol. 70, nos. 16–18, pp. 3056-3062, 2007.
[5]
Huang G.-B. and Chen L., Enhanced random search based incremental extreme learning machine, Neurocomputing, vol. 71, nos. 16–18, pp. 3460-3468, 2008.
[6]
Huang G.-B., Ding X., and Zhou H., Optimization method based extreme learning machine for classification, Neurocomputing, vol. 74, nos. 1–3, pp. 155-163, 2010.
[7]
Huang G.-B., Zhou H., Ding X., and Zhang R., Extreme learning machine for regression and multiclass classification, IEEE Trans. Syst. Man Cybern. Part B-Cybern., vol. 42, no. 2, pp. 513-529, 2012.
[8]
Huang G.-B., Wang D., and Lan Y., Extreme learning machines: A survey, Int. J. Mach. Learn. Cybern., vol. 2, no. 2, pp. 107-122, 2011.
[9]
Zhu Q.-Y., Qin A. K., Suganthan P. N., and Huang G.-B., Evolutionary extreme learning machine, Pattern Recognit., vol. 38, no. 10, pp. 1759-1763, 2005.
[10]
Huang G.-B., Liang N.-Y., Rong H.-J., Saratchandran P., and Sundararajan N., On-line sequential extreme learning machine, in Proc. of the IASTED Int. Conf. on Computational Intelligence, Calgary, Canada, 2005, pp. 232-237.
[11]
Liang N.-Y., Huang G.-B., Saratchandran P., and Sundararajan N., A fast and accurate on-line sequential learning algorithm for feedforward networks, IEEE Trans. Neural Netw., vol. 17, no. 6, pp. 1411-1423, 2006.
[12]
Rong H.-J., Huang G.-B., Sundararajan N., and Saratchandran P., On-line sequential fuzzy extreme learning machine for function approximation and classification problems, IEEE Trans. Syst. Man Cybern. Part B-Cybern., vol. 39, no. 4, pp. 1067-1072, 2009.
[13]
He H. and Garcia E. A., Learning from imbalanced data, IEEE Trans. Knowl. Data Eng., vol. 21, no. 9, pp. 1263-1284, 2009.
[14]
Liu X., Wu J., and Zhou Z., Exploratory undersampling for class-imbalance learning, IEEE Trans. Syst. Man Cybern. Part B-Cybern., vol. 39, no.2, pp. 539-550, 2006.
[15]
Han H., Wang W.-Y., and Mao B.-H., Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning, in Proc. of Int. Conf. on Intelligent Computing, Hefei, China, 2005, pp. 878-887.
[16]
Zong W., Huang G.-B., and Chen Y., Weighted extreme learning machine for imbalance learning, Neurocomputing, vol. 101, no. 3, pp. 229-242, 2013.
[17]
Chen M., Mao S., and Liu Y., Big data: A survey, Mobile Netw. Appl., vol. 19, no. 2, pp. 171-209, 2014.
[18]
Chen J., Chen Y., Du X., Li C., Lu J., Zhao S., and Zhou X., Big data challenge: A data management perspective, Front.. Comput. Sci., vol. 7, no. 2, pp. 157-164, 2013.
[19]
He Q., Shang T., Zhuang F., and Shi Z., Parallel extreme learning machine for regression based on mapreduce, Neurocomputing, vol. 102, no. 2, pp. 52-58, 2013.
[20]
Xin J., Wang Z., Chen C., Ding L., Wang G., and Zhao Y., ELM: Distributed extreme learning machine with mapreduce, World Wide Web, vol. 17, no. 5, pp. 1189-1204, 2014.
[21]
Bi X., Zhao X., Wang G., Zhang P., and Wang C., Distributed extreme learning machine with kernels based on mapreduce, Neurocomputing, vol. 149, no. 1, pp. 456-463, 2015.
[22]
Xin J., Wang Z., Qu L., and Wang G., Elastic extreme learning machine for big data classification, Neurocomputing, vol. 149, no. 1, pp. 464-471, 2015.
[23]
Xin J., Wang Z., Qu L., Yu G., and Kang Y., A-ELM: Adaptive distributed extreme learning machine with MapReduce, Neurocomputing, vol. 174, no. 1, pp. 368-374, 2016.
[24]
Dean J. and Ghemawat S., MapReduce: Simplified data processing on large clusters, in Proc. Symposium on Operating System Design and Implementation, San Francisco, CA, USA, 2004, pp. 137-150.
[25]
Dean J. and Ghemawat S., MapReduce: Simplified data processing on large clusters, Commun. ACM, vol. 51, no. 1, pp. 107-113, 2008.
[26]
Dean J. and Ghemawat S., MapReduce: A flexible data processing tool, Commun. ACM, vol. 53, no. 1, pp. 72-77, 2010.
[27]
Chu C.-T., Kim S.-K., Lin Y.-A., Yu Y.-Y., Bradski G., Ng A.-Y., and Olukotun K., Map-reduce for machine learning on multicore, in Proc. 20th Annual Conf. on Neural Information Processing Systems, Vancouver, Canada, 2007, pp. 281-288.
[28]
Meng X. and Mahoney M. W., Robust regression on mapreduce, in Proc. 30th Int. Conf. on Machine Learning, Atlanta, GA, USA, 2013, pp. 888-896.
[29]
Fletcher R., Constrained optimization, in Practical Methods of Optimization. Hoboken N. J., Ed. John Wiley & Sons Ltd, 1981, pp. 127-416.
[30]
Ghemawat S., Gobioff H., and Leung S.-T., The google file system, in Proc. 19th ACM Symposium on Operating Systems Principles, New York, UK, USA, 2003, pp. 29-43.
[31]
Shvachko K., Kuang H., Radia S., and Chansler R., The hadoop distributed file system, in Proc. 26th IEEE Symposium on Mass Storage Systems and Technologies, Incline Village, NV, USA, 2010, pp. 1-10.
Tsinghua Science and Technology
Pages 160-173
Cite this article:
Wang Z, Xin J, Yang H, et al. Distributed and Weighted Extreme Learning Machine for Imbalanced Big Data Learning. Tsinghua Science and Technology, 2017, 22(2): 160-173. https://doi.org/10.23919/TST.2017.7889638

584

Views

51

Downloads

19

Crossref

N/A

Web of Science

25

Scopus

5

CSCD

Altmetrics

Received: 27 August 2016
Revised: 14 January 2017
Accepted: 18 January 2017
Published: 06 April 2017
© The author(s) 2017
Return