AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Handling the Challenges of Small-Scale Labeled Data and Class Imbalances in Classifying the N and K Statuses of Rubber Leaves Using Hyperspectroscopy Techniques

Wenfeng Hu1,2,Weihao Tang1,Chuang Li1Jinjing Wu1Hong Liu1Chao Wang2Xiaochuan Luo1Rongnian Tang1( )
School of Mechanical and Electrical Engineering, Hainan University, Haikou 570228, China
School of Electrical Engineering and Automation, Tianjin University, Tianjin 300072, China

†These authors contributed equally to this work.

Show Author Information

Abstract

The nutritional status of rubber trees (Hevea brasiliensis) is inseparable from the production of natural rubber. Nitrogen (N) and potassium (K) levels in rubber leaves are 2 crucial criteria that reflect the nutritional status of the rubber tree. Advanced hyperspectral technology can evaluate N and K statuses in leaves rapidly. However, high bias and uncertain results will be generated when using a small size and imbalance dataset to train a spectral estimaion model. A typical solution of laborious long-term nutrient stress and high-intensive data collection deviates from rapid and flexible advantages of hyperspectral tech. Therefore, a less intensive and streamlined method, remining information from hyperspectral image data, was assessed. From this new perspective, a semisupervised learning (SSL) method and resampling techniques were employed for generating pseudo-labeling data and class rebalancing. Subsequently, a 5-classification spectral model of the N and K statuses of rubber leaves was established. The SSL model based on random forest classifiers and mean sampling techniques yielded optimal classification results both on imbalance/balance dataset (weighted average precision 67.8/78.6%, macro averaged precision 61.2/74.4%, and weighted recall 65.7/78.5% for the N status). All data and code could be viewed on the:Github https://github.com/WeehowTang/SSL-rebalancingtest. Ultimately, we proposed an efficient way to rapidly and accurately monitor the N and K levels in rubber leaves, especially in the scenario of small annotation and imbalance categories ratios.

References

1

Van Beilen JB, Poirier Y. Establishment of new crops for the production of natural rubber. Trends Biotechnol. 2007;25(11):522.

2

Reich PB, Walters MB, Kloeppel BD, Ellsworth DS. Different photosynthesis-nitrogen relations in deciduous hardwood and evergreen coniferous tree species. Oecologia. 1995;104(1):24–30.

3

Poorter H, Evans JR. Photosynthetic nitrogen-use efficiency of species that differ inherently in specific leaf area. Oecologia. 1998;116(1-2):26–37.

4

Shah SH, Angel Y, Houborg R, Ali S, McCabe MF. A random forest machine learning approach for the retrieval of leaf chlorophyll content in wheat. Remote Sens. 2019;11(8):920.

5

Peck GM, Andrews PK, Reganold JP, Fellman JK. HortScience HortSci. 2006;41:99.

6

Cao Q, Miao Y, Wang H, Huang S, Cheng S, Khosla R, Jiang R. Field Crop Res. 2013;154:133.

7

Zhang X, Liu F, He Y, Gong X. Detecting macronutrients content and distribution in oilseed rape leaves based on hyperspectral imaging. Biosyst Eng. 2013;115(1):56–65.

8

Asrar G, Kanemasu E, Yoshida M. Remote Sens Environ. 1985;17:1.

9
Reynolds M, Pask A, Mullan D. Physiological breeding I: interdisciplinary approaches to improve crop adaptation. Mexico: CIMMYT; 2012.
10

Ji-Yong S, Xiao-Bo Z, Jie-Wen Z, Kai-Liang W, Zheng-Wei C, Xiao-Wei H, de-Tao Z, Holmes M. Sci Hortic. 2012;138:190.

11

Lu J, Yang T, Su X, Qi H, Yao X, Cheng T, Zhu Y, Cao W, Tian Y. Precis Agric. 2020;21:324.

12

Bruce L, Koger C, Li J. IEEE Trans Geosci Remote Sens. 2002;40:2331.

13

ElMasry G, Sun D-W, Allen P. J Food Eng. 2012;110:127.

14

Phanomsophon T, Jaisue N, Worphet A, Tawinteung N, Shrestha B, Posom J, Khurnpoon L, Sirisomboon P. Rapid measurement of classification levels of primary macronutrients in durian (Durio zibethinus Murray CV. Mon Thong) leaves using FT-NIR spectrometer and comparing the effect of imbalanced and balanced data for modelling. Measurement. 2022;203:Article 111975.

15
Davaslioglu K, Sagduyu YE. Paper presented at: IEEE International Conference on Communications (ICC) (2018), pp. 1–6. 2018.
16

Amirruddin AD, Muharam FM, Ismail MH, Tan NP, Ismail MF. Comput Electron Agric. 2022;193:Article 106646.

17

Xiao Q, Tang W, Zhang C, Zhou L, Feng L, Shen J, Yan T, Gao P, He Y, Wu N. Plant Phenomics. 2022;2022.

18

Azadnia R, Rajabipour A, Jamshidi B, Omid M. New approach for rapid estimation of leaf nitrogen, phosphorus, and potassium contents in apple-trees using Vis/NIR spectroscopy based on wavelength selection coupled with machine learning. Comput Electron Agric. 2023;207:Article 107746.

19

Suh S, Lee H, Lukowicz P, Lee YO. CEGAN: Classification Enhancement Generative Adversarial Networks for unraveling data imbalance problems. Neural Netw. 2021;133:69–86.

20

Jacquemoud S, Bacour C, Poilvé H, Frangi J-P. Remote Sens Environ. 2000;74:471.

21

Zhou X, Hu Y, Wu J, Liang W, Ma J, Jin Q. IEEE Trans Industr Inform. 2023;19:570.

22

Peterson K, Sagan V, Sidike P, Hasenmueller EA, Sloan JJ, Knouft JH. Photogramm Eng Remote Sens. 2019;85:269.

23

Chen Q, Zheng B, Chenu K, Hu P, Chapman SC. Plant Phenomics. 2022;2022.

24

Ke R, Aviles-Rivero AI, Pandey S, Reddy S, Schönlieb C-B. IEEE Trans Image Process. 2022;31:1805.

25
Hussein BR, Malik OA, Ong W-H, Slik JWF, Automated classification of tropical plant species data based on machine learning techniques and leaf trait measurements. In: R. Alfred, Y. Lim, H. Haviluddin, C. K. On, editors. Computational science and technology Singapore: Springer Singapore; 2020. p. 85–94.
26
Wei C, Sohn K, Mellina C, Yuille A, Yang F. CReST: A class-rebalancing self-training framework for imbalanced semi-supervised learning. Paper presented at: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021;10857 –10866.
27
Oh Y, Kim D-J, Kweon IS. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022;9786–9796.
28
Kim J, Hur Y, Park S, Yang E, Hwang SJ, Shin J. Distribution aligning refinery of pseudo-label for imbalanced semi-supervised learning. Paper presented at: 34th Conference on Neural Information Processing Systems (NeurIPS 2020); 2020.
29

Zhao F, Qian J, Liu H, Wang C, Wang X, Wu W, Wang D, Cai C, Lin Y. Quantification, identification, and comparison of oligopeptides on five tea categories with different fermentation degree by kjeldahl method and ultra-high performance liquid chromatography coupled with quadrupole-orbitrap ultra-high resolution mass spectrometry. Food Chem. 2022;378:Article 132130.

30

Singh S, Sharma PK, Singh S, Kumar A. Commun Soil Sci Plant Anal. 2021;52:2912.

31
Walworth JL, Sumner ME. The diagnosis and recommendation integrated system (dris). In: Stewart BA, editor. Advances in soil science. New York (NY): Springer; 1987. p. 149–188.
32

Vrignon-Brenas S, Gay F, Ricard S, Snoeck D, Perron T, Mareschal L, Laclau JP, Gohet É, Malagoli P. Nutrient management of immature rubber plantations. A review. Agron Sustain Dev. 2019;39:11.

33

van Engelen JE, Hooks HH. Mach Learn. 2020;109:373.

34
Wang F, Kong AWK. In: Advances in Neural Information Processing Systems. Koyejo S et al., eds. Curran Associates, Inc.; 2022, vol. 35, p. 20580–20591.
35

Zhang B, Guo B, Zou B, Wei W, Lei Y, Li T. Environ Pollut. 2022;300:Article 118981.

36

Yang W, Xiong Y, Xu Z, Li L, Du Y. Infrared Phys Technol. 2022;126:Article 104359.

37

Chen J, Jönsson P, Tamura M, Gu Z, Matsushita B, Eklundh L. A simple method for reconstructing a high-quality NDVI time-series data set based on the Savitzky-Golay filter. Remote Sens Environ. 2004;91(3-4):332–344.

38

Li Y, Guan C, Li H, Chin Z. Pattern Recogn Lett. 2008;29:1285.

39

Gu X, Zhang C, Shen Q, Han J, Angelov PP, Atkinson PM. A Self-Training Hierarchical Prototype-based Ensemble Framework for Remote Sensing Scene Classification. Inform Fusion. 2022;80:179–204.

40

Esteki M, Shahsavari Z, Simal-Gandara J. Use of spectroscopic methods in combination with linear discriminant analysis for authentication of food products. Food Control. 2018;91:100–112.

41

Song W, Wang H, Maguire P, Nibouche O. Nearest clusters based partial least squares discriminant analysis for the classification of spectral data. Anal Chim Acta. 2018;1009:27–38.

42

Chan JC-W, Paelinckx D. Evaluation of random forest and adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery. Remote Sens Environ. 2008;112:2999.

43

Jin X, Ba W, Wang L, Zhang T, Zhang X, Li S, Rao Y, Liu L. ACS omega. 2022;7:39727.

44

Lin N, Jiang R, Li G, Yang Q, Li D, Yang X. Ecol Indic. 2022;143:Article 109330.

45
Guo L-Z, Zhang Z-Y, Jiang Y, Li Y-F, Zhou Z-H. Paper presented at: Proceedings of the 37th International Conference on Machine Learning (PMLR, 2020), vol. 119 of Proceedings of Machine Learning Research, pp. 3897–3906.
46
Zhan X, Liu Z, Yan J, Lin D. C. C. Loy. Proceedings of the European Conference on Computer Vision (ECCV). 2018.
47

Li Z, Kamnitsas K, Glocker B. IEEE Trans Med Imaging. 2021;40:1065.

48

Loyola-González O, Martinez-Trinidad JF, Carrasco-Ochoa JA, Garcia-Borroto M. Study of the impact of resampling methods for contrast pattern based classifiers in imbalanced databases. Neurocomputing. 2016;175(Part B):935–947.

49

Rendón E, Alejo R, Castorena C, Isidro-Ortega FJ, Granda-Gutiérrez EE. Data Sampling Methods to Deal With the Big Data Multi-Class Imbalance Problem. Appl Sci. 2020;10(4):1276.

50

Khushi M, Shaukat K, Alam TM, Hameed IA, Uddin S, Luo S, Yang X, Reyes MC. A comparative performance analysis of data resampling methods on imbalance medical data. IEEE Access. 2021;9:Article 109960.

51

Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. J Artif Intell Res. 2002;16:321.

52

Wang Y-J, Jin G, Li LQ, Liu Y, Kianpoor Kalkhajeh Y, Ning JM, Zhang ZZ. Infrared Phys Technol. 2020;108:Article 103365.

53
Rizve MN, Duarte K, Rawat YS, Shah M. CoRR. 2021;abs/2101.06329.
54
Benesty J, Chen J, Huang Y, Cohen I. Pearson correlation coefficientBerlin, Heidelberg (Germany): Springer; 2009. p. 1–4.
Plant Phenomics
Article number: 0154
Cite this article:
Hu W, Tang W, Li C, et al. Handling the Challenges of Small-Scale Labeled Data and Class Imbalances in Classifying the N and K Statuses of Rubber Leaves Using Hyperspectroscopy Techniques. Plant Phenomics, 2024, 6: 0154. https://doi.org/10.34133/plantphenomics.0154

148

Views

0

Crossref

1

Web of Science

1

Scopus

0

CSCD

Altmetrics

Received: 23 April 2023
Accepted: 27 January 2024
Published: 20 March 2024
© 2024 Wenfeng Hu et al. Exclusive licensee Nanjing Agricultural University. No claim to original U.S. Government Works.

Distributed under a Creative Commons Attribution License 4.0 (CC BY 4.0).

Return