PDF (5.4 MB)
Collect
Submit Manuscript
Show Outline
Figures (6)

Tables (7)
Table 1
Table 2
Table 3
Table 4
Table 5
Show 2 more tables Hide 2 tables
Open Access

GFDet: Multi-Level Feature Fusion Network for Caries Detection Using Dental Endoscope Images

College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310000, China
Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA
Future Science and Technology City Branch, Hangzhou Stomatology Hospital, Hangzhou 310000, China
Show Author Information

Abstract

Early dental caries detection by endoscope can prevent complications, such as pulpitis and apical infection. However, automatically identifying dental caries remains challenging due to the uncertainty in size, contrast, low saliency, and high interclass similarity of dental caries. To address these problems, we propose the Global Feature Detector (GFDet) that integrates the proposed Feature Selection Pyramid Network (FSPN) and Adaptive Assignment-Balanced Mechanism (AABM). Specifically, FSPN performs upsampling with the semantic information of adjacent feature layers to mitigate the semantic information loss due to sharp channel reduction and enhance discriminative features by aggregating fine-grained details and high-level semantics. In addition, a new label assignment mechanism is proposed that enables the model to select more high-quality samples as positive samples, which can address the problem of easily ignored small objects. Meanwhile, we have built an endoscopic dataset for caries detection, consisting of 1318 images labeled by five dentists. For experiments on the collected dataset, the F1-score of our model is 75.6%, which out-performances the state-of-the-art models by 7.1%.

References

[1]

C. Valkenburg, F. A. van der Weijden, and D. E. Slot, Plaque control and reduction of gingivitis: The evidence for dentifrices, Periodontology 2000, vol. 79, no. 1, pp. 221–232, 2019.

[2]

S. Sälzer, C. Graetz, C. E. Dörfer, D. E. Slot, and F. A. Van der Weijden, Contemporary practices for mechanical oral hygiene to prevent periodontal disease, Periodontology 2000, vol. 84, no. 1, pp. 35–44, 2020.

[3]

J. E. Frencken, P. Sharma, L. Stenhouse, D. Green, D. Laverty, and T. Dietrich, Global epidemiology of dental caries and severe periodontitis–A comprehensive review, J. Clin. Periodontol., vol. 44, no. S18, pp. S94–S105, 2017.

[4]

M. A. Peres, L. M. D. Macpherson, R. J. Weyant, B. Daly, R. Venturelli, M. R. Mathur, S. Listl, R. K. Celeste, C. C. Guarnizo-Herreño, C. Kearns, et al., Oral diseases: A global public health challenge, Lancet, vol. 394, no. 10194, pp. 249–260, 2019.

[5]

Y. Miki, C. Muramatsu, T. Hayashi, X. Zhou, T. Hara, A. Katsumata, and H. Fujita, Classification of teeth in cone-beam CT using deep convolutional neural network, Comput. Biol. Med., vol. 80, pp. 24–29, 2017.

[6]

J. Krois, T. Ekert, L. Meinhold, T. Golla, B. Kharbot, A. Wittemeier, C. Dörfer, and F. Schwendicke, Deep learning for the radiographic detection of periodontal bone loss, Sci. Rep., vol. 9, no. 1, p. 8495, 2019.

[7]

R. Li, J. Zhu, Y. Wang, S. Zhao, C. Peng, Q. Zhou, R. Sun, A. Hao, S. Li, Y. Wang, et al., Development of a deep learning based prototype artificial intelligence system for the detection of dental caries in children, (in Chinese), Chin. J. Stomatol., vol. 56, no. 12, pp. 1253–1260, 2021.

[8]
T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, Feature pyramid networks for object detection, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 936– 944.
[9]
X. Li, W. Wang, L. Wu, S. Chen, X. Hu, J. Li, J. Tang, and J. Yang, Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection, in Proc. 34 th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 21002–21012.
[10]
S. Liu, D. Huang, and Y. Wang, Learning spatial fusion for single-shot object detection, arXiv preprint arXiv: 1911.09516, 2019.
[11]

L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs, IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 4, pp. 834–848, 2018.

[12]

H. Ding, X. Jiang, B. Shuai, A. Q. Liu, and G. Wang, Semantic segmentation with context encoding and multi-path decoding, IEEE Trans. Image Process., vol. 29, pp. 3520–3533, 2020.

[13]
T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, Focal loss for dense object detection, in Proc. 2017 IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 2999–3007.
[14]
Z. Chen, C. Yang, Q. Li, F. Zhao, Z. J. Zha, and F. Wu, Disentangle your dense object detector, in Proc. 29 th ACM Int. Conf. Multimedia, Virtual Event, 2021, pp. 4939–4948.
[15]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, SSD: Single shot MultiBox detector, in Proc. 14 th European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 21–37.
[16]
Q. Zhao, T. Sheng, Y. Wang, Z. Tang, Y. Chen, L. Cai, and H. Ling, M2Det: A single-shot object detector based on multi-level feature pyramid network, in Proc. 33 rd AAAI Conf. Artificial Intelligence, Honolulu, HI, USA, 2019, pp. 9259–9266.
[17]
B. Singh and L. S. Davis, An analysis of scale invariance in object detection–SNIP, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 3578–3587.
[18]
C. Zhu, Y. He, and M. Savvides, Feature selective anchor-free module for single-shot object detection, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 840–849.
[19]
Y. Li, Y. Chen, N. Wang, and Z. X. Zhang, Scale-aware trident networks for object detection, in Proc. 2019 IEEE/CVF Int. Conf. Computer Vision, Seoul, Republic of Korea, 2019, pp. 6053–6062.
[20]

Y. Lai, F. Fan, Q. Wu, W. Ke, P. Liao, Z. Deng, H. Chen, and Y. Zhang, LCANet: Learnable connected attention network for human identification using dental images, IEEE Trans. Med. Imaging, vol. 40, no. 3, pp. 905–915, 2021.

[21]

S. Li, Y. Xie, G. Wang, L. Zhang, and W. Zhou, Adaptive multimodal fusion with attention guided deep supervision net for grading hepatocellular carcinoma, IEEE J. Biomed. Health Inform., vol. 26, no. 8, pp. 4123–4131, 2022.

[22]

C. Chen, K. Zhou, H. Wang, Y. Lu, Z. Wang, R. Xiao, and T. Lu, TMSF-Net: Multi-series fusion network with treeconnect for colorectal tumor segmentation, Comput. Methods Programs Biomed., vol. 215, p. 106613, 2022.

[23]
J. Redmon and A. Farhadi, YOLOv3: An incremental improvement, arXiv preprint arXiv: 1804.02767, 2018.
[24]
S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li, Single-shot refinement neural network for object detection, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 4203–4212.
[25]
Z. Tian, C. Shen, H. Chen, and T. He, FCOS: Fully convolutional one-stage object detection, in Proc. 2019 IEEE/CVF Int. Conf. Computer Vision, Seoul, Republic of Korea, 2019, pp. 9626–9635.
[26]
X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang, EAST: An efficient and accurate scene text detector, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 2642–2651.
[27]
J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. Huang, UnitBox: An advanced object detection network, in Proc. 24 th ACM Int. Conf. Multimedia, Amsterdam, The Netherlands, 2016, pp. 516–520.
[28]
H. Law and J. Deng, CornerNet: Detecting objects as paired keypoints, in Proc. 15 th European Conf. Computer Vision (ECCV ), Munich, Germany, 2018, pp. 765–781.
[29]
K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian, CenterNet: Keypoint triplets for object detection, in Proc. 2019 IEEE/CVF Int. Conf. Computer Vision, Seoul, Republic of Korea, 2019, pp. 6568–6577.
[30]

Y. Xu, Y. Wang, J. Yuan, Q. Cheng, X. Wang, and P. L. Carson, Medical breast ultrasound image segmentation by machine learning, Ultrasonics, vol. 91, pp. 1–9, 2019.

[31]

Y. Hu, Y. Guo, Y. Wang, J. Yu, J. Li, S. Zhou, and C. Chang, Automatic tumor segmentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model, Med. Phys., vol. 46, no. 1, pp. 215–228, 2019.

[32]

W. Xie, C. Jacobs, J. P. Charbonnier, and B. Van Ginneken, Relational modeling for robust and efficient pulmonary lobe segmentation in CT scans, IEEE Trans. Med. Imaging, vol. 39, no. 8, pp. 2664–2675, 2020.

[33]
J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, L. Lu, A. L. Yuille, and Y. Zhou, TransUNet: Transformers make strong encoders for medical image segmentation, arXiv preprint arXiv: 2102.04306, 2021.
[34]
A. Hatamizadeh, V. Nath, Y. Tang, D. Yang, H. R. Roth, and D. Xu, Swin UNETR: Swin transformers for semantic segmentation of brain tumors in MRI images, in Proc. 7 th Int. Workshop on Brainlesion : Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Virtual Event, 2021, pp. 272–284.
[35]

F. Casalegno, T. Newton, R. Daher, M. Abdelaziz, A. Lodi-Rizzini, F. Schürmann, I. Krejci, and H. Markram, Caries detection with near-infrared transillumination using deep learning, J. Dent. Res., vol. 98, no. 11, pp. 1227–1233, 2019.

[36]

X. Zhang, Y. Liang, W. Li, C. Liu, D. Gu, W. Sun, and L. Miao, Development and evaluation of deep learning for screening dental caries from oral photographs, Oral Dis., vol. 28, no. 1, pp. 173–181, 2022.

[37]

W. Wang, E. Xie, X. Li, D. P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao, PVT v2: Improved baselines with pyramid vision transformer, Computational Visual Media, vol. 8, no. 3, pp. 415–424, 2022.

[38]
J. Wang, K. Chen, R. Xu, Z. Liu, C. C. Loy, and D. Lin, CARAFE: Content-aware reassembly of features, in Proc. 2019 IEEE/CVF Int. Conf. Computer Vision, Seoul, Republic of Korea, 2019, pp. 3007–3016.
[39]

J. Bernal, J. Sánchez, and F. Vilariño, Towards automatic polyp detection with a polyp appearance model, Pattern Recog., vol. 45, no. 9, pp. 3166–3182, 2012.

[40]
R. Azad, M. Asadi-Aghbolaghi, M. Fathy, and S. Escalera, Bi-directional ConvLSTM U-Net with densley connected convolutions, in Proc. 2019 IEEE/CVF Int. Conf. Computer Vision Workshops, Seoul, Republic of Korea, 2019, pp. 406–415.
[41]
S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, in Proc. 28 th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2015, pp. 91–99.
[42]
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, Swin transformer: Hierarchical vision transformer using shifted windows, in Proc. 2021 IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 9992–10002.
[43]

J. Bernal, N. Tajkbaksh, F. J. Sanchez, B. J. Matuszewski, H. Chen, L. Yu, Q. Angermann, O. Romain, B. Rustad, I. Balasingham, et al., Comparative validation of polyp detection methods in video colonoscopy: Results from the MICCAI 2015 endoscopic vision challenge, IEEE Trans. Med. Imaging, vol. 36, no. 6, pp. 1231–1249, 2017.

[44]
H. A. Qadir, Y. Shin, J. Solhusvik, J. Bergsland, L. Aabakken, and I. Balasingham, Polyp detection and segmentation using mask R-CNN: Does a deeper feature extractor CNN always perform better? in Proc. 2019 13 th Int. Symp. Medical Information and Communication Technology (ISMICT ), Oslo, Norway, 2019, pp. 1–6.
[45]
X. Mo, K. Tao, Q. Wang, and G. Wang, An efficient approach for polyps detection in endoscopic videos based on faster R-CNN, in Proc. 2018 24 th Int. Conf. Pattern Recognition (ICPR ), Beijing, China, 2018, pp. 3929–3934.
[46]
C. Eggert, S. Brehm, A. Winschel, D. Zecha, and R. Lienhart, A closer look: Small object detection in faster R-CNN, in Proc. 2017 IEEE Int. Conf. Multimedia and Expo (ICME ), Hong Kong, China, 2017, pp. 421–426.
[47]
B. Y. Hsueh, W. Li, and I. C. Wu, Stochastic gradient descent with hyperbolic-tangent decay on classification, in Proc. 2019 IEEE Winter Conf. Applications of Computer Vision (WACV ), Waikoloa, HI, USA, 2019, pp. 435–442.
[48]
Z. Shen, R. Fu, C. Lin, and S. Zheng, COTR: Convolution in transformer network for end to end polyp detection, in Proc. 2021 7 th Int. Conf. Computer and Communications (ICCC ), Chengdu, China, 2021, pp. 1757–1761.
[49]

X. Zhang, F. Chen, T. Yu, J. An, Z. Huang, J. Liu, W. Hu, L. Wang, H. Duan, and J. Si, Real-time gastric polyp detection using convolutional neural networks, PLoS One, vol. 14, no. 3, p. e0214133, 2019.

[50]

F. Deeba, F. M. Bui, and K. A. Wahid, Computer-aided polyp detection based on image enhancement and saliency-based selection, Biomed. Signal Process. Control, vol. 55, p. 101530, 2020.

[51]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, in Proc. 25 th Int. Conf. Neural Information Processing Systems, Lake Tahoe, NV, USA, 2012, pp. 1097–1105.
[52]
J. Hu, L. Shen, and G. Sun, Squeeze-and-excitation networks, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7132–7141.
[53]
S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, Path aggregation network for instance segmentation, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 8759–8768.
[54]
G. Ghiasi, T. Y. Lin, and Q. V. Le, NAS-FPN: Learning scalable feature pyramid architecture for object detection, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 7029–7038.
[55]
X. Dai, Y. Chen, B. Xiao, D. Chen, M. Liu, L. Yuan, and L. Zhang, Dynamic head: Unifying object detection heads with attentions, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR ), Nashville, TN, USA, 2021, pp. 7369–7378.
[56]
S. Zhang, C. Chi, Y. Yao, Z. Lei, and S. Z. Li, Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection, in Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 9756–9765.
[57]
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, Grad-CAM: Visual explanations from deep networks via gradient-based localization, in Proc. 2017 IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 618–626.
Big Data Mining and Analytics
Pages 1362-1374
Cite this article:
Gao N, Li Y, Chen P, et al. GFDet: Multi-Level Feature Fusion Network for Caries Detection Using Dental Endoscope Images. Big Data Mining and Analytics, 2024, 7(4): 1362-1374. https://doi.org/10.26599/BDMA.2024.9020027
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return