Scholar - SciOpen

Follow this author

Fuchun Sun

Downloads: 239 Citations: 41 Articles: 2

Publication Fields

Information Sciences

Publications

Year

Co-author

Sort：

Published

Cited

Download

Open Access Issue

Self-Aligning Multi-Modal Transformer for Oropharyngeal Swab Point Localization

Tianyu Liu, Fuchun Sun

Tsinghua Science and Technology 2024, 29(4): 1082-1091

Published: 09 February 2024

Abstract

PDF (11.8 MB) Collect Collected

Downloads：81

The oropharyngeal swabbing is a pre-diagnostic procedure used to test various respiratory diseases, including COVID and Influenza A (H1N1). To improve the testing efficiency of testing, a real-time, accurate, and robust sampling point localization algorithm is needed for robots. However, current solutions rely heavily on visual input, which is not reliable enough for large-scale deployment. The transformer has significantly improved the performance of image-related tasks and challenged the dominance of traditional convolutional neural networks (CNNs) in the image field. Inspired by its success, we propose a novel self-aligning multi-modal transformer (SAMMT) to dynamically attend to different parts of unaligned feature maps, preventing information loss caused by perspective disparity and simplifying overall implementation. Unlike preexisting multi-modal transformers, our attention mechanism works in image space instead of embedding space, rendering the need for the sensor registration process obsolete. To facilitate the multi-modal task, we collected and annotate an oropharynx localization/segmentation dataset by trained medical personnel. This dataset is open-sourced and can be used for future multi-modal research. Our experiments show that our model improves the performance of the localization task by 4.2% compared to the pure visual model, and reduces the pixel-wise error rate of the segmentation task by 16.7% compared to the CNN baseline.

Open Access Issue

Machine Learning-Based Multi-Modal Information Perception for Soft Robotic Hands

Haiming Huang, Junhao Lin, Linyuan Wu, Bin Fang, Zhenkun Wen, Fuchun Sun

Tsinghua Science and Technology 2020, 25(2): 255-269

Published: 02 September 2019

Abstract

PDF (12.1 MB) Collect Collected

Downloads：156

This paper focuses on multi-modal Information Perception (IP) for Soft Robotic Hands (SRHs) using Machine Learning (ML) algorithms. A flexible Optical Fiber-based Curvature Sensor (OFCS) is fabricated, consisting of a Light-Emitting Diode (LED), photosensitive detector, and optical fiber. Bending the roughened optical fiber generates lower light intensity, which reflecting the curvature of the soft finger. Together with the curvature and pressure information, multi-modal IP is performed to improve the recognition accuracy. Recognitions of gesture, object shape, size, and weight are implemented with multiple ML approaches, including the Supervised Learning Algorithms (SLAs) of K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Logistic Regression (LR), and the unSupervised Learning Algorithm (un-SLA) of K-Means Clustering (KMC). Moreover, Optical Sensor Information (OSI), Pressure Sensor Information (PSI), and Double-Sensor Information (DSI) are adopted to compare the recognition accuracies. The experiment results demonstrate that the proposed sensors and recognition approaches are feasible and effective. The recognition accuracies obtained using the above ML algorithms and three modes of sensor information are higer than 85 percent for almost all combinations. Moreover, DSI is more accurate when compared to single modal sensor information and the KNN algorithm with a DSI outperforms the other combinations in recognition accuracy.

Total 2

<1/11>GOpage