BNRist; and Key Laboratory of Pervasive Computing (Ministry of Education) and the Department of Computer Science and Technology, Tsinghua University, Beijing100084, China
School of Electronic and Information Engineering, Beihang University, Beijing100191, China
Department of Nuclear Medicine, Peking Union Medical College Hospital, Beijing100730, China
School of Clinical Medicine, Tsinghua University
Beijing Tsinghua Changgung Hospital, Beijing100084, China
Show Author Information
Hide Author Information
Abstract
The accurate segmentation of medical images is crucial to medical care and research; however, many efficient supervised image segmentation methods require sufficient pixel level labels. Such requirement is difficult to meet in practice and even impossible in some cases, e.g., rare Pathoma images. Inspired by traditional unsupervised methods, we propose a novel Chan-Vese model based on the Markov chain for unsupervised medical image segmentation. It combines local information brought by superpixels with the global difference between the target tissue and the background. Based on the Chan-Vese model, we utilize weight maps generated by the Markov chain to model and solve the segmentation problem iteratively using the min-cut algorithm at the superpixel level. Our method exploits abundant boundary and local region information in segmentation and thus can handle images with intensity inhomogeneity and object sparsity. In our method, users gain the power of fine-tuning parameters to achieve satisfactory results for each segmentation. By contrast, the result from deep learning based methods is rigid. The performance of our method is assessed by using four Computerized Tomography (CT) datasets. Experimental results show that the proposed method outperforms traditional unsupervised segmentation techniques.
No abstract is available for this article. Click the button above to view the PDF directly.
References
[1]
C. T.Yeo, T.Ungi, P. U.Thainual, A.Lasso, R. C.McGraw, and G.Fichtinger, The effect of augmented reality training on percutaneous needle placement in spinal facet joint injections, IEEE Trans. Bio-Med. Eng., vol. 58, no. 7, pp. 2031-2037, 2011.
A. Y.Yang, J.Wright, Y.Ma, and S.Sastry, Unsupervised segmentation of natural images via lossy data compression, Comput. Vis. Image Underst., vol. 110, no. 2, pp. 212-225, 2008.
B.Hui, Y.Liu, J.Qiu, L.Cao, L.Ji, and Z.He, Study of texture segmentation and classification for grading small hepatocellular carcinoma based on CT images, Tsinghua Science and Technology, vol. 26, no. 2, pp. 199-207, 2021.
G.Zheng, G.Han, and N. Q.Soomro, An inception module CNN classifiers fusion method on pulmonary nodule diagnosis by signs, Tsinghua Science and Technology, vol. 25, no. 3, pp. 368-383, 2020.
A.Kanezaki, Unsupervised image segmentation by backpropagation, presented at IEEE Int. Conf. Acoustics, Speech and Signal Proc. (ICASSP), Calgary, Canada, 2018, pp. 1543-1547.
T.Cour, F.Benezit, and J. B.Shi, Spectral segmentation with multiscale graph decomposition, presented at 2005 IEEE Computer Society Conf. Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 2005, pp. 1124-1131.
[11]
J. B.Shi and J.Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888-905, 2000.
Y.Zhang, M.Brady, and S. M.Smith, Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm, IEEE Trans. Med. Imag., vol. 20, no. 1, pp. 45-57, 2001.
D.Comaniciu and P.Meer, Mean shift: A robust approach toward feature space analysis, IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603-619, 2002.
X.Fan, M.Dai, C.Liu, F.Wu, X.Yan, Y.Feng, Y.Feng, and B.Su, Effect of image noise on the classification of skin lesions using deep convolutional neural networks, Tsinghua Science and Technology, vol. 25, no. 3, pp. 425-434, 2020.
L. C.Chen, G.Papandreou, I.Kokkinos, K.Murphy, and A. L.Yuille, Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS, IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 4, pp. 834-848, 2018.
Y. Y.Boykov and M. P.Jolly, Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images, in Proc. 8th IEEE Int. Conf. Computer Vision, Vancouver, Canada, 2001, pp. 105-112.
[23]
C.Rother, V.Kolmogorov, and A.Blake, “Grabcut”: Interactive foreground extraction using iterated graph cuts, ACM Trans. Graph., vol. 23, no. 3, pp. 309-314, 2004.
C. M.Li, C. Y.Kao, J. C.Gore, and Z. H.Ding, Implicit active contours driven by local binary fitting energy, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Minneapolis, MN, USA, 2007, pp. 1-7.
C. M.Li, C. Y.Xu, C. F.Gui, and M. D.Fox, Distance regularized level set evolution and its application to image segmentation, IEEE Trans. Image Proc., vol. 19, no. 12, pp. 3243-3254, 2010.
O.Veksler, Y.Boykov, and P.Mehrani, Superpixels and supervoxels in an energy optimization framework, in Computer Vision-ECCV 2010, K.Daniilidis, P.Maragos, and N.Paragios, eds.Cham, Germany: Springer, 2010, pp. 211-224.
J. B.Shen, Y. F.Du, W. G.Wang, and X. L.Li, Lazy random walks for superpixel segmentation, IEEE Trans. Image Proc., vol. 23, no. 4, pp. 1451-1462, 2014.
M. S.Aslan, A.Shalaby, and A. A.Farag, Clinically desired segmentation method for vertebral bodies, presented at 2013 IEEE 10th Int. Symp. Biomedical Imaging, San Francisco, CA, USA, 2013, pp. 840-843.
Huang Q, Zhou Y, Tao L, et al. A Chan-Vese Model Based on the Markov Chain for Unsupervised Medical Image Segmentation. Tsinghua Science and Technology, 2021, 26(6): 833-844. https://doi.org/10.26599/TST.2020.9010042
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).
10.26599/TST.2020.9010042.F001
Overall segmentation flow diagram. Here, s represents foreground node and t represents background node.
10.26599/TST.2020.9010042.F002
Results of the weight maps under different parameters . (a) Result of the superpixels; and (b)-(d) weight maps corresponding to the superpixel at the red dot under different parameters , respectively.
4 Experiment
To evaluate our unsupervised model’s performance thoroughly, we conduct preliminary experiments on four datasets of varying difficulty and compare the results with those of other methods. In addition, we demonstrate the advantages of our proposed method over the Chan-Vese model in terms of initialization and iteration times.
4.1 Datasets
CVIP: This dataset is from Spine Web, which is an online collaborative platform. This dataset provides a total of 349 spinal CT scans and the corresponding ground truths for five patients[
31
]. In some subsequent experiments on parameters, we use 19 CT slices of one patient (referred to as MINI-CVIP).
LUNG: This dataset contains 267 lung CT images and manual masks of the corresponding lungs; the images were acquired from an open competition, Finding and Measuring Lungs in CT Data.
SKULL, SPINE: This dataset is provided by a hospital; it contains numerous CT images of the skull and spine, as well as manual segmentation results.
4.2 Experimental settings
We conduct several experiments on real CT scans to evaluate the performance of the proposed method. Our method does not utilize any annotation information for segmentation.
Evaluation metrics: The accuracies of segmented contours are quantified using the Dice Similarity Coefficient (DSC), which is calculated as follows:
The Jaccard Similarity Coefficient (JSC) is also adopted as an evaluation metric to evaluate the accuracy of the proposed segmentation method; JSC is defined as follows:
where is the area segmented by the algorithm, and is the ground-truth area manually delineated.
Implementation details: Some of the experiments compare our proposed method with the supervised method U-net; for each dataset, is used as the training set, and is used as the testing set. All methods conduct performance evaluation on the testing set. Moreover, the spinal column is relatively smaller than that in the original CT scan with a size of . For a transparent and fair comparison, the segmentation is limited within a bounding box that contains the ground truth. Then, the DSC and JSC are computed in the region.
4.3 Results4.3.1 Effect of superpixels
Superpixels are useful in a wide range of vision tasks, and we can fully take advantage of boundary information for subsequent segmentation due to the superpixel. We test the results of whether or not to use superpixels in our proposed method. In the case of not using superpixels, each node of the undirected graph corresponds to a particular pixel of the input image. As shown in
Fig. 3
, in the images with intensity inhomogeneity, nodes with coarser granularity have more boundary information. Therefore, the superpixels provide more cues compared with a single pixel, and the segmentation results do not easily fall into the local optima. Hence, we can derive additional global results.
10.26599/TST.2020.9010042.F003
Comparison of whether to use superpixels.
In our method, we adopt the standard SLIC algorithm. Given the single-channel characteristics of medical images, we apply the algorithm on the original features rather than the XYLab features. In the choice of parameters, we test the number of superpixels K on the dataset MINI-CVIP, and the experiment shows that as the value of K decreased (the number of pixels in each superpixel increased), the segmentation performance did not fluctuate within a certain range (as shown in
Fig. 4
) and then declined rapidly. The smaller K leads to more pixels in each superpixel, ultimately reducing the boundary segmentation precision. Therefore, for experimental accuracy and computational efficiency, we keep the number of pixels in each superpixel at approximately 10; that is, the value of the parameter K is approximately .
10.26599/TST.2020.9010042.F004
Influence of different numbers of superpixels on the segmentation result.
4.3.2 Comparison of different methods
Table 1
shows the segmentation results of different methods. To demonstrate the merits of our model, we compare K-means clustering, the Chan-Vese model, the snake model, the Local Binary Fitting (LBF) model, and our method. In addition, we compare our method with the supervised method U-net on some datasets. In all the experiments, all methods are tested on the corresponding testing set.
Table 1
shows that our method achieves the highest accuracy for majority of the datasets. The K-means method focuses on global losses and does not perform well on images with intensity inhomogeneity. Compared with the snake and Chan-Vese models, our method considers local information and shows better segmentation results in all experiments. LBF, an implicit active contour model driven by local binary fitting energy, performs best on the CVIP dataset. However, each image’s processing time takes long due to the introduction of the convolution operation. Although U-net performs better on the SPINE dataset, this performance is mainly due to the fact that spines have roughly the same shape. Another reason is that an extensive training set with 87 training images is provided.
10.26599/TST.2020.9010042.T001
Segmentation accuracy of different methods on multiple datasets.
Method
CVIP
LUNG
SKULL
SPINE
DSC
JSC
DSC
JSC
DSC
JSC
DSC
JSC
K-means
0.918
0.902
0.817
0.811
0.901
0.841
0.921
0.918
Chan-Vese
0.965
0.934
0.836
0.821
0.919
0.851
0.970
0.941
Snake
0.954
0.924
0.823
0.801
0.898
0.837
0.956
0.947
LBF
0.970
0.943
0.865
0.841
0.921
0.849
0.975
0.954
U-net
0.958
0.929
-
-
0.912
0.843
0.976
0.954
Proposed method
0.968
0.939
0.876
0.865
0.923
0.858
0.973
0.948
As shown in
Fig. 5
, obtaining the separated bones is difficult because no clear distinction exists between the gap and bones; thus, other areas can be easily mistaken as bones. With our method, we can fit the target boundary well. In addition, the similarity between superpixels is used to enhance the local relevance to obtain a compact segmentation. To achieve a compact contour, the Chan-Vese model introduces a smoothing term, which leads to segmentation results that cannot fit the target boundary well, especially for images with intensity inhomogeneity. For the snake model, the result is often not locally smooth, and the final result may have many isolated contours.
10.26599/TST.2020.9010042.F005
Segmentation results of different methods. Each row represents a sample corresponding to the ground truth and the segmentation results under different methods.
Our proposed method has remarkable advantages in terms of initialization and iteration times. Compared with our model, the Chan-Vese model is more sensitive to initialization. In many cases, it does not converge to the target region. As shown in
Fig. 6a
, if the initial contour is around the target, then the algorithm easily obtains the desired segmentation; otherwise, as shown in
Fig. 6b
, the result is poor due to the failure to consider the remote pixel. In addition, as shown in
Fig. 7
and
Fig. 8
, our method iterates less than ten times in all experiments compared with the Chan-Vese model, which requires dozens and even hundreds of iterations to achieve convergence. Moreover, our method has a certain iterative stop condition; that is, the min-cut of the undirected graph is no longer changed. However, for the Chan-Vese model, the evolution contour produces different results due to different iteration times and stop conditions. It is irreversible for the evolution contour to cross the target contour in the iteration process.
10.26599/TST.2020.9010042.F006
Results of Chan-Vese model and our model under different initializations. The green line represents the initial contour, the blue line is the segmentation result of the Chan-Vese model, and the red line is the result of our model.
10.26599/TST.2020.9010042.F007
Result of the Chan-Vese model and our model under different numbers of iterations.
10.26599/TST.2020.9010042.F008
Segmentation performance under different numbers of iterations (MINI-CVIP).
4.3.3 Ablation study
We further explore the parameters of our method. In practice, the segmentation results are adjusted mainly by four parameters (, and . and have the same function ( is mainly used for intensity inhomogeneous images) as and do. As shown in
Fig. 9
, by setting different and , we can find that the larger is, the more uniform the interior of the foreground target becomes; conversely, the larger is, the more consistent the interior of the background becomes. In addition, we conduct ablation studies, as shown in
Table 2
. Compared with using only global information, the weight maps we introduced result in great improvement because the weight maps mainly consider the pixels in the surrounding neighborhood. Therefore, in actual use, images with intensity inhomogeneity must be segmented by adjusting and .
10.26599/TST.2020.9010042.T002
Ablation studies. denotes the change of DSC compared with using global and local information simultaneously.
Parameter setting
DSC
()
0.9656
0
0.9404
0.9551
10.26599/TST.2020.9010042.F009
Segmentation results under different parameters.
10.26599/TST.2020.9010042.F003
Comparison of whether to use superpixels.
10.26599/TST.2020.9010042.F004
Influence of different numbers of superpixels on the segmentation result.
10.26599/TST.2020.9010042.F005
Segmentation results of different methods. Each row represents a sample corresponding to the ground truth and the segmentation results under different methods.
10.26599/TST.2020.9010042.F006
Results of Chan-Vese model and our model under different initializations. The green line represents the initial contour, the blue line is the segmentation result of the Chan-Vese model, and the red line is the result of our model.
10.26599/TST.2020.9010042.F007
Result of the Chan-Vese model and our model under different numbers of iterations.
10.26599/TST.2020.9010042.F008
Segmentation performance under different numbers of iterations (MINI-CVIP).
10.26599/TST.2020.9010042.F009
Segmentation results under different parameters.