广东工业大学学报 ›› 2023, Vol. 40 ›› Issue (06): 168-175.doi: 10.12052/gdutxb.230131

• 人工智能 • 上一篇    下一篇

基于自注意力和三维卷积的心脏多类分割方法

曾安1, 陈旭宙1, 姬玉柱1, 潘丹2, 徐小维3   

  1. 1. 广东工业大学 计算机学院, 广东 广州 510006;
    2. 广东技术师范大学 电子与信息学院, 广东 广州 510665;
    3. 广东省人民医院(广东省医学科学院) 心外科, 广东 广州 510080
  • 收稿日期:2023-08-31 出版日期:2023-11-25 发布日期:2023-11-08
  • 作者简介:曾安(1978-),女,教授,博士生导师,主要研究方向为图像处理、模式识别、人工智能,E-mail:zengan@gdut.edu.cn
  • 基金资助:
    广东省重点领域研发计划项目(2021B0101220006);广东省科技计划项目(2019A050510041);广东省自然科学基金资助项目(2021A1515012300);国家自然科学基金资助项目(61976058,92267107);广州市科技计划项目(202103000034,202002020090)

Cardiac Multiclass Segmentation Method Based on Self-attention and 3D Convolution

Zeng An1, Chen Xu-zhou1, Ji Yu-Zhu1, Pan Dan2, Xu Xiao-Wei3   

  1. 1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China;
    2. School of Electronics and Information Technology, Guangdong Technical Normal University, Guangzhou 510665, China;
    3. Department of Cardiovascular Surgery, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Guangzhou 510080, China
  • Received:2023-08-31 Online:2023-11-25 Published:2023-11-08

摘要: 心脏多类分割在医学影像领域具有重要意义,可提供精准心脏结构信息,辅助临床诊断。然而,在高分辨率心脏影像多类语义分割模型的训练中,多次下采样导致深层特征的丢失,从而引发分割出来的心脏影像器官不连续和边缘分割错误等问题。为了应对这一挑战,本文提出基于自注意力和三维卷积的神经网络——3DCSNet。具体地,在网络中引入三维特征融合模块和三维空间感知模块,前者集成了自注意力和三维卷积并行特征提取,能够有效地分配特征图同一维度下的通道内部和通道之间的权重;后者通过融合自注意力机制,捕捉不同维度之间的位置相关性信息,避免因为下采样导致重要信息丢失,进一步保留深层关键特征。3DCSNet在公开的先天性心脏病三维计算机断层图像数据集(ImageCHD)上优于多个现有模型。

关键词: 多类语义分割, 心脏医学图像, 三维卷积, 自注意力机制, U-Net

Abstract: Cardiac multi-class segmentation is of great significance in medical imaging, which can provide accurate cardiac structure information and assist clinical diagnosis. However, in the training of multi-class semantic segmentation models with high-resolution cardiac images, the loss of deep features due to multiple downsampling operations leads to the problems oforgan discontinuity and incorrect edge segmentation in the segmented cardiac. To address this, this paper proposes a 3DCSNet based on self-attention and 3D convolution for cardiac multi-class segmentation. Specifically, our proposed network introduces the 3D feature fusion module and a 3D spatial perception module into the segmentation network. The former 3D feature fusion module integrates self-attention and 3D convolution for parallel feature extraction, which is able to efficiently allocate the attentions weights within and between channels under the same dimension of the feature map. The latter 3D spatial perception module captures the positional correlation information between different dimensions by integrating the self-attention mechanism, avoiding the loss of important information in downsampling and further retaining the deep key features. Experimental results show that the proposed 3DCSNet outperforms several existing models on a publicly available 3D computed tomography image dataset (ImageCHD).

Key words: multi-class semantic segmentation, cardiac medical images, 3D convolution, self-attention mechanism, U-Net

中图分类号: 

  • TP391.4
[1] BHAT V, BELAVAL V, GADABANAHALLI K, et al. Illustrated imaging essay on congenital heart diseases: multim-odality approach part III: cyanotic heart diseases and complex congenital anomalies [J]. Journal of Clinical and Diagnostic Research, 2016, 10(7): TE01-TE06.
[2] NESSER H J, SUGENG L, CORSI C, et al. Volumetric analysis of regional left ventricular function with real-time three-dimensional echocardiography: validation by magnetic resonance an- d clinical utility testing [J]. Heart, 2007, 93(5): 572-578.
[3] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks [J]. Advances in Neural Information Processing Systems, 2012, 25(2): 84-90.
[4] RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]//Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. New York: Springer; 2015: 234–241.
[5] CHEN J, LU Y, YU Q, et al. Transunet: Transformers make strong encoders for medical image segmentation[EB/OL]. arXiv: 2102.04306(2021-02-08) [2023-09-04]. https://arxiv.org/abs/2102.04306.
[6] CAO H, WANG Y, CHEN J, et al. Swin-unet: Unet-like pure transformer for medical image segmentation[C]//Proceedings of European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 205-218.
[7] HATAMIZADEH A, TANG Y, NATH V, et al. Unetr: Transfor-mers for 3d medical image segmentation[C]//Proceedings of the IEEE/CVF Winter conference on Applications of Computer Vision. New Orleans: IEEE Computer Society 2022: 574-584.
[8] LIU H, HU H, XU X, et al. Automatic left ventricle segmen-tation in cardiac MRI using topological stable-state thresholding and region restricted dynamic programming [J]. Academic Radiology, 2012, 19(6): 723-731.
[9] ULEN J, STRANDMARK P, Kahl F. An efficient optimization framework for multi-region segmentation based on lagrangian duality [J]. IEEE Transactions on Medical Imaging, 2012, 32(2): 178-188.
[10] CHEN T, BABB J, KELLMAN P, et al. Semiautomated segmentation of myocardial contours for fast strain analysis in cine displacement-encoded MRI [J]. IEEE Transactions on Medical Imaging, 2008, 27(8): 1084-1094.
[11] AYED I B, CHEN H, PUNITHAKUMAR K, et al. Max-flow segmentation of the left ventricle by recovering subject-specificdistributions via a bound of the Bhatta-charyya measure [J]. Medical Image Analysis, 2012, 16(1): 87-100.
[12] PETITJEAN C, DACHER J N. A review of segmentation methods in short axis cardiac MR images [J]. Medical Image Analysis, 2011, 15(2): 169-184.
[13] QUEIROS S, BARBOSA D, HEYDE B, et al. Fast automatic myocardial segmentation in 4D cine CMR datasets [J]. Medical Image Analysis, 2014, 18(7): 1115-1131.
[14] MITCHELL S C, BOSCH J G, LELIEVELDT B P F, et al. 3D active appearance models: segmentation of cardiac MR and ultrasound images [J]. IEEE Transactions on Medical Imaging, 2002, 21(9): 1167-1178.
[15] BAI W, SHI W, LEDIG C, et al. Multiatlas segmentation withaugmented features for cardiac MR images [J]. Medical Image Analysis, 2015, 19(1): 98-109.
[16] LIN X, YU L, CHENG K T, et al. The lighter the better: Rethinking Transformers in medical image segmentation through adaptive pruning [J]. IEEE Transactions on Medical Imaging, 2023, 42(8): 2325-2337.
[17] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of Neural Information Processing Systems, Long Beach: MIT Press; 2017: 5998-6008.
[18] LIU Z, LIN Y, CAO Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Montreal: IEEE Computer Society, 2021: 10012-10022.
[19] ÇICEK Ö, ABDULKADIR A, LIENKAMP S S, et al. 3D U-Net: learning dense volumetric segmentation from sparse annotation[C]//Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. New York: Springer; 2016: 424-432.
[20] OKTAY O, SCHLEMPER J, FOLGOC L L, et al. Attention U-Net: Learning where to look for the pancreas[EB/OL]. arXiv: 1804.03999(2018-04-11) [2023-09-04]. https://arxiv.org/abs/1804.03999.
[21] WU Y, LIAO K, CHEN J, et al. D-former: a u-shaped dilated transformer for 3d medical image segmentation [J]. Neural Computing and Applications, 2023, 35(2): 1931-1944.
[22] ZHOU H Y, GUO J, ZHANG Y, et al. nnformer: Interleaved transformer for volumetric segmentation[EB/OL]. arXiv: 2109.03201(2021-09-07) [2023-09-04]. https://arxiv.org/abs/2109.03201
[23] GUO J, ZHOU H Y, WANG L, et al. UNet-2022: exploring dynamics in non-isomorphic architecture[EB/OL]. arXiv: 2210.15566(2022-10-27) [2023-09-04]. https://arxiv.org/abs/2210.15566
[24] HUANG H, XIE S, LIN L, et al. ScaleFormer: revisiting the transformer-based backbones from a scale-wise perspective for medical image segmentation[EB/OL]. arXiv: 2207.14552(2022-01-29) [2023-09-04]. https://arxiv.org/abs/2207.14552
[25] XU X, WANG T, ZHUANG J, et al. Imagechd: A 3D computedtomography image dataset for classification of congenital heart disease[C]//Proceedings of Medical Image Computing and Computer-Assisted Intervention. New York: Springer; 2020: 77-87.
[26] VAN D W S, SCHONBERGER J L, NUNEZ J, et al. scikit-image: image processing in Python [J]. PeerJ, 2014, 2: e453.
[1] 赖志茂, 章云, 李东. 基于Transformer的人脸深度伪造检测技术综述[J]. 广东工业大学学报, 2023, 40(06): 155-167.
[2] 叶文权, 李斯, 凌捷. 基于多级残差U-Net的稀疏SPECT图像重建[J]. 广东工业大学学报, 2023, 40(01): 61-67.
[3] 刘洪伟, 林伟振, 温展明, 陈燕君, 易闽琦. 基于MABM的消费者情感倾向识别模型——以电影评论为例[J]. 广东工业大学学报, 2022, 39(06): 1-9.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!