广东工业大学学报

• •    

基于多尺度卷积和注意力机制的病理图像分割网络

曾安1, 赖峻浩1, 杨宝瑶1, 潘丹2   

  1. 1. 广东工业大学 计算机学院, 广东 广州 510006;
    2. 广东技术师范大学 电子与信息学院, 广东 广州 510665
  • 收稿日期:2023-11-11 出版日期:2024-09-27 发布日期:2024-09-27
  • 通信作者: 杨宝瑶 (1991–) ,女,副教授,研究生导师,主要研究方向为模式识别与数据挖掘、知识表示与处理、机器学习、计算机图像视频处理与多媒体技术,E-mail:ybaoyao@gdut.edu.cn
  • 作者简介:曾安(1978–),女,教授,博士生导师,主要研究方向为图像处理、模式识别及人工智能,E-mail:zengan@gdut.edu.cn
  • 基金资助:
    国家自然科学基金资助项目 (61976058,92267107) ;广东省重点领域研发计划项目 (2021B0101220006) ;广东省科技计划项目 (2019A050510041) ;广东省自然科学基金资助项目 (2021A1515012300) ;广州市科技计划项目 (202103000034,202002020090)

Pathology Image Segmentation Network Based on Multiscale Convolution and Attention Mechanism

Zeng An1, Lai Jun-hao1, Yang Bao-yao1, Pan Dan2   

  1. 1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China;
    2. School of Electronics and Information, Guangdong Polytechnic Normal University, Guangzhou 510665, China
  • Received:2023-11-11 Online:2024-09-27 Published:2024-09-27

摘要: 深度学习在病理图像分割中具有重要作用,然而现有深度学习方法在处理多尺度的病理图像分割任务上存在分割性能不足,泛化能力差等问题。针对上述问题,本文提出了一种基于多尺度卷积和注意力机制的病理图像分割网络。该网络通过多尺度卷积注意力模块,从不同尺度出发对特征图进行特征提取并从空间维度出发关注全局上下文关联信息,有效过滤冗余噪声信息,提升网络处理多尺度病理图像数据的泛化能力;通过多尺度特征融合模块,将不同尺度的特征进行信息融合,丰富了特征图的边缘信息和细粒度信息,改善分割效果。在GlaS、MoNuSeg和Lizard数据集上分别进行实验,所提方法的Dice指标在3个数据集上分别为91.07%、81.00%、79.87%,IoU指标为84.13%、68.22%、67.26%。实验结果表明,本文所提方法能够有效分割病理图片,提升分割准确度,为临床诊断提供可靠依据。

关键词: 病理图像分割, UNet, 多尺度, 注意力机制

Abstract: Deep learning plays an essential role in the segmentation of pathological images. However, most existing deep learning methods still face challenges such as poor segmentation performance and generalization ability on multi-scale pathological image segmentation tasks. To address these issues, we propose a pathological image segmentation network based on multi-scale convolution and attention mechanisms. We design a multi-scale convolution attention module to extract different scales of features and spatially capture global contextual correlation information, effectively filtering redundant noise information and improving the network's generalization ability in handling multi-scale pathological image data. Additionally, we design a multi-scale feature fusion module to integrate features from different scales, enhancing the edge and fine-grained information in the feature maps and improving segmentation results. The experiments were performed on the GlaS, MoNuSeg and Lizard datasets, and the experimental results show that the Dice scores of the proposed method were 91.07%、81.00% and 79.87%, respectively, and the IoU scores were 84.13%、68.22% and 67.26%, respectively. This demonstrates that the proposed method can effectively segment pathology image, improve the segmentation accuracy, and provide a reliable basis for clinical diagnosis.

Key words: pathology image segmentation, UNet, multiscale, attention mechanism

中图分类号: 

  • TP391
[1] 宋杰, 肖亮, 练智超, 等. 基于深度学习的数字病理图像分割综述与展望[J]. 软件学报, 2021, 32(5): 1427-1460.
SONG J, XIAO L, LIAN Z C, et al. Overview and prospect of deep learning for image segmentation in digital pathology [J]. Journal of Software, 2021, 32(5): 1427-1460.
[2] TSAI A, YEZZI A, WELLS W, et al. A shape-based approach to the segmentation of medical imagery using level sets [J]. IEEE Transactions on Medical Imaging, 2003, 22(2): 137-154.
[3] HELD K, KOPS E R, KRAUSE B J, et al. Markov random field segmentation of brain MR images [J]. IEEE Transactions on Medical Imaging, 1997, 16(6): 878-886.
[4] RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C] //Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference. Munich: Springer International Publishing, 2015: 234-241.
[5] XIAO X, LIAN S, LUO Z, et al. Weighted res-unet for high-quality retina vessel segmentation[C]//2018 9th International Conference on Information Technology in Medicine and Education (ITME) . Hangzhou: IEEE, 2018: 327-331.
[6] AZAD R, AGHDAM E K, RAULAND A, et al. Medical image segmentation review: the success of u-net[EB/OL]. arXiv: 2211.14830(2022-11-27) [2024-03-12].https://arxiv.org/abs/2211.14830
[7] ZHOU S, NIE D, ADELI E, et al. High-resolution encoder–decoder networks for low-contrast medical image segmentation [J]. IEEE Transactions on Image Processing, 2019, 29: 461-475.
[8] HATAMIZADEH A, TANG Y, NATH V, et al. Unetr: transformers for 3d medical image segmentation[C] //Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2022: 574-584.
[9] CAO H, WANG Y, CHEN J, et al. Swin-unet: unet-like pure transformer for medical image segmentation[C]//European Conference on Computer Vision. Tel-Aviv: Springer Nature Switzerland, 2022: 205-218.
[10] LI X, CHEN H, QI X, et al. H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes [J]. IEEE Transactions on Medical Imaging, 2018, 37(12): 2663-2674.
[11] ZHOU Z, RAHMAN SIDDIQUEE M M, TAJBAKHSH N, et al. Unet++: A nested u-net architecture for medical image segmentation[C]//Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop. Granada: Springer International Publishing, 2018: 3-11.
[12] ÇICEK Ö, ABDULKADIR A, LIENKAMP S S, et al. 3D u-net: learning dense volumetric segmentation from sparse annotation[C]//Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016: 19th International Conference. Athens: Springer International Publishing, 2016: 424-432.
[13] MILLETARI F, NAVAB N, AHMADI S A. V-net: fully convolutional neural networks for volumetric medical image segmentation[C]//2016 Fourth International Conference on 3D Vision (3DV) . Stanford: IEEE, 2016: 565-571.
[14] CHEN J, LU Y, YU Q, et al. Transunet: transformers make strong encoders for medical image segmentation[EB/OL]. arXiv: 2102.04306(2021-02-08) [2024-03-12]. https://arxiv.org/abs/2102.04306
[15] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale [EB/OL]. arXiv: 2010.11929(2021-06-03) [2024-03-12]. https://arxiv.org/abs/2010.11929
[16] ZHANG Y, LIU H, HU Q. Transfuse: fusing transformers and CNNs for medical image segmentation[C]//Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference. Strasbourg: Springer International Publishing, 2021: 14-24.
[17] VALANARASU J M J, OZA P, HACIHALILOGLU I, et al. Medical transformer: gated axial-attention for medical image segmentation[C]//Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference. Strasbourg: Springer International Publishing, 2021: 36-46.
[18] WANG H, ZHU Y, GREEN B, et al. Axial-deeplab: stand-alone axial-attention for panoptic segmentation[C]//European Conference on Computer Vision. Glasgow: Springer International Publishing, 2020: 108-126.
[19] WANG H, CAO P, WANG J, et al. Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Vancouver: AAAI, 2022, 36(3) : 2441-2449.
[20] DIAO S, TIAN Y, HU W, et al. Weakly supervised framework for cancer region detection of hepatocellular carcinoma in whole-slide pathologic images based on multiscale attention convolutional neural network [J]. The American Journal of Pathology, 2022, 192(3): 553-563.
[21] ZHAO X, JIA H, PANG Y, et al. M2snet: multi-scale in multi-scale subtraction network for medical image segmentation [EB/OL]. arXiv: 2303.10894(2023-03-20) [2024-03-12]. https://arxiv.org/abs/2303.10894
[22] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, Neural Information Processing Systems Foundation, 2017: 5999-6008.
[23] LI C, LI L, JIANG H, et al. Yolov6: a single-stage object detection framework for industrial applications[EB/OL]. arXiv: 2209.02976(2022-09-07) [2024-03-12]. https://arxiv.org/abs/2209.02976
[24] SIRINUKUNWATTANA K, PLUIM J P W, CHEN H, et al. Gland segmentation in colon histology images: the glas challenge contest [J]. Medical Image Analysis, 2017, 35: 489-502.
[25] KUMAR N, VERMA R, SHARMA S, et al. A dataset and a technique for generalized nuclear segmentation for computational pathology [J]. IEEE Transactions on Medical Imaging, 2017, 36(7): 1550-1560.
[26] GRAHAM S, JAHANIFAR M, AZAM A, et al. Lizard: a large-scale dataset for colonic nuclear instance segmentation and classification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 684-693.
[27] OKTAY O, SCHLEMPER J, FOLGOC L L, et al. Attention u-net: learning where to look for the pancreas[EB/OL]. arXiv: 1804.03999(2018-05-20) [2024-03-12]. https://arxiv.org/abs/1804.03999
[1] 李雪森, 谭北海, 余荣, 薛先斌. 基于YOLOv5的轻量化无人机航拍小目标检测算法[J]. 广东工业大学学报, 2024, 41(03): 71-80.
[2] 涂泽良, 程良伦, 黄国恒. 基于局部正交特征融合的小样本图像分类[J]. 广东工业大学学报, 2024, 41(02): 73-83.
[3] 杨镇雄, 谭台哲. 基于生成对抗网络的低光照图像增强算法[J]. 广东工业大学学报, 2024, 41(01): 55-62.
[4] 谢伟立, 张军. 一种基于多尺度的多层卷积稀疏编码网络[J]. 广东工业大学学报, 2024, 41(0): 0-.
[5] 陈嘉鸿, 黄国恒, 谭喆. 基于跨模态差异注意力的医学报告生成[J]. 广东工业大学学报, 2024, 41(0): 0-.
[6] 赖志茂, 章云, 李东. 基于Transformer的人脸深度伪造检测技术综述[J]. 广东工业大学学报, 2023, 40(06): 155-167.
[7] 曾安, 陈旭宙, 姬玉柱, 潘丹, 徐小维. 基于自注意力和三维卷积的心脏多类分割方法[J]. 广东工业大学学报, 2023, 40(06): 168-175.
[8] 吴亚迪, 陈平华. 基于用户长短期偏好和音乐情感注意力的音乐推荐模型[J]. 广东工业大学学报, 2023, 40(04): 37-44.
[9] 陈晓荣, 杨雪荣, 成思源, 刘国栋. 基于改进Unet网络的锂电池极片表面缺陷检测[J]. 广东工业大学学报, 2023, 40(04): 60-66,93.
[10] 曹智雄, 吴晓鸰, 骆晓伟, 凌捷. 融合迁移学习与YOLOv5的安全帽佩戴检测算法[J]. 广东工业大学学报, 2023, 40(04): 67-76.
[11] 赖东升, 冯开平, 罗立宏. 基于多特征融合的表情识别算法[J]. 广东工业大学学报, 2023, 40(03): 10-16.
[12] 吴俊贤, 何元烈. 基于通道注意力的自监督深度估计方法[J]. 广东工业大学学报, 2023, 40(02): 22-29.
[13] 刘洪伟, 林伟振, 温展明, 陈燕君, 易闽琦. 基于MABM的消费者情感倾向识别模型——以电影评论为例[J]. 广东工业大学学报, 2022, 39(06): 1-9.
[14] 滕少华, 董谱, 张巍. 融合语义结构的注意力文本摘要模型[J]. 广东工业大学学报, 2021, 38(03): 1-8.
[15] 梁观术, 曹江中, 戴青云, 黄云飞. 一种基于注意力机制的无监督商标检索方法[J]. 广东工业大学学报, 2020, 37(06): 41-49.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!