Journal of Guangdong University of Technology ›› 2025, Vol. 42 ›› Issue (1): 60-69.doi: 10.12052/gdutxb.240091

• Smart Medical • Previous Articles    

A Method for Sparse-view Medical Image Reconstruction Based on Self-attention Neural Radiance Fields

Liao Haolin, Li Si   

  1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2024-08-06 Published:2025-01-14

Abstract: Sparse-view tomographic reconstruction is of significant importance for reducing radiation dose in clinical practice. In recent years, Implicit Neural Representation (INR) methods have been widely applied to medical image reconstruction in sparse-view scenario and have achieved competitive performance. However, traditional INR methods treat each sampling point individually as input, which neglect the inherent relations among neighboring sampling points, thus weakening the reconstruction performance. To address this, this paper proposes a novel INR method. The proposed method reorganizes neighboring sampling points on adjacent rays into multiple windows-of-interest, which are then fed into a Transformer query network equipped with a skip connection. By leveraging the self-attention mechanism of the Transformer network, the proposed method is able to capture the intrinsic relations among sampling points within each window-of-interest, thereby effectively enhancing the reconstructed image quality. This paper conducts extensive numerical experiments in two tomographic imaging modalities: Cone-Beam Computed Tomography (CBCT) and parallel-beam Single-Photon Emission Computed Tomography (SPECT) . The experimental results show that, compared to the advanced INR method Freq-NAF, the proposed method achieves superior performance in terms of reconstruction accuracy and image visualization under sparse-view conditions, particularly obtaining a 0.45 dB improvement in Peak Signal-to-Noise Ratio (PSNR) on the chest CBCT dataset.

Key words: implicit neural representation, sparse-view, medical image reconstruction, self-attention mechanism, window-of-interest

CLC Number: 

  • TP391
[1] FELDKAMP L A, DAVIS L C, KRESS J W. Practical cone-beam algorithm[J]. Journal of the Optical of America A, 1984, 1(6): 612-619.
[2] RONCHETTI M. Torchradon: fast differentiable routines for computed tomography[EB/OL]. arXiv: 2009.14788(2020-09-29) [2024-09-24]. https://arxiv.org/abs/2009.14788.
[3] ANDERSEN A H, KAK A C. Simultaneous algebraic reconstruction technique (SART): a superior implementation of the ART algorithm[J]. Ultrasonic Imaging, 1984, 6(1): 81-94.
[4] LI S, ZHANG J, KROL A, et al. Effective noise‐suppressed and artifact‐reduced reconstruction of SPECT data using a preconditioned alternating projection algorithm[J]. Medical Physics, 2015, 42(8): 4872-4887.
[5] 叶文权, 李斯, 凌捷. 基于多级残差U-Net的稀疏SPECT图像重建[J]. 广东工业大学学报, 2023, 40(1): 61-67.
YE W Q, LI S, LING J. Sparse-view SPECT image reconstruction based on multilevel-residual U-Net[J]. Journal of Guangdong University of Technology, 2023, 40(1): 61-67.
[6] 夏皓, 蔡念, 王平, 等. 基于多分辨率学习卷积神经网络的磁共振图像超分辨率重建[J]. 广东工业大学学报, 2020, 37(6): 26-31.
XIA H, CAI N, WANG P, et al. Magnetic resonance image super-resolution via multi-resolution learning[J]. Journal of Guangdong University of Technology, 2020, 37(6): 26-31.
[7] 梁宇辰, 蔡念, 欧阳文生, 等. 基于切片关联信息的慢性阻塞性肺疾病CT诊断[J]. 广东工业大学学报, 2024, 41(1): 27-33.
LIANG Y C, CAI N, OUYANG W S, et al. CT diagnosis of chronic obstructive pulmonary disease based on slice correlation information[J]. Journal of Guangdong University of Technology, 2024, 41(1): 27-33.
[8] 郑煜, 蔡念, 欧阳文生, 等. 基于深度关联机制的肝胆管超分辨率分割[J]. 广东工业大学学报, 2023, 40(5): 41-46.
ZHENG Y, CAI N, OUYANG W S, et al. Super-resolution segmentation of hepatobiliary ducts based on deep correlation mechanism[J]. Journal of Guangdong University of Technology, 2023, 40(5): 41-46.
[9] 曾安, 陈旭宙, 姬玉柱, 等. 基于自注意力和三维卷积的心脏多类分割方法[J]. 广东工业大学学报, 2023, 40(6): 168-175.
ZENG A, CHEN X Z, JI Y Z, et al. Cardiac multiclass segmentation method based on self-attention and 3D convolution[J]. Journal of Guangdong University of Technology, 2023, 40(6): 168-175.
[10] MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2021, 65(1): 99-106.
[11] WANG Y, LI Y, LIU P, et al. NeXT: towards high quality neural radiance fields via multi-skip transformer[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 69-86.
[12] REMATAS K, MARTIN-BRUALLA R, FERRARI V. ShaRF: shape-conditioned radiance fields from a single view[EB/OL]. arXiv: 2102.08860(2021-06-23) [2024-09-24]. https://arxiv.org/abs/2102.08860.
[13] YU A, YE V, TANCIK M, et al. Pixelnerf: neural radiance fields from one or few images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 4578-4587.
[14] REISER C, PENG S, LIAO Y, et al. KiloNeRF: speeding up neural radiance fields with thousands of tiny mlps[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 14335-14345.
[15] 肖祎龙, 邓伊琴, 陈志刚. 面向动态三维人体重建的神经辐射场加速方法[EB/OL]. (2024-06-14) [2024-09-24]. https://doi.org/10.19678/ j.issn.1000-3428.0069317.
[16] ZHA R, ZHANG Y, LI H. NAF: neural attenuation fields for sparse-view CBCT reconstruction[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2022: 442-452.
[17] XIAN J M, ZHU J, LIAO H L, et al. Frequency-regularized neural representation method for sparse-view tomographic reconstruction[EB/OL]. arXiv: 2409.14394 (2024-09-24) [2024-09-24]. https://arxiv.org/abs/2409.14394.
[18] FANG Y, MEI L, LI C, et al. SNAF: sparse-view CBCT reconstruction with neural attenuation fields[EB/OL]. arXiv: 2211.17048 (2022-11-30) [2024-09-24]. https://arxiv.org/abs/2211.17048.
[19] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems. Long Beach: Curran Associates, 2017: 5998–6008.
[20] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies(NAACL-HLT) . Minneapolis: ACL, 2019: 4171-4186.
[21] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[C]// Proceeding of the 9th International Conference on Learning Representations. Vienna: OpenReview. net, 2021: 1-21.
[22] YUAN L, CHEN Y, WANG T, et al. Tokens-to-token ViT: training vision transformers from scratch on imagenet[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 558-567.
[23] XIE E, WANG W, YU Z, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[J]. Advances in Neural Information Processing Systems, 2021, 34: 12077-12090.
[24] DAI Z, CAI B, LIN Y, et al. UP-DETR: unsupervised pre-training for object detection with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 1601-1610.
[25] CHEN J, LU Y, YU Q, et al. TransUNet: transformers make strong encoders for medical image segmentation[EB/OL]. arXiv: 2102.043 06 (2021-02-08) [2024-09-24]. https://arxiv.org/abs/2102.04306.
[26] ZHANG Z, YU L, LIANG X, et al. TransCT: dual-path transformer for low dose computed tomography[C]//Medical Image Computing and Computer Assisted Intervention. Strasbourg: Springer, 2021: 55-64.
[27] 范国玉. 基于transformer和生成对抗网络的3D人脸生成和重建研究[D]. 南京: 南京邮电大学, 2023.
[28] WANG Q, WANG Z, GENOVA K, et al. IBRNet: learning multi-view image-based rendering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 4690-4699.
[29] MÜLLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding[J]. ACM Transactions on Graphics (TOG) , 2022, 41(4): 1-15.
[30] ARMATO III S G, MCLENNAN G, BIDAUT L, et al. The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans[J]. Medical Physics, 2011, 38(2): 915-931.
[31] KLACANSKY P. Open scientific visualization datasets[EB/OL]. (2024-05-16) [2024-09-24]. https://klacansky.com/open-scivis-datasets/.
[32] LJUNGBERG M, STRAND S E, KING M A. Monte Carlo calculations in nuclear medicine: applications in diagnostic imaging[M]. Boca Raton, FL: CRC Press, 2012: 287-312.
[33] SEGARS W P, STURGEON G, MENDONCA S, et al. 4D XCAT phantom for multimodality imaging research[J]. Medical Physics, 2010, 37(9): 4902-4915.
[1] Lai Zhi-mao, Zhang Yun, Li Dong. A Survey of Deepfake Detection Techniques Based on Transformer [J]. Journal of Guangdong University of Technology, 2023, 40(06): 155-167.doi: 10.12052/gdutxb.240091
[2] Zeng An, Chen Xu-zhou, Ji Yu-Zhu, Pan Dan, Xu Xiao-Wei. Cardiac Multiclass Segmentation Method Based on Self-attention and 3D Convolution [J]. Journal of Guangdong University of Technology, 2023, 40(06): 168-175.doi: 10.12052/gdutxb.240091
[3] Liu Hong-wei, Lin Wei-zhen, Wen Zhan-ming, Chen Yan-jun, Yi Min-qi. A MABM-based Model for Identifying Consumers' Sentiment Polarity―Taking Movie Reviews as an Example [J]. Journal of Guangdong University of Technology, 2022, 39(06): 1-9.doi: 10.12052/gdutxb.240091
Viewed
Full text
119
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 5 114

  From Others local
  Times 46 73
  Rate 39% 61%

Abstract
98
Just accepted Online first Issue
0 15 83
  From local
  Times 98
  Rate 100%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
No Suggested Reading articles found!