Abstract:
Accurate segmentation of rectal cancer lesions is crucial for preoperative planning and treatment strategy formulation. Existing rectal cancer magnetic resonance imaging (MRI) segmentation methods mainly rely on a single-modality imaging input and fail to fully exploit the potential constraints provided by clinical structured information, such as serum tumor marker levels. In addition, rectal cancer lesions often exhibit low contrast and ambiguous boundaries. Meanwhile, skip connections tend to introduce noise and redundant information when transmitting shallow features, thereby increasing segmentation difficulty. To address these issues, this paper proposes a novel segmentation network, termed PCM-UNet (Prompt-guided Clinical Mamba U-Net), which integrates clinical structured information via a prompt-guided Mamba framework. The core of the proposed method lies in the design of a Promptable Feature Fusion Module (PFM), which leverages the Contrastive Language–Image Pre-training (CLIP) model to encode clinical structured information into text embeddings. These embeddings are continuously injected into the multi-stage feature learning process of the encoder–decoder architecture, enabling collaborative modeling of imaging features and clinical prompts. Furthermore, a Mixed Parallel Attention Module (MPAM) is introduced into the skip connections to adaptively filter cross-level features, thereby alleviating the interference caused by noise and redundant responses during decoding. In addition, a lightweight Frequency Domain Enhancement Module (FEM) is incorporated into the encoder to complement the representation of boundary and texture details. Experiments conducted on the CRC-370 dataset, curated from data provided by the First Affiliated Hospital of Sun Yat-sen University and the 940th Hospital of the Joint Logistics Support Force of the Chinese People’s Liberation Army, demonstrate that the proposed method outperforms the best competing baseline by 2.21% and 2.25% in terms of Intersection over Union (IoU) and Dice coefficient, respectively. Ablation studies further indicate that PFM serves as the primary contributor to performance improvement, while MPAM provides complementary benefits for cross-level feature transmission, and FEM yields additional gains when applied to mid- and high-level features. These results demonstrate that incorporating patient-level clinical structured information into the Mamba-based segmentation framework effectively enhances the segmentation performance of rectal cancer MRI lesions.