MAC-MIL: Multi-head Attention-Challenging Multiple Instance Learning for Survival Analysis

DesignCube - Mobile Screens
DesignCube - Mobile Screens
DesignCube - Tablet Screens
DesignCube - Tablet Screens

Category:

Prognosis

Status:

MICCAI 2024 Challenge 7th Place Winner

The MAC-MIL project is an advanced deep-learning model designed to predict survival risk in prostate cancer patients using whole-slide images (WSI). The model tackles the complexities of biochemical recurrence prediction by leveraging a multi-head attention mechanism and masking techniques, improving model robustness and interpretability. The project addresses key challenges in survival analysis and presents significant contributions through its innovative design and methodology.

Key Contributions

  1. Multi-head Attention-Based Patch Selection: The model introduces an attention-based mechanism to select relevant tissue patches from histopathology slides. This approach allows the model to focus on critical regions of interest, ensuring more precise predictions for biochemical recurrence in prostate cancer patients.

  2. Top-K Masking for Generalization: By introducing random masking of top patches, the model avoids overfitting to a select few regions, thus improving its generalizability across different samples. This random masking technique encourages the model to distribute attention more broadly across patches.

  3. Integration with Pretrained Foundation Models: MAC-MIL utilizes the UNI model, a foundation model pre-trained on histopathological images, for feature extraction. This integration improves the accuracy and quality of patch-level feature representation, enhancing the model’s performance in survival prediction.

  4. Attention-Based Survival Prediction: The project combines an attention mechanism with multiple instance learning (MIL) to aggregate patch-level features, yielding more accurate patient-level survival predictions. This combination provides a more interpretable and reliable prognosis for patients.

AI Implementation in MAC-MIL

MAC-MIL implements a range of AI techniques and architectures to improve survival analysis:

  • Foundation Model for Feature Extraction: The UNI model is used as a pretrained encoder to extract patch-level features from WSIs. The features extracted by UNI are robust due to the model’s pre-training on over 100 million images from various tissue types.

  • Multi-head Attention Block: After feature extraction, the attention block assigns importance scores to each patch. This multi-head attention mechanism allows the model to focus on different aspects of each patch, ensuring that the most diagnostically relevant regions are given priority.

  • Filtering and Masking: A filtering mechanism selects the top-N patches based on attention scores, and random masking is applied to the top-K patches with a probability, enhancing the model’s ability to generalize and preventing it from over-relying on certain patches.

  • Multiple Instance Learning: MIL aggregates patch-level features into a single slide-level representation. This aggregated representation is then used for both ISUP grade classification and survival prediction.

Methodology

1. Problem Statement and Dataset

The goal of MAC-MIL is to predict the survival risk of prostate cancer patients using WSIs. The dataset comprises WSIs, event indicators (censored or uncensored), and time-to-event data. The Prostate Cancer Gleason grading system is also incorporated using the ISUP grade labels, which are obtained using the PND model.

2. Feature Extraction

Each WSI is divided into smaller patches to reduce computational cost. The UNI foundation model is used to extract patch-level feature representations. This step ensures that each patch’s key features are captured, which is crucial for the model to understand tissue morphology and the severity of the cancer.

3. Patch Selection

The patch selection mechanism uses multi-head attention to assign importance scores to each patch. The top-N patches are selected for the final model training. The attention scores are computed using two fully connected layers followed by tanh and sigmoid activations, ensuring both linearity and non-linearity in the attention mechanism.

4. Top-K Masking

To avoid overfitting, random masking is applied to the top-K patches, challenging the model to focus on different regions of the WSI during each training cycle. This process improves generalization and model robustness.

5. Risk Prediction

The features from the selected patches are aggregated using multiple-instance learning. A slide-level representation is generated by combining the patch-level features weighted by their attention scores. This representation is then passed through a multilayer perceptron (MLP) for ISUP grade classification and survival risk prediction. The final loss function combines a cross-entropy loss for ISUP classification and a CoxPH loss for survival analysis.

Results

  • Performance Metrics: The model’s performance is evaluated using the Concordance Index (C-index), a standard metric in survival analysis. MAC-MIL achieved a C-index of 0.6704, outperforming baseline models that used random patch selection or single attention heads.

  • Attention-Based Patch Selection: The C-index improved from 0.5900 (random patch selection) to 0.6342 with attention-based patch selection, highlighting the effectiveness of attention in selecting diagnostically relevant patches.

  • Multi-head Attention and Masking: Introducing multiple attention heads and top-K masking further improved the C-index to 0.6704, demonstrating the benefit of the model’s multi-head attention mechanism in survival prediction.

Interpretability

To ensure the model’s decisions are interpretable, two heatmaps were generated:

  1. ISUP Patch Heatmap: This visualizes the ISUP grade assigned to each patch, providing insights into the severity of cancer across different regions of the WSI.

  2. Attention Heatmap: The heatmap based on attention scores highlights the regions the model focused on during prediction, offering a transparent view of the model’s decision-making process.

These heatmaps allow clinicians to understand which tissue regions contribute most to the survival prediction, making MAC-MIL a valuable tool for explainable AI in healthcare.

Discussion

Including attention mechanisms, multi-head attention, and top-K masking significantly improved the model's performance compared to baseline methods. The C-index increase demonstrates that attention-based patch selection leads to better predictions in survival analysis. Additionally, the ablation studies on multi-head attention and masking reveal that tuning these parameters can optimize the model's accuracy without overfitting.

Conclusion

MAC-MIL demonstrates the potential of multi-head attention and multiple-instance learning in survival analysis for prostate cancer patients. By introducing attention-based patch selection and top-K masking, the model improves generalizability and robustness, as evidenced by the enhanced C-index. Integrating the UNI foundation model for feature extraction ensures high-quality patch representations, making MAC-MIL a powerful tool for predicting biochemical recurrence in prostate cancer. Future work could explore its applicability to other cancer types and expand the dataset for further validation.

Create a free website with Framer, the website builder loved by startups, designers and agencies.