DEFUSE-MS: Deformation Field-Guided Spatiotemporal Graph-Based Framework for Multiple Sclerosis New Lesion Detection

Category:

GNN

Staus:

Accepted at MICCAI 2025

DEFUSE-MS is a state-of-the-art spatiotemporal graph neural network (STGNN) tailored for new lesion detection in Multiple Sclerosis (MS). It tackles the clinically urgent task of identifying new T2-weighted lesions by modeling longitudinal brain changes across MRI timepoints and leveraging deformation field (DF) information. DEFUSE-MS achieves superior lesion segmentation and detection accuracy while maintaining clinical interpretability and robustness in real-world datasets.

Key Contributions of DEFUSE-MS

  • Deformation-Guided Spatiotemporal Graph Modeling: DEFUSE-MS introduces a heterogeneous graph formulation that captures spatial structure at each timepoint and lesion evolution through deformation field–guided temporal edges.

  • Heterogeneous Graph-Based Encoder-Decoder: The model uses a ViG-style encoder-decoder architecture with graph reasoning layers, integrating local CNN features and global temporal relationships for new lesion detection.

  • Learned Deformation Embeddings: Instead of raw DFs or pooled features, DEFUSE-MS learns DF embeddings as temporal edge attributes, enabling accurate modeling of subtle lesion-related deformation patterns.

  • Dual Spatial and Temporal Graph Reasoning: By modeling intra-timepoint and inter-timepoint relationships jointly, the model captures both structural consistency and pathological evolution.

AI Implementation in DEFUSE-MS

  • Graph Construction:

    • Nodes: Represent patches from baseline and follow-up MRIs.

    • Spatial Edges: KNN-based links within each scan, capturing intra-scan structural similarity.

    • Temporal Edges: One-to-one links between matched nodes across timepoints, weighted by learned DF embeddings.

  • HSTGM (Heterogeneous Spatiotemporal Graph Module):

    • Integrates convolutional features with spatiotemporal reasoning via three Max-Relative GNNs (baseline spatial, follow-up spatial, and temporal).

    • Conditional gating mechanisms modulate edge influence based on spatial and temporal attributes.

  • Loss and Training Strategy:

    • Trained patch-wise with heavy augmentation around lesion voxels to mitigate class imbalance.

    • Uses 3D U-Net backbone, extended with ViG-like spatiotemporal GNN modules.

Methodology Overview

Data:

  • MSSEG-II challenge dataset, including 100 patients scanned across 15 centers.

  • Uses updated GT with new lesion annotations.

Preprocessing:

  • Rigid registration, brain extraction (ROBEX), bias correction (N4), histogram matching.

  • DF estimated using the Demons algorithm for each patient pair.

Graph Generation:

  • Patch-level feature maps extracted from baseline, follow-up, and DF.

  • Constructed as a heterogeneous graph with node and edge embeddings encoding spatiotemporal relationships.

Inference:

  • Sliding window patch-based inference for full volume prediction.

  • Combines graph outputs with decoder to produce final segmentation.

Results

  • State-of-the-Art Performance: Achieved F1 score of 0.65, sensitivity of 0.74, and Dice score of 0.55 on the MSSEG-II test set.

  • Learned DF Embeddings Improve Accuracy: Outperformed models using max-pooled DFs or no temporal information.

  • Strong Generalization: Low false positives in no-lesion cases (1.5 mm³ mean FP volume).

  • Robust to Anatomical Variability: Combines spatial edge features and temporal learning for balanced sensitivity and specificity.

Ablation Findings

  • Temporal Edge Modeling Matters: Removing DF or using max-pooling drops performance across all metrics.

  • Spatial Edges Help When Temporal Context Exists: In the absence of DF learning, spatial edges increase false positives.

  • Learned DF + Spatial Edges is the most effective combination.

Conclusion

DEFUSE-MS offers a novel, interpretable, and high-performing solution for detecting new MS lesions using longitudinal MRI. Its deformation field–guided spatiotemporal graph design provides precise modeling of disease progression and outperforms both CNN- and Transformer-based baselines. The integration of learned DF embeddings as edge attributes sets a new direction for modeling temporal changes in neurodegenerative disease imaging. Future directions include adding multi-modal inputs and validating generalizability across diverse cohorts.

Create a free website with Framer, the website builder loved by startups, designers and agencies.