DEFUSE-MS is a state-of-the-art spatiotemporal graph neural network (STGNN) tailored for new lesion detection in Multiple Sclerosis (MS). It tackles the clinically urgent task of identifying new T2-weighted lesions by modeling longitudinal brain changes across MRI timepoints and leveraging deformation field (DF) information. DEFUSE-MS achieves superior lesion segmentation and detection accuracy while maintaining clinical interpretability and robustness in real-world datasets.
Key Contributions of DEFUSE-MS
Deformation-Guided Spatiotemporal Graph Modeling: DEFUSE-MS introduces a heterogeneous graph formulation that captures spatial structure at each timepoint and lesion evolution through deformation field–guided temporal edges.
Heterogeneous Graph-Based Encoder-Decoder: The model uses a ViG-style encoder-decoder architecture with graph reasoning layers, integrating local CNN features and global temporal relationships for new lesion detection.
Learned Deformation Embeddings: Instead of raw DFs or pooled features, DEFUSE-MS learns DF embeddings as temporal edge attributes, enabling accurate modeling of subtle lesion-related deformation patterns.
Dual Spatial and Temporal Graph Reasoning: By modeling intra-timepoint and inter-timepoint relationships jointly, the model captures both structural consistency and pathological evolution.
AI Implementation in DEFUSE-MS
Graph Construction:
Nodes: Represent patches from baseline and follow-up MRIs.
Spatial Edges: KNN-based links within each scan, capturing intra-scan structural similarity.
Temporal Edges: One-to-one links between matched nodes across timepoints, weighted by learned DF embeddings.
HSTGM (Heterogeneous Spatiotemporal Graph Module):
Integrates convolutional features with spatiotemporal reasoning via three Max-Relative GNNs (baseline spatial, follow-up spatial, and temporal).
Conditional gating mechanisms modulate edge influence based on spatial and temporal attributes.
Loss and Training Strategy:
Trained patch-wise with heavy augmentation around lesion voxels to mitigate class imbalance.
Uses 3D U-Net backbone, extended with ViG-like spatiotemporal GNN modules.
Methodology Overview

Data:
MSSEG-II challenge dataset, including 100 patients scanned across 15 centers.
Uses updated GT with new lesion annotations.
Preprocessing:
Rigid registration, brain extraction (ROBEX), bias correction (N4), histogram matching.
DF estimated using the Demons algorithm for each patient pair.
Graph Generation:
Patch-level feature maps extracted from baseline, follow-up, and DF.
Constructed as a heterogeneous graph with node and edge embeddings encoding spatiotemporal relationships.
Inference:
Sliding window patch-based inference for full volume prediction.
Combines graph outputs with decoder to produce final segmentation.
Results
State-of-the-Art Performance: Achieved F1 score of 0.65, sensitivity of 0.74, and Dice score of 0.55 on the MSSEG-II test set.
Learned DF Embeddings Improve Accuracy: Outperformed models using max-pooled DFs or no temporal information.
Strong Generalization: Low false positives in no-lesion cases (1.5 mm³ mean FP volume).
Robust to Anatomical Variability: Combines spatial edge features and temporal learning for balanced sensitivity and specificity.
Ablation Findings
Temporal Edge Modeling Matters: Removing DF or using max-pooling drops performance across all metrics.
Spatial Edges Help When Temporal Context Exists: In the absence of DF learning, spatial edges increase false positives.
Learned DF + Spatial Edges is the most effective combination.
Conclusion
DEFUSE-MS offers a novel, interpretable, and high-performing solution for detecting new MS lesions using longitudinal MRI. Its deformation field–guided spatiotemporal graph design provides precise modeling of disease progression and outperforms both CNN- and Transformer-based baselines. The integration of learned DF embeddings as edge attributes sets a new direction for modeling temporal changes in neurodegenerative disease imaging. Future directions include adding multi-modal inputs and validating generalizability across diverse cohorts.





