Opening — Why this matters now

Medical AI is no longer struggling with accuracy. In constrained tasks like MRI-based brain tumour detection, convolutional neural networks routinely cross the 90% mark. The real bottleneck has shifted elsewhere: trust. When an algorithm flags—or misses—a tumour, clinicians want to know why. And increasingly, a single colourful heatmap is not enough.

This paper tackles that problem head-on. Instead of asking which explainability method is best, it asks a more pragmatic question: what happens when we layer them?

Background — The limits of single-method explainability

Explainable AI (XAI) has become mandatory rather than optional in healthcare. Grad-CAM highlights regions. LRP drills down to pixels. SHAP assigns numerical responsibility. Each method, on its own, offers a partial story.

The problem is that partial stories can mislead. Broad saliency maps may overgeneralise. Pixel-level relevance can overwhelm. Feature attribution without spatial grounding becomes abstract. Prior literature tends to deploy these tools in isolation, implicitly assuming that one lens is sufficient.

This assumption breaks down precisely where clinical risk is highest: borderline cases, partial tumours, and ambiguous slices.

Analysis — What the paper actually does

The authors design a custom CNN trained on FLAIR MRI slices derived from the BraTS 2021 dataset. Instead of relying on transfer learning, they build a lightweight architecture tailored to brain-only imaging, achieving 91.24% test accuracy after architectural and hyperparameter refinement.

But performance is not the headline. The contribution lies in the explanation stack.

The workflow is deliberately structured:

  1. Grad-CAM identifies coarse regions of interest at the final convolutional layer.
  2. LRP propagates relevance backward to expose pixel-level contribution.
  3. SHAP quantifies how much each region supports or contradicts the tumour prediction.

These outputs are then presented side-by-side, producing a layered interpretability panel rather than a single explanatory artifact.

Findings — Accuracy improves, but insight improves more

Quantitatively, the improved model outperforms its baseline across all major metrics:

Metric Original Model Improved Model
Accuracy 84.76% 91.24%
Precision 0.919 0.961
Recall 0.762 0.862
F1-score 0.833 0.909
AUC 0.92 0.96

Qualitatively, the explainability results are more interesting.

  • Clear tumours: All three XAI methods align, reinforcing confidence.
  • Non-tumour cases: Attention and relevance remain diffuse, with SHAP heavily weighted against tumour classification.
  • Partial tumours: This is where the framework earns its keep. Grad-CAM often appears uncertain, LRP reveals focused micro-patterns, and SHAP confirms that a majority of features still support tumour presence—even when visibility is poor.

In other words, disagreement between methods becomes a signal, not a flaw.

Implications — From explanations to decision support

The combined XAI framework does not pretend to replace clinicians. Instead, it behaves like a second reader with annotated reasoning.

Practically, this matters in three ways:

  • Borderline cases benefit from layered confirmation rather than binary confidence.
  • False negatives become auditable, often traceable to image quality or partial visibility.
  • Model uncertainty is exposed rather than hidden, inviting human intervention.

That said, limitations remain. The system operates on single 2D slices, uses only FLAIR sequences, and demands cognitive effort to interpret multiple explanation layers. This is not plug-and-play clinical software. It is a proof that explainability scales better in combination than in isolation.

Conclusion — Explainability works best in stereo

This paper quietly dismantles the idea that there is a single “best” XAI method for medical imaging. Instead, it shows that interpretability behaves more like triangulation: confidence emerges when different explanatory perspectives converge—or when their divergence is itself informative.

For high-stakes AI systems, especially in healthcare, this layered approach may be the difference between explainability as decoration and explainability as infrastructure.

Cognaptus: Automate the Present, Incubate the Future.