FAU Erlangen-Nürnberg
Institute of Micro- and Nanostructure Research
notebooks/week13_explainability.ipynb — train two tiny CNNs on 32×32 synthetic EM images; the clean model learns a real centre-disk defect (gradient saliency inside defect = 0.627, outside = 0.195, ratio 3.2×); the shortcut model learns a spurious corner artifact (corner saliency = 0.500, defect-region saliency = 0.054, ratio 9.3×). Occlusion saliency (patch=4) amplifies the contrast: inside = 0.698, outside = 0.006, ratio 109×. Both models achieve ≥ 0.99 test accuracy — accuracy alone cannot detect the shortcut.Global explanation averaged over 200 test images (left) vs local explanation for a single edge-crack image (right) — different features dominate.
From interpretable (left, green) to explainable (right, orange): increasing model capacity requires post-hoc methods to recover a human-readable explanation.
Permutation importance for six features of the EM defect classifier (mean accuracy drop over 50 shuffles). Blue = physical EM feature; red/orange = artifact. The corner artifact has negligible importance because the clean model learned to ignore it.
logit.backward() then read input.grad — two lines of code added to any trained CNN.Left: EM image with lime-circle defect region. Centre: gradient saliency map. Right: overlay. The clean model’s saliency concentrates inside the defect (inside = 0.627, outside = 0.195, ratio ≈ 3.2×) — the model attends to the physically meaningful region.
Left: EM image with lime-circle defect and corner artifact (red box). Centre: gradient saliency map. Right: overlay. The shortcut model’s saliency concentrates at the corner artifact, not the defect (corner = 0.500, defect region = 0.054, ratio ≈ 9.3×) — spatially coherent but physically wrong. Accuracy 0.990.
SHAP feature attributions for the EM defect classifier: texture score dominates (φ=0.41); corner brightness has slightly negative attribution (φ=−0.03) — the clean model learned to essentially ignore the artifact.
[CLS]) over all patches → a spatial heat map showing which image regions the transformer “focused on.”Left: synthetic EM grain image. Right: transformer attention map — attention peaks along grain-boundary locations (high attention = yellow). Attention provides a spatial explanation without a backward pass.
Latent-space illustration of training distribution (blue ellipse) vs OOD test inputs (red X = novel grain morphology, orange triangle = contamination/damage). The model is reliable only inside the ellipse.
Three trust failures from the course: data leakage (W4, red), poor calibration (W9, orange), and hallucination (W12, purple). All share one root cause: the model learned a spurious statistical association instead of the physical mechanism.
GroupKFold(n_splits=5, groups=specimen_id). Saliency diagnostic: if the saliency hotspot is at a scan artifact rather than the physical defect, suspect leakage.| Method | What it shows | Works on | Best for EM |
|---|---|---|---|
| Permutation importance | Which feature degrades accuracy most | Any model | Tabular descriptors (composition, process params) |
| Input-gradient saliency | Which pixels have highest sensitivity | Differentiable CNN | Fast, cheap image attribution |
| Grad-CAM Selvaraju, Ramprasaath R. et al., (2017) | Which conv feature-map regions activate | CNN | Spatial localisation, smoother than pixel saliency |
| Occlusion saliency | Which patch reduces confidence most | Any model | Robust cross-validation of gradient saliency |
| SHAP Lundberg, Scott M. et al., (2017) | Signed, complete attribution per feature | Any model | Global feature importance, correlated features |
| Integrated Gradients Sundararajan, Mukund et al., (2017) | Pixel-complete attribution along baseline path | Differentiable CNN | Per-pixel attribution with mathematical guarantee |
| Attention maps Vaswani, Ashish et al., (2017) | Which image patches the transformer attended to | Transformers | ViTs for STEM/SEM images |
The 13-week arc: from raw EM arrays to trustworthy, explainable science. Each step added a new layer of rigour.
_shared/exam_mustknow.md — 10 statements per week, all 13 weeks now filled. Study each statement; be able to explain, apply, or calculate._shared/miniproject.md — dataset options A–D, timeline, deliverables, rubric._shared/exam_mustknow.md Week 9 statements.jupyter nbconvert --to notebook --execute your_notebook.ipynb must exit 0._shared/exam_mustknow.md · _shared/miniproject.md
©Philipp Pelz - FAU Erlangen-Nürnberg - Data Science for Electron Microscopy