AI 4 Materials / KI-Materialtechnologie
FAU Erlangen-Nürnberg


Note
Why not LIME? LIME’s ad-hoc perturbations and local linear surrogate fail the basic attribution axioms (completeness, sensitivity) that IG satisfies by construction. The 2026 syllabus uses SHAP + IG, not LIME.
SHAP and IG tell us which inputs matter. They do not tell us what concept a hidden layer carries.
Sparse Autoencoders (SAEs) [@templeton_2024_scaling] decompose layer activations \(h\) into an over-complete, sparsely active dictionary: \[ \hat h = D(\mathrm{ReLU}(Eh - b)), \quad \mathcal L = \|h - \hat h\|_2^2 + \lambda \|Eh - b\|_1. \] Top-activating inputs per SAE feature tend toward monosemantic concepts: grain-boundary curvature, oxide stripe orientation, sample-tilt artefact.
Why this matters for materials certification
An SAE is exactly the Unit-5 autoencoder plus an \(\ell_1\) activation penalty. Architecture unchanged; loss adds one term.
Honest limits.
Note
SHAP/IG: per-prediction explanation. SAE: global model audit. Different tools, complementary roles. Both are in the 2026 toolbox.



© Philipp Pelz - ML for Characterization and Processing