Mathematical Foundations of AI & ML
Unit 14: Explainability, Limits, and Trust

Prof. Dr. Philipp Pelz

FAU Erlangen-Nürnberg

Title + Unit 14 positioning

  • The final lecture of Mathematical Foundations of AI & ML. {.fragment}
  • From physics-informed learning (Unit 13) to the question: can we trust our models? {.fragment}
  • We synthesize the entire 14-unit arc into a coherent methodology for trustworthy ML. {.fragment}

Learning outcomes for Unit 14

By the end of this lecture, students can:

  • explain why explainability is a scientific and industrial mandate, {.fragment}
  • distinguish semantic structures (synonyms, taxonomies, ontologies), {.fragment}
  • perform and interpret perturbation-based sensitivity analysis, {.fragment}
  • assess where ML adds value in causal process chains and where it fails. {.fragment}

Why explainability is non-negotiable

  • Science demands understanding, not just prediction — a model that cannot be questioned cannot be falsified. {.fragment}
  • Industry demands accountability — engineers must justify decisions to stakeholders. {.fragment}
  • Regulation demands transparency — EU AI Act requires explanations for high-risk AI systems. {.fragment}
  • Explainability is not optional — it is a prerequisite for deploying ML in engineering. {.fragment}

The black-box problem

  • Deep neural networks achieve remarkable accuracy but offer no explanation for individual predictions. {.fragment}
  • A model predicting “this alloy will fail” without explaining why is unacceptable for safety-critical decisions. {.fragment}
  • Engineers need to know which factors drive the prediction and how confident the model is. {.fragment}
  • The black-box problem motivates the entire field of explainable AI (XAI) [@neuer2024machine]. {.fragment}

Explainability vs interpretability

Interpretability

  • The model itself is transparent and understandable. {.fragment}
  • Examples: linear regression, decision trees, small rule sets. {.fragment}

Explainability

  • Post-hoc methods that reveal the reasoning of complex models. {.fragment}
  • Examples: SHAP values, sensitivity analysis, attention visualization. {.fragment}
  • Trade-off: interpretable models may be less accurate; explainability adds complexity to accurate models. {.fragment}

Who needs explanations?

  • Scientists: full understanding (all levels) — to build knowledge. {.fragment}
  • Engineers: process and prediction level — to make decisions. {.fragment}
  • Regulators: data provenance and prediction justification — to ensure compliance. {.fragment}
  • Operators: actionable recommendations — to adjust process parameters. {.fragment}
  • Different audiences need different types and depths of explanation. {.fragment}

The cost of unexplainability

  • Rejected by regulators (cannot approve what cannot be explained). {.fragment}
  • Distrusted by domain experts (they will use their own judgment instead). {.fragment}
  • Impossible to debug (when predictions fail, no path to diagnosis). {.fragment}
  • Liability risk (who is responsible when an unexplained model causes harm?). {.fragment}

Explainability as scientific method

  • Science progresses by proposing models, deriving predictions, and testing them. {.fragment}
  • A model that cannot be questioned cannot be falsified — it fails Popper’s criterion. {.fragment}
  • ML models that only predict without explanation are tools, not science. {.fragment}
  • Making ML explainable elevates it to a scientific methodology. {.fragment}

Course context

  • Every unit has built toward this moment: {.fragment}
    • Loss minimization (Unit 1): what does the model optimize? {.fragment}
    • Generalization (Unit 7): does it work on new data? {.fragment}
    • Uncertainty (Unit 12): how confident is it? {.fragment}
    • Physics (Unit 13): does it respect known laws? {.fragment}
    • Explainability (Unit 14): can we understand and trust it? {.fragment}

Roadmap of today’s 90 min

  • 10–25 min: Semantic structures — digitizing meaning. {.fragment}
  • 25–40 min: Six levels of explainability (E1–E6). {.fragment}
  • 40–55 min: Sensitivity analysis — perturbation and beyond. {.fragment}
  • 55–65 min: Causality in process chains. {.fragment}
  • 65–75 min: Data manifold limits and trust. {.fragment}
  • 75–87 min: Course retrospective — the 14-unit arc. {.fragment}

Digitizing meaning: the challenge

  • ML models operate on numbers (tensors, vectors, matrices). {.fragment}
  • Domain knowledge is encoded in language and relationships. {.fragment}
  • Bridging this gap requires semantic structures that formalize meaning. {.fragment}
  • Without semantic structures, models cannot be grounded in domain understanding [@neuer2024machine]. {.fragment}

Synonyms and controlled vocabularies

  • Different terms for the same concept: “yield strength” = “elastic limit” = “\(R_e\)”. {.fragment}
  • Controlled vocabulary: a standardized list of terms with defined meanings. {.fragment}
  • Without synonym resolution, models may treat the same property as two separate features. {.fragment}
  • First step in any data integration pipeline. {.fragment}

Taxonomies: hierarchical classification

  • Organize concepts in parent-child hierarchies: {.fragment}
    • Material > Metal > Steel > Stainless Steel > 316L. {.fragment}
  • Taxonomies enable inheritance: properties of “Metal” apply to all sub-categories. {.fragment}
  • They structure domain knowledge and guide feature selection. {.fragment}

Ontologies: structured knowledge graphs

  • An ontology defines concepts, relationships, and constraints: {.fragment}
    • “Alloy hasProperty tensileStrength” {.fragment}
    • “tensileStrength measuredIn MPa” {.fragment}
    • “grainSize affects yieldStrength” {.fragment}
  • Richer than taxonomies: capture arbitrary relationships, not just hierarchies. {.fragment}

Why ontologies matter for ML

  • Enable deductive reasoning: if the model’s prediction violates a known ontological relationship, flag it. {.fragment}
  • Guide feature engineering: ontological relationships suggest which features to include. {.fragment}
  • Support consistency checking: predictions must be consistent with domain constraints. {.fragment}
  • Provide a framework for communicating model behavior to domain experts. {.fragment}

Ontologies for feature engineering

  • Ontological relationships encode domain knowledge about what matters: {.fragment}
    • “Composition determines phase” → include composition features. {.fragment}
    • “Processing affects microstructure” → include processing parameters. {.fragment}
  • This connects to Unit 13 (physics-informed learning): ontologies formalize the physics knowledge. {.fragment}

Materials ontology example

  • Causal chain: Composition \(\) Processing \(\) Microstructure \(\) Properties. {.fragment}
  • This is a process ontology — each arrow represents a physical mechanism. {.fragment}
  • Models should respect this chain: predicting properties from composition is valid; the reverse is an ill-posed inverse problem. {.fragment}

Checkpoint: semantic structures

  • Question: Your model uses “hardness” and “HRC” as separate features. What semantic issue exists?
  • Answer: They are synonyms — “HRC” is the Rockwell C hardness scale, a measure of “hardness”. Including both double-counts the same information and may confuse the model.

The six levels of explainability (E1–E6)

  • A structured framework for matching explanation depth to audience and purpose.
  • Each level addresses a different question about the model and its predictions.
  • Comprehensive explainability requires addressing all six levels.
  • Not every audience needs every level — match the explanation to the recipient [@neuer2024machine].

E1: Data level

  • Question: “What data was used?” {.fragment}
  • Covers: data provenance, quality, completeness, representativeness, biases. {.fragment}
  • Why it matters: a model is only as good as its data — garbage in, garbage out. {.fragment}
  • Output: data documentation, distribution plots, missing data reports. {.fragment}

E2: Process level

  • Question: “What physical process does this model relate to?” {.fragment}
  • Covers: the engineering context, the physical system, the measurement setup. {.fragment}
  • Why it matters: predictions must be interpreted in the context of the physical process. {.fragment}
  • Output: process flow diagrams, variable definitions, physical constraints. {.fragment}

E3: Feature level

  • Question: “Which input features matter most?” {.fragment}
  • Covers: feature importance, feature selection rationale, sensitivity analysis. {.fragment}
  • Why it matters: identifies which measurements drive predictions — guides data collection and process control. {.fragment}
  • Output: feature importance rankings, sensitivity plots. {.fragment}

E4: Model level

  • Question: “How does the model work?” {.fragment}
  • Covers: architecture description, hyperparameter choices, training protocol, convergence diagnostics. {.fragment}
  • Why it matters: enables reproduction, debugging, and comparison with alternative models. {.fragment}
  • Output: model documentation, training curves, architecture diagrams. {.fragment}

E5: Prediction level

  • Question: “Why this specific prediction?” {.fragment}
  • Covers: local explanations for individual predictions. {.fragment}
  • Methods: LIME (local linear approximation), SHAP (Shapley values), perturbation analysis. {.fragment}
  • Output: “This sample is predicted high-strength because carbon content is high and grain size is small.” {.fragment}

E6: Decision level

  • Question: “What action should be taken?” {.fragment}
  • Covers: mapping predictions to actionable recommendations with confidence. {.fragment}
  • Why it matters: the ultimate purpose of the model is to inform decisions. {.fragment}
  • Output: “Increase sintering temperature by 20°C (confidence: 85%).” {.fragment}

Matching level to audience

Audience Primary levels Example explanation
Operator E2 + E6 “Adjust temperature; model is 90% confident”
Data scientist E3 + E4 “Feature X has highest SHAP value; 3-layer MLP”
Regulator E1 + E5 “Data from 500 samples; prediction driven by grain size”
Scientist All Full documentation and methodology
  • Different stakeholders require different depth and focus.
  • Explanations must be tailored to the user’s technical background and decision-making needs.

Perturbation-based sensitivity analysis

  • Perturb one input feature by \(\); observe the change in output:

\[ S_j = \frac{|f(\mathbf{x} + \Delta \mathbf{e}_j) - f(\mathbf{x})|}{|\Delta|} \]

  • High sensitivity: the output changes strongly when this feature is perturbed.
  • Low sensitivity: the feature has little effect on the prediction.
  • Simple, model-agnostic, and intuitive.

Global vs local sensitivity

  • Global sensitivity: average \(S_j\) across many data points — which features matter on average.
  • Local sensitivity: \(S_j\) at a specific point — which features matter for this prediction.
  • Global sensitivity guides feature selection; local sensitivity explains individual predictions.

Sensitivity analysis in practice

  • Vary each feature by \(%\) (or \(\)) while holding others constant.
  • Record the output change for each perturbation.
  • Rank features by average output sensitivity.
  • Visualize as a bar chart: “tornado plot” showing feature sensitivities.

Feature importance from sensitivity

  • High sensitivity \(\) important feature — changes in it strongly affect predictions.
  • Low sensitivity \(\) unimportant feature — can potentially be removed.
  • But: sensitivity alone does not imply causation — it reveals association.
  • Combine with domain knowledge to interpret importance.

Sensitivity analysis: limitations

  • Assumes independence: one-at-a-time perturbation misses feature interactions.
  • Linear approximation: sensitivity at one point may not represent the full landscape.
  • No causal information: sensitivity shows association, not mechanism.
  • For interactions: use Sobol indices or SHAP (more expensive, more informative).

Beyond perturbation: SHAP values (brief)

  • SHAP (SHapley Additive exPlanations): allocates prediction contribution to each feature using game theory.
  • Based on Shapley values: fair allocation of the “payout” (prediction) to “players” (features).
  • Accounts for feature interactions.
  • Computationally expensive but provides the most principled feature attribution.

Causality vs correlation

  • ML models find correlations: features that co-occur with the output.
  • But correlation \(\) causation: confounders can create spurious patterns.
  • Example: ice cream sales correlate with drowning rates (confounder: temperature).
  • Causal claims require interventional data or domain knowledge.

Causal process chains

  • In manufacturing: Composition \(\to\) Processing \(\to\) Microstructure \(\to\) Properties.
  • The arrow direction encodes causation: changing composition causes different microstructure.
graph LR
    A[Composition] --> B[Processing]
    B --> C[Microstructure]
    C --> D[Properties]
  • ML can model these links, but the causal direction is known from physics, not learned from data [@neuer2024machine].

Detection vs prediction

  • Detection: “This sample has low hardness” — pattern recognition from measurements. ML excels here.
  • Prediction: “Changing carbon content will increase hardness” — causal claim. Requires causal model.
  • Most ML models perform detection (interpolation). Prediction (extrapolation with causal claims) requires more.

Where ML adds value in causal chains

  • Within the training distribution: ML provides fast, accurate detection and interpolation.
  • At the boundaries: uncertainty quantification (Unit 12) flags unreliable predictions.
  • Beyond the distribution: causal models (physics, experiments) are needed.
  • ML is most valuable when combined with domain knowledge, not as a replacement for it.

Deductive reasoning with ontologies

  • If the ontology states “grain size affects yield strength” but the model assigns zero importance to grain size:
    • Either the data lacks variation in grain size, or
    • The model has a problem.
  • Ontological consistency checking catches such issues automatically.
  • This connects explainability to domain validation.

Checkpoint: causality

  • Question: Your model finds that ice cream sales predict drowning rates. What’s the issue?
  • Answer: Confounding variable — temperature causes both. The model found a correlation, not a causal relationship.

Data manifold limits

  • ML models are only reliable within the data manifold (training distribution).
  • Extrapolation: predicting outside the training range is unreliable — the model has no information there.
  • Detection: use latent space density (Unit 10), reconstruction error (Unit 9), GP uncertainty (Unit 12).
  • Never trust predictions in regions where the model has not seen data.

Detecting extrapolation

  • Latent space density (Unit 10): low density = far from training data = potential extrapolation.
  • Reconstruction error (Unit 9): high error = input differs from learned patterns.
  • GP uncertainty (Unit 12): wide uncertainty bands = no nearby training data.
  • Ensemble disagreement: models disagree = uncertain = possible extrapolation.

Inductive bias and trust

  • Every model has inductive bias — assumptions built into the model structure.
  • Linear model: assumes linear relationships. NN: assumes smooth functions (spectral bias).
  • Trust requires understanding what the model assumes and testing where those assumptions fail.
  • Physics-informed models (Unit 13) make their assumptions explicit — a trust advantage.

When models should NOT be trusted

  • Extrapolation beyond the training distribution.
  • Confounded features where correlation \(\) causation.
  • Insufficient training data (high epistemic uncertainty).
  • Missing physics (model violates known constraints).
  • Poor calibration (predicted confidence does not match observed accuracy).

Building trustworthy ML systems

  • Uncertainty quantification (Unit 12): know what you don’t know.
  • Explainability (Unit 14): understand why predictions are made.
  • Domain validation: check predictions against physical knowledge.
  • Human oversight: experts review critical predictions.
  • Trustworthy ML = the combination of all four.

Course retrospective: the 14-unit arc

  • This course has been a journey from “what is learning?” to “can we trust what the model learned?”
  • Each unit built on the previous, creating a coherent methodology for engineering ML.
graph TD
    subgraph Foundations
    U1[Unit 1: Risk Minimization]
    U2[Unit 2: Linear Algebra/PCA]
    U3[Unit 3: Loss Functions]
    U4[Unit 4: NN Architectures]
    end
    subgraph Training
    U5[Unit 5: Backprop]
    U6[Unit 6: Loss Landscapes]
    U7[Unit 7: Bias-Variance]
    end
    subgraph Probabilistic
    U8[Unit 8: Probabilistic/Bayesian]
    U9[Unit 9: Representation Learning]
    U10[Unit 10: Latent Spaces]
    U11[Unit 11: Unsupervised Learning]
    end
    subgraph Trust
    U12[Unit 12: Uncertainty/GPs]
    U13[Unit 13: Physics-Informed]
    U14[Unit 14: Explainability/Trust]
    end
    Foundations --> Training
    Training --> Probabilistic
    Probabilistic --> Trust
  • Let us review the arc.

Units 1–4: Foundations

  • Unit 1: Learning as risk minimization — the mathematical framework.
  • Unit 2: Linear algebra tools — PCA, SVD, eigendecomposition.
  • Unit 3: Loss functions — regression and classification as optimization.
  • Unit 4: Neural network architectures — layers, activations, expressivity.

Units 5–7: Training and generalization

  • Unit 5: Backpropagation — the algorithm that makes training feasible.
  • Unit 6: Loss landscapes — optimization behavior, momentum, Adam.
  • Unit 7: Bias-variance tradeoff — regularization and cross-validation.

Units 8–11: Probabilistic and unsupervised methods

  • Unit 8: Probabilistic view — MLE, Bayesian inference, MAP.
  • Unit 9: Representation learning — autoencoders and manifold hypothesis.
  • Unit 10: Latent spaces — t-SNE, UMAP, embeddings.
  • Unit 11: Unsupervised learning — K-Means, GMM, EM algorithm.

Units 12–14: Uncertainty, physics, and trust

  • Unit 12: Uncertainty quantification — GPs, MC Dropout, ensembles.
  • Unit 13: Physics-informed learning — PINNs, data enrichment, Lagaris.
  • Unit 14: Explainability and trust — the culmination.

Exam-aligned summary: 10 course-wide must-know statements

  1. Learning = minimizing [ expected | empirical ] risk; [ empirical | validation ] risk is the tractable proxy. {.fragment}
  2. The bias-variance tradeoff governs [ model complexity | dataset size ] selection. {.fragment}
  3. Backpropagation enables efficient gradient computation in [ \(O(W)\) | \(O(W^2)\) ]. {.fragment}
  4. Regularization [ restricts | expands ] hypothesis space to improve generalization. {.fragment}
  5. Bayesian inference provides principled uncertainty quantification via the [ likelihood | posterior ]. {.fragment}
  6. Autoencoders learn compressed representations; linear AE = [ PCA | K-Means ]. {.fragment}
  7. The EM algorithm iteratively finds [ ML | MAP ] parameters for mixture models. {.fragment}
  8. GP uncertainty grows [ towards | away from ] data — honest epistemic uncertainty. {.fragment}
  9. PINNs embed physics into the [ loss | architecture ] to reduce data requirements. {.fragment}
  10. Explainability is a [ mandate | luxury ], not an optional add-on. {.fragment}

Exam preparation and farewell

  • Exam scope: Units 1–14. Focus on derivations (MLE, backprop, bias-variance, EM, GP posterior).
  • Preparation: work through all exercise problems; understand the “10 must-know statements” per unit.
  • Format: written exam — derivations, interpretations, design questions.
  • Thank you for an excellent semester. Good luck with the exam!