AI 4 Materials / KI-Materialtechnologie
FAU Erlangen-Nürnberg
| Tool | One-line summary | When to reach for it |
|---|---|---|
| PINN loss decomposition | \(J = J_{\text{data}} + \lambda J_{\text{phys}} + \lambda_b J_{\text{BC}}\) | PDE residual is known and cheap to evaluate |
| Automatic differentiation | Exact \(\partial f_\theta / \partial x\) inside the loss | Needed for any PDE/ODE residual term |
| Soft constraint | Penalty in the loss | Easy to add; constraint only approximate |
| Hard constraint | Built into the architecture (Lagaris) | Constraint must be exact, e.g. BCs, mass |
| DeepONet / FNO | Learn an operator \(u \mapsto G(u)\) | Many similar PDE instances, fast inference |
| HNN / equivariant nets | Symmetries / conservation by construction | Energy, momentum, point-group symmetry |
| SINDy | Sparse regression on a derivative dictionary | Discover the governing equation itself |
Note
We covered the math in MFML W13. Today we ask: which of these do I pick up in the lab, and what breaks?
Each is a real failure mode. Constraints exist to make these impossible by construction, not to be patched after the fact.
A working checklist for any new ML model in our group:
The network is using a phase that visually fits a peak it cannot otherwise explain. Adding \(\mathcal{L}_{\text{phys}}\) forces it to attribute that intensity to a peak overlap or to the noise model instead.
| \(\lambda\) | \(J_{\text{data}}\) (MSE) | violation rate |
|---|---|---|
| 0 | 0.012 | 17 % |
| 0.1 | 0.013 | 9 % |
| 1 | 0.014 | 1.2 % |
| 10 | 0.018 | 0.0 % |
| 100 | 0.034 | 0.0 % |
Pick \(\lambda\) on validation by jointly tracking data error and physical-violation rate, never one alone.
The forward-consistency term is just another physics constraint — it forces predictions to be self-consistent with a known operator.
Report metrics on a held-out alloy family, not just held-out spectra:
Note
A model that has 5% lower RMSE but 10× the violation rate is worse, not better. Two-axis reporting is the only honest way to compare constrained methods.
A neural network \(T_\theta(x,t)\) with a three-term loss:
\[ J = \underbrace{\frac{1}{N}\sum_i (T_\theta(x_i, t_i) - T_i^{\text{obs}})^2}_{J_{\text{data}}} \]
\[ + \lambda_{\text{PDE}} \cdot \frac{1}{M}\sum_j \left( \rho c_p \partial_t T_\theta - k\,\partial_{xx} T_\theta - q \right)_{(x_j,t_j)}^2 \]
\[ + \lambda_{\text{BC}}\, J_{\text{BC/IC}} \quad \text{(insulated edges, ambient at } t=0\text{)}. \]
Collocation points \((x_j, t_j)\) are sampled densely between pyrometer pixels — that’s where physics supervises for free.
This is the PINN’s real value in the lab — not “we solved a PDE”, but “we obtained a quantity the sensor cannot measure”.
Path 1 — symmetry-augmented training.
Path 2 — equivariant network architecture.
Augmentation (Path 1)
Equivariant net (Path 2)
Note
Pick by deployment scenario. Screening 10\(^5\) candidates? Augmentation. Generating training data for a downstream physics simulator that requires exact symmetry? Equivariant.
Structural (hard). Predict cumulative non-negative increments and integrate: \[ \Delta\sigma_t = \text{softplus}\,\big(\text{NN}(x_t)\big) \ge 0, \qquad \sigma_t = \sigma_0 + \sum_{s \le t} \Delta\sigma_s. \]
Monotonicity is now exact. A non-negative output activation is a tiny architectural change with a hard guarantee.
Soft. Penalize negative slope inside the elastic region: \[ J_{\text{mon}} = \lambda_m \sum_{t : \epsilon_t < \epsilon_y} \max\!\left(0,\, -\frac{d\sigma_\theta}{d\epsilon}\right)^2. \]
Easier to add, but only approximate; behaves badly near the yield transition where the indicator function is itself uncertain.
Note
For the math behind the patterns we used today:
Today we used the results. The derivations live there.

© Philipp Pelz - ML for Characterization and Processing