Machine Learning in Materials Processing & Characterization
Unit 8: Generalization, Robustness, and Process Windows

Prof. Dr. Philipp Pelz

FAU Erlangen-Nürnberg

0. Reliability in Materials ML

Beyond Accuracy: Generalization & Robustness

Today’s Learning Journey:

Generalization: Why training accuracy is a lie.
Robustness: Handling noise and distribution shift.
Cross-Validation: Rigorous performance estimation.
Process Windows: Mapping ML to safe operations.
Sensitivity: Identifying the true drivers of your model.

1. Generalization

The Goal: Unseen Performance

We care about how the model performs on new data, not the training set. Sandfeld, Stefan et al., (2024)
Overfitting: The model memorizes noise instead of learning physics.
Bias-Variance Tradeoff: Finding the balance between model simplicity and complexity.

Parsimony

“Prefer the simplest model that explains the data well.” (Occam’s Razor).

2. Robustness to Noise

Stable Predictions

A robust model is insensitive to small perturbations (measurement noise).
Distribution Shift: Can your model handle data from a different microscope?
Outliers: Using robust loss functions (e.g., Huber loss) to ignore extreme artifacts.

Scientific Skepticism: Is it an outlier (error) or a rare event (discovery)?

3. Cross-Validation & Tuning

K-Fold Cross-Validation

Reusing limited data to get a stable performance estimate.
Group-based splits: Avoiding “Data Leakage” between samples. Sandfeld, Stefan et al., (2024)

Hyperparameter Tuning

Grid Search: Brute force.
Random Search: Often faster and more effective. McClarren, Ryan G., (2021)
Bayesian Optimization: Intelligently searching the meta-parameter space.

4. Defining Process Windows

Mapping Safe Zones

Process Window: The region in \((P, v, T)\) space where the product is safe.
Using ML classifiers to draw the Operating Boundary.
Probabilistic Windows: Mapping model uncertainty to engineering safety factors.

Case Study: Laser Power vs. Scan Speed in Additive Manufacturing.

5. Sensitivity Analysis

What drives the prediction?

Feature Importance: Which input has the most impact?
SHAP Values: Fairly distributing credit for a prediction among inputs.
Stability Analysis: Ensuring the model follows physical continuity.

6. Summary & Takeaways

Key Messages:

Generalization is the only benchmark that matters.
Robustness is required for deployment in the “dirty” real world.
Cross-Validation prevents being fooled by a lucky data split.
Process Windows turn ML predictions into actionable engineering maps.

Exercise Handoff

Implement 5-Fold Cross-Validation for the AM Meltpool dataset.
Perform a Random Search to optimize the CNN learning rate.
Map the Process Window for “High-Density” parts.

References

Materials data science, Stefan Sandfeld & others

Machine learning for engineers: Using data to solve problems for physical systems, Ryan G. McClarren

Machine Learning in Materials Processing & Characterization Unit 8: Generalization, Robustness, and Process Windows