Machine Learning in Materials Processing & Characterization
Unit 8: Generalization, Robustness, and Process Windows

Prof. Dr. Philipp Pelz

FAU Erlangen-Nürnberg

FAU Logo IMN Logo CENEM Logo ERC Logo Eclipse Logo

0. Reliability in Materials ML

Beyond Accuracy: Generalization & Robustness

Today’s Learning Journey:

  • Generalization: Why training accuracy is a lie.
  • Robustness: Handling noise and distribution shift.
  • Cross-Validation: Rigorous performance estimation.
  • Process Windows: Mapping ML to safe operations.
  • Sensitivity: Identifying the true drivers of your model.

1. Generalization

The Goal: Unseen Performance

  • We care about how the model performs on new data, not the training set. Sandfeld, Stefan et al., (2024)
  • Overfitting: The model memorizes noise instead of learning physics.
  • Bias-Variance Tradeoff: Finding the balance between model simplicity and complexity.

Parsimony

“Prefer the simplest model that explains the data well.” (Occam’s Razor).

2. Robustness to Noise

Stable Predictions

  • A robust model is insensitive to small perturbations (measurement noise).
  • Distribution Shift: Can your model handle data from a different microscope?
  • Outliers: Using robust loss functions (e.g., Huber loss) to ignore extreme artifacts.

Scientific Skepticism: Is it an outlier (error) or a rare event (discovery)?

3. Cross-Validation & Tuning

K-Fold Cross-Validation

  • Reusing limited data to get a stable performance estimate.
  • Group-based splits: Avoiding “Data Leakage” between samples. Sandfeld, Stefan et al., (2024)

Hyperparameter Tuning

  • Grid Search: Brute force.
  • Random Search: Often faster and more effective. McClarren, Ryan G., (2021)
  • Bayesian Optimization: Intelligently searching the meta-parameter space.

4. Defining Process Windows

Mapping Safe Zones

  • Process Window: The region in \((P, v, T)\) space where the product is safe.
  • Using ML classifiers to draw the Operating Boundary.
  • Probabilistic Windows: Mapping model uncertainty to engineering safety factors.

Case Study: Laser Power vs. Scan Speed in Additive Manufacturing.

5. Sensitivity Analysis

What drives the prediction?

  • Feature Importance: Which input has the most impact?
  • SHAP Values: Fairly distributing credit for a prediction among inputs.
  • Stability Analysis: Ensuring the model follows physical continuity.

6. Summary & Takeaways

Key Messages:

  1. Generalization is the only benchmark that matters.
  2. Robustness is required for deployment in the “dirty” real world.
  3. Cross-Validation prevents being fooled by a lucky data split.
  4. Process Windows turn ML predictions into actionable engineering maps.

Exercise Handoff

  • Implement 5-Fold Cross-Validation for the AM Meltpool dataset.
  • Perform a Random Search to optimize the CNN learning rate.
  • Map the Process Window for “High-Density” parts.

References

Materials data science, Stefan Sandfeld & others
Machine learning for engineers: Using data to solve problems for physical systems, Ryan G. McClarren