Machine Learning in Materials Processing & Characterization
Unit 4: From Classical Metrics to Learned Representations
FAU Erlangen-Nürnberg
By the end of this unit, you can:
Note
This morning’s MFML lecture covered the architecture, forward pass, and activation taxonomy of MLPs. We’ll recap that in 3 slides and then spend the rest of the lecture on what’s specific to materials data.
Slides 03–10
Stereology: The science of estimating 3D properties from 2D sections.
How we’ve characterized structure for over a century — the backbone of materials standards (ASTM, DIN).

Scalar features that condense complex 3D structures into single numbers:
These are our “hand-crafted features.”
Features based on human intuition:
Question: Can we keep more information while remaining computationally efficient?
Answer: Yes — learned representations (embeddings) compress images into vectors that preserve the task-relevant information.
Processing → Structure (Metric \(d\)) → Property (Hardness \(H\))
Hall-Petch: \(H = H_0 + k \cdot d^{-1/2}\)
A linear model using a physical descriptor — works well for simple cases!
Modern materials defy simple descriptions:
Simple metrics are insufficient to describe the “S” in PSPP completely. We need richer representations.
| Approach | Input | Features | Limitations |
|---|---|---|---|
| Classical | Micrograph → Metrics | Hand-crafted | Information loss |
| Modern | Micrograph → Network | Learned (embedding) | Need data |
From “Predicting with Descriptors” to “Learning Representations”
The model decides what features matter — not the scientist.
Slides 11–14
Note
We will not re-derive these here. If anything is unclear, the MFML deck (Unit 4) is your reference.
| Layer | Task | Activation | Why |
|---|---|---|---|
| Hidden | General | ReLU / Leaky ReLU | Cheap, no saturation for \(z > 0\) |
| Output | Regression (e.g., hardness) | Linear | Predictions span \(\mathbb{R}\) |
| Output | Binary classification | Sigmoid | Output ∈ (0, 1) interpreted as probability |
| Output | Multi-class classification | Softmax | Outputs sum to 1 over \(C\) classes |
Materials example: phase classification — Softmax outputs \(P(\text{FCC}) = 0.72\), \(P(\text{BCC}) = 0.25\), \(P(\text{HCP}) = 0.03\).
The pieces — loss, gradient, update rule — are universal. What changes between materials problems is what goes into X, what y means, and how we encode the structure. That’s the rest of this lecture.
The MFML deck showed that an MLP can fit any function. The materials questions are:
Slides 15–24
Before any network can learn from a microstructure, we must turn it into a vector or tensor. The choice of encoding decides what the model can learn.
| Encoding | Input shape | What the model sees |
|---|---|---|
| Hand-crafted descriptors | \(\mathbb{R}^{D}\) (small) | Pre-distilled features |
| Raw flattened image | \(\mathbb{R}^{H \cdot W}\) (huge) | Every pixel, no spatial prior |
| Patch / windowed statistics | \(\mathbb{R}^{N \times D}\) | Local texture distributions |
| n-point statistics | \(\mathbb{R}^{D}\) | Spatial correlations, translation-invariant |
| Raw image with conv inductive bias | \(\mathbb{R}^{H \times W \times C}\) | Spatial features (next week) |
The classical features from Part 1 — grain size, aspect ratio, phase fraction, ODF coefficients — are just one possible vector representation of the microstructure.
Their advantage: physically interpretable. You can name every component of the input vector.
Their disadvantage: information lossy and biased. Whatever physics you didn’t think to encode, the model cannot recover.
The two-point correlation function \(S_2(\mathbf{r})\):
\[S_2(\mathbf{r}) = P\bigl(\text{phase}(\mathbf{x}) = \alpha \;\wedge\; \text{phase}(\mathbf{x}+\mathbf{r}) = \alpha\bigr)\]
Why this is a good NN input: it’s translation-invariant by construction, low-dimensional (a few hundred numbers), and physically meaningful.
A practical compromise: feed the network \(S_2\) (informative, compact) rather than raw pixels (noisy, huge).
For many materials problems, the input is not an image at all — it’s tabular:
This is the natural setting for an MLP. The input vector is small (\(D \sim 10\)–\(50\)), interpretable, and has well-defined physical units. Standardize each feature (\(\mu = 0\), \(\sigma = 1\)) before feeding the network.
| Input type | Typical \(D\) | Best architecture | Example task |
|---|---|---|---|
| Composition + process (tabular) | 10–50 | MLP (this lecture) | Predict hardness from alloy + treatment |
| Hand-crafted morphological descriptors | 5–50 | MLP | Predict fatigue life from grain stats |
| n-point statistics | 100–1000 | MLP / 1D conv | MKS-style property prediction |
| Raw 2D micrograph | \(10^4\)–\(10^7\) | CNN (next week) | Phase segmentation, defect detection |
Each step retains more of the \(\sim 10^6\) pixels of original information. More information ≠ better model unless you have data to support it — this trade-off drives the rest of the course.
A composition feature in [0, 1] (mass fraction) and a temperature in [300, 1500] K cannot be fed to the same MLP unscaled.
This is the single most common mistake in published materials-ML papers. We will revisit it formally in Unit 6 (transfer learning) and in MFML’s generalization week.
Slides 25–34
Predict tensile strength of a polycrystalline alloy from its microstructure.
| Model | Input | Parameters |
|---|---|---|
| Linear (Hall-Petch) | Mean grain size \(d^{-1/2}\) | 2 |
| Linear + descriptors | \(d^{-1/2}\), aspect ratio, porosity, pore-size dispersion | 5 |
| MLP | All 12 hand-crafted descriptors | \(\sim 1{,}000\) |
We use the same 12 descriptors for the linear-extended model and the MLP — so the only difference is how the inputs are combined.
| Method | Features | R² (test) |
|---|---|---|
| Hall-Petch | Mean grain size only | 0.72 |
| Linear + descriptors | 12 hand-crafted | 0.81 |
| MLP | 12 hand-crafted | 0.88 |
The MLP gains 7 R² points on the same input as the linear model — the gain comes purely from non-linear feature combinations.
Permutation importance reveals which inputs the MLP actually relies on:
Physical reading: the network rediscovered weakest-link statistics on its own. Hall-Petch was an average; failure is a tail-of-distribution phenomenon.
A linear model in \(d^{-1/2}\) predicts the same strength gain whether you halve the mean grain size or halve the largest grain. Physically these are different operations — thermomechanical processing affects the tail of the grain-size distribution differently from the mean.
The MLP captures this because it can compose features non-linearly: e.g., \(\sigma_y \approx f(\bar{d}, d_{\max}, \mathrm{Var}[d])\) with \(f\) free to be non-additive.
Materials practice: try the linear baseline first. Only adopt the MLP if it improves cross-validation R² and you can defend its predictions on out-of-distribution test cases.
A common but invalid protocol:
Mitigation: feature engineering that normalizes out instrument-specific statistics, or — more powerfully — domain-randomized training, which we’ll revisit in Unit 6 (transfer learning).
What if we don’t have hand-crafted descriptors, only the raw image?
This is the failure mode that motivates next week’s lecture. CNNs use weight sharing and locality to make the parameter count manageable, allowing us to learn directly from pixels.
Key Takeaways:
Reading:
Next Week: Unit 5 — Convolutional Neural Networks for Microstructure Analysis

© Philipp Pelz - Machine Learning in Materials Processing & Characterization