Machine Learning for Characterization and Processing
Unit 10: Automation in microscopy and characterization

AI 4 Materials / KI-Materialtechnologie

Prof. Dr. Philipp Pelz

FAU Erlangen-Nürnberg

FAU Logo IMN Logo CENEM Logo ERC Logo Eclipse Logo

01. Intro & Motivation

The Manual Bottleneck

  • Materials science is becoming high-throughput.
  • 1000s of samples need characterization.
  • Human operators are expensive, prone to fatigue, and introduce bias.
  • Goal: Self-operating instruments that work 24/7.

The “Self-Driving” Microscope

  • Traditionally: Human \(\rightarrow\) Knob \(\rightarrow\) Image \(\rightarrow\) Interpretation.
  • Future: Agent \(\rightarrow\) Action \(\rightarrow\) Reward \(\rightarrow\) Discovery.
  • Defining objectives (e.g., “Find all Ni-rich precipitates”) instead of commands.

02. Instrumentation & Control Basics

Control Theory: Refresher

  • Feedback: Measure error, adjust control (e.g., thermostat).
  • Sensors: Detectors, beam current meters.
  • Actuators: Lenses, deflector coils, stage motors.

Why is Microscopy Control Hard?

  • Non-linear response: Magnetic lenses, saturation.
  • Hysteresis: Remanent magnetic fields.
  • High-dimensionality: Aligning an EM has 50+ interactings “knobs.”
  • State \(\mathbf{x}_t\): Position, focus, stigmation, illumination.

03. Reinforcement Learning (RL) Foundations

What is Reinforcement Learning?

  • (McClarren Ch 9.1)
  • Learning by Trial and Error.
  • No labels needed! Only a Reward Signal.
  • Agent (ML Model) \(\leftrightarrow\) Environment (The Microscope).

Key Components: State, Action, Reward

  • State: What the microscope “sees” (current image/signal).
  • Action: What the agent “does” (change lens current, move stage).
  • Reward: A scalar indicating how “good” the action was.
  • Policy \(\pi(s)\): Mapping from state to action.

Policy Gradients (The Strategy)

  • (McClarren 9.2)
  • Turning decisions into a probability distribution.
  • Update the NN to make “good” decisions (high reward) more likely.
  • Exploration vs. Exploitation: Trying new things vs. using what works.

04. Automation in Microscopy

Low-Level Automation: Autofocus

  • Traditional: Sweep lens current, pick max sharpness.
  • ML: Learn to jump directly to optimal focus from a single blurry image.
  • Reward: Image sharpness index (Laplacian, FFT high-freq).

Beam Alignment & Stigmation

  • Correcting for non-circular beams and tilt.
  • Agent learns to adjust deflector currents by observing beam shape.
  • ROI Selection: Automatically finding rare features in large samples.

Multi-Modal Data Fusion

  • Combining Images (SEM), Spectra (EDS), and Diffraction (EBSD).
  • Bayesian Sensor Fusion: Weighting each sensor by its precision.
  • A unified material state vector \(z = f(\text{Image}, \text{Spectrum}, \text{EBSD})\).

UMAP for Streaming Anomaly Triage

The recipe.

  1. Embed each incoming SEM / micrograph frame with the lab’s CNN or DINOv2 feature extractor.
  2. Maintain a frozen reference UMAP (McInnes et al. 2018) of the nominal manifold, built once on a QC-verified set of frames.
  3. Project new frames into that fixed UMAP via umap-learn’s .transform() — no retraining at deploy time.
  4. Alarm on frames whose 2-D coordinates land \(> \tau\) standard deviations from the nominal centroid, where \(\tau\) is set by held-out nominal validation.
  5. Operator dashboards display the live UMAP with a coloured trail — the eye picks up drift before the threshold does.

Why UMAP and not t-SNE.

  • UMAP preserves global structure — clusters of different defect types stay separable across runs.
  • UMAP is parametric enough to support .transform() on new points; t-SNE cannot project an unseen frame into a fixed embedding.
  • Runs in seconds on a 1080Ti for \(\sim 10^4\) frames — cheap enough to recompute the reference monthly.

05. Case Study: Industrial Glass Cooling

Why Process Control?

  • Automation isn’t just for labs; it’s for manufacturing.
  • (McClarren Ch 9.4)
  • Problem: Cooling rate controls chemical reactions and physical stress.

RL Control Strategy

  • Physics: Coupled Radiation and Diffusion PDEs.
  • Input: Current Temp, Target Temp (Future).
  • Action: Change boundary temperature \(\Delta u\).
  • Reward: Inverse of squared difference from target.

  • Outcome: RL learns to “overheat” to reach targets faster, discovering system lags.

06. Synthesis & Self-Driving Labs

The “Self-Driving Lab” Framework

  • Automated Synthesis \(\rightarrow\) Automated Characterization \(\rightarrow\) ML Analysis \(\rightarrow\) Loop.
  • Integration of Units 1-14.
  • Challenges: Software APIs, data standards, and trust.

Conformal Classification — Emit Prediction Sets, Not Single Labels

The recipe (classification variant).

  • Train any classifier \(\hat f\) that emits softmax probabilities.
  • On a held-out calibration set, compute non-conformity scores

\[s_i = 1 - \hat f_{y_i}(x_i)\]

— one minus the predicted probability of the true class.

  • Take \(\hat q = \text{Quantile}_{(1-\alpha)}(s)\).
  • At test time, emit the prediction set

\[C(x) = \{\, y : 1 - \hat f_y(x) \le \hat q \,\}.\]

Why this matters for automated defect detection.

  • Instead of class = scratch, the system emits class ∈ {scratch, stain} → send to operator.
  • The set size is a calibrated uncertainty signal — singleton sets mean the model is sure, multi-class sets mean it is not.
  • Plays cleanly with the §3-style closed-loop triage: singleton set → automate, multi-class set → escalate.

Recap: Unit 10

  • RL is the engine of automation.
  • Policy Gradients bridge control and deep learning.
  • Reward design is the most critical human task.
  • UMAP on a frozen nominal manifold gives streaming anomaly triage; conformal prediction sets turn classifiers into calibrated automate/escalate decisions.
  • Next: Handling the “unknown” (Uncertainty and Gaussian Processes).

Continue

References & Further Reading

  • McClarren (2021): Ch. 9 (Reinforcement Learning)
  • Murphy (2012): Ch. 11 (Data Fusion)
  • Neuer (2024): Ch. 7.3 (Automation & Causality)
  • McInnes, Healy & Melville (2018) — UMAP for streaming anomaly triage (McInnes et al. 2018).
  • Angelopoulos & Bates (2023) — conformal prediction sets for automated defect classification (Angelopoulos and Bates 2023).
Angelopoulos, Anastasios N., and Stephen Bates. 2023. “A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification.” Foundations and Trends in Machine Learning 16 (4): 494–591. https://doi.org/10.1561/2200000101.
McInnes, Leland, John Healy, and James Melville. 2018. “UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction.” arXiv Preprint arXiv:1802.03426.

Example Notebook

Week 11: Anomaly Detection via Autoencoder — CahnHilliardDataset