ML for Characterization and Processing
Lecture 5: Convolutional Neural Networks for Microstructure Analysis

Prof. Dr. Philipp Pelz

FAU Erlangen-Nürnberg

Institute of Micro- and Nanostructure Research

Welcome

Week 5 — Convolutional Neural Networks for Microstructure Analysis

Goals for today: - Explain the “Parameter Explosion” problem in MLPs - Define convolution, kernels, stride, and padding - Understand weight sharing and translation invariance - Distinguish between Max and Average Pooling - Review key CNN architectures (LeNet, AlexNet, ResNet) - Explore materials science case studies (TEM, SEM)

Outline

The Image Problem & MLP Limitations
The Convolution Layer
Architectural Principles: Local Connectivity & Weight Sharing
The Pooling Layer & Hierarchical Features
Key CNN Architectures
Materials Science Case Studies

1. The Image Problem

Why MLPs Fail on High-Res Images

Microscopy images are often high-resolution (e.g., \(1024 \times 1024\) pixels).

Problems with standard MLPs: - Parameter Explosion: A single hidden layer with 512 units for a 1MP image requires >500M weights. - Memory Cost: ~4 GB for one layer in double precision. - Loss of Structure: Flattening into a 1D vector ignores spatial correlations. - No Invariance: Moving a feature by 1 pixel makes it “new” to the MLP.

2. The Convolution Layer

Discrete Convolution

The core operation of CNNs: [ (I * K){m,n} = {i} {j} I{m-i, n-j} K_{i,j} ]

Kernel (Filter): A small matrix (e.g., \(3 \times 3\)) that “slides” over the image.
Feature Map: The output highlighting specific patterns (edges, spots).
Kernels as Detectors: Laplacian filters for edges, Gaussian for smoothing.

Stride and Padding

Stride: Step size of the kernel. Stride \(> 1\) reduces output size (downsampling).
Padding: Adding border pixels (usually zeros).
- Valid Padding: No padding, image shrinks.
- Same Padding: Preserves input dimensions.

3. Architectural Principles

Local Connectivity

Each neuron connects only to a small local patch (the kernel size). This drastically reduces parameters compared to “fully connected” layers.

The same kernel is used across the entire image. - Translation Invariance: Detects the same feature (e.g., a grain boundary) regardless of its location.

Receptive Field

Definition: The region of the input image that affects a specific output neuron.
Hierarchy: As we stack layers, the receptive field increases, allowing deeper neurons to see larger structures (e.g., from edges to whole grains).

4. The Pooling Layer

Downsampling and Robustness

Max Pooling: Returns the maximum value in a window. Preserves the most prominent signals (e.g., bright spots in TEM).
Average Pooling: Returns the mean. Provides a smoother downsampling.
Shift Invariance: Makes the representation less sensitive to small translations of features.

Hierarchical Feature Maps

First Layers: Low-level features (edges, grain boundaries).
Middle Layers: Motifs, shapes, precipitate clusters.
Deep Layers: Complex microstructural phases, martensite laths, melt pool morphologies.

5. Key CNN Architectures

LeNet-5 (1995): Established the conv-pool-dense pattern for digit recognition.
AlexNet (2012): The breakthrough for large-scale image classification (ImageNet). Used ReLU and GPUs.
ResNet (2015): Introduced Skip Connections (Residual Blocks) to solve vanishing gradients, enabling networks with hundreds of layers.

6. Materials Science Case Studies

Phase Segmentation (TEM)

Using U-Net to segment crystalline Au nanoparticles from amorphous backgrounds.
Comparable to human expert performance.

Synthetic Data for SEM

Training on synthetic Voronoi microstructures to segment real SEM grain boundaries.
Overcomes the lack of large, hand-labeled microscopy datasets.

Property Prediction (Ising Model)

Predicting temperature (\(T > T_c\) vs. \(T < T_c\)) directly from microstructural snapshots.
Moving from “just pictures” to “predictive physics” through CNNs.

Summary

CNNs solve the parameter explosion by using local connectivity and weight sharing.
Convolutions act as learnable filters that discover hierarchical features.
Pooling provides robustness and reduces dimensionality.
Modern architectures (ResNet, U-Net) enable complex microstructural analysis.
Case studies show success in TEM phase segmentation and SEM grain boundary detection.

Example Notebooks

Week 5: First CNN on Microstructures — IsingDataset (16×16)

Open rendered notebook →

Week 5: Full CNN Training — IsingDataset (64×64)

Open rendered notebook →

ML for Characterization and Processing Lecture 5: Convolutional Neural Networks for Microstructure Analysis

Welcome

Week 5 — Convolutional Neural Networks for Microstructure Analysis

Outline

1. The Image Problem

Why MLPs Fail on High-Res Images

2. The Convolution Layer

Discrete Convolution

Stride and Padding

3. Architectural Principles

Local Connectivity

Weight Sharing

Receptive Field

4. The Pooling Layer

Downsampling and Robustness

Hierarchical Feature Maps

5. Key CNN Architectures

6. Materials Science Case Studies

Phase Segmentation (TEM)

Synthetic Data for SEM

Property Prediction (Ising Model)

Summary

Example Notebooks

ML for Characterization and Processing
Lecture 5: Convolutional Neural Networks for Microstructure Analysis