ML for Characterization and Processing
Lecture 5: Convolutional Neural Networks for Microstructure Analysis
Prof. Dr. Philipp Pelz
FAU Erlangen-Nürnberg
Institute of Micro- and Nanostructure Research
Welcome
Week 5 — Convolutional Neural Networks for Microstructure Analysis
Goals for today: - Explain the “Parameter Explosion” problem in MLPs - Define convolution, kernels, stride, and padding - Understand weight sharing and translation invariance - Distinguish between Max and Average Pooling - Review key CNN architectures (LeNet, AlexNet, ResNet) - Explore materials science case studies (TEM, SEM)
Outline
- The Image Problem & MLP Limitations
- The Convolution Layer
- Architectural Principles: Local Connectivity & Weight Sharing
- The Pooling Layer & Hierarchical Features
- Key CNN Architectures
- Materials Science Case Studies
1. The Image Problem
Why MLPs Fail on High-Res Images
Microscopy images are often high-resolution (e.g., \(1024 \times 1024\) pixels).
Problems with standard MLPs: - Parameter Explosion: A single hidden layer with 512 units for a 1MP image requires >500M weights. - Memory Cost: ~4 GB for one layer in double precision. - Loss of Structure: Flattening into a 1D vector ignores spatial correlations. - No Invariance: Moving a feature by 1 pixel makes it “new” to the MLP.
2. The Convolution Layer
Discrete Convolution
The core operation of CNNs: [ (I * K){m,n} = {i} {j} I{m-i, n-j} K_{i,j} ]
- Kernel (Filter): A small matrix (e.g., \(3 \times 3\)) that “slides” over the image.
- Feature Map: The output highlighting specific patterns (edges, spots).
- Kernels as Detectors: Laplacian filters for edges, Gaussian for smoothing.
Stride and Padding
- Stride: Step size of the kernel. Stride \(> 1\) reduces output size (downsampling).
- Padding: Adding border pixels (usually zeros).
- Valid Padding: No padding, image shrinks.
- Same Padding: Preserves input dimensions.
3. Architectural Principles
Local Connectivity
Each neuron connects only to a small local patch (the kernel size). This drastically reduces parameters compared to “fully connected” layers.
Weight Sharing
The same kernel is used across the entire image. - Translation Invariance: Detects the same feature (e.g., a grain boundary) regardless of its location.
Receptive Field
- Definition: The region of the input image that affects a specific output neuron.
- Hierarchy: As we stack layers, the receptive field increases, allowing deeper neurons to see larger structures (e.g., from edges to whole grains).
4. The Pooling Layer
Downsampling and Robustness
- Max Pooling: Returns the maximum value in a window. Preserves the most prominent signals (e.g., bright spots in TEM).
- Average Pooling: Returns the mean. Provides a smoother downsampling.
- Shift Invariance: Makes the representation less sensitive to small translations of features.
Hierarchical Feature Maps
- First Layers: Low-level features (edges, grain boundaries).
- Middle Layers: Motifs, shapes, precipitate clusters.
- Deep Layers: Complex microstructural phases, martensite laths, melt pool morphologies.
5. Key CNN Architectures
- LeNet-5 (1995): Established the conv-pool-dense pattern for digit recognition.
- AlexNet (2012): The breakthrough for large-scale image classification (ImageNet). Used ReLU and GPUs.
- ResNet (2015): Introduced Skip Connections (Residual Blocks) to solve vanishing gradients, enabling networks with hundreds of layers.
6. Materials Science Case Studies
Phase Segmentation (TEM)
- Using U-Net to segment crystalline Au nanoparticles from amorphous backgrounds.
- Comparable to human expert performance.
Synthetic Data for SEM
- Training on synthetic Voronoi microstructures to segment real SEM grain boundaries.
- Overcomes the lack of large, hand-labeled microscopy datasets.
Property Prediction (Ising Model)
- Predicting temperature (\(T > T_c\) vs. \(T < T_c\)) directly from microstructural snapshots.
- Moving from “just pictures” to “predictive physics” through CNNs.
Summary
- CNNs solve the parameter explosion by using local connectivity and weight sharing.
- Convolutions act as learnable filters that discover hierarchical features.
- Pooling provides robustness and reduces dimensionality.
- Modern architectures (ResNet, U-Net) enable complex microstructural analysis.
- Case studies show success in TEM phase segmentation and SEM grain boundary detection.
Example Notebooks
Week 5: First CNN on Microstructures — IsingDataset (16×16)
Week 5: Full CNN Training — IsingDataset (64×64)