ML for Characterization and Processing
Lecture 5: Convolutional Neural Networks for Microstructure Analysis

Prof. Dr. Philipp Pelz

FAU Erlangen-Nürnberg

Institute of Micro- and Nanostructure Research

Welcome

Week 5 — Convolutional Neural Networks for Microstructure Analysis

Goals for today: - Explain the “Parameter Explosion” problem in MLPs - Define convolution, kernels, stride, and padding - Understand weight sharing and translation invariance - Distinguish between Max and Average Pooling - Review key CNN architectures (LeNet, AlexNet, ResNet) - Explore materials science case studies (TEM, SEM)

Outline

  1. The Image Problem & MLP Limitations
  2. The Convolution Layer
  3. Architectural Principles: Local Connectivity & Weight Sharing
  4. The Pooling Layer & Hierarchical Features
  5. Key CNN Architectures
  6. Materials Science Case Studies

1. The Image Problem

Why MLPs Fail on High-Res Images

Microscopy images are often high-resolution (e.g., \(1024 \times 1024\) pixels).

Problems with standard MLPs: - Parameter Explosion: A single hidden layer with 512 units for a 1MP image requires >500M weights. - Memory Cost: ~4 GB for one layer in double precision. - Loss of Structure: Flattening into a 1D vector ignores spatial correlations. - No Invariance: Moving a feature by 1 pixel makes it “new” to the MLP.

2. The Convolution Layer

Discrete Convolution

The core operation of CNNs: [ (I * K){m,n} = {i} {j} I{m-i, n-j} K_{i,j} ]

  • Kernel (Filter): A small matrix (e.g., \(3 \times 3\)) that “slides” over the image.
  • Feature Map: The output highlighting specific patterns (edges, spots).
  • Kernels as Detectors: Laplacian filters for edges, Gaussian for smoothing.

Stride and Padding

  • Stride: Step size of the kernel. Stride \(> 1\) reduces output size (downsampling).
  • Padding: Adding border pixels (usually zeros).
    • Valid Padding: No padding, image shrinks.
    • Same Padding: Preserves input dimensions.

3. Architectural Principles

Local Connectivity

Each neuron connects only to a small local patch (the kernel size). This drastically reduces parameters compared to “fully connected” layers.

Weight Sharing

The same kernel is used across the entire image. - Translation Invariance: Detects the same feature (e.g., a grain boundary) regardless of its location.

Receptive Field

  • Definition: The region of the input image that affects a specific output neuron.
  • Hierarchy: As we stack layers, the receptive field increases, allowing deeper neurons to see larger structures (e.g., from edges to whole grains).

4. The Pooling Layer

Downsampling and Robustness

  • Max Pooling: Returns the maximum value in a window. Preserves the most prominent signals (e.g., bright spots in TEM).
  • Average Pooling: Returns the mean. Provides a smoother downsampling.
  • Shift Invariance: Makes the representation less sensitive to small translations of features.

Hierarchical Feature Maps

  • First Layers: Low-level features (edges, grain boundaries).
  • Middle Layers: Motifs, shapes, precipitate clusters.
  • Deep Layers: Complex microstructural phases, martensite laths, melt pool morphologies.

5. Key CNN Architectures

  • LeNet-5 (1995): Established the conv-pool-dense pattern for digit recognition.
  • AlexNet (2012): The breakthrough for large-scale image classification (ImageNet). Used ReLU and GPUs.
  • ResNet (2015): Introduced Skip Connections (Residual Blocks) to solve vanishing gradients, enabling networks with hundreds of layers.

6. Materials Science Case Studies

Phase Segmentation (TEM)

  • Using U-Net to segment crystalline Au nanoparticles from amorphous backgrounds.
  • Comparable to human expert performance.

Synthetic Data for SEM

  • Training on synthetic Voronoi microstructures to segment real SEM grain boundaries.
  • Overcomes the lack of large, hand-labeled microscopy datasets.

Property Prediction (Ising Model)

  • Predicting temperature (\(T > T_c\) vs. \(T < T_c\)) directly from microstructural snapshots.
  • Moving from “just pictures” to “predictive physics” through CNNs.

Summary

  • CNNs solve the parameter explosion by using local connectivity and weight sharing.
  • Convolutions act as learnable filters that discover hierarchical features.
  • Pooling provides robustness and reduces dimensionality.
  • Modern architectures (ResNet, U-Net) enable complex microstructural analysis.
  • Case studies show success in TEM phase segmentation and SEM grain boundary detection.

Example Notebooks

Week 5: First CNN on Microstructures — IsingDataset (16×16)

Week 5: Full CNN Training — IsingDataset (64×64)