FAU Erlangen-Nürnberg
What does a neural network for an atomic system look like?
What this unit is not.
Recap — what we already have
Today — Unit 9 in one line
By the end of 90 minutes, you can:
Assumed primitives
Why “regular grid” is the catch today
The graph encoding
MG U7 contract still holds today. PBC are enforced at graph-construction time; cutoff and RBF parameters are documented; reproducibility starts with a deterministic graph.
The MG U6 picture
Today’s question
Setup
The failure mode
The four physical symmetries of an atomic-system property
Two ways to handle a symmetry
Scalar property: invariance
Vector / tensor property: equivariance
Cardinal rule. A scalar-only architecture can produce forces — via autograd through the energy. A vector-only architecture can produce energies — via an invariant readout. But mixing the two carelessly breaks physics.
E(3): the Euclidean group in 3D
SE(3): orientation-preserving subgroup
Why this matters for materials
Permutation invariance via aggregation
PBC at graph-construction time
Setting (schutt2017schnet?; schutt2018schnet?)
The central object: a continuous-filter convolution
The discrete-CNN obstruction
The continuous-filter answer
\[ W(r) = \text{MLP}\!\left(\text{RBF}(r)\right) \in \mathbb{R}^F \]
Interaction block
For each atom \(i\) and each interaction layer \(t = 1, \ldots, T\):
\[ \mathbf{x}_i^{(t+1)} = \mathbf{x}_i^{(t)} + \sum_{j \in N(i)} \mathbf{x}_j^{(t)} \odot W^{(t)}\!\left(r_{ij}\right) \]
Initialisation and readout
QM9 — the canonical molecular benchmark
SchNet’s QM9 result
Captures
Misses
The cutoff problem
The fix: a smooth envelope
One-slide SchNet summary
The SchNet \(\to\) CGCNN move
The CGCNN setting (xie2018cgcnn?)
Why CGCNN matters
Edge feature vector \(u_{ij}\)
Atom feature vector \(\mathbf{v}_i\)
The split. Chemistry enters via \(\mathbf{v}_i\). Geometry enters via \(u_{ij}\). The message-passing step on the next slide is what combines them.
The update
For each atom \(i\), layer \(t\):
\[ z_{ij} = [\mathbf{v}_i^{(t)} \,\|\, \mathbf{v}_j^{(t)} \,\|\, u_{ij}] \]
\[ \mathbf{v}_i^{(t+1)} = \mathbf{v}_i^{(t)} + \sum_{j \in N(i)} \sigma\!\left(W_z\, z_{ij} + b_z\right) \odot g\!\left(W_s\, z_{ij} + b_s\right) \]
Reading the equation
Per-atom contributions
Crystal-level readout
MG U7 sketched the schematic
CGCNN instantiates each piece
The pedagogical reason for ordering MG U7 before MG U9. U7 builds the interface; U9 fills in the implementation. Once a student has the U7 schematic, every architecture in §D–§F is a different way of filling slots in that schematic.
Materials Project benchmarks (2018–2020)
Industrial use cases
What CGCNN does not see
Why these blind spots matter
The MEGNet generalisation (chen2019megnet?)
Conditioning on external state
Readout via the global state
MEGNet on Materials Project (chen2019megnet?). Multi-property heads (formation energy, band gap, bulk modulus, shear modulus) from a single trunk; competitive or better MAE than CGCNN on each.
The ALIGNN trick (choudhary2021alignn?)
Increasing geometric resolution
| Architecture | Distances | Global state | Angles |
|---|---|---|---|
| SchNet | yes | — | — |
| CGCNN | yes | — | — |
| MEGNet | yes | yes | — |
| ALIGNN | yes | — | yes |
| M3GNet | yes | — | yes (3-body) |
Which channel matters for which target
M3GNet (chen2022m3gnet?)
Architectural extensions over MEGNet
The three coupled outputs of an MLIP
Why the coupling is non-trivial
The three-term loss. \[\mathcal{L} = \lambda_E \|E - E^{\rm DFT}\|^2 + \lambda_F \sum_i \|\mathbf{F}_i - \mathbf{F}_i^{\rm DFT}\|^2 + \lambda_\sigma \|\sigma - \sigma^{\rm DFT}\|^2\]
| Architecture | Best for | Cost |
|---|---|---|
| CGCNN | Scalar property prediction; cheap workhorse baseline. | Low. |
| MEGNet | State-conditioned properties; multi-task heads from a shared trunk. | Low–medium. |
| ALIGNN | Properties depending on bond angles (band gap, elastic constants). | Medium. |
| M3GNet | Universal MLIP across the periodic table; energy + forces + stress. | High (deployment). |
| NequIP / MACE / Allegro (§F) | Small-data MLIP; force-accurate models from \(\sim 10^4\) structures. | High (per step). |
Shared scaffolding
Shared limitation
Where the leaderboard sits in 2022–2023
The ceiling shifted from model to data
The autograd-from-invariants pathway
The equivariant pathway
The headline. For small-data MLIP regimes (~10⁴ structures, force labels), equivariant networks are the state of the art.
The irrep stack
Tensor products mix irreps
NequIP (batzner2022nequip?)
The data-efficiency claim
mir-group/nequip.Allegro (musaelian2023allegro?)
MACE (batatia2022mace?)
When to go equivariant
When invariance + autograd is enough
Rule of thumb. Crossover around \(\sim 10^5\) structures with strong force labels. Below: NequIP / MACE. Above: M3GNet / OMat24-style invariant + scale.
This section is a preview.
What attention buys for crystals
Matformer (yan2022matformer?)
Graphormer (ying2021graphormer?)
The architectural shift. Local message passing \(\to\) global attention. Receptive field = entire cell, in one layer. Compute cost scales as \(O(N^2)\) in atoms — manageable for typical unit cells, expensive for large supercells.
Foundation models reach materials (2023–2024)
Multi-modal materials models
The standard recipe (NLP-style)
Why this works for materials
The 2024–2026 default. New project? Start with a foundation-model checkpoint, fine-tune for a few hours. Train from scratch only if no relevant checkpoint exists.
Where we are in 2026 (as of this lecture)
What is still missing
By data regime
By target type
Small-data (\(\sim 10^3\)–\(10^4\))
Large-data (\(\sim 10^5\)–\(10^8\))
The crossover. Around \(\sim 10^5\) structures with strong force labels. Below: equivariance + handcrafted features. Above: pretrained foundation models + scale.
The MG U8 contract
Applied to MG U9 architectures
Where we are
Unit 10 picks up here
The single sentence to leave with. Unit 9 produces the embedding; Unit 10 studies it.
Reading for next week
Recommended primary papers
Exercise (90 min, this afternoon)
Reading for next week
Next week (Unit 10): representation learning — what to do with foundation-model embeddings. The supervised-architecture toolkit you learned today becomes the encoder for everything that follows.
Week 10: Regression on Nanoindentation data — baseline NN models

© Philipp Pelz - Materials Genomics