Materials Genomics
Unit 7: Graph-Based Crystal Representations

Prof. Dr. Philipp Pelz

FAU Erlangen-Nürnberg

FAU Logo IMN Logo CENEM Logo ERC Logo Eclipse Logo

§1 · Where We Stand

01. Unit goal

  • Move from fixed descriptors (U6 last week — Magpie / RDF / ACSF / SOAP) to learned representations
  • Treat a crystal as a graph: atoms = nodes, bonds = edges, periodicity = edge wrap-around
  • Build the math toolkit (PBC, RBF, invariance / equivariance, message passing) that U9, U10 will rely on

02. Learning objectives

By the end of this unit, students can:

  • Define a periodic crystal graph \(G=(V,E)\) with lattice-vector-aware edges
  • Implement PBC via the minimum-image convention and explain why naive cutoffs fail
  • Expand interatomic distances in a Gaussian RBF basis and reason about smooth cutoffs
  • Distinguish invariance (energy) from equivariance (forces, dipoles)
  • Write the generic message-passing update and instantiate it for CGCNN, MEGNet, SchNet
  • Choose readout consistent with extensive vs intensive targets and diagnose over-smoothing

03. Recap: prerequisite map

  • U2: Schrödinger / DFT — generates the labels (energies, forces)
  • U3: Crystal lattices, space groups, primitive cells
  • U6 (last week): Hand-crafted atom-centred descriptors (Magpie, RDF, ACSF, SOAP, ACE) — the classical analogue of today’s learned graphs
  • MFML W4-5: Feedforward NNs and backprop

Today’s step

Hand-crafted features \(\Rightarrow\) learned features by neighbour-aware message passing.

04. Why this matters: failure modes of fixed descriptors

  • Descriptors require a fixed-size vector \(\Rightarrow\) varying \(N\), varying stoichiometry break the input shape
  • Choice of cutoff, RBF grid, and ordering is manual and brittle
  • Permutation invariance enforced by hand (sort, average) \(\to\) information loss
  • Graphs absorb all three: variable size, permutation invariance, and learnable edge features
  • Bonus: graphs are data-efficient — weight sharing across atoms means few parameters per node

05. Reading map

  • Core theory: Sandfeld Ch. 2.2 (structure encoding)
  • Architectures: Neuer Ch. 4.5.1–4.5.4 (GNNs for engineering)
  • Deep learning reference: Murphy Ch. 35 (graph neural networks)
  • Equivariance: Bronstein et al., Geometric Deep Learning (2021)

§2 · Crystals as Graphs

06. Crystals as periodic graphs

Crystal structure \(\to\) graph \(G = (V, E)\) over the primitive unit cell plus periodic images.

  • Nodes \(V\): atoms in the primitive cell, \(|V| = N\)
  • Edges \(E\): pairs \((i, j, \vec{n})\) with \(i, j \in V\), image \(\vec{n}\in\mathbb{Z}^3\), within cutoff \(r_c\)
  • Node features \(h_i\): atomic number \(Z_i\), valence, electronegativity, group, period, oxidation state
  • Edge features \(e_{ij}\): displacement \(\vec{r}_{ij}\), distance \(d_{ij}\), optionally bond type / order

With lattice matrix \(\mathbf{T}=[\vec{a}_1,\vec{a}_2,\vec{a}_3]\in\mathbb{R}^{3\times 3}\), the lattice-aware displacement to image \(\vec{n}\) is

\[\vec{r}_{ij}(\vec{n}) \;=\; \vec{r}_j + \mathbf{T}\,\vec{n} - \vec{r}_i, \qquad d_{ij}(\vec{n}) = \|\vec{r}_{ij}(\vec{n})\|\]

Edge created iff \(d_{ij}(\vec{n}) \le r_c\) — a single ordered pair \((i,j)\) may yield several edges (one per image inside the cutoff sphere).

07. Periodic boundary conditions (PBC)

A finite-cell graph must reproduce the topology of the infinite lattice.

Minimum-image convention: replace the raw displacement by its closest periodic image,

\[\vec{r}_{ij} \;\longleftarrow\; \vec{r}_{ij} \;-\; \mathbf{T}\,\mathrm{round}\!\big(\mathbf{T}^{-1}\,\vec{r}_{ij}\big)\]

where \(\mathrm{round}\) acts component-wise on the fractional displacement.

Why a naive cutoff fails on small cells:

  • Primitive NaCl cell \(a \approx 5.64\) Å, two atoms (Na at origin, Cl at \(\tfrac{a}{2}(\hat{x}+\hat{y}+\hat{z})\))
  • Cutoff \(r_c = 4\) Å misses several of Na’s six nearest Cl neighbours unless periodic images are searched
  • Solution: tile the cell into a \(3\times 3\times 3\) supercell during neighbour search, or use the minimum-image rule plus image-index bookkeeping

Forgetting PBC \(\Rightarrow\) disconnected graph, missing nearest neighbours, wrong coordination — the most common silent bug in custom pipelines.

08. Graph construction workflow

CIF / POSCAR


parse lattice  T  and fractional coords  {f_i}


Cartesian coords  r_i = T · f_i


neighbour search:  for each i, j, image n ∈ Z^3
                   compute d_ij(n);  keep iff d_ij(n) ≤ r_c


edge tensor    [src, dst, image, d_ij, r_ij]
node tensor    [Z_i, group, period, ...]


batched DGL / PyG graph  →  GNN
  • Backends: pymatgen.StructureGraph, ase.neighborlist, jarvis.core.graphs, torch_geometric.transforms.RadialGraph

09. Neighbour-cutoff strategies

Fixed cutoff \(r_c\)

  • All \(j\) with \(d_{ij} \le r_c\)
  • Pros: physically meaningful (bonding shell)
  • Cons: variable degree; sensitive to thermal expansion / pressure

\(k\)-nearest neighbours

  • Keep the \(k\) closest atoms regardless of distance
  • Pros: constant degree, easy batching
  • Cons: in sparse regions, “neighbours” can be unphysically far

Hybrid: \(k\)-NN with maximum cutoff (max_neighbors=12, r_c=8 Å) — the practical default in CGCNN, MEGNet, ALIGNN.

10. Pseudocode: lattice-aware neighbour list

def build_edges(frac_coords, T, r_c):
    cart = frac_coords @ T            # (N, 3)
    # search shells of periodic images that intersect r_c-sphere
    n_max = ceil(r_c / min_lattice_spacing(T))
    images = product(range(-n_max, n_max+1), repeat=3)
    edges = []
    for n in images:
        shift = np.array(n) @ T       # (3,)
        for i in range(N):
            for j in range(N):
                r_ij = cart[j] + shift - cart[i]
                d = np.linalg.norm(r_ij)
                if 1e-8 < d <= r_c:
                    edges.append((i, j, n, d, r_ij))
    return edges
  • Cost \(\mathcal{O}(N^2 \cdot |\text{images}|)\); replace inner loops by KD-tree on the supercell for large \(N\).

§3 · Encoding Geometry

11. Distance encoding: Gaussian RBF

Raw \(d_{ij}\) is a poor input to an MLP — it is a single scalar with strong non-linearity in the energy.

Expand into a Gaussian RBF basis of \(K\) centres \(\{\mu_k\}\):

\[e_{ij,k} \;=\; \exp\!\Big(-\,\frac{(d_{ij} - \mu_k)^2}{2\sigma^2}\Big), \qquad k = 1,\ldots,K\]

Typical grid (CGCNN / SchNet):

  • \(\mu_k\) from \(0\) to \(5\) Å in \(\Delta\mu = 0.1\) Å steps \(\Rightarrow K = 50\)
  • Width \(\sigma = 0.2\) Å (matches \(\Delta\mu\))
  • Yields a smooth, differentiable distance fingerprint for each edge

Why a smooth basis:

  • Gradient \(\partial e_{ij,k}/\partial \vec{r}_j\) is well-defined \(\Rightarrow\) forces \(\vec{F}_i = -\nabla_i E\) via autograd
  • Soft cutoff envelope \(f_c(d_{ij}) = \tfrac{1}{2}[\cos(\pi d/r_c) + 1]\) avoids energy jumps as atoms cross \(r_c\)

12. Edge engineering beyond pair distances

  • Bond order / chemical bond type (single, double, ionic) from valence rules
  • Triplet / angular features \(\theta_{ijk}\): needed for force fields, central in ALIGNN, GemNet
  • Dihedrals \(\phi_{ijkl}\): relevant for organic / molecular crystals
  • Coordination number \(Z_i^{\rm coord} = |\mathcal{N}(i)|\) as a node feature
  • Composition-derived priors: electronegativity, ionization energy, atomic radius — initialise \(h_i^{(0)}\) instead of one-hotting \(Z\)

Higher-order geometry \(\to\) stricter inductive bias, but more expensive graphs and longer training.

§4 · Symmetries: Invariance and Equivariance

13. Symmetries that physics demands

A property predictor on a crystal must respect three groups acting on positions:

  • Translation \(T_{\vec{t}}\): \(\{\vec{r}_i\} \to \{\vec{r}_i + \vec{t}\}\)
  • Rotation / reflection \(R \in O(3)\): \(\vec{r}_i \to R\,\vec{r}_i\)
  • Permutation \(\pi \in S_N\): relabelling of atoms

The right requirement depends on the target tensor type.

14. Invariance vs equivariance

Invariance (scalars):

\[f(\{R\vec{r}_i + \vec{t}\}) \;=\; f(\{\vec{r}_i\})\]

  • Energy \(E\), bandgap \(E_g\), formation energy \(\Delta H_f\)
  • Achieved by using only distances and angles as edge features

Equivariance (vectors / tensors):

\[\vec{F}\!\left(\{R\vec{r}_i + \vec{t}\}\right) \;=\; R\,\vec{F}(\{\vec{r}_i\})\]

  • Forces \(\vec{F}_i\), dipole \(\vec{\mu}\), polarisation, elastic tensor \(C_{ijkl}\)
  • Requires vector-valued node features that transform with \(R\)

Permutation invariance: \(f(\pi\cdot V) = f(V)\) for any \(\pi \in S_N\) — handled automatically by sum/mean readouts over nodes.

15. Why equivariance is non-negotiable for forces

Forces are gradients of an invariant scalar:

\[\vec{F}_i = -\nabla_{\vec{r}_i} E\]

If \(E\) is rotation-invariant and differentiable, then automatically

\[\vec{F}_i(R\,\{\vec{r}\}) \;=\; -\nabla_{R\vec{r}_i} E(R\,\{\vec{r}\}) \;=\; R\,\big(-\nabla_{\vec{r}_i} E(\{\vec{r}\})\big) \;=\; R\,\vec{F}_i(\{\vec{r}\})\]

  • Distance-only GNNs (CGCNN, SchNet, MEGNet) predict invariant scalars and obtain forces by autograd \(\Rightarrow\) correct equivariance, but limited capacity
  • Tensor-valued GNNs (NequIP, MACE, Allegro) carry \(O(3)\)-equivariant features at each node — they can predict vectors and higher-rank tensors directly
  • Without equivariance, learnt forces are not closed under rotation \(\Rightarrow\) MD trajectories drift

§5 · Message Passing

16. The message-passing template

A GNN layer updates each node from its neighbourhood. Generic form:

\[\boxed{\;h_i^{(\ell+1)} \;=\; U^{(\ell)}\!\Big(h_i^{(\ell)},\;\;\mathrm{AGG}_{j\in\mathcal{N}(i)}\, M^{(\ell)}\!\big(h_i^{(\ell)},\,h_j^{(\ell)},\,e_{ij}\big)\Big)\;}\]

  • \(M^{(\ell)}\): message function (MLP on neighbour and edge features)
  • \(\mathrm{AGG}\): permutation-invariant aggregator (sum, mean, max, attention)
  • \(U^{(\ell)}\): update function (gated MLP, GRU, residual block)

After \(L\) layers, node \(i\) has aggregated information from atoms within graph distance \(L\) — its receptive field.

17. Concrete instantiation: GCN-style update

Take the simplest crystal-graph variant:

\[h_i^{(\ell+1)} \;=\; \sigma\!\Big(W^{(\ell)}\,\big(h_i^{(\ell)} + \tfrac{1}{|\mathcal{N}(i)|}\sum_{j\in\mathcal{N}(i)} \alpha_{ij}\,h_j^{(\ell)}\big) + b^{(\ell)}\Big)\]

  • \(\alpha_{ij}\): edge weight, e.g. \(\alpha_{ij} = \mathrm{MLP}(e_{ij})\) or \(\alpha_{ij} = 1/\sqrt{d_id_j}\)
  • \(\sigma\): ReLU / SiLU
  • Sum aggregation \(\Rightarrow\) permutation invariance for free

18. Walk-through: 3-atom toy graph

A water-like fragment \(V = \{O, H_1, H_2\}\), edges \(\{(O,H_1),(O,H_2),(H_1,H_2)\}\), scalar features \(h_i^{(0)} = Z_i\).

Layer 1 (sum aggregation, \(\alpha_{ij}=1\), \(W=1\), \(\sigma=\mathrm{id}\)):

\[h_O^{(1)} = h_O^{(0)} + h_{H_1}^{(0)} + h_{H_2}^{(0)} = 8 + 1 + 1 = 10\]

\[h_{H_1}^{(1)} = 1 + 8 + 1 = 10, \qquad h_{H_2}^{(1)} = 1 + 8 + 1 = 10\]

Layer 2: now every node sees the second-shell sum,

\[h_O^{(2)} = 10 + 10 + 10 = 30\]

— the receptive field doubles per layer. With realistic \(W\) and \(\sigma\), the same mechanism builds chemically-meaningful local descriptors.

§6 · Crystal GNN Architectures

19. CGCNN — crystal graph convolutional NN

Xie and Grossman (2018) introduce the first widely-used crystal-graph model.

  • Nodes initialised with one-hot atomic-property vectors
  • Edge features: Gaussian-expanded distances \(e_{ij}\)
  • Gated convolution combining neighbour and edge features:

\[h_i^{(\ell+1)} \;=\; h_i^{(\ell)} + \sum_{j\in\mathcal{N}(i)} \sigma\!\big(W_z\,z_{ij}^{(\ell)} + b_z\big) \;\odot\; g\!\big(W_s\,z_{ij}^{(\ell)} + b_s\big)\]

with \(z_{ij}^{(\ell)} = h_i^{(\ell)} \Vert h_j^{(\ell)} \Vert e_{ij}\), sigmoid gate \(\sigma\) and softplus content \(g\).

  • Use case: formation energy, bandgap, bulk modulus on the Materials Project

20. MEGNet — global state vector

Chen et al. (2019) add a global node \(u\) that interacts with every atom and edge.

  • Edges, nodes, and the global vector are updated each layer:

\[u^{(\ell+1)} \;=\; \phi_u\!\Big(u^{(\ell)},\;\tfrac{1}{N}\!\sum_i h_i^{(\ell+1)},\;\tfrac{1}{|E|}\!\sum_{ij} e_{ij}^{(\ell+1)}\Big)\]

  • \(u\) encodes state variables: temperature, pressure, applied field
  • Same architecture handles molecules and crystals with a single global flag
  • Use case: state-aware property prediction, multi-task learning

21. SchNet — continuous-filter convolutions

Schütt et al. (2018) replaces discrete edge weights by a continuous filter \(W: \mathbb{R}^+ \to \mathbb{R}^F\) evaluated on \(d_{ij}\).

\[h_i^{(\ell+1)} \;=\; h_i^{(\ell)} + \sum_{j\in\mathcal{N}(i)} h_j^{(\ell)} \,\odot\, W^{(\ell)}\!\big(d_{ij}\big)\]

  • \(W^{(\ell)}\) is itself an MLP applied to the RBF expansion of \(d_{ij}\)
  • Smooth in \(d_{ij}\) \(\Rightarrow\) analytically-differentiable energy \(\Rightarrow\) forces by autograd
  • Use case: potential energy surfaces, MD with ML interatomic potentials

22. Equivariant GNNs in one slide

Distance-only models cannot represent vectorial features in their hidden layers. Equivariant GNNs carry \(O(3)\)-equivariant tensors:

  • Tensor Field Networks / NequIP: hidden features are spherical-tensor-valued; messages use Clebsch–Gordan tensor products
  • MACE: many-body equivariant messages via Atomic Cluster Expansion
  • Allegro: strictly-local equivariant messages, no global pooling
  • Cost: more expensive per layer, but far more data-efficient for forces, polarisation, elastic tensors

Rule of thumb: predicting scalars only \(\to\) CGCNN / SchNet is enough; predicting vectors / tensors \(\to\) go equivariant.

§7 · Readout, Depth, Practicalities

23. Readout / pooling

After \(L\) message-passing layers, aggregate node features into a graph-level vector \(h_G\):

\[\text{sum:}\quad h_G = \sum_{i\in V} h_i^{(L)} \qquad \text{mean:}\quad h_G = \tfrac{1}{N}\sum_{i\in V} h_i^{(L)}\]

\[\text{attention:}\quad h_G = \sum_{i\in V} \alpha_i\,h_i^{(L)}, \qquad \alpha_i = \frac{\exp(w^\top h_i^{(L)})}{\sum_j \exp(w^\top h_j^{(L)})}\]

  • Extensive targets (total energy, formation energy per cell): sum preserves \(E \propto N\)
  • Intensive targets (bandgap, bulk modulus, density): mean is correct; sum would scale wrongly with cell size
  • Attention when contributions are sparse (defects, active sites)

The mismatched-readout pitfall: training “energy / atom” with sum readout silently learns to predict cell size.

24. Graph depth and over-smoothing

Each layer mixes neighbour features \(\Rightarrow\) after many layers, all node embeddings converge.

Empirically, \(L \ge 4\) on dense crystal graphs collapses node features:

\[\lim_{\ell \to \infty} h_i^{(\ell)} = h_j^{(\ell)} \quad \forall i, j \in V\]

— the GNN forgets which atom is which.

Mitigations:

  • Residual / skip connections: \(h_i^{(\ell+1)} = h_i^{(\ell)} + \mathrm{MP}(h_i^{(\ell)})\)
  • Jumping knowledge (Xu et al. 2018): readout concatenates \(h_i^{(1)}, \ldots, h_i^{(L)}\)
  • DenseNet-style dense skip: every layer reads every previous layer
  • Practical recipe: 3–4 message-passing layers + residual + Gaussian RBF edges — strong default for materials

25. Variable-size cells and batching

  • Crystal graphs vary in \(N\) and \(|E|\) — cannot stack into a single tensor
  • Disjoint-batch trick: concatenate all graphs into one large graph with block-diagonal adjacency; pooling uses a batch_index vector
  • Implemented by torch_geometric.data.Batch and dgl.batch
  • Weight matrices are shared across all atoms \(\Rightarrow\) same model handles primitive cells and supercells

26. Computational scaling

  • Neighbour search: \(\mathcal{O}(N \log N)\) with KD-tree, \(\mathcal{O}(N^2)\) naive
  • One message-passing layer: \(\mathcal{O}(|E|)\)
  • For 3D crystals at fixed \(r_c\), \(|E| \approx \bar{k}\,N\) with average degree \(\bar{k}\)
  • Total cost \(\mathcal{O}(L \cdot \bar{k} \cdot N)\)linear in cell size, vastly faster than DFT \(\mathcal{O}(N^3)\)

Multi-modal extensions (atom + microstructure + literature graphs) exist but are niche — covered briefly in U14.

§8 · Failure Modes and Diagnostics

27. Failure mode: shortcut learning

  • If unit-cell volume strongly correlates with energy in the dataset, a GNN can learn to predict \(E\) from \(N\) alone, ignoring chemistry
  • Diagnostic: randomise atomic numbers while keeping geometry; if \(R^2\) stays high, the model is cheating
  • Diagnostic: fix chemistry, perturb cell volume; honest models follow \(E(V)\) curves

Always pair a graph baseline with a composition-only MLP to expose this.

28. Failure mode: cutoff and reproducibility artefacts

  • A hard cutoff \(r_c\) creates discontinuities: as \(d_{ij}\) crosses \(r_c\), edge appears/disappears \(\Rightarrow\) jumps in \(E\), divergent forces
  • Smooth envelope \(f_c(d_{ij}) = \tfrac{1}{2}[\cos(\pi d_{ij}/r_c) + 1]\) for \(d_{ij} \le r_c\), \(0\) otherwise — multiply edge messages by \(f_c\)
  • Small changes in \(r_c\) (e.g. \(4.0 \to 4.5\) Å) can flip graph topology — always log \(r_c\), RBF grid, neighbour-search algorithm

Reproducibility checklist: (i) cutoff, (ii) max neighbours, (iii) RBF \(\{\mu_k, \sigma\}\), (iv) PBC convention, (v) random seed.

29. Transferability and OOD behaviour

Transfer across chemistries

  • Oxides \(\to\) nitrides: feasible if elemental-feature initialisation is rich
  • Pure \(Z\)-embedding \(\to\) poor; physical features (electronegativity, radius) \(\to\) better

Out-of-distribution structures

  • Perovskite-trained \(\to\) spinel: GNNs degrade gracefully but predictions are biased
  • Use ensemble disagreement as a novelty flag
  • Detailed UQ machinery: U13

30. Baseline comparison protocol

When reporting a new GNN result, always compare against:

  1. Mean-target baseline (the trivial predictor)
  2. Composition-only MLP (no geometry) — exposes shortcut learning
  3. Classical descriptor baseline (PRDF / SOAP + ridge or RF)
  4. Strong public GNN (CGCNN or MEGNet trained on the same split)

Any new architecture must beat all four on both MAE and Spearman rank correlation.

31. Evaluation metrics for screening

  • MAE / RMSE: regression accuracy
  • Spearman \(\rho\) / Kendall \(\tau\): rank correlation — what matters for screening
  • Top-\(k\) recall: of the true top-\(k\) materials, how many are in the model’s top-\(k\)?
  • For discovery, ranking and recall outweigh absolute accuracy

§9 · Case Studies

32. Case studies in one slide

Bandgap

  • Intensive \(\Rightarrow\) mean readout
  • Needs deeper / global context (MEGNet wins)
  • MAE \(\approx 0.3\) eV on MP

Formation energy

  • Extensive (per cell) \(\Rightarrow\) sum readout, then \(/N\)
  • CGCNN / MEGNet MAE \(\approx 0.03\) eV/atom — near DFT precision

Elasticity

  • Sparse data (\(\sim 10^4\) entries)
  • Transfer learning: pre-train on energy, fine-tune on \(C_{ij}\)
  • Tensor target \(\Rightarrow\) favors equivariant GNNs

§10 · Bridges and Limits

33. Recap to Unit 6 (local atomic environments)

  • Each MP layer = one iterative refinement of a local atomic environment descriptor
  • Layer \(\ell\) “sees” atoms within graph distance \(\ell\) — analogous to atomic neighbourhood radius
  • This is exactly the picture U6 (last week) built with SOAP / ACSF / ACE, which can now be re-read as fixed-feature MP layers — today we make the feature map learnable

34. Bridge to Units 9 and 9

  • U9 (NN potentials): trains GNN energies + autograd forces on MD trajectories — equivariance becomes essential
  • U10 (representation learning): discards the property head; the GNN encoder produces a latent \(\mathbf{z}\) for self-supervised pre-training, contrastive learning, generative models

35. The limit of GNNs

  • GNNs are interpolators trained on DFT data — they inherit DFT’s biases (XC functional, smearing, \(k\)-grid)
  • They cannot discover physics absent from the training set (e.g. unconventional superconductivity)
  • Long-range Coulomb / dispersion: distance cutoff misses both unless explicit terms are added
  • Symmetry-breaking phases (charge density waves, magnetism) require labels that capture them

§11 · Exercises

36. Exercise overview

  • Task 1: Build a periodic crystal graph from a CIF using pymatgen + torch_geometric. Verify edge count under PBC.
  • Task 2: Train a 3-layer CGCNN on \(\sim 10^4\) ABX\(_3\) perovskites; report MAE, Spearman \(\rho\), top-100 recall for formation energy.
  • Task 3: Ablation — sweep \(r_c \in \{4, 6, 8\}\) Å and depth \(L \in \{2, 3, 4, 5\}\). Plot error vs depth; identify over-smoothing.
  • Task 4: Failure analysis — inspect 5 worst-error samples. Are they rare chemistries, large cells, or shortcut victims?

§12 · Wrap-Up

37. Unit 7 — Key Takeaways

  • A crystal graph \(G=(V,E)\) encodes atoms, bonds, and periodic images via \(\vec{r}_{ij}(\vec{n}) = \vec{r}_j + \mathbf{T}\vec{n} - \vec{r}_i\)
  • PBC with the minimum-image convention is mandatory — naive cutoffs disconnect small cells
  • Gaussian RBF turns distances into a smooth, differentiable basis that supports autograd forces
  • Invariance for scalar targets, equivariance for vector / tensor targets — forces require the latter
  • Generic message passing \(h_i^{(\ell+1)} = U(h_i^{(\ell)}, \mathrm{AGG}_j M(h_i^{(\ell)},h_j^{(\ell)},e_{ij}))\) instantiates CGCNN, MEGNet, SchNet
  • Sum readout for extensive targets, mean for intensive — never mix them
  • Depth \(L = 3\)\(4\) + skip connections avoids over-smoothing
  • Always benchmark against mean / composition-only / SOAP baselines, and report rank metrics for screening

38. Looking back to Unit 6 and forward to Unit 9

  • U6 (last week) made local atomic environments explicit: ACSF, SOAP, ACE
  • Today’s MP layers are the learned version of those fixed many-body descriptors — same locality, different aggregation
  • Next stop: Unit 9 (NN interatomic potentials) — apply today’s GNN machinery + autograd to predict forces on MD trajectories

39. Example notebook

Week 5: Crystal graphs + CGCNN — ABX\(_3\) perovskites

Continue

References

Chen, Chi, Weike Ye, Yunxing Zuo, Chen Zheng, and Shyue Ping Ong. 2019. “Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals.” Chemistry of Materials 31 (9): 3564–72. https://doi.org/10.1021/acs.chemmater.9b01294.
Schütt, K. T., H. E. Sauceda, P.-J. Kindermans, A. Tkatchenko, and K.-R. Müller. 2018. “SchNet – a Deep Learning Architecture for Molecules and Materials.” The Journal of Chemical Physics 148 (24): 241722. https://doi.org/10.1063/1.5019779.
Xie, Tian, and Jeffrey C. Grossman. 2018. “Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties.” Physical Review Letters 120 (14): 145301. https://doi.org/10.1103/PhysRevLett.120.145301.