FAU Erlangen-Nürnberg
Part I — foundations of generative modelling for crystals
Part II — diffusion-based crystal generators (CDVAE, DiffCSP, MatterGen)
Part III — flow matching and autoregressive models (FlowMM, CrystaLLM)
Part IV — conditioning and constraints
Part V — downstream filtering, MLIP relaxation, DFT screening, GNoME, the active-learning loop
Closing — open challenges, takeaways, link to Unit 13 (uncertainty-aware discovery)
| Year | Model | Family |
|---|---|---|
| 2018 | CrystalGAN | GAN |
| 2020 | FTCP | VAE-like |
| 2022 | CDVAE | Diffusion + VAE |
| 2023 | DiffCSP / DiffCSP++ | Diffusion |
| 2023 | GNoME (DeepMind) | GNN screening at scale |
| 2024 | MatterGen (MSR) | Diffusion + conditioning |
| 2024 | CrystaLLM | LLM / autoregressive |
| 2024 | FlowMM | Flow matching |
A crystal is a structured object with multiple types of variables:
Structure objectsWhat makes a generated structure good?
Common composite metric: S.U.N. = Stable, Unique, Novel.
Unconditional
Conditional
Start with data \(x_0\), apply a noising schedule:
\[q(x_t\mid x_{t-1}) = \mathcal{N}\!\left(x_t;\sqrt{1-\beta_t}\,x_{t-1},\beta_t\mathbf{I}\right)\]
After \(T\) steps, \(x_T\approx\mathcal{N}(0,\mathbf{I})\) regardless of \(x_0\).
Learn \(p_\theta(x_{t-1}\mid x_t)\) — the denoising step.
Training objective: predict the noise \(\epsilon\) that was added to obtain \(x_t\):
\[\mathcal{L} = \mathbb{E}_{t,x_0,\epsilon}\,\|\epsilon - \epsilon_\theta(x_t,t)\|^2\]
An equivalent picture: learn the score \(\nabla_x\log p_t(x)\) instead of the noise.
Xie et al. 2022 — the first practical crystal generator with realistic SUN rates.
Jiao et al. 2023 — diffusion model that generates lattice and coordinates jointly.
Jiao et al. 2024 — adds space-group conditioning during generation.
Modern crystal diffusion almost always uses equivariant networks.
Zeni et al. 2024 (Microsoft Research) — diffusion model for property-conditioned generation.
Continuous-time alternative to diffusion (Lipman et al. 2023).
Learn a vector field \(v_\theta(x,t)\) such that the trajectory
\[\dot x = v_\theta(x,t)\]
transports a simple base distribution \(p_0\) to the data distribution \(p_1\).
Miller et al. 2024 — flow matching applied to crystals.
Antunes et al. 2024 — train a GPT-style model on CIF text.
| Family | Sample cost | S.U.N. rate | Conditioning | Notes |
|---|---|---|---|---|
| Diffusion | high (50–1000 steps) | strongest in 2023–2024 | flexible (classifier-free) | dominant today |
| Flow matching | low (10–25 steps) | catching up fast | deterministic ODE | likely default by 2026 |
| Autoregressive (LLM) | medium (token-by-token) | competitive | prompt-based | exploits LLM scaling |
| VAE / GAN | low (single pass) | low | limited | legacy / niche |
None of these are mutually exclusive — production pipelines often combine paradigms.
How do we ask for “a structure with bandgap \(\approx 2.0\) eV”?
Classifier guidance
Classifier-free guidance
For multi-property targets, mixed strategies (CFG + a property predictor head) are common.
A 2025-era production pipeline:
Merchant et al. (Nature 2023, DeepMind) — graph networks for materials exploration.

© Philipp Pelz - Materials Genomics