FAU Erlangen-Nürnberg
This deck is supplementary reading, not a standalone lecture.
You already have an embedding \(z\). Now what?
What this unit is not.
Recap — what we already have
Today — Unit 11 in one line
PCA
t-SNE (vandermaaten2008tsne?)
perplexity controls neighbourhood size.UMAP (mcinnes2018umap?)
n_neighbors, min_dist control granularity.One-line decision rule. If you need axes you can name, use PCA. If you need clusters you can see, use t-SNE / UMAP — and then verify them.
Generic latent space (MFML W9 framing)
Materials latent space
Why PBC matters for \(z\)
Whose responsibility?
The equivariance promise
What goes wrong without it
Composition-only embeddings
Structure-aware embeddings
The recipe
Why both projections
What we see
What this says
What we see
Why this is non-trivial
The colour scale
Reading the stability map
Coarse separation
Per-element substructure
Within a chemistry family
Why prototypes work
The ABO\(_3\) slice
What we will see in §G
Five questions before you trust a latent-space figure
A figure that fails any of these questions is decoration.
Why UMAP layouts are not unique
n_neighbors \(\to\) different cluster topologies.min_dist \(\to\) different cluster compactness.What is robust vs what is not
When PCA wins
What PCA cannot do
Use PCA and UMAP — they answer different questions.
The setting
Why this works
Outlier \(\neq\) noise
The outlier as a target
The novelty score
Caveats baked in
Two different anomalies
When each fires
The system
The latent map shows
The system
What the latent map shows
Two different jobs
They are adjoint, not redundant
The discovery loop in one slide
The latent map is a first filter
The famous example
\[\vec{\text{king}} - \vec{\text{man}} + \vec{\text{woman}} \approx \vec{\text{queen}}\]
The materials analogue
\[z_{\text{BaTiO}_3} - z_{\text{Ba}} + z_{\text{Sr}} \approx z_{\text{SrTiO}_3}\]
The substitution vector
When it works, when it doesn’t
The interpolation path
\[z(t) = (1 - t)\, z_A + t\, z_B \quad t \in [0, 1]\]
Why latent interpolation beats raw
The gradient direction
The actionable axis
The design move
Caveats baked in
The series
Map as a curve in \(z\)
Without usable arithmetic
With usable arithmetic
Where arithmetic fails
The discipline
The trap
Why it’s wrong
The trap
Symptoms
The trap
The defence
The trap
The defence
The ablation idea
Why ablations matter
The protocol
What the comparison says
Five items, none optional
Without all five
The corpus
The encoder + projection
Three lobes
Sub-features
Supportable claims (with evidence)
Why these are supportable
Unsupportable narratives
The discipline
The end-to-end loop
The closed loop
The exercise repo
notebook/perovskite_latent.ipynb — full case-study notebook.from_pretrained() one-liner.Reproducibility is a requirement, not a bonus
Helps
Misleads when
What U12 does next
Plus the bridge to generative
What U13 does next
Why the latent space matters
Exercise (90 min, this afternoon)
n_neighbors).Reading for next week
Next week (Unit 12): clustering and generative use of \(z\).
The single sentence to leave with: the materials latent space is a map, not a fact — read it carefully, navigate it deliberately, and challenge it always.

© Philipp Pelz - Materials Genomics