LCAO Recap and Choice of Basis
LCAO ansatz from Unit 2:
\[|\psi\rangle = \sum_{j=1}^{N_b} c_j\,|\chi_j\rangle\]
- Quality of result is bounded by the flexibility of \(\{\chi_j\}\)
- Basis families differ in: cusp behaviour, tail behaviour, integral evaluation cost, periodicity
- Trade-off: accuracy vs computational cost
Slater-Type Orbitals (STOs)
\[\chi(\mathbf{r}) \sim r^{n-1}\,e^{-\zeta r}\,Y_{\ell m}(\theta,\phi)\]
- \(\zeta\): orbital exponent
- \(Y_{\ell m}\): spherical harmonics, as for the hydrogen atom
- Physically correct: cusp at nucleus, exponential decay at infinity
- Few functions needed for high accuracy
STOs — The Catch
- Multicenter integrals (two electrons on different atoms) cannot be done analytically in simple closed form
- Numerical integration is expensive
- Limits STOs to atoms and very small molecules
- Used in specialised codes (ADF) but not the standard choice
STOs: physically right, computationally awkward. We trade physics for tractable integrals.
Gaussian-Type Orbitals (GTOs)
Replace \(e^{-\zeta r}\) by a Gaussian \(e^{-\alpha r^2}\):
\[\chi(\mathbf{r}) \sim x^a y^b z^c\,e^{-\alpha r^2}\]
- Wrong cusp at the nucleus (smooth instead of pointed)
- Decay too fast at large \(r\)
- But: products of Gaussians on different centres are themselves Gaussians on a new centre
- Multicenter integrals become analytical — orders of magnitude faster
Why GTOs Won
- Standard choice of essentially all molecular quantum-chemistry codes
- NWChem, PySCF, Gaussian, ORCA, Molpro — all GTO-based
- Speed advantage outweighs the loss in physical fidelity
- Cusp and tail errors are absorbed by using more Gaussians
The price of GTO speed is paid in basis-set size, not in algorithmic complexity.
Contracted Gaussians
A single GTO is a poor STO; a fixed linear combination is much better:
\[\chi^{\rm CGTO}(\mathbf{r}) = \sum_{k=1}^{K} d_k\,g_k(\mathbf{r};\alpha_k)\]
- \(g_k\): primitive Gaussians with fixed exponents \(\alpha_k\)
- \(d_k\): contraction coefficients, fixed once and for all
- One contracted GTO ≈ one STO, evaluated as a sum of cheap Gaussians
- The SCF only varies the molecular-orbital coefficients \(c_j\), not the \(d_k\)
Basis-Set Naming Conventions (I)
- STO-3G (minimal): each STO approximated by 3 primitive Gaussians; one CGTO per occupied AO
- 3-21G, 6-31G (split valence): core 1 CGTO; valence split into inner+outer for flexibility
- 6-31G(d) = 6-31G* : add d-type polarisation functions on heavy atoms
- 6-31G(d,p) = 6-31G** : also add p-polarisation on hydrogen
Basis-Set Naming Conventions (II)
Correlation-consistent Dunning sets cc-pVnZ:
- \(n=\) D, T, Q, 5, 6 — double, triple, quadruple, … zeta
- Designed for systematic convergence to the basis-set limit
- “aug-” prefix adds diffuse functions (anions, weak interactions, excited states)
- Allow complete-basis-set extrapolation \(E(n) \to E(\infty)\)
Plane-wave codes (VASP, Quantum ESPRESSO) use a single kinetic-energy cutoff \(E_{\rm cut}\) instead — different convention, same idea.
Other Basis Choices
Plane waves
\[\chi_{\mathbf{k}}(\mathbf{r}) = e^{i\mathbf{k}\cdot\mathbf{r}}\]
- Delocalised; ideal for periodic crystals
- Systematic via \(E_{\rm cut}\)
- Need pseudopotentials or PAW for cores
Wavelet & numerical bases
- Adaptive spatial resolution
- Mathematically clean systematic improvement
- Niche codes: BigDFT, ONETEP
- Numerical AOs: integrate on grids
Basis-Set Quality and Completeness
- Larger basis \(\Rightarrow\) more variational freedom \(\Rightarrow\) lower energy
- Energy converges monotonically to the HF limit as basis grows (variational principle)
- For HF: typically saturated with a few hundred basis functions per atom
- For correlated methods (MP2, CC): convergence is much slower
- Empirically: \(E(n)\to E(\infty)\) as \(\sim n^{-3}\) for cc-pVnZ — extrapolation possible
Basis-Set Superposition Error (BSSE)
- When two molecules approach, each “borrows” basis functions from the other
- Artificially lowers interaction energies — looks like extra binding
- Worst for small basis sets, vanishes in the complete-basis limit
- Counterpoise correction (Boys-Bernardi): compute monomers in the full dimer basis
Always check basis-set convergence and BSSE for weak interactions (van der Waals, hydrogen bonds).
The Mean-Field Idea
The hard part is electron-electron repulsion:
\[\sum_{i=1}^{N_e}\sum_{j>i}\frac{1}{|\mathbf{r}_i-\mathbf{r}_j|}\]
Hartree-Fock approximation: replace this by an average field. Each electron \(j\) feels the smeared-out density of the others through the Hartree potential:
\[V_{H,j}(\mathbf{r}) = \int\frac{\rho_{\ne j}(\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|}\,d\mathbf{r}'\]
The instantaneous correlation between electrons is lost — but the problem becomes separable into one-electron equations.
The Slater-Determinant Ansatz
Hartree-Fock restricts the trial wavefunction to a single Slater determinant of one-electron spin-orbitals:
\[|\psi_{\rm HF}\rangle = \frac{1}{\sqrt{N!}}\det\!\bigl[\phi_1\phi_2\cdots\phi_{N_e}\bigr]\]
- Antisymmetric by construction \(\Rightarrow\) obeys the exchange principle
- Built from orbitals \(\{\phi_i\}\) to be determined variationally
- The Pauli exclusion principle is built in (two equal columns \(\Rightarrow\) zero)
The Fock Operator
Variational minimisation of \(\langle\psi_{\rm HF}|\hat{H}|\psi_{\rm HF}\rangle\) produces an effective one-electron eigenvalue problem with the Fock operator:
\[\hat{F} = -\frac{\hbar^2}{2m_e}\Delta - \frac{e^2}{4\pi\varepsilon_0}\sum_{k=1}^{N_{\rm at}}\frac{Z_k}{|\mathbf{r}-\mathbf{R}_k|} + \hat{V}_H + \hat{K}^{\rm exch}\]
- Kinetic energy + nuclear attraction (one-electron core)
- \(\hat{V}_H\): classical Hartree (Coulomb) potential
- \(\hat{K}^{\rm exch}\): exchange operator — non-classical, arises purely from antisymmetrisation
Coulomb and Exchange — Schematic
For two occupied orbitals \(\phi_i,\phi_j\):
\[J_{ij} = \iint \frac{|\phi_i(\mathbf{r}_1)|^2\,|\phi_j(\mathbf{r}_2)|^2}{|\mathbf{r}_1-\mathbf{r}_2|}\,d\mathbf{r}_1 d\mathbf{r}_2 \quad\text{(Coulomb)}\]
\[K_{ij} = \iint \frac{\phi_i^*(\mathbf{r}_1)\phi_j^*(\mathbf{r}_2)\phi_j(\mathbf{r}_1)\phi_i(\mathbf{r}_2)}{|\mathbf{r}_1-\mathbf{r}_2|}\,d\mathbf{r}_1 d\mathbf{r}_2 \quad\text{(Exchange)}\]
- \(J\): classical electrostatic repulsion of two charge clouds
- \(K\): pure quantum-statistical effect — no classical analogue
- Crucial: \(K\) cancels self-interaction (\(J_{ii} = K_{ii}\))
The Hartree-Fock Equations
The HF eigenvalue problem reads
\[\hat{F}\,\phi_i = \varepsilon_i\,\phi_i\]
- \(\phi_i\): molecular spin-orbital
- \(\varepsilon_i\): orbital energy (Lagrange multiplier from orthonormality constraints)
- Nonlinear: \(\hat{F}\) depends on the orbitals it tries to compute
- \(\Rightarrow\) must be solved self-consistently
The Roothaan-Hall Equations
Insert LCAO \(\phi_i = \sum_\mu C_{\mu i}\,\chi_\mu\) into the HF equations:
\[\boxed{\;\mathbf{F}\,\mathbf{C} = \mathbf{S}\,\mathbf{C}\,\boldsymbol{\varepsilon}\;}\]
- \(F_{\mu\nu} = \langle\chi_\mu|\hat{F}|\chi_\nu\rangle\): Fock matrix
- \(S_{\mu\nu} = \langle\chi_\mu|\chi_\nu\rangle\): overlap matrix (basis non-orthogonal)
- \(\boldsymbol{\varepsilon}\): diagonal matrix of orbital energies
- \(\mathbf{C}\): MO coefficients in the AO basis
A generalised eigenvalue problem of size \(N_b\) — solvable by standard linear algebra.
The SCF Procedure
- Guess initial orbitals / density matrix \(\mathbf{P}^{(0)}\)
- Build \(\mathbf{S}\), core Hamiltonian, and \(\mathbf{F}^{(n)}\) from \(\mathbf{P}^{(n)}\)
- Solve \(\mathbf{F}\mathbf{C} = \mathbf{S}\mathbf{C}\boldsymbol{\varepsilon}\)
- Build new \(\mathbf{P}^{(n+1)}\) from occupied orbitals
- Check convergence (\(\Delta E\), \(\Delta \mathbf{P}\)); else go to 2
Convergence is not guaranteed — DIIS, level shifting, fractional occupations are common numerical aids.
Cost of HF
- Dominated by two-electron integrals \(\propto N_b^4\)
- Storage and transformation of \((\mu\nu|\lambda\sigma)\) becomes the bottleneck
- Modern integral screening + linear-scaling tricks for large systems
- Each SCF cycle scales as \(\mathcal{O}(N_b^3)\) (diagonalisation) or \(\mathcal{O}(N_b^4)\) (Fock build)
HF is the cheapest correlated-style ansatz — and the starting point for nearly every post-HF method.
RHF, ROHF, UHF
- RHF (Restricted): \(\alpha\) and \(\beta\) share spatial orbitals; closed-shell systems
- ROHF (Restricted Open-Shell): doubly occupied orbitals share space, singly occupied are different
- UHF (Unrestricted): \(\alpha\) and \(\beta\) have different spatial orbitals
- UHF: more flexibility, lower energy for radicals — but suffers from spin contamination (\(\langle\hat{S}^2\rangle\) off)
What HF Misses: Correlation Energy
By the variational principle and the single-determinant restriction:
\[E_{\rm HF}^{\rm limit} \ge E_0\]
The correlation energy is defined as the gap to the exact non-relativistic ground state:
\[E_{\rm corr} = E_0 - E_{\rm HF}^{\rm limit} \;<\; 0\]
- \(|E_{\rm corr}|\) small in absolute terms but chemically critical (~1 eV per bond)
- Captures the instantaneous electron-electron avoidance the mean field smears out
- HF systematically overestimates bond lengths and underestimates binding energies
- Post-HF methods are designed to recover (most of) \(E_{\rm corr}\)
A Famous HF Failure: Dispersion
- HF places noble-gas dimers (He-He, Ar-Ar) unbound or barely bound
- Real atoms are bound by London dispersion — a pure correlation effect
- MP2 already gives qualitatively correct attractive wells
- Lesson: correlation is not optional for non-covalent interactions
Density Functional Theory
Why DFT?
- Wavefunction methods scale steeply: HF \(\mathcal{O}(N^4)\), MP2 \(\mathcal{O}(N^5)\), CCSD(T) \(\mathcal{O}(N^7)\)
- Truly exact methods (FCI) scale factorially
- DFT typically scales as \(\mathcal{O}(N^3)\) to \(\mathcal{O}(N^4)\) — affordable for \(\sim 1000\) atoms
- Workhorse of computational materials science: VASP, Quantum ESPRESSO, CP2K, ORCA, ABINIT
- Generates the bulk of data fueling materials genomics
The Central Object: The Electron Density
Instead of \(\psi(\mathbf{r}_1,\ldots,\mathbf{r}_N)\) — a function of \(3N\) variables — work with
\[\rho(\mathbf{r}) = N_e\int|\psi(\mathbf{r},\mathbf{r}_2,\ldots,\mathbf{r}_N)|^2\,d\mathbf{r}_2\cdots d\mathbf{r}_N\]
- Function of just 3 spatial variables, regardless of \(N_e\)
- Directly observable (X-ray scattering)
- Integrates to the number of electrons: \(\int\rho\,d\mathbf{r} = N_e\)
First Hohenberg-Kohn Theorem (1964)
Theorem (HK1). The ground-state density \(\rho_0(\mathbf{r})\) uniquely determines the external potential \(V_{\rm ext}(\mathbf{r})\) — and hence the entire Hamiltonian — up to a constant.
Consequences:
- All ground-state properties are functionals of \(\rho_0\)
- Total energy: \(E[\rho_0]\)
- Wavefunction: \(\psi[\rho_0]\)
- In principle, knowing \(\rho_0(\mathbf{r})\) tells you everything
Second Hohenberg-Kohn Theorem
Theorem (HK2). There exists a universal energy functional \(E[\rho]\) such that the exact ground-state density minimises it:
\[E_0 = \min_{\rho}\,E[\rho], \qquad \int\rho\,d\mathbf{r} = N_e\]
A variational principle in density space — analogous to \(\min_\psi\langle\psi|\hat{H}|\psi\rangle\), but over a much simpler object.
The catch: HK1 + HK2 are existence statements. They do not tell us the form of \(E[\rho]\).
Decomposing the Energy Functional
Schematically:
\[E[\rho] = T[\rho] + V_{\rm ne}[\rho] + J[\rho] + E_{\rm xc}[\rho]\]
- \(T[\rho]\): kinetic energy
- \(V_{\rm ne}[\rho]\): nucleus-electron attraction (known, classical)
- \(J[\rho]\): classical Coulomb repulsion of \(\rho\) with itself (known)
- \(E_{\rm xc}[\rho]\): exchange-correlation — the unknown piece
\(T[\rho]\) as a pure density functional is also unknown — Thomas-Fermi attempts give terrible accuracy.
The Kohn-Sham Trick (1965)
Reintroduce orbitals to get \(T\) almost right. Define a fictitious non-interacting system with the same density \(\rho\):
\[\rho(\mathbf{r}) = \sum_{i=1}^{N_e}|\psi_i^{\rm KS}(\mathbf{r})|^2\]
The KS kinetic energy
\[T_s[\rho] = -\frac{\hbar^2}{2m_e}\sum_i\langle\psi_i^{\rm KS}|\Delta|\psi_i^{\rm KS}\rangle\]
is known exactly from the orbitals — and accounts for >99% of the true \(T[\rho]\).
The Kohn-Sham Equations
The KS orbitals satisfy a set of one-electron equations:
\[\left[-\frac{\hbar^2}{2m_e}\Delta + V_{\rm KS}[\rho](\mathbf{r})\right]\psi_i^{\rm KS} = \varepsilon_i\,\psi_i^{\rm KS}\]
The KS potential bundles all interactions:
\[V_{\rm KS}[\rho] = V_{\rm ext} + V_H[\rho] + V_{\rm xc}[\rho]\]
with \(V_{\rm xc} = \delta E_{\rm xc}/\delta\rho\) — the exchange-correlation potential.
- Same SCF cycle structure as Hartree-Fock
- Replaces non-local exchange operator by a local \(V_{\rm xc}(\mathbf{r})\)
- The orbitals \(\psi_i^{\rm KS}\) are mathematical auxiliaries — not the true wavefunction
Status of \(E_{\rm xc}[\rho]\)
- Contains everything beyond classical Coulomb that a non-interacting reference cannot capture
- Exact form is unknown and (provably) very complicated
- Must be approximated — the entire art of practical DFT
- “DFT” as used in practice = “DFT with approximate \(E_{\rm xc}\)”
HK theorems guarantee an exact \(E[\rho]\) exists — they give us no way to construct it. DFT is exact in principle, approximate in practice.
LDA — Local Density Approximation
Use the XC energy of the uniform electron gas, evaluated at the local density:
\[E_{\rm xc}^{\rm LDA}[\rho] = \int \rho(\mathbf{r})\,\varepsilon_{\rm xc}\bigl(\rho(\mathbf{r})\bigr)\,d\mathbf{r}\]
- Simplest and oldest functional
- Surprisingly good geometries for solids (bulk metals, oxides)
- Overbinds: bond lengths too short, atomisation energies too high
- Inadequate for many molecular applications
GGA — Generalised Gradient Approximation
Add dependence on the density gradient \(\nabla\rho\):
\[E_{\rm xc}^{\rm GGA}[\rho] = \int f\bigl(\rho,\nabla\rho\bigr)\,d\mathbf{r}\]
- Distinguishes slowly- vs rapidly-varying densities
- Examples: PBE (solids, materials community standard), BLYP (chemistry community)
- Significant improvement over LDA for molecules
- Still misses long-range correlation (van der Waals)
Hybrid Functionals
Mix in a fraction of exact (HF) exchange:
\[E_{\rm xc}^{\rm hyb} = a\,E_x^{\rm HF} + (1-a)\,E_x^{\rm GGA} + E_c^{\rm GGA}\]
- B3LYP (\(a\!\approx\!0.20\)): chemistry workhorse for decades
- PBE0, HSE06 (range-separated hybrid for solids)
- Better band gaps and barrier heights than pure GGA
- Computationally more expensive: HF exchange brings back \(\mathcal{O}(N^4)\)
Jacob’s Ladder of Functionals
- LDA — local density only
- GGA — density + gradient
- meta-GGA — also kinetic-energy density \(\tau\) (e.g. SCAN)
- Hybrid GGA — fraction of HF exchange (B3LYP, PBE0)
- Double hybrid — also a fraction of MP2 correlation (B2PLYP)
Higher rungs \(\Rightarrow\) generally more accurate, more expensive, less robustly transferable.
Climbing the ladder is not monotonic for every system.
Pseudopotentials
For heavier atoms, replace the strong nuclear Coulomb + core electrons by a smooth effective potential:
\[-\frac{Z}{r} \;\longrightarrow\; V_{\rm ps}(r)\]
- Core electrons are chemically inert — wasteful to treat explicitly
- Removes rapid oscillations near the nucleus \(\Rightarrow\) smaller basis / lower \(E_{\rm cut}\)
- Norm-conserving, ultrasoft, PAW (projector-augmented-wave) variants
- Transferability between chemical environments must be tested
DFT Pitfall — Self-Interaction Error
- In approximate DFT, \(J[\rho]\) contains the interaction of an electron with itself
- HF’s exchange exactly cancels this; LDA/GGA only do so approximately
- Drives delocalisation: electrons artificially smeared over many sites
- Toy example: H\(_2^+\) dissociation curve goes to a wrong limit with LDA/PBE/B3LYP
- True dissociation: \(E({\rm H}_2^+)\to E({\rm H})\). LDA/GGA produce extra spurious binding
DFT Pitfall — Van der Waals
- Standard LDA/GGA functionals lack long-range dispersion (London forces)
- Layered materials (graphite), molecular crystals, biological binding poorly described
- Remedies: DFT-D (Grimme empirical correction), VV10/non-local correlation, vdW-DF
- Standard reflex: always check whether dispersion is added when reading “DFT” results
DFT Pitfall — Band Gaps
- KS gap \(\ne\) true fundamental gap (derivative discontinuity of \(E_{\rm xc}\))
- LDA/GGA systematically underestimate gaps by 30-50%
- Hybrids (HSE06) and meta-GGA (SCAN) close part of the gap
- GW many-body perturbation theory is the principled fix — but expensive
For ML on materials data: bandgap labels from PBE may need a calibration to experiment.
DFT — Summary
- DFT is not systematically improvable once a functional is chosen
- Equivalence with the exact Schrödinger equation is lost the moment we approximate \(E_{\rm xc}\)
- Quality depends on functional choice, basis, pseudopotential, dispersion correction
- Despite caveats — the workhorse of materials simulation