Week 1 Summary: What makes materials data special?
Cross-Book Summary
1. The Concept of Data-Based Modeling
- Data-based vs. First-Principle: Top-down ML vs. bottom-up physics.
- Traceability: White (explainable), Grey (hybrid), and Black-Box (opaque) models.
- Overfitting: Excess complexity fails to generalize.
2. Foundations of Data
- Types: Nominal, Ordinal, Cardinal, Binary.
- Scales: Nominal, Ordinal, Interval, Ratio.
- Uncertainty: Units and measurement error are essential.
3. Materials Science Specifics
- PSPP Paradigm: Processing-Structure-Property-Performance dependency.
- Noise: Physical noise, aliasing, and instrument bias.
- Scarcity: High-quality data is expensive and rare.
90-Minute Lecture Strategy
Part 1: Introduction & Philosophy
- AI 4 Materials goals.
- Hype vs. Reality.
- Convergence of high-throughput, simulation, and ML.
Part 2: Models in Engineering
- Prediction vs. Explanation.
- Physics-based vs. Data-driven.
- Moving toward White-Box ML.
Part 3: Special Materials Data
- PSPP graph.
- Multi-modal data types.
- Small Data challenges.
- Physical Priors.
Part 4: Data Quality
- Categorizing data.
- Metadata and Units.
- Error propagation.
Part 5: CRISP-DM for Labs
- Adapting industrial standards.
- Deployment.
- Correlation != Causality.
Quarto Website Update (Summary)
Summary for ML-PC Week 1:
- Transitions from physics-based to data-driven modeling. - Highlights multi-modal, scarce materials data. - Covers PSPP relationships, data scales, and uncertainty. - Adapts CRISP-DM for scientific workflows.