KEY LEARNINGS
- The 'black box' problem refers to AI systems where the internal decision-making process is invisible or too complex for humans to understand.
- Interpretability means the model is transparent by design (like a decision tree), while explainability involves using tools to approximate why a complex model made a decision.
- Opacity in AI creates significant governance risks, including the inability to detect bias, fix errors, or comply with regulations like GDPR.
- High-stakes decisions affecting human rights or safety often require inherently interpretable models rather than complex 'black boxes.'
- Governance requires a tiered approach to transparency, providing different levels of detail for users, auditors, and regulators.
- 📰Rudin: Stop Explaining Black Box ModelsInfluential paper arguing for interpretable models in high-stakes decisions.
- 🌐Google PAIR: Explainability GuidePractical guidance on implementing explainable AI.
- 📄ICO: Explaining Decisions Made with AIUK regulator guidance on AI transparency requirements.
- Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence.
- Molnar, C. (2022). Interpretable Machine Learning.
- Selbst, A.D., & Barocas, S. (2018). The Intuitive Appeal of Explainable Machines. Fordham Law Review.





