Time commitment: 2-3 hours total
Learning Objectives
By end of Phase 4, you should be able to:
- Understand what "hallucination" means and why it's not a bug
- Explain why you can't "make models fully explainable"
- Translate technical AI concepts to non-technical colleagues
- Feel confident in conversations about AI limitations
This is the capstone phase. Everything comes together here. Your ability to communicate about AI limits is now your most valuable skill.
Resource 9: How Transformers Work (Visual)
Title: The Illustrated Transformer
Link: https://jalammar.github.io/illustrated-transformer/
Time: 45-60 minutes
Note: This gets abstract. OK to find a peer to discuss with. Visual understanding is enough; you don't need to internalize every detail.
Resource 10: Why LLMs Hallucinate
Title: Extrinsic Hallucinations in LLMs
Link: https://lilianweng.github.io/posts/2024-07-07-hallucination/
Time: 40-50 minutes
Core insight: Hallucinations are not bugs. They're the default. The model is trained to generate plausible text, not to fact-check.
Career changer insight: This is where you become the adult in the room. When someone says "let's fix hallucinations with better training," you explain why that's not how it works.
Resource 11: Interpretability Tools
Title: Explainability and Interpretability in Modern LLMs
Link: https://www.rohan-paul.com/p/explainability-and-interpretability
Time: 30 minutes
Key concept: We can't fully explain AI decisions. But we have tools (attention visualization, saliency maps) that show us parts of the reasoning.
Phase 4 Glossary Callout
| Term |
What it means |
| Hallucination | Model confidently says false things |
| Confidence | Model's certainty about an answer (measured 0-1) |
| Attention | Which parts of the input the model focused on |
| Interpretability | Understanding how the model works (very hard) |
| Explainability | Tools to understand a specific decision (hard, but possible) |
Phase 4 Try This: The Most Important Exercise
Communication Challenge:
Your CEO reads in the news that an LLM hallucinated. Your CEO asks: "Are our models safe? Can we prevent this?"
Write a 2-minute response (3-4 paragraphs) that:
- Explains what hallucination is (without jargon)
- Explains why it happens (without math)
- Explains what you do about it (mitigation, monitoring, guardrails)
This is the skill that makes you valuable. If you can do this, you've internalized Phase 4.
Phase 4 Capstone: Explaining AI to Others
You've learned how transformers work, why they fail, and how to detect failures. Now the most important skill: communicating this to people without technical backgrounds.
How to Explain Hallucinations
Simple version: "LLMs make up facts sometimes. They predict what words come next; they don't verify facts. So they confidently generate false information."
Why it happens: "The model was trained on internet text (which has false information). It learned patterns. It can't distinguish between patterns it learned from real facts vs. patterns from false claims."
What we do about it: "We monitor for confidence mismatches. We use fact-checking tools. We tell customers: assume the model is wrong until verified."
How to Explain "Black Box"
Simple version: "We don't fully understand why the model makes each decision. But we can observe what it paid attention to."
Analogy: "Like asking someone why they like a painting. They can point to colors they enjoyed, but explaining aesthetic judgment completely is impossible. We can observe, but not fully explain."