Why Orthodox Distinctives Trip Up LLMs
AI generates icons with gibberish text and missing fingers. It collapses theosis into New Age pantheism. The failures aren't random—they reveal a structural bias toward Western theological categories.
Research insights, benchmark analysis, and theological AI news.
AI generates icons with gibberish text and missing fingers. It collapses theosis into New Age pantheism. The failures aren't random—they reveal a structural bias toward Western theological categories.
A Catholic AI trained on 23,000 Church documents still can't tell settled doctrine from open questions. Protestant bias isn't accidental. It's architectural.
ChatGPT placed Calvin in the 20th century. That's not the real problem. AI systematically distorts Reformed soteriology in one direction—toward human agency, away from divine sovereignty.
AI models score 48/100 on faith. Not because they're wrong—because they've learned to say nothing specific. Here's how "God" becomes "higher power" and why it matters.
AI scores lowest on faith—48/100—while handling finances at 81%. Not because theology is hard. Because AI is teaching a different religion.
Medical AI hits 96% accuracy. Legal benchmarks have "objectively correct" answers. Theological benchmarks? 48/100 on faith. Here's why.
Coming soon: More research posts launching with the full benchmark release in Q2 2026.
AI models default to vague spiritual language when they should be making tradition-specific claims.
Breaking down why frontier models score poorly on FaithBench and what it reveals about their training.
Covenant theology, confessional nuance, and why Reformed questions expose model limitations.
Get notified when we publish new research and benchmark updates.
Email subscribe@faithbench.org to join our mailing list.