How Reformed Theology Challenges AI
By FaithBench Research
ChatGPT placed Calvin in the 20th century. That's not the real problem. AI systematically distorts Reformed soteriology in one direction—toward human agency, away from divine sovereignty.
ChatGPT placed John Calvin among "20th century theologians." Calvin died in 1564.
That's not a hallucination. It's a category error—and it reveals something deeper about how AI processes Reformed theology.
The Difference Isn't Difficulty
Reformed theology isn't harder than other traditions. It's differently structured.
The Westminster Confession isn't a list of beliefs you can extract in isolation. It's a system where categories depend on other categories. Federal headship requires covenant theology. Limited atonement presupposes total depravity. Each doctrine interlocks with every other.
AI's architecture—pattern matching across flattened training data—collapses what Reformed theology holds apart.
Sola Scriptura Is Structurally Impossible
Here's the first problem no prompt engineering can fix: AI cannot privilege Scripture.
Large language models weight by corpus frequency. When Catholic, liberal Protestant, and Reformed sources all discuss justification, the model creates a weighted average. But Reformed epistemology doesn't average. It requires Scripture to judge tradition, not blend with it.
Important
This isn't a bug to fix. It's structural. An AI operating from sola scriptura would need to weight Scripture differently—but that's a theological judgment AI architecturally cannot make.
The Gentle Reformation notes it plainly: "AI does not operate from sola Scriptura but blends sources without discernment."
To operate from sola scriptura, AI would need to do what it fundamentally cannot: treat some sources as authoritative judges of other sources. Corpus frequency doesn't discriminate between Scripture and commentary on Scripture.
Federal Headship Collapses
Ask an AI about original sin and you'll get something like: "We're all affected by Adam's sin."
That's true but theologically hollow. Reformed theology doesn't say we're affected by Adam. It says Adam's guilt was judicially imputed to us by covenant arrangement. There's a legal, federal structure—Adam as covenant head, his guilt reckoned to his descendants.
This is where AI most visibly breaks. "Judicially imputed guilt" becomes "we're all affected." The legal structure vanishes. The covenant center—the entire point—disappears.
Reformed Theological Seminary identified this pattern: "ChatGPT cannot represent the specificity of Reformed soteriology and would likely reflect Arminianism."
The TULIP Pattern
The same distortion appears across all five points of Calvinism:
| Doctrine | Reformed Position | AI Substitution |
|---|---|---|
| Total Depravity | Humans utterly enslaved, unable to contribute to salvation | Humans flawed but capable of moral growth |
| Unconditional Election | God chooses without regard to foreseen faith or merit | God chooses based on foreseen human response |
| Limited Atonement | Christ died specifically for the elect | Christ died for all; atonement available to all |
| Irresistible Grace | God's effectual call cannot ultimately be resisted | Grace enables but doesn't determine human choice |
| Perseverance | God's preservation guarantees believers' final salvation | Salvation conditional on continued human faith |
Notice the pattern. Every substitution moves in the same direction: more human agency, less divine sovereignty. More conditions, fewer unconditionals. More "enables," fewer "determines."
This isn't random error. It's systematic bias.
The Pattern Named
AI doesn't merely get TULIP wrong. It systematically distorts Reformed soteriology toward anthropocentrism, conditionality, and theological pluralism.
Chad Seales at UT Austin identified this: "AI reproduces Protestant theological problems, particularly around predestination."
The root cause is structural. AI training data is saturated with post-Enlightenment theological assumptions—human autonomy as baseline, divine sovereignty as problem to solve. When moderate evangelical content vastly outweighs confessional Reformed content in the training corpus, the model's center shifts.
Early testing suggests AI struggles with covenant-related questions more than other doctrines. The substance/administration distinction, the circumcision-to-baptism continuity, why Abraham's covenant and the new covenant are the same covenant of grace—these systematic connections prove difficult for pattern-matching architectures.
Prompt Dependency
Researcher Anthony Delgado documented something troubling: the same theological question produces different answers depending on prompt wording.
Ask "Is election unconditional?" and you get a Reformed-leaning answer. Ask "Can people choose salvation?" and you get an Arminian-leaning answer. Ask "How do predestination and free will relate?" and you get a synthesized middle that belongs to no tradition.
Same theology. Different phrasing. Different conclusions.
AI mirrors expectations rather than analyzing theological substance. Even Reformed scholars asking about their own tradition get softened output.
The Stakes
Seminary students using AI won't develop confessional fluency. They'll learn to sound Reformed without thinking Reformed.
Warning
Confessional subscription means nothing if you don't understand what you're subscribing to.
This creates a pre-catechesis problem. Congregants arrive with AI-shaped assumptions that look Reformed on the surface but subtly undermine every distinctive. They've absorbed the vocabulary without the framework.
"Effectual calling" becomes "God's invitation." "Federal headship" becomes "Adam represented humanity." The words survive; the meaning drains away.
The Distinctions That Matter
Reformed theology exists because distinctions matter.
Justification ≠ sanctification. Imputed guilt ≠ inherited tendency. Unconditional ≠ foreseen faith. Federal headship ≠ "we're affected." Effectual calling ≠ "God's invitation."
AI collapses every one.
The Reformation was fought over these distinctions. People died for them. They're not pedantic theological hair-splitting—they're the load-bearing walls of how Reformed Christians understand salvation, human nature, and God's sovereignty.
AI is slowly, quietly erasing them—one helpful paraphrase at a time.
This is what FaithBench aims to measure. The question isn't whether AI can discuss Reformed theology. It's whether it can discuss it as Reformed—with the precision, the systematic connections, and the confessional commitments that define the tradition.
The distinctions that cost lives to establish are being averaged away. And most churches have no idea it's happening.
References
Gloo. (2025, December 15). Gloo unveils the first benchmark exposing how AI misses Christian worldview and values [Press release]. https://gloo.com/press/releases/gloo-unveils-the-first-benchmark-exposing-how-ai-misses-christian-worldview-and-values
Muller, R. A. (2003). Post-Reformation Reformed dogmatics (2nd ed.). Baker Academic.
Tao, Y., et al. (2024). Cultural alignment of large language models. arXiv preprint. https://arxiv.org/abs/2402.13231
Westminster Assembly. (1646). The Westminster Confession of Faith. Available at reformed.org.