Why AI Hallucinations Are Our Fault Too
January 13, 2026
“I asked ChatGPT to update my code to match best security practices and it cited standards that don’t exist. These things just make stuff up - you can’t trust them for anything serious.”
I’ve heard some version of this complaint dozens of times. And yes, LLMs hallucinate. They fabricate citations, invent statistics, create plausible-sounding nonsense with complete confidence. It’s a real problem. But before you write off AI as fundamentally unreliable, let’s talk about what you did to cause it.
What Hallucinations Actually Are
LLMs are trained to complete patterns, not verify truth. The training process is: here’s a massive corpus of text, learn to predict what comes next. The corpus contains millions of questions followed by confident answers. So the model learns the pattern of confident answers, not a separate skill of “knowing when I know things.”
When you ask an LLM a question, it’s not searching a database of facts and returning results. It’s navigating a probability space of “ways that questions like this get answered in my training data.” The model generates tokens that have high probability of following your prompt, weighted by patterns it learned across billions of examples.
Here’s the critical part: the model doesn’t know the difference between “confident answer to question I know” and “confident answer to question I’m interpolating.” Both look the same from inside the probability distribution. Both follow the learned pattern of question → confident response.
This isn’t a bug in the model. It’s the model working exactly as designed. You trained a system to complete text patterns, and when you ask it questions, it completes the pattern of answering questions. The hallucination problem is baked into the architecture.
The Real Problem: Workflow Mismatch
But here’s where we come in. We treat LLMs like magic oracles - throw massive, ambiguous questions at them and expect perfect answers--and while we get answers, we don’t get the ones we want.
Consider what you’re actually asking when you prompt: “Write me a comprehensive security framework.”
What you think you’re asking: “Take your knowledge of security best practices and create something useful for my context.”
What you’re actually asking: “Synthesize decades of industry knowledge across multiple domains, apply it to my specific context (which I haven’t fully specified), invent novel solutions where needed, resolve conflicts between competing standards, and do it all in one shot with no clarification or iteration.”
This is an insane ask. No human could do this. A human security architect would ask you dozens of questions: What’s your industry? What regulations apply? What’s your threat model? What’s your current maturity level? What resources do you have? What’s already in place?
But you didn’t prompt for those questions. You just said “comprehensive security framework” and expected the LLM to fill in all those gaps itself.
Every gap the LLM fills is a potential hallucination waiting to happen. It has to guess:
- Your industry context
- Applicable regulations
- What “comprehensive” means to you
- Which standards are relevant
- What your existing security posture looks like
- What level of detail you need
The task spans too much of the probability space. The model is forced to interpolate across too many dimensions where it has insufficient data about your specific situation. So it does what it learned to do: generates confident-sounding text that follows the pattern of “security framework documents.”
Some of that text will be grounded in real standards it saw during training. Some will be plausible-sounding inventions that match the pattern of security standards without corresponding to real ones. You won’t know which is which until you verify every claim. We set the model up to fail, then blame it for failing.
At each step the LLM takes those nuggets of wayward predictions and they cascade on each other...the first bullet point is close to what was asked, these small predictive errors build upon one another and by the end what comes out is wildly astray.
How Agentic Workflows Helps Address This
I have a homegrown AI coding orchestrator I call Czarina. She’s not unique—tools like Devin, Cursor, and GitHub Copilot Workspace follow similar patterns: break down tasks, maintain context, verify incrementally. The difference between these tools and “ChatGPT wrote me broken code” isn’t the underlying model—it’s the workflow wrapper that forces verification loops.
Before Czarina touches code, I build the plan: break down goals into phases, identify components, think through dependencies. I’m doing systems architecture—the part requiring business context, existing codebase knowledge, organizational politics. Only then do I hand bounded tasks to Czarina: “Implement the authentication middleware following this specification.” “Create the database migration for these schema changes.” (And, let’s be honest here, it’s not ME, the human writing this stuff, it’s another LLM instance helping me write it)
The hallucination prevention isn’t just task decomposition—it’s the human maintaining context and boundaries while the AI executes within them.
Here’s how this works in practice with “build me a security framework”:
Step 1: “List the major categories typically covered in enterprise security frameworks”
- This stays grounded in training data
- Categories like “access control,” “data protection,” “incident response” are well-established
- Easy to verify against my knowledge
- If something looks weird, I catch it here
Step 2: “For each category, identify 2-3 industry-standard frameworks or guidelines that address it”
- Still within training distribution
- NIST, ISO 27001, CIS Controls are real, documented things
- Claims are specific enough to verify
- We’re building up verified knowledge, not guessing
Step 3: “For NIST access control guidelines, summarize the core requirements”
- Narrow, bounded task
- The model can draw on actual NIST documentation patterns
- Summary is checkable against source material
- Errors are obvious and caught early
Step 4: “Now synthesize these verified requirements into a coherent access control section”
- The synthesis step is constrained by verified inputs
- The model isn’t inventing requirements
- It’s organizing and expressing things we’ve already validated
- Creative work happens on a foundation of verified facts
Step 5: “Czar Review of All Subtasks”
- Every Czarina run has a Czar that is responsible for the full workflow.
- Each worker’s tasks are reviewed, did the outputs match expectation?
- Checks and validates work done, “you implemented these methods as stubs, worker 3, go back and flesh them out”.
- Fully responsible for the final review from a full context point of view--not just each subtask alone.
Each step is small enough to stay within the model’s training distribution. Each step is verifiable before we move forward. Errors get caught early and don’t compound. The model isn’t guessing about massive context - it’s executing clear, bounded tasks.
This is what agentic workflows actually mean. Not just “break tasks into steps,” but “break tasks into verifiable steps that keep the model grounded.”
Agentic workflows aren’t the magic bullet here--errors still compound across steps.
Intelligence Builds on Itself
This isn’t just about preventing errors. It’s about how intelligence actually works.
Humans don’t solve complex problems in single jumps either. We decompose, verify, build incrementally. When you tackle a hard problem, you don’t just generate the final answer. You work through it: gather information, test hypotheses, check your reasoning, iterate.
When we skip those grounding loops - the verification, the decomposition, the iteration - we get confident nonsense. We just call it “being confidently wrong” instead of “hallucination.”
The hallucination problem reveals something fundamental: complex reasoning requires grounding loops. You can’t maintain alignment with reality across too many inferential steps without checking back against ground truth.
LLMs have the same constraint, but it’s more visible because they can’t draw on lived experience the way humans can. When you ask a human architect to design a security framework, they’re not just generating text - they’re drawing on years of implemented systems, failed attempts, organizational politics, regulatory audits. All that tacit knowledge keeps them grounded even when they’re being creative.
The LLM only has training data. When you push it too far from that training distribution without grounding loops, it drifts into invention.
The solution isn’t “better models” (though that helps). The solution is better workflows that match how reasoning actually works.
A Note on Context Windows
Better models help, and yes, longer context windows provide more room for the model to track information. But longer context also means more room to drift from grounding. The fundamental principle remains: complex reasoning needs verification loops regardless of context window size.
Practical Takeaways
If you want to stop seeing hallucinations, change how you work with LLMs:
Stop asking oracle questions. “What’s the best approach to X?” is too big. Break down what you actually need to know. Start with “What are the common approaches to X?” Then “What are the tradeoffs of approach Y?” Then “Given constraints A and B, which approach would you recommend?”
Make assumptions explicit. Instead of letting the model guess about your context, prompt it to state its assumptions: “Before answering, tell me what you’re assuming about my use case.” This forces the model to externalize its guesses where you can correct them.
Verify incrementally. Don’t wait until the end to check if it’s right. Verify each component as you build. Treat the LLM output as draft material that needs validation, not finished product.
Use chain-of-thought prompting. Make the model show its work: “Walk through your reasoning step by step.” This both improves accuracy (the model catches more of its own errors) and makes hallucinations more obvious to you.
Treat LLMs like junior engineers. You wouldn’t hand a junior dev a vague requirement and expect production-ready code. You’d give them clear tickets, defined scope, verification gates. Do the same with your AI tools.
Liberal Use of Agentic Rulesets. Tell your agent your preferred python version, ensure that it’s using “docker compose” and not “docker-compose” with rules loaded by each agent at instantiation.
The Bottom Line
Hallucinations are real. LLMs will confidently state things that aren’t true. This is a legitimate limitation of the technology.
But hallucinations aren’t random. They emerge from the interaction between model capabilities and human workflow choices. When you ask questions that force the model to interpolate too far from its training distribution, you get hallucinations. When you provide grounding through decomposition and verification, you don’t.
The better you understand how to work WITH the model’s strengths and limitations, the less you’ll see hallucinations. You wouldn’t blame a calculator for giving wrong answers if you’re hitting the wrong buttons. Don’t blame the LLM for hallucinating when you’re asking questions it can’t possibly answer from within its training distribution.
The AI isn’t stark raving mad. You’re asking it questions it can’t answer, then being surprised when it tries anyway.
Next time you see a hallucination, ask yourself: Did I give it enough context? Did I break the task down? Did I verify incrementally? Did I make my requirements explicit?
Or did I just throw a massive ambiguous prompt at it and hope for magic?
Because one of those approaches works. And the other one gets you invented security standards and Reddit complaints about how AI can’t be trusted.
This is part of an ongoing series exploring what actually changes when AI can write code at human level. Not hype, not doom - just observations from the trenches.