Why AI Accountability Is Structurally Broken

Russell E. Willis
Feb 23
8 min read

When an AI system denies your loan, flags you as a flight risk, or filters your résumé before a human ever sees it, the instinct is to ask: Who is responsible for this?

The honest answer is everyone — and therefore no one.

That is not a scandal. It is a structure. And until we understand the structure, we will keep responding to AI failures the wrong way.

The Question That Never Gets Answered

On January 15, 2021, the entire Dutch cabinet resigned after an investigation revealed that a government AI system had wrongly flagged more than 26,000 families as welfare fraudsters. Families lost homes. Marriages collapsed. Children were taken into state care. The financial damage ran into the hundreds of millions of euros.

When investigators tried to answer the basic question — who was responsible? — what they found was a system so thoroughly distributed that accountability had nowhere to land.

The politicians who approved the system did not design it. The civil servants who administered it did not control its logic. The engineers who built it did not determine how it would be deployed. The data scientists who trained it did not choose which historical patterns would carry weight. Each actor made defensible choices within their domain. No single actor saw or controlled the whole.

This is not a uniquely Dutch problem. It is not even a government problem. It is the defining structural challenge of AI deployment at scale — and it is happening right now, across healthcare, finance, employment, criminal justice, and content moderation, in organizations that consider themselves responsible.

The problem is not irresponsible people. The problem is that we keep applying accountability frameworks built for a simpler world to a fundamentally different kind of system.

Highway Thinking in a Web World

Traditional accountability assumes what I call a highway model of causation: linear, clear, and individually controlled. Someone makes a decision. That decision produces an outcome. That person answers for it. The logic is intuitive because it mirrors how we experience most human action. I did this. It caused that. I am responsible.

This model works reasonably well when causation is proximate, visible, and traceable to individual choices. It works for bridges and buildings. It works for product defects and contract disputes. It does not work for AI.

AI systems create what I call intersections within webs — points where engineers, training data, organizational incentives, vendors, regulatory frameworks, historical bias, and deployment contexts all converge simultaneously. Decisions do not emerge from any single node in this web. They emerge from the interaction of all of them. No single actor controls the intersection. And yet consequential decisions — about your mortgage, your bail, your employment, your medical care — flow out of it constantly.

Consider a hiring algorithm. Engineers design the architecture. Data scientists train it on historical hiring data. HR leaders define the requirements. Product managers set the optimization targets. Vendors supply the underlying model. Legal teams review for compliance. Historical hiring patterns shape what the training data treats as "success." Organizational incentives determine what gets measured. Social context defines what "qualified" means.

No one controls the intersection. And yet candidates are accepted or rejected by it every day.

When the system produces discriminatory outcomes — as Amazon's experimental hiring algorithm did when it systematically downgraded women's résumés — the finger-pointing begins. The engineers say they built what was specified. The vendor says the training data came from the client. The HR team says they trusted a validated tool. The executives say they relied on expert recommendations. Every statement is defensible. None of them adds up to accountability.

The Five Ways Accountability Disappears

This pattern is not random. Across every sector where AI operates at scale, the same architecture produces the same accountability vacuum. Five structural elements combine and compound to make responsibility impossible to locate.

First, automation displaces human judgment. AI promises scale, speed, and consistency. But automation does not merely assist human judgment — it progressively displaces it. Humans remain nominally "in the loop," but the loop is structured so that independent judgment becomes both harder and riskier. When the AI is usually right, questioning it feels like incompetence. When it is wrong and you followed it, accountability is murky. When you override it and are wrong, accountability is crystal clear. Over time, rational actors learn to trust the system — not because they believe it is infallible, but because the incentive structure makes trust the path of least resistance.

Second, opacity prevents meaningful contestation. Even when AI systems are technically "explainable," they often remain practically opaque. Explanations describe what happened, not whether it was appropriate. Feature importance scores and confidence intervals tell engineers something useful. They tell the person who was denied a mortgage nothing actionable. When the emergency physician in one of my case studies tried to understand why her hospital's diagnostic AI had nearly missed a cardiac event, the vendor provided a technically comprehensive response that was clinically irrelevant. She could recount the system's output. She could not explain why it failed or how to prevent similar failures.

Third, the collection of personal data creates profound power asymmetries. AI systems rely on massive, often ambient data collection. You cannot meaningfully consent to something you cannot see or avoid. You cannot opt out of infrastructure. The systems know vastly more about you than you will ever know about them. When the AI in a lending algorithm incorporates your zip code, your spending patterns, your social network's credit behavior, and behavioral signals purchased from data brokers, and then classifies you as "elevated risk" — you have no meaningful way to know what drove that assessment or how to contest it.

Fourth, systemic risk multiplies silently. AI systems rarely operate in isolation. Similar models trained on similar data fail in the same ways at the same time. When Dr. Chen's hospital used a diagnostic AI built on data primarily from major urban academic medical centers, so did three thousand other hospitals using the same vendor's architecture. The bias that nearly killed her patient was not an isolated error — it was a correlated vulnerability distributed across an entire health system. Systemic risk of this kind is invisible to any single institution looking only at its own outcomes.

Fifth, authority is delegated without accountability. Organizations deploy AI to delegate decisions. But delegation rarely comes with a clear allocation of answerability. When the algorithm fails, the loan officer who trusted it, the product manager who deployed it, the engineer who designed it, and the executive who approved it can each point plausibly to someone else. Authority is concentrated in the system. Accountability evaporates among the humans.

These five elements do not merely coexist. They amplify one another. Automation increases opacity. Opacity makes delegation more dangerous. Delegation relies on data collection. Data collection fuels further automation. And as more organizations adopt similar systems, individual risks become systemic risks. The compounding effect is why isolated fixes do not work.

Why "Just Fix the Algorithm" Keeps Failing

The instinct, when an AI system produces harm, is to demand better design, tighter oversight, more comprehensive auditing. These responses are not wrong — but they are insufficient, because they treat a structural problem as a technical one.

You cannot solve opacity without confronting the complexity that produces it. You cannot solve delegation without rethinking the data practices beneath it. You cannot manage systemic risk from within a single organization. These problems interlock in ways that make piecemeal solutions ineffective.

There is also a deeper issue. When we frame AI accountability primarily as a technical challenge, we inadvertently protect the responsibility frameworks that are failing. We improve the tools without questioning the governance architecture that underpins them. We optimize the algorithm without asking who is responsible for what it is optimizing.

The accountability crisis is not fundamentally about bad code. It is about the mismatch between the complexity of AI systems and the simplicity of the frameworks we use to govern them. Traditional responsibility frameworks were built for a world of clear causation, individual control, and predictable outcomes. AI systems operate in a world of distributed agency, emergent behavior, and outcomes that no single actor designed or controls.

The Dutch case is instructive here. The investigation identified technical flaws in the algorithm. But the deeper failure was structural: no one had built governance capable of catching systematic harm at scale. The system had no meaningful mechanism for affected families to contest decisions they could not understand. It had no ongoing monitoring designed to detect disparate impact before it became catastrophic. It had no accountability structure that matched the actual complexity of the deployment.

From Accountability to Answerability

The shift I am advocating begins with a conceptual distinction that has significant practical consequences.

Accountability asks: Who caused this? Who should bear the consequences?

Answerability asks: how do we respond well to what we are participating in? What are we building, and what is it making us become?

Accountability looks backward. Answerability looks forward. Accountability assigns blame. Answerability shapes action. Accountability seeks control. Answerability seeks participation — genuine, ongoing engagement with the systems we are building and their effects on the people they touch.

This is not a retreat from responsibility. It is a more demanding form. Answerability does not let anyone off the hook. It distributes obligation differently—not by identifying the person who caused the harm, but by asking every participant in the system to take seriously their specific capacity to see, notice, and respond.

For the engineer, that means asking not just "does this model perform well?" but "what happens when this model fails, and who bears that cost?"

For the executive, it means asking not just "is this tool validated?" but "have we built governance structures that can actually catch problems when they emerge?"

For the policymaker, it means asking not just "does this system comply with existing law?" but "does existing law provide meaningful protection for the people this system affects?"

Answerability requires building organizations that can hear those questions — and structures that create obligation to respond to what they reveal.

What Changes When We Get This Right

Organizations that take answerability seriously begin to look different in specific ways. They invest in transparency that serves understanding, not just documentation. They build participatory governance structures that give affected communities genuine authority, not just advisory input. They treat monitoring and adaptation as core ongoing work, not a post-launch afterthought. They create accountability structures that match the actual complexity of their systems — distinguishing between domain accountability, integration accountability, and systemic accountability in ways that prevent responsibility from simply dissolving into distributed irrelevance.

And critically, they cultivate what I call responsible imagination — the discipline of asking not just "does this work?" but "should this exist? What world is this system helping to create? What are we becoming through building it?"

None of this is utopian. It is harder than compliance, slower than pure optimization, and more expensive than pretending the accountability vacuum does not exist. But the alternative — AI systems making consequential decisions about millions of people while responsibility evaporates into distributed complexity — is not just ethically intolerable. It is practically unsustainable. The failures will compound. The harms will accumulate. The regulatory and reputational consequences will eventually arrive.

The accountability crisis is real. It is structural. And it is addressable — not by finding better people to blame, but by building better systems of responsibility.

That work is already overdue.

Russell E. Willis, Ph.D., is an AI implementation consultant, strategic planning adviser, and author of AI and the Crisis of Control: How Leaders Can Reclaim Responsibility in the Age of AI (forthcoming from Archway Publications), which introduces the ASSUME Model and Five Pillars of responsible AI stewardship. He has spent fifty years at the intersection of technology and responsibility — as an engineer, academic, and entrepreneur. He works with executives and policymakers through Got Vision Consulting.