If human oversight fails, how can we build AI systems that don’t?

In high-stakes AI deployments, the most important decisions are made before the system is ever switched on. Michael A Santoro argues that effective governance depends not on human override but on the design choices, ethical trade-offs, and accountability structures built into AI systems from the outset.

Michael A Santoro (2)
Image by syda_productions | Freepik

When AI systems are deployed by governments and non-profits, the default intuition is that higher-stakes environments require tighter human control. The more serious the consequences, the stronger the pull toward ‘human-in-the-loop’ intervention. That intuition is understandable, but it is also, in important respects, mistaken.

In routine functions, such as traffic management, permitting, and service delivery, errors are visible, distributed, and often reversible. In crisis environments, by contrast, decisions are compressed in time, stakes are elevated, and the margin for error is narrow.

However, the critical management insight is the same: late-stage human intervention is not a reliable safeguard. It is often too slow, too inconsistent, and too dependent on the same biases and informational gaps that the system was meant to address. The more effective approach is not tighter downstream oversight, but stronger upstream design.

Consider a policing scenario in which an AI-supported dispatch system prioritises responses based on predicted risk. Following a series of incidents, it becomes apparent that the system is disproportionately directing police presence into minority neighbourhoods. The immediate reaction is often to intervene at the point of output: adjusting recommendations, second-guessing the system, or requiring human override before action is taken.

Yet this response addresses symptoms rather than causes. If the system is producing biased outputs, the issue lies in the model’s construction: the data it was trained on, the proxies it uses for risk, and the value judgements embedded in its optimisation criteria.

Effective guardrails, in this context, must operate at the level of model design. This includes scrutinising training data for historical bias, explicitly modelling disparate impact, and making normative choices about how to weigh competing objectives such as efficiency, fairness, and harm reduction.

The most important factor to keep in mind is the same that applies any time governments design policies that will be implemented at the community level. These are not purely technical decisions; they are policy choices that require democratic accountability. Attempting to correct for them after the fact, in the midst of live deployment, is operationally more complicated than it might seem. Attaching accountability only to visible outcomes, while leaving the operative judgements embedded in system architecture, is a suboptimal form of oversight. A human override in one part of the systemic terrain will have unintended inefficiency effects on other communities. The right time to make the requisite trade-offs is at the outset of the process.

A second example arises in counter-terrorism or emergency response. Imagine a system tasked with identifying potential threats based on vehicle movement and behavioural signals. A white van, in isolation, is not a meaningful indicator; there are too many such vehicles for the signal to be useful. Human operators on the ground may rely on contextual cues, such as unusual driving patterns, location, timing, or combinations of behaviours, to identify which vehicles warrant attention. The common assumption is that these contextual judgements must remain with human responders.

But this assumption overlooks a critical point: if these cues are sufficiently systematic to guide human decision-making, they can and should be incorporated into the model itself. The goal is not to replace human judgement with a crude proxy, but to formalise and integrate that judgement into a system that can apply it consistently at scale. When these signals are left unmodelled, the system remains blunt, and the burden shifts back to human operators operating under pressure, with all the attendant risks of inconsistency and bias.

In both cases, the temptation is to treat AI systems as tools that require continuous human correction. A more effective framing is to treat them as public institutions whose behaviour must be shaped by human governance in advance. This requires moving from an oversight paradigm to a guardrails paradigm. Oversight assumes that errors will occur and must be caught; guardrails aim to prevent those errors from arising in the first place.

For practitioners in risk management and crisis response, this shift has practical implications. First, governance must begin before deployment, with structured processes for defining objectives, identifying trade-offs, and stress-testing models under realistic conditions. Second, evaluation must include not only accuracy metrics, but also distributional effects, particularly in communities that have historically borne disproportionate risk. Third, accountability mechanisms must be part and parcel of design decisions. When a system fails, the question should be: why was the failure not anticipated in the system design process? Accountability, in other words, must follow the chain of design decisions, not only the chain of operational command.

None of this eliminates the role of human judgement. Managers must be ever vigilant in monitoring results so that unintended consequences are caught early. Human expertise is most valuable in setting the parameters within which systems operate, in determining what counts as risk, how competing values are balanced, and what constraints are non-negotiable. In crisis environments, where time is scarce and consequences are immediate, that upstream work becomes even more critical, not less.

The paradox is that the highest-stakes applications of AI are precisely those in which we must rely less on ad hoc human intervention and more on disciplined, transparent system design. Trust in these systems will not be built through the promise of human override, but through demonstrable evidence that the systems and those who design them are worthy of trust. That principle holds whether the domain is municipal governance, emergency response, or national security.

Michael A Santoro is Professor of Management and Entrepreneurship at Santa Clara University, USA, specialising in business ethics, governance, and AI systems.

Tweet Post Post

Crisis Response Journal

Tweets by @CRJ_reports

News and Blogs

4th International POLSECURE Expo wraps up in Poland

May 2026: The three-day event has officially concluded at Targi Kielce, bringing together 219 organisations from 17 countries and over 6,000 guests from the safety and security sectors

The echo chamber of climate communication

May 2026: George Buchan explores why climate change messaging often fails to inspire meaningful action, arguing that communicators must connect climate solutions to people’s everyday concerns and priorities

CRJ’s Advisor Cedrick Moriggi joins a global conversation on energy resilience

May 2026: Bringing together leading voices from energy, risk, and resilience sectors, the briefing will assess the effect of geopolitical tensions in the Middle East on global energy systems and explore strategies for long-term resilience

CRJ welcomes Cedrick Moriggi to our advisory panel

May 2026: Cedrick’s appointment marks an exciting step forward for CRJ, as we continue to explore the evolving nature of resilience, security, and risk in a rapidly changing world

Operation Nimrod: Was it pressure or preparation that won the day?

May 2026: 46 years later, Operation Nimrod stands as a testament to how performance is built through preparation, not pressure itself, write Robert McAlister, and Dr Hendrie Weisinger

CRJ 21:1 is live now!

April 2026: CRJ 21:1 is available online now. Here's what's inside our latest edition...