The question that has animated most public discourse about artificial intelligence for the past decade is: what can AI do? It is a question about capability — about the range of tasks that can be automated, the complexity of problems that can be solved, the speed at which decisions can be made. It is a question with impressive and accelerating answers.
But it is not, I would argue, the most important question. The question that matters more — and that receives far less attention — is a different one: what can human beings and AI systems accomplish together that neither can accomplish alone?
That is a question about collaboration. About the design of systems where human intelligence and artificial intelligence are genuinely complementary — where each is doing what it does best, in a structure that allows their respective strengths to amplify rather than cancel each other. It is a harder question to answer than the capability question, because it requires understanding not just what AI can do but what humans do that AI cannot, and what conditions allow the two to combine productively.
"The question was never whether AI would be capable. It is whether we can build systems where human and artificial intelligence amplify each other's strengths — where the result is more human, not less."
Artificial intelligence, at the level of current capability, is extraordinarily good at a specific set of things. It processes information at scales and speeds that human cognition cannot match. It identifies patterns in large datasets that would be invisible to human analysts. It maintains consistent performance without fatigue. It applies rules reliably across millions of instances without the variance that characterises human judgment in high-volume, low-stakes decisions. It can generate plausible language, images, and code in response to natural language prompts with a facility that continues to surprise even those who build these systems.
What AI does not do — at least not in any current or near-horizon system — is understand context in the way that humans understand context. It does not navigate the ethical complexity of novel situations with the judgment that comes from having lived in and cared about a moral community. It does not build trust relationships over time through demonstrated reliability and genuine understanding of another person's needs. It does not exercise the practical wisdom — what Aristotle called phronesis — that allows a person to recognise which principles apply in a particular situation and how to apply them given the full texture of that situation.1
The obvious implication of this analysis is that the most productive AI deployments are not those that maximise AI autonomy but those that design for genuine complementarity — allocating to AI the tasks where its strengths dominate and to humans the tasks where human judgment, relationship, and accountability are irreducible. But this obvious implication turns out to be surprisingly difficult to implement.
The difficulty is partly technical. Building systems where human and AI capabilities are genuinely complementary requires understanding what each does well at a level of granularity that most AI deployment decisions do not achieve. Organisations tend to ask: "Can AI do this task?" rather than "What is the structure of this task, which elements benefit from AI capabilities, which require human judgment, and what interface would allow the two to work together optimally?"
But the difficulty is also organisational and cultural. Automation is a clean story. AI replaces a human function; costs fall; throughput rises. Collaboration is a messier story. Roles change. Skill requirements shift. Accountability becomes harder to assign. The workflow that produces the best outcomes may not be the workflow that is easiest to manage or to explain to a board. The evidence that collaboration produces better outcomes than either human-only or AI-only approaches is, in many domains, strong — but converting that evidence into operational practice is slow and organisationally demanding.2
The research on human-AI complementarity in decision-making is now substantial and largely consistent. Kahneman, Sibony, and Sunstein, in their work on noise in human judgment, identified conditions under which algorithmic assistance consistently reduces the variance in human decisions without eliminating the human judgment that drives accuracy in complex cases.3 Hemmer et al. (2025) found that teams combining human and AI judgment outperformed both humans alone and AI alone across a range of decision tasks — with the performance advantage largest in the conditions where neither was sufficient alone: high complexity combined with the need for contextual sensitivity.4
In the context of learning specifically — which is where much of my own research is concentrated — the evidence for complementarity is compelling. AI tutoring systems that adapt pacing, identify gaps, and provide consistent reinforcement have measurable positive effects on learning outcomes. But they produce the strongest effects not in isolation but in combination with human teachers who provide the relationship, motivation, and contextual judgment that AI systems cannot. The question of whether AI should replace teachers is, on this evidence, the wrong question. The right question is how AI and teachers can collaborate in ways that make teaching more effective and learning more accessible.5
If collaborative intelligence is the destination, the design challenge is considerable. It requires interface design that makes AI reasoning legible to human collaborators — so that human judgment can be applied meaningfully to AI outputs rather than simply overriding or deferring to them. It requires workflow design that identifies the right points for human review, escalation, and override. It requires training — not only of the humans who work with AI systems, but of the organisations that deploy them, the cultures that determine how human-AI interactions are valued and rewarded.
And it requires governance frameworks that treat human oversight not as a constraint on AI capability but as the condition under which AI performs best. My research on the HOF-AIDE framework found that enterprises which embedded human oversight architecturally — not as an audit mechanism but as a structural design feature — produced better outcomes than those that treated oversight as an optional layer.6 The governance architecture was not friction. It was the infrastructure of trust within which AI could operate effectively.
"The most important design challenge of the AI era is not building smarter machines. It is building systems where machines and people are smarter together than either is alone."
Collaborative intelligence asks something of humans as well as of AI systems. It asks us to be honest about what we do better than AI and what AI does better than us — without the defensiveness that comes from feeling threatened by the comparison, and without the hubris that comes from underestimating AI's genuine capabilities.
It asks us to design for the outcomes we actually want — not the outcomes that are easiest to automate — and to build the institutional and organisational structures that make human-AI collaboration work over time rather than optimising for short-term efficiency gains that erode the human capabilities we will need later.
And it asks us to hold onto something that I think is at risk in the current moment of AI enthusiasm: the conviction that what makes outcomes good is not the intelligence of the system producing them but the human values, judgments, and relationships that determine what good means in the first place.
The future belongs to collaborative intelligence. But only if we design it that way.