Nobody Has Figured Out How to Show Users What the AI Is Doing

Three months ago, a product manager at a mid-size company used their company's new AI assistant to clean up their calendar. They'd asked it to "remove the duplicates and decline anything that conflicts with my focus time." Seemed simple.
The AI executed 47 calendar actions in eleven seconds. Declined invites from the CEO. Removed a standing one-on-one with a direct report that had been rescheduled three weeks prior. Sent automated decline messages to clients the PM hadn't actually wanted to decline.
The product worked exactly as designed. There was just no design for what "exactly as designed" would look like when the AI was doing the acting.
The Verb Shift That Changes Everything
The history of UX is mostly a history of designing around human-triggered actions. A user clicks a button. The system responds. The user sees the response. The user clicks another button.
Everything in the conventional UX toolkit — affordances, feedback, confirmation dialogs, progress indicators, undo — was built for this model. The human is the agent. The system is the tool. The user has full visibility into what they're asking for, because they're the one asking.
Agentic AI inverts this. The user states an intent. The AI interprets that intent, decides what actions to take, and executes them. The user's role in the loop shrinks from "actor" to "supervisor." The UX challenge — which nobody has really solved yet — is how to design a supervision interface for something you can't fully predict.
The existing patterns don't transfer cleanly. A progress indicator tells you that a thing you started is happening. It doesn't tell you that a thing the AI decided to do is about to happen or has already happened. A confirmation dialog assumes you know what you're confirming. AI action plans can involve dozens of interdependent steps that humans have no context to evaluate individually.
Three Properties That Make This Different
AI-triggered actions have three properties that distinguish them from human-triggered ones, and those properties drive the design challenge.
Opacity. When you click a button, you know what you clicked. When an AI takes an action, the action emerged from a chain of reasoning that wasn't visible to you. The action may be unexpected even if it's technically correct. Conventional UX doesn't have a pattern for "the system did something defensible but surprising."
Irreversibility asymmetry. Human actions trend toward reversibility because humans trigger them incrementally and can observe the effects. AI agents batch actions and often move through irreversible steps (sends, deletes, publishes) as confidently as reversible ones (drafts, stages, queues). There's no friction signal when an action is about to cross a threshold you can't undo.
Distance from intent. When you delete a file, you know you deleted it. When an AI deletes a file on your behalf, the connection between your intent ("clean up my desktop") and the specific action ("deleted project-brief-final-v3-REALLYFINAL.docx") is mediated by the AI's interpretation. Users often don't understand what the AI understood, which makes reviewing or appealing its decisions genuinely hard.
What's Actually Been Tried
A few patterns have emerged from products experimenting in this space. None of them are settled solutions, but they're instructive.
Pre-flight plans. Show the user a plan before execution. "Here's what I'm going to do: 1. Archive emails older than 30 days from the Promotions tab, 2. Create a folder for..." This seems intuitive, but in practice, detailed plans are hard for non-technical users to evaluate and create friction that undermines the value of the agent. Users tend to approve plans they don't understand and then complain about outcomes.
Post-hoc activity logs. Show users what happened after it happened. Better than nothing, and useful for auditing, but it addresses accountability rather than control. By the time you're reading the log, the irreversible actions have been taken.
Confidence-graduated confirmation. Only interrupt the user for actions above a certain risk threshold. Low-stakes actions (read-only, easily reversible) execute automatically. High-stakes actions (delete, send, publish, purchase) require explicit approval. This is theoretically clean but requires the system to accurately classify action risk, which is harder than it sounds when context matters a lot.
Undo queuing. Hold all actions in a reversible queue for a configurable period (30 seconds, 5 minutes) before executing. This is the most user-friendly approach and works well for small action sets. It doesn't scale to complex multi-step agents where later steps depend on earlier ones.
Apple's approaches with iOS 18 Siri — where the assistant explains what it's about to do before doing it — represent one direction. Microsoft's Copilot has experimented with inline "accept/reject" controls on AI-generated edits. Neither has found a pattern that scales to genuinely autonomous agents.
The Design Problem Under the Design Problem
There's a reason no clean solution has emerged yet: the design problem is actually a trust calibration problem wearing UX clothes.
Users don't know how much to trust AI agents. They don't know when to supervise closely and when to let the agent run. They don't know what the failure modes look like or how bad they get. This isn't a UX failure — it's an accurate response to genuine uncertainty about system behavior.
The UX can make that uncertainty visible or invisible. It can create friction that trains users to supervise more or creates friction fatigue that trains them to approve blindly. It can show them a detailed action log they won't read or a summary they'll trust too much.
What it can't do is solve the underlying epistemics. A user who doesn't know what the AI considers a "duplicate calendar event" can't meaningfully evaluate a plan that includes "remove duplicates." The interface can show them the plan, but the knowledge required to evaluate it doesn't come from the interface.
This is why the parallel to agentic UX patterns in general — the whole question of designing for systems that act rather than respond — is still mostly open. The vocabulary exists. The patterns are being invented. But the field is designing for something it doesn't fully understand yet, which is exactly the condition under which design patterns tend to calcify into mistakes.
What Good Might Look Like
The honest answer is that the design for agentic AI is going to look different across risk levels, and the most important design decision is which category you're in.
For low-stakes, high-reversibility agents: minimize interruption, maximize transparency through readable logs. Trust the user to check when they care to check.
For high-stakes, low-reversibility agents: front-load the confirmation. Not a binary approve/deny, but a structured review where the user can understand the consequences of the actions they're approving. This is slower. It should be slower.
For complex multi-step agents: rethink the interface model entirely. The action-by-action pattern doesn't work when there are 200 actions. What you need is something closer to goal alignment — confirm not the steps but the parameters of the task, and build in checkpoints at the moments of maximum irreversibility.
The PM with the calendar problem didn't need to see 47 confirmations. They needed one: "This will affect scheduling relationships with clients and executives. Do you want to proceed?"
The AI knew which actions were high stakes. The interface just didn't communicate it.
Cover: LED display by Benjamin Farren via Pexels — red "No Signal" indicator in dark setting