Intent Recognition Is Job Recognition

You type: “Summarize this Slack thread.”

The AI complies instantly. It generates a perfect, bulleted list of the conversation topics. It captures the main arguments. It even identifies the speakers correctly.

And you look at it, sigh, and scroll back up to read the raw messages yourself.

The summary wasn’t wrong. It was factually accurate. It was grammatically perfect.

It was just … useless.

You weren’t trying to get a generic summary. You were trying to find out if you were assigned a task while you were at lunch. Or you were trying to see if the engineering lead approved the deploy. Or you were checking if the client is angry.

Those are three different jobs: Accountability, Verification, and Risk Assessment.

A generic “summary” fails all three because it averages them out. It treats the prompt as a text-processing task rather than a -making task. This is the silent failure mode of conversational AI. The system understands the words (semantics) but misses the work (intent).

And until your agent can tell the difference, it will remain a novelty—fun to play with, but never safe to trust.

The Partial Prompt

In human-to-human delegation, we rarely say exactly what we mean. We give a shorthand instruction and rely on the other person to fill in the rest with shared .

If a VP tells a Director, “Handle the Q3 update,” the Director knows the implied constraints without asking:

Highlight the revenue beat (: look successful).
Bury the churn number in the appendix (: avoid panic).
Format it for the board deck (: save time).

The prompt “Handle the Q3 update” is just a pointer to a massive, invisible bundle of requirements.

When a user types that same prompt into an AI, the AI doesn’t have the shared history. It doesn’t know the political landscape. It doesn’t know that “update” means “board deck slide,” not “three-paragraph email.”

So it generates the email. The user sees it and thinks: “This is dumb. I have to rewrite it anyway.” The model is smart enough, but the gap here is . The system took the prompt literally instead of treating it as a hiring signal for a specific job.

The “Draft” Trap

The most dangerous word in is “Draft;” as in, “Draft a reply to this customer.”

To a semantic model, this is a clear instruction: generate text that responds to the previous text. To a product designer adhering to principles, this is a minefield of ambiguity. The user is hiring the “Draft” command for one of three mutually exclusive jobs:

Job 1: The Soft Let-Down (Social Friction Reduction)

The user wants to say “no” to a feature request without damaging the relationship. Success criteria: Empathetic tone, clear "no," future consideration promise. Failure: A blunt rejection or a false promise.

Job 2: The Technical Stall (Buying Time)

The user doesn’t know the answer yet and needs to acknowledge receipt without committing to a timeline. Success criteria: Vague on dates, specific on "we are investigating," professional padding. Failure: Committing to a deadline that doesn’t exist.

Job 3: The De-Escalation (Risk Management)

The customer is furious. The user is panicking. Success criteria: Apologetic, ownership language, immediate next steps. Failure: Defensive tone or "I understand your " clichés.

If your agent doesn’t know which job it’s doing, it will default to a generic "helpful" tone that fails all three. It will sound too cheery for a de-escalation and too vague for a hard "no."

The user spends more time editing the tone than they would have spent writing it from scratch.

Designing the Calibration Turn

The fix isn’t to build a model that can read minds. The fix is conversational design that treats ambiguity as a trigger for calibration. When the intent is unclear, the agent shouldn’t guess. It should triangulate.

User: “Draft a reply.”

Agent: “Do you want to let them down gently, or just acknowledge receipt while we investigate?”

That single turn changes the interaction model entirely.

It signals competence. The agent proves it understands the nuance of the work.
It forces clarity. The user has to consciously select , which improves the output quality by an order of magnitude.
It builds the "Partner" mental model.Partners ask clarifying questions. Tools just execute.

You don’t need a twenty-question survey. You need one question that splits the intent path.

“Are we fixing this code or just explaining it?”

“Is this analysis for you or for a presentation?”

“Do you want the short version or the audit trail?”

From Command to Delegation

Trust in an agent follows a predictable ladder. You can measure your product’s maturity by where your users fall on it.

Level 1: The Command Line (Low Trust)

The user micromanages every token.

“Write a SQL query for the users table. Filter by active date. Group by region. Don’t use join. Format as code.”

The user is doing the thinking; the AI is just typing.

Level 2: The Intent (Medium Trust)

The user states the outcome.

Get me the active user breakdown by region.”

The user trusts the AI to pick the right columns and syntax, but they still verify the logic.

Level 3: The Delegation (High Trust)

The user references a shared .

“Run the Monday numbers.”

This is the holy grail. The user trusts the agent to know which numbers, what format, who needs them, and where to post them.

You cannot jump from Level 1 to Level 3 with better LLMs alone. You get there by consistently recognizing behind the prompt. Every time the agent correctly identifies “The Monday numbers” as a specific bundle of reporting tasks (Job Recognition), the user’s trust climbs a rung.

Building an Intent Library

If you’re building an agent, stop organizing your roadmap by features ("add PDF support," "add chart generation").

Organize it by intents. Take the top 50 prompts in your logs. They probably look like: "Fix this" "Make it better" "Shorten this" "What does this mean?"

There are divergent jobs hidden inside each one.

Take "Make it better" (Writing):

The Punch-Up Job: It’s boring. (Add voice, vary sentence length).
The Clarity Job: It’s confusing. (Simplify, remove jargon).
The Formatting Job: It’s a wall of text. (Add bullets, headers).

If your agent treats "Make it better" as a generic instruction to "rewrite using better words," you are rolling the dice on which job the user actually has. Instead, recognize the ambiguity. Offer the paths.

"Want to punch up the tone, or just simplify the structure?"

The "Helpful Idiot" Problem

The worst kind of AI isn’t the one that refuses to answer. It’s the one that confidently answers the wrong job.

We call this . sees “Analyze this ” and immediately produces a correlation matrix, three charts, and a summary of the columns.

It looks impressive in a demo. But the user was looking for anomalies. They wanted to know why row 405 is empty. buried the anomaly under a mountain of "insights" that nobody asked for. The user has to dig through the AI’s work just to get back to the raw .

This creates negative work. The user now has to manage the AI and the original task.

You know you’ve nailed when the user stops treating the output as a first draft.

When a user reads an AI response and says, “Exactly,” or “Thank you,” they felt recognized.

They felt that the system understood the pressure they were under, the audience they were writing for, and the outcome they needed.

That feeling is rare in software. Most software feels like a tool you have to manipulate. An agent that recognizes jobs feels like a colleague you can lean on.

And that’s the way to move from "I'm testing it" to "I trust it.”