Skip to content
Business Term

AI Agent

自律型AIエージェント

An AI agent is an AI system that can reason toward a goal, use tools or external context, and carry out multi-step work. The more autonomy it has, the more permissions, auditability, and human checkpoints matter.

Formula
Successful tasks / attempted tasks
Use when
Teams can decide whether chat assistance is enough or whether tool-executing agents are justified.
Watch out
Planning, search, tool calls, file operations, API calls, validation, revision
Updated: 07/04/2026Quality: ReviewedPage tier: Reviewed articleSources: 3

What it means

An AI agent is a system that interprets a goal, plans or selects steps, uses tools and data sources, and checks or updates its work across a task. Compared with a single chat response, an agent may combine search, file operations, API calls, planning, validation, and revision. Production agents need explicit boundaries for allowed tools, accessible data, human approvals, audit logs, and stop conditions. Without those boundaries, an agent can make mistakes faster and with broader operational impact.

How to calculate it

Evaluate AI agents by task success, safe execution, and human intervention load. Task completion rate | Successful tasks / attempted tasks | Measures practical usefulness Intervention rate | Human stops / executions | Shows whether autonomy is calibrated Safe execution rate | Policy-compliant executions / executions | Measures permission and audit health

LensFormula / treatmentWhen to use it
Task completion rateSuccessful tasks / attempted tasksMeasures practical usefulness
Intervention rateHuman stops / executionsShows whether autonomy is calibrated
Safe execution ratePolicy-compliant executions / executionsMeasures permission and audit health

What counts / what does not

AI agents overlap with chatbots, workflow automation, and RPA, but differ by tool access and decision scope. Include | Planning, search, tool calls, file operations, API calls, validation, revision | Handles multi-step work Exclude | Unlimited autonomy, unapproved high-impact actions, ownerless decisions | Requires governance Make explicit | Tool set, data scope, approval gates, logs, stop conditions | Required for production use

ItemTreatmentWhy it matters
IncludePlanning, search, tool calls, file operations, API calls, validation, revisionHandles multi-step work
ExcludeUnlimited autonomy, unapproved high-impact actions, ownerless decisionsRequires governance
Make explicitTool set, data scope, approval gates, logs, stop conditionsRequired for production use

What moves the number

Agent performance depends on tool design, permissions, evaluation, and failure boundaries as much as model capability. Tools | Narrow, well-described tools limit damage when something fails Permissions | Separate read, draft, execute, and external-send rights Evaluation | Long tasks need both intermediate and final checks Human review | Approval gates make autonomy safer to expand

DriverMetric impact
ToolsNarrow, well-described tools limit damage when something fails
PermissionsSeparate read, draft, execute, and external-send rights
EvaluationLong tasks need both intermediate and final checks
Human reviewApproval gates make autonomy safer to expand

When it helps

Teams can decide whether chat assistance is enough or whether tool-executing agents are justified. Classifying actions by approval requirement balances speed and safety. MCP or API integrations can be designed around tool descriptions, input schemas, permissions, and logs.

  • Teams can decide whether chat assistance is enough or whether tool-executing agents are justified.
  • Classifying actions by approval requirement balances speed and safety.
  • MCP or API integrations can be designed around tool descriptions, input schemas, permissions, and logs.

How to use it

  • An AI agent is a system for doing work, not just producing an answer.
  • More tool access requires stronger logs, approvals, and rollback paths.
  • Start with low-risk read or draft workflows before allowing high-impact execution.
  • Measure intervention, misoperation, permission, and user burden alongside success rate.
  • Standard connectors such as MCP are useful, but tool exposure must be intentionally scoped.

Decision cautions

Agent usefulness and agent risk scale together. External sending, deletion, purchases, contracts, and permission changes should require human approval. Ambiguous tool descriptions can cause tools to be invoked for the wrong purpose. Without execution logs and replay information, incidents cannot be investigated well.

  • External sending, deletion, purchases, contracts, and permission changes should require human approval.
  • Ambiguous tool descriptions can cause tools to be invoked for the wrong purpose.
  • Without execution logs and replay information, incidents cannot be investigated well.

Read with

AI agents should be read together with MCP, tool use, evaluation, and prompt injection. Model Context Protocol | Standard for connecting tools and context | Often used in agent integrations Tool Use | Ability to call external systems | Defines execution scope Prompt Injection | Untrusted input can redirect actions | Especially important for agents

MetricRoleWhy read together
Model Context ProtocolStandard for connecting tools and contextOften used in agent integrations
Tool UseAbility to call external systemsDefines execution scope
Prompt InjectionUntrusted input can redirect actionsEspecially important for agents

Example

An engineering team uses an AI agent for first-pass incident triage. The initial tool set allows log reading, related PR search, and known-incident search, but it cannot change production settings or restart services. The agent writes a triage memo with evidence links and unresolved questions. A human approves the next step before any runbook action is taken. The workflow reduces triage time, but one incident shows that an old runbook was cited, so the team adds source freshness checks to the agent's retrieval layer.

Compare with

AI Agent | Carries out multi-step work toward a goal | Needs tools and evaluation Chatbot | Responds in conversation | Usually limited execution authority RPA | Automates fixed steps | Less flexible reasoning

MetricDifferenceWhy read together
AI AgentCarries out multi-step work toward a goalNeeds tools and evaluation
ChatbotResponds in conversationUsually limited execution authority
RPAAutomates fixed stepsLess flexible reasoning

Common mistakes

  • Agents do not need to be fully autonomous. Human checkpoints are often the safer design.
  • A better model alone does not remove operational risk. Tool and permission design matter.
  • Long tasks do not automatically improve efficiency. Without intermediate checks, rework can grow.

Frequently asked questions

How is an AI agent different from a chatbot?

A chatbot mainly answers. An AI agent can combine multi-step reasoning with tools, data, and execution workflows.

Should agents be fully autonomous?

Not for high-impact actions. Use approval gates, logs, and stop conditions.

What should be automated first?

Start with read, draft, summary, and research-support tasks that are easy to review and roll back.

Sources

SourcesKindLink
NIST: AI RMFtier_sOpen
Model Context Protocol: What is MCP?tier_sOpen
Model Context Protocol: Toolstier_sOpen
AI Agent | YogoQ Core