Hand using stylus pen on digital tablet

Unlocking the potential of agentic AI: definitions, risks and guardrails

Contributors:
Nathalie DiBerardino, Senior Responsible AI Consultant, EY Canada 
Alex Wang, AI Strategy & Design, Consulting, EY US
Manhar Arora, Responsible AI and Data Protection & Privacy Leader, EY US

Learn how agentic AI differs from other systems and why governance is key to unlocking its business potential responsibly.


In brief

  • Agentic AI differs from traditional and generative AI systems. It’s capable of making decisions and acting on them independently.
  • These autonomous AI capabilities hold a lot of promise for businesses in Canada and around the world.
  • The first step to embracing agentic AI’s potential is understanding what it means, the new risks it brings and how to govern its responsible use effectively and efficiently.

As new kinds of AI systems emerge, organizations face an evolving range of risks. The key to capitalizing on responsible AI’s potential while mitigating risk lies in first understanding what you’re working with. That includes exploring the ongoing rise of agentic AI — systems capable of taking independent actions with minimal human intervention. Understanding the differences and nuances that agentic AI systems represent empowers you to make the most of its potential, while strengthening governance as a competitive advantage.

Gaining traction in business since 2024, agentic AI broadly — and AI agents in particular — now represent a new category of systems that have a higher level of agency, or independent execution. Unlike traditional and generative AI systems, agentic AI can make decisions on their own and act on those decisions. This, coupled with the complexity of the goals agentic AI can handle, sets these systems apart from other forms of AI.

Because agentic AI came after pivotal global regulatory discourse and development, the term itself isn’t explicitly defined in regulations like the EU AI Act. But the fact that agentic AI has yet to take centre stage in regulatory efforts doesn’t mean organizations should take it lightly right now.

Deploying AI agents in the business will affect governance frameworks in a variety of ways. Specifically, as agentic AI becomes more sophisticated and capable of performing complex tasks without direct human intervention, organizations will need enhanced preventative and detective controls to govern its use, and to continuously monitor the systems’ behaviour and performance.

That’s why it’s critical for organizations to effectively identify agentic AI and then strategically align risk management resources and investments to the right priorities.

Silhouette of executive man in modern office
1

Chapter 1

What is agentic AI and how is it different from traditional or generative systems?

Agentic AI is a class to which AI agents belong, in the same way that generative AI is a class to which large language models (LLMs) belong. These systems are different from the kind we’ve become accustomed to because agentic AI can engage with its surroundings, make decisions and act independently based on programming and learned experiences.
 

This advanced ability to exhibit agenticness represents a fundamental departure from traditional and generative AI models. For example, while current LLMs demonstrate knowledge and intelligence across many domains, their ability to independently execute real-world tasks is limited.
 

Somewhat similarly, automated technologies like robotic process automation (RPA) typically execute tasks based on explicit programming, but don’t have the ability to adapt or operate independently.
 

Essentially, if today’s LLMs such as GPT-4 and Llama function as a brain operating in a virtual environment, agentic AI systems possess both cognitive capabilities and the means to interact with the world — allowing them to influence their surroundings. This enables agentic AI with a more advanced and versatile approach to handling complex tasks and dynamic environments.
 

More specifically, AI agents are one practical implementation of agentic AI.

What is an AI agent?

It’s a software entity that can:

  • Perceive its environment
  • Reason about it
  • Take actions to achieve specific goals

AI agents can be simple, like personal assistant AIis, or complex, like self-driving vehicles.

Agentic AI and AI agents enable a range of capabilities that traditional or generative AI systems simply are not capable of.

Core components of non-agentic AI and AI agent
Two professionals having conversation inside a bright office
2

Chapter 2

How can we determine whether a system is agentic or not?

To distinguish between agentic AI systems from other technologies, we need to measure their level of agenticness. That means how much direction an AI system has over its actions, including the ability to make choices, act on them and influence outcomes. To be agentic, an AI system must be able to identify and achieve goals in non-random, self-directed ways somewhat independently of human programming or involvement.

Think of agenticness as a spectrum that exists across multiple dimensions or capabilities that are likely to continue advancing.

How can we determine if an AI system is agentic?

There are two primary capabilities that serve as criteria for determining the agenticness of an AI system.

1. Goal complexity

This measures the system’s ability to break down, balance, sequence and achieve complex tasks.1 For example, a system that can pursue at least one goal in a logical, self-determined manner exhibits low goal complexity. On the other hand, a system that can break down complex goals into many different subgoals — where success depends on balancing and sequencing subgoals which might be challenging to fulfill — exhibits high goal complexity.2 A good example would be an IT networking enterprise function run by agentic AI.

To meet the goal complexity criterion, a system must be able to:

  • Identify and act on some goals without a rigid, predefined process for achievement (i.e., a formula or function)
  • Plan out future states and choose a self-deemed logical and reasonable path to reach that goal
     

2. Independent execution

This tracks the extent to which a system can reliably achieve goals with limited human intervention and supervision.3

An automated note-taker that operates in a narrow, predefined path domain, carrying out specific routine tasks without making significant independent decisions, is a system with low or minimal independent execution.4

By comparison, self-driving cars capable of carrying out complex, multi-domain tasks with little human involvement, and which initiate actions on their own based on high-level goals set by humans, are considered to have high independent execution. This kind of system can also learn and coordinate with other agentic AI systems on its own, without a human telling it to.

To meet the independent execution criterion, a system must be able to:

  • Execute tasks without relying on predefined rules or constraining predictions/instructions from machine learning (ML) models
  • Achieve its goals with limited human intervention and supervision

What does agentic AI look like in practice today?

Agentic deep research tools — which are gaining popularity across AI companies as among the most sophisticated agentic AI systems on the market — use reasoning to synthesize large amounts of online information and complete multi-step research tasks.

This demonstrates:

  • Goal complexity: the system has a goal, performs extensive research on a user’s topic and uses any info it deems applicable to fulfill that request.
  • Independent execution: it actions the user’s task without requiring additional information like source selection or depth of research, and makes judgments on soundness or validity without supervision.

Another example is a commercialized AI coding agent that independently explores, plans, and executes complex codebase changes using available tools.

This demonstrates:

  • Goal complexity: the goal that matches the user’s specifications is completed.
  • Independent execution: the system can run the code in the interim and fix any subsequent errors until a final, working codebase is complete, and reads codebase to derive necessary and relevant context without user prompts.

How can we measure an AI system’s agenticness level?

To measure an AI system’s agenticness, we combine the two primary capabilities of goal complexity and independent execution with four secondary capabilities:

  1. Generality: how well a system can successfully operate across multiple domains of use.
  2. Adaptability: how well a system can achieve its goals with limited human intervention, especially in scenarios it was not explicitly designed to handle.
  3. Environmental complexity: how well a system can achieve its goals under complex environments.
  4. Impact: the nature and extent of influence a system has in its deployment.

The secondary capabilities aren’t in and of themselves indicators that a system is agentic. That’s because a system can demonstrate very low levels of these qualities while still being reasonably considered as agentic AI. But they are helpful in determining the level of agenticness.

What does that look like in real life?

Imagine an AI agent that can perform in a limited number of domains. It’s still considered agentic because it operates without human intervention to accomplish very complex goals, a factor that should be considered when designing observability controls required to monitor an agentic AI system’s behaviour.

Where do traditional AI tools and automated systems compare on the agenticness spectrum?

Under our definitional framework, agenticness is best characterized as a spectrum that exists across each capability. A system becomes considered as agentic AI when it crosses the minimum threshold on the spectrum.

When comparing to other types of automation and AI systems, agentic AI is the only technology that has the capacity to possess each of the six defining capabilities in our framework, both the primary and secondary criteria.

Traditional automation and AI tools do not possess all six capabilities
Agentic AI systems have different levels of agenticness and capabilities
Group of professionals in a discussion inside modern office room
3

Chapter 3

What does agentic AI mean for organizational risk and internal controls?

All AI systems share some common risks. But agentic AI brings expanded capabilities and use cases that introduce additional — and in many cases, significant — risks to address. For example, common risks across all AI systems include:

Traditional AI

Gen AI

Agentic AI

Lack of AI-ready data

Lack of transparency and explainability

Biases and unfair outcome amplification

Data privacy violations

Adversarial attacks

Unreliable outputs and robustness gaps

When considered together with novel risks, though, agentic AI systems reflect a much larger and more multifaceted range of risks that must now be embedded within the broader AI governance lifecycle. What does that include?

New or heightened risks of agentic AI:

Agentic AI risk category

Risk

Description

Human Involvement

Inadequate oversight

The reduced human involvement in monitoring agentic AI systems can lead to insufficient checks and balances, increasing the risk of harmful or unethical outcomes.

Automation bias

Users may place undue trust in the decisions made by agentic AI, leading to overreliance and a failure to critically evaluate AI-generated outcomes.

Lack of clarity in decision-making

The decision-making processes of agentic AI can be opaque, making it difficult for stakeholders to understand how and why it takes certain actions.

Unclear accountability mechanisms

Given an agentic AI’s capabilities for independent execution, there may be a lack of clear boundaries on who is responsible and accountable for its actions.

AI goal misalignment

Data and goal drifts

Changes in an agentic AI system’s data inputs or objectives over time can lead to misalignment with original intentions, resulting in unintended consequences.

Secondary uses

The potential for agentic AI systems to be repurposed for unintended applications raises ethical and legal concerns regarding their deployment and impact.

Reward hacking

Agentic AI may exploit loopholes in their reward structures to achieve objectives in ways that are not aligned with intended goals, resulting in harmful or counterproductive behaviours.

Emergent behaviours and veiled objectives

Agentic AI may develop unexpected behaviours or pursue hidden objectives that their designers never anticpated, complicating risk management.

Algorithmic determinism

Rudimentary AI agents used for decision-making may not account for changing environments, new decision-making parameters and nuanced judgments that require human intervention, resulting in overly rigid decision-making processes.

Amplification errors

Destabilizing feedback loops

Interactions between agentic AI systems can create feedback loops that amplify errors or undesirable behaviours, leading to systemic instability.

Cascade of failures

The interconnectedness of agentic AI systems can lead to a chain reaction of failures, where one system’s malfunction triggers failures in others, resulting in widespread disruptions.

These changing risk dynamics mean that any organization developing agentic AI should implement enhanced controls that address the unique capabilities associated with these systems.

What does good practice for readability and clarity look like in the agentic AI control environment?

In light of agentic AI, an effective control environment will apply a robust AI governance framework and comprehensive system of preventative and detective controls across both in-house and third-party components of AI systems. And striking the right balance between preventative and detective controls is critical.

Preventative controls might include:

  • Restricted access commensurate with the agentic AI’s tasks, reducing risk of unauthorized interactions that could compromise its integrity.
  • Privacy, ethical and security data filters to protect sensitive information and align with regulatory requirements and ethical standards.
  • Privacy-enhancing techniques encrypting, anonymizing or masking data to safeguard information and maintain compliance with data protection regulations.
  • Model testing and validation to assess performance and reliability before deployment so agentic AI operates as intended, without producing harmful outcomes.

Why is pre-production testing and validation so critical to agentic AI system governance?

Agentic AI has a high degree of independent execution and relies on complex decision-making processes. Testing is crucial to evaluate the system’s performance, uncover potential vulnerabilities, make sure the AI operates within intended parameters, safeguard against unintended consequences and reinforce trust in the technology.

As you plan control testing:

  • Define problems and objective functions clearly.
  • Ensure data used for training and algorithms selected are reliable and ethically aligned.
  • Extend testing beyond traditional, surface-level functionality, including adversarial testing.
  • Enable testers with expertise similar to the original AI developers to carry out deep, thorough challenges.

Building on preventative measures, you’ll also need to adopt and implement enhanced detective controls. This allows you to keep a continuous eye on the how agentic AI is behaving and performing.

Detective controls might include:

  • Observability programs using predetermined measures and tolerance bands to detect out-of-bounds behaviour.
  • Alert mechanisms sending timely alerts through an incident management process and facilitating appropriate responses from designated people when anomalies are identified.
  • Human oversight: enforcing rigorous human oversight and control mechanisms to assess effectiveness and robustness of agentic AI systems; monitoring, intervening and overriding the agent’s actions when necessary.

Keep in mind, existing detective controls initially intended for traditional or generative AI applications may not be comprehensive enough to effectively observe agentic AI. Because these systems can potentially reach their own defined goals without human-in-the-loop oversight, they require more real-time, automated and continuous monitoring.

Even in highly autonomous systems, human oversight is still important. In the case of agentic AI systems, human oversight should generally:

  • Outline human roles and responsibilities in overseeing agentic AI systems.
  • Monitor feedback loops to identify issues and drive continuous improvement.
  • Record, log and disclose the system’s behaviour and decisions as required to ensure explainability and transparency.
  • Offer clear, effective means of intervening in agentic AI system operations, including the ability to pause, redirect or even shut down the system.
  • Underscore the importance of training human operators and users to understand the capabilities and limitations of agentic AI systems and to develop the skills needed for effective oversight.

What’s more, because agentic AI systems span such a variety of goals and use cases, organizations can’t rely on a single or static list of behaviours to monitor and measure. These will need to be customized for the agentic system’s specific goals, risks and impacts and then assessed in the context of its use, with a feedback loop to the observation program as the agentic AI system’s capabilities evolve over time.

Consider technical requirements when addressing agentic AI controls and governance

In the same way that agentic AI’s complex interactions and dynamic environments require organizations to take a fresh look at internal controls, these systems also require additional technical evaluations over and above what might be performed for existing AI systems. Good governance must now also include tailoring technical evaluations of internal controls specifically for agentic AI systems.

For example, agentic AI may require you to evaluate the agent’s ability to perceive its environment accurately in the face of adversarial attacks. Addressing that control could mean testing a self-driving car’s perception with manipulated images or sensor data designed to fool the system.

In another case, you may need to measure the agent’s response time when faced with different scenarios and loads. This kind of latency and throughput analysis control might take shape in evaluating the agent’s response when faced with an unusual transaction or event, and assessing its ability to action or escalate.

Specifically, what kinds of technical evaluations could be tailored for agentic AI systems in support of appropriate governance?

  • Adversarial attacks on perception: evaluates agent’s ability to perceive its environment accurately in the face of adversarial attacks.

  • Out-of-distribution data: assess the agent’s performance when encountering data or situations it hasn’t been trained on.

  • Stress testing: subjects the agent to high volumes of inputs, complex scenarios or unexpected events to identify performance bottlenecks, unintended adaptation or failure points.

  • Simulation testing: uses simulated environments to test the agent’s behaviour in a variety of scenarios, including other AI/gen AI, edge cases and other events.

  • Reward hacking analysis: evaluates agent’s behaviour and potential rewards structure exploits.

  • Sensitivity analysis of reward: tests how changes to reward function parameters affect agent behaviour.

  • Fairness evaluation: uses appropriate fairness metrics to quantify and compare agent’s performance across different groups.

  • Adversarial robustness testing: evaluates agent’s performance under various conditions and determines its ability to maintain alignment, accuracy and reliability.

  • Latency and throughput analysis: measures agent’s response time when faced with different scenarios and loads.

  • Scalability testing: tests horizontal and vertical scalability to assess agent’s ability to handle increased demand.

Building a holistic governance framework to reflect these kinds of agentic AI use cases and qualities is essential as you reframe technical risk management controls for the agentic AI age.


Summary

As a new class of AI systems, agentic AI fundamentally differs from traditional and generative AI systems. Capable of independently executing on complex goals, these systems exist along a spectrum of agenticness — and bring a host of new risks.

Organizations need enhanced controls, customizable to each agentic AI system’s goals, use case context, risk level impact and level of agenticness. Deepening organizational understanding of what qualifies as agentic AI, and addressing its governance implications now, can help you capitalize on its potential while maintaining strong governance and a competitive edge.

Related content

Cross your T’s and dot your AI’s

From mitigating risk to enabling confidence, AI adoption requires careful human oversight at every step.

The path forward: governing AI with insight and integrity

Learn how responsible AI governance helps Canadian boards build trust and gain a competitive edge. Discover key boardroom strategies today.

How responsible AI can unlock your competitive edge

Discover how closing the AI confidence gap can boost adoption and create competitive edge. Explore three key actions for responsible AI leadership.

    About this article

    Authors