The shift from the age of automation to the Agentic era

Mulenga Agley

Alex B

Saturday May 10 2025, 15:04

Let's start with something which is not 'agentic'...

A user fills out a lead form on a landing page.
A webhook pushes the form data to a Zap.
GPT-4 evaluates the quality of the lead based on demographic and firmographic inputs.
If the lead is qualified, GPT-4 drafts a personalised onboarding email.
A CRM action logs the email and tags the lead for follow-up.

This workflow, while powerful, is ultimately static. It cannot replan. It cannot handle ambiguity.

An AI agent is a computational system equipped with:

Goals: A persistent objective it attempts to fulfill.
Memory: A mechanism for storing and retrieving prior knowledge or state.
Tools: Access to external interfaces and APIs through which it interacts with its environment.
Reasoning Frameworks: The ability to plan, evaluate, and revise its own behavior.

Unlike LLMs, which operate in a stateless, single-prompt paradigm, agents function in loops. They perceive, decide, act, observe outcomes, and iterate. Architecturally, this often means incorporating ReAct-style frameworks (reasoning + acting), retrieval-augmented generation (RAG), and modular tool use.

LLMs are like very smart interns waiting for instructions. AI agents are like junior product managers... imperfect, learning, but autonomous enough to ship value.

Most explanations of AI agents I’ve encountered tend to fall into two extremes. They’re either too technical for practical comprehension or far too basic to offer useful insight. That’s precisely why I’ve decided to write this article... Running Growthcurve, the world's leading growth marketing agency, I’ve worked with AI tools extensively across all aspects of the business, from operations to creative production.

I begin at the level of large language models, since that is the foundational concept most people interacting with AI are already familiar with. Popular AI chatbots like ChatGPT, Google Gemini, and Claude are built on top of large language models, or LLMs. These systems are highly capable at generating and editing text. The interaction framework is simple: a human provides an input, and the LLM produces an output based on the statistical patterns it learned during training. If I ask ChatGPT to write an email requesting a coffee meeting, my prompt becomes the input. The output is the drafted email, likely more polite than I’d usually be in real life. That is the entire transaction. Input leads to output, mediated by the model’s learned structure.

This simplicity begins to reveal the system’s limitations when I shift the context. If I ask ChatGPT a question like “When is my next coffee chat?” it fails to deliver a useful answer. The system doesn’t have access to my calendar. This scenario clarifies two of the most critical constraints of LLMs. First, despite being trained on vast amounts of public and proprietary data, LLMs lack access to user-specific or internal company data unless it is explicitly provided at the time of the prompt. Second, LLMs are inherently passive. They don’t take initiative. They wait for a prompt and respond to it. Both of these traits are essential to remember as I move forward into more advanced concepts.

To bridge the capabilities gap, I began to work with AI workflows. This concept extends the basic utility of LLMs by integrating them into structured systems where multiple actions are executed in sequence. For instance, if I instruct the LLM to perform a search query on my Google Calendar every time I ask about a personal event, the system becomes marginally more intelligent. When I ask, “When is my coffee chat with Elon Husky?” the LLM, guided by predefined instructions, checks the calendar and retrieves the correct data. However, this reveals a second-order limitation. If I follow up by asking, “What will the weather be like that day?” the system fails again. The reason is simple. The LLM is locked into a fixed path. It was instructed only to search the calendar. The calendar contains no weather data. Therefore, the system cannot answer the question.

This fixed execution path illustrates a defining characteristic of AI workflows: they can only follow predefined instructions created by a human. This structure is often called the control logic. Even if I add more steps, accessing weather data via an API, converting output into speech using a text-to-audio model, the system still operates as a workflow. It doesn’t matter if the process has three steps or three thousand. If the human remains the core decision-maker defining every branch, it is not an AI agent. It is a workflow with no agency.

Retrieval-Augmented Generation, or RAG, is frequently discussed in this context. In technical terms, RAG is an approach that enables the model to look up information before producing an output. For example, accessing my calendar or a weather service as part of the output generation pipeline. RAG is not an agentic capability on its own. It is a functional enhancement of workflows. It allows the system to augment its answers with contextually retrieved information but still does not enable it to operate autonomously.

The transition into true agentic systems occurs when the human decision-maker is removed from the control loop. The essence of agency is this ability to make reasoning-driven decisions about how to complete a goal.

In technical architectures, this is often implemented using the ReAct framework, short for Reason + Act. Every AI agent must have the capacity to reason about its environment and the actions available to it and then take steps accordingly. ReAct is not a buzzword. It is a structural requirement. Without reasoning and action, a system cannot be considered agentic.

Another defining trait of AI agents is their ability to iterate. If a LinkedIn post isn’t sufficiently engaging, a traditional workflow requires a human to make improvements. An AI agent, in contrast, can autonomously critique its own output. It can instantiate a secondary LLM to evaluate the post against predefined criteria, such as alignment with LinkedIn content heuristics, tone calibration, or engagement benchmarks. Based on this evaluation, the agent can revise the post. This feedback loop can be repeated until all quality conditions are satisfied. This process is autonomous. The human does not intervene.

This trait enables agents to escape the rigidity of workflows. In a real-world system, the agent can dynamically choose new tools, reroute its logic, evaluate results, and retry until the objective is achieved. Agents are goal-driven systems that operate in dynamic contexts with minimal human intervention.

The AI vision agent demo created by Andrew Ng provides a practical example. When I input the keyword “skier,” the agent first reasons about the semantic representation of that term, a person on skis moving quickly across snow. It then scans video footage, attempting to identify matching patterns. When it finds a probable match, it indexes the clip and returns it to the user. This eliminates the need for human pre-tagging. The agent is not retrieving tags. It is generating understanding, acting on that understanding, and delivering results. It performs reasoning, acts via external tools (in this case, video analysis), and iterates until it reaches a plausible match. That is the operational definition of agency in machine systems.

The interface between user experience and backend system complexity is an important consideration. The front end of an AI agent system can be deceptively simple. A user may interact with a single prompt or button click, unaware of the number of internal reasoning cycles, retrieval steps, or tool interactions taking place. The abstraction of complexity from the user is not incidental. It is the objective. However, for the system designer or the enterprise deploying AI agents, it is critical to understand the architecture beneath the surface.

An agentic system typically consists of the following key components: a control loop responsible for initiating reasoning cycles, a task planner which decomposes goals into actionable steps, a memory store which provides persistent and contextual recall, a toolchain registry to invoke specific capabilities (e.g. file access, API calls, databases), and an execution monitor to evaluate outcomes and determine if retries or replanning are required.

In practical terms, when I instruct an agent to produce a result, such as identifying relevant news articles and turning them into branded social media posts, the system begins by breaking down the goal. It evaluates the required sub-tasks, such as article retrieval, summarization, copywriting, tone matching, and quality control. Each of these may be handled by a separate sub-agent or module, depending on system complexity.

The ability of agents to evaluate interim outputs is what differentiates them from brittle deterministic systems. If the post does not meet a minimum threshold, perhaps it is too long for LinkedIn, or lacks a strong call to action, the agent can self-correct. It re-engages the reasoning process, adjusts the prompt, or replaces the tool used for generation. This feedback mechanism introduces a degree of resilience into the system.

In a well-instrumented agent framework, error recovery is essential. Agents must be able to detect failure states, log those states, and reroute their logic accordingly. For instance, if an external API fails due to a rate limit, the agent must pause execution, check for alternate providers, or wait and retry. Failure to design this capability results in agentic systems that collapse under operational stress.

Persistent memory is another critical requirement. Agents must be able to access long-term context, including past decisions, user preferences, task history, and dynamic constraints. This is implemented using a combination of vector databases (for semantic memory), key-value stores (for structured memory), and episodic logs (for audit and traceability). These memory structures enable agents to simulate continuity, providing a coherent experience even when operating across sessions or changing environments.

Tool use remains central. An agent without access to tools is functionally equivalent to a chat interface. Tools may include HTTP clients, document parsers, image analyzers, browser emulators, and SQL interfaces. Each tool is wrapped with metadata indicating how it is invoked, what inputs it expects, and what outputs it returns. The agent’s planner queries this registry to select tools dynamically based on the task requirements.

The ability to invoke other agents is increasingly relevant. In advanced systems, agents operate as modular services. One agent handles scheduling, another handles data analysis, and another manages outbound communication. These agent systems communicate through message passing, shared memory, or pub-sub systems. They coordinate through handshake protocols or shared goal objects. The complexity of orchestration increases with the number of agents, requiring robust state tracking and role assignment.

From the user’s perspective, this entire architecture is abstracted behind a simple interface. This is intentional. The system’s job is not to surface its complexity. Its job is to solve problems. The average user should not need to understand memory vectorization, retrieval scoring, tool invocation protocols, or planning heuristics. They should only need to provide a goal and evaluate the output.

The final abstraction I want to clarify is the conceptual hierarchy between LLMs, workflows, and agents. The distinction can be formalized as follows:

In the case of a standard LLM usage pattern, the human provides an input. The LLM produces a single-step output. No reasoning beyond the prompt-response mechanism is performed.

In the case of an AI workflow, the human provides an input and defines a path. The LLM follows that path, potentially invoking external tools, but without independent decision-making authority. All branches and contingencies are predetermined by the system’s designer.

In the case of an AI agent, the human provides a goal. The LLM, embedded in a control loop and given access to tools and memory, reasons about the optimal steps to achieve that goal, executes actions, evaluates interim results, iterates as needed, and produces a final output. The defining attribute is that the LLM becomes a decision-maker within the system. It is no longer merely responding to instructions. It is determining its own plan of action.

This structural shift redefines what it means to deploy AI in operational environments. The inclusion of autonomous reasoning, memory-informed context awareness, dynamic tool invocation, and iterative quality control mechanisms turns what was once a passive language model into an active system of intelligence. Not just a tool, but an actor within your workflows.

As a system designer or deployment strategist, it is imperative to understand the internal composition and coordination architecture that underpins agentic systems. While end-user applications can obscure this complexity, the operational burden of designing agents that are both autonomous and reliable rests with the engineers and technical architects responsible for their behavior.

The defining shift in paradigm centers on delegation of authority. In traditional software, deterministic logic paths are authored by humans and executed linearly. In agentic systems, logic selection is delegated to a model capable of adaptive planning. This requires rigorous boundaries, including constrained tool access, safe execution contexts, logging instrumentation, and often human-in-the-loop approval gates for high-risk actions.

Model selection remains a critical variable. The LLM powering the agent must exhibit reliable chain-of-thought behavior, function well with structured outputs (e.g. function calls, JSON), and operate with high semantic fidelity across tasks. Where long-term task memory is required, persistent vector stores must be implemented using systems like Pinecone, Weaviate, or FAISS. Semantic embedding quality directly affects retrieval relevance and therefore response integrity.

Reasoning strategies may be guided using prompting scaffolds such as self-ask, tree-of-thought, or planner-actor architectures. These determine how the agent decomposes tasks and how intermediate steps are verified. Evaluation may rely on internal validators (other LLMs), external APIs, or even user feedback loops. These strategies ensure not just action, but verification.

Tool invocation must be secure and contextual. Each tool available to the agent is defined with input and output schemas, permission boundaries, error catchers, and timeout behaviors. When an agent chooses to use a tool, it must retrieve the correct usage pattern, format the input accordingly, handle the return structure, and adjust its subsequent reasoning based on the output. Misalignment in any of these stages leads to cascading failures.

Critique modules provide a further layer of robustness. These modules act as reviewers that can accept, reject, or suggest revisions to the agent’s outputs. The critique step may be applied to writing quality, factual accuracy, compliance with policies, or adherence to brand tone. Agents can then replan based on this critique, closing the loop.

Evaluation metrics differ from traditional software. Agentic systems are evaluated on task completion rate, number of reasoning iterations, tool invocation accuracy, hallucination rate, and user satisfaction. Logs are parsed to identify failure modes, including invalid tool inputs, failed API calls, incomplete reasoning cycles, or infinite loops.

Execution environments vary. Some agents operate in stateless HTTP contexts with ephemeral memory, suitable for one-off task execution. Others operate in persistent containers or serverless functions with background triggers, enabling long-running agents capable of monitoring, reacting, and re-engaging over time. In advanced cases, agents operate in multi-agent networks, communicating via message queues or shared blackboards, coordinating distributed goals.

An AI agent is not a chatbot, not a workflow, not a single API call. It is a composite system capable of goal-driven reasoning, autonomous action, environmental awareness, tool-mediated execution, and outcome-based iteration.

These are the systems I now rely on every day, not in hypothetical future scenarios, but in live operational contexts across client engagements, internal automation, and infrastructure augmentation. They are not experimental prototypes. They are production-grade assets. The agentic era is not arriving. It is already operational.