You’ve all been listening to me talk about AI and specifically, LLMs (large language models), for some time now. If you are new to AI, you can catch up on some of the terminology by reading these: “The Rise of Super-intelligence,” and “Navigating the AI Landscape.”
New LLMs, of course, continue to be developed and those that exist continue to evolve. OpenAI’s latest release of “OpenAI o1” is a recent example of ongoing LLM evolution. The AI industry itself is now evolving to adapt to the new technologies that it is creating. It is a truly amazing thing to observe!
One such technology we are already starting to hear a lot about is “agent technology.” AI industry experts are fairly unanimous in claiming that these new “AI agents” are going to eventually exist in the millions, potentially, billions! That’s a lot of agents!
There are, of course, a number of questions that such claims raise, with two big questions being: “What the hell is an agent?” and “Why would we ever need so many agents?”
I will attempt to answer these questions in this blog post. Given the capabilities that agents will eventually have, I think it’s important that we understand exactly what an agent is and how they might be used.
The Gist of It
We all know what an app is, right? We can thank our phones for teaching us all about apps. Below is a screenshot I took of just one page out of 8 total that shows the many apps I have installed on my phone. Your phone app catalog looks something similar, I would guess.
For the most part, if we want to run any of the apps installed on our phone, we have to interact with the phone. This might mean tapping on an app icon, or perhaps tapping on an icon and then using your voice to issue a command, or maybe just using a voice command like, “Hello Google!” The bottom line is that these phone apps all require us “to do something” if we wish the app to execute whatever it is we want them to do.
It’s a process that is very familiar to all of us. It is also a process that is about to experience exponential change over the next 1-3 years. This won’t be a “refinement” of app technology, or the introduction of a new type of app. It’s going to be an entirely new way of getting work done, using AI. As Monty Python once said, “And now for something completely different!”
1995 Reimagined
The term, “agent,” has been in use for some years now. I’m not talking about “secret agents” or “FBI agents,” but rather computerized agents, agents that reside on your PC or phone and do work on your behalf.
I remember the first time I heard about computerized “agents.” It was way back in 1995! Hewlett-Packard, my employer, released a video in 1995 that later became known as the 1995 video. It was an animated video that was meant to display the power and the promise of a new technology HP was promoting called, “NewWave.”
The star of the video was identified as a software agent, essentially a PC folder icon on your desktop that one could program to perform basic tasks like collecting data and printing a report. It was mind-blowing at the time. It did the trick, intoxicating both employees and potential customers as we collectively drank the 1995 Kool-Aid. Alas, the promise of the 1995 video never fully came to pass. The technological promise of 1995 just couldn’t be realized at that time due to technical limitations, and wouldn’t be, at least not until recently.
So, what changed? Well, for one thing, 2017 happened. 2017 will go down in the annals of digital history as the year when virtually everything related to AI changed. It is a very special year due to the publication of a specific research paper titled, “Attention Is All You Need.” This paper described the architecture required for developers to train their LLMs in a way that was not only much more efficient than ever before, but also allowed the LLM to analyze language forwards and backward, providing LLMs the ability for multi-threaded contextual connections between words, allowing it to experience greater understanding than any model ever conceived.
Transformers
I’ve been thinking about how to explain the significance of the Attention paper from 2017. This paper is widely regarded as a major milestone in the development of artificial intelligence, particularly in natural language processing. It introduced concepts such as Transformers, that have fundamentally changed how we approach language models, much like earlier breakthroughs in technology have reshaped their fields.
While I don’t claim to fully understand how Transformers work, I’ll do my best to describe them: Transformers enable natural language processing at scale, allowing input words to be analyzed in their entirety rather than sequentially. This parallel processing capability enhances efficiency and helps identify contextual relationships between words. For a simple illustration, consider these two sentences:
- The cat is on the mat.
- It is very hungry.
What is an Agent?
Traditional language models often processed text in a linear fashion, which could limit their ability to connect information across sentences. In contrast, Transformers can analyze the entire input simultaneously, allowing them to recognize that “It” in sentence 2 refers back to “cat” in sentence 1. This ability facilitates a deeper contextual understanding of the text.
However, before diving into the advancements brought about by Transformers, it’s essential to define what is meant by an AI agent. An AI agent is a piece of software designed to perform tasks autonomously or semi-autonomously on behalf of a user, which it accomplishes through the use of various tools, including one or more AI systems. Using these tools and AIs, the agents can perceive their environment, make decisions, and take actions to achieve specific goals. Agents can also vary in type and style, from simple chatbots that handle basic customer inquiries to complex systems capable of learning and adapting over time.
The development of Transformers and their impact on language models will significantly enhance the capabilities of these AI agents. With improved natural language processing, AI agents can now engage in more sophisticated interactions, understanding and generating human-like text with remarkable accuracy and contextual awareness. For instance, AI agents powered by these advanced language models can perform:
- Customer Service: Handle complex inquiries with greater understanding of context and nuance.
- Personal Assistants: Engage in more natural conversations, interpreting user intent more effectively.
- Content Creation: Assist in generating written or visual material, ensuring coherence and relevance.
- Data Analysis: Process large volumes of text data, extracting insights with depth and accuracy.
- Education: Provide personalized learning experiences by adapting to students’ needs.
The evolution from basic chatbots to tomorrow’s sophisticated AI agents reflects the advancements in underlying technologies like Transformers. As these language models continue to evolve, we can expect AI agents to become even more capable, potentially revolutionizing how we interact with technology and access information.
Please note that a key advantage of these AI agents is their autonomy. Unlike traditional phone apps that require direct user input, these agents can proactively perform tasks on your behalf, such as those listed above. They achieve this not through rigid programming alone, but through sophisticated training that allows them to understand context, make decisions, and take appropriate actions. It’s important to remember that while AI agents are trained rather than traditionally programmed, they often combine learned behaviors with predefined rules to operate effectively.
Why Do We Need So Many?
So, now you have an idea of what an AI agent is supposed to do, which answers the first question mentioned above: “What the hell is an AI agent?” Now let me try to answer why we may end up with millions of agents.
The expectation is that over the next 1-3 years AI agents will be designed for specific tasks or domains which can lead to a wide variety of specialized agents. How many people do you know that own a cell phone? Well, each of them will probably want their own personal AI agent assistant. Wouldn’t you? I sure do! Over 97% of the US population currently own a cellphone. Worldwide ownership of cell phones is over 5 billion!
Agents will also be developed for specific industries and domains, like education. Right now, OpenAI o1 is performing at PhD level in a variety of fields, including science, physics, biology, and math, and the list is growing. I think it would be every parent’s dream if each of their kids could get a personal tutor assigned to them as they begin school, helping them learn. Think about it – a teacher who is always patient, always supportive, never gets angry or frustrated, is smart as hell in every subject offered, has been trained to exhibit the best teaching skills, and oh yeah, he’s always there, perhaps right in their pocket! There are 28.3 million school-age children.
Just between education and personal assistants, we’re already talking millions of AI agents. And that is barely scratching the surface! Across the world, billions of AI agents is certainly not out of the question.
And don’t forget – agents don’t have to work alone! In fact, there is tremendous potential power when agents can work together, perhaps by dividing and solving complex problems. So, it won’t be a one-to-one formula for agents, each person will probably have multiple agents, each city, each school, hell, and each home and building will eventually have their own agents, trained to focus and deliver on the needs of their specific domain.
The near future is going to be an Agent-based World out there!
0 Comments