After the novelty of creative artificial intelligence (AI) wore off, many people raised an important question — yes, it’s cool, but how can it make an impact in the real world? It was a valid question. Although AI chatbots can be seen as a one-stop shop for quickly searching for information, having instant conversations, writing articles, creating images or videos, their role is largely that of a system. is limited to where a human user would have to work continuously. Command it to get the output and monitor the result.

Even if his abilities cannot be denied, and he has made a significant impact in improving the productivity of workers in certain fields, he lacked one key element that prevented him from becoming a loyal supporter. Blocks that can handle tasks and truly automate — decision-making. Generative AI can help with certain aspects of a person’s work today, but it cannot perform a task.

For example, you can ask it to write an email to a client letting them know about an unexpected delay, but it can’t send that message or respond with an angry message they send. Similarly, you can use Gemini or ChatGPT to ask for “best smartphone for shooting videos,” and it might recommend the latest iPhone 16 Pro Max or the Samsung Galaxy S24 Ultra. But it won’t let you scour the web to find the best deal and shop.

Realizing this gap, tech companies working on large language models (LLMs) started using the term AI agent. Researchers believe that AI agents can take a knowledge-based AI system and turn it into an action-taking system that can perform end-to-end tasks without human intervention.

This term gained importance in the second half of 2024, and currently, it is being considered as a panacea for all work-related problems. And while there is some truth to this, is it really a technology of this capacity transformation? The answer can be a little complicated, but we’ll do our best to break it down and highlight all the different aspects you should know about. Let’s dive into it.

What is an AI agent?

Since the technology is still in its infancy, there is no agreed definition of what exactly constitutes an AI agent. IBM Explains It is defined as a system “that is capable of performing tasks autonomously on behalf of a user” by designing workflows and using tools. Similarly, Google, which last year announced its first AI agent called Project Mariner, describes it as a system that acts as an assistant to humans and helps them complete tasks. does

A more comprehensive definition is given by Amazon, which Describes It is defined as “a software program that can interact with its environment, collect data, and use the data to perform self-defined tasks to accomplish predetermined goals. Humans set goals, but an AI agent independently chooses the best course of action to achieve those goals.

Simply put, an AI agent can be thought of as an AI system that can perform an action instead of just telling the user about the action.

Breaking the AI ​​agent

A typical AI agent will have a large language model (LLM) as its brain. But it will also include other elements that enable it to put that intelligence into action. Typically, these additional parts are various sensors, mechanical components, encoders, or other software integrations.

Sensors enable an AI agent to collect data in various formats. These can be visual, sound, temperature, or electronic signals. Mechanical parts are typically used for embodied AI or robots that need to perform real-world actions such as lifting an object or moving it from one place to another. Encoders are used to convert various types of signals into information that can be processed by the LLM. Finally, software integration enables the ability to perform tasks.

At this point it is also important to highlight another important difference between AI models and AI agents. AI models have a pre-training database that forms their knowledge base. Anything that is not part of the database will not produce output. A good example of this was the early version of ChatGPT that was not connected to the Internet and had a knowledge cut-off date. If he is asked to answer a question about current affairs, he will not be able to answer it.

In contrast, AI agents, when integrated with related systems, can independently collect new data to solve problems that would not be possible based on their existing databases. For example, Google’s Project Mariner can. The conversation With a browser to find the best deals on smartwatches.

Another aspect of AI agents is the ability to handle complex tasks. AI agents are capable of advanced reasoning and thus can break a complex task into multiple simpler tasks and then complete them one after the other. This contextual understanding of a problem and the ability to figure out how to break it down is the core function of AI agents.

A good example of this is Gemini’s recently added Deep Research tool. Users can ask it to explain a technical or specific topic. The AI ​​will then create a multi-step research plan, break the topic down into smaller parts, find relevant research papers and articles on the topic, execute the plan, research, and compile it to create a detailed report. will analyze the data collected.

Applications of AI agents.

AI firms are using AI agents as a tool that can be used across industries and across different scenarios. It can be used as a voice assistant for devices that can perform device-specific functions (such as taking a picture or playing music). It can be incorporated into an app or software and perform functions within it (buying a product through a browser-based agent). It can also be integrated into enterprise systems and can detect fraud or find ways to improve various processes.

In addition, AI agents are also called upon to perform transformative tasks in certain industries. In healthcare, it can be used for diagnosis, treatment recommendation, and drug discovery. In the automotive sector, it can be used to make self-driving cars. AI agents are said to be able to pilot drones in disaster areas to collect and analyze data and offer actionable insights for rescue operations.

It has applications in manufacturing industries through AI-powered robots, in the gaming industry as a game developer or as a non-playable character (NPC) within games, and in education to create personalized study plans and grade test papers. There are also applications for Fashion like a human.

However, it’s important to note that while tech companies are marketing AI agents as a catch-all for all kinds of end-to-end intelligent automation, the current technology is largely task-specific rather than generic. Limits to characters based on – Aim tool.

AI agents in 2025

With that being said, it’s important to ground our expectations and understand what we can realistically expect from AI agents in the current year. It is unlikely that AI agents will enter the workforce in any key sectors such as manufacturing, automobiles, healthcare, or education.

However, this year should mark the entry of AI agents into consumer electronics, mobile and desktop applications, as well as websites and platforms. Google’s Project Mariner, for example, could integrate with Google Chrome and help users make web purchases and search for files by the end of this year.

OpenAI is also rumored to launch its own AI agent this year that could further expand ChatGPT’s capabilities and allow it to perform certain actions on the user’s device and over the Internet. Anthropic’s computer usage tool is also expected to make a global release and help users with their daily tasks on the device.

Eventually, we should also see a shift where AI agents can mimic keystrokes, mouse movements and clicks, and do more on devices. For example, by the end of the year, more agent tools like Coding Agent Devine could write code end-to-end, test it, find and fix bugs, and deploy it without human intervention. But, adding it to the 2025 itinerary would be highly optimistic.

On the enterprise side, AI agents can play a major role in completing certain organizational tasks such as monitoring large amounts of data, generating analytical reports, and making recommendations and course corrections. It can also be used in some cybersecurity roles. Notably, Metta has said it already uses AI to ensure guidelines are being followed. YouTube also uses AI to monitor copyright violations.

However, we don’t expect AI agents to enter any significant roles this year as the technology is largely untested and its reliability will be in question. Businesses, especially public enterprises or those backed by large investors, are generally risk-averse and unlikely to provide access to sensitive data.

Problems with AI agents

With the current trend in the tech space and the potential to disrupt a number of industries, it’s understandable why there is so much excitement about AI agents. However, beyond rose-colored glasses there are several issues with AI agents that need to be addressed before the technology can witness mass adoption. On the other hand, if it is not checked, the technology can lead to several risks.

A major problem with AI agents is the bias and discrimination that comes from their training data and can lead to discriminatory results. This also highlights another issue of transparency in AI agents. With complex algorithms and architectures, most AI agents are complex and ambiguous systems where it is difficult to understand how and why they make certain decisions.

There are also security and privacy issues. From a security perspective, AI agents can be vulnerable to adversarial attacks, where malicious actors manipulate input data to trick the system. Additionally, because AI agents need to connect to multiple systems and collect large amounts of data to perform tasks, they also pose privacy risks.

With so many challenges, AI firms will have a tough job convincing businesses and individuals of the technology’s upside as well as its downsides. Regardless, it cannot be denied that AI agents will form a major part of AI announcements in 2025.



Source link