Forget chatbots. AI agents are the future.

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

This week, a startup called Cognition AI caused a stir with its release. A demo Demonstrating an artificial intelligence program called Devin is usually performed by well-paid software engineers. Chatbots like ChatGPT and Gemini can generate code, but Devin plans to go beyond that by solving a problem, writing the code, and then testing and implementing it.

Devin’s creators have branded it as an “AI software developer”. When asked to test how Meta’s open-source language model Llama 2 performed when accessed by various hosting companies, Devin laid out a step-by-step plan for the project, accessing APIs and Developed the code needed to run benchmarking tests, and built a website. Summary of results

Staged demos are always difficult to judge, but Cognition shows Devin handling a wide range of impressive tasks. This Amazed investors and engineers At X, receiving plenty Validationand even encouraged. Some Memes– including some predictions Devin will soon make. Responsible For the tech industry’s holiday wave.

Devin is just the latest, most striking example of a trend I’ve been tracking for some time — the emergence of AI agents that solve a problem presented by a human, rather than just providing answers or advice. can take action for A few months ago I tried Auto-GPT, an open source program that tries to do useful things by taking actions on a person’s computer and the web. Recently I tested another program called vimGPT to see how the visual skills of new AI models can help these agents browse the web more efficiently.

I was impressed by my experiences with these agents. Yet for now, just like the language models that power them, they make quite a few mistakes. And when a piece of software is performing an action, not just generating text, one mistake can mean total failure and potentially costly or dangerous consequences. Reducing the range of tasks that an agent can perform to, say, a specific set of software engineering tasks may seem like a smart way to reduce the error rate, but there are still many possible ways to fail. are

Not only startups are building AI agents. Earlier this week I wrote about an agent called SIMA, developed by Google DeepMind, that plays video games including a truly bonkers title. Goat Simulator 3. SIMA watched human athletes learn how to perform more than 600 complex tasks, such as chopping down a tree or shooting an asteroid. Most importantly, it can perform many of these actions successfully even in an unfamiliar game. Google DeepMind calls it a “generalist.”

I suspect Google hopes these agents will eventually go beyond video games, perhaps helping users navigate the web or running software for them. But video games make a good sandbox for developing and testing agents by providing complex environments in which they can be tested and improved. “Making them more accurate is something we’re actively working on,” Tim Harley, a research scientist at Google DeepMind, told me. “We have different ideas.”

You can expect a lot more news about AI agents in the coming months. Google DeepMind CEO Damis Hassabis recently told me that he plans to combine large language models with work his company has done in video games to develop more capable and reliable agents than before. Trained AI programs to play. “It’s definitely a huge area. We’re investing a lot in that direction, and I imagine others are as well,” Hasabis said. When they become more agent-like.”

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Leave a Comment