Here comes the AI ​​bugs.

As creative AI systems like OpenAI’s ChatGPT and Google’s Gemini become more advanced, they are increasingly being worked on. Startups and tech companies are building AI agents and ecosystems on top of systems that can complete boring tasks for you: think automatically booking a calendar and possibly buying products. But as tools are given more freedom, it also increases the number of possible ways to attack them.

Now, in a demonstration of the dangers of connected, autonomous AI ecosystems, a group of researchers have created what they claim are the first artificial AI bugs—ones that can spread from one system to another. are, potentially stealing data or deploying malware. Process “What this basically means is that you now have the ability to launch or execute a new type of cyberattack that hasn’t been seen before,” says Ben Nassi, the Cornell Tech researcher behind the study.

Nasi, along with fellow researchers Steve Cohen and Ron Button, created a worm called Morse II, a nod to the original Morse computer worm that wreaked havoc on the Internet in 1988. Researchers explain how an AI worm can attack a generative AI email assistant to steal data from emails and send spam messages—breaking some of the security safeguards in ChatGPT and Gemini in the process.

The research, which was conducted in a test environment and not against a publicly available email assistant, comes as large language models (LLMs) are becoming increasingly multimodal, with images and video. Can also generate text. While AI-producing bugs have yet to be seen in the wild, several researchers say they are a security risk that startups, developers and tech companies should be concerned about.

Most creative AI systems work through feed prompts — text instructions that tell tools to answer a question or create an image. However, these signals can also be weaponized against the system. Jailbreaks can make a system ignore its security rules and spread toxic or malicious content, while instant injection attacks can give secret instructions to a chatbot. For example, an attacker could hide text on a webpage telling LLM to act as a scammer and ask for your bank details.

To create a generative AI worm, the researchers turned to a so-called “anti-self-replicating prompt.” It’s a prompt that triggers the generative AI model to output, in response, another prompt, the researchers say. In short, the AI ​​system is asked to formulate a set of further instructions in its responses. Researchers say it is broadly similar to traditional SQL injection and buffer overflow attacks.

To demonstrate how the worm might work, the researchers built an email system that can send and receive messages using generative AI, plugging into ChatGPT, Gemini, and the open source LLM, LLaVA . They then found two ways to exploit the system — by using a text-based self-repeating prompt and by embedding a self-repeating prompt inside an image file.

Leave a Comment