The Pentagon is working to address vulnerabilities in its AI systems that could allow attackers to use visual tricks or manipulate signals. Their research program, Guaranteeing AI Robustness Against Deception (GARD), has been investigating these “adversarial attacks” since 2022.
Researchers have shown that seemingly innocuous artifacts can fool AI into misidentifying objects, with potentially disastrous consequences on the battlefield. For example, an AI can mistakenly mistake a bus with passengers as a tank if it is tagged with the right “visual noise”.
The concerns come amid public concern about the Pentagon’s development of autonomous weapons. To address this, the Department of Defense recently updated its AI development rules, emphasizing “responsible behavior” and requiring approval for all deployed systems.
The modestly funded GARD program has made progress in developing defenses against such attacks. They have even provided some tools to the Defense Department’s newly formed Chief Digital and AI Office (CDAO).
However, some advocacy groups are concerned. They fear that AI-powered weapons could misinterpret situations and attack without reason, even without any deliberate manipulation of signals. Such weapons, they say, can inadvertently increase tensions, especially in tense areas.
The Pentagon is actively modernizing its arsenal with autonomous weapons, highlighting the urgent need to address these threats and ensure the responsible development of this technology.
According to a statement from the Defense Advanced Research Projects Agency, GARD researchers from Two Six Technologies, IBM, MITRE, University of Chicago, and Google Research developed the following virtual testbed, toolbox, benchmarking dataset, and Developed training materials that are now widely available. Research Community:
- The Armory Virtual Platform, available on GitHub, serves as a “testbed” for researchers who need repeatable, scalable, and robust assessments of adversarial defenses.
- The Adversarial Robustness Toolbox (ART) provides tools for developers and researchers to defend and evaluate their ML models and applications against multiple adversarial threats.
- Restrained Adversarial Patches in Context (APRICOT) dataset enables reproducible research on the real-world effectiveness of physical adversarial patch attacks on object detection systems.
- The Google Research Self-Study Repository includes “test dummies” that represent a general idea or approach to building defenses.