Multi-Agent Security: Architecture Matters More Than You Think

Summary

As multi-agent systems (MAS) move from research labs into production deployments, a critical question has remained largely unexplored: how do architectural choices affect security? This paper from researchers at the University of Oxford and other institutions provides the first systematic empirical answer. The authors evaluate 13 architectural configurations across three agentic environments—browser automation, desktop control, and code generation—using a novel stagewise evaluation framework that distinguishes four distinct failure modes: planning refusal, execution-stage interception, partial harmful execution, and successful attack completion.

The results are sobering for the MAS community. In the majority of configurations tested, multi-agent architectures proved more vulnerable than standalone agents, with attack success rates varying by up to 3.8x at comparable or higher benign accuracy. Critically, no single design emerged as universally safer. The security posture depends on three key architectural decisions: how authority and responsibility are allocated among agents (agent roles), how and when agents communicate (communication topology), and what context and state visibility each agent has access to (memory). This suggests that security must be treated as a first-class architectural concern, not an afterthought.

Key Contributions

First systematic empirical study of how MAS architectural decisions shape the security-performance tradeoff across multiple environments
Novel stagewise evaluation framework that distinguishes four distinct failure modes for granular security analysis
Demonstration that multi-agent architectures are more vulnerable than standalone agents in most configurations, with attack success rates varying by up to 3.8x
Identification of three key architectural dimensions—agent roles, communication topology, and memory—that significantly impact security posture
Evidence that no single MAS design is universally safer, challenging assumptions about optimal architectures

Implications

For Researchers

This work establishes a rigorous foundation for studying MAS security as a distinct research area. The stagewise evaluation framework provides a reusable methodology for future studies, while the finding that architectural choices create up to 3.8x variation in attack success rates opens new research directions in architecture-aware security testing. Researchers should move beyond single-agent threat models and develop defensive architectural patterns that minimize attack surface while maintaining coordination benefits.

For Developers

The paper’s core message is clear: architectural decisions are security decisions. When building multi-agent systems, developers should adopt stagewise evaluation methods to understand where failures occur in their specific architecture. The tradeoffs between agent roles (hierarchical vs. flat), communication topologies (broadcast vs. point-to-point), and memory designs (shared vs. private) must be evaluated in the context of the specific threat model and deployment environment. No single design is universally safe, so security testing must be architecture-specific.

For Users

End users of multi-agent systems should be aware that these systems introduce security risks that don’t exist with single agents, even if each individual agent is secure. The architectural choices made by developers directly impact the system’s vulnerability to attacks. Users should demand transparency about architectural security properties and understand that the optimal design depends on the specific use case and threat environment.

References

https://arxiv.org/abs/2604.23459v1

Multi-Agent Security: Architecture Matters More Than You Think

Read this first.

Where this changes the map.

Translated text.