Translate high-signal Chinese AI content into English.
Signals connect ecosystem movement back to tools and agents: source, thesis, why it matters, related objects, and an English reading layer.
Read signals as ecosystem evidence.
Each card gives the source, permission posture, thesis, and links into affected tools or agents.
COAgents: Multi-Agent Framework Masters VRP Search Space
COAgents introduces a cooperative multi-agent framework for solving Vehicle Routing Problems (VRP) by modeling the search process as a dynamically constructed graph. Three specialized agents—Node Selection, Move Selection, and Jump—collaborate to guide intensification and exploration, achieving state-of-the-art results on VRPTW benchmarks and competitive performance on CVRP.
Contractual Skills: Making Enterprise AI Agents Governable
This paper introduces 'contractual skills'—a GovernSpec-inspired framework for organizing SKILL.md files as readable, inspectable task contracts. Through two offline experiments covering 960 text-generation outputs and 192 simulated tool-call records across multiple models, the author demonstrates that contractual skills improve governance and checkability over baselines, but do not significantly boost raw generation quality. The framework clarifies the boundary between skills, YAML contracts, MCP surfaces, tool adapters, and runtime guardrails.
Orchard: Open-Source Framework for Scalable Agent Training
Orchard introduces a lightweight, open-source environment service (Orchard Env) that provides reusable primitives for sandbox lifecycle management across agent domains. Built on this layer, the authors demonstrate three agentic modeling recipes—Orchard-SWE (coding), Orchard-GUI (computer use), and Orchard-Claw (personal assistant)—that achieve state-of-the-art results among open-source models using dramatically fewer training trajectories than prior approaches.
LLM Agents Self-Adapt Security for IoT at the Edge
ASPO introduces a self-adaptive multi-agent architecture that integrates LLM-based reasoning with deterministic enforcement within a MAPE-K control loop for IoT security pattern selection. The framework separates stochastic decision generation from execution, achieving 100% conflict-free activation and consistent resource feasibility across workloads while reducing tail latency and energy overheads by over 20%.
LLM Multi-Agent System Automates Topology Optimization
This paper introduces TopOptAgents, a multi-agent system that automates topology optimization—a complex engineering design process—using six LLM-based agents that collaborate through iterative self-refinement cycles. The framework handles problem formulation, validation, code generation, execution, and quality assessment, successfully producing converged designs even for problem types where single LLMs fail, particularly those with sparse literature coverage.
When AI agents overtrust bad evidence: a new benchmark
This paper introduces EnvTrustBench, a systematic framework for benchmarking when LLM agents fail to verify environmental evidence—treating stale, incorrect, or malicious observations as sufficient for action. Testing across 6 LLM backbones and 5 scaffolds reveals that evidence-grounding defects (EGDs) are pervasive, highlighting a critical reliability gap in current agent architectures.
Multi-Agent Security: Architecture Matters More Than You Think
This paper presents the first systematic empirical study of how architectural decisions in multi-agent systems (MAS) affect the tradeoff between task performance and security. Across three environments and 13 configurations, the authors find that MAS designs are generally more vulnerable than single agents, with attack success rates varying by up to 3.8x depending on architecture choices like agent roles, communication topology, and memory design.
Claw AI Lab: From Prompt to Interactive AI Research Team
Claw AI Lab transforms autonomous research from a black-box pipeline into an interactive, multi-agent laboratory. Users can instantiate a full research team from a single prompt, monitor progress in real time, inspect artifacts, and roll back experiments—all through a unified dashboard. Its Claw-Code Harness bridges the gap between code execution and paper generation, significantly improving experimental completeness and result integrity.
Formal Skill: Executable Runtime Skills for LLM Agents
This paper introduces Formal Skill, a runtime-native abstraction that moves reusable agent procedures from verbose prompt text into executable state machines with hook-governed policies. Implemented in the open-source FairyClaw runtime, it achieves competitive accuracy on Harness-Bench while using significantly fewer tokens, particularly excelling on tasks requiring structured workflow enforcement and policy compliance.
Hybrid LLM-RL Red Teaming Framework Exposes AI Security Gaps
This paper introduces an autonomous red teaming framework that combines large language models with reinforcement learning to generate adaptive, multi-stage attack campaigns against AI-enabled security systems. Testing in high-fidelity enterprise simulations reveals that standalone LLM agents cannot sustain complex attacks, while hybrid LLM-RL approaches achieve significantly higher compromise rates, exposing critical vulnerabilities in current AI security defenses.
AgentCo-op: Retrieval-Based Multi-Agent Workflow Synthesis
AgentCo-op introduces a retrieval-based synthesis framework that dynamically composes existing agents, tools, and skills into multi-agent workflows using typed artifact handoffs. It applies bounded local repair to fix only failing components, achieving strong benchmark results while reducing costs and enabling open-world scientific collaboration without redesigning existing agents.
ColPackAgent: MCP-Powered AI for Colloidal Packing Simulations
ColPackAgent is an agent framework that pairs a domain-specific Python package (colpack) with a Model Context Protocol (MCP) tool server and a portable agent skill to autonomously execute colloidal packing simulations. It demonstrates that general-purpose LLMs can reliably run structured scientific workflows when given dedicated tools and workflow instructions, rather than just describing them.
EngiAI: Multi-Agent Benchmark Reveals LLM Gaps in Engineering Design
EngiAI introduces a multi-agent benchmark suite for LLM-driven engineering design, testing agents across workflow, RAG, and HPC dimensions. Results show proprietary models excel on simple tasks but all models struggle with conditional branching and long-running orchestration, revealing critical gaps for real-world engineering deployment.
Layered Security Review of Autonomous Agent Frameworks
This survey provides the first layered review of security risks and defenses in autonomous agent frameworks built on LLMs. By organizing threats across four layers—context/instruction, tool/action, state/persistence, and ecosystem/automation—the authors reveal how attacks can propagate from manipulated inputs to persistent state contamination and ecosystem-level impact, using OpenClaw as a case study.
When Skills Hurt: Negative Result for CTF Agents
This paper presents a controlled study of an MCP-grounded autonomous Capture-the-Flag (CTF) agent, showing that adding curated procedural knowledge (Skills) yields no statistically significant improvement over a no-Skills baseline. The authors argue that when an agent's tool layer returns strict, low-latency, schema-validated observations, the environment itself provides the correction signal that Skills are meant to supply, making them redundant overhead—and in some cases, actively harmful.
Agent Memory Goes Infrastructure: Memori at 14K Stars
Memori represents a new category: agent-native memory infrastructure. It's LLM-agnostic, turning agent execution traces and conversations into structured, persistent state for production systems. At 14K stars, it signals that memory is becoming a standalone infrastructure concern, separate from the agent runtime.
China Agent Ecosystem: agentUniverse Framework and Chinese Developer Tools
Two Chinese-origin projects highlight parallel development in the agent ecosystem. agentUniverse is a multi-agent framework that lets developers build collaborative LLM applications. indie-hacker-tools-plus is a curated Chinese-language tool stack for independent developers, including AI agent tools. Both signal that the Chinese AI agent ecosystem is building its own infrastructure layer.
MCP Security Goes Architectural: Prompts Don't Protect
Two papers this week signal a shift in MCP security: from trusting LLMs to enforce rules via prompts, to architectural enforcement at the protocol layer. Rohith Uppala demonstrates that LLMs will select unauthorized tools in adversarial scenarios regardless of prompt instructions. A separate paper from the ADR team presents the first production-proven enterprise MCP security framework.
OpenClaw Agent Ecosystem Hits 162 Production Templates
Three community-curated awesome-lists have emerged as ecosystem hubs. The OpenClaw agent template collection now hosts 162 production-ready SOUL.md configurations across 19 categories. The Claude Code awesome-list has reached 44K stars, making it the largest agent-specific resource index. A new awesome-agent-skills list adds a dedicated skill discovery layer.
The Tool-Calling Training Gap: FireFly and EnvFactory Attack the Bottleneck
Training LLMs to reliably call tools remains a bottleneck. Two new papers present complementary solutions: FireFly generates verified tool-call trajectories from real APIs with ground-truth outcomes, while EnvFactory synthesizes executable environments and uses reinforcement learning to scale agent training. Together they address the core data problem that limits tool-using agent reliability.
China's AI Agent Education Ecosystem Goes Systematic: The ai-agents-from-zero Phenomenon
A single Chinese-language GitHub repository now packages the entire AI agent learning path — from LLM fundamentals and prompt engineering through LangChain/LangGraph, Coze and Dify low-code platforms, MCP protocol implementation, enterprise RAG workflows, fine-tuning with LoRA/QLoRA, and an interview question bank aligned with real job descriptions. With 1,100+ stars and growing, it signals that China's agent developer education is consolidating around Python-first, framework-deep, project-complete curricula.
Claude Code as Engineering Tool: Chinese Developers Move Beyond Code Generation
A GeekTime (极客时间) column companion repository demonstrating how Chinese developers use Claude Code for full engineering workflows — architecture design, code review, testing, deployment, and documentation — not just code generation. It represents a maturation in how Chinese developers think about AI coding tools: from 'write this function' to 'own this feature end-to-end.'
JeecgBoot: AI Skills Meet Low-Code — 46,000 Stars for China's AI-Native Development Platform
JeecgBoot is China's most popular AI-powered low-code platform with 46,000+ GitHub stars. It combines traditional low-code generation with AI Skills that can generate entire systems from natural language descriptions — one sentence to draw workflows, design forms, and scaffold complete applications. With built-in AI chat, knowledge bases, MCP plugin support, and compatibility with mainstream Chinese and Western LLMs, it represents the convergence of AI agents and enterprise application development in China.
MCP Protocol Adoption in China: From Experimental to Production Infrastructure
Analysis of the Chinese-language MCP ecosystem on GitHub reveals that MCP adoption has moved beyond experimental projects into production infrastructure. Chinese developers are building MCP servers for domestic services (WeChat, DingTalk, Feishu, Baidu, Alibaba Cloud), enterprise databases (MySQL, PostgreSQL, MongoDB), and internal tools. Major Chinese platforms — including Coze, Dify, JeecgBoot, and Trae — now include MCP support. Total Chinese-language MCP repositories with 300+ stars have grown to over 100, signaling mainstream adoption.
OpenOcta: Enterprise-Grade Open-Source Agent Platform Built for Chinese Teams
OpenOcta is an open-source enterprise agent platform purpose-built for Chinese development teams. It packages multi-agent orchestration, tool integration, knowledge management, and observability into a single deployable system, with native support for Chinese enterprise workflows including WeCom, DingTalk, and Feishu integration. With 2,500+ GitHub stars, it signals growing demand for self-hosted, China-specific agent infrastructure.
Feishu-Native AI Agent: Proma Brings Proactive Agents to Chinese Workplace
Proma is an open-source proactive AI agent built on the Claude Agent SDK, designed to live inside Feishu (Lark) group chats. It demonstrates a new paradigm where agents aren't summoned — they proactively participate in conversations, suggest actions, and complete multi-step workflows. With native Feishu integration and flexible model provider support, it's a blueprint for how Chinese workplace agents will operate.
QwenPaw packages personal agents around skills, channels, memory, and safety
QwenPaw is a China-origin personal AI assistant that emphasizes local or cloud deployment, skills, multi-agent collaboration, multi-channel access, memory, and safety controls.
Qwen-Agent turns MCP into a first-class agent framework capability
Qwen-Agent presents MCP as part of a broader agent framework that also includes function calling, code interpreter, RAG, GUI apps, and model-service integration.
QwenPaw packages personal agents around skills, channels, memory, and safety
QwenPaw is a China-origin personal AI assistant that emphasizes local or cloud deployment, skills, multi-agent collaboration, multi-channel access, memory, and safety controls.