
Sub-agents
An agent can call another agent. There's no built-in InvokeAgent primitive — you write a thin [Tool] that wraps scope.AgentAI.RunAgentAsync. That wrapper is the sub-agent pattern.
public class KBResearchTool
{
[Tool("Run a deep knowledge-base research pass. Use when a question needs cross-referencing " +
"multiple articles. Returns a ResearchBrief with citations.")]
public static async Task<string> Research(ToolScope scope,
[Parameter("The research question, in the user's words.", required: true)] string question)
{
var run = await scope.AgentAI.RunAgentAsync(
agentUID: AI_Agents.KB_Research,
userMessage: question,
userUID: scope.CurrentUser, // sub-agent inherits the caller's ACL
chatUID: scope.CurrentChat, // groups runs + stitches citations
cancellationToken: scope.CancellationToken);
if (run is null || run.Status != AgentRunStatus.Completed)
return "{\"error\":\"Research did not complete.\"}";
scope.SetToolCallDisplayName("Researched knowledge base");
return run.Result; // JSON — KB_Research has an OutputSchema
}
}
return new KBResearchTool();
Why split work across agents:
| One big agent | Planner + specialists |
|---|---|
| Knows every tool and every prompt | Sees one focused tool per specialist |
| Prompt budget grows with each task | Specialist prompts stay internal |
| Shallow — picks tools one at a time | A specialist can loop its tools many times |
| One model for everything | Each specialist pins its own model — small planner, large specialist |
Three principles hold for every workflow:
- The planner is small and stateless — its job is choosing which specialist to call, not doing the work.
- Specialists are deep and typed — larger model, an
OutputSchema, a focused tool set. - Identity flows everywhere —
scope.CurrentUserandscope.CurrentChatpropagate, so every nested call is ACL-filtered as the same user and grouped under one chat.
No recursion guard. The runtime won't stop agent A from invoking A. Encode a closed allowed-names list in the wrapper, cap delegations in the planner prompt, and run the outer endpoint in Pooling mode — each RunAgentAsync defaults to a 5-minute timeout.