-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Description
Question
Date: January 2, 2026
Environment: opencode Windows
Executive Summary
We are implementing a Multi-Agent "Conductor" Pattern where a primary agent orchestrates specialized sub-agents (Researcher, Writer) using the Task tool.
We are encountering a critical issue where tasks exceeding a short duration (~30-60s) return only a session_id string to the conductor, and the sub-agent session appears to terminate immediately, failing to complete the work (e.g., writing a file).
We have attempted to configure timeouts at the Provider and Agent level, but the behavior persists, suggesting a hardcoded timeout in the Task tool implementation or a signal handling issue that kills the child process.
1. Our Architecture
We are using the Conductor Pattern to keep context windows clean and enforce role separation.
Conductor (@content-factory):
- Role: Orchestration only. Never writes content itself.
- Action: Calls Task(subagent_type="researcher", prompt="...").
- Expectation: Waits for sub-agent to finish and return a string (e.g., "Report saved to path/to/file.md").
Sub-Agents (@researcher, @writer):
- Role: Execute heavy work (Search, Write 2k words).
- Action: Use tools (perplexity-search, write), then return a string result.
The Workflow Attempted
// Conductor calls Researcher
task({
subagent_type: "researcher",
prompt: "Research 'Government IT Trends'. Use perplexity. Save report to 'research.md'. Return path.",
description: "Deep Research"
})
2. The Issue
- Behavior A: Short/Simple Tasks (Success)
- Task: "Write 3 bullet points." (Pure generation, < 10s)
- Result: The Task tool returns the actual string content.
- Outcome: Success.
Behavior B: Medium/Complex Tasks (Failure)
- Task: "Research X using Perplexity and write a report." (Tool use + Gen, > 60s)
- Result: The Task tool returns a metadata block:
<task_metadata>
session_id: ses_47ddfbc5affe...
</task_metadata>
- Critical Failure: The requested file (research.md) is NEVER created.
- Implication: The sub-agent did not just "detach"; it stopped running entirely.
3. Debugging & Attempts
We performed extensive testing to isolate the cause.
Attempt 1: Global Provider Timeout
Hypothesis: The model provider is timing out. Action: Updated opencode.json with extreme timeouts:
"provider": {
"openai": { "options": { "timeout": 600000 } } // 10 minutes
}
Result: No change. Long tasks still return session_id immediately after ~60s.
Attempt 2: Agent-Level Timeout
Hypothesis: The agent definition needs its own timeout. Action: Added timeout: 600000 to .opencode/agent/researcher.md. Result: No change.
Attempt 3: Permission Fixes
Hypothesis: Sub-agent crashed due to missing permissions. Action: Discovered @researcher lacked perplexity-search permission. Fixed it. Result: "Medium" tasks (Find 5 stats) started working! BUT, "Full Research" (running multiple searches sequentially) still fails with session_id only.
Attempt 4: "Fire and Forget" (Async Polling)
Hypothesis: Maybe the session_id return is "working as designed" (async), and we just need to wait. Action:
Conductor sends task: "Write a 500-word essay to test.md."
Task tool returns session_id.
Conductor waits 60 seconds (bash sleep 60).
Conductor checks for test.md. Result: File never appears. Conclusion: The sub-agent session is killed or suspended the moment the Task tool "times out" and returns the session ID. It does not continue in the background.
4. Workaround: Chunking
The only reliable solution we found is to artificially break tasks into micro-chunks (30-60s max):
"Find 5 stats" (Return)
"Find 3 examples" (Return)
"Write report using stats+examples" (Return)
This defeats the purpose of an autonomous "Deep Research" agent and forces the Conductor to micro-manage.
5. Request for Help
Is there a timeout parameter for the Task tool itself? (Client-side wait time).
Why does the sub-agent die? If the Conductor stops waiting, shouldn't the sub-agent session continue running on the server?
Is wait: true supported? We need a way to force the Task tool to wait indefinitely (or 10m+) for the sub-agent to finish.