Skip to content

Task Tool Timeouts & Early Termination in Multi-Agent Conductor Pattern #6792

@DaveW001

Description

@DaveW001

Question

Date: January 2, 2026
Environment: opencode Windows

Executive Summary

We are implementing a Multi-Agent "Conductor" Pattern where a primary agent orchestrates specialized sub-agents (Researcher, Writer) using the Task tool.

We are encountering a critical issue where tasks exceeding a short duration (~30-60s) return only a session_id string to the conductor, and the sub-agent session appears to terminate immediately, failing to complete the work (e.g., writing a file).

We have attempted to configure timeouts at the Provider and Agent level, but the behavior persists, suggesting a hardcoded timeout in the Task tool implementation or a signal handling issue that kills the child process.

1. Our Architecture

We are using the Conductor Pattern to keep context windows clean and enforce role separation.

Conductor (@content-factory):

  • Role: Orchestration only. Never writes content itself.
  • Action: Calls Task(subagent_type="researcher", prompt="...").
  • Expectation: Waits for sub-agent to finish and return a string (e.g., "Report saved to path/to/file.md").

Sub-Agents (@researcher, @writer):

  • Role: Execute heavy work (Search, Write 2k words).
  • Action: Use tools (perplexity-search, write), then return a string result.

The Workflow Attempted

// Conductor calls Researcher
task({
  subagent_type: "researcher",
  prompt: "Research 'Government IT Trends'. Use perplexity. Save report to 'research.md'. Return path.",
  description: "Deep Research"
})

2. The Issue

  • Behavior A: Short/Simple Tasks (Success)
  • Task: "Write 3 bullet points." (Pure generation, < 10s)
  • Result: The Task tool returns the actual string content.
  • Outcome: Success.

Behavior B: Medium/Complex Tasks (Failure)

  • Task: "Research X using Perplexity and write a report." (Tool use + Gen, > 60s)
  • Result: The Task tool returns a metadata block:
<task_metadata>
session_id: ses_47ddfbc5affe...
</task_metadata>
  • Critical Failure: The requested file (research.md) is NEVER created.
  • Implication: The sub-agent did not just "detach"; it stopped running entirely.

3. Debugging & Attempts

We performed extensive testing to isolate the cause.

Attempt 1: Global Provider Timeout
Hypothesis: The model provider is timing out. Action: Updated opencode.json with extreme timeouts:

"provider": {
  "openai": { "options": { "timeout": 600000 } } // 10 minutes
}

Result: No change. Long tasks still return session_id immediately after ~60s.

Attempt 2: Agent-Level Timeout
Hypothesis: The agent definition needs its own timeout. Action: Added timeout: 600000 to .opencode/agent/researcher.md. Result: No change.

Attempt 3: Permission Fixes
Hypothesis: Sub-agent crashed due to missing permissions. Action: Discovered @researcher lacked perplexity-search permission. Fixed it. Result: "Medium" tasks (Find 5 stats) started working! BUT, "Full Research" (running multiple searches sequentially) still fails with session_id only.

Attempt 4: "Fire and Forget" (Async Polling)
Hypothesis: Maybe the session_id return is "working as designed" (async), and we just need to wait. Action:

Conductor sends task: "Write a 500-word essay to test.md."
Task tool returns session_id.
Conductor waits 60 seconds (bash sleep 60).
Conductor checks for test.md. Result: File never appears. Conclusion: The sub-agent session is killed or suspended the moment the Task tool "times out" and returns the session ID. It does not continue in the background.

4. Workaround: Chunking

The only reliable solution we found is to artificially break tasks into micro-chunks (30-60s max):
"Find 5 stats" (Return)
"Find 3 examples" (Return)
"Write report using stats+examples" (Return)
This defeats the purpose of an autonomous "Deep Research" agent and forces the Conductor to micro-manage.

5. Request for Help

Is there a timeout parameter for the Task tool itself? (Client-side wait time).
Why does the sub-agent die? If the Conductor stops waiting, shouldn't the sub-agent session continue running on the server?
Is wait: true supported? We need a way to force the Task tool to wait indefinitely (or 10m+) for the sub-agent to finish.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions