idea
add an integration test that spawns a claude-code instance with our plugin installed and verifies the CLI skill actually teaches the agent proper usage patterns.
flow
- spawn claude-code instance via agent SDK
/plugin marketplace add prefecthq/prefect-mcp-server
/plugin install prefect@prefect
- give it a mutation task (e.g., "cancel flow run X")
- assert it:
- uses
--no-prompt as top-level flag
- uses
prefect api for JSON output when needed
- doesn't fail on truncated IDs
questions
- how does the agent SDK handle plugin installation?
- can we mock the prefect API for deterministic testing?
- what's the right assertion model for "agent used skill correctly"?
refs
would need to understand claude agent SDK patterns - see claude-code-guide