This file is read by AI coding assistants (GitHub Copilot, Gemini Code Assist, Claude, etc.) to enforce project-wide conventions. Keep it up to date.
NEVER hardcode URLs, ports, secrets, API keys, or any environment-dependent values directly in source code.
-
Always use environment variables with a sensible local-dev fallback:
# ✅ CORRECT api_key = os.environ.get("GEMINI_API_KEY") # ❌ WRONG — hardcoded key api_key = "AIzaSy..." # ❌ WRONG — hardcoded URL redis_url = "redis://production-host:6379/0"
-
Use the configuration system — don't redeclare config in every file:
# ✅ Import from the canonical source from agent_forge.config import load_config config = load_config() # ❌ Don't redeclare per-file REDIS_URL = os.environ.get("REDIS_URL", "redis://localhost:6379/0")
-
Never commit secrets — API keys, tokens, and credentials must come from environment variables, never from source. The
.envfile is in.gitignore.
| Prefix | Purpose | Example |
|---|---|---|
AGENT_FORGE_ |
Application configuration | AGENT_FORGE_AGENT_MAX_ITERATIONS |
GEMINI_API_KEY |
LLM provider key (direct, no prefix) | GEMINI_API_KEY |
OPENAI_API_KEY |
LLM provider key (direct, no prefix) | OPENAI_API_KEY |
ANTHROPIC_API_KEY |
LLM provider key (direct, no prefix) | ANTHROPIC_API_KEY |
Document any new env var in docs/spec.md § Configuration and the project's agent-forge.toml defaults.
NEVER push directly to main. All changes must go through a feature branch and Pull Request.
-
Always work on a branch — use the naming conventions:
feat/<issue-number>-<short-description>for featuresfix/<issue-number>-<short-description>for bug fixesdocs/<issue-number>-<short-description>for documentationrefactor/<short-description>for refactoringtest/<short-description>for test additions
-
Submit a Pull Request targeting
main— include a clear description and reference the issue (Closes #N). -
Merge only on explicit developer request — never merge a PR autonomously. Wait for the developer to say "merge", "you can merge", or equivalent.
-
Never force-push to
main— only force-push on feature branches if absolutely necessary. -
Clean up after merge — delete the feature branch (local + remote) and align local
main. -
Use the
/start-issueworkflow when beginning work on any issue or task. Run the steps in.agent/workflows/start-issue.md. -
Automatically run
/finish-issuewhen completing work. When work on any issue or task is done, always execute every step in.agents/workflows/finish-issue.md— verify coverage, run tests, lint, commit, push, open PR, wait for CI green, and merge. This workflow is mandatory, not optional. Do not skip steps or ask whether to run it. -
Update documentation with every user-facing change. Any change that modifies CLI flags, API endpoints, configuration options, deployment topology, or observable behavior must include corresponding documentation updates in the same branch. Review and update as needed:
README.md— features, project structure, usage examplesdocs/— architecture, configuration, hosted-service, extending, testingdocs/spec.md— technical specification, interface contracts- Inline docstrings in changed modules
Skip only for purely internal refactors with zero user-facing impact.
- Python 3.11+ — use modern syntax:
X | Yunions,matchstatements,tomllib - Async-first — use
async/awaitfor I/O-bound operations (LLM calls, Docker, file I/O) - ABCs for interfaces — all providers and tools implement abstract base classes
- Pydantic for validation — use Pydantic models for external data (config, API responses)
- Dataclasses for internals — use
@dataclassfor internal data structures
| Package | Purpose |
|---|---|
agent_forge.llm |
LLM provider adapters (Gemini, OpenAI, etc.) |
agent_forge.tools |
Built-in tools (file ops, shell, search) |
agent_forge.sandbox |
Docker sandbox management |
agent_forge.agent |
ReAct loop, state machine, prompts |
agent_forge.orchestration |
Task queue, event bus, workers |
agent_forge.observability |
Structured logging, tracing, cost tracking |
- Sandbox containers use
--network noneby default - Never pass API keys into the sandbox
- Resource limits are mandatory:
--cpus,--memory,--pids-limit - All file operations are validated to stay within
/workspace
Agent Forge is a generic coding agent framework — comparable to Claude Code, Codex, or Antigravity. It must remain domain-agnostic. Any feature tied to a specific use case (smart contract auditing, web security scanning, code migration, etc.) belongs in the extension layer, never in the core packages.
┌──────────────────────────────────────────────────────────────┐
│ CORE (agent_forge/*) │
│ Generic, domain-agnostic capabilities: │
│ LLM adapters, ReAct loop, sandbox, tools, profiles, │
│ orchestration, observability, CLI, hosted service shell │
├──────────────────────────────────────────────────────────────┤
│ EXTENSION LAYER (plugins/, skills/, workflows/) │
│ Domain-specific capabilities loaded at runtime: │
│ - plugins/proof-of-audit/ → audit profiles, detectors, │
│ report schemas, challenge evidence, multi-agent personas │
│ - plugins/<other-domain>/ → any future specialization │
│ - --profiles-dir, entry_points, skill files, workflows │
└──────────────────────────────────────────────────────────────┘
-
Core packages must not import or reference domain-specific concepts. Terms like "reentrancy", "access control", "vulnerability", "finding", "severity", "detector" are audit-domain vocabulary — they do not belong in
agent_forge.*. -
Use generic abstractions in core. A profile has
prompt_scope(generic), notdetectors(audit-specific). A report is a JSON artifact, not a "proof-of-audit report". -
Domain features are delivered via extensions:
- Profiles → YAML files in a plugin's
profiles/directory, loaded with--profiles-dir - Tools → Python entry points registered under
agent_forge.tools - Prompts → Injected through the generic
prompt_scopefield onAgentProfile - Workflows → Markdown files in
.agent/workflows/
- Profiles → YAML files in a plugin's
-
Test accordingly. Core tests must not depend on any domain-specific profile or plugin existing. Domain tests live alongside the plugin.
To add a "web-security-scanner" domain, create plugins/web-security-scanner/ with its own
profiles, tools, and workflows. Do not modify any file under agent_forge/ to add
web-security concepts.
Extensions can be separate installable packages — they do not need to live in this monorepo. A user installs the core agent and then adds domain capabilities:
pip install agent-forge # core framework
pip install agent-forge-proof-of-audit # audit profiles, tools, report schemas
pip install agent-forge-web-security # hypothetical web-security extensionThe plugins/ directory in this repo is a development convenience for first-party
extensions. At runtime, extensions are discovered through:
entry_points— Python's standard plugin mechanism (already used for tools via theagent_forge.toolsgroup intools/plugins.py). Future groups:agent_forge.profiles,agent_forge.prompts.--profiles-dir— CLI flag pointing to a directory of profile YAMLs.- Config —
agent-forge.tomlcan declare extension paths.
| Pattern | Purpose | Runner |
|---|---|---|
tests/unit/test_*.py |
Pure unit tests — no Docker, no external I/O | make test-unit |
tests/integration/test_*.py |
Tests with real Docker containers | make test-integration |
tests/e2e/test_*.py |
Full agent run on sample repos | make test (all) |
-
Mock LLM responses, not tools. Tools should be tested against a real sandbox when possible. Use recorded/cached LLM responses (VCR pattern) for deterministic tests.
-
Use
pytestfixtures for sandbox setup/teardown:# ✅ CORRECT — use fixture @pytest.fixture async def sandbox(): sb = DockerSandbox() await sb.start("./tests/fixtures/sample_repo", SandboxConfig()) yield sb await sb.stop() # ❌ WRONG — manual setup in test body
-
Never mock the sandbox in integration tests. Integration tests exist to verify real Docker interactions.
-
Use
respxfor HTTP mocking in LLM adapter unit tests:# ✅ CORRECT — mock HTTP, not the adapter respx.post("https://generativelanguage.googleapis.com/...").respond(json={...})
# Unit tests only (fast, no Docker needed)
make test-unit
# Integration tests (requires Docker)
make test-integration
# All tests with coverage
make test- Run
make lintbefore committing (ruff check + mypy) - Run
make formatto auto-format (ruff format) - All public functions and methods must have type hints
- Use Google-style docstrings for public APIs
- Follow Conventional Commits for all commit messages