Skip to content

add orchestrated agent skills and workflow#195

Open
tdejoigny-ledger wants to merge 4 commits into
masterfrom
tdj/AI_agents
Open

add orchestrated agent skills and workflow#195
tdejoigny-ledger wants to merge 4 commits into
masterfrom
tdj/AI_agents

Conversation

@tdejoigny-ledger

Copy link
Copy Markdown
Contributor

No description provided.

@codecov

codecov Bot commented Feb 17, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.81%. Comparing base (ac10944) to head (4927426).
⚠️ Report is 29 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #195   +/-   ##
=======================================
  Coverage   90.81%   90.81%           
=======================================
  Files           5        5           
  Lines          98       98           
  Branches       22       22           
=======================================
  Hits           89       89           
  Misses          9        9           
Flag Coverage Δ
unittests 90.81% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a multi-agent prompt framework under .github/prompts/ to orchestrate Ledger app development across implementation (C), QA (Ragger/Speculos), documentation, and review “gatekeeping”.

Changes:

  • Added an Orchestrator prompt defining a 4-phase Dev → QA → Doc → Review pipeline.
  • Added four “skill” prompt files specifying role responsibilities and checklists for C, QA, Doc, and Review agents.
  • Added a .github/prompts/README.md describing how to use the prompt suite.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
.github/prompts/ORCHESTRATOR.md Defines the multi-phase orchestration workflow and logging template.
.github/prompts/SKILL_LEDGER_C.md Dev-agent rules for embedded C implementation/build expectations.
.github/prompts/SKILL_LEDGER_QA.md QA-agent rules for Ragger/Speculos testing and snapshot workflow.
.github/prompts/SKILL_LEDGER_DOC.md Doc-agent rules for generating/maintaining APDU + design documentation.
.github/prompts/SKILL_LEDGER_REVIEW.md Review-agent “gatekeeper” checklist and verdict format.
.github/prompts/README.md Explains how to use the orchestrator + skills as a prompt workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +86 to +95
4. **Provide Proof (MANDATORY):**
- After writing code, you **MUST execute** the Docker Build Command **yourself in the terminal**. Do NOT just display it to the user.
- **Start Clean:** Begin the command chain with `make clean`.
- **Filter Targets:** Read `[app].devices` from `ledger_app.toml` and generate build commands **ONLY** for the listed devices. Use the **Target → SDK Variable Mapping** table above to translate each device name to its `BOLOS_SDK` variable (e.g., `nanos+` → `$NANOSP_SDK`).
- **Check Compliance:** **MUST** end the command chain with the Rule Enforcer (`/opt/enforcer.sh`).
- **Enforcer Duration:** The enforcer runs `scan-build` for every target and can take **10-15 minutes**. Use a generous timeout (900s+). Consider running the compilation chain and enforcer as **two separate commands** to avoid timeout-induced SIGTERM.
- **Chain Commands:** Use `&&` to ensure the process stops on the first error.
- **Preserve Artifacts:** Stop after the last compilation step (Do not run a final clean).
- **Execution:** Use `run_in_terminal` (or equivalent tool) to launch the `docker run` command. Do NOT ask the user to run it manually.
5. **Iterate (The Fix Loop):**

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prompt requires using a run_in_terminal tool, but this repository doesn't define such a tool and common chat integrations (e.g., GitHub Copilot Chat) can't execute Docker commands. Consider rephrasing to be tool-agnostic (e.g., "run the following commands locally") or documenting the specific integration that provides terminal execution.

Copilot uses AI. Check for mistakes.
- **Description** — What the command does.
- **Command table** — CLA, INS, P1, P2, Lc, CData with value + description for each field.
- **Response table** — Field name, length, description. Always end with SW `0x9000`.
- **Error Responses table** — Condition → SW code. Cover: wrong P1/P2 (`0x6A86`), wrong data length (`0x6D02`), invalid data (`0x6A80`), user rejection (`0x6985`).

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example status word for "wrong data length" is incorrect (0x6D02 is not a standard length error and doesn't match what this codebase uses via StatusWords/SWO_WRONG_DATA_LENGTH). Prefer referencing the codebase constant (e.g., Errors.SWO_WRONG_DATA_LENGTH) or the standard 0x6700 to avoid generating inaccurate APDU docs.

Suggested change
- **Error Responses table** — Condition → SW code. Cover: wrong P1/P2 (`0x6A86`), wrong data length (`0x6D02`), invalid data (`0x6A80`), user rejection (`0x6985`).
- **Error Responses table** — Condition → SW code. Cover: wrong P1/P2 (`0x6A86`), wrong data length (`0x6700`), invalid data (`0x6A80`), user rejection (`0x6985`).

Copilot uses AI. Check for mistakes.
Comment on lines +58 to +65
1. **Analyze the Artifacts:** Read the changes in `src/`, the test directory (from `[pytest.standalone].directory` in `ledger_app.toml`), and `doc/`.
2. **Evaluate Logic:**
- Ask: *"Does this C code solve the user request?"*
- Ask: *"Does the Python test prove the feature works?"*
- Ask: *"Is the Documentation accurate?"*
3. **Produce Verdict:**
- **FAIL:** Return a list of specific "Blockers" if ANY principle is violated. Explicitly state what must be fixed.
- **PASS:** Output `✅ READY TO MERGE` ONLY if the feature is production-ready.

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

markdownlint (MD030) requires exactly one space after an ordered-list marker. Update this routine list (and the ordered list inside the blockquote examples below) to use 1. instead of 1. to avoid lint failures.

Copilot uses AI. Check for mistakes.
- **Code blocker** (security, logic, build) → Loop to **PHASE 1** (Dev).
- **Test blocker** (missing coverage, wrong assertion) → Loop to **PHASE 2** (QA).
- **Doc blocker** (wrong INS code, missing diagram) → Loop to **PHASE 3** (Doc).
You MUST pass the list of "Blockers" found by the Reviewer as high-priority constraints for the targeted phase.

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line ends with trailing spaces (used for a Markdown hard line break). The trailing-whitespace pre-commit hook will strip these spaces, so the hard break won't be preserved. Prefer an explicit <br> (allowed by MD033) or reflow the paragraph to avoid relying on trailing spaces.

Suggested change
You MUST pass the list of "Blockers" found by the Reviewer as high-priority constraints for the targeted phase.
You MUST pass the list of "Blockers" found by the Reviewer as high-priority constraints for the targeted phase.<br>

Copilot uses AI. Check for mistakes.
Comment thread .github/prompts/ORCHESTRATOR.md Outdated
Comment on lines +7 to +9
1. **Integration Tests (Ragger):** MANDATORY.
2. **Unit Tests (Host):** OPTIONAL (Recommended for complex logic).
3. **Fuzzing:** OPTIONAL (Recommended for security hardening).

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

markdownlint (MD030) requires exactly one space after an ordered-list marker. Update this ordered list (and the other ordered lists in this file) to use 1. instead of 1. to avoid lint failures.

Suggested change
1. **Integration Tests (Ragger):** MANDATORY.
2. **Unit Tests (Host):** OPTIONAL (Recommended for complex logic).
3. **Fuzzing:** OPTIONAL (Recommended for security hardening).
1. **Integration Tests (Ragger):** MANDATORY.
2. **Unit Tests (Host):** OPTIONAL (Recommended for complex logic).
3. **Fuzzing:** OPTIONAL (Recommended for security hardening).

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +84
5. **Provide Proof (MANDATORY):**
- After writing tests, you **MUST execute** the Docker Test Command **yourself in the terminal**. Do NOT just display it to the user.
- **Step 1: Setup:** Activate venv and install requirements.
- **Step 2: Golden Run:** Run `pytest --golden_run` for **ALL** supported devices.
- **Step 3: Verification:** Run `pytest` (without golden_run flag) for **ALL** supported devices to confirm stability.
- **Chain Commands:** Use `&&` to ensure the process stops immediately if any step fails.
- **Execution:** Use `run_in_terminal` (or equivalent tool) to launch the `docker run` command. Do NOT ask the user to run it manually.
6. **Iterate (The Fix Loop):**

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prompt requires using a run_in_terminal tool to execute Docker commands, but the repo doesn't define such a tool and many target chat environments cannot run commands. Consider rewriting to be tool-agnostic (provide commands to run) or explicitly documenting the required tooling/integration.

Copilot uses AI. Check for mistakes.
Comment thread .github/prompts/SKILL_LEDGER_DOC.md Outdated
Comment thread .github/prompts/README.md
Comment on lines +33 to +35
1. Open your AI Chat (or "Composer" / "Copilot Edits" mode).
2. **Reference the `@ORCHESTRATOR.md` file** in your context (e.g. `#file:ORCHESTRATOR.md`).
3. Type your high-level request.

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

markdownlint (MD030) requires exactly one space after an ordered-list marker. Update this ordered list (and other ordered lists in this file) to use 1. instead of 1. to avoid lint failures.

Copilot uses AI. Check for mistakes.
Comment thread .github/prompts/SKILL_LEDGER_C.md Outdated
tdejoigny-ledger and others added 3 commits February 17, 2026 18:29
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants