Type
feature (high confidence)
Description
Add a new CLI command (e.g., asm eval <skill-path>) that evaluates a skill against established best practices and produces a structured quality report with actionable feedback. This tool helps skill creators improve their skills before publishing or sharing.
The evaluator should check a skill's SKILL.md (and supporting files) against criteria derived from best practice sources (see #117), including:
- Structure & completeness — has required frontmatter fields (name, description, type), proper markdown structure, modes/steps documented
- Description quality — trigger description is specific and non-overlapping, concise but descriptive, uses action verbs
- Prompt engineering — uses progressive disclosure, sets clear degrees of freedom, avoids ambiguity, includes examples
- Context efficiency — avoids bloating the context window, uses references/templates instead of inline content, respects token budgets
- Safety & guardrails — includes error handling instructions, has confirmation steps for destructive actions, validates prerequisites
- Testability — acceptance criteria are testable, examples cover edge cases, outputs are verifiable
- Naming & conventions — follows naming conventions, uses imperative mood, labels are consistent
Output
A scored report with per-category ratings and specific improvement suggestions:
◆ Skill Evaluation: my-skill
┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄
Category │ Score │ Notes
──────────────────────┼───────┼──────────────────────────
Structure │ 9/10 │ ✓ All sections present
Description quality │ 6/10 │ Trigger too broad
Prompt engineering │ 7/10 │ Missing examples section
Context efficiency │ 8/10 │ Good use of references
Safety & guardrails │ 5/10 │ No confirmation steps
Testability │ 7/10 │ Criteria could be sharper
Naming & conventions │ 9/10 │ ✓ Follows conventions
──────────────────────┼───────┼──────────────────────────
Overall │ 73/100│
⚡ Top 3 improvements:
1. Add confirmation prompts before destructive actions
2. Narrow trigger description — overlaps with "code-review"
3. Add 2-3 usage examples in the skill body
Auto-Fix Mode (--fix)
In addition to reporting issues, the evaluator should provide an --fix flag that automatically corrects basic, deterministic problems in a skill's SKILL.md frontmatter and structure. This saves skill creators from manually fixing trivial issues that have clear, unambiguous solutions.
Usage:
asm eval <skill-path> --fix # evaluate and auto-fix basic issues
asm eval <skill-path> --fix --dry-run # show what would be fixed without modifying files
Auto-fixable items:
| Problem |
Auto-fix action |
Missing version in frontmatter |
Add version: 0.1.0 |
Missing author / creator info |
Add author: field with value from git config (user.name) or prompt |
Missing effort field |
Infer effort (XS/S/M/L/XL) from skill line count and complexity, add to frontmatter |
Missing type field |
Infer type from skill content (e.g., presence of code patterns → code, CLI commands → cli) |
Missing description field |
Extract first meaningful sentence from the skill body as description |
| Trailing whitespace / inconsistent line endings |
Normalize whitespace |
| Missing blank line between sections |
Insert blank lines per markdown best practices |
| Frontmatter field ordering |
Reorder to canonical order: name, description, version, author, type, effort |
Auto-fix output example:
◆ Skill Evaluation: my-skill (--fix mode)
┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄
Auto-fixed 3 issues:
✓ Added missing version: 0.1.0
✓ Added missing author: "John Doe" (from git config)
✓ Added missing effort: M (inferred from 180 lines)
Remaining issues (manual fix needed):
⚠ Description quality: trigger too broad — overlaps with "code-review"
⚠ Safety: no confirmation steps for destructive actions
Overall: 73/100 → 82/100 (after auto-fix)
Constraints:
- Auto-fix only applies to deterministic, low-risk corrections (frontmatter fields, formatting)
- Content-level issues (description quality, prompt engineering, safety) are never auto-fixed — they require human judgment
--dry-run shows a diff preview of proposed changes without writing to disk
- All fixes are applied to SKILL.md in-place; a backup is created as
SKILL.md.bak before modification
Related
Reporter Context
To support skill creators, add a tool to evaluate their skill based on the best practice.
This process should also provide option for auto-fix to fix some basic problem such as: missing version number, missing creator info, effort value, etc.
Acceptance Criteria
Metadata
- Priority: medium
- Effort: L
- Suggested labels: feature, cli
Type
feature (high confidence)
Description
Add a new CLI command (e.g.,
asm eval <skill-path>) that evaluates a skill against established best practices and produces a structured quality report with actionable feedback. This tool helps skill creators improve their skills before publishing or sharing.The evaluator should check a skill's SKILL.md (and supporting files) against criteria derived from best practice sources (see #117), including:
Output
A scored report with per-category ratings and specific improvement suggestions:
Auto-Fix Mode (
--fix)In addition to reporting issues, the evaluator should provide an
--fixflag that automatically corrects basic, deterministic problems in a skill's SKILL.md frontmatter and structure. This saves skill creators from manually fixing trivial issues that have clear, unambiguous solutions.Usage:
Auto-fixable items:
versionin frontmatterversion: 0.1.0author/ creator infoauthor:field with value from git config (user.name) or prompteffortfieldtypefieldcode, CLI commands →cli)descriptionfieldAuto-fix output example:
Constraints:
--dry-runshows a diff preview of proposed changes without writing to diskSKILL.md.bakbefore modificationRelated
Reporter Context
Acceptance Criteria
asm eval <skill-path>command that accepts a local skill directory path (high confidence)--fixflag auto-corrects basic frontmatter issues: missing version, author, effort, type, and description (high confidence)--fix --dry-runshows proposed fixes as a diff without modifying files (high confidence)SKILL.md.bakbackup before modifying the original file (high confidence)--jsonoutput for machine consumption (medium confidence)asm publishas a pre-publish quality gate (medium confidence)Metadata