Skip to content

Releases: moonrunnerkc/skillcheck

v1.3.0

Choose a tag to compare

@moonrunnerkc moonrunnerkc released this 18 May 17:48

Full Changelog: v1...v1.3.0

v1.2.3

Choose a tag to compare

@moonrunnerkc moonrunnerkc released this 08 May 00:16

v1.2.3

Added

  • --format github: outputs diagnostics as GitHub Actions workflow commands (::error, ::warning, ::notice) with proper escaping for file, line, and message properties. The GitHub Action now defaults to this format so PR annotations render automatically without a Python entrypoint.
  • .pre-commit-hooks.yaml: adds a skillcheck hook for pre-commit, matching SKILL.md files and passing filenames to the CLI.
  • CONTRIBUTING.md: documents the release convention (immutable patch tags plus a force-updated v1 moving major tag).
  • tests/__init__.py: makes the test package importable, fixing from tests.conftest in environments where another tests package shadows the path.
  • nargs="+" on the path argument: the CLI now accepts multiple paths (required by pre-commit's pass_filenames mode). Single-path usage is unchanged.

Changed

  • action.yml simplified to a two-step composite action that installs skillcheck via pip and runs it directly. The Python entrypoint (action/entrypoint.py) is no longer invoked; --format github handles PR annotations natively. The format input defaults to github (was json, which was ignored at runtime).
  • README GitHub Action section updated to reflect automatic PR annotations via --format github.
  • README pre-commit section added with a .pre-commit-config.yaml snippet.
  • README test count updated to 730.

Fixed

  • Path separators in --format github output normalized to forward slashes for Windows compatibility.

Removed

  • The Python entrypoint (action/entrypoint.py) for annotation parsing and step summary generation is no longer used by the action. The action runs skillcheck directly.

v1.2.2

Choose a tag to compare

@moonrunnerkc moonrunnerkc released this 03 May 22:00

[1.2.2] - 2026-05-03

Added

  • compat.cursor-description-block-scalar rule (INFO by default). Flags description: >, description: >+, description: |, and description: |+ because Cursor's skills UI renders these as empty. The Cursor-safe form is description: >- (folded strip). Closes #1.
  • --strict-cursor flag promotes the new rule to ERROR and fails the run. Mirrors --strict-vscode.
  • cursor is now a valid --target-agent choice; promotes the rule to WARNING when set without --strict-cursor.
  • strict-cursor action input (action.yml) and INPUT_STRICT_CURSOR wiring (action/entrypoint.py).
  • TOML config: strict-cursor = true is now accepted in skillcheck.toml.

Changed

  • frontmatter.name.required and frontmatter.description.required now append a hint when the missing field appears as a ## name: or ## description: markdown heading inside the frontmatter block. Frontmatter keys are YAML, not markdown; the hint nudges authors to drop the ## prefix. Closes #1.

v1.2.1

Choose a tag to compare

@moonrunnerkc moonrunnerkc released this 03 May 22:00

[1.2.1] - 2026-05-03

Fixed

  • description.quality-score no longer flags verb-led descriptions starting with investigate, diagnose, triage, troubleshoot, examine, audit, inspect, compare, capture, normalize, or refactor. Expanded _ACTION_VERBS from 43 to 170 entries to cover investigation, inspection, search, code-work, output, comparison, logging, and normalization clusters. Closes #2.

v1.2.0

Choose a tag to compare

@moonrunnerkc moonrunnerkc released this 03 May 22:00
0eee715

[1.2.0] - 2026-04-29

Backward compatibility: previously-passing skills still pass. Some previously-failing skills now warn instead of error and produce exit code 0 instead of 1.

Added

  • template.detected info-level rule and src/skillcheck/template_detection.py module.
  • ECOSYSTEM_FIELDS classification for license, repository, homepage, and template.
  • Config support for [frontmatter] extension_fields in skillcheck.toml.

Changed

  • frontmatter.name.reserved-word demoted from ERROR to WARNING; source tag changed from spec to advisory; message rewritten.
  • frontmatter.description.person-voice demoted from ERROR to WARNING; messages rewritten to acknowledge the heuristic.
  • Budget-message phrasing aligned with the spec's "recommended" language across sizing.* and disclosure.* rules.

Fixed

  • frontmatter.field.unknown no longer fires on license, repository, homepage, or template; these now produce info-level frontmatter.field.ecosystem diagnostics or are silent for user extensions.
  • Templates (placeholder content, template: true flag, or files under template/ or templates/ directories) no longer trigger deployment-blocking checks (frontmatter.name.directory-mismatch, compat.vscode-dirname, description.quality-score).

Internal

  • Renamed config.KNOWN_FRONTMATTER_FIELDS to config.SPEC_FIELDS.
  • New template.detected rule wired into rules/__init__.py.
  • Frontmatter rule implementation split into smaller modules while preserving skillcheck.rules.frontmatter imports.
  • Root SKILL.md restored so skillcheck SKILL.md self-validation works from the repository root.
  • New fixture set under tests/fixtures/ covering ecosystem fields, user extensions, template detection, and demoted severities.

skillcheck 1.1.0

Choose a tag to compare

@moonrunnerkc moonrunnerkc released this 28 Apr 20:27
44620cc

skillcheck 1.1.0

An external audit against v1.0.1 surfaced eight repo defects: an unpinned GitHub Action install, gitignored evidence paths cited in the README, a top-level SKILL.md describing an unrelated skill, a missing @v0 tag the README claimed existed, exit-code 2 conflating tool-misuse with warning-only reports, an oversized cli.py, and a vague-word list that flagged context-dependent terms like "comprehensive". v1.1.0 fixes all of them and reverses one v1.0.1 behavior change that turned out wrong.

Behavior change

Warning-only runs now return exit code 0 by default. v1.0.1 made them return 2; that conflated valid runs that produced warnings with tool-misuse cases (missing path, conflicting flags, empty directory). CI consumers couldn't tell the difference. v1.1.0 splits them: warnings exit 0, input errors exit 2, errors stay at 1, semantic drift stays at 3. The new --warnings-as-errors flag escalates warning-only runs to exit 1 for pipelines that want warnings to block.

If your CI relied on v1.0.1's "warnings exit 2" behavior, add --warnings-as-errors to your skillcheck invocation, or pin to @v1.0.1 until you can update.

Added

  • --warnings-as-errors flag.
  • Two regression tests guarding the description-scorer rubric.

Changed

  • action.yml install step pins skillcheck>=1.0.1. Until v1.1.0 is uploaded to PyPI, this fails loudly on unpublished v1 features rather than silently resolving to v0.2.0.
  • Description scorer no longer penalizes comprehensive, robust, or flexible in descriptions. Each can describe a concrete attribute when qualified ("comprehensive coverage of N file formats", "robust against malformed input"). The inclusion rubric is now documented inline. Verified against anthropics/skills: zero score changes across 17 files, because none of those skills use the dropped words. The rubric edit is a no-op against the current corpus; the two new regression tests are forward-looking guards, not regression evidence.
  • Description scorer verb matching collapsed from 86 entries (base + 3rd-person duplicates) to 42 base forms with stem normalization. Adding a new verb now only requires the base form.
  • README field-test citations replaced gitignored runs/... paths with reproducible commands.
  • README exit-code table documents the new semantics; flag table documents --warnings-as-errors.
  • README test count: 663 → 667.

Removed

  • Top-level git-commit-crafter SKILL.md from the repo root.
  • False @v0 tag claim from the README and CHANGELOG.

Why this is a minor and not a patch

The exit-code semantics change is observable in CI and not opt-in. Adding --warnings-as-errors is also a public-surface addition. Either alone would be a minor bump under semver; together they aren't a patch.

Audit items not closed

  • PyPI publish: the v1.1.0 sdist and wheel are built and pass twine check, but PyPI upload requires authenticated credentials and happens out-of-band. Until that runs, pip install skillcheck continues to ship v0.2.0. The pinned action install will refuse to run.
  • cli.py line count: the audit asked for a refactor toward main() under 100 lines and cli.py under 700. An attempted helper extraction met the main() target but pushed total file size from 1127 to 1172. The refactor was reverted; the file remains at its pre-audit size, with the audit's "deliberate choice" path left open for a follow-up.

skillcheck 1.0.1

Choose a tag to compare

@moonrunnerkc moonrunnerkc released this 28 Apr 15:06

skillcheck 1.0.1

skillcheck v1.0.1 commits a batch of post-v1.0.0 implementation work that had been sitting uncommitted, ships the docs corrections an end-to-end verification surfaced, and aligns the README, CHANGELOG, and CLI surface so they describe the same release.

There is one behavior change relative to v1.0.0: warning-only runs now return exit code 2. Errors return 1; semantic drift returns 3. CI consumers that previously relied on warning-only exiting 0 must update.

Changed

  • Warning-only CLI reports now return exit code 2. Exit code 1 remains errors; exit code 3 remains semantic drift.
  • README Exit Codes table row 0 now reads "no errors and no warnings".
  • README test count corrected from 653 to 663.
  • README JSON-stability promise updated from "0.x series" to "v1.x series".
  • README field-test numbers reframed as April 2026 snapshots against anthropics/skills, with a note that they will drift as upstream evolves.
  • action.yml format input description clarified: accepted but ignored at runtime; the action always invokes skillcheck with --format json so it can parse diagnostics for PR annotations and the step summary.
  • Development extras now include ruff>=0.6, mypy>=1.10, and types-PyYAML>=6.0.

Added

  • --semantic: guide-compatible shortcut that enables semantic-adjacent validation. In standalone mode it runs heuristic graph analysis; with ingested agent responses it merges those diagnostics.
  • --agent-reason: guide-compatible agent-workflow shortcut. Emits a combined critique and graph prompt packet so the calling agent can run both reasoning steps and feed JSON back through --ingest-critique and --ingest-graph.
  • --format md and --format agent: Markdown report output and agent-oriented next-action output.
  • skillcheck.toml config loading: top-level defaults for format, thresholds, target agent, strict VS Code mode, skip flags, ignored rule prefixes, graph analysis, semantic mode, history, and agent variants. CLI flags always win; the loader fills unset values.
  • Experimental --activation-hypotheses: generates likely natural-language routing triggers plus a discoverability entropy score. Routing caveat included in every report.
  • Machine-readable diagnostic metadata: JSON diagnostics now include source and confidence fields.
  • GitHub Action inputs for the v1.0 modes: semantic, analyze-graph, ingest-critique, critique-agent, ingest-graph, graph-agent, history, activation-hypotheses. The action still always emits JSON internally for PR annotations.

Why this is a patch and not a minor

Every addition above either documents existing behavior, refines a flag, or is gated behind a new opt-in flag. There is one breaking-ish change: warning-only runs now exit 2 instead of 0. Strict semver would call that a minor bump. The judgment call here: v1.0.0 shipped with documentation that already implied the v2-style exit codes (and the v1.0.1 README makes it explicit), the prior "warnings exit 0" behavior was undocumented in the released README, and the change matches what users running this in CI would expect. If your CI pipeline depended on the old behavior, pin to @v1.0.0 rather than @v1 until you can update.

Verification

After installing skillcheck==1.0.1:

skillcheck --version
# skillcheck 1.0.1

skillcheck skills/skillcheck/SKILL.md --analyze-graph
# exit 0 with no errors and no warnings (only INFO diagnostics)

End-to-end verification was run against anthropics/skills at commit 5128e186 (18 SKILL.md files). All 26 documented flags exercised; all four exit codes (0, 1, 2, 3) reproduced; the action entrypoint produced byte-identical JSON to the CLI. Full report: see the v1.0.1 verification artifacts.

Links

skillcheck 1.0.0

Choose a tag to compare

@moonrunnerkc moonrunnerkc released this 25 Apr 21:15

skillcheck v1.0.0 is the first major release. It adds agent-native semantic self-critique, heuristic capability graph extraction with five structural analyzers, and a per-skill validation history ledger on top of the v0.2.0 symbolic foundation. The tool is designed for two modes: when a calling agent is present it uses that agent for semantic analysis; when no agent is present it runs symbolic checks only. No LLM API keys required. Suitable for CI pipelines, local pre-commit hooks, or agent-loop integration.

Changed

  • Rewrote README end-to-end for v1.0 launch audience. New sections: "Why This Exists", "Modes" (five subsections: Symbolic, Heuristic Graph, Agent Critique, Agent Graph, History), "Maintainer Notes". Removed v0.2.0-era feature bullet list and duplicated section prose. Restructured Quick Start to lead with the agent-native workflow. Rebuilt Options table from live argparse audit; every flag matches its actual help text and default. Rebuilt Rules table from live rule module audit; added source-tag legend paragraph. Added inline v1.0 case study paragraph (full detail at docs/case-study-v1-real-world-runs.md). All cited diagnostics and output excerpts trace verbatim to field-test artifacts in runs/.
  • Added docs/case-study-v1-real-world-runs.md: full breakdown of the pre-3B field test covering 18 Anthropic skills (symbolic), mcp-builder through the full v1.0 pipeline (symbolic + heuristic graph + agent critique + agent graph), and 5 uxuiprinciples skills (strict VS Code mode). Documents three semantic.contradiction.detected errors on a skill that passed all symbolic checks, five graph.capability.orphaned patterns, and the recurring unknown-field pattern (license, homepage, env) across official catalogs.

Added

  • skills/skillcheck/SKILL.md: skillcheck's own SKILL.md, validating the tool against itself. Passes symbolic, graph, critique, and history validation with zero errors and zero warnings. Serves as the worked example for the Rules table in the README.
  • Self-host integration test suite (tests/test_self_host.py): confirms the bundled SKILL.md passes symbolic validation, all five graph analyzers, critique ingestion, agent graph ingestion with divergence analysis, full CLI pipeline, history round-trip, and description scoring threshold.
  • scripts/regen_self_host_fixtures.py: regenerates tests/fixtures/self_host/graph_clean.json from the live heuristic graph after skill edits.
  • Makefile with regen-self-host-fixtures target: runs the regen script against skills/skillcheck/SKILL.md.
  • --history flag: appends a validation record to the per-skill .skillcheck-history.json ledger next to the SKILL.md file. Off by default; existing invocations see no behavior change. Incompatible with emit modes.
  • --show-history flag: reads the per-skill ledger and prints it (text or JSON via --format), then exits 0. Skips all validation. Incompatible with emit modes and --history.
  • history.skill.regressed WARNING rule: fires when --history is active, the skill content hash matches a prior passing run, and the current run fails. Indicates a rule tightened or an agent surfaced a new finding.
  • history.write.failed WARNING rule: fires when --history is active but the ledger file cannot be written. Validation exit code is unaffected.
  • history.read.failed WARNING rule: fires when --history is active but the existing ledger cannot be read. Validation continues without regression check.
  • --emit-graph: emit mode. Prints the extracted capability graph (text or JSON) to stdout and exits 0. Identifies Capability, Input, and Output nodes plus requires/produces edges heuristically from heading structure and backtick references. Mutually exclusive with --analyze-graph, --emit-critique-prompt, and --ingest-critique.
  • --analyze-graph: augment mode. Extracts the capability graph from each file, runs all five graph analyzers, and merges diagnostics into the validation report. Compatible with --ingest-critique (both run; results merged per file). Graph WARNINGs do not fail validation or change the exit code.
  • Five graph rule checkers (all WARNING severity): graph.capability.orphaned, graph.input.unused, graph.output.unproduced, graph.capability.empty_description, graph.tool.unreferenced. No double-firing: body inputs and frontmatter tools are handled by separate analyzers.
  • graph_render module: render_graph_text and render_graph_json pure rendering functions. JSON output is deterministic (field order follows dataclass declaration).
  • merge_diagnostics public function in core.semantic and core.__init__. merge_critique_diagnostics is now a thin wrapper; existing callers unchanged.
  • --critique-agent {claude,codex,cursor}: select the prompt template variant for agent self-critique. Prompt framing is tuned per vendor; the schema, parser, and exit codes are identical across all agents. Requires --emit-critique-prompt or --ingest-critique. Records the agent name as critique_source in JSON output and as a header line in text output. Default: claude.
  • --emit-critique-prompt: print the agent self-critique prompt to stdout and exit 0. Use --format json to wrap in {"prompt": "..."}. In directory mode, prompts are separated by a delimiter line so downstream tools can split per-skill.
  • --ingest-critique PATH: read an agent self-critique JSON response from PATH (use - for stdin), convert to diagnostics, merge with symbolic results, and emit a unified report.
  • Exit code 3: symbolic validation passed but the ingested critique contains semantic errors (contradictions or findings with ERROR severity). Exit code 1 takes priority when symbolic errors exist.
  • --emit-graph-prompt: print the capability graph extraction prompt to stdout and exit 0. Use --graph-agent to select the vendor variant. In directory mode, prompts are separated by the same per-skill delimiter used by --emit-critique-prompt.
  • --ingest-graph PATH: read an agent graph extraction JSON response from PATH (use - for stdin), parse it into a CapabilityGraph with source="agent", run standard graph analyzers, run divergence analyzers against the heuristic baseline, and merge all diagnostics into the validation report.
  • --graph-agent {claude,codex,cursor}: select the prompt template variant for graph extraction. Framing is tuned per vendor; the schema, parser, and exit codes are identical across all agents. Requires --emit-graph-prompt or --ingest-graph. Default: claude. Records the agent name as graph_source in JSON output and as a header line in text output.
  • graph.contradiction.heuristic_disagreement (ERROR, source: agent): fires when an ingested agent graph claims an edge between two nodes that both appear in the heuristic graph but that edge is absent heuristically. Indicates a possible over-claimed capability. Only active when --ingest-graph is used.
  • Graph extraction prompt module (agents.graph_base, agents.graph_claude, agents.graph_codex, agents.graph_cursor): parallel to the critique prompt module. Claude variant uses XML tags and a full worked example; Codex uses markdown headers and a full worked example; Cursor uses a compact type signature only.

Verification

After installing skillcheck==1.0.0:

skillcheck --version
# skillcheck 1.0.0

skillcheck skills/skillcheck/SKILL.md --analyze-graph
# Should exit 0 with no errors and no warnings (only INFO diagnostics)

Links

v0.2.0

Choose a tag to compare

@moonrunnerkc moonrunnerkc released this 12 Mar 01:28

What's new in 0.2.0

First feature release. Adds cross-agent compatibility checks, file reference validation, progressive disclosure budgeting, description quality scoring, and a drop-in GitHub Action.

Highlights

GitHub Action -- three lines of YAML to add skillcheck to any CI pipeline. PR annotations, job summary table, and JSON output included. See the README for setup.

Cross-agent compatibility warnings -- flags Claude Code-only fields, VS Code directory-name requirements, and fields with unverified behavior in Codex and Cursor. Full compatibility matrix across four agents.

File reference validation -- parses markdown links and frontmatter directives, verifies referenced files exist on disk, catches symlink escapes (CWE-59), and warns when references go deeper than one directory level.

Progressive disclosure budget -- three-tier token budgeting (metadata at ~100 tokens, body at <5,000, resources on demand). Flags oversized code blocks, large tables, and embedded base64.

Description quality scoring -- scores 0-100 for agent discoverability. Checks action verbs, trigger phrases, keyword density, specificity, and length. Enforce a minimum with --min-desc-score N.

YAML type coercion detection -- catches when yaml.safe_load silently converts bare values like true, 123, or null into non-string types. Clear fix advice included.

New CLI flags

  • --strict-vscode promotes VS Code compatibility issues from INFO to ERROR
  • --target-agent {claude,vscode,all} scopes checks to a specific agent
  • --skip-dirname-check and --skip-ref-check for CI without filesystem context
  • -q/--quiet suppresses all output (exit code only)
  • --min-desc-score N enforces a minimum description quality threshold

Bug fixes

  • Fixed duplicate diagnostics for ../../ reference paths (both depth-exceeded and traverses-above fired; now only the most specific one does)
  • Corrected sizing rule descriptions in the README

Install

pip install skillcheck==0.2.0

Full changelog: CHANGELOG.md