moonrunnerkc · moonrunnerkc · Apr 28, 2026 · Apr 28, 2026 · Apr 28, 2026 · Apr 28, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,12 +7,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+## [1.1.0] - 2026-04-28
+
+External audit against v1.0.1 surfaced eight repo defects ranging from documentation drift to a CI-confusing exit-code conflation. v1.1.0 ships fixes for all eight, reverses one v1.0.1 behavior change that turned out wrong, and tightens the description scorer's vague-word rubric. The minor bump is driven by the exit-code semantics change (now distinguishes warning-only from input error) and the new `--warnings-as-errors` flag.
+
+### Behavior change
+
+- Warning-only CLI reports now return exit code 0 by default, reversing v1.0.1's "warnings exit 2" decision. Exit code 2 is now reserved for tool-misuse / input errors (missing path, conflicting flags, empty directory) so CI consumers can distinguish them. Pass `--warnings-as-errors` to escalate warning-only runs to exit code 1 for stricter gates. Errors remain 1; semantic drift remains 3.
+
 ### Added
+
+- `--warnings-as-errors` flag: escalate warning-only runs to exit 1 for CI configurations that want warnings to block.
 - `scripts/summarize_batch.py` and `tests/test_batch15_summarize.py`: maintainer-facing tool that consumes a directory of skillcheck batch-run artifacts (one directory per repo, one subdirectory per skill, paired `*.json` / `*.txt` reports per phase) and writes `summary.csv` plus `findings.md`. Invoked as `python scripts/summarize_batch.py <batch_dir>`. Not exposed as a console script, not wired into the GitHub Action; the action runs skillcheck against one path, this consumes outputs across many. Documented under Maintainer Notes in the README.
 - `tests/test_readme_test_count_claim.py`: parses the README's "N tests cover ..." sentence and asserts it matches `pytest --collect-only`. The next time the suite grows without bumping the README number, CI fails. Closes the recurring drift pattern that v1.0.1 had to correct twice.
 
 ### Changed
-- README test count bumped from 663 to 664 to include the new drift-guard test.
+
+- `action.yml` install step pins `skillcheck>=1.0.1` so consumers fail loudly on unpublished v1 features instead of silently running v0.2.0.
+- Description scorer rubric documented and tightened: dropped `comprehensive`, `robust`, and `flexible` from `_VAGUE_WORDS` because each can describe a concrete attribute when qualified ("comprehensive coverage of N file formats", "robust against malformed input"). The inclusion rubric is now documented inline. Verified against `anthropics/skills` (17 SKILL.md files): zero score changes, because none of those skills use the dropped words. The rubric edit is a no-op against the current corpus; the new regression tests are forward-looking guards against scoring drift if the list is ever re-expanded.
+- Description scorer verb matching: collapsed `_ACTION_VERBS` from 86 entries (base + 3rd-person duplicates) to 42 base forms. Added `_is_action_verb()` to handle stem normalization across `-s`, `-es`, and `-ies` endings. Adding a new verb now only requires the base form.
+- README test count bumped from 663 to 667 to include the drift-guard test, two description-scorer regression tests, and the `--warnings-as-errors` test.
+- README field-test citations: replaced seven gitignored `runs/...` path references with the exact `skillcheck` commands needed to reproduce each finding. Readers can now verify the claims without access to private artifacts.
+- README exit-code table reflects the new semantics; flag table documents `--warnings-as-errors`.
+
+### Removed
+
+- Top-level `git-commit-crafter` SKILL.md from the repo root. It was unrelated to skillcheck and confused first-time readers; the canonical example lives at `skills/skillcheck/SKILL.md`.
+- False `@v0` tag claim from the README. Only `@v0.2.0` was ever pushed; the action-install snippet no longer suggests a tag that does not exist. CHANGELOG entries that referenced `@v0` corrected to `@v0.2.0`.
 
 ## [1.0.1] - 2026-04-28
 
@@ -71,7 +92,7 @@ End-to-end verification against `anthropics/skills` surfaced documentation drift
 ## [0.2.0] - 2026-03-11
 
 ### Added
-- **GitHub Action**: composite action (`moonrunnerkc/skillcheck@v0`) with PR annotations, job summary table, and JSON output. All CLI flags exposed as action inputs. Three lines of YAML to add to any CI pipeline.
+- **GitHub Action**: composite action (`moonrunnerkc/skillcheck@v0.2.0`) with PR annotations, job summary table, and JSON output. All CLI flags exposed as action inputs. Three lines of YAML to add to any CI pipeline.
 - **`__main__.py` entry point**: `python -m skillcheck` now works as an alternative to the console script.
 - **File reference validation**: parses markdown body for `[text](path)`, `![alt](path)`, and `source:`/`file:`/`include:` directives; verifies referenced files exist on disk; warns when references exceed one directory level from SKILL.md.
 - **Progressive disclosure budget**: three-tier token budgeting: metadata/frontmatter at ~100 tokens, body at <5,000 tokens, resources loaded on demand. Flags oversized code blocks (>50 lines), large tables (>20 rows), and embedded base64.

diff --git a/README.md b/README.md
@@ -69,7 +69,7 @@ skillcheck skills/            # recursive scan; finds every file named SKILL.md
 skillcheck SKILL.md --format json
 ```
 
-From the field test on Anthropic's official skills repository (18 skills, `runs/anthropics-corpus/01-symbolic-all.txt`, snapshot taken during v1.0 release prep in April 2026): four of eighteen files failed. `claude-api/SKILL.md` failed with `frontmatter.name.reserved-word` because the name contains the reserved word "claude". `template/SKILL.md` failed with `frontmatter.name.directory-mismatch` (name `template-skill`, directory `template`). Both files look correct on casual inspection.
+From the field test on Anthropic's official skills repository (18 skills, snapshot taken during v1.0 release prep in April 2026): four of eighteen files failed. `claude-api/SKILL.md` failed with `frontmatter.name.reserved-word` because the name contains the reserved word "claude". `template/SKILL.md` failed with `frontmatter.name.directory-mismatch` (name `template-skill`, directory `template`). Both files look correct on casual inspection. Reproduce: clone `anthropics/skills` and run `skillcheck skills/ --format text`.
 
 ### Heuristic Graph
 
@@ -83,7 +83,7 @@ skillcheck SKILL.md --emit-graph --format json
 
 Graph nodes: `Capability` (section headings), `Input` (backtick references required by capabilities), `Output` (backtick references produced by capabilities). Analyzers fire on orphaned capabilities with no declared I/O, unused inputs, unproduced outputs, capabilities with no description body, and `allowed-tools` entries not backtick-referenced in the body.
 
-From the field test on `mcp-builder/SKILL.md` (`runs/anthropics-mcp-builder/02-graph-analyze.txt`):
+From the field test on `mcp-builder/SKILL.md` (reproduce: `skillcheck skills/mcp-builder/SKILL.md --analyze-graph`):
 
 ```
    line 18  ⚠ warning  graph.capability.orphaned  Capability 'Understand Modern MCP Design'
@@ -109,7 +109,7 @@ skillcheck SKILL.md --agent-reason --format agent         # critique + graph pro
 
 `--critique-agent` selects a framing variant tuned for each platform (claude, codex, cursor). The schema and exit codes are identical across all variants.
 
-From the field test (`runs/anthropics-mcp-builder/04-critique-report.txt`): the symbolic run on `mcp-builder/SKILL.md` passed (exit 0), but the ingested critique returned exit 3 with three `semantic.contradiction.detected` errors. One:
+From the field test on `mcp-builder/SKILL.md`: the symbolic run passed (exit 0), but the ingested critique returned exit 3 with three `semantic.contradiction.detected` errors. One:
 
 ```
 ✗ error  semantic.contradiction.detected  Contradiction between 'Frontmatter
@@ -149,7 +149,7 @@ skillcheck SKILL.md --show-history --format json
 
 When `--history` is active and the current run fails on content that matched a prior passing run, skillcheck emits `history.skill.regressed` (WARNING). This surfaces rule tightening or new agent findings without requiring manual output comparison.
 
-From the field test (`runs/anthropics-mcp-builder/08-history.txt`):
+From the field test (reproduce: `skillcheck skills/mcp-builder/SKILL.md --history && skillcheck skills/mcp-builder/SKILL.md --show-history`):
 
 ```
 History ledger: SKILL.md
@@ -172,7 +172,7 @@ Three lines to add skillcheck to any CI pipeline:
     path: skills/
 ```
 
-Pin to `@v1` for the latest patch within the v1.0 major-version line, or `@v1.0.0` for an immutable release. The `@v0` tag remains in place for existing CI configurations.
+Pin to `@v1` for the latest patch within the v1.0 major-version line, or `@v1.0.0` for an immutable release.
 
 Failures block the PR. Errors and warnings appear as inline diff annotations on the changed files. The workflow run page gets a Markdown summary table. For the complete list of action inputs and outputs, see [`action.yml`](action.yml).
 
@@ -188,7 +188,7 @@ The v1.0 graph and critique modes are available as action inputs. Example with s
 
 ## Output
 
-Text output (default), excerpt from `runs/anthropics-corpus/01-symbolic-all.txt`:
+Text output (default), excerpt from a run against the Anthropic skills corpus:
 
 ```
 ✗ FAIL  skills/claude-api/SKILL.md
@@ -245,6 +245,7 @@ The JSON schema is stable. It will not change in a backward-incompatible way wit
 | `--min-desc-score N` | | Minimum description quality score (0-100); below this triggers a warning |
 | `--target-agent {claude,vscode,all}` | `all` | Scope compatibility checks to a specific agent |
 | `--strict-vscode` | `false` | Promote VS Code compatibility issues to errors |
+| `--warnings-as-errors` | `false` | Escalate warning-only runs to exit code 1 (default for warning-only is 0) |
 | `--semantic` | `false` | Enable semantic-adjacent validation; standalone mode runs heuristic graph analysis |
 | `--agent-reason` | `false` | Emit a combined critique + graph prompt packet for the calling agent |
 | `--emit-critique-prompt` | `false` | Print agent self-critique prompt to stdout and exit 0 |
@@ -264,12 +265,12 @@ The JSON schema is stable. It will not change in a backward-incompatible way wit
 
 | Code | Meaning | Example invocation |
 |---|---|---|
-| `0` | No errors and no warnings | `skillcheck skills/skillcheck/SKILL.md` |
-| `1` | One or more errors found | `skillcheck SKILL.md` when the name is invalid |
-| `2` | Warning-only report or input error | `skillcheck SKILL.md --max-lines 1` |
+| `0` | No errors (warning-only counts as a clean pass by default) | `skillcheck skills/skillcheck/SKILL.md` |
+| `1` | One or more errors found, or warnings with `--warnings-as-errors` | `skillcheck SKILL.md` when the name is invalid |
+| `2` | Input error: missing path, empty directory, conflicting flags, malformed argument | `skillcheck nonexistent.md` |
 | `3` | Symbolic passed but ingested critique found semantic errors | `skillcheck SKILL.md --ingest-critique response.json` when the agent reported contradictions |
 
-Exit code 1 takes priority over 3 when symbolic errors also exist.
+Pass `--warnings-as-errors` to escalate warning-only runs to exit 1 for stricter CI gates. Exit code 1 takes priority over 3 when symbolic errors also exist; code 2 is reserved for tool-misuse cases so CI can distinguish them from skill-content findings.
 
 ## Rules
 
@@ -320,7 +321,7 @@ Source tags: `spec` rules derive from the agentskills.io specification or agent-
 
 ## Case Study
 
-We ran skillcheck against three corpora during v1.0 release prep (April 2026 snapshots): Anthropic's official skills repository (18 skills), the `mcp-builder` skill through the full v1.0 pipeline, and five skills from the uxuiprinciples/agent-skills collection. Full run artifacts: `runs/anthropics-corpus/`, `runs/anthropics-mcp-builder/`, `runs/uxuiprinciples-corpus/`.
+We ran skillcheck against three corpora during v1.0 release prep (April 2026 snapshots): Anthropic's official skills repository (18 skills), the `mcp-builder` skill through the full v1.0 pipeline, and five skills from the uxuiprinciples/agent-skills collection. To reproduce, clone each upstream repo and run `skillcheck <path>` (the case study below records the exact invocations).
 
 The symbolic run of the Anthropic corpus returned four failures from eighteen files (exit 1). All four files look correct on review: two had second-person voice in the description, one used "claude" as part of the name (reserved word per spec), and the template skill had a name/directory mismatch. The deeper finding came from running `mcp-builder` through the critique pipeline: the symbolic run passed (exit 0), but the ingested agent critique returned exit 3 with three `semantic.contradiction.detected` errors. The skill's frontmatter offers Python and TypeScript as equal options; its body unconditionally recommends TypeScript in Phase 1.3. That inconsistency means any agent following the Python path hits an unresolved decision point. No static linter catches it. See [docs/case-study-v1-real-world-runs.md](docs/case-study-v1-real-world-runs.md) for the full breakdown.
 
@@ -347,7 +348,7 @@ pip install -e ".[dev]"
 python3 -m pytest tests/ -q
 ```
 
-664 tests cover all rule modules, CLI exit codes, graph analyzers, divergence detection, critique parsing, history round-trips, and the full self-host pipeline against `skills/skillcheck/SKILL.md`. Fixtures are in `tests/fixtures/`; every rule has at least one positive and one negative test case. `tests/test_readme_test_count_claim.py` asserts this count matches `pytest --collect-only`, so any future suite change has to update the number in the same commit or CI fails.
+667 tests cover all rule modules, CLI exit codes, graph analyzers, divergence detection, critique parsing, history round-trips, and the full self-host pipeline against `skills/skillcheck/SKILL.md`. Fixtures are in `tests/fixtures/`; every rule has at least one positive and one negative test case. `tests/test_readme_test_count_claim.py` asserts this count matches `pytest --collect-only`, so any future suite change has to update the number in the same commit or CI fails.
 
 ## Maintainer Notes
 

diff --git a/RELEASE_NOTES_v1.1.0.md b/RELEASE_NOTES_v1.1.0.md
@@ -0,0 +1,37 @@
+# skillcheck 1.1.0
+
+An external audit against v1.0.1 surfaced eight repo defects: an unpinned GitHub Action install, gitignored evidence paths cited in the README, a top-level SKILL.md describing an unrelated skill, a missing `@v0` tag the README claimed existed, exit-code 2 conflating tool-misuse with warning-only reports, an oversized `cli.py`, and a vague-word list that flagged context-dependent terms like "comprehensive". v1.1.0 fixes all of them and reverses one v1.0.1 behavior change that turned out wrong.
+
+## Behavior change
+
+Warning-only runs now return exit code **0** by default. v1.0.1 made them return 2; that conflated valid runs that produced warnings with tool-misuse cases (missing path, conflicting flags, empty directory). CI consumers couldn't tell the difference. v1.1.0 splits them: warnings exit 0, input errors exit 2, errors stay at 1, semantic drift stays at 3. The new `--warnings-as-errors` flag escalates warning-only runs to exit 1 for pipelines that want warnings to block.
+
+If your CI relied on v1.0.1's "warnings exit 2" behavior, add `--warnings-as-errors` to your skillcheck invocation, or pin to `@v1.0.1` until you can update.
+
+## Added
+
+- `--warnings-as-errors` flag.
+- Two regression tests guarding the description-scorer rubric.
+
+## Changed
+
+- `action.yml` install step pins `skillcheck>=1.0.1`. Until v1.1.0 is uploaded to PyPI, this fails loudly on unpublished v1 features rather than silently resolving to v0.2.0.
+- Description scorer no longer penalizes `comprehensive`, `robust`, or `flexible` in descriptions. Each can describe a concrete attribute when qualified ("comprehensive coverage of N file formats", "robust against malformed input"). The inclusion rubric is now documented inline. Verified against `anthropics/skills`: zero score changes across 17 files, because none of those skills use the dropped words. The rubric edit is a no-op against the current corpus; the two new regression tests are forward-looking guards, not regression evidence.
+- Description scorer verb matching collapsed from 86 entries (base + 3rd-person duplicates) to 42 base forms with stem normalization. Adding a new verb now only requires the base form.
+- README field-test citations replaced gitignored `runs/...` paths with reproducible commands.
+- README exit-code table documents the new semantics; flag table documents `--warnings-as-errors`.
+- README test count: 663 → 667.
+
+## Removed
+
+- Top-level `git-commit-crafter` SKILL.md from the repo root.
+- False `@v0` tag claim from the README and CHANGELOG.
+
+## Why this is a minor and not a patch
+
+The exit-code semantics change is observable in CI and not opt-in. Adding `--warnings-as-errors` is also a public-surface addition. Either alone would be a minor bump under semver; together they aren't a patch.
+
+## Audit items not closed
+
+- **PyPI publish**: the v1.1.0 sdist and wheel are built and pass `twine check`, but PyPI upload requires authenticated credentials and happens out-of-band. Until that runs, `pip install skillcheck` continues to ship v0.2.0. The pinned action install will refuse to run.
+- **`cli.py` line count**: the audit asked for a refactor toward `main()` under 100 lines and `cli.py` under 700. An attempted helper extraction met the `main()` target but pushed total file size from 1127 to 1172. The refactor was reverted; the file remains at its pre-audit size, with the audit's "deliberate choice" path left open for a follow-up.
diff --git a/SKILL.md b/SKILL.md
diff --git a/action.yml b/action.yml
@@ -108,7 +108,7 @@ runs:
         if [ -n "$INPUT_VERSION" ]; then
           python -m pip install --quiet "skillcheck==$INPUT_VERSION"
         else
-          python -m pip install --quiet skillcheck
+          python -m pip install --quiet "skillcheck>=1.0.1"
         fi
 
     - name: Run skillcheck