PatternInspector false-positives on benign data: blocked-command regexes match anywhere in the full command string

### Summary

`PatternInspector` matches every blocked-bash pattern against the **entire raw command string** with no shell-context awareness:

`.claude/hooks/security/inspectors/PatternInspector.ts:79`
```ts
return new RegExp(pattern, 'i').test(command);
```

`.claude/PAI/DOCUMENTATION/Security/Patterns.example.yaml:10`
```
# matchesPattern() does regex.test(fullCommand) — matches anywhere in string
```

Several destructive-command rules are short, bare substrings (e.g. the `mkfs` blocked pattern). Because the match is case-insensitive and unanchored against the whole command, these rules fire on **benign data** that merely *contains* the substring — opaque identifiers, filenames, hashes, quoted strings, or heredoc bodies — and the command is hard-denied (`exit 2`).

### Reproduction

All three of these are legitimate, non-destructive commands that get blocked:

1. **Identifier contains the substring.** Append an opaque id that happens to contain `mkfs` case-insensitively (e.g. an id like `aB3Smkf​Sz9` → lowercases to `...smkfs...`):
   ```
   printf '%s\n' "aB3SmkfSz9" >> ids.txt
   ```
   → BLOCKED: "Filesystem format" (the `mkfs` rule matched inside the identifier).

2. **Searching FOR the term.** A grep whose *argument* contains a blocked token is blocked by its own argument:
   ```
   grep "mkfs" somefile
   ```
   → BLOCKED.

3. **Heredoc/regex literals as data.** A Python/Bash heredoc that references `rm`, `dd`, or `dump` as regex literals (not as commands) is blocked because the tokens appear anywhere in the command text.

### Impact

Any automation that handles arbitrary identifiers or free text — data migrations, log processing, search, batch scripts — hits spurious denials. A common secondary effect: users rephrase the command to work around it, and language like "avoid the hook" then trips the auto-mode classifier as an apparent bypass — compounding the friction for a benign action.

### Why it's subtle (design nuance for any fix)

Not all patterns should change. **Secret-egress** patterns (`sk-ant-`, `sk_live_`, `whsec_`, `PRIVATE KEY`, …) *intentionally* scan the full text including data — a leaked key inside an argument should match. The false-positive problem is specific to **destructive-command** patterns (`mkfs`, `rm`, `dd`, `diskutil …`), which describe a *command being invoked*, not arbitrary text.

### Suggested fix

- Make destructive-command patterns **command-position aware** — anchor to start-of-command / after `;` `|` `&` `(` / word boundaries — or tokenize the command (and/or strip quoted-string & heredoc bodies) before matching this class.
- Keep secret-egress patterns scanning the full command text unchanged.
- Minimal illustrative tightening for the format rule:
  ```
  mkfs   →   (?:^|[\s;&|(])mkfs(?:\.\w+)?\b
  ```
  This still blocks a real `mkfs.ext4 /dev/disk2` while no longer matching the substring inside an identifier/filename/quoted string.

### Environment

PAI v5.0.0 (paths verified against `Releases/v5.0.0/.claude/...` on `main`).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PatternInspector false-positives on benign data: blocked-command regexes match anywhere in the full command string #1335

Summary

Reproduction

Impact

Why it's subtle (design nuance for any fix)

Suggested fix

Environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

PatternInspector false-positives on benign data: blocked-command regexes match anywhere in the full command string #1335

Description

Summary

Reproduction

Impact

Why it's subtle (design nuance for any fix)

Suggested fix

Environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions