Skip to content

chore: merge 256-a11y-detect-fix (a11y research docs + watch test fix)#275

Merged
pietgk merged 8 commits into
mainfrom
chore/merge-256-a11y-docs
Jun 1, 2026
Merged

chore: merge 256-a11y-detect-fix (a11y research docs + watch test fix)#275
pietgk merged 8 commits into
mainfrom
chore/merge-256-a11y-docs

Conversation

@pietgk

@pietgk pietgk commented Jun 1, 2026

Copy link
Copy Markdown
Owner

Cleanup merge of the stale 256-a11y-detect-fix branch.

What's included

  • Accessibility research docs under docs/archive/spec/accessibility-research/ (relocated to follow the docs/spec → docs/archive/spec rename on main):
    • a11y-detect-fix-vision.md, a11y-detect-fix-vision-challenge.md
    • a11y-reference-storybook-specs.md, devac-stories.md
    • plan-a11y-detect-fix-zag-devac-1.md, test-scan-storybook-initial-plan-ideas.md
    • small additions to ramblings.md
  • watch.test.ts: increased macOS FSEvents test timeouts for reliability under parallel load (the only code change).

Dropped from the source branch

Scratch / unfinished files were intentionally excluded: tmp-v1.md, tmp-v2.md, tmp-v3..md, and the typo-named ai-will not eleminate-the-need-for-accessibility-professionals.md.

Notes

🤖 Generated with Claude Code

pietgk and others added 8 commits February 7, 2026 16:21
Add two new sections to the a11y detect & fix vision:
- Section 11: Apple HIG criteria that go beyond WCAG for native mobile
- Section 12: Proven XCTest + Storybook pipeline from ~/ws/app

Update existing sections (3, 4, 6, 7, 8, 10) to integrate Apple HIG
as a distinct testing dimension and foreground the proven PoC in Phase 4.
Add 5 glossary terms and 4 new open questions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The industry consensus that 25-35% of accessibility testing requires
human expertise was established with pre-LLM studies (Deque 2022, W3C).
Research from 2024-2025 shows LLMs can address most of this gap,
reducing the genuinely human-required portion to 5-10%.

Adds Sections 13-14 with three-tier decomposition (LLM-replaceable,
LLM+automation, genuinely human), criterion-by-criterion LLM capability
ratings, 7 research studies with evidence, counterargument analysis
(accessiBe FTC fine, Baymard error rates), and auditor expertise
breakdown. Updates coverage tables and bottom line throughout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…s, RN test host, diagrams

- Upgrade all WCAG references from 2.1 to 2.2 with new criteria analysis
  (9 new SC including 2.5.8 Target Size, 3.2.6 Consistent Help, 3.3.8
  Accessible Auth; removal of 4.1.1 Parsing)
- Add Section 15: Storybook vs E2E testing gap — documents the 59/67
  component-level vs 8 page-level axe-core rule split, cross-page criteria,
  and runtime interaction criteria that need full-page E2E
- Add Section 16: React Native Storybook Test Host — establishes
  architectural need for dedicated test Expo app for native a11y scanning
- Add 5 Mermaid diagrams: testing layers pyramid, Storybook vs E2E
  comparison, phase progression Gantt, LLM tier pie chart, RN native pipeline
- Update coverage tables with E2E layer row, raising automated ceiling
  from ~55-65% to ~63-75% (pre-LLM) and ~78-95% (with LLM)
- Add LLM capability ratings for new WCAG 2.2 criteria in Section 13

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace block-beta pyramid with flowchart TD (block-beta renders flat)
- Simplify Storybook vs E2E diagram (remove cramped subgraph internals)
- Fix Gantt chart syntax: remove unsupported %q format, colons and
  special characters in task names, use explicit date format
- Remove <br/> tags from RN pipeline flowchart node labels

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Split multi-line node labels into separate connected nodes to avoid
  literal \n rendering in Storybook vs E2E diagram
- Use tickInterval 1month and short month format (%b) for Gantt chart
  to prevent overlapping x-axis labels
- Shorten Phase 5 label to fit better

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cross-reference our a11y automation vision with Byrne-Haber's practitioner
critique to calibrate DevAC's capability claims. Maps 11 accessibility task
domains against our tier model, identifies where evidence supports our
position (Lopez-Gil 87.18%) and where practitioner concerns should temper
claims (mobile SR, context-heavy evaluation). Includes calibrated capability
matrix and risk management principles to avoid overlay-style overclaiming.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The watch file change detection tests were flaky due to three issues:
- 100ms watcher stabilization wait too short for macOS FSEvents under
  parallel test load (increased to 500ms)
- 2000ms detection timeouts too tight given the chokidar atomic:true
  delay (100ms) + debounce + FSEvents latency (increased to 5000ms)
- Performance test assertion threshold (2000ms) equaled the
  waitForCondition timeout, causing boundary race (separated to 10s
  wait / 5s assertion)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Brings in accessibility-research docs (relocated to docs/archive/spec/ per
the docs restructure on main) and the macOS FSEvents watch-test timeout fix.

Dropped scratch files from the branch: tmp-v1/v2/v3..md and the typo-named
"ai-will not eleminate..." file.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@pietgk pietgk merged commit 3071cb2 into main Jun 1, 2026
3 checks passed
@pietgk pietgk deleted the chore/merge-256-a11y-docs branch June 1, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant