feat: implement OCR regression harness with golden dataset and CI integration#521
Conversation
…nd automated workflow support
|
@bytebinders Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits. You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀 |
|
@Cedarich , could you drop a few more GitHub issues for me to work on? I’m ready for more 🔧🙂 |
|
Please fix work flow |
|
@Cedarich the issue is missing file sampl_001.png but I have fixed it. |
|
Fix workflow |
…d field extraction
|
@bytebinders is attempting to deploy a commit to the Cedarich's projects Team on Vercel. A member of the Team first needs to authorize it. |
|
fixed |
|
@Cedarich fixed |
|
Please fix failing python test |
|
@Cedarich fixed |
|
Merge everything and let's finish this up. |
|
@Cedarich even though I know I won't get any points for this, I still managed to finish it. Thanks! |
|
Thank so much for the contribution, you will get points for it go to the #tickets channel on discord scroll up a bit you would see I didn't receive points for an issue button click on it and select this issue, the points would be sent to you. |
that works only in 7 days after the wave. and I know it, I just want to deliver what I started. |
|
@Cedarich, thank you for trusting me. That alone means a lot to me and is greatly appreciated. Thanks so much! |
|
Thank you so much 🙏🏿 See you next wave 🌊 |
✅ PR Description:
What was done:
app/ai-service/regression_harness/.regression_harness/dataset/.evaluator.pyto compare actual OCR output against expected fields, supporting text normalization, error classification, and confidence tracking.cli.pyto run suites locally, providing human-readable console summaries and machine-readable JSON reports for CI artifacts..github/workflows/ocr-regression.ymlto trigger automatically on OCR-related changes.README.mdfor tool usage, adding new samples, and maintenance.Why it was done:
To establish a reliable, low-maintenance testing infrastructure that ensures OCR extraction accuracy is preserved as the AI models, prompts, or preprocessing steps evolve.
How it was verified:
Summary of Work:
Required line:
Closes #464