fix(question-generator): support nested MinerU output in mimic mode#250
fix(question-generator): support nested MinerU output in mimic mode#250
Conversation
|
I checked the failing CI jobs on April 6, 2026 (run 24019400631). The current red checks are caused by the existing Python 3.10 dependency matrix rather than this question-generator fix. The actual failure is Import Check (Python 3.10), which stops during dependency installation with:
The Test Summary job is failing only because that import-check job fails. I also checked a recent upstream dev run from April 2, 2026 (run 23902376382), and it shows the same Python 3.10 import-check failure before this PR. I will open a separate minimal fix for the dependency marker so this PR can be evaluated on its actual code changes. |
|
Follow-up: I opened a separate minimal dependency fix here: #251. If that lands first, rerunning the checks on this PR should remove the unrelated Python 3.10 import-check failure. |
|
I pushed a follow-up commit to this PR that includes the same Python 3.10 dependency marker fix as #251, so the branch now contains both the question-generator fix and the CI/dependency unblock. At the moment GitHub shows the new Tests run for this PR (run 24031098632, created on April 6, 2026) as action_required rather than failed, so the updated checks have not executed yet. Once workflow approval is granted and the run is allowed to start, this PR should be evaluated against the latest branch contents instead of the earlier failing run. |
Summary
Why
Issue #182 shows that MinerU can emit markdown inside a nested parsed-output directory, while the extractor only checked one preferred directory or the paper root. In that case the parse step succeeds, but question extraction still fails because the markdown is never discovered.
Testing
Closes #182