Skip to content

fix: add missing commas in LLM prompts to ensure valid JSON output#258

Open
Himanshuwagh wants to merge 1 commit intoVectifyAI:mainfrom
Himanshuwagh:issues-wagh
Open

fix: add missing commas in LLM prompts to ensure valid JSON output#258
Himanshuwagh wants to merge 1 commit intoVectifyAI:mainfrom
Himanshuwagh:issues-wagh

Conversation

@Himanshuwagh
Copy link
Copy Markdown

Title: Fix JSON parsing errors by adding missing commas in LLM prompts

Summary

This PR fixes intermittent KeyError and JSON parsing failures by ensuring all LLM prompts in page_index.py request a valid JSON structure.

The Problem

Several prompts were missing a comma between the "thinking" key and the subsequent result key. This caused the model to output invalid JSON syntax, which resulted in the following error:

KeyError: 'toc_detected' (and similar for 'completed', 'page_index_given_in_toc')

Changes Made

Added trailing commas to the "thinking" lines in the following functions within pageindex/page_index.py:

  • check_title_appearance (Line 34)
  • check_title_appearance_in_start (Line 62)
  • toc_detector_single_page (Line 112)
  • check_if_toc_extraction_is_complete (Line 132)
  • check_if_toc_transformation_is_complete (Line 150)
  • detect_page_index (Line 213)

Testing

  • Verified that the prompts now strictly adhere to standard JSON formatting.
  • This fix resolves the KeyError encountered when the LLM follows the provided format literally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant