Fix missing label retry state leaking across ontology processing#272
Fix missing label retry state leaking across ontology processing#272
Conversation
There was a problem hiding this comment.
One minor issue: silent success on page retrieval failure
Suppose paging.page(page, size).all at line 110 raises a StandardError (e.g., a SPARQL timeout, network. error, or triplestore issue).
Step 1 — The error is caught and swallowed (lines 109-117):
begin
page_classes = paging.page(page, size).all # raises StandardError
total_pages = page_classes.total_pages
page_len = page_classes.length
rescue StandardError => e
page_classes = [] # line 114
logger.error(...) # line 115 — logged, but execution continues
logger.flush
endAfter the rescue: page_classes = [], total_pages = 0 (from line 107 initialization).
Step 2 — The retry block is skipped (line 120):
if total_pages > 0 && page_classes.empty? && ...Since total_pages = 0, this entire block (lines 120-147) is skipped. The retry logic never fires — the error is treated as "there are simply no pages."
Step 3 — The loop breaks (lines 149-156):
if page_classes.empty?
if total_pages > 0
...
else
logger.info("Ontology #{acr} contains #{total_pages} pages...") # line 153
end
break # line 155
endThe loop exits cleanly with a log message that says the ontology contains 0 pages — as if this is a normal, expected condition rather than an error.
Step 4 — Success status is set (lines 187-191):
# set the status on actions that have completed successfully
callbacks.each do |_, callback|
if callback[:status]
@submission.add_submission_status(callback[:status]) # line 189 — marks RDF_LABELS as SUCCESS
@submission.save
end
endThe submission is marked with the RDF_LABELS success status even though no labels were actually processed — the page fetch failed with an exception.
Step 5 — handle_missing_labels rescue is never triggered (lines 33-38):
Since loop_classes returned normally (no exception propagated), the rescue Exception => e at line 33 never fires, so the error status is never set.
Suggested fix
Re-raise the exception after logging on the initial page fetch, so it reaches the error status path in handle_missing_labels. The retry-loop rescue (lines 128-138) can remain as-is since the retry block already handles exhausted retries by raising at line 145.
# lines 109-117: re-raise on initial fetch failure
begin
page_classes = paging.page(page, size).all
total_pages = page_classes.total_pages
page_len = page_classes.length
rescue StandardError => e
page_classes = []
logger.error("Error retrieving page #{page} for #{acr}: #{e.class}: #{e.message}")
logger.flush
raise e # let handle_missing_labels rescue set the error status
end
Addresses