Skip to content

Conversation

@lakhmanisahil
Copy link

@lakhmanisahil lakhmanisahil commented Jan 5, 2026

Fixes #201

Summary

This PR implements a 15-minute timeout for builds to prevent them from remaining indefinitely in the RUNNING state. It introduces a new TIMED_OUT build state and enforces timeout handling independently in the builder and progress updater, following maintainer guidance.


Changes Made

1. Core Timeout Infrastructure

  • Added BuildState.TIMED_OUT to represent builds terminated due to timeout
  • Introduced a shared timeout constant (BUILD_TIMEOUT_SECONDS = 900) in common/config.py
  • Added time_started_running to BuildInfo to track when a build actually begins execution
  • Added error_message to BuildInfo to store timeout and failure details

2. Builder (builder/builder.py)

  • Enforced timeout on all build subprocess steps (configure, clean, build) using timeout=BUILD_TIMEOUT_SECONDS
  • Added handling for subprocess.TimeoutExpired:
    • Terminates the running process
    • Aborts the build cleanly
    • Records a clear timeout error message
  • Added __check_if_timed_out() to detect if the progress updater has already marked a build as TIMED_OUT
  • Builder checks timeout state between build steps for early termination
  • Skips archive generation for timed-out builds
  • Ensures cleanup runs even when a build times out or fails

3. Build Manager (build_manager/manager.py)

  • Added mark_build_timed_out() to safely transition builds to TIMED_OUT
  • Prevents overriding terminal states (SUCCESS, FAILURE)
  • Automatically records time_started_running when a build enters the RUNNING state
  • Extended BuildInfo.to_dict() using getattr() to maintain backward compatibility with existing Redis entries

4. Progress Updater (build_manager/progress_updater.py)

  • Added __check_build_timeout() invoked from the existing periodic update loop
  • Timeout is measured from time_started_running (not time_created) for accuracy
  • Handles edge cases where time_started_running is not yet available
  • Marks builds as TIMED_OUT once the timeout threshold is exceeded
  • Added handling for TIMED_OUT in state and progress update paths
  • Includes clear logging for timeout detection

5. Web API (web/app.py)

  • Added /api/builds/<build_id>/status endpoint for lightweight polling
  • Returns build state, progress, and timeout error information
  • Improved robustness of get_all_builds() by skipping and logging individual build errors instead of failing the entire request
  • Updated API usage to align with recent changes (remote_name / commit_ref)

Implementation Details

As suggested by the maintainer, timeout handling is implemented independently in two places:

  1. Builder subprocess timeout
    Each subprocess call is bounded by BUILD_TIMEOUT_SECONDS. If exceeded, the process is terminated and the build is aborted.

  2. Progress updater timeout detection
    The existing periodic task checks whether any RUNNING build has exceeded the timeout since entering the running state and marks it as TIMED_OUT.

Both mechanisms coordinate via BuildState.TIMED_OUT:

  • Either mechanism may trigger the timeout
  • Builder exits early if the progress updater has already marked the build as timed out
  • No direct coupling between the two workflows

This preserves separation of responsibilities while keeping behavior consistent.


Testing

  • Built and tested locally using Docker
  • Verified subprocess timeouts terminate long-running builds
  • Verified progress updater correctly marks timed-out builds
  • Confirmed both mechanisms operate independently
  • Verified timed-out builds do not generate archives
  • Confirmed timeout errors are logged and exposed via API
  • No circular imports or startup errors observed

Backward Compatibility

  • New fields (time_started_running, error_message) are accessed via getattr()
  • No schema or Redis migration required
  • Existing builds continue to function normally

Future Work (not included)

  • Make timeout configurable via environment variable
  • UI support for retrying timed-out builds

Supporting Images (checked for 2 minutes)

Screenshot from 2026-01-05 22-47-47 Screenshot from 2026-01-05 22-44-02 Screenshot from 2026-01-05 22-43-46

@lakhmanisahil
Copy link
Author

Hello @shiv-tyagi,
I’ve reviewed all the suggested changes. I’ll address them and push an update shortly.

As suggested, we use time_started instead of time_started_running

Co-authored-by: Shiv Tyagi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce a Timed-Out state for builds

2 participants