Skip to content

ci: DH-19408: PNAP Integration#417

Open
stanbrub wants to merge 39 commits intodeephaven:mainfrom
stanbrub:pnap-integration
Open

ci: DH-19408: PNAP Integration#417
stanbrub wants to merge 39 commits intodeephaven:mainfrom
stanbrub:pnap-integration

Conversation

@stanbrub
Copy link
Collaborator

@stanbrub stanbrub commented Feb 4, 2026

Wrote an integration with Phoenix NAP (our new bare metal provider).

  • Used the PNAP ReST API (Tried terraform, open tofu, and cloud init and could not get working even with support help)
  • We now run as a user instead of root
  • Upgraded to using Ubuntu 24.04 from 22.04
  • Most changes have to do with the change in providers, but some are because of Ubuntu 24.04
  • We now have two wait states to check after deploy
    • Wait for the server to be up and ssh accessible
    • Wait for APT to be unlocked (done updating) after first login
  • We now turn off all auto-updaters for both auto-provisioned and always-on servers
  • We now turn off ASLR in an attempt to minimize benchmark run variability
  • Purging expired servers is now a function performed by us (Equinix had a way to set expiration on server create)
    • We expire ephemeral servers every 24 hours that have been alive at least 24 hours
    • Given that this only affects auto-provisioned servers, and those servers are used about 10% of the time compared to the nightly always-on server of 9hrs, and the rate at which Github Actions outages would affect cleanup (always()) routines was ~27 incidents last year, the probability of the purge-expired function having an impact is 0.28%. (Gotta love Copilot)
    • It's still necessary to have the fallback
  • Updated the secrets document (The public one is here, and the actual keys are in the private vault)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the benchmark infrastructure from Equinix to Phoenix NAP (PNAP) as the bare metal provider, with several related infrastructure improvements.

Changes:

  • Integrated Phoenix NAP REST API for bare metal server provisioning and management, replacing Equinix Metal
  • Upgraded GitHub Actions runners and deployed servers from Ubuntu 22.04 to 24.04
  • Migrated from root user execution to non-root user with proper Docker group permissions
  • Implemented automated server expiration and cleanup for ephemeral benchmark systems
  • Added comprehensive wait states for server readiness and APT availability
  • Disabled automatic system updates on benchmark servers to ensure consistent test environments
  • Updated documentation to reflect new provider requirements and secrets

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 27 comments.

Show a summary per file
File Description
src/main/java/io/deephaven/benchmark/controller/DeephavenDockerController.java Removed sudo prefix from all docker commands to support non-root user execution
docs/ForkSetup.md Updated secrets documentation for Phoenix NAP provider and added reference to private vault
.github/workflows/remote-benchmarks.yml Upgraded to Ubuntu 24.04 runners, added purge job for expired servers, added METAL_PROJECT_ID to environment
.github/workflows/adhoc-exist-remote-benchmarks.yml Upgraded to Ubuntu 24.04 runners, fixed default test class list spacing
.github/workflows/adhoc-auto-remote-benchmarks.yml Upgraded to Ubuntu 24.04 runners, updated copyright year, changed server plan, removed METAL_EXPIRE parameter, added METAL_PROJECT_ID
.github/scripts/setup-test-server-remote.sh Added APT lock waiting, disabled automatic updates, configured SSH security, changed all paths from /root to /${HOME}, added usermod for docker group, added DEBIAN_FRONTEND export
.github/scripts/run-benchmarks-remote.sh Updated paths from /root to /${HOME}, added userHome variable substitution in properties
.github/scripts/manage-deephaven-remote.sh Updated paths from /root to /${HOME}
.github/scripts/fetch-results-local.sh Updated path to use /home/${USER} format
.github/scripts/build-server-distribution-remote.sh Updated paths from /root to /${HOME}
.github/scripts/build-docker-image-remote.sh Updated paths from /root to /${HOME}, added copyright header
.github/scripts/build-benchmark-artifact-remote.sh Updated paths from /root to /${HOME}, added copyright header
.github/scripts/adhoc.sh Complete rewrite of deploy-metal, delete-metal functions for PNAP API, added purge-metal function, added getApiToken function, updated examples
.github/resources/terraform.tfstate Added empty Terraform state file
.github/resources/*-scale-benchmark.properties Updated docker.compose.file path to use ${userHome} variable substitution
.github/resources/adhoc-server-deploy.json New JSON template for PNAP server deployment configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@stanbrub stanbrub requested a review from Copilot February 5, 2026 18:54
@stanbrub stanbrub changed the title ci: DH-19408: Pnap Integration ci: DH-19408: PNAP Integration Feb 5, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 20 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@stanbrub stanbrub requested a review from cpwright February 6, 2026 00:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant