-
Notifications
You must be signed in to change notification settings - Fork 4k
WIP: [ci] move CI container images to GHCR, workflows to this repo #7109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
jameslamb
wants to merge
33
commits into
master
Choose a base branch
from
ci/image-builds
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+573
−3,055
Draft
Changes from 21 commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
b0c4ad6
[ci] move CI container images to GHCR
jameslamb 30b8679
push
jameslamb a7fcbf8
fix env
jameslamb c6ccf66
empty commit to re-trigger CI
jameslamb 6bfed04
lowercase repo
jameslamb 5ba87cb
cannot use github.workspace
jameslamb 8df5568
fix OS, reduce duplication
jameslamb 5f99901
fix job names
jameslamb dbb7b96
work around newer CMake + old pocl
jameslamb 22fca6d
try forcing C++11
jameslamb 95f1462
upgrade to PoCL 7.1
jameslamb 6671e4a
try building the same way on x86_64
jameslamb 946bb7d
fix CPU flag
jameslamb 45735f0
try images in CI
jameslamb c0ed153
use -dev label
jameslamb e56817f
fix image URIs
jameslamb 5fef0f5
try new OpenCL headers
jameslamb d1151d5
temporarily skip check-wheel-contents (some OpenCL headers are gettin…
jameslamb dc3a465
align OpenCL versions
jameslamb 193fd23
Merge branch 'master' of github.com:microsoft/LightGBM into ci/image-…
jameslamb f4f655a
try getting more information from 'clinfo'
jameslamb b09a916
try using LLVM instead
jameslamb 7121551
try installing Intel OpenCL support
jameslamb 9a58b2e
ensure the PoCL ICD loader gets installed
jameslamb 9c99f6e
check LLC_HOST_CPU values, try skipping intel driver installation
jameslamb 83732ef
target cortex-a53
jameslamb a4b8ace
try compiling support for more Arm CPUs
jameslamb 4b6b1d8
comment out some CI, start adding docs
jameslamb 08ef096
comment out even more CI
jameslamb d62591c
get more debugging information, install 'all' component
jameslamb aff8267
try hand-writing the ICD file
jameslamb 0540af4
fix test code
jameslamb 8f1b053
more fixes
jameslamb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| * |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
| FROM quay.io/pypa/manylinux_2_28_aarch64 | ||
|
|
||
| # use 'bash' for RUN steps | ||
| SHELL ["/bin/bash", "-euo", "pipefail", "-c"] | ||
|
|
||
| # install packages | ||
| RUN <<EOF | ||
| yum update -y | ||
|
|
||
| yum install --nodocs -y \ | ||
| epel-release | ||
|
|
||
| yum install --nodocs -y \ | ||
| clang-devel \ | ||
| gcc-c++ \ | ||
| hwloc-devel \ | ||
| llvm-devel \ | ||
| llvm-static \ | ||
| ocl-icd-devel \ | ||
| sudo | ||
|
|
||
| yum module install --nodocs -y \ | ||
| llvm-toolset | ||
|
|
||
| yum clean all | ||
|
|
||
| rm -rf /var/cache/yum | ||
| EOF | ||
|
|
||
| RUN <<EOF | ||
| # install a newer CMake than what the package manager has | ||
| curl -sL https://cmake.org/files/v3.23/cmake-3.23.1-linux-aarch64.sh -o cmake.sh | ||
| chmod +x cmake.sh | ||
| ./cmake.sh --prefix=/usr/local --exclude-subdir | ||
| rm -f ./cmake.sh | ||
|
|
||
| # build PoCL | ||
| # | ||
| # NOTE: If this is updated, check if CL_TARGET_OPENCL_VERSION in cmake/IntegratedOpenCL.cmake | ||
| # needs to be updated (see comments there fore links). | ||
| git clone \ | ||
| --depth 1 \ | ||
| --branch v7.1 \ | ||
| https://github.com/pocl/pocl.git | ||
|
|
||
| # explanations for some flags: | ||
| # | ||
| # * -DCMAKE_{C,CXX}_COMPILER: DEVTOOLSET_ROOTPATH is where manylinux puts the gcc toolset | ||
| # * -DLLC_HOST_CPU="generic": passed to clang's -march/-mcpu flag. | ||
| # | ||
| cmake \ | ||
| -B pocl/build \ | ||
| -S pocl \ | ||
| -DCMAKE_BUILD_TYPE=release \ | ||
| -DCMAKE_C_COMPILER="${DEVTOOLSET_ROOTPATH}/usr/bin/gcc" \ | ||
| -DCMAKE_CXX_COMPILER="${DEVTOOLSET_ROOTPATH}/usr/bin/g++" \ | ||
| -DENABLE_DOXYGEN=OFF \ | ||
| -DENABLE_EXAMPLES=OFF \ | ||
| -DENABLE_HOST_CPU_DEVICES=ON \ | ||
| -DENABLE_HWLOC=ON \ | ||
| -DENABLE_POCLCC=ON \ | ||
| -DENABLE_SPIRV=ON \ | ||
| -DENABLE_TESTS=OFF \ | ||
| -DENABLE_VALGRIND=OFF \ | ||
| -DINSTALL_OPENCL_HEADERS=OFF \ | ||
| -DLLC_HOST_CPU=generic \ | ||
| -DPOCL_DEBUG_MESSAGES=OFF \ | ||
| -DPOCL_INSTALL_ICD_VENDORDIR=/etc/OpenCL/vendors | ||
|
|
||
| cmake --build pocl/build -j4 | ||
|
|
||
| cmake --install pocl/build | ||
| EOF |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| * |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,102 @@ | ||
| FROM quay.io/pypa/manylinux_2_28_x86_64 | ||
|
|
||
| # ensure that libraries like libc++ built in this image can be found by the linker | ||
| #ENV LD_LIBRARY_PATH="/usr/local/lib64:${LD_LIBRARY_PATH}:/usr/local/lib" | ||
|
|
||
| # use 'bash' for RUN steps | ||
| SHELL ["/bin/bash", "-euo", "pipefail", "-c"] | ||
|
|
||
| # install packages | ||
| RUN <<EOF | ||
| yum update -y | ||
|
|
||
| yum install --nodocs -y \ | ||
| epel-release | ||
|
|
||
| yum install --nodocs -y \ | ||
| clang-devel \ | ||
| gcc-c++ \ | ||
| hwloc-devel \ | ||
| llvm-devel \ | ||
| llvm-static \ | ||
| ocl-icd-devel \ | ||
| sudo | ||
|
|
||
| yum module install --nodocs -y \ | ||
| llvm-toolset | ||
|
|
||
| yum clean all | ||
|
|
||
| rm -rf /var/cache/yum | ||
| EOF | ||
|
|
||
| RUN <<EOF | ||
| # install a newer CMake than what the package manager has | ||
| curl -sL https://cmake.org/files/v3.23/cmake-3.23.1-linux-$(arch).sh -o cmake.sh | ||
| chmod +x cmake.sh | ||
| ./cmake.sh --prefix=/usr/local --exclude-subdir | ||
| rm -f ./cmake.sh | ||
|
|
||
| # build PoCL | ||
| git clone \ | ||
| --depth 1 \ | ||
| --branch v7.1 \ | ||
| https://github.com/pocl/pocl.git | ||
|
|
||
| # explanations for some flags: | ||
| # | ||
| # * -DCMAKE_{C,CXX}_COMPILER: DEVTOOLSET_ROOTPATH is where manylinux puts the gcc toolset | ||
| # * -DLLC_HOST_CPU="x86_64": passed to clang's -march/-mcpu flag. see https://github.com/chromebrew/chromebrew/pull/9176#issuecomment-1891751465 | ||
| # | ||
| cmake \ | ||
| -B pocl/build \ | ||
| -S pocl \ | ||
| -DCMAKE_BUILD_TYPE=release \ | ||
| -DCMAKE_C_COMPILER="${DEVTOOLSET_ROOTPATH}/usr/bin/gcc" \ | ||
| -DCMAKE_CXX_COMPILER="${DEVTOOLSET_ROOTPATH}/usr/bin/g++" \ | ||
| -DENABLE_DOXYGEN=OFF \ | ||
| -DENABLE_EXAMPLES=OFF \ | ||
| -DENABLE_HOST_CPU_DEVICES=ON \ | ||
| -DENABLE_HWLOC=ON \ | ||
| -DENABLE_POCLCC=ON \ | ||
| -DENABLE_SPIRV=ON \ | ||
| -DENABLE_TESTS=OFF \ | ||
| -DENABLE_VALGRIND=OFF \ | ||
| -DINSTALL_OPENCL_HEADERS=OFF \ | ||
| -DLLC_HOST_CPU="x86-64" \ | ||
| -DPOCL_DEBUG_MESSAGES=OFF \ | ||
| -DPOCL_INSTALL_ICD_VENDORDIR=/etc/OpenCL/vendors | ||
|
|
||
| cmake --build pocl/build -j4 | ||
|
|
||
| cmake --install pocl/build | ||
| EOF | ||
|
|
||
| # Install Java | ||
| RUN yum install -y \ | ||
| java-1.8.0-openjdk-devel.x86_64 \ | ||
| && yum clean all | ||
|
|
||
| ENV JAVA_HOME_8_X64=/usr/lib/jvm/java | ||
| ENV JAVA_HOME=$JAVA_HOME_8_X64 | ||
|
|
||
| # Install SWIG | ||
| RUN curl -sLk https://sourceforge.net/projects/swig/files/swig/swig-4.0.2/swig-4.0.2.tar.gz/download -o swig.tar.gz \ | ||
| && tar -xzf swig.tar.gz \ | ||
| && cd swig-4.0.2 \ | ||
| && ./configure --prefix=/usr/local --without-pcre \ | ||
| && make \ | ||
| && make install \ | ||
| && cd .. \ | ||
| && rm -f ./swig.tar.gz \ | ||
| && rm -rf ./swig-4.0.2 | ||
|
|
||
| # Install miniforge | ||
| RUN curl -sL "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-$(uname -m).sh" -o miniforge.sh \ | ||
| && chmod +x miniforge.sh \ | ||
| && ./miniforge.sh -b -p /opt/miniforge \ | ||
| && rm -f ./miniforge.sh \ | ||
| && /opt/miniforge/bin/conda clean -a -y \ | ||
| && chmod -R 777 /opt/miniforge | ||
|
|
||
| ENV CONDA=/opt/miniforge/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| name: ci-images | ||
|
|
||
| on: | ||
| # TODO(jameslamb): remove 'push' trigger before merging | ||
| push: | ||
| workflow_dispatch: | ||
| inputs: | ||
| tag-suffix: | ||
| type: string | ||
| default: "-dev" | ||
| description: | | ||
| Suffix for image tags, including a leading "-" if desired. | ||
| Set to the empty string to overwrite the main images used by LightGBM's CI. | ||
|
|
||
| jobs: | ||
| build-and-push-images: | ||
| name: build-ci-images (${{ matrix.tag }}) | ||
| runs-on: ${{ matrix.os }} | ||
| permissions: | ||
| contents: read | ||
| packages: write | ||
| attestations: write | ||
| id-token: write | ||
| strategy: | ||
| fail-fast: false | ||
| # NOTE: Cannot use "{{ github.repository }}" because that'd return "{org}/LightGBM" which results in | ||
| # and error like "repository name must be lowercase". | ||
| matrix: | ||
| include: | ||
| - os: ubuntu-24.04-arm | ||
| dockerfile: .ci/ci-images/manylinux_2_28_aarch64/Dockerfile | ||
| # TODO(jameslamb): revert hard-coded tag before merging | ||
| tag: ci-manylinux_2_28_aarch64-dev | ||
| # tag: ci-manylinux_2_28_aarch64${{ inputs.tag-suffix }} | ||
| - os: ubuntu-latest | ||
| dockerfile: .ci/ci-images/manylinux_2_28_x86_64/Dockerfile | ||
| # TODO(jameslamb): revert hard-coded tag before merging | ||
| tag: ci-manylinux_2_28_x86_64-dev | ||
| # tag: ci-manylinux_2_28_x86_64${{ inputs.tag-suffix }} | ||
| steps: | ||
| - name: Checkout repository | ||
| uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1 | ||
| - name: Log in to the Container registry | ||
| uses: docker/login-action@184bdaa0721073962dff0199f1fb9940f07167d1 # v3.5.0 | ||
| with: | ||
| registry: ghcr.io | ||
| username: ${{ github.actor }} | ||
| password: ${{ github.token }} | ||
| - name: Build and push Docker image | ||
| id: push | ||
| uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # v6.18.0 | ||
| with: | ||
| context: null | ||
| file: ${{ matrix.dockerfile }} | ||
| push: true | ||
| tags: ghcr.io/${{ github.repository_owner }}/lightgbm:${{ matrix.tag }} | ||
| # create an attestation to prove that this image came from a workflow in this repository | ||
| - name: Generate artifact attestation | ||
| uses: actions/attest-build-provenance@e8998f949152b193b063cb0ec769d69d929409be # v2.4.0 | ||
| with: | ||
| subject-name: ghcr.io/${{ github.repository_owner }}/lightgbm | ||
| subject-digest: ${{ steps.push.outputs.digest }} | ||
| push-to-registry: true |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Picking a somewhat-arbitrary place to start a thread.
Right now, the images are building successfully but Python tests with
device="gpu"are all failing.The
gpu sourcejob (where LightGBM's defaultdeviceis set to"gpu") has 238 failures like this:Which look at a glance like #3679
The
bdist_wheeljobs (which just run a single test checking that OpenCL support was compiled in successfully) on both x86_64 and aarch 64 are failing like this:(build link)
Ideas I'm looking into:
Boostto work with a new PoCL?I'm going to focus on the
gpu sourcebuilds first, because those don't rely on anything in https://github.com/microsoft/LightGBM/blob/master/cmake/IntegratedOpenCL.cmake and so should be a more minimal way to investigate this.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticing that the CI job running with an NVIDIA GPU is working: https://github.com/microsoft/LightGBM/actions/runs/20581695041/job/59110391374?pr=7109
So I guess it's just that these jobs are no longer successfully targeting the host CPUs on the GitHub runners? I'll look into that.