Skip to content

Add pool membership controller logic#21

Merged
jlebon merged 13 commits into
mainfrom
milestone3a-pool-membership-sync
May 20, 2026
Merged

Add pool membership controller logic#21
jlebon merged 13 commits into
mainfrom
milestone3a-pool-membership-sync

Conversation

@jlebon

@jlebon jlebon commented May 13, 2026

Copy link
Copy Markdown
Collaborator

This covers milestone 3a of the implementation plan.

See individual commits!

jlebon added 2 commits May 13, 2026 12:19
The controller needs to watch/update Nodes (for labeling and
cordon state) and manage BootcNodes (create/delete per pool
membership, read status for rollout decisions).

Assisted-by: Pi (Claude Opus 4.6)
Set up the controller to watch Nodes and BootcNodes objects. For Node
objects, we need a mapper to map it back to the related pool(s).

Use predicates for the Node watch to try to filter on only events we
care about.

Assisted-by: Pi (Claude Opus 4.6)
@jlebon jlebon force-pushed the milestone3a-pool-membership-sync branch from c9a51e2 to df4042b Compare May 14, 2026 00:52
Comment thread internal/controller/bootcnodepool_controller.go Outdated
Comment thread internal/controller/bootcnodepool_controller.go Outdated
Comment thread internal/controller/bootcnodepool_controller.go Outdated
Comment thread internal/controller/membership_test.go Outdated
Comment thread internal/controller/membership_test.go Outdated
@alicefr

alicefr commented May 19, 2026

Copy link
Copy Markdown
Collaborator

I'm really not a big fan on reimplementing the test logic for polling and the matches. It makes the code much harder to read. Also, with the increase of the test we will require the more and more complex logic.

@alicefr

alicefr commented May 19, 2026

Copy link
Copy Markdown
Collaborator

I would also implement some unit testing for the reconcile loop for the bootcnode pool, there are a lot of corner cases there.

Comment thread internal/controller/bootcnodepool_controller.go
Comment thread internal/controller/bootcnodepool_controller.go
Comment thread internal/controller/bootcnodepool_controller.go
jlebon added 8 commits May 19, 2026 16:39
Implement the core reconciliation loop that matches Nodes to
BootcNodePools via nodeSelector and maintains the corresponding
BootcNode objects.

On each reconcile, the controller computes matching nodes vs owned
BootcNodes and creates/deletes as needed. This pretty much follows the
logic set out by the ARCHITECTURE doc.

Assisted-by: Pi (Claude Opus 4.6)
First, tweak the guidelines in the ARCHITECTURE doc around pool
conflicts slightly: we continue on to reconcile uncontested nodes,
but otherwise leave contested nodes to their current owners (but still
degrade the pool to surface this to admins). This just seems more in
line with K8s' "graceful degradation" philosophy.

Then, actually implement pool conflict detection. This is easy to detect
simply by catching `IsAlreadyExists()` at `BootcNode` creation time.

Assisted-by: Pi (Claude Opus 4.6)
Now that we have a real controller, this test doesn't make sense
anymore. As soon as the pool is created, the controller will start
managing it and update its status.

The point was to sanity-check the CRD definition itself, but we'll
implicitly test that in a much more meaningful way as we actually
implement more of the controller logic and test various situations that
exercise the full status schema.
Prep for testing the new controller.

While we're here, enable logging as well.
Add tests covering the new controller pool membership logic.

Assisted-by: Pi (Claude Opus 4.6)
This is pretty standard stuff. I'm sure there's hundreds of third-party
packages out there which offer an API for this. But meh, it's simple
code.

In the process, we significantly improve our error-handling in those
wait loops where before we considered all errors indiscriminately when
we should really only handle IsNotFound differently, which is a pet
peeve of mine.
This doesn't really add much value beyond what envtests test today but
it does exercise the controller in a real cluster. The test here is
pretty basic and I don't think it's worth to test all the same corner
cases already covered by the envtests.

These e2e tests will get much more meaningful and exciting once we
actually have a daemon to mutate the node.

Assisted-by: Pi (Claude Opus 4.6)
@jlebon jlebon force-pushed the milestone3a-pool-membership-sync branch from df4042b to 9c00f2d Compare May 20, 2026 02:20
@jlebon

jlebon commented May 20, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks for the review! Updated for comments. I tried to fold all the changes into their respective commits but kept the gomega switchover as a separate commit to avoid that pain.

I'm really not a big fan on reimplementing the test logic for polling and the matches. It makes the code much harder to read. Also, with the increase of the test we will require the more and more complex logic.

Thanks, I wasn't deeply familiar with gomega and you made me take a closer look. I was under the impression that it was a part of ginkgo, but no it clearly can work without it. Switched over to that now!

I would also implement some unit testing for the reconcile loop for the bootcnode pool, there are a lot of corner cases there.

Hmm, this patch series does add quite a few envtests here though. Is there anything obvious that's missing?

Comment thread internal/controller/crd_test.go Outdated
Comment thread internal/controller/crd_test.go Outdated
Comment thread internal/controller/crd_test.go Outdated
Comment thread internal/controller/membership_test.go Outdated
Comment thread test/e2e/controller_test.go Outdated
Comment thread test/e2e/e2eutil/env.go Outdated
Comment thread internal/controller/membership_test.go Outdated
Comment thread internal/controller/membership_test.go Outdated
@alicefr

alicefr commented May 20, 2026

Copy link
Copy Markdown
Collaborator

Hmm, this patch series does add quite a few envtests here though. Is there anything obvious that's missing?

We are not testing the path when a node is deleted and that the corresponding bootc should be deleted as well.

jlebon added 3 commits May 20, 2026 09:20
It makes a bunch of our test assertions easier to write, and it
has various polling helpers already that allow us to get rid of our
homegrown version.
That seems like a nice way to complete the story and verifies that we
correctly clear a pool's Degraded condition due to conflicts.
Verify that a node being deleted causes its `BootcNode` to also be
deleted.
@jlebon jlebon force-pushed the milestone3a-pool-membership-sync branch from 9c00f2d to aa2478c Compare May 20, 2026 13:26
@jlebon

jlebon commented May 20, 2026

Copy link
Copy Markdown
Collaborator Author

Updated for comments!

Hmm, this patch series does add quite a few envtests here though. Is there anything obvious that's missing?

We are not testing the path when a node is deleted and that the corresponding bootc should be deleted as well.

Added!

@jlebon jlebon merged commit b15e720 into main May 20, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants