Open
Conversation
This is needed for libxl integration. tapdisk waits to transition to InitWait until after "hotplug-status" is connected. However libxl doesn't run its hotplug scripts, which write "hotplug-status", until the backend transitions to InitWait. With both sides waiting, progress is not made and connecting the blktap device times out and fails. Make tapback transition to InitWait earlier to resolve this issue under libxl. Place the transition to InitWait in tapback_backend_create_device() after the xenstore feature nodes have been written. InitWait is a signal to the frontend that such nodes have been written. This matches blkback's behaviour. It should also be fine since tapback still won't advance to Connected without the other setup like physical-device-path and hotplug-status. VBDs can be reconnected. When pv-grub is used, it connects the VBD, loads the kernel, disconnects the VBD. It then re-sets the frontend state to XenbusStateInitialising so that the new kernel will find and connect the VBD. tapback and blkback handle this case differently. When blkback observes the frontend transition to XenbusStateInitialising, and the backend is XenbusStateClosed, the backend transitions to XenbusStateInitWait. When tapback observes the frontend transition to XenbusStateInitialising, the backend checks for hotplug_status_connected to be true before switching XenbusStateInitWait. For tapback, this serves a second purpose for setting XenbusStateInitWait initially as well. With tapback setting XenbusStateInitWait earlier, follow the blkback model to transition to XenbusStateInitWait when the backend was previously in XenbusStateClosed. The end result shouldn't change, but it will remove an extra write and watch trigger for XenbusStateInitWait. Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
libxl doesn't clean up tapdisks because it doesn't call the hotplug cleanup scripts: libxl: debug: libxl_event.c:1043:devstate_callback: backend /local/domain/0/backend/vbd3/5/2048/state wanted state 6 but it was removed libxl: debug: libxl_event.c:849:libxl__ev_xswatch_deregister: watch w=0xf82ba0 wpath=/local/domain/0/backend/vbd3/5/2048/state token=1/2: deregister slotnum=1 libxl: debug: libxl_device.c:1156:device_backend_callback: Domain 5:calling device_backend_cleanup libxl: debug: libxl_event.c:863:libxl__ev_xswatch_deregister: watch w=0xf82ba0: deregister unregistered libxl: error: libxl_device.c:1169:device_backend_callback: Domain 5:unable to remove device with path /local/domain/0/backend/vbd3/5/2048 - rc -6 The backend state cannot be found because tapback deleted the entire backend subtree. tapback shouldn't remove the backend nodes when the frontend is removed, because the nodes contain the information on how to clean up. Leave the nodes and allowed them to be removed by the toolstack. Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Contributor
|
I'll run this through our test systems and check nothing goes bang, |
Contributor
|
OK, as this stands right now it causes Xenopsd to fail with timeouts so it can't be immediately merged in this form. I'll get somebody to look at the xenopsd failures and see if that can be made to work compatibly. |
Contributor
Author
|
@MarkSymsCtx Thanks so much for testing it - sorry it didn't work... |
Contributor
|
@jandryuk no problems, we'll look into whether we can get xenopsd to function compatibly with this. It looked to be timing out waiting on a state change, which possibly means that it's missed that the transition has already occurred. |
Contributor
Author
|
This branch puts the changes from this PR behind a new |
hubot
pushed a commit
to xen-project/xen
that referenced
this pull request
May 28, 2024
Add entry for backendtype=tap support in libxl. blktap needs some changes to work with libxl, which haven't been merged. They are available from this PR: xapi-project/blktap#394 Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
These two patches are the minimal needed for integration with libxl.
The InitWait handling needs a little modification to set it earlier. libxl needs to see InitWait to run hotplug scripts which will write hotplug-status = "connected". Setting InitWait earlier won't allow transitioning to Connected any earlier, so I think it should be okay with XAPI. Howver, it has not been tested under XAPI.
The deleting of the xenstore backend is no longer performed. libxl expects to find the backend to run the hotplug scripts to perform cleanup. Later on, libxl will delete the entire domain sub-tree, so xenstore will get cleaned up that way. Again, untested under XAPI. Maybe XAPI does similar sub-tree cleanup so this isn't necessary?
If this isn't acceptable as-is, would a command line option & global flag for a "libxl mode" be acceptable to change behavior?