legacy_apps: xlnx: add api to poll apps on vdev status#100
legacy_apps: xlnx: add api to poll apps on vdev status#100tnmysh wants to merge 2 commits intoOpenAMP:mainfrom
Conversation
| break; | ||
| } | ||
| } | ||
| platform_poll_on_vdev_reset(priv, rdev); |
There was a problem hiding this comment.
This imposes that platform_poll_on_vdev_reset is defined in all machine. For instance this breaks build on Linux PC.
/usr/bin/ld: CMakeFiles/rpmsg-echo-shared.dir/rpmsg-echo.c.o: in function `rpmsg_echo_app':
rpmsg-echo.c:(.text+0x37f): undefined reference to `platform_poll_on_vdev_reset'
then platform_poll seems deprecated
Why not just keeping platform_poll but adding struct rpmsg_device *rpdev parameter?
There was a problem hiding this comment.
Hmm... But that means, all the current support will move to the virtio reset polling. Is that okay?
Sorry I missed the Linux build on PC.
There was a problem hiding this comment.
Hmm... But that means, all the current support will move to the virtio reset polling. Is that okay?
Not sure to understand your point, the implementation of platform_poll is already machine dependent.
There was a problem hiding this comment.
I mean all for all places where platform_poll is used, is it okay to use platform_poll_on_vdev_reset? Can we deprecate use of platform_poll ? I don't see any reason to keep platform_poll honestly. What do you think?
There was a problem hiding this comment.
Without platform_poll, there is no notification of new incoming rpmsg (see the remoteproc_get_notification() function). It seems to me that your platform_poll_on_vdev_reset function is an extension of platform_poll with added management of the vdev status reset.
From my point of view, platform_poll is a more generic name. I would update it instead of creating a new function.
In the openamp-system-reference, I don’t think there is a need to apply the deprecation process, as the objective is only to provide reference code.
There was a problem hiding this comment.
@arnopo Thanks for the suggestion. Even if rpmsg_retarget is an example, it is a part of the open-amp library. It is still not part of the system-reference. It is possible, that people have defined their own interfaces like platform_poll and passing into rpmsg_retarget. IMHO if we decide to change the rpmsg_retarget interface, then we should treat it just same as any other library interface that is exported to the users and go through the deprecation process of an interface. Because as of now, it is still part of the library.
I think we should aim for long-term solution instead, and remove polling from the service, and do it in the system-referece example instead.
There was a problem hiding this comment.
@arnopo Thanks for the suggestion. Even if
rpmsg_retargetis an example, it is a part of the open-amp library. It is still not part of the system-reference. It is possible, that people have defined their own interfaces likeplatform_polland passing intorpmsg_retarget. IMHO if we decide to change the rpmsg_retarget interface, then we should treat it just same as any other library interface that is exported to the users and go through the deprecation process of an interface. Because as of now, it is still part of the library.I think we should aim for long-term solution instead, and remove polling from the service, and do it in the system-referece example instead.
If implementing the long-term solution directly is ok with you, then I am also in favor of this approach.
There was a problem hiding this comment.
HI @arnopo,
I did analysis on rpmsg_retarget.c file as well. I think the implementation standpoint it's correct. So, when open call is made, it waits for the confirmation from the Host that the file was created. So the polling happens to wait for that confirmation. We can't expect the polling happen in the system-reference app code, because it is required as part of open call. Without it the file fd won't be received from the host.
This means, our only option is to change the interface of rpc_poll, and include second argument there.
That also means to go through the deprecation process for that interfaces/typedef.
I think the easiest option instead is, to maintain two polling API in the system reference for now, and deprecate rpc_poll later.
There was a problem hiding this comment.
HI @arnopo,
I might have a way to achieve the same result, without changing the platform_poll interfaces. I will test it next week and update you. Please ignore above message till then.
Thank you.
There was a problem hiding this comment.
@arnopo I pushed the latest changes. Please check. I don't have to change platfrom_poll interface now, and still can poll on vdev flag.
The trick is, I can retrieve vdev from the list of remoteproc vdev devices. Today we have only one remoteproc_vdev for each remoteproc, so I just get the first remoteproc_vdev from the list, and from there I can get virtio device.
Let me know if this looks good.
Thanks.
In case of the remoteproc repeat attach-detach, RPU firmware should be able to re-create the virtio devices. For this, the firmware should poll on the vdev status, and if the status is 0, then it should destroy the virtio and rpmsg devices. Then it should re-create them and wait for next driver ready status from the linux kernel. The vdev status 0 is usually done during remoteproc detach operation. Mailbox notification is expected from the Linux side on vdev status reset. Signed-off-by: Tanmay Shah <tanmay.shah@amd.com>
echo and matrix multiply apps were not able to run after attach()->detach()->attach() command sequence from linux side. Solve this problem, by not stopping app, and re-creating rpmsg devices after detection of vdev reset. This allows the app to run again on next callback when linux sets DRIVER_OK bit. Signed-off-by: Tanmay Shah <tanmay.shah@amd.com>
252144f to
4608802
Compare
| if (!rpdev) { | ||
| LPERROR("Failed to create rpmsg virtio device.\r\n"); | ||
| break; | ||
| ret = 0; |
There was a problem hiding this comment.
| ret = 0; | |
| ret = -1; |
| NULL, NULL); | ||
| if (!rpdev) { | ||
| LPERROR("Failed to create rpmsg virtio device.\r\n"); | ||
| ret = -EINVAL; |
There was a problem hiding this comment.
sometime -1 is returned, sometime -EINVAL. What about using EPERM or ENXIO everywhere
| metal_err("failed to get vdev status %d\n", ret); | ||
| return ret; | ||
| } | ||
| } while (vdev_status); |
There was a problem hiding this comment.
Except if I miss something this breaks the virtio spec protocol. The virtio driver must not set the status to 0.
2.4 Device Reset
The driver may want to initiate a device reset at various times; notably, it is required to do so during device initialization and device cleanup.
The mechanism used by the driver to initiate the reset is transport specific.
2.4.1 Device Requirements: Device Reset
A device MUST reinitialize device status to 0 after receiving a reset.
A device MUST NOT send notifications or interact with the queues after indicating completion of the reset by reinitializing device status to 0, until the driver re-initializes the device.
2.4.2 Driver Requirements: Device Reset
The driver SHOULD consider a driver-initiated reset complete when it reads device status as 0.
here the sequence should be
- a reset request should come from the other side
- The machine detect the reset in platform_poll() function and a a specif error ( perhaps -EPIPE).
- generic application stop the rpmsg and restart it
There was a problem hiding this comment.
So as per this virtio standard, this behavior is already broken in the linux kernel side. Because today we see virtio status set to 0 on detach() call. Perhaps rproc_reset_rsc_table_on_detach this function is restoring the default resource table in the linux kernel which sets vdev_status to 0 on detach.
There was a problem hiding this comment.
Yes you are right, the Linux should not clean the vdev, status
There was a problem hiding this comment.
How about this. We keep polling up until DRIVER_OKAY bit is set by the linux. If it's not setup anymore, then we stop polling. I think that's not breaking the virtio standard.
There was a problem hiding this comment.
How about this. We keep polling up until DRIVER_OKAY bit is set by the linux. If it's not setup anymore, then we stop polling. I think that's not breaking the virtio standard.
Actually, that's not a good idea too. Because as per the spec, driver can't choose specific bits to reset.
Instead to reset the device, driver will set the status 0. So, are we not resetting the device when we issue detach() command?
There was a problem hiding this comment.
From my point of view, we should think with a system-level approach rather than going with a temporary solution.
What about a solution based on a new status in the resource table, as discussed during the last OpenAMP RP call?
We should probably discuss the solution on the Linux kernel side before moving forward with this PR. The fact that Linux does not respect the virtio specification should help us reach a solution.
There was a problem hiding this comment.
@arnopo Yes that's what I was thinking too.
The only problem is the backward compatibility from the linux standpoint.
If we implement the new status based on the resource table, and the remote firmware is polling based on current vdev status, then we may break the existing firmwares.
So, I was thinking in the linux side:
if (new_vdev_status in rsc tbl) then:
do not reset vdev status on detach, instead rely on fw to reset vdev status
else:
reset vdev status, and notify firmware.
endif
We can discuss this in the next system-reference or rp call. Which comes first.
| break; | ||
| } | ||
| } | ||
| ret = platform_poll(priv); |
There was a problem hiding this comment.
This make break all machines implementation not only the AMD ones.
There was a problem hiding this comment.
I don't know how it can break. Remote firmware will still work as it is.
There was a problem hiding this comment.
Not tested but, , for instance, seems not compatible with following codes:
k
|
From 4/15/2026 OpenAMP System Reference call:
|
In case of the remoteproc repeat attach-detach, RPU firmware should be able to re-create the virtio devices. For this, the firmware should poll on the vdev status, and if the status is 0, then it should destroy the virtio and rpmsg devices. Then it should re-create them and wait for next driver ready status from the linux kernel. The vdev status 0 is usually done during remoteproc detach operation. Mailbox notification is expected from the Linux side on vdev status reset.