Skip to content

Conversation

@jscheffl
Copy link
Contributor

@jscheffl jscheffl commented Oct 7, 2025

So far the Edge worker running remote is using a single threaded loop to fetch, process, monitor tasks and upload results logs as well as heartbeat. This limits the parallelism achieveable on a Edge Worker - how many tasks can be handled in parallel.

With this PR a major refactor is made to use Python AsyncIO for handling of tasks in order to improve concurrency with many tasks.

Note: Before merging this needs to have proper review and testing as a lot of logic and libraries change with the risk of degraded quality/stability as of implementation glitched. UPDATE: WIP status left, did a lot of (local) testing and I think it is ready now.

Tested with 100 concurrent tasks (mostly sleeping) on my machine and worker used only 10-20% of CPU (compared to 10GB RAM all the task needed). So with the re-implementation in AsyncIO the worker scales very much more than before where I considered 10-15 tasks can be handled.

FYI @dheerajturaga - As also being a user, committer status still pending but looking forward for a review!
@dabla As you do a lot of AsyncIO on your side, looking forward for a review from you as well!
@AutomationDev85 - Would also like your feedback!

@boring-cyborg boring-cyborg bot added area:providers provider:edge Edge Executor / Worker (AIP-69) labels Oct 7, 2025
@jscheffl jscheffl force-pushed the feature/make-edge-worker-using-async-loop branch from 16dce9a to e5de8ad Compare October 12, 2025 00:14
@jscheffl jscheffl force-pushed the feature/make-edge-worker-using-async-loop branch from e5de8ad to 5a0ef71 Compare November 11, 2025 21:13
@dabla
Copy link
Contributor

dabla commented Dec 12, 2025

That will be a nice improvement @jscheffl. Looking forward to this!

@dabla
Copy link
Contributor

dabla commented Dec 12, 2025

I've just checked the code and I saw maybe a possible "improvement" regarding following code in worker:

if worker_info.state == EdgeWorkerState.MAINTENANCE_REQUEST:
                logger.info("Maintenance mode requested!")
                EdgeWorker.maintenance_mode = True
            elif (
                worker_info.state in [EdgeWorkerState.IDLE, EdgeWorkerState.RUNNING]
                and EdgeWorker.maintenance_mode
            ):
                logger.info("Exit Maintenance mode requested!")
                EdgeWorker.maintenance_mode = False
            if EdgeWorker.maintenance_mode:
                EdgeWorker.maintenance_comments = worker_info.maintenance_comments
            else:
                EdgeWorker.maintenance_comments = None
            if worker_info.state == EdgeWorkerState.SHUTDOWN_REQUEST:
                logger.info("Shutdown requested!")

We could encapsulate the checks of the state in the WorkerInfo dataclass through properties, then above code would be easier to read and to test also, as then you can test the check of state directly in the WorkerInfo dataclass. So I would add following properties in WorkerInfo dataclass:

@property
def is_maintenance(self) -> bool:
     return self.state == EdgeWorkerState.MAINTENANCE_REQUEST

@property            
def is_running_or_idle(self) -> bool:
     return self.state in [EdgeWorkerState.IDLE, EdgeWorkerState.RUNNING]

@property
def is_shutdown(self) -> bool:
     return self.state == EdgeWorkerState.SHUTDOWN_REQUEST

Then you could rewrite following as below and add dedicated tests in WorkerInfo dataclass for above properties:

if worker_info.is_maintenance:
                logger.info("Maintenance mode requested!")
                EdgeWorker.maintenance_mode = True
            elif (
                worker_info.is_running_or_idle
                and EdgeWorker.maintenance_mode
            ):
                logger.info("Exit Maintenance mode requested!")
                EdgeWorker.maintenance_mode = False
            if EdgeWorker.maintenance_mode:
                EdgeWorker.maintenance_comments = worker_info.maintenance_comments
            else:
                EdgeWorker.maintenance_comments = None
            if worker_info.is_shutdown:
                logger.info("Shutdown requested!")

WDYT?

@jscheffl
Copy link
Contributor Author

WDYT?

Yes, the code is not perfect and the state management has grown over time. I am also not 100% happy about it, if you have a good idea and some time... like with all: contributions are welcome.

@jscheffl jscheffl force-pushed the feature/make-edge-worker-using-async-loop branch 5 times, most recently from ad8fc74 to 53921f2 Compare January 3, 2026 22:38
@dabla
Copy link
Contributor

dabla commented Jan 6, 2026

Really like the refactorings being done here, looking forward to test it once it's done ;-)

@jscheffl
Copy link
Contributor Author

jscheffl commented Jan 6, 2026

Almost ready to review, need to hunt for one bug that I saw (and had no time, was distracted by family) hope I can make it ready for review by EOB today.

@jscheffl jscheffl force-pushed the feature/make-edge-worker-using-async-loop branch from 53921f2 to 507ba20 Compare January 6, 2026 20:37
@jscheffl jscheffl requested a review from dabla January 6, 2026 20:40
@jscheffl jscheffl marked this pull request as ready for review January 6, 2026 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:edge Edge Executor / Worker (AIP-69)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants