[RFC] [podman] Collect podman Data for All Users#4290
[RFC] [podman] Collect podman Data for All Users#4290sahithiRavindranath wants to merge 1 commit into
Conversation
| if user == 'root': | ||
| command = "" | ||
| else: | ||
| # Check if user has any containers before collecting data |
There was a problem hiding this comment.
Worth to move command = "sudo -u "+ user from line 88 here.
There was a problem hiding this comment.
I've removed all embedded sudo calls and now use the runas parameter instead, as requested by @TurboTurtle.
| if self.get_option('allusers'): | ||
| non_root_users = self.exec_cmd("lslogins -u -o USER --noheadings") | ||
| if non_root_users['status'] == 0: | ||
| for user in non_root_users['output'].splitlines() |
|
|
||
| self.add_cmd_tags({ | ||
| f'{command} podman images': 'podman_list_images', | ||
| f'{command} podman ps': 'podman_list_containers' |
There was a problem hiding this comment.
Do we (and should we) support adding one tag for multiple command outputs, @TurboTurtle ? This is being tagged for each and every user, incl. root.
There was a problem hiding this comment.
We do, and I believe we should. I'm not immediately opposed to adding the same tag that root gets to non-root users, but I'd suggest checking with at least the Insights team to see if this would throw off anything for them.
There was a problem hiding this comment.
Current implementation uses the same tag for all users. Should I switch to unique tags per user (e.g., podman_list_images_root, podman_list_images_user1) or keep the shared tag approach?"
| command = "" | ||
| else: | ||
| # Check if user has any containers before collecting data | ||
| cmd = self.exec_cmd(f"{command} podman ps -aq") |
There was a problem hiding this comment.
This fails if the user does not have permissions to access ${PWD} where the root is. E.g. trying this on a containerised AAP deployment where podman containers run under my aap user:
# who am I
root pts/0 2026-04-13 12:03 (10.44.32.126)
# pwd
/root/sos-main-pr4290
# sudo -u aap podman ps -aq
cannot chdir to /root/sos-main-pr4290: Permission denied
# su - aap -c "podman ps -aq"
8a634b8461cb
.. list of podman container ids ..
#
(but I dont know if the su - {user} -c .. is the best approach here)
There was a problem hiding this comment.
Fixed! Using the runas parameter instead of embedded sudo solves this - the sos framework handles working directory permissions properly. I've also added exception handling to gracefully skip users that can't be accessed
7db6e7b to
273aa73
Compare
|
Congratulations! One of the builds has completed. 🍾 You can install the built RPMs by following these steps:
Please note that the RPMs should be used only in a testing environment. |
| else: | ||
| command = "sudo -u "+ user |
There was a problem hiding this comment.
nack.
We do not permit embedded sudo calls in plugins. For any collection that needs to be done as a non-root user, you need to use the runas parameter to add_cmd_output(). I believe we have a single exception to this today in pulp, but it is something we have worked to eliminate over the last few years.
There was a problem hiding this comment.
Fixed! All embedded sudo calls removed. Now using runas parameter throughout.
| for user in users: | ||
| if not user: | ||
| continue | ||
| if user == 'root': | ||
| command = "" |
There was a problem hiding this comment.
Rather than nesting everything under a potentially large item loop like this, it'd be cleaner to instead break the collection into another method (or several, even) and call that method from the loop, with the user as an optional argument. The default here would be to run as root, and always be fired, with non-root iterations only firing if the users list above is populated at all (i.e. root should not be included).
This would be a cleaner approach and would also allow for easier future modification and any per-user validations we may want to do.
There was a problem hiding this comment.
broke the loop into several helper functions and calling the those function for root user always regardless if non-root_list is populated at all.
d22a871 to
d4f5786
Compare
|
I can see there are 1 test case failing on Debian and 2 on Ubuntu: Debian 13 Error: ModuleNotFoundError: No module named 'pkg_resources' This is nothing to do with my changes I think. The module is needed for package build. Ubuntu 25.10 DEB Results: PASS 254 | ERROR 0 | FAIL 1 | SKIP 26 | WARN 0 | INTERRUPT 0 | CANCEL 0 Job Time: 990.12 s Failed Test: 152-tests/report_tests/encryption_tests.py:EncryptedCleanedReportTest.test_archive_gpg_encrypted This is something to do with Encryption, which I think is not related to my patch. Ubuntu 25.10 Snap @pmoravec @TurboTurtle It seems the above test case failures are unrelated to the changes introduced by this pull request. Please confirm if that is not the case. |
|
@sahithiRavindranath if you rebase to main, the |
d4f5786 to
6d1b755
Compare
TurboTurtle
left a comment
There was a problem hiding this comment.
Looking very solid, just a few minor touch points below.
| try: | ||
| cmd = self.exec_cmd("podman ps -aq", runas=user) | ||
| return cmd['status'] == 0 and cmd['output'].strip() | ||
| except subprocess.SubprocessError: | ||
| return False |
There was a problem hiding this comment.
We shouldn't ever be throwing a subprocess exception here via exec_cmd(), or at least we don't intend to be doing so. Did you run into one? I'd rather fix that within Plugin (separately to this PR), than capture that here. The intent behind exec_cmd(), like anything else that culminates in a call to sos_get_command_output(), is that any exception results in a non-zero status value rather than trickling back up to the call.
There was a problem hiding this comment.
Yes, encountered this during testing when switching to certain users:
subprocess.SubprocessError: Exception occurred in preexec_fn.
File "/usr/lib64/python3.9/subprocess.py", line 1838, in _execute_child
raise child_exception_type(err_msg)
The exception occurs at utilities.py line 346 during Popen() initialization when preexec_fn fails during user switching (permission issues).
I've removed the try-except from the plugin. Can the exception handling be added in sos_get_command_output() as below?
diff --git a/sos/utilities.py b/sos/utilities.py
index 2646ab22..4910acbe 100644
--- a/sos/utilities.py
+++ b/sos/utilities.py
@@ -10,7 +10,7 @@ import os
import pwd
import re
import inspect
-from subprocess import Popen, PIPE, STDOUT
+from subprocess import Popen, PIPE, STDOUT, SubprocessError
import logging
import fnmatch
import errno
@@ -407,6 +407,11 @@ def sos_get_command_output(command, timeout=TIMEOUT_DEFAULT, stderr=False,
if e.errno == errno.ENOENT:
return {'status': 127, 'output': "", 'truncated': ''}
raise e
+ except SubprocessError:
+ # Handle exceptions in preexec_fn (e.g., user switching failures)
+ # Return 126 (command cannot execute)
+ return {'status': 126, 'output': "", 'truncated': ''}
finally:
if hasattr(_output, 'close'):
_output.close()There was a problem hiding this comment.
@TurboTurtle Following up on the diff above. The SubprocessError handling would prevent preexec_fn exceptions by returning status 126 instead. Thoughts on this approach?
| # XDG_RUNTIME_DIR is required for rootless podman to access | ||
| # user-specific runtime files and sockets | ||
| 'XDG_RUNTIME_DIR': f"/run/user/{pwd_user.pw_uid}" |
| # | ||
| # See the LICENSE file in the source distribution for further information. | ||
|
|
||
| import shlex |
There was a problem hiding this comment.
We haven't seen the need to shlex.quote the entities this is being added to previously, or at least none has been reported.
I'm curious what causes the need here? Not saying doing this is a problem or anything, just wondering.
There was a problem hiding this comment.
Added as defensive coding for edge cases with special characters in container/image/volume names. However, since it hasn't been needed historically, I've removed it per your feedback. All shlex.quote() calls and the import have been removed.
0e80347 to
4035372
Compare
|
|
||
| def _user_has_containers(self, user): | ||
| """Check if user has any containers before collecting data""" | ||
| cmd = self.exec_cmd("podman ps -aq", runas=user) |
There was a problem hiding this comment.
This does not scale well when there are tens or hundreds of users (I hope lslogins -u -o USER --noheadings does not list AD / LDAP users..?). I think it is fine for now, but we might need to improve it in future.
There was a problem hiding this comment.
Ack. I'll optimize it in future iterations.
TurboTurtle
left a comment
There was a problem hiding this comment.
Minor change I'd like to see, should have caught it earlier, my apologies for the review delay.
| non_root_users = self._get_non_root_users() | ||
| for user in non_root_users: | ||
| if self._user_has_containers(user): | ||
| self._collect_for_user(user, f'{user}/', subcmds) |
There was a problem hiding this comment.
One last change request here. Instead of putting user dirs in the top-level plugin directory, let's nest them, perhaps under a users dir or somesuch. Just to keep things a little more tidy.
There was a problem hiding this comment.
Fixed. User directories are now nested under users/
Issue Description: The current sos report only gathers podman data for the root user. It is unable to collect container data belonging to non‑root users because sos report runs with elevated (sudo) privileges. Solution: Added a new plugin option, allusers, to the podman plugin. When enabled, this option collects podman container data for all non‑root users which has container running on the system in addition to the root user. Signed-off-by: Sahithi Ravindranath <sahithi.Ravindranath@ibm.com>
|
rebased to main |
Issue Description:
The current sos report only gathers podman data for the root user. It is unable to collect container data belonging to non‑root users because sos report runs with elevated (sudo) privileges.
Solution:
Added a new plugin option, allusers, to the podman plugin. When enabled, this option collects podman container data for all non‑root users which has container running on the system in addition to the root user.
Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines