-
Notifications
You must be signed in to change notification settings - Fork 6
IHS-193 Add support for file upload/download for CoreFileObject
#792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds end-to-end file upload/download support to the SDK. New GraphQL multipart helpers (MultipartBuilder, multipart POST/request builders) and extended GraphQL types support file uploads. Introduces PreparedFile, FileHandler/FileHandlerSync for streaming uploads and downloads. InfrahubClient/InfrahubClientSync gain multipart upload execution, proxy-building, and streaming GET helpers. Node classes gain file state and APIs (is_file_object, upload_from_path, upload_from_bytes, clear_file, download_file) and validation. Protocol types include CoreFileObject. Extensive unit tests and pylint exceptions were added. 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Deploying infrahub-sdk-python with
|
| Latest commit: |
66c9871
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://6650b7f0.infrahub-sdk-python.pages.dev |
| Branch Preview URL: | https://gma-20260202-ihs193.infrahub-sdk-python.pages.dev |
Codecov Report❌ Patch coverage is @@ Coverage Diff @@
## infrahub-develop #792 +/- ##
====================================================
+ Coverage 80.36% 80.46% +0.09%
====================================================
Files 115 117 +2
Lines 9865 10222 +357
Branches 1504 1546 +42
====================================================
+ Hits 7928 8225 +297
- Misses 1415 1462 +47
- Partials 522 535 +13
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
Use two distinct methods instead of one to clearly show intent: * select_content_for_upload * select_file_for_upload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@infrahub_sdk/file_handler.py`:
- Around line 98-118: The handler handle_error_response should guard against
non-JSON responses by wrapping each exc.response.json() call in a try/except
(catching json.JSONDecodeError or ValueError) and falling back to
exc.response.text; for 401/403, if json parsing fails treat the body text as a
single message (or empty) when building the AuthenticationError message, and for
404 use the fallback text as the detail when constructing NodeNotFoundError;
preserve the original HTTPStatusError as the cause (raise ... from exc).
In `@infrahub_sdk/node/node.py`:
- Around line 1160-1163: The error raised when self._file_object_support is true
and self._file_content is None references the wrong helper method name; update
the ValueError text in the block that checks self._file_object_support /
self._file_content (near usage of self._schema.kind) to instruct callers to use
the correct methods select_file_for_upload() or select_content_for_upload()
instead of set_file().
- Around line 2049-2052: The error message inside the check for
self._file_object_support and self._file_content should reference the actual
synchronous setter method name instead of the non-existent set_file(); update
the message to instruct callers to use the real sync setter (e.g.,
set_file_content()) — modify the ValueError string that currently mentions
set_file() so it references set_file_content() (or the actual sync method used
in this class) and keep the rest of the message intact; target the block that
checks _file_object_support and _file_content and update the message formatting
around _schema.kind accordingly.
🧹 Nitpick comments (1)
infrahub_sdk/client.py (1)
997-1035: Avoid mutating caller-providedvariablesin file-upload helpers.Both async and sync paths mutate the incoming dict. Prefer a new dict to avoid side effects.
Suggested refactor
- variables = variables or {} - variables["file"] = None + variables = {**(variables or {}), "file": None} @@ - variables = variables or {} - variables["file"] = None + variables = {**(variables or {}), "file": None}Also applies to: 2055-2093
|
I would prefer the |
e9e0f9e to
1b304a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@infrahub_sdk/protocols.py`:
- Around line 110-115: The CoreFileObject class (and the other affected
generated blocks referenced at lines 234-237, 675-680, 799-802) were edited
manually but this file is generated; revert any hand edits to these definitions
and re-run the generation task or update the source schema so the generator
emits the correct definitions (e.g., restore CoreFileObject and the other
generated classes/fields to match the generator output), do not make direct
changes in infrahub_sdk/protocols.py, and commit the regenerated file instead.
🧹 Nitpick comments (1)
infrahub_sdk/node/node.py (1)
224-288: Reuse_validate_file_object_supportto avoid duplicated checks.
upload_from_pathandupload_from_bytesrepeat the same FeatureNotSupportedError logic; reusing the helper keeps error messaging consistent and centralized.♻️ Suggested refactor
- if not self._file_object_support: - raise FeatureNotSupportedError( - f"File upload is not supported for {self._schema.kind}. Only nodes inheriting from CoreFileObject support file uploads." - ) + self._validate_file_object_support( + message=( + f"File upload is not supported for {self._schema.kind}. " + "Only nodes inheriting from CoreFileObject support file uploads." + ) + ) @@ - if not self._file_object_support: - raise FeatureNotSupportedError( - f"File upload is not supported for {self._schema.kind}. Only nodes inheriting from CoreFileObject support file uploads." - ) + self._validate_file_object_support( + message=( + f"File upload is not supported for {self._schema.kind}. " + "Only nodes inheriting from CoreFileObject support file uploads." + ) + )Also applies to: 509-512
It was agreed to use `upload_from_path(path)` and `upload_from_bytes(content, name)`. Also update protocols which raised, it seems a valid error.
1b304a6 to
72047a7
Compare
ogenstad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good with some integration tests once we have things updated on the Infrahub side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@infrahub_sdk/client.py`:
- Around line 997-1033: Summary: The method _execute_graphql_with_file mutates
the caller's variables dict by assigning variables["file"] = None; make a copy
first to avoid leaking state. Fix: inside _execute_graphql_with_file (and the
other similar block) replace the in-place modification with a shallow copy
(e.g., vars_copy = dict(variables) or variables.copy() after handling None) and
set vars_copy["file"] = None, then use vars_copy for the request; ensure you
handle variables being None before copying. Reference symbols:
_execute_graphql_with_file, variables, "file".
- Around line 1060-1090: The multipart upload retries can send an empty file
because the passed file-like object has been consumed; in the async
_post_multipart method (decorated with handle_relogin) seek the incoming
file_content to the start (file_content.seek(0)) before calling
MultipartBuilder.build_payload so the payload is built from the beginning of the
stream, and apply the identical change to the sync counterpart (the sync
_post_multipart or similarly named method used elsewhere) and any other
occurrences (the other multipart methods around the second block referenced) so
retries re-upload the full file.
In `@infrahub_sdk/ctl/branch.py`:
- Around line 122-123: The two occurrences of "pc.created_by" (used in
proposed_change_table.add_row and format_timestamp) currently use a blanket "#
type: ignore[attr-defined]" without justification; fix this by either updating
the type definitions for the object that pc is (so created_by is declared on
that model) or by replacing the ignore with an explicit, checkable
assertion/cast (e.g., assert isinstance(pc, ExpectedProposedChange) or cast to
the model type) and add a short inline comment explaining why created_by is
guaranteed to exist (e.g., "created_by is populated by the API vX response for
created proposed changes"); update the annotation for pc or add the assert/cast
next to the accesses of pc.created_by so mypy no longer needs the ignore and
future readers see the rationale.
In `@infrahub_sdk/node/node.py`:
- Around line 249-275: The docstring example for upload_from_bytes is unsafe
because it stores a file handle in self._file_content and then closes it (via
the with block) before node.save() runs, causing I/O errors; update the example
and text in upload_from_bytes to show one of two correct patterns: (a) read the
file into bytes and call upload_from_bytes(content=f.read(), name=...) outside a
with, or (b) keep the handle open and call node.save() inside the with block
(e.g., with open(..., "rb") as f: node.upload_from_bytes(content=f, name="...");
node.save()). Mention that this applies to nodes inheriting from CoreFileObject
and reference upload_from_bytes, self._file_content and save() in the docstring.
In `@tests/unit/sdk/test_file_object.py`:
- Around line 78-105: The test uses type: ignore[union-attr] on attributes like
node.contract_start and node.contract_end in
test_node_create_with_file_uses_multipart (and the other listed tests) without
justification; replace each bare ignore by either asserting/casting the node to
the correct runtime type or adding a brief justification comment: e.g., use an
explicit runtime assertion (assert isinstance(node, InfrahubNode) or cast via
typing.cast) before accessing .contract_start/.contract_end, or add a one-line
comment explaining why the union attribute is present and safe (e.g., "# safe:
test constructs node with schema that guarantees contract_start attribute");
update the occurrences around InfrahubNode/InfrahubNodeSync instantiation so
mypy no longer needs type: ignore or the ignore is accompanied by the short
justification comment.
🧹 Nitpick comments (3)
infrahub_sdk/node/node.py (3)
282-288: Consider moving async/sync file upload preparation to their respective classes.
_get_file_for_uploadis anasync defonInfrahubNodeBase, which meansInfrahubNodeSyncinherits an async method it never uses. Similarly,_get_file_for_upload_syncis inherited byInfrahubNode. This pattern follows from the existing codebase but the guideline says to never mix async/sync inappropriately. Consider moving_get_file_for_uploadtoInfrahubNodeand_get_file_for_upload_synctoInfrahubNodeSyncto keep the separation clean, similar to how_process_hierarchical_fieldsis defined separately on each class.
791-824: Consider adding@overloadsignatures for better type narrowing.
FileHandler.downloadalready uses@overloadto distinguishbytes(no dest) vsint(with dest). Without matching overloads here, callers always receivebytes | intand must manually narrow. This is a minor DX concern.Example overloads
from typing import overload `@overload` async def download_file(self, dest: None = None) -> bytes: ... `@overload` async def download_file(self, dest: Path) -> int: ... async def download_file(self, dest: Path | None = None) -> bytes | int: ...
1186-1210: Extract duplicated file-upload mutation logic into a helper.The file-upload-or-regular-mutation block is repeated verbatim in
createandupdate(both async and sync — 4 copies total). Consider extracting a helper like_execute_mutation(query, input_data, tracker, timeout)that encapsulates the "file" branch, try/finally cleanup, clear_file, and the else-branch. This would reduce the surface area for bugs if the logic needs to change later.Also applies to: 1228-1252
This change adds support to create, update and retrieve nodes which schemas implement
CoreFileObject.The proposed user exposed API follows these key points:
node.upload_from_path(path)to select a file from disk for upload (streams during upload)node.upload_from_bytes(content, name)to set content for upload (supports bytes or BinaryIO)node.download_file(dest)to download a file to memory or stream to diskBeing able to stream a file to disk or from a disk is important in order to support large files and to avoid them being loaded completely into memory (which would be problematic for +1GB files in general).
The choice of using
upload_from_pathandupload_from_bytesis to prevent a collision with a potential attribute or relationship calledfilein the schema. That is also the reason why thefileGraphQL parameter is outside thedataone instead of inside it.Here we introduce a couple of components to try to make our code SOLID (it's not much for now, but it's honest work):
FileHandler/FileHandlerSyncdedicated classes for file I/O operationsMultipartBuilderGraphQL Multipart Request Spec payload buildingIt sure won't make our code SOLID but it won't add to the burden for now.
So given the user who loaded a schema, using our SDK will look like:
Upload a file when creating a node
Download a file from a node
Summary by CodeRabbit
New Features
Tests