-
-
Notifications
You must be signed in to change notification settings - Fork 424
Add version-specific schema processing using schema_features #2934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add version-specific schema processing using schema_features #2934
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughThis PR introduces schema versioning support and feature detection for JSON Schema and OpenAPI. It exposes new public enums, adds version detection utilities, and implements version-aware feature flags and data format mappings across parser classes. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
📚 Docs Preview: https://pr-2934.datamodel-code-generator.pages.dev |
| from datamodel_code_generator.enums import JsonSchemaVersion, OpenAPIVersion | ||
|
|
||
| if TYPE_CHECKING: | ||
| from datamodel_code_generator.types import Types |
Check notice
Code scanning / CodeQL
Unused import
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 3 days ago
To fix the problem, remove the unused import of Types from datamodel_code_generator.types inside the if TYPE_CHECKING: block. This resolves the static analysis warning without changing any runtime behavior, since if TYPE_CHECKING: blocks are ignored at runtime.
Concretely, in src/datamodel_code_generator/parser/schema_version.py, delete the line from datamodel_code_generator.types import Types that appears under if TYPE_CHECKING:. No other code changes or new imports are needed.
-
Copy modified line R16
| @@ -13,7 +13,7 @@ | ||
| from datamodel_code_generator.enums import JsonSchemaVersion, OpenAPIVersion | ||
|
|
||
| if TYPE_CHECKING: | ||
| from datamodel_code_generator.types import Types | ||
| ... | ||
|
|
||
|
|
||
| @dataclass(frozen=True) |
| discriminator_support: bool | ||
|
|
||
| @classmethod | ||
| def from_openapi_version(cls, version: OpenAPIVersion) -> OpenAPISchemaFeatures: |
Check notice
Code scanning / CodeQL
Explicit returns mixed with implicit (fall through) returns
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 3 days ago
In general, to fix “explicit returns mixed with implicit returns”, ensure that every code path in a function with a non-None return annotation ends with an explicit return, including the fall-through at the end of the function. This avoids any possibility of returning None implicitly.
For OpenAPISchemaFeatures.from_openapi_version, the match currently has case OpenAPIVersion.V30: and a fallback case _: that both return cls(...). To eliminate any potential implicit return, we can simply add a final return after the match that returns a sensible default instance. Because the case _: branch already represents the default behavior, the safest and behavior-preserving change is to construct and return the same values as in the wildcard case at the end of the function. In practice, this return should be unreachable, but it satisfies the analyzer and guarantees the function never returns None, even if the match were later changed in a way that forgets a return.
Concretely, within src/datamodel_code_generator/parser/schema_version.py, after line 123 (the end of the case _: block in from_openapi_version), add an explicit return cls(...) with the same arguments as in the wildcard case. No new imports or additional helpers are needed.
-
Copy modified lines R124-R134
| @@ -121,6 +121,17 @@ | ||
| nullable_keyword=False, | ||
| discriminator_support=True, | ||
| ) | ||
| # Fallback return to avoid any implicit None; mirrors the default case. | ||
| return cls( | ||
| null_in_type_array=True, | ||
| defs_not_definitions=True, | ||
| prefix_items=True, | ||
| boolean_schemas=True, | ||
| id_field="$id", | ||
| definitions_key="$defs", | ||
| nullable_keyword=False, | ||
| discriminator_support=True, | ||
| ) | ||
|
|
||
|
|
||
| SchemaFeaturesT = TypeVar("SchemaFeaturesT", bound=JsonSchemaFeatures) |
CodSpeed Performance ReportMerging #2934 will not alter performanceComparing
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
src/datamodel_code_generator/parser/schema_version.py (1)
190-190: Remove unused noqa directives.The
# noqa: PLC0415comments on lines 190 and 251 are not needed as the PLC0415 rule is not enabled in your Ruff configuration.🔎 Proposed fix
- from datamodel_code_generator.types import Types # noqa: PLC0415 + from datamodel_code_generator.types import TypesApply the same change on line 251.
Also applies to: 251-251
src/datamodel_code_generator/parser/openapi.py (1)
176-176: Remove unused noqa directive.The
# noqa: PLC0415comment on line 176 is not needed as the PLC0415 rule is not enabled.🔎 Proposed fix
- from datamodel_code_generator.parser.schema_version import ( # noqa: PLC0415 + from datamodel_code_generator.parser.schema_version import ( OpenAPISchemaFeatures, detect_openapi_version, )src/datamodel_code_generator/__init__.py (1)
46-46: LGTM! Clean public API exposure.The new version enums and detection functions are properly integrated:
- Enums imported from
enumsmodule- Detection functions configured for lazy loading
- All additions included in
__all__exportsThe lazy import pattern is consistent with existing entries like
generate_dynamic_models.However, the
# noqa: F822comments on lines 987-989 can be removed as Ruff is not flagging these lines:🔎 Proposed fix
- "clear_dynamic_models_cache", # noqa: F822 - "detect_jsonschema_version", # noqa: F822 - "detect_openapi_version", # noqa: F822 + "clear_dynamic_models_cache", + "detect_jsonschema_version", + "detect_openapi_version",Also applies to: 50-50, 54-54, 936-937, 974-974, 979-979, 986-986, 988-989
src/datamodel_code_generator/parser/jsonschema.py (1)
929-941: Consider using_get_array_itemshelper in existing code.The
_get_array_itemsstaticmethod correctly handles version-aware array item resolution (prefixItems vs items), but it's not called anywhere in this file.Similar logic exists in
parse_array_fields(lines 2959-2970) that could potentially be refactored to use this helper for consistency and DRY.Is this method intended for:
- Future refactoring of parse_array_fields?
- Use by openapi.py (listed as a dependent)?
- Public API for version-aware item handling?
If not actively used yet, consider either integrating it now or adding a TODO comment explaining its intended use.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/datamodel_code_generator/__init__.pysrc/datamodel_code_generator/enums.pysrc/datamodel_code_generator/parser/jsonschema.pysrc/datamodel_code_generator/parser/openapi.pysrc/datamodel_code_generator/parser/schema_version.pytests/data/expected/main/openapi/same_name_objects.pytests/parser/test_schema_version.py
💤 Files with no reviewable changes (1)
- tests/data/expected/main/openapi/same_name_objects.py
🧰 Additional context used
🧬 Code graph analysis (6)
tests/parser/test_schema_version.py (2)
src/datamodel_code_generator/enums.py (3)
JsonSchemaVersion(243-254)OpenAPIVersion(257-265)VersionMode(268-276)src/datamodel_code_generator/parser/schema_version.py (7)
JsonSchemaFeatures(20-81)OpenAPISchemaFeatures(85-123)detect_jsonschema_version(139-165)detect_openapi_version(168-182)from_version(43-81)from_openapi_version(99-123)get_data_formats(261-277)
src/datamodel_code_generator/enums.py (1)
src/datamodel_code_generator/model/enum.py (1)
Enum(39-121)
src/datamodel_code_generator/__init__.py (1)
src/datamodel_code_generator/enums.py (3)
JsonSchemaVersion(243-254)OpenAPIVersion(257-265)VersionMode(268-276)
src/datamodel_code_generator/parser/openapi.py (2)
src/datamodel_code_generator/enums.py (1)
OpenAPIVersion(257-265)src/datamodel_code_generator/parser/schema_version.py (3)
OpenAPISchemaFeatures(85-123)detect_openapi_version(168-182)from_openapi_version(99-123)
src/datamodel_code_generator/parser/schema_version.py (2)
src/datamodel_code_generator/enums.py (2)
JsonSchemaVersion(243-254)OpenAPIVersion(257-265)src/datamodel_code_generator/types.py (1)
Types(955-994)
src/datamodel_code_generator/parser/jsonschema.py (2)
src/datamodel_code_generator/enums.py (1)
JsonSchemaVersion(243-254)src/datamodel_code_generator/parser/schema_version.py (3)
JsonSchemaFeatures(20-81)detect_jsonschema_version(139-165)from_version(43-81)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/__init__.py
987-987: Unused noqa directive (unused: F822)
Remove unused noqa directive
(RUF100)
988-988: Unused noqa directive (unused: F822)
Remove unused noqa directive
(RUF100)
989-989: Unused noqa directive (unused: F822)
Remove unused noqa directive
(RUF100)
src/datamodel_code_generator/parser/openapi.py
176-176: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
src/datamodel_code_generator/parser/schema_version.py
190-190: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
251-251: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
src/datamodel_code_generator/parser/jsonschema.py
790-790: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: py312-isort5 on Ubuntu
- GitHub Check: py312-isort6 on Ubuntu
- GitHub Check: py312-isort7 on Ubuntu
- GitHub Check: 3.13 on Windows
- GitHub Check: 3.11 on Windows
- GitHub Check: py312-pydantic1 on Ubuntu
- GitHub Check: 3.12 on Windows
- GitHub Check: 3.10 on Windows
- GitHub Check: 3.14 on Windows
- GitHub Check: Analyze (python)
- GitHub Check: benchmarks
🔇 Additional comments (13)
src/datamodel_code_generator/enums.py (1)
243-276: LGTM! Well-structured version enums.The three new enums (
JsonSchemaVersion,OpenAPIVersion,VersionMode) are cleanly defined with clear docstrings explaining their purpose and theAutovariant behavior. The enum values follow standard naming conventions for their respective specifications.src/datamodel_code_generator/parser/schema_version.py (3)
19-123: LGTM! Feature flags correctly mapped across versions.The frozen dataclasses provide immutable feature flag sets for each schema version. The version-to-feature mappings are accurate:
- Draft 4's use of
idvs later drafts'$id- Boolean schemas introduced in Draft 6
$defsreplacingdefinitionsin 2019-09- OpenAPI 3.0's
nullablekeyword vs 3.1's type arraysThe default case for
Autoand future versions sensibly enables all modern features.
139-165: LGTM! Robust version detection with sensible fallbacks.The detection priority is well-designed:
- Explicit
$schemafield (with type check)- Heuristics (
$defsvsdefinitions)- Draft 7 fallback (most widely used)
The string type check on line 156 prevents errors when
$schemacontains unexpected values.
168-182: LGTM! Clean OpenAPI version detection.The implementation correctly uses
startswithto handle patch versions (e.g., 3.0.3, 3.1.0) and includes a type check to handle non-string values. Defaulting to V31 (latest) is appropriate.src/datamodel_code_generator/parser/openapi.py (1)
173-182: LGTM! Consistent schema features implementation.The
schema_featuresproperty follows the same pattern asJsonSchemaParser, usingcached_propertyfor efficient lazy evaluation and local imports to avoid circular dependencies. The fallback toAutowhenraw_objis unavailable is appropriate.tests/parser/test_schema_version.py (1)
1-423: LGTM! Excellent test coverage.This comprehensive test suite validates:
- Version detection across all JSON Schema drafts and OpenAPI versions
- Edge cases (non-string values, missing fields, heuristics)
- Feature flag mappings for each version
- Immutability of frozen dataclasses
- Lazy import exposure from the main module
- Data format mappings with and without OpenAPI-specific formats
- Parser integration with real raw_obj data
The tests provide strong confidence in the correctness of the new versioning infrastructure.
src/datamodel_code_generator/parser/jsonschema.py (7)
31-31: LGTM! Clean imports for version-aware schema processing.The imports of
JsonSchemaVersionandJsonSchemaFeaturesenable the new version-specific behavior throughout the parser.Also applies to: 91-91
195-200: LGTM! Helpful documentation clarification.The enhanced docstring properly documents that
Discriminatoris an OpenAPI-specific concept and explains why it's located in the JSON Schema parser (circular import avoidance).
519-533: LGTM! Well-designed for version-specific format handling.Adding the
data_formatsparameter with a default fallback enables version-aware and parser-specific type mappings while maintaining backward compatibility.
748-756: LGTM! Good extensibility point for parser-specific formats.The
_data_formatscached property provides a clean override point for subclasses (e.g., OpenAPI parser) while maintaining backward compatibility.
757-772: LGTM! Correctly integrated version-aware format mappings.The method now uses
self._data_formatsinstead of the global mapping, enabling parser-specific and version-specific format resolution while respecting custom type mappings.
3617-3628: LGTM! Correct version-aware root ID handling.The
root_id_contextnow properly usesschema_features.id_fieldto support both Draft 4's"id"and Draft 6+'s"$id", with appropriate fallback checks for lenient compatibility with mixed-version schemas.
774-796: Remove unused noqa directive.Line 790 has an unused
noqadirective forPLC0415(import-outside-toplevel) which is not enabled in your Ruff configuration.Otherwise, the version-aware
schema_pathsandschema_featuresimplementation looks excellent. The lenient approach of checking both$defsanddefinitions(with version-appropriate prioritization) provides good compatibility with mixed-version schemas.🔎 Proposed fix
def schema_features(self) -> JsonSchemaFeatures: """Get schema features based on detected version.""" - from datamodel_code_generator.parser.schema_version import ( # noqa: PLC0415 + from datamodel_code_generator.parser.schema_version import ( JsonSchemaFeatures, detect_jsonschema_version, )⛔ Skipped due to learnings
Learnt from: koxudaxi Repo: koxudaxi/datamodel-code-generator PR: 2799 File: src/datamodel_code_generator/model/pydantic/__init__.py:43-43 Timestamp: 2025-12-25T09:22:22.481Z Learning: In datamodel-code-generator project, defensive `# noqa: PLC0415` directives should be kept on lazy imports (imports inside functions/methods) even when Ruff reports them as unused via RUF100, to prepare for potential future Ruff configuration changes that might enable the import-outside-top-level rule.Learnt from: koxudaxi Repo: koxudaxi/datamodel-code-generator PR: 2681 File: tests/cli_doc/test_cli_doc_coverage.py:82-82 Timestamp: 2025-12-18T13:43:16.235Z Learning: In datamodel-code-generator project, Ruff preview mode is enabled via `lint.preview = true` in pyproject.toml. This enables preview rules like PLR6301 (no-self-use), so `noqa: PLR6301` directives are necessary and should not be removed even if RUF100 suggests they are unused.
2e1b066 to
9e57fb9
Compare
18f34ac to
837b245
Compare
4c3c9e9 to
811cf13
Compare
72d1df4 to
dbc8045
Compare
dbc8045 to
dd2aef2
Compare
* Add schema_features property to parsers for version detection * Add version-specific schema processing using schema_features (#2934) * Add --jsonschema-version and --openapi-version CLI options * Add --schema-version and --schema-version-mode CLI options * Regenerate CLI docs * Add version-specific schema processing using schema_features * Implement flag-based behavior control for schema version * Add comprehensive version-specific feature checks with exclusive_as_number flag * Replace getattr with direct config access for schema_version_mode * docs: update llms.txt files Generated by GitHub Actions * docs: update CLI reference documentation and prompt data 🤖 Generated by GitHub Actions * Add SchemaFeaturesT generic type parameter to Parser * Fix test snapshot: add exclusive_as_number field * Refactor: genericize _create_default_config using _get_config_class * Add parameterized e2e tests for --schema-version and --schema-version-mode * docs: update CLI reference documentation and prompt data 🤖 Generated by GitHub Actions * docs: update llms.txt files Generated by GitHub Actions * Refactor: use _config_class_name class variable instead of method override * docs: update CLI reference documentation and prompt data 🤖 Generated by GitHub Actions * docs: update llms.txt files Generated by GitHub Actions * Add Schema Version Support to docs navigation * docs: update llms.txt files Generated by GitHub Actions * Docs: add detailed unsupported features tables * docs: update llms.txt files Generated by GitHub Actions * Docs: add version info to unsupported features tables * docs: update llms.txt files Generated by GitHub Actions * Add e2e tests for schema version error handling and strict mode warnings --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.