-
Notifications
You must be signed in to change notification settings - Fork 1.9k
docs: Fix upgrade guide API examples for FileScanConfigBuilder and ParquetSource #19397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
docs: Fix upgrade guide API examples for FileScanConfigBuilder and ParquetSource #19397
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes incorrect API examples in the upgrade guide documentation for migrating from ParquetExecBuilder to the new FileScanConfigBuilder and DataSourceExec pattern. The examples were updated to reflect the current API implementation, correcting method signatures, parameter types, and builder patterns that had evolved since the original documentation was written.
Key Changes
- Updated
ParquetSourceinstantiation to useTableSchemaparameter instead of parquet options - Replaced
FileScanConfig::new()withFileScanConfigBuilderpattern and removed the schema parameter - Changed
with_projection()towith_projection_indices()with proper error handling - Updated final execution plan creation to use
DataSourceExec::from_data_source()
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
This is an upgrade guide for an older version (which had a different API) -- Why would the upgrade match the current version? |
Apologies — I misunderstood the purpose of the upgrade guide itself. 😅. |
|
Hi @mag1c1an1
I am not sure where your screen shots come from. This one shows the arguments from
The actual released 51 docs are here:
I wonder if you are trying to upgrade to the |
|
Hi @alamb . I'm very sorry that I only saw your comment now. I upgraded to version 51 instead of the main branch. What puzzles me is the following description in the upgrading guide:
But in api doc , the new function of
But in api doc , the new function of In my understanding, the meaning of the above code block is that the API of 51 has changed now. Please make the necessary modifications, such as removing the code starting with '- ' and replacing it with the code starting with '+ '. Am I mistaken in my interpretation? All the code blocks come from the "DataFusion 51" section of the upgrading guide. I hope my poor expression skills can correctly convey my intention. |
|
@mag1c1an1 you're correct that the upgrade guide is incorrect, it seems the PR was originally slated for v51 but was pushed to v52 but I guess we forget to amend the upgrade guide @ShashidharM0118 Please doublecheck where we're supposed to be fixing the upgrade guide as per what @mag1c1an1 stated cc @adriangb |
…upgrade-guide-api-mismatch
|
Apologies if I put docs in the wrong version and caused confusion. This was a big change and I'm sure I made mistakes. It seems like there's already a lot of discussion underway, and I myself am confused after reading this about what is in what version 😄, I'm happy to help clarify anything if needed otherwise I will allow the work to continue. |
|
Thanks @mag1c1an1 and @Jefffrey for the clarification! I've updated the docs |
| **Key changes:** | ||
|
|
||
| 1. **FileSource constructors now require TableSchema**: All built-in file sources now take the schema in their constructor: | ||
|
|
||
| ```diff | ||
| - let source = ParquetSource::default(); | ||
| + let source = ParquetSource::new(table_schema); | ||
| + let source = ParquetSource::new(TableParquetOptions::default()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still not correct @ShashidharM0118; can you please double check my comment above? We should not be modifying the existing text in-place under v51 as it is not supposed to be in v51, it is meant to be in v52. Moreover these edits don't make sense; the existing text is left to state:
FileSource constructors now require TableSchema:
But the example has been changed so this is not the case here so now it is inconsistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ShashidharM0118, thanks for your contribution to #19393!
Regarding the update, could you please remove the sections I mentioned? Since these API changes haven't been introduced in v51 yet, this part remains the same for users upgrading from v50 or earlier. Keeping it might cause confusion, so it's best to revert those specific examples to match the current v51 implementation.







Which issue does this PR close?
Closes #19393
Rationale for this change
The upgrade guide contained incorrect API examples that didn't match the current implementation
What changes are included in this PR?
ParquetExecBuildermigration example in the upgrade guide to use correct API:ParquetSource::new(parquet_options)toParquetSource::new(table_schema)with properTableSchemacreationFileScanConfig::new(url, file_schema, source)toFileScanConfigBuilder::new(url, source)(removed schema parameter)with_table_partition_cols()method callwith_projection()towith_projection_indices()with proper error handlingDataSourceExec::from_data_source()Are these changes tested?
Are there any user-facing changes?
No