Skip to content

Conversation

@chiaweh2
Copy link
Contributor

@chiaweh2 chiaweh2 commented Dec 14, 2025

Change Summary

This pull request improves support for opening kerchunk datasets with the kerchunk engine, particularly distinguishing between local and remote reference files. It also adds new tests to ensure the correct behavior for both remote HTTPS URLs and local files when using the kerchunk engine.

Related issue number

Related to the PR #758. This fix bug of local access on the PR and also add unittest on both local and remote reference file access using Kerchunk engine.

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI
  • Documentation reflects the changes where applicable

Enhancements to dataset opening logic:

  • Updated the _open_dataset function in source.py to avoid using fsspec when loading local reference files while engine set to kerchunk, ensuring correct handling by checking that xarray_open_kwargs['engine'] != 'kerchunk' before using fsspec.open_local.

Expanded test coverage:

  • Added test_open_dataset_kerchunk_engine to verify that remote kerchunk reference files accessed via HTTPS are properly opened with the kerchunk engine.
  • Added test_open_dataset_kerchunk_engine_local to verify that local kerchunk reference files are handled correctly, including testing with specific storage_options for remote S3 access.

@chiaweh2 chiaweh2 requested a review from mgrover1 as a code owner December 14, 2025 04:06
Copy link
Collaborator

@mgrover1 mgrover1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks for adding a test here too

Copy link
Collaborator

@mgrover1 mgrover1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait - we need to make sure kerchunk is installed in the testing environment too

@chiaweh2
Copy link
Contributor Author

Thanks for the quick review! I did add the kerchunk package in the previous PR #758 in requirement.txt. Or is the test environment set somewhere else? Thanks!

@mgrover1
Copy link
Collaborator

Thanks for the quick review! I did add the kerchunk package in the previous PR #758 in requirement.txt. Or is the test environment set somewhere else? Thanks!

It needs to included here
https://github.com/intake/intake-esm/blob/main/ci/environment-upstream-dev.yml

As in the testing infrastructure, it does not install all the dependencies in the requirements.txt file

@chiaweh2
Copy link
Contributor Author

Looks like python3.10 is in conflict with the kerchunk version for the python3.10 CI test.

@andersy005
Copy link
Member

Looks like python3.10 is in conflict with the kerchunk version for the python3.10 CI test.

can we drop python3.10 in a separate PR and close this issue

?

@chiaweh2
Copy link
Contributor Author

@mgrover1 @andersy005 just checking to see do I need to do anything for the failed testing? Many thanks!

@mgrover1
Copy link
Collaborator

@mgrover1 @andersy005 just checking to see do I need to do anything for the failed testing? Many thanks!

correct! I am planning on submitting a PR tonight with the fix; then we can just merge the main branch in here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants