Skip to content

Class initialiser#7

Merged
ChrisSoderberg-ONS merged 23 commits intomainfrom
class_initialiser
Mar 2, 2026
Merged

Class initialiser#7
ChrisSoderberg-ONS merged 23 commits intomainfrom
class_initialiser

Conversation

@Jday7879
Copy link
Collaborator

Proposed Changes

Related Issues

  • Related to #
  • Fixes #

Pre-requisites

This section may not be fully required if the branch is not merging into main.
Please indicate items that aren't necessary and why, with comments around incomplete checks.

  • Version number has been incremented, according to SemVer
  • Changelog has been updated, listing changes to this version. Use the keep a changelog format
  • New features are tested
  • New features are documented using the numpydoc docstring format
  • Other relevant package documentation is updated
  • For new functionality, examples are included in the docs or a feature request has
    been made for it/them
  • Required workflows and pre-commits succeed

else:
continue

def _convert_schema_dtypes(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if this is necessary? Without this function, when supplying "int", "str", etc, the types still get coerced to PySpark types, which I assume pandera is doing for us

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that the schema was being formatted correctly but without converting from strings into pyspark dtypes it would not actually trigger the type checks or other checks. I can take a look into it more in the future and will add a backlog ticket to review

"float": T.FloatType(),
"string": T.StringType(),
"str": T.StringType(),
"bool": T.BooleanType(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything for timestamps?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice spot will add now

Copy link
Collaborator

@ChrisSoderberg-ONS ChrisSoderberg-ONS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of comments on the PySparkValidator class

@Jday7879
Copy link
Collaborator Author

Jday7879 commented Mar 2, 2026

Thanks Chris, I've updated and added support for date, datetime and timestamp types. Will expand and check these work when further examples and tests are developed (would do it here but this branch is already getting quite big!)

@ChrisSoderberg-ONS
Copy link
Collaborator

@Jday7879 approved!

@ChrisSoderberg-ONS ChrisSoderberg-ONS merged commit 59fecc5 into main Mar 2, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants