Class initialiser by Jday7879 · Pull Request #7 · ONSdigital/datachecker

Jday7879 · 2026-02-26T09:18:06Z

Proposed Changes

Related Issues

Related to #
Fixes #

Pre-requisites

This section may not be fully required if the branch is not merging into main.
Please indicate items that aren't necessary and why, with comments around incomplete checks.

Version number has been incremented, according to SemVer
Changelog has been updated, listing changes to this version. Use the keep a changelog format
New features are tested
New features are documented using the numpydoc docstring format
Other relevant package documentation is updated
For new functionality, examples are included in the docs or a feature request has
been made for it/them
Required workflows and pre-commits succeed

datachecker/data_checkers/pyspark_validator.py

ChrisSoderberg-ONS · 2026-03-02T14:37:53Z

datachecker/data_checkers/pyspark_validator.py

+            else:
+                continue
+
+    def _convert_schema_dtypes(self):


Wondering if this is necessary? Without this function, when supplying "int", "str", etc, the types still get coerced to PySpark types, which I assume pandera is doing for us

I found that the schema was being formatted correctly but without converting from strings into pyspark dtypes it would not actually trigger the type checks or other checks. I can take a look into it more in the future and will add a backlog ticket to review

ChrisSoderberg-ONS · 2026-03-02T14:38:32Z

datachecker/data_checkers/pyspark_validator.py

+            "float": T.FloatType(),
+            "string": T.StringType(),
+            "str": T.StringType(),
+            "bool": T.BooleanType(),


Anything for timestamps?

Nice spot will add now

ChrisSoderberg-ONS

Just a couple of comments on the PySparkValidator class

Jday7879 · 2026-03-02T15:09:48Z

Thanks Chris, I've updated and added support for date, datetime and timestamp types. Will expand and check these work when further examples and tests are developed (would do it here but this branch is already getting quite big!)

ChrisSoderberg-ONS · 2026-03-02T15:10:27Z

@Jday7879 approved!

Jday7879 added 22 commits February 25, 2026 10:12

clean-up install requirements

25fe6ec

updating requirements and testing skipping tests

abdc200

remove old test, adding int test for pyspark validator

740e3ff

adding pytest to workflows for testing

00bfe4e

temp adding runing workflows on PR

350c302

updating logic for selecting pandera pyspark

939d79f

update python versions

b8f999e

dont run pyspark tests on windows

5059e4c

restructure workflows

6d71576

restructure pyspark test to avoid import errors

5baf1cf

remove setup method

a4c08e9

expanding test coverage

99ca73a

fix failing unit test

c271a40

new test to check pyspark log

8ff8433

update to test failing check

c6f1c8d

update to handle pyspark return validation

87ceb61

Update schema to convert types to pyspark

cb32648

replace error message for pyspark validation fails

69fa594

update tests

cfbe6da

handle no pyspark validator errors

cb02cf2

remove old code

74b815e

update unit test to expected fail due to bug

297885b

ChrisSoderberg-ONS reviewed Mar 2, 2026

View reviewed changes

datachecker/data_checkers/pyspark_validator.py Outdated Show resolved Hide resolved

ChrisSoderberg-ONS reviewed Mar 2, 2026

View reviewed changes

ChrisSoderberg-ONS requested changes Mar 2, 2026

View reviewed changes

remove print and add support for dates.

e49c997

ChrisSoderberg-ONS approved these changes Mar 2, 2026

View reviewed changes

Jday7879 requested a review from ChrisSoderberg-ONS March 2, 2026 15:10

ChrisSoderberg-ONS merged commit 59fecc5 into main Mar 2, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Class initialiser#7

Class initialiser#7
ChrisSoderberg-ONS merged 23 commits intomainfrom
class_initialiser

Jday7879 commented Feb 26, 2026

Uh oh!

Uh oh!

ChrisSoderberg-ONS Mar 2, 2026

Uh oh!

Jday7879 Mar 2, 2026

Uh oh!

ChrisSoderberg-ONS Mar 2, 2026

Uh oh!

Jday7879 Mar 2, 2026

Uh oh!

ChrisSoderberg-ONS left a comment

Uh oh!

Jday7879 commented Mar 2, 2026 •

edited

Loading

Uh oh!

ChrisSoderberg-ONS commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Jday7879 commented Feb 26, 2026

Proposed Changes

Related Issues

Pre-requisites

Uh oh!

Uh oh!

ChrisSoderberg-ONS Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Jday7879 Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

ChrisSoderberg-ONS Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Jday7879 Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

ChrisSoderberg-ONS left a comment

Choose a reason for hiding this comment

Uh oh!

Jday7879 commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ChrisSoderberg-ONS commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Jday7879 commented Mar 2, 2026 •

edited

Loading