Skip to content

Refactor legacy MinorTraceChemistry into the Ocotillo schema via backfill job #600

@kbighorse

Description

@kbighorse

Summary

Implement a repeatable, idempotent backfill job to migrate legacy NMA_MinorTraceChemistry records into the new Ocotillo schema (Observation, Sample, Parameter, Notes tables).

Source feature spec: tests/features/nma-chemistry-minortracechemistry-refactor.feature

Requirements

Core backfill (idempotent)

  • Create Observation records from legacy NMA_MinorTraceChemistry rows, keyed by GlobalIDnma_pk_chemistryresults
  • Link each Observation to its Sample via SamplePtIDnma_pk_chemistrysample
  • Re-running the job must not create duplicates

Field mapping

Legacy Field Target Notes
GlobalID Observation.nma_pk_chemistryresults Idempotency key
SamplePtID Sample linkage via nma_pk_chemistrysample Also links to Thing
Analyte Parameter.parameter_name (matrix = "water") Create Parameter if needed
SampleValue Observation.value
Units Observation.unit
AnalysisDate Observation.observation_datetime
AnalysisMethod Observation.analysis_method_name Preserve as-is (e.g. "Field analysis", "EPA 200.8")
AnalysesAgency Observation.analysis_agency
Uncertainty Observation.uncertainty
Symbol = < Observation.detect_flag = false Value is detection limit, not detected concentration
Volume Sample.volume Populated on related Sample
VolumeUnit Sample.volume_unit Populated on related Sample
Notes Notes table record target_table="observation", note_type="Chemistry Observation"

Unmapped fields (ignored)

  • SamplePointID, OBJECTID, WCLab_ID — not persisted in new schema
  • Volume and VolumeUnit go to Sample, not Observation

Orphan prevention

  • Skip legacy records whose SamplePtID does not match an existing Sample
  • Report count of skipped records with reason (missing Sample linkage)

Linkage integrity

  • Each Observation must link to its Sample and the Thing associated with that Sample (no orphaned observations)

Acceptance Criteria

All scenarios in the feature file pass:

  • Backfill creates Observation records and can be re-run without duplicates
  • Volume and VolumeUnit populate the related Sample
  • Observations link to Sample (and Thing) by SamplePtID
  • AnalysisMethod values are preserved as-is
  • Notes are stored in the Notes table and linked to the Observation
  • Symbol < sets detect_flag to false
  • Unmapped legacy fields are not persisted
  • Orphan legacy records are skipped and reported

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions