Water Level Import Reference.md

Water Level Import Reference

This page documents the water-level CSV bulk-upload feature in detail: the import workflow, transaction model, field mappings, validation rules, and operational notes.

For a broader overview of all CSV import surfaces, see CSV Ingestion and Validation. To compare this importer with the well inventory importer, see Well Inventory Import Reference.

What This Importer Does

Both of these entry points call the same service: bulk_upload_water_levels(...).

CLI: oco water-levels bulk-upload --file <path>
API: POST /observation/groundwater-level/bulk-upload

The importer reads a CSV file, validates each row, finds an existing well by name, and then creates or updates:

field_event
field_activity
sample
observation
field_event_participant for non-empty staff columns
contact records for staff names when no matching name + organization row already exists

It also links each imported groundwater-level sample to the structured participant who matches measuring_person.

It does not create new wells.

High-Level Flow

The importer runs in this order:

Read the CSV.
Normalize header aliases.
Check that all required headers are present.
Validate each row.
Look up the target well for each valid row.
Create or update database records.
Commit all successful rows at the end.

Rows with validation errors are collected in validation_errors. A bad row does not stop other rows from importing.

Workflow Overview

CSV Parsing and Header Standardization

The service accepts a file path, raw bytes, or a binary file-like object. CSV parsing uses Python's csv.DictReader.

Before validation, certain header aliases are mapped to their standard names:

Accepted alias	Standard name
`measurement_date_time`	`water_level_date_time`
`sampler`	`measuring_person`
`mp_height_ft`	`mp_height`

The following headers are silently ignored:

hold(not saved)
cut(not saved)

Any other unrecognized columns are also ignored — only known fields are processed.

If a row contains both an alias and its standard name, the importer keeps the later non-empty value in CSV column order. A later blank value does not overwrite an earlier non-blank one.

BOM warning: This importer does not handle UTF-8 files saved with a byte-order mark (utf-8-sig). Unlike the well inventory importer, a BOM will be left attached to the first header name, which will cause a missing-column error for that field.

Header validation

After aliases are normalized, the following required columns must be present:

well_name_point_id
field_event_date_time
field_staff
water_level_date_time
measuring_person
sample_method

Header validation runs before any data is written. If a required column is missing, the importer fails and produces file-level errors like CSV missing required column 'field_staff'.

Row validation

Each row goes through two validation phases:

Format validation (WaterLevelCsvRow):

parses timestamps and optional numeric fields
checks that enum fields contain allowed values
enforces cross-field constraints.

Database validation:

looks up well_name_point_id to confirm it matches exactly one water well
determines the effective measuring-point height
checks the corrected depth-to-water against the well depth.

Rows that fail validation are collected into validation_errors; other rows continue processing.

Record creation and update

For each valid row, the importer:

Builds a sample name using a fixed pattern: <well.name>-WL-<UTC YYYYMMDDHHMM>.
Checks for an existing groundwater-level sample for that well with the same sample name.
Normalizes field_staff, field_staff_2, and field_staff_3 into an ordered participant list without changing the CSV input shape.
Creates or updates records in field_event, field_activity, sample, and observation. Records are created if no match exists and updated if one does.
Creates or reuses structured participant data for each non-empty staff value.
Sets sample.field_event_participant_id to the unique participant matching measuring_person.

The groundwater-level Parameter is looked up once per file. If no matching parameter exists, the importer creates one with:

parameter_name = groundwater level
matrix = groundwater
parameter_type = Field Parameter
default_unit = ft

Every imported sample is stored with:

sample_matrix = groundwater
qc_type = Normal.

Well lookups are cached during the run, so repeated rows for the same well do not repeat the database query.

If measuring_person cannot be matched to exactly one participant for the field event, the importer rejects that row with a human-readable row error.

Response and exit behavior

The service returns a response with the following structure:

summary.total_rows_processed
summary.total_rows_imported
summary.validation_errors_or_warnings
water_levels
validation_errors

Exit behavior depends on the outcome:

Outcome	CLI exit code	API status
At least one row imported (even with other failures)	`0`	`200 OK`
Zero rows imported and errors present	`1`	`400 Bad Request`
Uploaded file is empty	—	`400 Bad Request`

Transaction Model

The water-level importer is not all-or-nothing at the row level, but it still uses one final commit for the file.

One SQLAlchemy session is opened for the entire file
Header and row validation run before any inserts or updates
All record writes run inside a single outer transaction
Each row gets its own savepoint (session.begin_nested()), so a failure on one row rolls back only that row
Successfully written rows stay pending in the outer transaction until a final session.commit()

Important consequence: other database sessions do not see imported rows until the final commit. If that final commit fails, all rows from that file roll back, including rows that had succeeded earlier.

Rerun Behavior

The water-level importer uses update-on-match behavior. This differs from the well inventory importer, which skips records that already exist.

How matches are identified: well name + activity type groundwater level + sample name <well.name>-WL-<UTC YYYYMMDDHHMM>

Because the sample name is derived from the measurement time converted to UTC, matching is UTC-based — not based on the raw timestamp string from the CSV.

When a matching sample is found:

The existing Sample record is reused
The linked FieldEvent date and notes are updated
FieldActivity.notes is updated
Structured participants are created or reused idempotently for the imported staff names
sample.field_event_participant_id is set or updated to the unique participant matching measuring_person
The groundwater-level Observation is updated if it exists
If the sample exists but has no groundwater-level observation, a new one is created without affecting other observations on that sample

If two rows in the same file share the same well and measurement time (after UTC conversion), they point to the same sample — the later row overwrites the values written by the earlier one.

Field Mapping Reference

CSV Column	Schema Field	DB Table	DB Column	Validation / Processing	Write Behavior
`well_name_point_id`	`well_name_point_id`	`thing`	`name`	Required. Must resolve to exactly one water well.	Lookup only — no well record is created or modified.
`field_event_date_time`	`field_event_date_time`	`field_event`	`event_date`	Required. ISO 8601 datetime. Timestamps without a timezone are assumed `America/Denver`, converted to UTC.	Sets or updates the field-event timestamp.
`field_staff`	`field_staff`	`contact`, `field_event_participant`	`name`, `participant_role`	Normalized with additional staff fields into an ordered participant list.	Resolves or creates a `Contact` for the staff name and creates or reuses a `FieldEventParticipant` with role `Lead`.
`field_staff_2`	`field_staff_2`	`contact`, `field_event_participant`	`name`, `participant_role`	Optional. Blank → `None`.	Resolves or creates a `Contact` and creates or reuses a `FieldEventParticipant` with role `Participant`.
`field_staff_3`	`field_staff_3`	none	none	Optional. Blank → `None`.	Resolves or creates a `Contact` and creates or reuses a `FieldEventParticipant` with role `Participant`.
`water_level_date_time` / `measurement_date_time`	`water_level_date_time`	`sample`	`sample_date`	Required. Alias via `measurement_date_time`. Timestamps without a timezone are assumed `America/Denver`, converted to UTC.	Sets or updates the sample timestamp.
`water_level_date_time` / `measurement_date_time`	`water_level_date_time`	`sample`	`sample_name`	Same processing as above.	Derives `<well>-WL-<UTC YYYYMMDDHHMM>`.
`water_level_date_time` / `measurement_date_time`	`water_level_date_time`	`observation`	`observation_datetime`	Same processing as above.	Sets or updates the observation timestamp.
`measuring_person` / `sampler`	`measuring_person`	`sample`	`field_event_participant_id`	Required. Must match one of `field_staff`, `field_staff_2`, or `field_staff_3`. Alias via `sampler`.	Links the sample to the unique structured participant matching the measuring person.
`sample_method`	`sample_method`	`sample`	`sample_method`	Required. Must be one of the allowed values (see below). Case-insensitive. Short aliases accepted (see below).	Sets or updates the sample method.
`sample_method`	`sample_method`	`sample`	`sample_matrix`	n/a	Always stored as `groundwater`.
`sample_method`	`sample_method`	`sample`	`qc_type`	n/a	Always stored as `Normal`.
`mp_height` / `mp_height_ft`	`mp_height`	`observation`	`measuring_point_height`	Optional float. Alias via `mp_height_ft`. Falls back to the well's current measuring-point height if blank.	Saves the measuring-point height used for the observation.
`depth_to_water_ft`	`depth_to_water_ft`	`observation`	`value`	Optional float. Must be `>= 0` when present.	Stored as the groundwater-level measurement value.
`depth_to_water_ft`	`depth_to_water_ft`	`observation`	`unit`	n/a	Always stored as `ft`.
`level_status`	`level_status`	`observation`	`groundwater_level_reason`	Required when `depth_to_water_ft` is blank. Must be one of the allowed values; short aliases accepted (see below).	Stored in a structured column, not as notes text.
`data_quality`	`data_quality`	`observation`	`nma_data_quality`	Optional. Must be one of the allowed values.	Stored in a structured column, not as notes text.
`water_level_notes`	`water_level_notes`	`field_event`	`notes`	Optional. Blank → `None`.	Stored as the field-event freeform notes only. Staff names are not written into note text.
`water_level_notes`	`water_level_notes`	`sample`	`notes`	Optional. Blank → `None`.	Stored on the sample.
`water_level_notes`	`water_level_notes`	`observation`	`notes`	Optional. Blank → `None`.	Stored on the observation.

The parameter table is also updated indirectly (see Record creation and update).

Tables not written by this workflow: note. The importer uses table note/notes columns, but it does not create rows in the polymorphic note table.

Validation Rules

Datetime handling

field_event_date_time and water_level_date_time accept ISO 8601 strings
Timestamps without a timezone are assumed to be America/Denver and converted to UTC before writing
water_level_date_time must be greater than or equal to field_event_date_time

Blank value handling

Optional text fields: blank becomes None.
Optional numeric fields: blank becomes None.
Required fields: blank after trimming is rejected.

Enum-backed fields

sample_method, level_status, and data_quality are validated against allowed-value lists built from lexicon categories in core/enums.py. Matching is case-insensitive and whitespace-tolerant. The standardized value is written to the database regardless of how it was entered in the CSV.

sample_method accepted values:

CSV value	Stored as
`electric tape`	`Electric tape measurement (E-probe)`
`steel tape`	`Steel-tape measurement`

level_status accepted values:

CSV value	Stored as
`dry`	`Site was dry`
`obstructed` / `obstruction`	`Obstruction was encountered in the well (no level recorded)`
`flowing`	`Site was flowing. Water level or head couldn't be measured w/out additional equipment.`
`flowing recently`	`Site was flowing recently.`
`pumped`	`Site was being pumped`
`pumped recently`	`Site was pumped recently`
`not affected`	`Water level not affected`
`other`	`Other conditions exist that would affect the level (remarks)`

Cross-field rules

At least one of depth_to_water_ft or level_status must be present per row
measuring_person must match one of field_staff, field_staff_2, or field_staff_3
measuring_person must resolve to exactly one structured participant for the field event; duplicate matching staff names on the same row cause the row to fail
depth_to_water_ft must be >= 0 when present — negative values are rejected

Well lookup

well_name_point_id must match exactly one water well — zero or multiple matches both produce a row-level error
Matched wells are cached by name for the duration of the file run

Measuring-point height and depth check

The measuring-point height used for an observation is determined in this order:

CSV mp_height
Existing Thing.measuring_point_height from the database
None

If both the CSV and the well record have a measuring-point height and the values differ, the CSV value is used and a warning is added to validation_errors.

When depth_to_water_ft, a measuring-point height, and a well depth are all present, the importer checks:

depth_to_water_ft - measuring_point_height < well_depth

Rows that fail this check are rejected.

Summary and Warning Semantics

summary.validation_errors_or_warnings is not a simple count of entries in validation_errors.

The service counts the number of unique row numbers found in messages matching the pattern Row N: .... This means:

Multiple errors for the same row count as one
Warning rows (e.g., measuring-point height mismatches) are counted the same way as error rows
Messages that are not scoped to a specific row fall back to a simple message count

Both warnings and errors are reported in the same validation_errors array.

Operational Notes

File handling behavior

Behavior	Detail
Extra columns	Ignored
Column order	Does not matter
Alias headers	Accepted
Empty file (API)	Rejected with `400 Uploaded file is empty` before parsing
BOM (`utf-8-sig`)	Not handled — causes a missing-column error for the first field
Duplicate headers	Not explicitly rejected — behavior is undefined if the same standard column name appears more than once
Row count limit	None enforced
Delimiter validation	None — files using semicolons or tabs as delimiters will surface as missing required-column errors

Differences from the well inventory importer

Dimension	Water level importer	Well inventory importer
Rerun behavior	Updates existing records	Skips existing records
Transaction model	Best-effort per row (savepoints)	Best-effort per row (savepoints)
Creates wells or locations	No — links to existing wells only	Yes
Creates contacts or notes	Reuses or creates `contact` rows for staff names. Does not create rows in the polymorphic `note` table.	Creates contacts and note-backed well metadata.
Structured field participants	Yes. Creates or reuses `field_event_participant` rows and sets `sample.field_event_participant_id` when `measuring_person` resolves uniquely.	Yes for `field_event_participant` rows during well inventory import, but does not currently set `sample.field_event_participant_id` on the groundwater-level sample it creates.
BOM support	No	Yes (`utf-8-sig`)
Row count limit	None	2,000
`level_status` / `data_quality` storage	Structured columns	Notes text

Known Gaps

The importer assumes measurements are depth-to-water values.
Elevation-based and pressure-transducer bulk uploads are not supported through this flow.
The bulk upload can create or update records, but it cannot delete them.
The CSV/schema contract still supports only three staff columns (field_staff, field_staff_2, field_staff_3), even though the importer now normalizes them internally into an iterable participant list.
Reruns update matching records in place, but there is no explicit policy for partial field changes beyond normal update behavior.

Water Level Import Reference.md

Water Level Import Reference

What This Importer Does

High-Level Flow

Workflow Overview

CSV Parsing and Header Standardization

Header validation

Row validation

Record creation and update

Response and exit behavior

Transaction Model

Rerun Behavior

Field Mapping Reference

Validation Rules

Datetime handling

Blank value handling

Enum-backed fields

Cross-field rules

Well lookup

Measuring-point height and depth check

Summary and Warning Semantics

Operational Notes

File handling behavior

Differences from the well inventory importer

Known Gaps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally