-
Notifications
You must be signed in to change notification settings - Fork 3
Water Level Import Reference.md
This page documents the water-level CSV bulk-upload feature in detail: the import workflow, transaction model, field mappings, validation rules, and operational notes.
For a broader overview of all CSV import surfaces, see CSV Ingestion and Validation. To compare this importer with the well inventory importer, see Well Inventory Import Reference.
Both of these entry points call the same service: bulk_upload_water_levels(...).
- CLI:
oco water-levels bulk-upload --file <path> - API:
POST /observation/groundwater-level/bulk-upload
The importer reads a CSV file, validates each row, finds an existing well by name, and then creates or updates:
field_eventfield_activitysampleobservation-
field_event_participantfor non-empty staff columns -
contactrecords for staff names when no matchingname + organizationrow already exists
It also links each imported groundwater-level sample to the structured participant who matches measuring_person.
It does not create new wells.
The importer runs in this order:
- Read the CSV.
- Normalize header aliases.
- Check that all required headers are present.
- Validate each row.
- Look up the target well for each valid row.
- Create or update database records.
- Commit all successful rows at the end.
Rows with validation errors are collected in validation_errors. A bad row does not stop other rows from importing.
The service accepts a file path, raw bytes, or a binary file-like object. CSV parsing uses Python's csv.DictReader.
Before validation, certain header aliases are mapped to their standard names:
| Accepted alias | Standard name |
|---|---|
measurement_date_time |
water_level_date_time |
sampler |
measuring_person |
mp_height_ft |
mp_height |
The following headers are silently ignored:
hold(not saved)cut(not saved)
Any other unrecognized columns are also ignored — only known fields are processed.
If a row contains both an alias and its standard name, the importer keeps the later non-empty value in CSV column order. A later blank value does not overwrite an earlier non-blank one.
BOM warning: This importer does not handle UTF-8 files saved with a byte-order mark (
utf-8-sig). Unlike the well inventory importer, a BOM will be left attached to the first header name, which will cause a missing-column error for that field.
After aliases are normalized, the following required columns must be present:
well_name_point_idfield_event_date_timefield_staffwater_level_date_timemeasuring_personsample_method
Header validation runs before any data is written. If a required column is missing, the importer fails and produces file-level errors like CSV missing required column 'field_staff'.
Each row goes through two validation phases:
Format validation (WaterLevelCsvRow):
- parses timestamps and optional numeric fields
- checks that enum fields contain allowed values
- enforces cross-field constraints.
Database validation:
- looks up
well_name_point_idto confirm it matches exactly one water well - determines the effective measuring-point height
- checks the corrected depth-to-water against the well depth.
Rows that fail validation are collected into validation_errors; other rows continue processing.
For each valid row, the importer:
- Builds a sample name using a fixed pattern:
<well.name>-WL-<UTC YYYYMMDDHHMM>. - Checks for an existing groundwater-level sample for that well with the same sample name.
- Normalizes
field_staff,field_staff_2, andfield_staff_3into an ordered participant list without changing the CSV input shape. - Creates or updates records in
field_event,field_activity,sample, andobservation. Records are created if no match exists and updated if one does. - Creates or reuses structured participant data for each non-empty staff value.
- Sets
sample.field_event_participant_idto the unique participant matchingmeasuring_person.
The groundwater-level Parameter is looked up once per file. If no matching parameter exists, the importer creates one with:
parameter_name = groundwater levelmatrix = groundwaterparameter_type = Field Parameterdefault_unit = ft
Every imported sample is stored with:
sample_matrix = groundwater-
qc_type = Normal.
Well lookups are cached during the run, so repeated rows for the same well do not repeat the database query.
If measuring_person cannot be matched to exactly one participant for the field event, the importer rejects that row with a human-readable row error.
The service returns a response with the following structure:
summary.total_rows_processedsummary.total_rows_importedsummary.validation_errors_or_warningswater_levelsvalidation_errors
Exit behavior depends on the outcome:
| Outcome | CLI exit code | API status |
|---|---|---|
| At least one row imported (even with other failures) | 0 |
200 OK |
| Zero rows imported and errors present | 1 |
400 Bad Request |
| Uploaded file is empty | — | 400 Bad Request |
The water-level importer is not all-or-nothing at the row level, but it still uses one final commit for the file.
- One SQLAlchemy session is opened for the entire file
- Header and row validation run before any inserts or updates
- All record writes run inside a single outer transaction
- Each row gets its own savepoint (
session.begin_nested()), so a failure on one row rolls back only that row - Successfully written rows stay pending in the outer transaction until a final
session.commit()
Important consequence: other database sessions do not see imported rows until the final commit. If that final commit fails, all rows from that file roll back, including rows that had succeeded earlier.
The water-level importer uses update-on-match behavior. This differs from the well inventory importer, which skips records that already exist.
How matches are identified: well name + activity type groundwater level + sample name <well.name>-WL-<UTC YYYYMMDDHHMM>
Because the sample name is derived from the measurement time converted to UTC, matching is UTC-based — not based on the raw timestamp string from the CSV.
When a matching sample is found:
- The existing
Samplerecord is reused - The linked
FieldEventdate and notes are updated -
FieldActivity.notesis updated - Structured participants are created or reused idempotently for the imported staff names
-
sample.field_event_participant_idis set or updated to the unique participant matchingmeasuring_person - The groundwater-level
Observationis updated if it exists - If the sample exists but has no groundwater-level observation, a new one is created without affecting other observations on that sample
If two rows in the same file share the same well and measurement time (after UTC conversion), they point to the same sample — the later row overwrites the values written by the earlier one.
| CSV Column | Schema Field | DB Table | DB Column | Validation / Processing | Write Behavior |
|---|---|---|---|---|---|
well_name_point_id |
well_name_point_id |
thing |
name |
Required. Must resolve to exactly one water well. | Lookup only — no well record is created or modified. |
field_event_date_time |
field_event_date_time |
field_event |
event_date |
Required. ISO 8601 datetime. Timestamps without a timezone are assumed America/Denver, converted to UTC. |
Sets or updates the field-event timestamp. |
field_staff |
field_staff |
contact, field_event_participant
|
name, participant_role
|
Normalized with additional staff fields into an ordered participant list. | Resolves or creates a Contact for the staff name and creates or reuses a FieldEventParticipant with role Lead. |
field_staff_2 |
field_staff_2 |
contact, field_event_participant
|
name, participant_role
|
Optional. Blank → None. |
Resolves or creates a Contact and creates or reuses a FieldEventParticipant with role Participant. |
field_staff_3 |
field_staff_3 |
none | none | Optional. Blank → None. |
Resolves or creates a Contact and creates or reuses a FieldEventParticipant with role Participant. |
water_level_date_time / measurement_date_time
|
water_level_date_time |
sample |
sample_date |
Required. Alias via measurement_date_time. Timestamps without a timezone are assumed America/Denver, converted to UTC. |
Sets or updates the sample timestamp. |
water_level_date_time / measurement_date_time
|
water_level_date_time |
sample |
sample_name |
Same processing as above. | Derives <well>-WL-<UTC YYYYMMDDHHMM>. |
water_level_date_time / measurement_date_time
|
water_level_date_time |
observation |
observation_datetime |
Same processing as above. | Sets or updates the observation timestamp. |
measuring_person / sampler
|
measuring_person |
sample |
field_event_participant_id |
Required. Must match one of field_staff, field_staff_2, or field_staff_3. Alias via sampler. |
Links the sample to the unique structured participant matching the measuring person. |
sample_method |
sample_method |
sample |
sample_method |
Required. Must be one of the allowed values (see below). Case-insensitive. Short aliases accepted (see below). | Sets or updates the sample method. |
sample_method |
sample_method |
sample |
sample_matrix |
n/a | Always stored as groundwater. |
sample_method |
sample_method |
sample |
qc_type |
n/a | Always stored as Normal. |
mp_height / mp_height_ft
|
mp_height |
observation |
measuring_point_height |
Optional float. Alias via mp_height_ft. Falls back to the well's current measuring-point height if blank. |
Saves the measuring-point height used for the observation. |
depth_to_water_ft |
depth_to_water_ft |
observation |
value |
Optional float. Must be >= 0 when present. |
Stored as the groundwater-level measurement value. |
depth_to_water_ft |
depth_to_water_ft |
observation |
unit |
n/a | Always stored as ft. |
level_status |
level_status |
observation |
groundwater_level_reason |
Required when depth_to_water_ft is blank. Must be one of the allowed values; short aliases accepted (see below). |
Stored in a structured column, not as notes text. |
data_quality |
data_quality |
observation |
nma_data_quality |
Optional. Must be one of the allowed values. | Stored in a structured column, not as notes text. |
water_level_notes |
water_level_notes |
field_event |
notes |
Optional. Blank → None. |
Stored as the field-event freeform notes only. Staff names are not written into note text. |
water_level_notes |
water_level_notes |
sample |
notes |
Optional. Blank → None. |
Stored on the sample. |
water_level_notes |
water_level_notes |
observation |
notes |
Optional. Blank → None. |
Stored on the observation. |
The parameter table is also updated indirectly (see Record creation and update).
Tables not written by this workflow: note. The importer uses table note/notes columns, but it does not create rows in the polymorphic note table.
-
field_event_date_timeandwater_level_date_timeaccept ISO 8601 strings - Timestamps without a timezone are assumed to be
America/Denverand converted to UTC before writing -
water_level_date_timemust be greater than or equal tofield_event_date_time
- Optional text fields: blank becomes
None. - Optional numeric fields: blank becomes
None. - Required fields: blank after trimming is rejected.
sample_method, level_status, and data_quality are validated against allowed-value lists built from lexicon categories in core/enums.py. Matching is case-insensitive and whitespace-tolerant. The standardized value is written to the database regardless of how it was entered in the CSV.
sample_method accepted values:
| CSV value | Stored as |
|---|---|
electric tape |
Electric tape measurement (E-probe) |
steel tape |
Steel-tape measurement |
level_status accepted values:
| CSV value | Stored as |
|---|---|
dry |
Site was dry |
obstructed / obstruction
|
Obstruction was encountered in the well (no level recorded) |
flowing |
Site was flowing. Water level or head couldn't be measured w/out additional equipment. |
flowing recently |
Site was flowing recently. |
pumped |
Site was being pumped |
pumped recently |
Site was pumped recently |
not affected |
Water level not affected |
other |
Other conditions exist that would affect the level (remarks) |
- At least one of
depth_to_water_ftorlevel_statusmust be present per row -
measuring_personmust match one offield_staff,field_staff_2, orfield_staff_3 -
measuring_personmust resolve to exactly one structured participant for the field event; duplicate matching staff names on the same row cause the row to fail -
depth_to_water_ftmust be>= 0when present — negative values are rejected
-
well_name_point_idmust match exactly one water well — zero or multiple matches both produce a row-level error - Matched wells are cached by name for the duration of the file run
The measuring-point height used for an observation is determined in this order:
- CSV
mp_height - Existing
Thing.measuring_point_heightfrom the database None
If both the CSV and the well record have a measuring-point height and the values differ, the CSV value is used and a warning is added to validation_errors.
When depth_to_water_ft, a measuring-point height, and a well depth are all present, the importer checks:
depth_to_water_ft - measuring_point_height < well_depth
Rows that fail this check are rejected.
summary.validation_errors_or_warnings is not a simple count of entries in validation_errors.
The service counts the number of unique row numbers found in messages matching the pattern Row N: .... This means:
- Multiple errors for the same row count as one
- Warning rows (e.g., measuring-point height mismatches) are counted the same way as error rows
- Messages that are not scoped to a specific row fall back to a simple message count
Both warnings and errors are reported in the same validation_errors array.
| Behavior | Detail |
|---|---|
| Extra columns | Ignored |
| Column order | Does not matter |
| Alias headers | Accepted |
| Empty file (API) | Rejected with 400 Uploaded file is empty before parsing |
BOM (utf-8-sig) |
Not handled — causes a missing-column error for the first field |
| Duplicate headers | Not explicitly rejected — behavior is undefined if the same standard column name appears more than once |
| Row count limit | None enforced |
| Delimiter validation | None — files using semicolons or tabs as delimiters will surface as missing required-column errors |
| Dimension | Water level importer | Well inventory importer |
|---|---|---|
| Rerun behavior | Updates existing records | Skips existing records |
| Transaction model | Best-effort per row (savepoints) | Best-effort per row (savepoints) |
| Creates wells or locations | No — links to existing wells only | Yes |
| Creates contacts or notes | Reuses or creates contact rows for staff names. Does not create rows in the polymorphic note table. |
Creates contacts and note-backed well metadata. |
| Structured field participants | Yes. Creates or reuses field_event_participant rows and sets sample.field_event_participant_id when measuring_person resolves uniquely. |
Yes for field_event_participant rows during well inventory import, but does not currently set sample.field_event_participant_id on the groundwater-level sample it creates. |
| BOM support | No | Yes (utf-8-sig) |
| Row count limit | None | 2,000 |
level_status / data_quality storage |
Structured columns | Notes text |
- The importer assumes measurements are depth-to-water values.
- Elevation-based and pressure-transducer bulk uploads are not supported through this flow.
- The bulk upload can create or update records, but it cannot delete them.
- The CSV/schema contract still supports only three staff columns (
field_staff,field_staff_2,field_staff_3), even though the importer now normalizes them internally into an iterable participant list. - Reruns update matching records in place, but there is no explicit policy for partial field changes beyond normal update behavior.