Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion BACKLOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ Description formats:
| 151 | Feature | [[E09] Shadow track rendering and lifecycle](https://github.com/debrief/debrief-future/issues/373) — render previous solution as faded overlay on map during recalculation; auto-display on any recalc trigger, dismiss button in Analysis Log, replacement on subsequent recalc, transient-only (requires #150) | 4 | 5 | 3 | 12 | Medium | proposed |
| ~~180~~ | ~~Infrastructure~~ | ~~[[E10] Platform registry — unified vessel class + platform tree](docs/ideas/E10-nl-assisted-catalog-discovery.md) — `shared/data/platform-registry.yaml` with platforms as leaf instances under their class; Python + TypeScript loaders; seeded with 10 known platforms~~ | ~~5~~ | ~~3~~ | ~~4~~ | ~~12~~ | ~~Medium~~ | ~~complete~~ |
| ~~181~~ | ~~Infrastructure~~ | ~~[[E10] LinkML schema update — per-platform override fields](docs/ideas/E10-nl-assisted-catalog-discovery.md) — optional `display_name`, `nationality`, `vessel_class`, `vessel_type`, `vessel_role`, `domain` on TrackProperties; `debrief:platforms` STAC extension replacing flat aggregates; regen Pydantic + TS types (requires #180)~~ | ~~5~~ | ~~2~~ | ~~4~~ | ~~11~~ | ~~Medium~~ | ~~complete~~ |
| 182 | Enhancement | [[E10] Import handler warnings for unregistered platforms](docs/ideas/E10-nl-assisted-catalog-discovery.md) — check extracted `platform_id` values against registry after import; log warnings listing unregistered IDs (import still succeeds) (requires #180) | 3 | 1 | 5 | 9 | Low | proposed |
| ~~182~~ | ~~Enhancement~~ | ~~[[E10] Import handler warnings for unregistered platforms](docs/ideas/E10-nl-assisted-catalog-discovery.md) — check extracted `platform_id` values against registry after import; log warnings listing unregistered IDs (import still succeeds) (requires #180)~~ | ~~3~~ | ~~1~~ | ~~5~~ | ~~9~~ | ~~Low~~ | ~~complete~~ |
| 183 | Feature | [[E10] Save-time registry resolution](docs/ideas/E10-nl-assisted-catalog-discovery.md) — resolve each TRACK `platform_id` against registry tree at save; overlay analyst-set overrides; emit fully resolved `debrief:platforms` on item.json (requires #180, #181) | 5 | 3 | 4 | 12 | Medium | proposed |
| 184 | Infrastructure | [[E10] Nuke + regenerate sample catalog](docs/ideas/E10-nl-assisted-catalog-discovery.md) — delete `preview/workspace/samples/local-store/`, re-import 72 legacy files through enriched pipeline; populate registry; all schema tests pass (requires #182, #183) | 4 | 2 | 4 | 10 | Medium | proposed |
| 185 | Feature | [[E10] CQL2 `array_filter` evaluator](docs/ideas/E10-nl-assisted-catalog-discovery.md) — extend `shared/components/src/filter-engine/` to evaluate `array_filter()` for compound predicates on `platforms[]`; matchers + CQL2-JSON serialization; unit tests (requires #181) | 5 | 3 | 4 | 12 | Medium | proposed |
Expand Down
5 changes: 5 additions & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,11 @@
## [Unreleased]

### Added
- **Import Platform Warnings** (#182) — Post-parse validation checks extracted `platform_id` values against the platform registry after import; emits advisory `UNREGISTERED_PLATFORM` warnings for unregistered platforms. Import always succeeds regardless of registry coverage. [E10 Phase 2]
- New function: `_validate_platform_ids()` in `import_catalog.py` with registry loading and graceful fallback
- Warning codes: `UNREGISTERED_PLATFORM` (per platform per file), `REGISTRY_UNAVAILABLE` (registry load failure)
- Tests: 17/17 passing (9 unit + 8 integration), 344 existing tests unaffected
- Evidence: `specs/182-import-platform-warnings/evidence/test-summary.md`, `usage-example.md`, `sample-warnings.json`
- **LinkML Per-Platform Override Fields** (#181) — Six optional override fields on TrackProperties, new PlatformRecord entity, and `debrief:platforms` structured array replacing flat aggregates on STAC extension. Full consumer code migration across filter engine, VS Code, web-shell, and Python services. [E10 Phase 1]
- Schema: 3 LinkML YAML files modified, VesselDomainEnum moved to common.yaml for cross-module use
- New entity: PlatformRecord (id required + 6 optional classification fields)
Expand Down
1 change: 1 addition & 0 deletions docs/evidence-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ Tracks evidence artifacts captured for each feature. Used to assess coverage and
| 118-sensor-rendering | 2 | md | 2026-04-10 | current | [#413](https://github.com/debrief/debrief-future/pull/413) |
| 180-platform-registry | 3 | md, txt | 2026-04-13 | current | [#416](https://github.com/debrief/debrief-future/pull/416) |
| 181-linkml-platform-overrides | 3 | md | 2026-04-13 | current | [#421](https://github.com/debrief/debrief-future/pull/421) |
| 182-import-platform-warnings | 3 | md, json | 2026-04-13 | current | [#422](https://github.com/debrief/debrief-future/pull/422) |
1 change: 1 addition & 0 deletions services/io/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ classifiers = [
dependencies = [
"pydantic>=2.12.5",
"debrief-schemas",
"debrief-data",
]

[project.scripts]
Expand Down
47 changes: 46 additions & 1 deletion services/io/src/debrief_io/import_catalog.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,15 @@
import uuid
from datetime import UTC, datetime
from pathlib import Path
from typing import Any
from typing import TYPE_CHECKING, Any

from debrief_io.models import ImportFileError, ImportResult, ImportWarning
from debrief_io.parser import parse
from debrief_schemas import SensorData # noqa: TC001 — runtime model_dump()

if TYPE_CHECKING:
from debrief_data import PlatformRegistry

logger = logging.getLogger(__name__)

# Supported extensions for import
Expand Down Expand Up @@ -186,6 +189,30 @@ def _count_feature_kinds(features: list[dict[str, Any]]) -> tuple[int, int, int]
return tracks, sensors, narratives


def _validate_platform_ids(
features: list[dict[str, Any]],
file_rel: str,
registry: PlatformRegistry,
warnings: list[ImportWarning],
) -> None:
"""Check platform IDs against registry; append warnings for unregistered ones."""
platform_ids: set[str] = set()
for feature in features:
pid = feature.get("properties", {}).get("platform_id", "")
if pid and pid.strip():
platform_ids.add(pid)

for pid in sorted(platform_ids):
if registry.resolve(pid) is None:
warnings.append(
ImportWarning(
file=file_rel,
code="UNREGISTERED_PLATFORM",
message=f"Platform '{pid}' is not registered in the platform registry",
)
)


def _merge_deferred_sensors(
catalog_path: Path,
deferred_sensors: dict[str, list[tuple[str, list[SensorData]]]],
Expand Down Expand Up @@ -289,6 +316,7 @@ def import_legacy_data(
FileNotFoundError: If source_dir does not exist.
FileExistsError: If catalog_path already exists.
"""
from debrief_data import RegistryError, load_registry # isort: skip
from debrief_stac.assets import add_asset
from debrief_stac.catalog import create_catalog
from debrief_stac.features import add_features
Expand All @@ -305,6 +333,19 @@ def import_legacy_data(

result = ImportResult(catalog_path=str(catalog_path))

# Load platform registry for post-parse validation (best-effort)
registry: PlatformRegistry | None = None
try:
registry = load_registry()
except (FileNotFoundError, RegistryError) as e:
result.warnings.append(
ImportWarning(
file="",
code="REGISTRY_UNAVAILABLE",
message=f"Platform registry could not be loaded: {e}. Platform validation skipped.",
)
)

# Collect source files
source_files = sorted(
f
Expand Down Expand Up @@ -351,6 +392,10 @@ def import_legacy_data(
ImportWarning(file=file_rel, code=sw.code, message=sw.message)
)

# Validate platform IDs against registry (advisory only)
if registry is not None and parse_result.features:
_validate_platform_ids(parse_result.features, file_rel, registry, result.warnings)

if not parse_result.features:
if not parse_result.pending_sensor_data:
result.warnings.append(
Expand Down
179 changes: 179 additions & 0 deletions services/io/tests/test_import_catalog.py
Original file line number Diff line number Diff line change
Expand Up @@ -438,3 +438,182 @@ def test_basic_report(self) -> None:
assert "Files succeeded: 9" in report
assert "Files failed: 1" in report
assert "Total tracks: 20" in report


# --- REP content helpers for platform validation tests ---

# A REP line for a registered platform (NELSON)
_REP_REGISTERED = (
"951212 050000.000 NELSON @C 22 11 10.63 N 21 41 52.37 W 269.7 2.0 0\n"
"951212 050100.000 NELSON @C 22 11 10.58 N 21 42 2.98 W 269.7 2.0 0\n"
)

# A REP line for an unregistered platform (PHANTOM)
_REP_UNREGISTERED = (
"951212 050000.000 PHANTOM @C 22 11 10.63 N 21 41 52.37 W 269.7 2.0 0\n"
"951212 050100.000 PHANTOM @C 22 11 10.58 N 21 42 2.98 W 269.7 2.0 0\n"
)

# Mixed: one registered, one unregistered
_REP_MIXED = _REP_REGISTERED + _REP_UNREGISTERED


class TestPlatformValidationIntegration:
"""Integration tests for platform registry validation in the import pipeline."""

def test_registered_platforms_no_warnings(self, tmp_path: Path) -> None:
"""Import REP file with registered platforms — no UNREGISTERED_PLATFORM warnings."""
source = tmp_path / "source"
source.mkdir()
(source / "boat1.rep").write_text((FIXTURES / "boat1.rep").read_text())

catalog = tmp_path / "catalog"
result = import_legacy_data(source, catalog)

assert result.files_succeeded >= 1
unreg_warns = [w for w in result.warnings if w.code == "UNREGISTERED_PLATFORM"]
assert len(unreg_warns) == 0

def test_unregistered_platforms_produce_warnings(self, tmp_path: Path) -> None:
"""Import REP file with unregistered platform — correct warning emitted, import succeeds."""
source = tmp_path / "source"
source.mkdir()
(source / "mixed.rep").write_text(_REP_MIXED)

catalog = tmp_path / "catalog"
result = import_legacy_data(source, catalog)

assert result.files_succeeded == 1
assert result.files_failed == 0
unreg_warns = [w for w in result.warnings if w.code == "UNREGISTERED_PLATFORM"]
assert len(unreg_warns) == 1
assert "PHANTOM" in unreg_warns[0].message
assert unreg_warns[0].file == "mixed.rep"

def test_dpf_unregistered_platforms_produce_warnings(self, tmp_path: Path) -> None:
"""Import DPF file with unregistered platform — correct warning emitted."""
source = tmp_path / "source"
source.mkdir()
# Read existing DPF and inject an unregistered track name
dpf_content = (FIXTURES / "sample.dpf").read_text()
dpf_content = dpf_content.replace('Name="COLLINGWOOD"', 'Name="GHOST_SHIP"', 1)
(source / "modified.dpf").write_text(dpf_content)

catalog = tmp_path / "catalog"
result = import_legacy_data(source, catalog)

assert result.files_succeeded == 1
unreg_warns = [w for w in result.warnings if w.code == "UNREGISTERED_PLATFORM"]
assert any("GHOST_SHIP" in w.message for w in unreg_warns)

def test_registry_unavailable_still_succeeds(
self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch
) -> None:
"""When registry cannot be loaded, import succeeds with REGISTRY_UNAVAILABLE warning."""
import debrief_data

# Monkeypatch load_registry to raise FileNotFoundError
def _failing_load(*_args: object, **_kwargs: object) -> None:
raise FileNotFoundError("registry not found")

monkeypatch.setattr(debrief_data, "load_registry", _failing_load)

source = tmp_path / "source"
source.mkdir()
(source / "boat1.rep").write_text((FIXTURES / "boat1.rep").read_text())

catalog = tmp_path / "catalog"
result = import_legacy_data(source, catalog)

assert result.files_succeeded >= 1
assert result.files_failed == 0
reg_warns = [w for w in result.warnings if w.code == "REGISTRY_UNAVAILABLE"]
assert len(reg_warns) == 1
# No UNREGISTERED_PLATFORM warnings when registry is unavailable
unreg_warns = [w for w in result.warnings if w.code == "UNREGISTERED_PLATFORM"]
assert len(unreg_warns) == 0

def test_all_unregistered_import_still_succeeds(self, tmp_path: Path) -> None:
"""Import with only unregistered platforms still succeeds (US2)."""
source = tmp_path / "source"
source.mkdir()
(source / "unknown.rep").write_text(_REP_UNREGISTERED)

catalog = tmp_path / "catalog"
result = import_legacy_data(source, catalog)

assert result.files_succeeded == 1
assert result.files_failed == 0
assert result.total_tracks >= 1
unreg_warns = [w for w in result.warnings if w.code == "UNREGISTERED_PLATFORM"]
assert len(unreg_warns) == 1

def test_deduplication_many_positions_one_warning(self, tmp_path: Path) -> None:
"""File with many positions for one unregistered platform — exactly one warning (US3)."""
source = tmp_path / "source"
source.mkdir()
# Create 50 position records for one unregistered platform
lines = []
for i in range(50):
minute = f"{i:02d}"
lines.append(
f"951212 05{minute}00.000 CONTACT_X @C 22 11 10.63 N 21 41 52.37 W 269.7 2.0 0\n"
)
(source / "many_positions.rep").write_text("".join(lines))

catalog = tmp_path / "catalog"
result = import_legacy_data(source, catalog)

assert result.files_succeeded == 1
unreg_warns = [w for w in result.warnings if w.code == "UNREGISTERED_PLATFORM"]
assert len(unreg_warns) == 1
assert "CONTACT_X" in unreg_warns[0].message

def test_multiple_unregistered_one_warning_each(self, tmp_path: Path) -> None:
"""File with 3 unregistered platforms — exactly 3 warnings (US3)."""
source = tmp_path / "source"
source.mkdir()
content = (
"951212 050000.000 ALPHA_X @C 22 11 10.63 N 21 41 52.37 W 269.7 2.0 0\n"
"951212 050100.000 ALPHA_X @C 22 11 10.58 N 21 42 2.98 W 269.7 2.0 0\n"
"951212 050000.000 BRAVO_X @C 22 11 10.63 N 21 41 52.37 W 269.7 2.0 0\n"
"951212 050100.000 BRAVO_X @C 22 11 10.58 N 21 42 2.98 W 269.7 2.0 0\n"
"951212 050000.000 CHARLIE_X @C 22 11 10.63 N 21 41 52.37 W 269.7 2.0 0\n"
"951212 050100.000 CHARLIE_X @C 22 11 10.58 N 21 42 2.98 W 269.7 2.0 0\n"
)
(source / "multi.rep").write_text(content)

catalog = tmp_path / "catalog"
result = import_legacy_data(source, catalog)

assert result.files_succeeded == 1
unreg_warns = [w for w in result.warnings if w.code == "UNREGISTERED_PLATFORM"]
assert len(unreg_warns) == 3

def test_batch_import_file_attribution(self, tmp_path: Path) -> None:
"""Batch import with different unregistered platforms in different files (US4)."""
source = tmp_path / "source"
source.mkdir()

# File A: unregistered VESSEL_X
(source / "file_a.rep").write_text(
"951212 050000.000 VESSEL_X @C 22 11 10.63 N 21 41 52.37 W 269.7 2.0 0\n"
"951212 050100.000 VESSEL_X @C 22 11 10.58 N 21 42 2.98 W 269.7 2.0 0\n"
)

# File B: unregistered VESSEL_Y
(source / "file_b.rep").write_text(
"951212 050000.000 VESSEL_Y @C 22 11 10.63 N 21 41 52.37 W 269.7 2.0 0\n"
"951212 050100.000 VESSEL_Y @C 22 11 10.58 N 21 42 2.98 W 269.7 2.0 0\n"
)

catalog = tmp_path / "catalog"
result = import_legacy_data(source, catalog)

assert result.files_succeeded == 2
unreg_warns = [w for w in result.warnings if w.code == "UNREGISTERED_PLATFORM"]
assert len(unreg_warns) == 2

warn_map = {w.message.split("'")[1]: w.file for w in unreg_warns}
assert warn_map["VESSEL_X"] == "file_a.rep"
assert warn_map["VESSEL_Y"] == "file_b.rep"
Loading
Loading