Skip to content

V4 colocated dv writes#19

Draft
stevenzwu wants to merge 13 commits into
mainfrom
v4_colocated_dv_writes
Draft

V4 colocated dv writes#19
stevenzwu wants to merge 13 commits into
mainfrom
v4_colocated_dv_writes

Conversation

@stevenzwu

Copy link
Copy Markdown
Owner

stacked implementation up to phase 6 according to this plan
apache#16694

@stevenzwu stevenzwu force-pushed the v4_colocated_dv_writes branch 4 times, most recently from 4df2623 to f2e1162 Compare June 16, 2026 18:34
@stevenzwu stevenzwu force-pushed the v4_colocated_dv_writes branch 16 times, most recently from 727354a to 847273c Compare June 22, 2026 04:35
thswlsqls and others added 6 commits June 22, 2026 10:48
…pache#16924)

SparkValueConverter.convertToSpark returned a new
UnsupportedOperationException for STRUCT, LIST, and MAP instead of
throwing it, so the exception object would be passed downstream as a
value. Throw it to match the method's own default branch.
Remove path filters from the ASF allowlist workflow so it runs for all pull requests and pushes to main.

This surfaces upstream approved-pattern drift as a visible check failure even when a pull request does not edit workflow files.

Fixes apache#16914

Generated-by: Codex

Co-authored-by: Codex <codex@openai.com>
…value is null (apache#16826)

* kafka-connect: evolve table schema when record schema is updated but value is null

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

* kafka connect: fix nested schema evolution when parent evolves, add map key evolution

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

* Fix style

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

* kafka connect: improve docs and testing for evolve schema when value is null

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

* kafka connect: defer nested null value evolution when parent evolves, drop map key recursion

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

---------

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>
@stevenzwu stevenzwu force-pushed the v4_colocated_dv_writes branch from 847273c to c24aa44 Compare June 22, 2026 23:55
…4 write paths

Adds status(), dataSequenceNumber(), fileSequenceNumber(), and firstRowId()
setters to TrackedFileBuilder. These bypass the TrackingBuilder.added/from
chain and construct the Tracking directly in build(), needed for v4
manifest references (explicit data/file sequence numbers without a source
TrackedFile) and v4 non-ADDED transitions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@stevenzwu stevenzwu force-pushed the v4_colocated_dv_writes branch from c24aa44 to 13e208d Compare June 23, 2026 04:59
Introduce two package-private helpers used by the v4 manifest writer
path:

TrackedFileWrapper is a StructLike adapter exposing a TrackedFile via
positional access matching TrackedFile.schemaWithContentStats(...). It
mirrors V4Metadata.ManifestEntryWrapper for the new content_entry
layout, supporting Parquet writes without materializing intermediate
records.

ContentEntryAdapter converts legacy ManifestEntry and ManifestFile
inputs into TrackedFileStruct rows. The leaf factories accept any
ManifestEntry whose file is DATA or EQUALITY_DELETES, so
ManifestWriter.V4Writer and V4DeleteWriter share one entry point. A
DV-specific overload takes ManifestEntry<DataFile> for colocated
deletion vector emission used by Phase 6's REPLACED/MODIFIED pairs.
fromManifestFile remains for root manifest references and accepts
the writer_format_version override (1 for v4 leaves, 0 for v3 leaves
carried through a v3->v4 upgrade).

Status derivation is delegated to TrackingBuilder; content stats
construction goes through MetricsUtil.fromMetrics. ManifestEntry.Status
is mapped to EntryStatus in toEntryStatus(...), the only mapping
needed since the legacy enum has no REPLACED or MODIFIED.

Per the v4 plan, validation of writer_format_version against the
supported set lives in the Phase 2 ContentEntryReader, not at storage
or write time. The adapter validates only on the manifest factory,
the single caller that may legitimately set 0.

No callers wired yet; round-trip tests land with the Phase 2 reader
and writer rewrite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevenzwu and others added 5 commits June 22, 2026 22:53
…add matching reader

V4Writer and V4DeleteWriter now emit content_entry Parquet rows via
TrackedFileWrapper/ContentEntryAdapter rather than the legacy manifest_entry
Avro shape. ContentEntryReader and ContentEntryManifestReaderAdapter project
content_entry rows back to ManifestEntry<DataFile/DeleteFile> so all downstream
consumers (ManifestGroup, MergingSnapshotProducer rewrite paths) work
unchanged.

Read-path dispatch in ManifestFiles is layered:
1. Avro manifests are always legacy (no file inspection).
2. Snapshot-tree callers thread an Integer writerFormatVersion hint through
   the new package-private read overloads: 1 routes to ContentEntryReader,
   0 routes to legacy.
3. Callers without a hint (tests writing-then-reading, ad-hoc tooling) fall
   back to inspecting the Parquet footer schema for field id 134 (content_type)
   or 147 (tracking). The footer read is delegated to InternalParquet via
   DynMethods so core has no compile-time dependency on iceberg-parquet.

Key design choices:
- TrackedFile.schemaWithContentStats omits partition and content_stats when
  their struct types are empty (Parquet rejects empty groups).
- TrackedFileWrapper uses hasPartition/hasContentStats flags to map positions
  dynamically when either optional group is absent.
- V4Writer.add(DataFile) bypasses Delegates.suppressFirstRowId so per-entry
  firstRowId is stored in the tracking struct rather than at manifest level.
- ContentEntryReader.setEntry uses wrapAppendPreservingFirstRowId for ADDED
  entries so firstRowId read from the tracking struct is not re-suppressed.
- ContentEntryAdapter preserves firstRowId for EXISTING entries so uncommitted
  manifests can round-trip per-entry row IDs.
- ContentEntryManifestReaderAdapter applies the same committed/uncommitted
  firstRowId nullification logic as ManifestReader.idAssigner.
- ContentEntryManifestReaderAdapter.iterator tracks ordinal position and sets
  fileOrdinal and manifestLocation on each BaseFile to match Avro reader behavior.
- Parquet.readSchema(InputFile) is a new public helper that returns just the
  Iceberg-converted file schema; InternalParquet.readSchema delegates to it
  for the DynMethods entry point.
- v4 spec forbids content_type=POSITION_DELETES (PR apache#16025); three
  TestManifestReader tests that write standalone position-delete files / DV
  delete files are guarded with assumeThat isLessThan(4) and will be removed
  once PR apache#16677 (or its successor) gates v4 out of the broad parameterized
  test suite during incubation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce the v4 root manifest write/read pair — the replacement for the
manifest list in format version 4.

RootManifestWriter emits content_entry Parquet rows (content_type=DATA_MANIFEST
or DELETE_MANIFEST) with manifest_info counts from each ManifestFile. The
writer accepts an explicit writer_format_version (0 for legacy v3 leaves
carried over in a v3->v4 upgrade; 1 for v4 content_entry leaves) so Phase 5's
SnapshotProducer can set it correctly per entry.

RootManifestReader reconstructs GenericManifestFile objects from those rows.
Direct data-file entries (the small-write optimization) are skipped with a
DEBUG log; that path is deferred to a future phase.

RootManifests is the static factory (analogous to ManifestLists) with write()
and two read() overloads.

TestRootManifest covers round-trips for data/delete manifests, key metadata,
multiple manifests, legacy writer_format_version=0 entries, and the version
guard on write().

The empty Parquet group limitation is resolved by using a single dummy optional
boolean field (field id 99999/_unpartitioned) for the partition struct and a
separate dummy (field id 99998/_no_stats) for content_stats; both are always
null on write and ignored on read.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Phase 4 of the v4 metadata-tree write-path plan. Adds
rootManifestLocation() to the Snapshot API, plumbs formatVersion
and rootManifestLocation fields through BaseSnapshot with a
constructor that enforces exactly one of manifest-list or
root-manifest is set, updates SnapshotParser to read/write the
new root-manifest JSON key, and dispatches cacheManifests() to
RootManifests.read() for format version >= 4.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Wire SnapshotProducer.apply() to write a root manifest (.parquet) instead
of a manifest list (.avro) when the table format version is >= 4. The v4
path uses RootManifests.write(), derives ADDED/EXISTING status per manifest
from snapshotId comparison, and carries firstRowId + addedRows for row
lineage. The v1-v3 path is unchanged. The commit() cleanup now resolves
the committed location through both manifestListLocation() and
rootManifestLocation() so it handles both v3 and v4 snapshots.

RootManifestWriter gains a three-argument add() overload that accepts an
explicit EntryStatus, needed to emit EXISTING for carried-over leaf
manifests in multi-snapshot root manifests.

TestV4SnapshotProducer covers: single append (root manifest .parquet,
DATA_MANIFEST entry ADDED with writer_format_version=1), two appends
(first leaf EXISTING, second ADDED), and delete-file (rewritten leaf ADDED,
unchanged leaf EXISTING, deleted file DELETED in rewritten leaf).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…airing

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@stevenzwu stevenzwu force-pushed the v4_colocated_dv_writes branch from 13e208d to cdc8e56 Compare June 23, 2026 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants