Enable LargeListArray support in Parquet reader schema validation

## Summary

Follow-up to #502. The data conversion layer now supports `LargeListArray` (64-bit offsets) via `ProjectRecordBatch`, but the Parquet reader's schema validation still rejects `LARGE_LIST` types.

## Problem

`ValidateParquetSchemaEvolution` in `parquet_schema_util.cc:177-180` only accepts `::arrow::Type::LIST`:

```cpp
case TypeId::kList:
  if (arrow_type->id() == ::arrow::Type::LIST) {
    return {};
  }
  break;
```

When reading a Parquet file containing `LargeListArray` columns, the reader fails with:
```
Cannot read Iceberg type: list from Parquet type: large_list<...>
```

## Proposed Solution

Update the validation to accept both list types:

```cpp
case TypeId::kList:
  if (arrow_type->id() == ::arrow::Type::LIST ||
      arrow_type->id() == ::arrow::Type::LARGE_LIST) {
    return {};
  }
  break;
```

This is safe because:
1. Iceberg's `ListType` doesn't distinguish between LIST and LARGE_LIST
2. The projection layer (`ProjectRecordBatch`) already handles both via templated `ProjectListArrayImpl<>`
3. Both represent the same logical "list" concept, just with different offset sizes

## Files to Change

- `src/iceberg/parquet/parquet_schema_util.cc` - Update `ValidateParquetSchemaEvolution`
- `src/iceberg/test/parquet_test.cc` - Add integration test for reading LargeListArray through full reader pipeline

## Related

- Closes the remaining work from #502

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable LargeListArray support in Parquet reader schema validation #513

Summary

Problem

Proposed Solution

Files to Change

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Enable LargeListArray support in Parquet reader schema validation #513

Description

Summary

Problem

Proposed Solution

Files to Change

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions