-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Avro decoder can't handle a reader schema with no fields #9608
Description
Describe the bug
An application that needs to count records in an Avro file without decoding any fields may pass a reader schema to that effect.
In the current implementation, RecordDecoder creates a RecordBatch from decoded column arrays without the row_count option, which results in an error when there are no columns to decide the number of rows from.
To Reproduce
Create an arrow-avro reader with a reader schema matching the top-level record of the Avro content (e.g. an OCF file) schema, but listing no fields, e.g.
{
"type": "record",
"name": "topLevelRecord",
"fields": []
}Use the appropriate read API to read batches from the file.
The error is reported: "Invalid argument error: must either specify a row count or at least one column"
Expected behavior
The reader retrieves batches with no columns, but numbers of rows as decided by the batch size option and other flags affecting batch composition (i.e. the row counts should be the same as if the full writer schema was read).