Skip to content

feat: support array format for create_index columns#1008

Open
jamesvillarrubia wants to merge 1 commit intoxataio:mainfrom
jamesvillarrubia:feat/create-index-array-format
Open

feat: support array format for create_index columns#1008
jamesvillarrubia wants to merge 1 commit intoxataio:mainfrom
jamesvillarrubia:feat/create-index-array-format

Conversation

@jamesvillarrubia
Copy link

@jamesvillarrubia jamesvillarrubia commented Dec 17, 2025

Problem

Multi-column indexes require predictable column ordering for optimal query planner performance. The current OpCreateIndex implementation stores columns in a map[string]IndexField, which has non-deterministic iteration order in Go. This causes index columns to be created in random order rather than the user-specified order.

Example:

columns:
  product_id: {}
  is_active: {}
  deleted_at: {}

May generate CREATE INDEX ... (deleted_at, product_id, is_active) instead of the intended order.

Additionally, per-column settings (collation, sort order, nulls order, operator classes) were added in v0.10.0, but the map format makes it unclear which column should come first in multi-column indexes.

Solution

Support both array and map formats for the columns field:

Array Format (Recommended)

columns:
  - name: product_id
  - name: is_active
  - name: deleted_at
    sort: DESC
  • Explicitly preserves column order
  • Consistent with create_table operation format
  • Required for multi-column indexes where order matters

Map Format (Backward Compatible)

columns:
  product_id: {}
  is_active: {}
  • Still supported for single-column indexes
  • YAML map key order is preserved during parsing (per YAML 1.2 spec)
  • Deprecation warning logged for multi-column/partial indexes

Implementation

  • OpCreateIndex.Columns internally stores []IndexColumn (array)
  • Custom UnmarshalJSON/UnmarshalYAML methods detect format and convert appropriately
  • Map format converted to array while preserving YAML key insertion order
  • Execution pipeline uses ordered array throughout (no map conversion)
  • JSON schema uses oneOf to validate both formats
Deprecation Strategy

Map format is deprecated only for:

  • Multi-column indexes (2+ columns)
  • Partial indexes (indexes with predicates)

Single-column indexes can continue using map format indefinitely.

Warning message:

DEPRECATION WARNING: Map format for columns is deprecated for multi-column 
and partial indexes. Use array format instead: columns: [{name: col1}, {name: col2}]. 
Map format does not preserve column order which is critical for index performance.

Changes

Component Description
types.go Changed OpCreateIndex.Columns from map to []IndexColumn
op_create_index_unmarshal.go Custom unmarshalers for format detection (~174 lines)
op_create_index_format.go Deprecation warning logic (~61 lines)
dbactions.go Use ordered array in execution pipeline
schema.json Added oneOf validation for both formats
create_index.go Generate array format from SQL
Documentation Updated create_index.mdx with examples
Tests 3 new test files (~700 lines)
Test Coverage

Unit Tests (op_create_index_execution_test.go):

  • Verifies actual SQL statement column order
  • Runs 10 iterations to detect non-deterministic behavior
  • Tests non-alphabetical ordering (zebra, alpha, beta)
  • Tests column options (DESC, NULLS LAST, collations)

Integration Tests (op_create_index_test.go):

  • Creates multi-column indexes in real PostgreSQL
  • Queries pg_indexes to validate column order
  • Verifies non-alphabetical ordering (status, user_id, created_at)

Format Tests (op_create_index_format_test.go):

  • JSON/YAML array format parsing
  • JSON/YAML map format conversion
  • YAML key order preservation
  • Backward compatibility with existing migrations
  • Mixed format support in single migration

Example

# examples/55_create_multicolumn_index.yaml
name: create_multicolumn_index
operations:
  - create_index:
      name: idx_products_lookup
      table: products
      columns:
        - name: status
        - name: category_id  
        - name: created_at
          sort: DESC

Testing

✅ All 50+ existing create_index tests pass
✅ New execution test catches column ordering bugs
✅ Integration test validates real PostgreSQL indexes
✅ Format tests verify both JSON and YAML parsing
✅ Backward compatibility with map format maintained
✅ No linter errors

Migration Path

Existing migrations continue to work unchanged. Users can adopt array format gradually:

Before:

columns:
  user_id: {}
  created_at: {sort: DESC}

After:

columns:
  - name: user_id
  - name: created_at
    sort: DESC

@github-actions github-actions bot temporarily deployed to Docs Preview December 17, 2025 22:04 Inactive
@jamesvillarrubia jamesvillarrubia force-pushed the feat/create-index-array-format branch from 9499bfd to f8302c6 Compare December 17, 2025 22:08
@github-actions github-actions bot temporarily deployed to Docs Preview December 17, 2025 22:09 Inactive
@jamesvillarrubia jamesvillarrubia force-pushed the feat/create-index-array-format branch from f8302c6 to 3bf9888 Compare December 17, 2025 22:13
@github-actions github-actions bot temporarily deployed to Docs Preview December 17, 2025 22:13 Inactive
@jamesvillarrubia jamesvillarrubia marked this pull request as draft December 17, 2025 22:17
@github-actions github-actions bot temporarily deployed to Docs Preview December 17, 2025 22:17 Inactive
@jamesvillarrubia jamesvillarrubia marked this pull request as ready for review December 17, 2025 22:47
@github-actions github-actions bot temporarily deployed to Docs Preview December 17, 2025 23:53 Inactive
@jamesvillarrubia jamesvillarrubia force-pushed the feat/create-index-array-format branch from afc6df7 to a62ed66 Compare December 17, 2025 23:55
@github-actions github-actions bot temporarily deployed to Docs Preview December 17, 2025 23:55 Inactive
@jamesvillarrubia jamesvillarrubia force-pushed the feat/create-index-array-format branch from a62ed66 to 85a523e Compare December 17, 2025 23:56
@github-actions github-actions bot temporarily deployed to Docs Preview December 17, 2025 23:56 Inactive
@jamesvillarrubia jamesvillarrubia force-pushed the feat/create-index-array-format branch from 85a523e to 1b5cf6e Compare December 17, 2025 23:59
@github-actions github-actions bot temporarily deployed to Docs Preview December 18, 2025 00:00 Inactive
@jamesvillarrubia jamesvillarrubia force-pushed the feat/create-index-array-format branch from 1b5cf6e to b087cb0 Compare December 18, 2025 00:02
@github-actions github-actions bot temporarily deployed to Docs Preview December 18, 2025 00:02 Inactive
Add support for array format in create_index.columns while maintaining
full backward compatibility with existing map format.

## Problem

Multi-column indexes need predictable column ordering for query planner
optimization. The map format (v0.10.0+) doesn't preserve order, causing
non-deterministic index creation where columns could appear in any order.

## Solution

Support both formats in a single 'columns' field:

Array format (preferred):
  columns:
    - name: user_id
    - name: created_at
      sort: DESC

Map format (legacy, still works):
  columns:
    user_id: {}
    created_at:
      sort: DESC

## Implementation

- Custom UnmarshalJSON/UnmarshalYAML (~160 lines) detect and convert formats
- OpCreateIndex stores columns as []IndexColumn internally
- Execution pipeline uses ordered array (no map conversion)
- YAML map format preserves key insertion order
- Deprecation warnings logged for multi-column/partial indexes using maps

## Backward Compatibility

All existing migrations work unchanged:
- Single-column map format: no warnings
- Multi-column map format: deprecation warning logged
- Array format: preferred for all cases

No breaking changes. Users migrate at their own pace.
@jamesvillarrubia jamesvillarrubia force-pushed the feat/create-index-array-format branch from b087cb0 to 887cb36 Compare December 18, 2025 00:10
@github-actions github-actions bot temporarily deployed to Docs Preview December 18, 2025 00:10 Inactive
@andrew-farries
Copy link
Collaborator

Hi @jamesvillarrubia, thanks for the PR.

It looks as though the use of oneOf is causing problems with type generation. You can see this locally with make generate.

We have two choices for how to resolve #1001:

IMO it would be better to make a clean breaking change to the migration format now (and have the automated support for rewriting migration files via pgroll update that #1009 provides).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants