Skip to content

bug: FeatureStore.materialize() OOM-crashes on large time ranges with no built-in workaround #6307

@alan-gauthier-jt

Description

@alan-gauthier-jt

Summary

FeatureStore.materialize() and FeatureStore.materialize_incremental() load the entire requested time range into memory in a single pass. On production deployments this causes out-of-memory (OOM) crashes that are silent, difficult to recover from, and currently require external orchestration to work around.

Steps to reproduce

from datetime import datetime, timedelta

fs.materialize(
    start_date=datetime(2026, 1, 1),
    end_date=datetime(2026, 3, 1),  # ~60-day window
)

With a high-frequency feature view (e.g. 10-minute ETL batches → ~8 640 rows/day per entity × many entities), a worker with ≤ 8 GB RAM will exhaust memory and crash with no informative error from Feast.

Root cause

Both materialize() and materialize_incremental() delegate to materialize_single_feature_view exactly once per feature view, passing the full [start, end] window. The underlying data source query materialises the entire range in one shot — there is no pagination, chunking, or streaming at the Feast SDK layer.

Impact

Scenario Effect
Multi-day / multi-week backfill Worker OOM crash
Sub-minute sensor data at scale Worker OOM crash
Large number of entities × long window Worker OOM crash
Crash recovery most_recent_end_time is not committed until the entire range succeeds, so a crash forces a full re-run

Current workaround

Users must implement their own loop outside Feast:

chunk = timedelta(hours=6)
cursor = start
while cursor < end:
    next_cursor = min(cursor + chunk, end)
    fs.materialize(cursor, next_cursor)
    cursor = next_cursor

This is error-prone, not integrated with materialize_incremental's watermark, and must be reimplemented by every affected user.

Expected behaviour

Feast should provide a native chunk_size option so users can cap peak memory usage without external orchestration. Chunking should:

  • Be opt-in and backward-compatible (default: no chunking).
  • Support a project-level default in feature_store.yaml under materialization.chunk_size.
  • Commit most_recent_end_time per chunk so a crash mid-run allows materialize_incremental to resume from the last committed chunk.
  • Be available via both the Python SDK and the CLI.

Related

A PR implementing the above is available for review: #6277

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions