Skip to content

fix(query-grammar): handle set_field(None) on Exists without panic#2921

Open
masumi-ryugo wants to merge 1 commit into
quickwit-oss:mainfrom
masumi-ryugo:fix/query-grammar-set-field-none-panic
Open

fix(query-grammar): handle set_field(None) on Exists without panic#2921
masumi-ryugo wants to merge 1 commit into
quickwit-oss:mainfrom
masumi-ryugo:fix/query-grammar-set-field-none-panic

Conversation

@masumi-ryugo

Copy link
Copy Markdown

What

UserInputLeaf::set_field panicked with "Exist query without a field isn't allowed" when called with None on a UserInputLeaf::Exists. This was reachable straight from the strict parse_query(&str) API: query_grammar::literal builds an Exists leaf via alt((range, set, exists, regex, term_or_phrase)) even when no leading field: prefix was parsed (opt(field_name)), and then routes through leaf.set_field(field_name) with field_name == None.

Repro

A 4-byte cargo-fuzz repro ((*'\n) drove this in single-digit minutes from an empty corpus on tantivy-query-grammar/main:

thread '<unnamed>' panicked at query-grammar/src/user_input_ast.rs:51:30:
Exist query without a field isn't allowed
   …
   7: query_grammar::literal
              at query-grammar/src/query_grammar.rs:374

Reachable from tantivy_query_grammar::parse_query(&str), which is the public API tantivy's higher-level QueryParser::parse_query is built on. Any tantivy user feeding attacker-controlled query strings (search box input from end users, log query DSL, etc.) crashes the process on this.

Fix

Switch the call to preserve the existing field when None is passed, matching the no-op intent in set_default_field for the same leaf shape:

// before
UserInputLeaf::Exists { field: _ } => UserInputLeaf::Exists {
    field: field.expect("Exist query without a field isn't allowed"),
},
// after
UserInputLeaf::Exists { field: existing } => UserInputLeaf::Exists {
    field: field.unwrap_or(existing),
},

The strict parser still produces the same AST it would have produced with a non-None field; the only behavior change is that malformed user-supplied query strings no longer crash the process. The lenient parse_query_lenient path was unaffected because it routes through a different conversion that already tolerated the missing field.

Tests

Two regression tests in user_input_ast::tests:

  • test_set_field_none_on_exists_does_not_panic — direct set_field(None) call on an Exists leaf, asserting the existing field is preserved.
  • test_parse_query_fuzz_repro_does_not_panic — end-to-end via parse_query("(*'\\n") on the fuzz repro byte string.

The existing 54 tests in query-grammar continue to pass.

Source of the repro

The repro is from a cargo-fuzz harness I'm prototyping for tantivy-query-grammar's parse_query and parse_query_lenient entry points. The harness lives on a separate branch on my fork (fuzz/initial-harnesses) — I'm holding off on proposing it as a standalone tracking issue until I see whether a similar finding cadence in this crate justifies the integration cost. Happy to send that as its own follow-up if the maintainers find this kind of fuzz-found bug worth keeping rolling.

`UserInputLeaf::set_field` panicked with
"Exist query without a field isn't allowed" when called with `None`
on a `UserInputLeaf::Exists`. This was reachable straight from the
strict `parse_query(&str)` API, which builds an `Exists` leaf via
`alt((range, set, exists, regex, term_or_phrase))` even when no
leading `field:` prefix was parsed (`opt(field_name)`), then routes
through `leaf.set_field(field_name)` with `field_name == None`.

A 4-byte cargo-fuzz repro (`(*'\n`) drove this in single-digit
minutes from an empty corpus on `tantivy-query-grammar/main`:

  thread '<unnamed>' panicked at query-grammar/src/user_input_ast.rs:51:30:
  Exist query without a field isn't allowed
       …
       7: query_grammar::literal at query-grammar/src/query_grammar.rs:374

Switch the call to preserve the existing field when `None` is passed,
matching the no-op intent in `set_default_field` for the same leaf
shape. The strict parser still produces the same AST it would have
produced with a non-`None` field; the only behavior change is that
malformed user-supplied query strings no longer crash the process.

Two regression tests in `tests`:
- `test_set_field_none_on_exists_does_not_panic` — direct
  `set_field(None)` call on an `Exists` leaf.
- `test_parse_query_fuzz_repro_does_not_panic` — end-to-end via
  `parse_query("(*'\\n")` on the fuzz repro byte string.

The lenient `parse_query_lenient` path was unaffected because it
goes through a different conversion that already tolerates the
missing field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant