Skip to content

Conversation

@Jefffrey
Copy link
Contributor

@Jefffrey Jefffrey commented Jan 5, 2026

Which issue does this PR close?

  • N/A

Rationale for this change

Was reading through some window related code and decided to fix/update some documentation.

What changes are included in this PR?

Minor fixes to formatting, wording, and links. Also add a doc section on LimitEffect (introduced by #18029)

Are these changes tested?

Doc changes.

Are there any user-facing changes?

Doc changes.

@github-actions github-actions bot added the logical-expr Logical plan and expressions label Jan 5, 2026
/// # use datafusion_macros::user_doc;
/// # use std::sync::Arc;
///
/// #[user_doc(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires pulling in datafusion-macros as a dev dependency, but to me it is more representative of how you would implement a UDWF

}

/// the effect this function will have on the limit pushdown
/// The effect this function will have on limit pushdowns through a window bound.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@avantgardnerio would you be able to double check the documentation here since you worked on the original implementation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal was to move away from boolean APIs for specific optimizer rules, i.e. support_filter_pushdown() -> bool because that means each UDAF needs to know about each optimizer rule.

Instead the idea was to move towards natural mathematical properties such as cardinality_effect() -> CustomEnum. This does not specifically relate to an optimizer rule, it merely exposes information about what the LogicalPlanNode / UDAF / etc does.

By separating these concerns, we no longer need to update everything each time we invent a new optimizer rule (in fact, the window_push_past_limit rule was able to use the pre-existing cardinality_effect() from a previous optimizer rule).

This is a long-winded way to say: I think the update to this documentation is incorrect in that it is falsely overly specific.

I'm sure the world would go on if we merged as it, but I think it helps understanding to leave it correctly vague.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does that mean we're planning to alter/remove this limit_effect() API, even though it was recently introduced?

I agree with this being too overly specific, but considering we introduced a new API to WindowUDFImpl just for this optimizer rule I felt it would help users understand why this API exists and how it affects things in DataFusion. As the documentation currently is on main, I find it hard to understand its purpose so I assume it may be the same for other users. And the best way I found to understand this was via an example usage (and so far the only usage).

Do you have a suggestion on how we could generalize this documentation in a way that still helps users understand what it is for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we are over thinking it:

Suggested change
/// The effect this function will have on limit pushdowns through a window bound.
/// the effect this function will have on the limit pushdown (e.g. through a window bound)

@Jefffrey Jefffrey marked this pull request as ready for review January 5, 2026 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-expr Logical plan and expressions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants