[flake8-return] Fix RET504 false positive when variable is read in except/finally#25441
[flake8-return] Fix RET504 false positive when variable is read in except/finally#25441anishgirianish wants to merge 4 commits into
Conversation
|
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| RUF100 | 1 | 1 | 0 | 0 | 0 |
Linter (preview)
ℹ️ ecosystem check detected linter changes. (+1 -0 violations, +0 -0 fixes in 1 projects; 55 projects unchanged)
mlflow/mlflow (+1 -0 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --no-fix --output-format concise --preview
+ mlflow/telemetry/track.py:31:32: RUF100 [*] Unused `noqa` directive (unused: `RET504`)
Changes by rule (1 rules affected)
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| RUF100 | 1 | 1 | 0 | 0 | 0 |
ntBre
left a comment
There was a problem hiding this comment.
Thanks again for looking into this. As noted in my other comment, I was poking at this locally along with Codex because this still didn't feel like quite the right approach to me. I guess my idea from my previous comment was to try to move more state into the single ReturnVisitor rather than having a nested AugAssignFinder and the nested any_over_body calls, which do additional local traversals within the main ReturnVisitor traversal. This was kind of tricky to get working, but I think Codex eventually came up with an interesting patch:
Details
diff --git a/crates/ruff_linter/resources/mdtest/flake8-return/unnecessary-assign.md b/crates/ruff_linter/resources/mdtest/flake8-return/unnecessary-assign.md
index 8d77011f81..9b1c9ff2b3 100644
--- a/crates/ruff_linter/resources/mdtest/flake8-return/unnecessary-assign.md
+++ b/crates/ruff_linter/resources/mdtest/flake8-return/unnecessary-assign.md
@@ -132,6 +132,33 @@ def f():
log(x)
```
+A conditional rebind doesn't necessarily kill the `try`'s value:
+
+```py
+def f():
+ try:
+ x = foo()
+ return x
+ finally:
+ if cond():
+ x = "done"
+ log(x)
+```
+
+An earlier unconditional rebind still kills the `try`'s value:
+
+```py
+def f():
+ try:
+ x = foo()
+ return x # error: [unnecessary-assign]
+ finally:
+ x = "done"
+ if cond():
+ x = "other"
+ log(x)
+```
+
## `finally` doesn't read the name (RET504 fires)
```py
diff --git a/crates/ruff_linter/src/rules/flake8_return/rules/function.rs b/crates/ruff_linter/src/rules/flake8_return/rules/function.rs
index 9db7dc3a4e..faf944fbfb 100644
--- a/crates/ruff_linter/src/rules/flake8_return/rules/function.rs
+++ b/crates/ruff_linter/src/rules/flake8_return/rules/function.rs
@@ -8,8 +8,8 @@ use ruff_python_ast::token::TokenKind;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::whitespace::indentation;
use ruff_python_ast::{self as ast, Decorator, ElifElseClause, Expr, Stmt};
-use ruff_python_semantic::SemanticModel;
use ruff_python_semantic::analyze::visibility::is_property;
+use ruff_python_semantic::{Scope, SemanticModel};
use ruff_python_trivia::{SimpleTokenKind, SimpleTokenizer, is_python_whitespace};
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextRange, TextSize};
@@ -568,16 +568,13 @@ pub(crate) fn unnecessary_assign(checker: &Checker, function_stmt: &Stmt) {
let Some(function_scope) = checker.semantic().function_scope(function_def) else {
return;
};
- for (assign, return_, stmt) in &stack.assignment_return {
+ for (assign, return_, stmt, enclosing_tries) in &stack.assignment_return {
// Identify, e.g., `return x`.
let Some(value) = return_.value.as_ref() else {
continue;
};
- let Expr::Name(ast::ExprName {
- id: returned_id, ..
- }) = value.as_ref()
- else {
+ let Expr::Name(returned_name) = value.as_ref() else {
continue;
};
@@ -597,7 +594,16 @@ pub(crate) fn unnecessary_assign(checker: &Checker, function_stmt: &Stmt) {
continue;
};
- if returned_id != assigned_id {
+ if returned_name.id != *assigned_id {
+ continue;
+ }
+
+ if is_observed_by_enclosing_finally(
+ checker.semantic(),
+ function_scope,
+ returned_name,
+ enclosing_tries,
+ ) {
continue;
}
@@ -669,6 +675,57 @@ pub(crate) fn unnecessary_assign(checker: &Checker, function_stmt: &Stmt) {
}
}
+fn is_observed_by_enclosing_finally(
+ semantic: &SemanticModel,
+ function_scope: &Scope,
+ returned_name: &ast::ExprName,
+ enclosing_tries: &[&ast::StmtTry],
+) -> bool {
+ let Some(returned_binding_id) = semantic.resolve_name(returned_name) else {
+ return false;
+ };
+ let bindings = function_scope
+ .get_all(returned_name.id.as_str())
+ .collect::<Vec<_>>();
+ let Some(returned_binding_index) = bindings
+ .iter()
+ .position(|binding_id| *binding_id == returned_binding_id)
+ else {
+ return false;
+ };
+ // A new binding copies the references from the binding it shadows. Walk the
+ // later bindings too, but ignore a reference when a later binding dominates it.
+ let later_bindings = &bindings[..returned_binding_index];
+
+ std::iter::once(returned_binding_id)
+ .chain(later_bindings.iter().copied())
+ .flat_map(|binding_id| semantic.binding(binding_id).references())
+ .map(|reference_id| semantic.reference(reference_id))
+ .filter(|reference| {
+ reference.is_load()
+ && reference.range() != returned_name.range()
+ && enclosing_tries.iter().any(|stmt_try| {
+ stmt_try
+ .finalbody
+ .iter()
+ .any(|stmt| stmt.range().contains_range(reference.range()))
+ })
+ })
+ .any(|reference| {
+ let Some(reference_node) = reference.expression_id() else {
+ return true;
+ };
+ !later_bindings.iter().any(|binding_id| {
+ let binding = semantic.binding(*binding_id);
+ binding.scope == reference.scope_id()
+ && binding.source.is_some_and(|source| {
+ semantic.node(source).start() < reference.start()
+ && semantic.dominates(source, reference_node)
+ })
+ })
+ })
+}
+
/// RET505, RET506, RET507, RET508
fn superfluous_else_node(
checker: &Checker,
diff --git a/crates/ruff_linter/src/rules/flake8_return/visitor.rs b/crates/ruff_linter/src/rules/flake8_return/visitor.rs
index 8d42d812de..1c75c59583 100644
--- a/crates/ruff_linter/src/rules/flake8_return/visitor.rs
+++ b/crates/ruff_linter/src/rules/flake8_return/visitor.rs
@@ -1,5 +1,3 @@
-use ruff_python_ast::helpers::{StoredNameFinder, any_over_body};
-use ruff_python_ast::statement_visitor::{self, StatementVisitor};
use ruff_python_ast::{self as ast, ElifElseClause, Expr, Identifier, Stmt};
use rustc_hash::FxHashSet;
@@ -35,8 +33,12 @@ pub(super) struct Stack<'data> {
/// The `assignment`-to-`return` statement pairs in the current function.
/// TODO(charlie): Remove the extra [`Stmt`] here, which is necessary to support statement
/// removal for the `return` statement.
- pub(super) assignment_return:
- Vec<(&'data ast::StmtAssign, &'data ast::StmtReturn, &'data Stmt)>,
+ pub(super) assignment_return: Vec<(
+ &'data ast::StmtAssign,
+ &'data ast::StmtReturn,
+ &'data Stmt,
+ Vec<&'data ast::StmtTry>,
+ )>,
}
pub(super) struct ReturnVisitor<'semantic, 'data> {
@@ -140,11 +142,9 @@ impl<'a> Visitor<'a> for ReturnVisitor<'_, 'a> {
// return x
// ```
Stmt::Assign(stmt_assign) => {
- if !is_observed_by_enclosing_try(stmt_assign, &self.enclosing_tries) {
- self.stack
- .assignment_return
- .push((stmt_assign, stmt_return, stmt));
- }
+ self.stack
+ .assignment_return
+ .push((stmt_assign, stmt_return, stmt, self.enclosing_tries.clone()));
}
// Example:
// ```python
@@ -157,17 +157,13 @@ impl<'a> Visitor<'a> for ReturnVisitor<'_, 'a> {
if let Some(stmt_assign) =
with.body.last().and_then(Stmt::as_assign_stmt)
{
- if !has_conditional_body(with, self.semantic)
- && !is_observed_by_enclosing_try(
- stmt_assign,
- &self.enclosing_tries,
- )
- {
+ if !has_conditional_body(with, self.semantic) {
self.stack.assignment_return.push((
- stmt_assign,
- stmt_return,
- stmt,
- ));
+ stmt_assign,
+ stmt_return,
+ stmt,
+ self.enclosing_tries.clone(),
+ ));
}
}
}
@@ -240,88 +236,3 @@ pub(crate) fn has_conditional_body(with: &ast::StmtWith, semantic: &SemanticMode
false
})
}
-
-/// Returns `true` if the `finally` clause of an enclosing `try` reads the
-/// assigned name. The `finally` runs after the `return`, so removing the
-/// assignment would hide the value from it.
-///
-/// `except` handlers are **not** checked: they are alternative control-flow
-/// paths to the `return`. If an exception fires, the assignment never
-/// completed, so the handler reads whatever value the name had before the
-/// `try` either way; removing the assignment does not change that.
-fn is_observed_by_enclosing_try(
- stmt_assign: &ast::StmtAssign,
- enclosing_tries: &[&ast::StmtTry],
-) -> bool {
- let [Expr::Name(target)] = stmt_assign.targets.as_slice() else {
- return false;
- };
- let name = target.id.as_str();
- enclosing_tries
- .iter()
- .any(|stmt_try| finalbody_observes(name, &stmt_try.finalbody))
-}
-
-/// `Stmt::AugAssign` targets count as reads: `x += 1` loads `x` first, but
-/// the AST encodes the target as `Store`, so a load-only walk misses it.
-fn finalbody_observes(name: &str, finalbody: &[Stmt]) -> bool {
- let load_of = |expr: &Expr| {
- matches!(
- expr,
- Expr::Name(ast::ExprName { id, ctx, .. })
- if ctx.is_load() && id == name
- )
- };
- for stmt in finalbody {
- let body = std::slice::from_ref(stmt);
- let mut finder = AugAssignFinder { name, found: false };
- finder.visit_body(body);
- if finder.found || any_over_body(body, load_of) {
- return true;
- }
- if is_top_level_kill(stmt, name) {
- return false;
- }
- }
- false
-}
-
-struct AugAssignFinder<'a> {
- name: &'a str,
- found: bool,
-}
-
-impl<'a> StatementVisitor<'a> for AugAssignFinder<'_> {
- fn visit_stmt(&mut self, stmt: &'a Stmt) {
- if self.found {
- return;
- }
- if let Stmt::AugAssign(ast::StmtAugAssign { target, .. }) = stmt
- && let Expr::Name(ast::ExprName { id, .. }) = target.as_ref()
- && id == self.name
- {
- self.found = true;
- return;
- }
- statement_visitor::walk_stmt(self, stmt);
- }
-}
-
-/// Conditional rebinds (inside `if`/`for`/...) don't count, since they may
-/// not execute.
-fn is_top_level_kill(stmt: &Stmt, name: &str) -> bool {
- let targets: &[Expr] = match stmt {
- Stmt::Assign(ast::StmtAssign { targets, .. }) => targets,
- Stmt::AnnAssign(ast::StmtAnnAssign {
- target,
- value: Some(_),
- ..
- }) => std::slice::from_ref(target.as_ref()),
- _ => return false,
- };
- let mut finder = StoredNameFinder::default();
- for target in targets {
- finder.visit_expr(target);
- }
- finder.names.contains_key(name)
-}Some of this binding manipulation might still be a bit much, but this captures the general outline of my idea: store more information during the main ReturnVisitor traversal and look up this state directly in the rule code. I think this is more analogous to the approach for nonlocal and global variables here:
ruff/crates/ruff_linter/src/rules/flake8_return/rules/function.rs
Lines 605 to 608 in db5aa0a
Alternatively, I also found a very simple patch like this to be pretty compelling:
diff --git a/crates/ruff_linter/src/rules/flake8_return/rules/function.rs b/crates/ruff_linter/src/rules/flake8_return/rules/function.rs
index 9db7dc3a4e..876b5e28e4 100644
--- a/crates/ruff_linter/src/rules/flake8_return/rules/function.rs
+++ b/crates/ruff_linter/src/rules/flake8_return/rules/function.rs
@@ -617,6 +617,15 @@ pub(crate) fn unnecessary_assign(checker: &Checker, function_stmt: &Stmt) {
else {
continue;
};
+ // A `return` can run `finally` suites that reference the assigned variable after the
+ // return expression is evaluated.
+ if assigned_binding
+ .references()
+ .map(|reference_id| checker.semantic().reference(reference_id))
+ .any(|reference| reference.range().start() > value.end())
+ {
+ continue;
+ }
// Check if there's any reference made to `assigned_binding` in another scope, e.g, nested
// functions. If there is, ignore them.
if assigned_bindingI think this option is biased toward more false negatives, but the sheer simplicity is pretty compelling, to me at least. I think I would probably lean toward the very simple approach, but I'm curious to get your thoughts.
I also think this might be helped someday by more general control flow analysis, which is part of the reason I'd be happy with a conservative, simple approach for now.
| matches!( | ||
| expr, | ||
| Expr::Name(ast::ExprName { id, ctx, .. }) | ||
| if ctx.is_load() && id == name |
There was a problem hiding this comment.
I looked at this for a while with Codex, and it was only able to come up with a couple of additional niche cases. The first is a del statement like:
def foo():
try:
x = 1
return x
finally:
del xwhere removing the binding can cause an UnboundLocalError. So we just need to check:
| if ctx.is_load() && id == name | |
| if (ctx.is_load() || ctx.is_del()) && id == name |
and then it pointed out that it's technically possible for a Call to locals() or vars() to capture a reference, but I think that's too far-fetched to worry much about. It suggested a match arm like this to handle that case:
Expr::Call(ast::ExprCall {
func, arguments, ..
}) => {
matches!(
func.as_ref(),
Expr::Name(ast::ExprName { id, ctx, .. })
if ctx.is_load()
&& arguments.is_empty()
&& matches!(id.as_str(), "locals" | "vars")
)
}…an enclosing finally
Summary
Fixes #17292
RET504 flagged assignments as unnecessary even when the variable was read in
finally/except, breaking runtime behavior on fix.Checks that the binding has only one reference (the return itself) before flagging
Test Plan