Skip to content

[mypyc] feat: ForFilter generator helper for builtins.filter #19643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
b3f3eab
[mypyc] feat: ForFilter generator helper for builtins.filter
BobTheBuidler Aug 12, 2025
67818c6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 12, 2025
74b7a6e
fix: add filter to ir fixtures
BobTheBuidler Aug 12, 2025
eeb09ab
fix: run tests
BobTheBuidler Aug 12, 2025
ddc13b8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 12, 2025
fc12cea
test with None
BobTheBuidler Aug 12, 2025
5ce8148
Merge branch 'for-filter' of https://github.com/BobTheBuidler/mypy in…
BobTheBuidler Aug 12, 2025
54ad04e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 12, 2025
eae9209
IR cases for testing C calls
BobTheBuidler Aug 12, 2025
9941d54
feat: handle native calls and primitive ops
BobTheBuidler Aug 12, 2025
71b27ef
Merge branch 'for-filter' of https://github.com/BobTheBuidler/mypy in…
BobTheBuidler Aug 12, 2025
5237f0b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 12, 2025
d68b833
Update run-loops.test
BobTheBuidler Aug 12, 2025
c9680dc
Update for_helpers.py
BobTheBuidler Aug 12, 2025
5bf4b22
test primitive op
BobTheBuidler Aug 12, 2025
c39bb4a
feat: use speciailizers
BobTheBuidler Aug 12, 2025
8e43b2e
Merge branch 'for-filter' of https://github.com/BobTheBuidler/mypy in…
BobTheBuidler Aug 12, 2025
9dceb9a
Revert "Update for_helpers.py"
BobTheBuidler Aug 12, 2025
7c8053f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 12, 2025
8aff832
add to docs
BobTheBuidler Aug 12, 2025
0d2c019
Merge branch 'for-filter' of https://github.com/BobTheBuidler/mypy in…
BobTheBuidler Aug 12, 2025
cec1a5d
Update for_helpers.py
BobTheBuidler Aug 13, 2025
5170a10
Merge branch 'master' into for-filter
BobTheBuidler Aug 13, 2025
ba5a978
Merge branch 'master' into for-filter
BobTheBuidler Aug 13, 2025
572793c
Update native_operations.rst
BobTheBuidler Aug 14, 2025
55ed2d6
Merge branch 'master' into for-filter
BobTheBuidler Aug 14, 2025
dbbbb57
Update for_helpers.py
BobTheBuidler Aug 16, 2025
0bc1d26
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 16, 2025
7d56fa9
Update for_helpers.py
BobTheBuidler Aug 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions mypyc/doc/native_operations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,5 @@ These variants of statements have custom implementations:
* ``for ... in seq:`` (for loop over a sequence)
* ``for ... in enumerate(...):``
* ``for ... in zip(...):``
* ``for ... in filter(...):``
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered supporting list(filter(...)) as well -- this seems quite common (in a follow-up PR)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I actually have that drafted already. but it won't be a special case for list(filter(...)) it will be a special case for [list|tuple|set](some_builtin_we_have_a_helper_for_in_for_helpers(...)) which will account for any builtin we have ForGenerator helpers for

Copy link
Contributor Author

@BobTheBuidler BobTheBuidler Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw this was part of the intent behind the list-built-from-range tests

I wasn't actually testing that we can build a list from a range, I was preparing IR to reflect how this helper would change the C implementation. Will work for map, filter, range, zip, enumerate, and future ops with special-case gen helpers

* ``for ... in itertools.filterfalse(...):``
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in my comment, this seems rare enough that it doesn't seem worth it to support it. Every special case adds some overhead to users, as they may try to remember all the stdlib features with optimized primitives.

Copy link
Contributor Author

@BobTheBuidler BobTheBuidler Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there currently any API for a "mypyc plugin" like a mypy plugin? Or future plans for one?

Could be cool to have a mypyc functools plugin and a mypyc itertools plugin, etc, to cover special casing of things that don't necessarily belong in the main mypy repo

something like iterools.accumulate would see quite large performance increases if we could handle the math operations in C. (this is something I wanted/needed for my own libs and was driving towards with these generator helpers but I'll just fork mypy for now to implement)

but to compile optimized C code for other fun things like functools.lru_cache and functools.cached_property would be really really cool (and fast) (and useful (in niche cases)) if there was a way it could be done without polluting the main repo

Copy link
Contributor Author

@BobTheBuidler BobTheBuidler Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though I do think some functools special casing might still be appropriate in the main repo, if not now as a longer term goal. even though that isn't the current way things are done, it seems like we've already started in that direction (see the singledispatch feature set)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we just implement but don't document this, so as to add no overhead to users?

81 changes: 81 additions & 0 deletions mypyc/irbuild/for_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
SetExpr,
TupleExpr,
TypeAlias,
Var,
)
from mypyc.ir.ops import (
ERR_NEVER,
Expand Down Expand Up @@ -491,6 +492,16 @@ def make_for_loop_generator(
for_list = ForSequence(builder, index, body_block, loop_exit, line, nested)
for_list.init(expr_reg, target_type, reverse=True)
return for_list

elif (
expr.callee.fullname == "builtins.filter"
and len(expr.args) == 2
and all(k == ARG_POS for k in expr.arg_kinds)
):
for_filter = ForFilter(builder, index, body_block, loop_exit, line, nested)
for_filter.init(index, expr.args[0], expr.args[1])
return for_filter

if isinstance(expr, CallExpr) and isinstance(expr.callee, MemberExpr) and not expr.args:
# Special cases for dictionary iterator methods, like dict.items().
rtype = builder.node_type(expr.callee.expr)
Expand Down Expand Up @@ -1166,3 +1177,73 @@ def gen_step(self) -> None:
def gen_cleanup(self) -> None:
for gen in self.gens:
gen.gen_cleanup()


class ForFilter(ForGenerator):
"""Generate optimized IR for a for loop over filter(f, iterable)."""

def need_cleanup(self) -> bool:
# The wrapped for loops might need cleanup. We might generate a
# redundant cleanup block, but that's okay.
return True

def init(self, index: Lvalue, func: Expression, iterable: Expression) -> None:
self.filter_func_def = func
if (
isinstance(func, NameExpr)
and isinstance(func.node, Var)
and func.node.fullname == "builtins.None"
):
self.filter_func_val = None
else:
self.filter_func_val = self.builder.accept(func)
self.iterable = iterable
self.index = index

self.gen = make_for_loop_generator(
self.builder,
self.index,
self.iterable,
self.body_block,
self.loop_exit,
self.line,
is_async=False,
nested=True,
)

def gen_condition(self) -> None:
self.gen.gen_condition()

def begin_body(self) -> None:
# 1. Assign the next item to the loop variable
self.gen.begin_body()

# 2. Call the filter function
builder = self.builder
line = self.line
item = builder.read(builder.get_assignment_target(self.index), line)

if self.filter_func_val is None:
result = item
else:
fake_call_expr = CallExpr(self.filter_func_def, [self.index], [ARG_POS], [None])

# I put this here to prevent a circular import
# from mypyc.irbuild.expression import transform_call_expr

# result = transform_call_expr(builder, fake_call_expr)
result = builder.accept(fake_call_expr)

# Now, filter: only enter the body if func(item) is truthy
cont_block, rest_block = BasicBlock(), BasicBlock()
builder.add_bool_branch(result, rest_block, cont_block)
builder.activate_block(cont_block)
builder.nonlocal_control[-1].gen_continue(builder, line)
builder.goto_and_activate(rest_block)
# At this point, the rest of the loop body (user code) will be emitted

def gen_step(self) -> None:
self.gen.gen_step()

def gen_cleanup(self) -> None:
self.gen.gen_cleanup()
21 changes: 21 additions & 0 deletions mypyc/test-data/fixtures/ir.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
overload, Mapping, Union, Callable, Sequence, FrozenSet, Protocol
)

from typing_extensions import Self

_T = TypeVar('_T')
T_co = TypeVar('T_co', covariant=True)
T_contra = TypeVar('T_contra', contravariant=True)
Expand Down Expand Up @@ -406,3 +408,22 @@ class classmethod: pass
class staticmethod: pass

NotImplemented: Any = ...

_T1 = TypeVar("_T1")
_T2 = TypeVar("_T2")

class map(Generic[_S]):
@overload
def __new__(cls, func: Callable[[_T1], _S], iterable: Iterable[_T1], /) -> Self: ...
@overload
def __new__(cls, func: Callable[[_T1, _T2], _S], iterable: Iterable[_T1], iter2: Iterable[_T2], /) -> Self: ...
def __iter__(self) -> Self: ...
def __next__(self) -> _S: ...

class filter(Generic[_T]):
@overload
def __new__(cls, function: None, iterable: Iterable[_T | None], /) -> Self: ...
@overload
def __new__(cls, function: Callable[[_T], Any], iterable: Iterable[_T], /) -> Self: ...
def __iter__(self) -> Self: ...
def __next__(self) -> _T: ...
Loading
Loading