-
-
Notifications
You must be signed in to change notification settings - Fork 3k
[mypyc] feat: specialize any
and all
using for loop helpers if possible [1/2]
#19948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
mypyc/test-data/irbuild-basic.test
Outdated
r12 = list_get_item_unsafe r2, r9 | ||
r13 = cast(str, r12) | ||
__mypyc_all_item__ = r13 | ||
r14 = CPyStr_IsTrue(__mypyc_all_item__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The boolean check for the all
call is now specialized per the dtype of the iterable, and we no longer have to use python's iterator protocol for the input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On master this seemed to just call the stdlib all
using a vector call? So the change is that we now generate a for loop for this, right?
builder.goto(loop_exit) | ||
builder.activate_block(true_block) | ||
|
||
index_type = builder._analyze_iterable_item_type(arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I deduplicate this block or is this fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be useful to deduplicate, since the only difference seems to be true/false switch, so a helper would just need a bool flag that indicates whether it's any
or all
. Not a big deal though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets just keep it as-is, the intent of the code is explicit and clear
mypyc/irbuild/for_helpers.py
Outdated
|
||
def create_synthetic_nameexpr(index_name: str, index_type: Type) -> NameExpr: | ||
"""This helper spoofs a NameExpr to use as the lvalue in one of the for loop helpers.""" | ||
index = NameExpr(index_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably need something here to ensure no name collisions globally, but not sure what
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my other comment I suggested adding _<line>_column
suffix. Another thing we use is to use the temp_counter
defined in IRBuilder
to get an increasing suffix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should create_synthetic_nameexpr be an IRBuilder method instead of going here then?
I think it could be used to refactor a few other places within the codebase where NameExpr are artificially created
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented the temp_counter but left this function where it is for now
any
and all
using for loop helpers if possible [1/2]
40f0ddd
to
92818b2
Compare
667e30b
to
41c7c9b
Compare
mypyc/irbuild/for_helpers.py
Outdated
|
||
def create_synthetic_nameexpr(index_name: str, index_type: Type) -> NameExpr: | ||
"""This helper spoofs a NameExpr to use as the lvalue in one of the for loop helpers.""" | ||
index = NameExpr(index_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my other comment I suggested adding _<line>_column
suffix. Another thing we use is to use the temp_counter
defined in IRBuilder
to get an increasing suffix.
arg = expr.args[0] | ||
if isinstance(arg, GeneratorExpr): | ||
return any_all_helper(builder, arg, builder.false, lambda x: x, builder.true) | ||
elif expr_has_specialized_for_helper(builder, arg): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a while to understand why this check is useful. The idea is that if we'd have a generic for loop (PyObject_GetIter
etc.) we can as well call the stdlib function?
What would happen if we'd always created the for loop here, since it would likely avoid the overhead of calling any
using generic call op, and the namespace dict lookup? If that sounds feasible, can you run some microbenchmarks to check if it's useful? I think it might be worth it especially if the iterable is small (0 or 1 items), which is quite common in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"It took me a while to understand why this check is useful. The idea is that if we'd have a generic for loop (PyObject_GetIter etc.) we can as well call the stdlib function?"
Uh. No. While I appreciate your assessment, it wasn't exactly so thought-out. My only intent was to fall back to the existing logic, whatever that may be, if there is no supported ForHelper.
I need to think about how to implement this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait I want to confirm, you're saying if we just remove the if check, we're good to go without any other changes? I suppose that makes sense since the for_loop_helper does have the generic ForGenerator class but I wonder if we're missing any edge cases where that might cause failure
builder.goto(loop_exit) | ||
builder.activate_block(true_block) | ||
|
||
index_type = builder._analyze_iterable_item_type(arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be useful to deduplicate, since the only difference seems to be true/false switch, so a helper would just need a bool flag that indicates whether it's any
or all
. Not a big deal though.
mypyc/test-data/irbuild-basic.test
Outdated
r12 = list_get_item_unsafe r2, r9 | ||
r13 = cast(str, r12) | ||
__mypyc_all_item__ = r13 | ||
r14 = CPyStr_IsTrue(__mypyc_all_item__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On master this seemed to just call the stdlib all
using a vector call? So the change is that we now generate a for loop for this, right?
for more information, see https://pre-commit.ci
if desc: | ||
obj = builder.accept(obj_expr) | ||
return builder.primitive_op(desc, [obj], expr.line) | ||
if node := type_expr.node: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was definitely not necessary but it bothered me how ugly that snippet was
This PR modifies the
any
andall
call specializers to use our for loop helpers in cases where they can be used.This is only marginally helpful now but will become more impactful with the addition of
ForMap
andForFilter
helpers, as well as any future helpers.While the private helpers might seem unnecessary at this point, I intend to use them for future specialization use cases such as
list(iterable)
andtuple(iterable)
as discussed in #19649 , andsum(iterable)
which will come shortly after this PR.This PR is ready for review
Update: The
sum(iterable)
is now ready in #19950. It's a bit more involved than this one so I'm going to leave it separate.Update: The helpers are no longer private