Skip to content

gh-143732: add specialization for FOR_ITER#148745

Open
NekoAsakura wants to merge 9 commits intopython:mainfrom
NekoAsakura:gh-143732/for-iter-type-recording
Open

gh-143732: add specialization for FOR_ITER#148745
NekoAsakura wants to merge 9 commits intopython:mainfrom
NekoAsakura:gh-143732/for-iter-type-recording

Conversation

@NekoAsakura
Copy link
Copy Markdown
Contributor

@NekoAsakura NekoAsakura commented Apr 19, 2026

main branch ratio
for_iter_dict_items 110.23 ms ± 1.74 ms 99.29 ms ± 1.24 ms 1.11× faster
for_iter_dict_keys 85.20 ms ± 1.03 ms 76.20 ms ± 1.02 ms 1.12× faster
for_iter_dict_values 84.91 ms ± 1.15 ms 76.27 ms ± 1.24 ms 1.11× faster
for_iter_set 96.88 ms ± 1.05 ms 87.46 ms ± 1.16 ms 1.11× faster
for_iter_reversed 79.99 ms ± 0.77 ms 72.03 ms ± 0.79 ms 1.11× faster
for_iter_enumerate 123.65 ms ± 1.94 ms 110.14 ms ± 1.82 ms 1.12× faster
for_iter_zip 131.47 ms ± 1.38 ms 122.17 ms ± 2.87 ms 1.08× faster
for_iter_list 59.21 ms ± 1.17 ms 59.48 ms ± 1.06 ms 1.00× slower
for_iter_tuple 53.78 ms ± 0.61 ms 53.77 ms ± 0.74 ms 1.00× faster
for_iter_range 72.62 ms ± 0.80 ms 72.55 ms ± 1.08 ms 1.00× faster

Copy link
Copy Markdown
Member

@cocolato cocolato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for doing this!

Comment thread Python/bytecodes.c Outdated
}

macro(FOR_ITER) = _SPECIALIZE_FOR_ITER + _FOR_ITER;
macro(FOR_ITER) = _SPECIALIZE_FOR_ITER + _RECORD_NOS_GEN_FUNC + _RECORD_NOS_TYPE + _FOR_ITER;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case JIT_SYM_RECORDED_GEN_FUNC_TAG:
return &PyGen_Type;

We don't need _RECORD_NOS_GEN_FUNC here, because the GEN_FUNC_TAG recorded here does not contribute to the current optimization.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every FOR_ITER specialisation's record list must be a prefix of FOR_ITER's.
_RECORD_NOS_GEN_FUNC writes a gen func or NULL to slot 0, matching what FOR_ITER_GEN reads from it.
https://github.com/python/cpython/actions/runs/24623062259/job/71997168333

PyType_Watch(TYPE_WATCHER_ID, (PyObject *)probable);
_Py_BloomFilter_Add(dependencies, probable);
sym_set_type(iter, probable);
int32_t orig_target = (this_instr - 1)->target;
Copy link
Copy Markdown
Member

@cocolato cocolato Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add an assert to make sure the last uop is _RECORD_NOS_TYPE

Copy link
Copy Markdown
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've a few suggestions inline.

Comment thread Python/optimizer_bytecodes.c Outdated
Comment thread Python/optimizer_bytecodes.c
Comment thread Python/bytecodes.c
}

tier2 op(_ITER_NEXT_INLINE, (iternext_fn/4, iter, null_or_index -- iter, null_or_index, next)) {
volatile iternextfunc iternext_v = (iternextfunc)iternext_fn;
Copy link
Copy Markdown
Member

@markshannon markshannon May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why volatile? It shouldn't be necessary.
Also function pointers may not be the same size as normal pointers.
Can you add assert(sizeof(iternextfunc) == sizeof(uintptr_t)); to be on the safe side.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because mmap pages usually sit too far from tp_iternext for the offset to reach, we have to use volatile to force compiler to emit callq *%rax (target read from a register) instead of callq <rel32> (target baked in as a fixed offset). Otherwise the call jumps to the wrong address and segfaults.

@bedevere-app
Copy link
Copy Markdown

bedevere-app Bot commented May 1, 2026

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be poked with soft cushions!

@read-the-docs-community
Copy link
Copy Markdown

Documentation build overview

📚 cpython-previews | 🛠️ Build #32505275 | 📁 Comparing 0f69071 against main (4b33308)

  🔍 Preview build  

92 files changed · + 1 added · ± 91 modified

+ Added

± Modified

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants