perf(semantic): remove a branch from `record_ast_node` #4273

overlookmotel · 2024-07-15T11:04:20Z

Remove an unnecessary branch from record_ast_node. As suggested by @rzvxa in #4263 (review).

graphite-app · 2024-07-15T11:04:25Z

Your org has enabled the Graphite merge queue for merging into main

Add the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

overlookmotel · 2024-07-15T11:04:35Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

perf(semantic): remove a branch from record_ast_node #4273 👈
perf(semantic): inline ast record functions #4272
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @overlookmotel and the rest of your teammates on Graphite

codspeed-hq · 2024-07-15T11:09:48Z

CodSpeed Performance Report

Merging #4273 will not alter performance

_{Comparing 07-15-perf_semantic_remove_a_branch_from_record_ast_node_ (019b4d3) with main (f9d3f2e)}

Summary

✅ 32 untouched benchmarks

rzvxa · 2024-07-15T11:44:48Z

Huh, It is consistently slower? How does that make sense? I can't remember if the benchmark runs with CFG or not, I think the None check is just fewer CPU cycles with no CFG runs. What does the benchmark look like with CFG enabled?

This reverts commit 788269e.

overlookmotel · 2024-07-15T11:51:31Z

@rzvxa I've run the benchmarks 3 times, and each time it shows a negative effect. NB: CFG is not enabled on semantic benchmark.

I am guessing what's going on is one of:

Because record_ast_nodes and retrieve_recorded_ast_node also call self.cfg.is_some(), this value is warm in cache, so faster to check.
Because record_ast_nodes and retrieve_recorded_ast_node also call self.cfg.is_some(), using self.cfg.is_some() here too allows compiler to see it's a repeated check and make some optimization. This seems unlikely to me, as calls to record_ast_node are so far away from record_ast_nodes.
Vec::last_mut is slower than Option::is_some. This surprises me as I'd assume first thing Vec::last_mut would do is a bounds check. That check would be whether an 8-byte value is 0, same as Option::is_some check is.

But this is all just guesswork. No idea really! Any thoughts?

Regardless, given the (small) perf regression, I think we should not merge this.

rzvxa · 2024-07-15T12:01:54Z

Yes, Let's close it. I initially suggested this to have fewer checks in the CFG pass but since the linter benchmark(which contains CFG) is untouched it clearly wasn't the case.

Sorry to waste your time mate😓

overlookmotel · 2024-07-15T12:02:46Z

Not a waste of time at all. Negative results are also results.

Add a comment referencing conclusions of #4273.

overlookmotel · 2024-07-16T13:37:22Z

I thought about it more, and I have another theory. Maybe the thing is that Vec::last_mut is a more complicated function, and often used, so compiler doesn't inline it. So the difference between Vec::last_mut and Option::is_some is that Vec::last_mut adds the overhead a function call. Added to that, it may be optimized to expect the Vec to have entries, so returning None is the cold path and incurs a branch mis-prediction.

Maybe.

@rzvxa Still no idea if this is correct, but thought you might be interested.

rzvxa · 2024-07-16T22:59:14Z

Thanks for having this conversation with me.

I thought about it more, and I have another theory. Maybe the thing is that Vec::last_mut is a more complicated function, and often used, so compiler doesn't inline it. So the difference between Vec::last_mut and Option::is_some is that Vec::last_mut adds the overhead a function call. Added to that, it may be optimized to expect the Vec to have entries, so returning None is the cold path and incurs a branch mis-prediction.

Maybe.

@rzvxa Still no idea if this is correct, but thought you might be interested.

I've tested this with a few other approaches such as checking indices instead of using the last_mut and checks for is_some have outperformed them all. Rustc is just really good at handling options(as it should be).

overlookmotel · 2024-07-17T11:10:26Z

Thanks for having this conversation with me.

I thought about it more, and I have another theory. Maybe the thing is that Vec::last_mut is a more complicated function, and often used, so compiler doesn't inline it. So the difference between Vec::last_mut and Option::is_some is that Vec::last_mut adds the overhead a function call. Added to that, it may be optimized to expect the Vec to have entries, so returning None is the cold path and incurs a branch mis-prediction.
Maybe.
@rzvxa Still no idea if this is correct, but thought you might be interested.

I've tested this with a few other approaches such as checking indices instead of using the last_mut and checks for is_some have outperformed them all. Rustc is just really good at handling options(as it should be).

Or maybe the explanation was this one:

Because record_ast_nodes and retrieve_recorded_ast_node also call self.cfg.is_some(), this value is warm in cache, so faster to check.

github-actions bot added the A-semantic Area - Semantic label Jul 15, 2024

overlookmotel mentioned this pull request Jul 15, 2024

perf(semantic): inline ast record functions #4272

Merged

overlookmotel marked this pull request as ready for review July 15, 2024 11:05

overlookmotel marked this pull request as draft July 15, 2024 11:22

Boshen force-pushed the 07-15-perf_semantic_inline_ast_record_functions branch from fbdd40f to f9d3f2e Compare July 15, 2024 11:43

Boshen changed the base branch from 07-15-perf_semantic_inline_ast_record_functions to main July 15, 2024 11:49

overlookmotel added 3 commits July 15, 2024 11:49

perf(semantic): remove a branch from record_ast_node

9f14952

Dummy comment to re-run benchmark

6adfb66

Revert "Dummy comment to re-run benchmark"

019b4d3

This reverts commit 788269e.

Boshen force-pushed the 07-15-perf_semantic_remove_a_branch_from_record_ast_node_ branch from 64f74b5 to 019b4d3 Compare July 15, 2024 11:49

overlookmotel requested a review from rzvxa July 15, 2024 11:51

rzvxa closed this Jul 15, 2024

overlookmotel mentioned this pull request Jul 15, 2024

refactor(semantic): comment why extra CFG enabled check #4274

Merged

overlookmotel added a commit that referenced this pull request Jul 15, 2024

refactor(semantic): comment why extra CFG enabled check (#4274)

639fd48

Add a comment referencing conclusions of #4273.

Boshen deleted the 07-15-perf_semantic_remove_a_branch_from_record_ast_node_ branch August 12, 2024 06:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

perf(semantic): remove a branch from `record_ast_node` #4273

perf(semantic): remove a branch from `record_ast_node` #4273

Uh oh!

overlookmotel commented Jul 15, 2024 •

edited

Loading

Uh oh!

graphite-app bot commented Jul 15, 2024

Uh oh!

overlookmotel commented Jul 15, 2024

Uh oh!

codspeed-hq bot commented Jul 15, 2024 •

edited

Loading

Uh oh!

rzvxa commented Jul 15, 2024 •

edited

Loading

Uh oh!

overlookmotel commented Jul 15, 2024 •

edited

Loading

Uh oh!

rzvxa commented Jul 15, 2024

Uh oh!

overlookmotel commented Jul 15, 2024

Uh oh!

overlookmotel commented Jul 16, 2024 •

edited

Loading

Uh oh!

rzvxa commented Jul 16, 2024

Uh oh!

overlookmotel commented Jul 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

perf(semantic): remove a branch from record_ast_node #4273

perf(semantic): remove a branch from record_ast_node #4273

Uh oh!

Conversation

overlookmotel commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

graphite-app bot commented Jul 15, 2024

Your org has enabled the Graphite merge queue for merging into main

Uh oh!

overlookmotel commented Jul 15, 2024

Uh oh!

codspeed-hq bot commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #4273 will not alter performance

Summary

Uh oh!

rzvxa commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

overlookmotel commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rzvxa commented Jul 15, 2024

Uh oh!

overlookmotel commented Jul 15, 2024

Uh oh!

overlookmotel commented Jul 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rzvxa commented Jul 16, 2024

Uh oh!

overlookmotel commented Jul 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

perf(semantic): remove a branch from `record_ast_node` #4273

perf(semantic): remove a branch from `record_ast_node` #4273

overlookmotel commented Jul 15, 2024 •

edited

Loading

codspeed-hq bot commented Jul 15, 2024 •

edited

Loading

rzvxa commented Jul 15, 2024 •

edited

Loading

overlookmotel commented Jul 15, 2024 •

edited

Loading

overlookmotel commented Jul 16, 2024 •

edited

Loading