-
-
Notifications
You must be signed in to change notification settings - Fork 773
perf(ast, ast_codegen): optimize AST structs' memory layouts #4404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(ast, ast_codegen): optimize AST structs' memory layouts #4404
Conversation
Your org has enabled the Graphite merge queue for merging into mainAdd the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix. You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link. |
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
CodSpeed Performance ReportMerging #4404 will not alter performanceComparing Summary
|
tasks/ast_codegen/src/layout.rs
Outdated
| // well known types | ||
| // TODO: generate const assertions for these in the ast crate | ||
| Span => { 64: Layout::of::<oxc_span::Span>(), 32: Layout::of::<oxc_span::Span>() }, | ||
| Atom => { 64: Layout::of::<oxc_span::Atom>(), 32: Layout::of::<oxc_span::Atom>() }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Span and Atom, can we mark them #[ast] and have codegen parse the files in oxc_span, same as it does for oxc_ast? We need both types to be made #[repr(C)] anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly believe we shouldn't extend the ast marker to types outside of the AST crate. Those one-off types can always be made repr(C) without all this infra. Span is already in the ideal order, And Vec type doesn't expose its fields so it can be reordered manually. Atom is a transparent type so no need for this stuff whatsoever.
But that's just my preference, I still haven't seen any real need to process those types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With that said, I have nothing against hard coding them as opposed to using a layout of method. I just did it for the sake of consistency if we regenerate on multiple architectures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Span and Atom we can mark them #[repr(C)] and hard-code them if you prefer. But:
- AST transfer needs to know the field order for
Span(so we'd have to hard-code that too). - We can't avoid extending outside of
oxc_astcrate, because there are various types inoxc_syntaxcrate which are part of the AST e.g.AssignmentOperator. These types need#[repr(C)]and explicit discriminants. So it seems to me that simplest way to do that is#[ast]+ codegen.
What's the reason for your opposition to codegen reaching into other crates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've mulled over this, I guess you are right it would be easier if we do so.
I wanted to keep the code generation's input limited to AST types, Vec, Box, Span, and Atom obviously aren't AST types. But your argument for types defined in oxc_syntax is valid.
I'm not familiar with all AST-related stuff in the oxc_syntax crate, But All I can remember is unit enums. So they should be easy to make repr(C).
Initially, I was slightly biased against doing this but now I'm natural. It sure helps not to have to worry about those types. Even if they all are simple enums.
I'll do it in a follow-up PR.
|
@overlookmotel I have a problem understanding something with repr C and option types. Can you help me with it? Take a look at this struct definition: oxc/crates/oxc_ast/src/ast/ts.rs Lines 829 to 836 in b1b66e2
If you get the size of the Is it correct in the first place? If my assumptions are right; How does it work on FFI? I guess option doesn't mean nullable for that pointer types are a better fit, But packing this information in structures wouldn't break the C compatibility? Does it only happen on pointers because there is free space in the address? Right now I only fold the option if there is room at the end of type(either the last field of struct or if all enum variants have 1 bit headroom). Should I also consider all the padding in between fields as viable space for folding options? Edit:I went with my intuition but it might very well be wrong so please let me know if I'm doing it incorrectly. |
34cacf6 to
38b4fc8
Compare
671d536 to
5112832
Compare
|
To clarify one thing:
"Niche" means a pattern of bits which is illegal for the type to use. In the case of I think the idea for FFI is that if you're getting a pointer from C, that C pointer can be either null or non-null. That maps cleanly onto Rust's The rules around niches as far as I understand them are:
Same thing for all the Some types have multiple niches. e.g.
(By the way, these complications around niches in enums is what motivated the "squashed enums" PR #3115. It was intractable to figure out what niches and discriminants Rust would produce when you have multiple enums, each with niches, nested enum-in-enum-in-enum). If a struct has a field which has a niche, then the struct inherits that niche too (as you've found). The hard part is when a struct has more than 1 field with a niche - which niche does the struct use? I don't know the answer to that one - but we don't need an answer anyway right now - if Padding cannot be used as a niche - I don't entirely understand that one, but pretty sure it's true. NB: One oddity is that One last thing: |
|
Thank you, I'll rework it to consider niches correctly. |
38b4fc8 to
af18a31
Compare
0ebd4eb to
d8867e2
Compare
7d25dc6 to
719efba
Compare
af18a31 to
fbaa37a
Compare
719efba to
b23e99f
Compare
e822ba6 to
425f59c
Compare
07ae601 to
d84a2db
Compare
edf5eb8 to
ed8675c
Compare
31a5a1c to
42740c6
Compare
ed8675c to
7118f04
Compare
42740c6 to
8cdd9d9
Compare
7118f04 to
4cc58bf
Compare
8cdd9d9 to
d0297db
Compare
4cc58bf to
9579449
Compare
d0297db to
717cd87
Compare
717cd87 to
01338e2
Compare
9579449 to
97c92f9
Compare
d07ce36 to
0c52c0d
Compare
18f1433 to
67e0203
Compare
375b347 to
297401d
Compare
67e0203 to
80e3195
Compare
|
Just to say, please don't delete this branch. We will want to return to this after we have added |

No description provided.