Skip to content

Conversation

@DaniPopes
Copy link
Collaborator

@DaniPopes DaniPopes commented Aug 22, 2023

Split off from #582

Summary:

  • i256_sign:
    • is split into the two possible variants as before (with the const bool), so we can use it with an immutable reference, avoiding copies in i256_cmp (see slt, sgt impls). i256_sign_compl gets optimized away like i256_sign::<true> before
    • by sorting Sign variants we can leverage derived Eq, Ord impls, which are can be further optimized by the compiler. Also, by using -1, 0, 1 it will be the exact same cmp::Ordering and the signum function result, which helps in the sign function (where we transmute num != 0 directly to Sign)
  • i256_div: if condition generates better code than a fully exhaustive match (branchless vs multiple branches). I don't know why, I would assume the two to be identical ... godbolt: https://godbolt.org/z/939obPrsj
  • other small changes are for readability: overflowing().0 == wrapping(), >> == wrapping_shr

Copy link
Collaborator

@gakonst gakonst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall supportive, defer to dragan for more

Comment on lines -142 to +134
Sign::Minus => {
let shifted = ((op2.overflowing_sub(U256::from(1)).0) >> shift)
.overflowing_add(U256::from(1))
.0;
two_compl(shifted)
}
Sign::Plus | Sign::Zero => op2.wrapping_shr(shift),
Sign::Minus => two_compl(op2.wrapping_sub(ONE).wrapping_shr(shift).wrapping_add(ONE)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine if we are OK with going from overflowing -> wrapping

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify this to ((op2-ONE)>>shift)+1) as checks for op2 == zero and op1 >= 256 are done and shr in background calls overflowing_shr inside ruint.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense in the context that ruint operations are overloaded to be wrapping, but it might not be obvious without this knowledge. I wrote them with explicit wrapping, normally I'd use operators but this is one of the very few instances where I'd rather use explicit methods

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrapper_* are nicer here then overflowing_* so I will just merge PR.
But I am curious why you think this is one of those instances, bearing in mind that we already check for op2 == zero and op1 >= 256

Comment on lines +81 to 86
match first_sign.cmp(&second_sign) {
// note: adding `if first_sign != Sign::Zero` to short circuit zero comparisons performs
// slower on average, as of #582
Ordering::Equal => first.cmp(second),
o => o,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems fine

Copy link
Member

@rakita rakita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Had a few small nits, really like what you did with Sign

.0;
two_compl(shifted)
}
Sign::Plus | Sign::Zero => op2.wrapping_shr(shift),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shift size is checked so it is okay to use *op2 >>shift without wrapping_shr:
https://doc.rust-lang.org/std/primitive.u8.html#method.wrapping_shl

wrapping_shr calls overflowing_shr and even ordinary shr calls wrapping_shr

Comment on lines -142 to +134
Sign::Minus => {
let shifted = ((op2.overflowing_sub(U256::from(1)).0) >> shift)
.overflowing_add(U256::from(1))
.0;
two_compl(shifted)
}
Sign::Plus | Sign::Zero => op2.wrapping_shr(shift),
Sign::Minus => two_compl(op2.wrapping_sub(ONE).wrapping_shr(shift).wrapping_add(ONE)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify this to ((op2-ONE)>>shift)+1) as checks for op2 == zero and op1 >= 256 are done and shr in background calls overflowing_shr inside ruint.

@DaniPopes
Copy link
Collaborator Author

Noticed one more thing (a774b29), we can drop the zero check in div and mod after doing the operation, since it allows for more optimizations at function return and results in better codegen overall.

@rakita rakita merged commit 37b0192 into bluealloy:main Aug 27, 2023
@DaniPopes DaniPopes deleted the ir-i256 branch August 27, 2023 10:10
mikelodder7 pushed a commit to LIT-Protocol/revm that referenced this pull request Sep 12, 2023
* perf(interpreter): improve i256 instructions

* chore: remove unused code, address review

* perf: drop zero check after dividing
Evalir pushed a commit to Evalir/revm that referenced this pull request Sep 14, 2023
* perf(interpreter): improve i256 instructions

* chore: remove unused code, address review

* perf: drop zero check after dividing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants