Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/forty-dodos-visit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'openzeppelin-solidity': minor
---

`Strings`: Added a utility function for converting an address to checksummed string.
51 changes: 43 additions & 8 deletions contracts/utils/Strings.sol
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redesigned toChecksumHexString to avoid double allocation.

  • removed the need for _unsafeSetHexString
  • removed the need for HEX_DIGITS_UPPERCASE

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right this is extremely cleaner. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great changes, thxs!

Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import {SignedMath} from "./math/SignedMath.sol";
*/
library Strings {
bytes16 private constant HEX_DIGITS = "0123456789abcdef";
bytes16 private constant HEX_DIGITS_UPPERCASE = "0123456789ABCDEF";
uint8 private constant ADDRESS_LENGTH = 20;

/**
Expand Down Expand Up @@ -63,17 +64,15 @@ library Strings {
* @dev Converts a `uint256` to its ASCII `string` hexadecimal representation with fixed length.
*/
function toHexString(uint256 value, uint256 length) internal pure returns (string memory) {
uint256 localValue = value;
if (length < Math.log256(value) + 1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computing the log is quite expensive. Checking at the end is less expensive (when everything works fine, which is the case we should optimize for). Any reason to not keep it?

revert StringsInsufficientHexLength(value, length);
}

bytes memory buffer = new bytes(2 * length + 2);
buffer[0] = "0";
buffer[1] = "x";
for (uint256 i = 2 * length + 1; i > 1; --i) {
buffer[i] = HEX_DIGITS[localValue & 0xf];
localValue >>= 4;
}
if (localValue != 0) {
revert StringsInsufficientHexLength(value, length);
}
_unsafeSetHexString(buffer, 2, value);

return string(buffer);
}

Expand All @@ -85,10 +84,46 @@ library Strings {
return toHexString(uint256(uint160(addr)), ADDRESS_LENGTH);
}

/**
* @dev Converts an `address` with fixed length of 20 bytes to its checksummed ASCII `string` hexadecimal
* representation, according to EIP-55.
*/
function toChecksumHexString(address addr) internal pure returns (string memory) {
bytes memory lowercase = new bytes(40);
uint160 addrValue = uint160(addr);
_unsafeSetHexString(lowercase, 0, addrValue);
bytes32 hashedAddr = keccak256(abi.encodePacked(lowercase));
Copy link
Collaborator

@Amxx Amxx Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You often do abi.encodePacked(x) when x is already a bytes. (here and on line 107)

Used like this, abi.encodePacked is the identity function ... the output exactly match the input ... so it is not necessary. One thing that happens however, is that memory is allocated for this copy... there is also a copy loop (or a mcopy if we are lucky).

Anyway, this increasse costs and leaks memory, so we should avoid it!


bytes memory buffer = new bytes(42);
buffer[0] = "0";
buffer[1] = "x";
uint160 hashValue = uint160(bytes20(hashedAddr));
for (uint256 i = 41; i > 1; --i) {
uint8 digit = uint8(addrValue & 0xf);
buffer[i] = hashValue & 0xf > 7 ? HEX_DIGITS_UPPERCASE[digit] : HEX_DIGITS[digit];
addrValue >>= 4;
hashValue >>= 4;
}
return string(abi.encodePacked(buffer));
}
Copy link
Collaborator

@Amxx Amxx Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entier function should be done "in place". Allocating two buffers is a waste.


/**
* @dev Returns true if the two strings are equal.
*/
function equal(string memory a, string memory b) internal pure returns (bool) {
return bytes(a).length == bytes(b).length && keccak256(bytes(a)) == keccak256(bytes(b));
}

/**
* @dev Sets the hexadecimal representation of a value in the specified buffer starting from the given offset.
*
* NOTE: This function does not check that the `buffer` can allocate `value` without overflowing. Make sure
* to check whether `Math.log256(value) + 1` is larger than the specified `length`.
*/
function _unsafeSetHexString(bytes memory buffer, uint256 offset, uint256 value) private pure {
for (uint256 i = buffer.length; i > offset; --i) {
buffer[i - 1] = HEX_DIGITS[value & 0xf];
value >>= 4;
}
}
}
12 changes: 12 additions & 0 deletions test/utils/Strings.test.js
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a rust implementation of this checksum algorithm in Foundry (as seen in cast), so it should be relatively trivial to make a PR and request for it to be exposed through VM.sol as with Base64.

With that, we can fuzz the implementation, which would be extremely valuable.
Not required for this PR though, but something to consider given that the changes we're making are somewhat relevant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally -- fuzzing should be used in some of the other utils as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, feel free to open PRs adding fuzzing or Halmos FV to those utils you consider make sense

Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,18 @@ describe('Strings', function () {
});
});

describe('toChecksumHexString address', function () {
it('converts a random address', async function () {
const addr = '0xa9036907dccae6a1e0033479b12e837e5cf5a02f';
expect(await this.mock.getFunction('$toChecksumHexString(address)')(addr)).to.equal(ethers.getAddress(addr));
});

it('converts an address with leading zeros', async function () {
const addr = '0x0000e0ca771e21bd00057f54a68c30d400000000';
expect(await this.mock.getFunction('$toChecksumHexString(address)')(addr)).to.equal(ethers.getAddress(addr));
});
});
Copy link
Member

@ernestognw ernestognw Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a little consideration in the documentation of getAddress:

  • If %%address%% contains both upper-case and lower-case, it is
  • assumed to already be a checksum address and its checksum is
  • validated, and if the address fails its expected checksum an
  • error is thrown.

None of these tests are using mixed-case letters so I'd recommend adding .toLowerCase() according to the same docs:

  • If you wish the checksum of %%address%% to be ignore, it should
  • be converted to lower-case (i.e. .toLowercase()) before
  • being passed in.

Even better, let's rewrite these tests:

const addresses = [...]

describe('toChecksumHexString address', function () {
  for (const addr of addresses) {
    it(`converts ${addr}`, async function () {
      expect(await this.mock.getFunction('$toChecksumHexString(address)')(addr.toLowerCase())).to.equal(
        ethers.getAddress(addr),
      );
    });
  }
});

I'm pushing a commit


describe('equal', function () {
it('compares two empty strings', async function () {
expect(await this.mock.$equal('', '')).to.be.true;
Expand Down