Skip to content

Conversation

@nickva
Copy link
Contributor

@nickva nickva commented Apr 2, 2021

Previously, mango tests with objects as keys were failing on CentOS 6 and CentOS 7. The reason for the failures was that old libicu collation algorithms didn't consider the <<255,255,255,255>> as the highest sortable string as CouchDB intends it to be. Later versions of libicu, at least as old as 59, started to do that. However, as long as we support CentOS 7 we can fix the issue by explicitly checking for the highest marker.

This was referenced Apr 2, 2021
@bessbd
Copy link
Member

bessbd commented Apr 2, 2021

Thank you for this PR, @nickva !
When I opened 4bccaa5 , my thought was that the sentinel value(s) should never actually be passed to "low level" code. Ie. it's a nice thing that those sentinels could actually be items that "order well", but they aren't items that are to be compared with others.
Having said that, I'd be +1 on this change, too.

What do you think?

@nickva
Copy link
Contributor Author

nickva commented Apr 3, 2021

I left a more complete response in #3488 (comment) but the idea is it's a bit more general and works along the lines of how current libicu version does (treating FFFF as special), so when we stop supporting CentOS 7 we can remove just that one max handling section and let libicu do the right thing.

@nickva nickva force-pushed the fix-centos-7-icu-collation-issue branch 3 times, most recently from b7f9e68 to 30a60bb Compare April 5, 2021 12:55
Previously, mango tests with objects as keys were failing on CentOS 6 and
CentOS 7. The reason for the failures was that old libicu collation algorithms
didn't consider the `<<255,255,255,255>>` as the highest sortable string as
CouchDB intends it to be. Later versions of libicu, at least as old as 59,
started to do that
https://www.unicode.org/reports/tr35/tr35-collation.html#tailored_noncharacter_weights.
However, as long as we support CentOS 7 we can fix the issue by explicitly
checkign for the highest marker.
@nickva nickva force-pushed the fix-centos-7-icu-collation-issue branch from 30a60bb to b651dd6 Compare April 5, 2021 13:21
@bessbd
Copy link
Member

bessbd commented Apr 5, 2021

I left a more complete response in #3488 (comment) but the idea is it's a bit more general and works along the lines of how current libicu version does (treating FFFF as special), so when we stop supporting CentOS 7 we can remove just that one max handling section and let libicu do the right thing.

I still don't get why (lower level) C code should be aware of a specific higher level (marker) constant (in erlang - caller code). - https://github.com/apache/couchdb/pull/3490/files#diff-e07aa2cf62bca24d714bb7b5d125f77a906db4145a60672e532e27a324b7dcf7R69-R73 . Sounds like a Separation of concerns violation.

Nevertheless, I'm confident that we're better off with this PR merged than without. +1

Copy link
Member

@davisp davisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@nickva nickva merged commit f6f81be into main Apr 5, 2021
@nickva nickva deleted the fix-centos-7-icu-collation-issue branch April 5, 2021 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants