Skip to content

PS-11073 [8.0]: Fix underestimations in dict_index_node_ptr_max_size()#5945

Open
polchawa-percona wants to merge 1 commit into
percona:8.0from
polchawa-percona:PS-11073-8.0
Open

PS-11073 [8.0]: Fix underestimations in dict_index_node_ptr_max_size()#5945
polchawa-percona wants to merge 1 commit into
percona:8.0from
polchawa-percona:PS-11073-8.0

Conversation

@polchawa-percona
Copy link
Copy Markdown

@polchawa-percona polchawa-percona commented May 13, 2026

https://perconadev.atlassian.net/browse/PS-11073

  • Spatial indexes: the first field stores the MBR (DATA_MBR_LEN bytes), not the original geometry payload. Without this special case the generic path uses dict_col_get_max_size() which returns ULINT_MAX for DATA_GEOMETRY, causing rec_max_size to wrap around to a too-small node_ptr_max_size estimate.

  • Fast path was incorrectly taken whenever dict_col_get_fixed_size() returned non-zero. When dict_index_add_col() has zeroed field->fixed_len (because fixed_len > DICT_MAX_FIXED_COL_LEN = 768, e.g. CHAR(255) utf32 = 1020 bytes), the field is actually variable- length-encoded and needs 1-2 extra header bytes that the fast path skipped. Take the fast path only when field->fixed_len != 0.

  • Added a sanity ut_ad(field_max_size != ULINT_MAX) on the slow path.

  • A debug-only runtime cross-check is added in btr_cur_search_to_nth_level() under the new DBUG keyword "check_node_ptr_size_estimation": when latched in BTR_MODIFY_TREE the actual rec_offs_size(node_ptr_offsets) must not exceed the estimated node_ptr_max_size. Tests enable this DBUG keyword to verify the estimator is a true upper bound.

@polchawa-percona polchawa-percona self-assigned this May 13, 2026
@polchawa-percona polchawa-percona force-pushed the PS-11073-8.0 branch 3 times, most recently from de198bc to b410041 Compare May 13, 2026 09:49
@polchawa-percona polchawa-percona marked this pull request as ready for review May 13, 2026 09:50
… utf32 PK indexes

https://perconadev.atlassian.net/browse/PS-11073

* Spatial indexes: the first field stores the MBR (DATA_MBR_LEN bytes),
  not the original geometry payload.  Without this special case the
  generic path uses dict_col_get_max_size() which returns ULINT_MAX for
  DATA_GEOMETRY, causing rec_max_size to wrap around to a too-small
  node_ptr_max_size estimate.

* Fast path was incorrectly taken whenever dict_col_get_fixed_size()
  returned non-zero. When dict_index_add_col() has zeroed
  field->fixed_len (because fixed_len > DICT_MAX_FIXED_COL_LEN = 768,
  e.g. CHAR(255) utf32 = 1020 bytes), the field is actually variable-
  length-encoded and needs 1-2 extra header bytes that the fast path
  skipped. Take the fast path only when field->fixed_len != 0.

* Added a sanity ut_ad(field_max_size != ULINT_MAX) on the slow path.

A debug-only runtime cross-check is added in btr_cur_search_to_nth_level()
under the new DBUG keyword "check_node_ptr_size_estimation": when latched
in BTR_MODIFY_TREE the actual rec_offs_size(node_ptr_offsets) must not
exceed the estimated node_ptr_max_size.  Tests enable this DBUG keyword
to verify the estimator is a true upper bound.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant