Skip to content

Conversation

Aniketsy
Copy link
Contributor

This PR adds a check in IntervalIndex.from_arrays to raise a TypeError when integer arrays have mismatched signedness. Includes a unit test to verify the behavior.

Please let me know if my approach or fix needs any improvements . I’m open to feedback and happy to make changes based on suggestions.
Thankyou !

@Aniketsy
Copy link
Contributor Author

@jbrockmendel please review these changes when you get a chance.

copy: bool = False,
dtype: Dtype | None = None,
) -> IntervalIndex:
# Check for mismatched signed/unsigned integer dtypes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would make more sense to do this on the IntervalArray.from_arrays method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I will update changes in IntervalArray.from_arrays

@Aniketsy
Copy link
Contributor Author

Aniketsy commented Oct 3, 2025

@jbrockmendel I’ve updated this as per your suggestions.

) -> Self:
# Check for mismatched signed/unsigned integer dtypes
left_dtype = getattr(left, "dtype", None)
right_dtype = getattr(right, "dtype", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think putting this after the _maybe_convert_platform_interval calls would be more robust. e.g. if one is a list and the other is uint64?

Also is it just int vs uint we care about, or also e.g. int32 vs int64?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are checking int vs unit .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason not to move this to after the _maybe_convert_platform_interval calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, i have updated, also added whatsnew note, please let me know if this needs improvement.


# Check for mismatched signed/unsigned integer dtypes
left_dtype = getattr(left, "dtype", None)
right_dtype = getattr(right, "dtype", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the getattr should be unnecessary. the attribute should always be there now that this is moved to after

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, got it. i have updated the code and removed none check for dtype.

@Aniketsy
Copy link
Contributor Author

@jbrockmendel could you please check if this CI failure is related to the changes in this PR?

FAILED pandas/tests/internals/test_internals.py::TestBlockManager::test_astype[float16] - SystemError: <method 'astype' of 'numpy.ndarray' objects> returned a result with an exception set

@Aniketsy Aniketsy force-pushed the fix-intervalindex-signedness branch from c756fbc to 5b67c6e Compare October 13, 2025 03:54
@Aniketsy Aniketsy requested a review from rhshadrach as a code owner October 13, 2025 03:54
@jbrockmendel
Copy link
Member

FAILED pandas/tests/internals/test_internals.py::TestBlockManager::test_astype[float16] - SystemError: <method 'astype' of 'numpy.ndarray' objects> returned a result with an exception set

I'm pretty sure that's affecting all PRs and is unrelated to this.

if (
left_dtype.kind in "iu"
and right_dtype.kind in "iu"
and left_dtype.kind != right_dtype.kind
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just compare if left.dtype != right.dtype at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this will be more clear . i will update with this .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel i guess we should not use if left.dtype != right.dtype as this will only restrict to int64 vs uint64 and cause failure in other cases . should i revert the changes to previous one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what cases? im pretty sure we always want matching dtypes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=========================== short test summary info ============================
FAILED pandas/tests/indexes/categorical/test_astype.py::TestAstype::test_astype - TypeError: Left and right arrays must have matching dtypes. Got float64 and int64.
FAILED pandas/tests/indexes/interval/test_constructors.py::TestFromArrays::test_mixed_float_int[int64-float64] - TypeError: Left and right arrays must have matching dtypes. Got int64 and float64.
FAILED pandas/tests/indexes/interval/test_constructors.py::TestFromArrays::test_mixed_float_int[float64-int64] - TypeError: Left and right arrays must have matching dtypes. Got float64 and int64.
FAILED 

got these these check fails, after applying changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it is casting mixed int/float to float/float in ensure_simple_new_inputs. So putting this check after that should do the trick

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, that make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm before applying changes , i should move these check after ensure_simple_new_inputs, or should I use a strict if left.dtype != right.dtype check after that step?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think so yes

dtype=dtype,
)

# Check for mismatched signed/unsigned integer dtypes after casting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im confused as to why this affects from_arrays, since that goes through simple_new and not __new__

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: IntervalIndex.from_arrays with int64 vs uint64 arrays

2 participants