Skip to content

Conversation

@nrc
Copy link
Contributor

@nrc nrc commented Jul 16, 2024

Which issue does this PR close?

Addresses apache/datafusion#6327

Mentioned in #4969.

Closes #6113

Rationale for this change

Will support using extract with intervals in DataFusion. It's also desirable functionality in Arrow for it's own sake, although I don't have a directly motivating use case.

Are there any user-facing changes?

The date_part function will now work with intervals

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jul 16, 2024
@nrc
Copy link
Contributor Author

nrc commented Jul 16, 2024

apache/datafusion#11501 demonstrates how this change supports extract on intervals in DataFusion.

@nrc
Copy link
Contributor Author

nrc commented Jul 22, 2024

@alamb @Jefffrey would either of you be able to review this PR please? (Sorry if I'm missing process for requesting a review or you're not the best folk, it's my first Arrow PR).

@alamb
Copy link
Contributor

alamb commented Jul 25, 2024

@alamb @Jefffrey would either of you be able to review this PR please? (Sorry if I'm missing process for requesting a review or you're not the best folk, it's my first Arrow PR).

No problem @nrc -- I am sorry for the delay in reviewing. I think @tustvold has less time to review PRs than he used to, and the rest of us are just trying to work through reviews as much as possible. Sadly, review bandwidth is almost always our most limited resource

I am checking it out now

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @nrc -- I think this PR looks really nice . Sorry for the delay in reviewing.

impl ExtractDatePartExt for PrimitiveArray<IntervalYearMonthType> {
fn date_part(&self, part: DatePart) -> Result<Int32Array, ArrowError> {
match part {
DatePart::Year => Ok(self.unary_opt(|d| Some(d / 12))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense as IntervalYearMonth stores the number of whole months: https://docs.rs/arrow/latest/arrow/datatypes/enum.IntervalUnit.html

👍

ensure_returns_error(&Time64NanosecondArray::from(vec![0]));
}

// IntervalDayTimeType week, day, hour, miute, second, mili, u, nano; invalid month, year; ignores the other part
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these comments seem like they belong with the other tests test_interval_day_time_array and test_interval_month_day_nano_array 🤔

assert_eq!(0, actual.value(0));
assert_eq!(42, actual.value(1));
assert_eq!(1042, actual.value(2));
assert_eq!(MILLISECONDS_IN_DAY as i32 + 1, actual.value(3));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it somewhat strange that only for MILLISECONDS_IN_DAY we have a defined constant 🤔

assert_eq!(10, actual.value(1));
assert_eq!(35, actual.value(2));

// Times follow from nanos, but are not affected by months or dats.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Times follow from nanos, but are not affected by months or dats.
// Times follow from nanos, but are not affected by months or days.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed to fix in #6140

@alamb
Copy link
Contributor

alamb commented Jul 25, 2024

I took the liberty of merging up from main to pick up #6116

@alamb
Copy link
Contributor

alamb commented Jul 25, 2024

FYI @Jefffrey and @tustvold -- I'll plan to merge this tomorrow unless there are any concerns or anyone would like more time to reivew

@Dandandan Dandandan changed the title Implement data_part for intervals Implement date_part for intervals Jul 26, 2024
@alamb
Copy link
Contributor

alamb commented Jul 27, 2024

Thanks again @nrc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement date_part for Interval

2 participants