Skip to content
Prev Previous commit
Next Next commit
fixing typos and lints
  • Loading branch information
Alon Agmon committed Aug 28, 2024
commit afc9c13b8da05ec764d183dc3a355952ab0c5ae6
6 changes: 3 additions & 3 deletions crates/integrations/datafusion/src/physical_plan/scan.rs
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,6 @@ fn convert_filters_to_predicate(filters: &[Expr]) -> Option<Predicate> {
///
/// * `Some(Predicate)` if the expression could be successfully converted.
/// * `None` if the expression couldn't be converted to an Iceberg predicate.
///
fn expr_to_predicate(expr: &Expr) -> Option<Predicate> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could use visitor pattern to make code more cleaner to avoid code bomb if supporting more expressions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I'm not a DataFusion expert but I think this makes sense being implemented as a visitor of Expr, probably by implementing a https://docs.rs/datafusion-common/41.0.0/datafusion_common/tree_node/trait.TreeNodeVisitor.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sdd, @FANNG1 , I appriciate it
I will try to refactor this, though I am wondering whether Visitor will indeed be the most suitable. Let me know what you think, but I think visitor shines when we have to run different logic on different kinds of elements (chained in some way) while we want to keep the logic in one place - i.e., the Visitor. whereas here we have one kind of element - Expr - which is an enum that can be deconstructed in different ways, for example -

Expr::Column(col), op, Expr::Literal(lit)
OR 
Expr::BinaryExpr(left_expr), Operator::Or, Expr::BinaryExpr(right_expr)

so what Im trying to say is that using visitor will simply move the matching complexity to another place - to the visitor.
Does this make sense?
I will continue to try and refactor this but please let me know what you think

match expr {
Expr::BinaryExpr(BinaryExpr { left, op, right }) => {
Expand All @@ -206,7 +205,7 @@ fn expr_to_predicate(expr: &Expr) -> Option<Predicate> {
}
}
// Third option arm (inner OR), e.g. x < 1 OR y > 10
// if one is unsuported, we fail the predicate
// if one is unsupported, we fail the predicate
(Expr::BinaryExpr(left_expr), Operator::Or, Expr::BinaryExpr(right_expr)) => {
let left_pred = expr_to_predicate(&Expr::BinaryExpr(left_expr.clone()))?;
let right_pred = expr_to_predicate(&Expr::BinaryExpr(right_expr.clone()))?;
Expand Down Expand Up @@ -248,11 +247,12 @@ fn scalar_value_to_datum(value: &ScalarValue) -> Option<Datum> {

#[cfg(test)]
mod tests {
use super::*;
use datafusion::arrow::datatypes::{DataType, Field, Schema};
use datafusion::common::DFSchema;
use datafusion::prelude::SessionContext;

use super::*;

fn create_test_schema() -> DFSchema {
let arrow_schema = Schema::new(vec![
Field::new("foo", DataType::Int32, false),
Expand Down
6 changes: 3 additions & 3 deletions crates/integrations/datafusion/src/table.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,19 +18,19 @@
use std::any::Any;
use std::sync::Arc;

use crate::physical_plan::scan::IcebergTableScan;
use async_trait::async_trait;
use datafusion::arrow::datatypes::SchemaRef as ArrowSchemaRef;
use datafusion::catalog::Session;
use datafusion::datasource::{TableProvider, TableType};
use datafusion::error::Result as DFResult;
use datafusion::logical_expr::Expr;
use datafusion::logical_expr::{BinaryExpr, TableProviderFilterPushDown};
use datafusion::logical_expr::{BinaryExpr, Expr, TableProviderFilterPushDown};
use datafusion::physical_plan::ExecutionPlan;
use iceberg::arrow::schema_to_arrow_schema;
use iceberg::table::Table;
use iceberg::{Catalog, NamespaceIdent, Result, TableIdent};

use crate::physical_plan::scan::IcebergTableScan;

/// Represents a [`TableProvider`] for the Iceberg [`Catalog`],
/// managing access to a [`Table`].
pub(crate) struct IcebergTableProvider {
Expand Down