-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Is your feature request related to a problem or challenge?
At the moment ballista overrides LogicalExtensionCodec::try_decode_file_format & LogicalExtensionCodec::try_encode_file_format providing support for:
file_format_codecs: vec![
Arc::new(ParquetLogicalExtensionCodec {}),
Arc::new(CsvLogicalExtensionCodec {}),
Arc::new(JsonLogicalExtensionCodec {}),
Arc::new(ArrowLogicalExtensionCodec {}),
Arc::new(AvroLogicalExtensionCodec {}),
],as seen at 1.
Should we want to integrate ballista with datafusion python we would need to provide a custom LogicalExtensionCodec implementing same logic or reuse ballista LogicalExtensionCodec implementation, which would complicate integration a bit.
As this file types are supported out of the box in datafusion would it make sense to implement encoder/decoder or them in DefaultLogicalExtensionCodec?
Describe the solution you'd like
Should this proposal make sense, implement support for it in DefaultLogicalExtensionCodec similar to what is supported in ballista already:
fn try_decode_file_format(
&self,
buf: &[u8],
ctx: &datafusion::prelude::SessionContext,
) -> Result<Arc<dyn datafusion::datasource::file_format::FileFormatFactory>> {
let proto = FileFormatProto::decode(buf)
.map_err(|e| DataFusionError::Internal(e.to_string()))?;
let codec = self
.file_format_codecs
.get(proto.encoder_position as usize)
.ok_or(DataFusionError::Internal(
"Can't find required codec in file codec list".to_owned(),
))?;
codec.try_decode_file_format(&proto.blob, ctx)
}
fn try_encode_file_format(
&self,
buf: &mut Vec<u8>,
node: Arc<dyn datafusion::datasource::file_format::FileFormatFactory>,
) -> Result<()> {
let mut blob = vec![];
let (encoder_position, _) =
self.try_any(|codec| codec.try_encode_file_format(&mut blob, node.clone()))?;
let proto = FileFormatProto {
encoder_position,
blob,
};
proto
.encode(buf)
.map_err(|e| DataFusionError::Internal(e.to_string()))
}as seen at 2
Describe alternatives you've considered
alternative would be to keep everything as it is.
Additional context
No response