Summary
There is a bug in the zero-copy Arrow IPC deserializer. The pointer validation logic in ArrowTypeInfoExt::from_array is vulnerable to an integer overflow, which allows a node to inadvertently construct out-of-bounds ArrayData from shared memory.
Motivation & Impact
dora-rs relies on zero-copy shared memory for high-performance IPC. The dora-core deserializer reads the buffer lengths provided by the publisher over the IPC channel. Because the current bounds check uses wrapping addition, an exceptionally large b.len() sent by a buggy or misconfigured upstream node will bypass the boundary check. This leads to out-of-bounds memory reads in downstream operators or causes the dora-daemon to panic due to a stack overflow, taking down the dataflow.
The Issue
In ArrowTypeInfoExt::from_array (libraries/core/src/metadata.rs, lines 72-76), the validation math uses an unchecked + operator:
if ptr as usize + b.len() > region_start as usize + region_len {
eyre::bail!("ptr ends after region");
}
If b.len() is excessively large (e.g., near usize::MAX), ptr + b.len() wraps to a small positive integer. The condition evaluates to false, treating a wildly out-of-bounds length as perfectly valid.
How to Reproduce
- Create a minimal Dora node that emits a
dora_message with an inflated BufferOffset.
let mut type_info = ArrowTypeInfo::empty();
type_info.buffer_offsets = vec![BufferOffset { offset: 0, len: usize::MAX - 2048 }];
node.send_output_sample("test_output".into(), type_info, Default::default(), Some(sample))?;
- Set up a simple topology in dataflow.yml routing
test_output into a standard downstream node.
nodes:
- id: sender
build: cargo build -p test_node --bin sender
path: target/debug/sender.exe
outputs:
- test_output
- id: receiver
build: cargo build -p test_node --bin receiver
path: target/debug/receiver.exe
inputs:
test_input: sender/test_output
- Run
dora up then dora start dataflow.yml --attach.
Error
The un-validated length causes the upstream consumer to hard crash and sever the IPC connection:
thread 'main' (4076) has overflowed its stack
[ERROR]
start dataflow failed: RPC transport error
Caused by:
0: the channel was disconnected due to a critical error
1: could not read from the transport
2: An existing connection was forcibly closed by the remote host. (os error 10054)
Use standard Rust checked_add bounds verification inside the unsafe block:
let end_ptr = (ptr as usize).checked_add(b.len())
.ok_or_else(|| eyre::eyre!("Arrow buffer length integer overflow"))?;
let region_end = (region_start as usize).checked_add(region_len)
.ok_or_else(|| eyre::eyre!("Region length integer overflow"))?;
if end_ptr > region_end {
eyre::bail!("ptr ends after region");
}
Summary
There is a bug in the zero-copy Arrow IPC deserializer. The pointer validation logic in
ArrowTypeInfoExt::from_arrayis vulnerable to an integer overflow, which allows a node to inadvertently construct out-of-boundsArrayDatafrom shared memory.Motivation & Impact
dora-rsrelies on zero-copy shared memory for high-performance IPC. Thedora-coredeserializer reads the buffer lengths provided by the publisher over the IPC channel. Because the current bounds check uses wrapping addition, an exceptionally largeb.len()sent by a buggy or misconfigured upstream node will bypass the boundary check. This leads to out-of-bounds memory reads in downstream operators or causes thedora-daemonto panic due to a stack overflow, taking down the dataflow.The Issue
In
ArrowTypeInfoExt::from_array(libraries/core/src/metadata.rs, lines 72-76), the validation math uses an unchecked+operator:If
b.len()is excessively large (e.g., nearusize::MAX),ptr + b.len()wraps to a small positive integer. The condition evaluates tofalse, treating a wildly out-of-bounds length as perfectly valid.How to Reproduce
dora_messagewith an inflatedBufferOffset.test_outputinto a standard downstream node.dora upthendora start dataflow.yml --attach.Error
The un-validated length causes the upstream consumer to hard crash and sever the IPC connection:
Use standard Rust
checked_addbounds verification inside theunsafeblock: