Skip to content

Integer overflow in Arrow IPC bounds check #1478

@swar09

Description

@swar09

Summary
There is a bug in the zero-copy Arrow IPC deserializer. The pointer validation logic in ArrowTypeInfoExt::from_array is vulnerable to an integer overflow, which allows a node to inadvertently construct out-of-bounds ArrayData from shared memory.

Motivation & Impact
dora-rs relies on zero-copy shared memory for high-performance IPC. The dora-core deserializer reads the buffer lengths provided by the publisher over the IPC channel. Because the current bounds check uses wrapping addition, an exceptionally large b.len() sent by a buggy or misconfigured upstream node will bypass the boundary check. This leads to out-of-bounds memory reads in downstream operators or causes the dora-daemon to panic due to a stack overflow, taking down the dataflow.

The Issue
In ArrowTypeInfoExt::from_array (libraries/core/src/metadata.rs, lines 72-76), the validation math uses an unchecked + operator:

if ptr as usize + b.len() > region_start as usize + region_len {
    eyre::bail!("ptr ends after region");
}

If b.len() is excessively large (e.g., near usize::MAX), ptr + b.len() wraps to a small positive integer. The condition evaluates to false, treating a wildly out-of-bounds length as perfectly valid.

How to Reproduce

  1. Create a minimal Dora node that emits a dora_message with an inflated BufferOffset.
let mut type_info = ArrowTypeInfo::empty();
type_info.buffer_offsets = vec![BufferOffset { offset: 0, len: usize::MAX - 2048 }];
node.send_output_sample("test_output".into(), type_info, Default::default(), Some(sample))?;
  1. Set up a simple topology in dataflow.yml routing test_output into a standard downstream node.
nodes:
  - id: sender
    build: cargo build -p test_node --bin sender
    path: target/debug/sender.exe
    outputs:
      - test_output

  - id: receiver
    build: cargo build -p test_node --bin receiver
    path: target/debug/receiver.exe
    inputs:
      test_input: sender/test_output
  1. Run dora up then dora start dataflow.yml --attach.

Error
The un-validated length causes the upstream consumer to hard crash and sever the IPC connection:

thread 'main' (4076) has overflowed its stack

[ERROR]
start dataflow failed: RPC transport error
Caused by:
   0: the channel was disconnected due to a critical error
   1: could not read from the transport
   2: An existing connection was forcibly closed by the remote host. (os error 10054)

Use standard Rust checked_add bounds verification inside the unsafe block:

let end_ptr = (ptr as usize).checked_add(b.len())
    .ok_or_else(|| eyre::eyre!("Arrow buffer length integer overflow"))?;
let region_end = (region_start as usize).checked_add(region_len)
    .ok_or_else(|| eyre::eyre!("Region length integer overflow"))?;

if end_ptr > region_end {
    eyre::bail!("ptr ends after region");
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions