Skip to content

PAX Header Parsing Fails in some xattr cap encodings #428

@sleyva-jumptrading

Description

@sleyva-jumptrading

Found this while building a bootc OS image (layers parsed with tar-rs).

thread 'main' panicked at src/main.rs:115:31:
called `Result::unwrap()` on an `Err` value: Custom { kind: Other, error: "malformed pax extension" }
note: run with `RUST_BACKTRACE=1` to display a backtrace

With some debug logging, I see the parser starts a new PAX record in the middle of the previous one because the value contains a newline byte (0x0A):

New PAX Record!
line hex (42 bytes): 35 37 20 53 43 48 49 4c 59 2e 78 61 74 74 72 2e 73 65 63 75 72 69 74 79 2e 63 61 70 61 62 69 6c 69 74 79 3d 01 00 00 02 02 00
PAX record (as text): 57 SCHILY.xattr.security.capability=\x01\x00\x00\x02\x02\x00
line len: 42, reported len (in record): 57

Inside the container I run this:

for f in $(getcap -r / 2>/dev/null | cut -d' ' -f1); do
  echo -n "$f: "
  getfattr -n security.capability --only-values "$f" 2>/dev/null | od -An -tx1 -v | grep -q " 0a " && echo "CONTAINS 0x0A!" 
done

/usr/local/telegraf/bin/telegraf: CONTAINS 0x0A!

Further confirmation:

getfattr -n security.capability --only-values /usr/local/telegraf/bin/telegraf | od -An -tx1 -v | cat -n
  1  01 00 00 02 02 00 0a 00 00 00 00 00 00 00 00 00
  2  00 00 00 00

Removing the file cap makes the image boot:

setcap -r /usr/local/telegraf/bin/telegraf

It looks like tar-rs parses PAX entries using newline-delimited lines. That treats the embedded 0x0A (newline) inside the binary xattr value as the end of the record, so the actual bytes read (42) don’t match the declared length (57), and the parser returns Err("malformed pax extension").

This PAX key is SCHILY.xattr.security.capability (Linux file capabilities). The value is binary and may contain 0x0A. Some tar tools accept such records by trusting the length field and treating the value as opaque bytes.

Proposal:

  • Add a lenient parsing mode that:
    • reads the decimal length first,
    • then reads exactly that many bytes (requiring that the last byte is '\n'),
    • splits keyword and value at the first '=' and allows embedded newlines in the value.
  • Keep the current strict behavior by default if desired (spec-compliant), or guard lenient behavior behind a feature flag.

Happy to work on this, just wanted to see if there was interest prior to me submitting a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions