Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions srsly/tests/test_yaml_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

from .._yaml_api import yaml_dumps, yaml_loads, read_yaml, write_yaml
from .._yaml_api import is_yaml_serializable
from ruamel.yaml import YAML
from ruamel.yaml.comments import CommentedMap
from .util import make_tempdir

Expand Down Expand Up @@ -90,3 +91,38 @@ def test_is_yaml_serializable(obj, expected):
assert is_yaml_serializable(obj) == expected
# Check again to be sure it's consistent
assert is_yaml_serializable(obj) == expected


class Malicious:
init_count = 0

def __new__(cls):
cls.init_count += 1
return object.__new__(cls)


def test_yaml_safe():
"""Old versions of PyYAML / ruamel.yaml were unsafe by default,
with `yaml.load` allowing arbitrary code execution.
Test that srsly does not allow deserializing arbitrary Python objects.
"""

m = Malicious()
buf = StringIO()
yaml = YAML(typ="full")
yaml.dump(m, buf)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a little more convincing if this test used srsly.write_yaml or srsly.yaml_dumps instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither of those functions, by design, allow you to serialize an arbitrary python object.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test demonstrating that would be more useful than what's here IMO. I searched and there are zero uses of srsly.ruamel_yaml in the explosion stack outside srsly itself. Testing the public API is what's important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test does not test srsly.ruamel_yaml.

This test verifies that a YAML-legal malicious payload, which can only be crafted either with unwrapped ruamel.yaml.YAML or PyYAML, cannot cause arbitrary code execution in srly.yaml_load.

I've now added a test that shows that you get a meaningful error message in srsly.write_yaml when trying to write arbitrary objects, and removed the lines that use ruamel.yaml.YAML() to load the payload back in order to prevent confusion.

I searched and there are zero uses of srsly.ruamel_yaml in the explosion stack outside srsly itself.

Are you suggesting to not introduce the dummy srsly/ruamel_yaml.py etc. backwards compatibiity modules, incurring in a slight risk that third-party packages may (trivially) break?

payload = buf.getvalue()
assert payload.startswith("!!python/object:")

prev_count = Malicious.init_count
with pytest.raises(ValueError, match="python/object"):
yaml_loads(payload)
# No arbitrary code execution happened
assert Malicious.init_count == prev_count

# By default, even the wrapped ruamel.yaml does not allow it;
# you need to explicitly opt in with `typ='unsafe'` (deprecated).
yaml2 = YAML()
buf.seek(0)
_ = yaml2.load(buf)
assert Malicious.init_count == prev_count