fix: use safe deserialization and tarfile extraction by gn00295120 · Pull Request #70 · NVIDIA/personaplex

gn00295120 · 2026-03-22T00:52:04Z

Summary

Add weights_only=True to all torch.load() calls to prevent arbitrary code execution via pickle deserialization (CVE-2025-32434, CWE-502)
Add filter='data' to all tarfile.extractall() calls to prevent path traversal attacks (CVE-2007-4559, CWE-22)

Files Changed

File	Fix
`moshi/moshi/models/lm.py` line ~979	`torch.load(..., weights_only=True)`
`moshi/moshi/models/loaders.py` lines ~160, ~217, ~295	`torch.load(..., weights_only=True)`
`moshi/moshi/server.py` lines ~333, ~349	`tarfile.extractall(..., filter='data')`
`moshi/moshi/offline.py` line ~145	`tarfile.extractall(..., filter='data')`

Security Impact

torch.load() without weights_only=True deserializes arbitrary Python objects via pickle, allowing remote code execution if a malicious checkpoint file is loaded. The weights_only=True flag restricts deserialization to tensor data only.

tarfile.extractall() without filter='data' can be exploited via crafted archives containing ../ path components (tar slip / directory traversal). The filter='data' argument (available since Python 3.12, backported to 3.11.4+) strips dangerous metadata and blocks absolute/traversal paths.

Test Plan

Verify model loading still works with weights_only=True on standard .pt checkpoints
Verify tarfile extraction of voices.tgz and dist.tgz succeeds with filter='data'
Confirm no regressions in existing test suite

Add weights_only=True to torch.load() calls to prevent arbitrary code execution via pickle (CVE-2025-32434). Add filter='data' to tarfile.extractall() to prevent path traversal (CVE-2007-4559). Addresses CWE-502, CWE-22.

Copilot

Pull request overview

This PR hardens model loading and archive extraction paths to reduce the risk of unsafe deserialization (PyTorch checkpoints) and tar path traversal during asset extraction.

Changes:

Add weights_only=True to torch.load() calls across model-loading codepaths.
Add filter='data' to tarfile.extractall() calls when extracting voices.tgz and dist.tgz.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
`moshi/moshi/server.py`	Uses `tarfile.extractall(..., filter='data')` when extracting downloaded archives.
`moshi/moshi/offline.py`	Uses `tarfile.extractall(..., filter='data')` when extracting voice prompt archives.
`moshi/moshi/models/loaders.py`	Loads PyTorch checkpoints with `weights_only=True` for safer deserialization.
`moshi/moshi/models/lm.py`	Loads saved voice prompt embeddings with `weights_only=True`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-22T00:54:44Z

moshi/moshi/server.py

        logger.info(f"extracting {voices_tgz} to {voices_dir}")
        with tarfile.open(voices_tgz, "r:gz") as tar:
-            tar.extractall(path=voices_tgz.parent)
+            tar.extractall(path=voices_tgz.parent, filter='data')


tarfile.TarFile.extractall() only supports the filter kwarg on Python 3.12+; this repo declares requires-python = ">= 3.10", so on 3.10/3.11 this will raise TypeError: extractall() got an unexpected keyword argument 'filter' at runtime. Consider either (a) raising the minimum supported Python to 3.12 (and documenting it), or (b) feature-detecting support for filter (e.g., via inspect.signature) and falling back to a manual “safe extract” implementation that rejects absolute paths, .. traversal, and symlinks/hardlinks.

Copilot · 2026-03-22T00:54:44Z

moshi/moshi/server.py

        if not dist.exists():
            with tarfile.open(dist_tgz, "r:gz") as tar:
-                tar.extractall(path=dist_tgz.parent)
+                tar.extractall(path=dist_tgz.parent, filter='data')


Same compatibility issue as above: passing filter='data' to extractall() will crash on Python < 3.12, but moshi/pyproject.toml currently allows Python 3.10+. Please add a version/feature-detected fallback safe extraction path (or bump the minimum Python version accordingly).

Copilot · 2026-03-22T00:54:45Z

moshi/moshi/offline.py

        log("info", f"extracting {voices_tgz} to {voices_dir}")
        with tarfile.open(voices_tgz, "r:gz") as tar:
-            tar.extractall(path=voices_tgz.parent)
+            tar.extractall(path=voices_tgz.parent, filter='data')


tarfile.extractall(..., filter='data') is not supported on Python 3.10/3.11 and will raise TypeError at runtime. Since the project currently supports Python >=3.10, please add a compatibility fallback (manual safe extraction) or bump the minimum supported Python version.

fix: use safe deserialization and tarfile extraction

820aa38

Add weights_only=True to torch.load() calls to prevent arbitrary code execution via pickle (CVE-2025-32434). Add filter='data' to tarfile.extractall() to prevent path traversal (CVE-2007-4559). Addresses CWE-502, CWE-22.

Copilot AI review requested due to automatic review settings March 22, 2026 00:52

Copilot started reviewing on behalf of gn00295120 March 22, 2026 00:52 View session

Copilot AI reviewed Mar 22, 2026

View reviewed changes

fix: add Python < 3.12 fallback for safe tarfile extraction

ea8e97c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use safe deserialization and tarfile extraction#70

fix: use safe deserialization and tarfile extraction#70
gn00295120 wants to merge 2 commits intoNVIDIA:mainfrom
gn00295120:fix/unsafe-deserialization-and-tarfile

gn00295120 commented Mar 22, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gn00295120 commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files Changed

Security Impact

Test Plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gn00295120 commented Mar 22, 2026 •

edited

Loading