Fix unsloth-zoo ascii encoding issue #332
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Python throws this error when you run a training script using
uv run file.py.
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 31733: ordinal not in range(128)
But if you run it from a notebook, it works fine.
The problem is that Python’s default text encoding is determined by OS locale, and in many cases, it assumes ASCII.
Jupyter kernel, on the other hand, runs in UTF-8.
ASCII encoding doesn’t seem to work with unsloth-zoo. There is an open PR, but it has not been merged yet.
We can temporarily enforce UTF-8 encoding until unsloth fixes the issue.