Skip to content

Improve device compatibility (Apple Silicon/MPS) and error handling#1044

Open
AliHamzaAzam wants to merge 2 commits into
OpenTalker:mainfrom
AliHamzaAzam:fix/device-compatibility
Open

Improve device compatibility (Apple Silicon/MPS) and error handling#1044
AliHamzaAzam wants to merge 2 commits into
OpenTalker:mainfrom
AliHamzaAzam:fix/device-compatibility

Conversation

@AliHamzaAzam

Copy link
Copy Markdown

What

Device compatibility and type handling (6c9be06)

  • inference.py selects mps automatically when CUDA is unavailable but MPS is (the --cpu flag still forces CPU).
  • src/facerender/modules/{util,dense_motion,keypoint_detector}.py: tensor creation now uses explicit dtype=/device= arguments instead of the legacy .type(string) round-trip. The string path has no MPS equivalent (there is no torch.mps.FloatTensor), so coordinate grids and heatmap padding crashed on MPS devices. A small helper (_ref_to_device_dtype) keeps the old string argument working for any external callers.
  • src/facerender/animate.py: audio trim/resample now uses scipy instead of pydub. pydub depends on audioop, which was removed in Python 3.13, and scipy is already a dependency.
  • src/face3d/util/my_awing_arch.py, src/face3d/util/preprocess.py, src/utils/preprocess.py: NumPy compatibility — the removed np.float alias, the removed np.VisibleDeprecationWarning category, and explicit scalar extraction where newer NumPy no longer implicitly converts size-1 arrays.
  • src/utils/face_enhancer.py: GFPGAN is imported lazily inside the enhancer function, so it is only a hard dependency when an enhancer is actually requested.

Error handling (a74164b)

  • src/utils/croper.py: raise 'string' is itself a TypeError on Python 3, which masked the real failure when no landmark is detected. It now raises RuntimeError with the same message.

Why

Stock SadTalker assumes CUDA and an older NumPy/Python toolchain; on a non-CUDA machine it either crashes at device selection or deep inside the facerender modules. These patches come from integrating SadTalker into a production macOS pipeline on Apple Silicon, where inference now runs end-to-end on MPS.

Tested on

  • MacBook Pro 14" — Apple M3 Pro (14-core GPU), 18 GB unified memory, macOS 27.0
  • Python 3.13.5, PyTorch 2.8.0, NumPy 2.2.6
  • Primarily the MPS path (end-to-end inference as part of the pipeline above). The CUDA selection logic is unchanged and was not re-tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant