Skip to content
Open
Changes from 1 commit
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
5e5e9db
Fix: WAN 2.2 I2V boundary detection, AdamW8bit OOM crash, and add gra…
Oct 22, 2025
12e2b37
Improve video training with better bucket allocation
Oct 28, 2025
a1f70bc
Fix MoE training: per-expert LR logging and param group splitting
Oct 28, 2025
a2749c5
Add progressive alpha scheduling and comprehensive metrics tracking f…
Oct 29, 2025
86d107e
Merge remote-tracking branch 'upstream/main'
Oct 29, 2025
c91628e
Update README with comprehensive fork documentation and alpha schedul…
Oct 29, 2025
61143d6
Add comprehensive beginner-friendly documentation and UI improvements
Oct 29, 2025
96b1bda
Remove sponsors section from README - this is a fork without sponsors
Oct 29, 2025
bce9866
Fix confusing expert metrics display - add current training status
Oct 29, 2025
bd45a9e
Fix UnboundLocalError: remove redundant local 'import os'
Oct 29, 2025
abbe765
Add metrics API endpoint and UI components for real-time training mon…
Oct 29, 2025
edaf27d
Fix: Always show Loss Trend Analysis section with collection progress
Oct 29, 2025
a551b65
Fix: SVG charts now display correctly - add viewBox for proper coordi…
Oct 29, 2025
1682199
Fix: Downsample metrics to 500 points and lower phase transition thre…
Oct 30, 2025
885bbd4
Add comprehensive training recommendations based on research
Oct 30, 2025
705c5d3
Fix TRAINING_RECOMMENDATIONS for motion training
Oct 30, 2025
54c059a
Fix metrics to use EMA instead of simple averages
Oct 30, 2025
20b3c12
FIX CRITICAL BUG: Training loop re-doing checkpoint step on resume
Oct 30, 2025
226d19d
Remove useless checkpoint analyzer script
Oct 30, 2025
66978dd
Fix: Export EMA metrics to JSONL for UI visualization
Oct 30, 2025
fa12a08
Fix: Optimizer state loading counting wrong number of params for MoE
Oct 30, 2025
264c162
Fix: Set current_expert_name for metrics tracking
Oct 30, 2025
aecc467
Fix alpha scheduler not loading for MoE models on resume
Oct 31, 2025
b1ea60f
feat: Add SageAttention support for Wan models
Nov 4, 2025
20d689d
Fix CRITICAL metrics regression: boundary misalignment on resume + ad…
Nov 4, 2025
8b8506c
Merge feature/sageattention-wan-support into main
Nov 4, 2025
6a7ecac
docs: Update README with SageAttention and metrics fixes
Nov 4, 2025
850db0f
docs: Update installation instructions to use PyTorch nightly
Nov 4, 2025
26e9bdb
docs: Major README overhaul - Focus on Wan 2.2 I2V optimization
Nov 4, 2025
88785a9
docs: Fix Blackwell CUDA requirements - CUDA 13.0 not 12.8
Nov 4, 2025
0cacab8
Fix: torchao quantized tensors don't support copy argument in .to()
Nov 4, 2025
3ad8bfb
Fix critical FP16 hardcoding causing low-noise training instability
Nov 4, 2025
8589967
Fix metrics UI cross-contamination in per-expert windows
Nov 4, 2025
47dff0d
Fix FP16 hardcoding in TrainSliderProcess mask processing
Nov 4, 2025
eeeeb2e
Fix LR scheduler stepping to respect gradient accumulation
Nov 4, 2025
f026f35
CRITICAL: Fix VAE dtype mismatch in Wan encode_images
Nov 5, 2025
c7c3459
CRITICAL: Revert CFG-zero to be optional (match Ostris Nov 4 update)
Nov 5, 2025
728b46d
CRITICAL: Fix multiple SageAttention bugs causing training instability
Nov 5, 2025
7c9b205
Additional SageAttention and VAE dtype refinements
Nov 5, 2025
1d9dc98
Fix rotary embedding application to match Diffusers WAN reference
Nov 5, 2025
67445b9
Add temporal_jitter parameter for video frame sampling
Nov 5, 2025
ab59f00
Document temporal_jitter feature in README
Nov 5, 2025
80ff3db
Fix VAE dtype handling for WAN 2.2 I2V training to prevent blurry sam…
Nov 5, 2025
384ce94
Fix MoE UI metrics bugs and optimizer state restoration
Nov 6, 2025
b7cf917
Disable SageAttention for training (inference-only)
Nov 7, 2025
55b1dc2
Revise README for SageAttention and feature updates
relaxis Nov 7, 2025
fd208dc
Update README to reflect changes and optimizations
relaxis Nov 7, 2025
e1570af
Revise README for alpha scheduling and metrics updates
relaxis Nov 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix: torchao quantized tensors don't support copy argument in .to()
Fixes RuntimeError when loading models with torchao quantization. The
_ensure_cpu_pinned function now checks if a tensor is quantized before
attempting to move it to CPU, avoiding the use of copy=True for quantized
tensors that don't support this argument (e.g., AffineQuantizedTensor).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
  • Loading branch information
AI Toolkit Contributor and claude committed Nov 4, 2025
commit 0cacab851277d79bc7eeb09ecd3be542c24cd0ca
13 changes: 11 additions & 2 deletions toolkit/memory_management/manager_modules.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,10 +98,19 @@ def _is_quantized_tensor(t: Optional[torch.Tensor]) -> bool:
def _ensure_cpu_pinned(t: Optional[torch.Tensor]) -> Optional[torch.Tensor]:
if t is None:
return None
# Check if quantized BEFORE moving to CPU, as some quantized tensor types
# (e.g., torchao's AffineQuantizedTensor) don't support the copy argument
is_quantized = _is_quantized_tensor(t)

if t.device.type != "cpu":
t = t.to("cpu", copy=True)
# Use copy=True for regular tensors, but not for quantized tensors
if is_quantized:
t = t.to("cpu")
else:
t = t.to("cpu", copy=True)

# Don't attempt to pin quantized tensors; many backends don't support it
if _is_quantized_tensor(t):
if is_quantized:
return t
if torch.cuda.is_available():
try:
Expand Down