Fix alpha scheduler not loading for MoE models on resume

When resuming training for MoE models (high_noise/low_noise), the alpha scheduler state file wasn't being found because the code was looking for expert-specific scheduler files (_high_noise_alpha_scheduler.json or _low_noise_alpha_scheduler.json) but the actual file is shared across experts (just _alpha_scheduler.json). This caused the alpha scheduler to reset to foundation phase instead of continuing from the saved phase (e.g., emphasis), resulting in incorrect alpha values after resume. Fix: Strip expert suffix from filename before looking for alpha scheduler. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
ostris · relaxis · Oct 22, 2025 · Oct 28, 2025 · Oct 28, 2025 · Oct 29, 2025
commit aecc467366e65115fec29fecaa45be9a614d84b0
diff --git a/jobs/process/BaseSDTrainProcess.py b/jobs/process/BaseSDTrainProcess.py
@@ -880,6 +880,9 @@ def load_weights(self, path):
             if hasattr(self.network, 'alpha_scheduler') and self.network.alpha_scheduler is not None:
                 import json
                 scheduler_file = path.replace('.safetensors', '_alpha_scheduler.json')
+                # For MoE models, strip expert suffix (_high_noise, _low_noise) since scheduler is shared
+                scheduler_file = scheduler_file.replace('_high_noise_alpha_scheduler.json', '_alpha_scheduler.json')
+                scheduler_file = scheduler_file.replace('_low_noise_alpha_scheduler.json', '_alpha_scheduler.json')
                 print_acc(f"[DEBUG] Looking for alpha scheduler at: {scheduler_file}")
                 if os.path.exists(scheduler_file):
                     try: