Skip to content

Conversation

@lessw2020
Copy link
Contributor

This PR adds:
1 - the base GroupGemmKernel from Cutlass 4.0 cute dsl
2 - a CutlassGroupGemm Strategy that runs it with Deepseek model
3 - a converter class that handles PyTorch->Cute conversion as well as GroupGemm specific Cute preparation.
4 - updates model.py to add it as an available group gemm.

Screenshot 2025-06-22 at 12 20 41 PM

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jun 22, 2025
@kwen2501 kwen2501 requested review from drisspg and msaroufim June 23, 2025 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants