-
Notifications
You must be signed in to change notification settings - Fork 175
Pull requests: HazyResearch/ThunderKittens
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
added a docker container (only try on GPU wont work on CPU)
#146
opened Aug 24, 2025 by
Kushagra481
Loading…
updated Sep 17, 2025
Document and simplify fp16 -> fp8 conversion
#140
opened Aug 3, 2025 by
melonedo
Loading…
updated Sep 15, 2025
Fix unit test Makefile bugs for A100 and 4090 results in linker failure due to cuda libraries missing
#145
opened Aug 20, 2025 by
tomflinda
Loading…
updated Sep 15, 2025
40% build speedup: get torch and python include paths without subprocesses
#119
opened May 7, 2025 by
technillogue
Loading…
updated Sep 15, 2025
Do not pass
l_vec
and o
to mha_forward in h100_bench
#92
opened Feb 19, 2025 by
acforvs
Loading…
updated Sep 15, 2025
global_to_shared.cuh row accessing fixes
#95
opened Feb 21, 2025 by
dylanllim
Loading…
updated Sep 15, 2025
Add Semaphore Support for
cp.async
loads (Non-TMA Load Patterns)
#97
opened Mar 5, 2025 by
SohamGovande
Loading…
updated Sep 15, 2025
Use
-gencode
instead of -arch
in mla_decode
#126
opened Jun 6, 2025 by
lucifer1004
Loading…
updated Sep 15, 2025
I created a website to write documentation!
#89
opened Feb 14, 2025 by
prateekshukla1108
Loading…
updated Sep 15, 2025
Remove unnecessary device and stream syncs
#129
opened Jun 15, 2025 by
Edenzzzz
Loading…
updated Sep 15, 2025
implement group gemm for contiguous case
#136
opened Jul 22, 2025 by
XiaobingSuper
Loading…
updated Sep 15, 2025
Added
LCSF
template implementation support for FlashAttention Backward.
#135
opened Jul 22, 2025 by
KuangjuX
Loading…
updated Sep 15, 2025
2 tasks done
Add Implementation of Native Sparse Attention
#137
opened Jul 22, 2025 by
yukavio
Loading…
updated Sep 15, 2025
Remove redundant register declarations.
#134
opened Jul 22, 2025 by
KuangjuX
Loading…
updated Sep 15, 2025
ProTip!
Adding no:label will show everything without a label.