Skip to content

Commit 411d53d

Browse files
inocsinmikeiovine
authored andcommitted
[https://nvbugs/5284463][fix] fix ada fp8 group gemm lacks shared memory (NVIDIA#9044)
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: Mike Iovine <miovine@nvidia.com>
1 parent ee9b687 commit 411d53d

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

cpp/tensorrt_llm/kernels/cutlass_kernels/cutlass_heuristic.cpp

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -177,13 +177,13 @@ std::vector<CutlassTileConfig> get_candidate_tiles(
177177
{
178178
if (sm == 89 || sm >= 120)
179179
{
180-
return {CutlassTileConfig::CtaShape16x256x128_WarpShape16x64x128,
181-
CutlassTileConfig::CtaShape32x128x64_WarpShape32x32x64,
180+
return {CutlassTileConfig::CtaShape32x128x64_WarpShape32x32x64,
182181
CutlassTileConfig::CtaShape64x128x64_WarpShape64x32x64,
183182
CutlassTileConfig::CtaShape64x64x128_WarpShape32x64x64,
184183
CutlassTileConfig::CtaShape128x64x64_WarpShape64x32x64,
185184
CutlassTileConfig::CtaShape128x256x64_WarpShape64x64x64,
186-
CutlassTileConfig::CtaShape256x128x64_WarpShape64x64x64};
185+
CutlassTileConfig::CtaShape256x128x64_WarpShape64x64x64,
186+
CutlassTileConfig::CtaShape16x256x128_WarpShape16x64x128};
187187
}
188188
else
189189
{

0 commit comments

Comments
 (0)