## ❓ Questions and Help Thank you for such a wonderful job! My question is: When the model is not trained on 8 GPUS but on fewer devices, such as 4 GPUS and 2 GPUS, what should we be careful of? e.g., In the original maskrcnn-benchmark, the POST_NMS_TOPK_TRAIN should be set as 1000 * batch_per_gpu. Thanks for your attention! Look forward to your reply.