If we have more (or equal number of) classes than batch size it should be possible to use MultipleNegativesRankingLoss. This might improve performance.
I would like to get comments about this idea and might provide a PR later.
Does anyone have an opinion on this?