I have tested your epoll and io_uring examples and I get 250k (req/sec) with your epoll example and only 220k with io_uring. I also get 250k with my own epoll implementation so that confirms we are both using efficient use of epoll.
I'm running on Linux 5.7 Clear Linux - do you have any hints on how I can reproduce your results?