The client doesn't do it. You put your front ends behind a load balancer like an ELB, or use a reverse proxy like Nginx.
Edit: And yes, round robin is the most commonly used load distribution technique, and works very well assuming each request has a roughly equivalent unit of work cost.
I'm surprised you wouldn't run into cases where the requests being rate-limited can't end up unevenly distributed between servers. There are assumptions you could add that would make that not a problem, but I'm surprised they'd hold.
Each server contains a map of tokens per per client filling at a fixed interval.
That interval is calculated by taking the total global token refresh rate and dividing it by the number of servers.
The end result is exactly the same but, now you are stateless and have eliminated the bottleneck of a central token bucket.