Nowadays GPUs are a poor choice for this. GPUs have fairly high latency and are designed to be optimal for embarrassingly parallel computations. Sure you can do 10gigabits/sec but because you're batching that'll result in your latency being high-ish on average.
In terms of non-CPU hardware for routing FPGA's are still king if you need a bit of flexibility. In theory a rule table can be defined in configs and then flashed onto the FGPA for line-speed packet processing. Even then, given how slow FPGA's are to flash CPUs are still king when general packet processing is needed.
Interesting... looks like it's effective for routing minimum-sized packets. As an aside, this is the epitome of good web design. Fast, clear, proper line-widths... I'm in love :)
In terms of non-CPU hardware for routing FPGA's are still king if you need a bit of flexibility. In theory a rule table can be defined in configs and then flashed onto the FGPA for line-speed packet processing. Even then, given how slow FPGA's are to flash CPUs are still king when general packet processing is needed.