For some reason inference seems to be overlooked. A lot of “ink” has been spilled over GPUs for training tasks but at the end of the day if you can’t do inference you can’t serve users and you can’t make money.