| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bnprks 777 days ago
	Seems like the claims of the abstract for speed and energy-efficiency relative to an RTX 3090 are when the GPU is using a batch size of 1. I wonder if someone with more experience can comment on how much throughput gain is possible on a GPU by increasing batch size without severely harming latency (and what the power consumption change might be). And from a hardware cost perspective the AWS f1.2xlarge instances they used are $1.65/hr on-demand, vs say $1.29/hr for an A100 from Lambda Labs. A very interesting line of thinking to use FPGAs, but I'm not sure if this is really describing a viable competitor to GPUs even for inference-only scenarios.

1 comments

dhruvdh 777 days ago

The FPGA being used is I believe one of the lowest speced SKUs.

AWS instance prices are more of a supply/demand/availability thing, it would be more interesting to compare from a total cost of ownership / perf-power-area prespective.