Hacker News new | ask | show | jobs
by liuliu 1082 days ago
Yes, computes are carried out in FP16 (so there is no compute efficiency gains, might be latency reductions due to memory-bandwidth saving). These savings are not realized yet because no custom kernels introduced yet.