| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by briansm 10 days ago
	Strange that they are feeding raw audio in. Even in humans, there is a hardware transform to the frequency domain (the cochlea) before data is fed to the brain, effectively doing this part in the LLM seems inefficient.

1 comments

nialse 10 days ago

The FFT is essentially just a matrix multiplication, or two. No need for fancy conversions. Just a huge amount of training data and a very large array.

link