| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by MichealCodes 239 days ago
	I don't think we've had the transformer moment for audio training yet, but yes, in theory audio-first models will be much more capable.

1 comments

Particularly interesting would be transformations between tokenised audio and tokenised text.

I recall someone telling me once up to 90% of communication can be non-verbal, so when an LLM sticks to just text, it's only getting 10% of the data.