| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by minimaxir 847 days ago
	It is possible to create audio/speech embeddings using a model like CLAP: https://huggingface.co/laion/larger_clap_music_and_speech The results aren't good for nearest neighbor vector lookup, however.