| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vvolhejn 196 days ago
	Václav from Kyutai here. Thanks for the bug report! A workaround for now is to chunk the text into smaller parts where the model is more reliable. We already do some chunking in the Python package. There is also a more fancy way to do this chunking in a way that ensures that the stitched-together parts continue well (teacher-forcing), but we haven't implemented that yet.

1 comments

mgaudet 196 days ago

Is this just sort of expected for these models? Should users of this expect only truncation or can hallucinated bits happen too?

I also find Javert in particular seems to put in huge gaps and spaces... side effect of the voice?

vvolhejn 190 days ago

> Is this just sort of expected for these models? Should users of this expect only truncation or can hallucinated bits happen too?

Basically, yes, sort of expected: we don't have detailed enough control to precent it fully. We can measure how much it happens and train better models, but no 100% guarantee. The bigger the model, the less this happens, but this one is tiny, so it's not the sharpest tool in the shed. Hallucinated bits can theoretically happen but I haven't observed it with this model yet.