Y
Hacker News
new
|
ask
|
show
|
jobs
by
gac3
207 days ago
Was this trained on the same data as Dia 1?
1 comments
gac3
207 days ago
Would be interesting to know what improvements come from arch, data, and different tokenizer.
link