Y
Hacker News
new
|
ask
|
show
|
jobs
by
danielhanchen
115 days ago
Unsure but yes most likely they use YaRN, and maybe trained a bit more on long context maybe (or not)