Hacker News new | ask | show | jobs
by refulgentis 957 days ago
More or less, like there's stuff you can do to extend the window of an existing model fairly easily, i.e. LoRA type training budget, O($1000).

But in practice, even when context_size max output token count was enabled, it simply couldn't make use of it, no matter how many prompt engineering tricks I threw at it.[1] And I've heard anecdotally that it's true for that LoRA-type technique.

[1] TL;DR, about 1/5th the actual length: write 100 pages, 3 paragraphs each, number the pages as you go and write 1 page at a time until 100. Also write out "I have written page N and need to write 100 pages total" after each page.

Inevitably it would "get tired" and be like "end page 23...now page 100"