More or less, like there's stuff you can do to extend the window of an existing model fairly easily, i.e. LoRA type training budget, O($1000).
But in practice, even when context_size max output token count was enabled, it simply couldn't make use of it, no matter how many prompt engineering tricks I threw at it.[1] And I've heard anecdotally that it's true for that LoRA-type technique.
[1] TL;DR, about 1/5th the actual length: write 100 pages, 3 paragraphs each, number the pages as you go and write 1 page at a time until 100. Also write out "I have written page N and need to write 100 pages total" after each page.
Inevitably it would "get tired" and be like "end page 23...now page 100"
But in practice, even when context_size max output token count was enabled, it simply couldn't make use of it, no matter how many prompt engineering tricks I threw at it.[1] And I've heard anecdotally that it's true for that LoRA-type technique.
[1] TL;DR, about 1/5th the actual length: write 100 pages, 3 paragraphs each, number the pages as you go and write 1 page at a time until 100. Also write out "I have written page N and need to write 100 pages total" after each page.
Inevitably it would "get tired" and be like "end page 23...now page 100"