Y
Hacker News
new
|
ask
|
show
|
jobs
by
phreeza
108 days ago
But this is missing exactly the gap which OP seems to have, which is going from a next token predictor (a language model in the classical sense) to an instruction finetuned, RLHF-ed and "harnessed" tool?
1 comments
js8
107 days ago
The book has a sequel
https://www.manning.com/books/build-a-reasoning-model-from-s...
It will give you an answer to the extent anybody can.
link
It will give you an answer to the extent anybody can.