| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by aidan_mclau 735 days ago

Hey! Essay author here.

>The cool thing about using modern LLMs as an eval/policy model is that their RLHF propagates throughout the search.

>Moreover, if search techniques work on the token level (likely), their thoughts are perfectly interpretable.

I suspect a search world is substantially more alignment-friendly than a large model world. Let me know your thoughts!

1 comments

Your webpage is broken for me. The page appears briefly, then there's a french error message telling me that an error occured and i can retry.

Mobile Safari, phone set to french.

I'm in the same situation (mobile Safari, French phone) but if you use Chrome it works

It fixed itself (?)