|
|
|
|
|
by hansvm
410 days ago
|
|
> definitions Sure, perhaps. Take it up with the authors. > make sense...goal That's not necessarily the goal. Alignment definitely filters the available response distribution, but the result of alignment and fine-tuning can be higher entropy than the original. E.g., how many people complain about text being"obvious LLM garbage"? A wider range of styles and a more entropic solution would fall out of fine-tuning in a world where the graders cared about such things. E.g., Alignment is a fuzzy, human problem. Is a model more aligned if it never describes DIY EMPs and often considers interesting philosophical components? If it never says anything outside of the median opinion range? The former solution has a lot more entropy than the latter and isn't particularly well reflected in available training data, so fine-tuning, even for the purpose of alignment, could easily increase entropy. |
|