|
> This is true for almost everything politicians promote as a "solution" :-) In their defense, this time they may not even be aware. > So for most people, they actually see g which is something like g(f(H))) It is interesting to consider the entailment structure of g(f(H)). f(H), or A, is entailed from H by f. g(f(H)), let us call it A', is entailed from f(H) by g. Using |- as the symbol for entailment, or "entails": f |- f(H)
g |- g(f(H))
All included plugins, embeddings, other API endpoints in g will definitely affect the results, turning answer A, or f(H), into a modified answer A', or g(f(H)). However, what entails f? Nothing in the context of applying g.To a first approximation, design D and training T are the ones that entail f. T includes all human feedback (training pairs, not direct edits on the entailment in f), and also previous chat histories H for subsequent releases f', however this inclusion is behind decision hierarchies, and the effects of the inclusion on f are largely outside of users' or researchers' precise or accurate control. Training pairs can be added to fine tune, e.g., to filter out violent content, but the results can only be checked downstream of the new f', and fine tuning will retain previous entailments hence acting similarly to a g(). (D,T) |- f
Let us consider programming, where software S entails output o. What [subject] entails software [direct object]? Programmers P, directly and reliably. Not only can programs be affected precisely and accurately. If we want to modify a program, we do not wrap it inside another program g, hoping that the functional composition will yield the desired result, which will invariably contain entailment from the one we are stuck with. Instead, we edit it. S |- o
P |- S
S |- P
Programming is interactive, between programmers and software, whereas with LLMs users are downstream when it comes to wanting to affect directly the entailment patterns in f. The human-like chat UX of LLMs leads us to think that chatting with them, giving them human feedback, prompts, or guidelines, or adding plugins and context will naturally flow into us being able to affect f, and that since humans can be manipulated, then so can LLMs. The core of the matter is that f, which entails issues, cannot be affected easily, precisely, or accurately, despite any g that we may want to wrap around it. |
RLHF is Reinforcement Learning from Human Feedback
Thus you cannot get ChatGPT without humans in the loop, making it quite sensitive and irreproducible.