|
|
|
|
|
by mgkimsal
1191 days ago
|
|
> It does have great promise in an open-source self-hosted incarnation not controlled by external actors, however. I'm not even sure about that, entirely. My very limited understanding of this is that a core requirement is the initial data - the large language models(?). Which of these you can use, or how it's initially developed/populated, will have an influence on the answers you get and how it may evolve/"learn". Instead of trusting the external corp to run the service, you need to trust whatever actors are building the base data sets, and be concerned what sort of bias may be inherent. Or do I have this totally wrong? |
|
Model refinement seemingly has lower training requirements, putting it within the reach of smaller organizations or wealthy individuals. If you don’t like the refinement dataset it will likely be feasible to bootstrap your own off someone else’s LLM. See what Stanford did with Alpaca.