|
|
|
|
|
by eurekin
829 days ago
|
|
I once had a pleasure of delving into the automotive mechanical engineering. Of course, most, if not all, materials ingested by OpenAI were obvious marketing straight from the brands website. I started out the conversation multiple times anew, with explicit rules forbidding certain phrases. I couldn't make it stop throwing stuff like "best in class", "advanced", "sophisticated" no matter, what I did. There will be demand for gpt's trained on an actual engineering material and it could actually be a huge gamechanger for that market. |
|
Look at WizardLM Uncensored: https://www.reddit.com/r/LocalLLaMA/comments/1384u1g/wizardl...
The author just deleted from the training data content with specific words likely to bias it. The test afterwards showed it worked. Reusing their concept, I think we could just remove or edit for honesty common words and phrases in marketing material. You’ve given some good examples.
We could also do that for “scientific” papers which oversell their results. Or anything else where what’s presented as certain is modified to say source(s) X claimed Y. Foundational materials, which trainers vet for quality, would get a lot more training runs before, during, and after riskier material.
I think there’s a lot of potential here by just trimming the fat out of otherwise useful documents. The LLM’s we build to support the work might also become great, lie detectors.