Tangential: you can finetune something like flan-ul2 to do quote extraction using examples generated from chatgpt. If you have a good enough GPU, it should help cut down costs significantly
Nice, that sounds like it's worth exploring. Much appreciated.
Again though, it's the zero-effort part that's appealing. I'm on a very small team and getting that to close to the same standard will take time for a ham-fisted clod like myself. Worth giving a shot all the same though, thanks again.
The zero shot ability is convenient. But for tasks that you need to get done millions of times, I’d much rather spend $10 on GPU compute and maybe a day of training data generation to train a T5 which I then “own”.
Also, running your own specialized model locally can be much faster than using someone’s API.
Maybe one day you’ll be able to tell ChatGPT what kind of model you need and it’ll automatically select the right architecture, gather the training data, and commission the training using the cheapest and/or fastest provider. :)
was referring to "(iii) use output from the Services to develop models that compete with OpenAI; (iv) except as permitted through the API, use any automated or programmatic method to extract data or output from the Services, including scraping, web harvesting, or web data extraction;" ~ https://openai.com/policies/terms-of-use
I think I missed the exception for API, how ever not sure where they are, but seems to be fine based on alpaca. Also interesting they are so hard on web scraping and and extraction, lol. But wow, that is a poorly worded paragraph.
Can you elaborate? Did some brief Google searching but had issues putting it together. We have thousands of documents and data stores we'd like to parse using GPT-3.5 (or the new ChatGPT API) and have been thinking of pretraining to cut things down. Thank you!
Again though, it's the zero-effort part that's appealing. I'm on a very small team and getting that to close to the same standard will take time for a ham-fisted clod like myself. Worth giving a shot all the same though, thanks again.