| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by avibhu 1196 days ago
	Tangential: you can finetune something like flan-ul2 to do quote extraction using examples generated from chatgpt. If you have a good enough GPU, it should help cut down costs significantly

3 comments

specproc 1196 days ago

Nice, that sounds like it's worth exploring. Much appreciated.

Again though, it's the zero-effort part that's appealing. I'm on a very small team and getting that to close to the same standard will take time for a ham-fisted clod like myself. Worth giving a shot all the same though, thanks again.

link

leobg 1196 days ago

The zero shot ability is convenient. But for tasks that you need to get done millions of times, I’d much rather spend $10 on GPU compute and maybe a day of training data generation to train a T5 which I then “own”.

Also, running your own specialized model locally can be much faster than using someone’s API.

link

specproc 1195 days ago

Sure, purely a time issue for me. I'm not the most skilled in this area, and I've got a load of core stuff I need to keep on top of.

I think we're not far off having something equivalent that can be pulled from Huggingface and run on a near consumer grade GPU.

For now, I'll hang tight and see how things progress. Don't disagree.

link

leobg 1195 days ago

Maybe one day you’ll be able to tell ChatGPT what kind of model you need and it’ll automatically select the right architecture, gather the training data, and commission the training using the cheapest and/or fastest provider. :)

link

pfdietz 1196 days ago

It's interesting what you can do with ChatGPT with few shot learning. It generalizes at the drop of a hat, often correctly.

link

winddude 1196 days ago

Don't they have in the ToS you aren't allowed to use outputs for training downstream? Which is a little ridiculous, considering it's ToS.

But yea, they cheap cost and lack of training is making me a take a long hard look at how I'm implementing more traditional NLP solutions.

link

swyx 1196 days ago

> Don't they have in the ToS you aren't allowed to use outputs for training downstream?

you mean this? "Data submitted through the API is no longer used for service improvements (including model training) unless the organization opts in" https://openai.com/blog/introducing-chatgpt-and-whisper-apis

link

winddude 1187 days ago

was referring to "(iii) use output from the Services to develop models that compete with OpenAI; (iv) except as permitted through the API, use any automated or programmatic method to extract data or output from the Services, including scraping, web harvesting, or web data extraction;" ~ https://openai.com/policies/terms-of-use

I think I missed the exception for API, how ever not sure where they are, but seems to be fine based on alpaca. Also interesting they are so hard on web scraping and and extraction, lol. But wow, that is a poorly worded paragraph.

link

hooande 1196 days ago

I do this. It works.

link

icelancer 1195 days ago

Can you elaborate? Did some brief Google searching but had issues putting it together. We have thousands of documents and data stores we'd like to parse using GPT-3.5 (or the new ChatGPT API) and have been thinking of pretraining to cut things down. Thank you!

link

hooande 1193 days ago

contact me at the email in my profile

link