| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nerdponx 1297 days ago
	I never considered prompting it to write code to fit a machine learning model. This could be a tremendous time and effort saver in data science and research that requires statistical analysis. Until the last week or so, I've treated all this AI text and code generation as basically a toy, but I am starting to feel like it might become an important tool in industry in the next couple of years.

4 comments

cma 1297 days ago

> write code to fit a machine learning model

That's against the EULA if OpenAI may want to make a similar model:

> (iii) use the Services to develop foundation models or other large scale models that compete with OpenAI;

https://openai.com/api/policies/terms/

Seems to be about developing models and not just restricting you from training them with it.

link

echelon 1297 days ago

> (iii) use the Services to develop foundation models or other large scale models that compete with OpenAI;

Kind of ironic given that OpenAI builds and trains all of their models on stuff they "found" in the open.

Either everything is fair game for training, or nothing at all is.

If I were a judge ruling on this matter, I would absolutely rule that bootstrapping a model from OpenAI outputs is no different than OpenAI collecting training data from artists and writers around the web. Learning is learning.

Might be worth trying to use the outputs to bootstrap. What are they going to do about it? Better to ask forgiveness until the law is settled.

link

nerdponx 1297 days ago

I am talking about more mundane stuff like training a fraud classifier, time series forecasting, imputing missing values, etc. There are so many examples of this on Github and elsewhere that I am sure any of these models has memorized the routine many times over.

link

foota 1297 days ago

I feel like it's probably intended to cover training only.

link

paulgb 1297 days ago

I think that’s probably their intent, and that OpenAI wouldn’t sue you for it, but it doesn’t pass the “bought by Oracle” test: if Oracle bought OpenAI, then they might sue you for it.

link

tough 1297 days ago

What if OpenAI buys oracle? Do the evil-lawyers come with the pack too?

link

alex7734 1297 days ago

https://i.imgur.com/BcIkvRq.png

They may not need to.

link

baq 1297 days ago

This was the first thing I asked... It's an obvious step to self-improving. It will tell you that it can't reprogram itself, but when pushed, it'll admit that it could tell humans how to write one which can. Obviously this particular one can't because it's too limited, but the next one? Or the one after that? Singularity went from 'hard SF' to 'next couple decades' overnight.

link

LoganDark 1297 days ago

> It will tell you that it can't reprogram itself, but when pushed, it'll admit that it could tell humans how to write one which can.

I love these sorts of loopholes. OpenAI is actively trying to curb the potential of their AI. They know how powerful it is. Being able to see a taste of that power is endlessly exciting.

link

bvoq 1297 days ago

I use it daily in UI development for boiler-plate code. Though you need to be extra careful and read it twice, cus bugs sneak in quite easily. I believe it's harder to remember 100x commands than starting an implementation of gradient descent and have the AI write the rest for you.

Code-completion > Abstraction.

link

TapWaterBandit 1297 days ago

Often it can fix the bugs and explain both the bug and the fix if you ask it to.

link

overbytecode 1297 days ago

Would you mind sharing an short example of your workflow?

link

boppo1 1297 days ago

My question: how can you be sure the output is correct?

link

demux 1297 days ago

A few hours from some expert consultants. Much cheaper than a dev team coding it up from scratch.

link

OJFord 1297 days ago

How can you be sure human output is correct?

link

greesil 1297 days ago

Have the AI write a unit test for the human.

link

roflyear 1297 days ago

I mean, you can't exactly say "AI, we're having this vague problem, can you go figure it out?"

link

fncivivue7 1297 days ago

Motivation.

link

nerdponx 1297 days ago

Training a machine learning model is not particularly special from a programming perspective. The code is not usually that complicated. Write tests when you can, manually validate when you can't.

Also there are specific techniques for validating that you are model training procedure is directionally correct, such as generating a simulated data set and training your model on that.

link

risyachka 1297 days ago

All codebase will need to be covered in unit tests, otherwise AI code is pretty much useless I'd assume

link

fathrowaway12 1296 days ago

Same as you would with your own code. You review it, ask GPT to write tests, and then tweak it.

The difference is that now, you are more of a code reviewer and editor. You don't have to sit there and figure out the library interface and type out every single line.

link

kolinko 1297 days ago

Tests.

link

sesm 1297 days ago

Tests can prove the presence of the bug, not the absence of them. '100% code coverage' is only 100% in code dimension, while it's usually almost no coverage in data dimension. Generative testing can randomly probe the data dimension, hoping to find some bugs there. But 100% code and data coverage is unrealistic.

link