Hacker News new | ask | show | jobs
by osterbit2 1179 days ago
To anyone who may be pasting code along the lines of 'convert this sql table schema into a [pydantic model|JSON Schema]' where you're pasting in the text, just ask it instead to write you a [python|go|bash|...] function that reads in a text file and 'converts an sql table schema to output x' or whatever. Related/not-related--great pandas docs replacement is another great+safe use-case.

Point is, for a meaningful subset of high-value use-cases you don't need to move your important private stuff across any trust boundaries, and it still can be pretty helpful...so just calling that out in case that's useful to anyone...

3 comments

At first I was impressed by how easy it was to reach a data model with chatgpt, then I laughed as I tried to tweak it and use it. I realized it didn't really have any model concepts and was just using its various KB.

I am unsure if the so called AI can think in models but so far, not but still an impressive assisting tool if you take care of its limitations.

Another point where it lacks is in logic, my daughter has a lot of fun with the book "what is the name of this book?" but she was struggling with the "map of baal" explanation, her the answer was a certain map, yet the book had another answer, I had a third one as I interpreted a proposition. I never got an answer without a contradiction in chatgpt reasoning, and the book had been mistranslated to French so one of its propositions was changed (C, both A and B were knaves) but not the answer.

> At first I was impressed by how easy it was to reach a data model with chatgpt, then I laughed as I tried to tweak it and use it. I realized it didn't really have any model concepts and was just using its various KB.

> I am unsure if the so called AI can think in models but so far, not but still an impressive assisting tool if you take care of its limitations.

I don't know. I'm using it for exactly that ("here's a problem, come up with a data model") and it gives a great starting point.[0]

Not perfect, but after that it's easy to tweak it the old-fashioned way.

I find its data modelling capabilities (in the domain I'm using it for - API services) to be rougly on par with a mid-level developer (for a handwavy definition of "midlevel").

[0] https://apibakery.com/demo/ai/

Did you prime it before asking, so it was answering in the appropriate context?
I'm doing that since day one. I can't believe people are pasting real data into this corporate black boxes.
What about Google Docs, Office 365, Github, AWS, Azure, Google Cloud, JIRA, Zendesk, etc?

What is different about ChatGPT (if anything)?

We have data standards and agreements with those companies, we pay them to have expectations. Even then, we're strict about what touches vendor servers and it's audited and monitored. Accounts are managed by us and tied into onboarding and offboarding. If they have a security incident, they notify, there's response and remediation.

ChatGPT seems to be used more like a fast stackoverflow, except people aren't thinking of it like a forum where others will see their question so they aren't as cautious. We're just waiting for some company's data to show up remixed into an answer for someone else and then plastered all over the internet for the infosec lulz of the week.

> We have data standards and agreements with those companies, we pay them to have expectations. Even then, we're strict about what touches vendor servers and it's audited and monitored. Accounts are managed by us and tied into onboarding and offboarding.

For every company like yours there are hundreds that don't. People use free gmail address for sensitive company stuff, paste random things in random pastebins, put their private keys in public repos, etc.

Yes, data leaks from OpenAI are bound to happen (again), and they should beef up their security practices.

But thinking people are using only ChatGPT in an insecure way vastly overestimates their security practices elsewhere.

The solution is education, not avoiding new tools.

Doesn't OpenAI explicitly say that your Q/A on the free ChatGPT are stored and sent to human reviewers to be put in their RL database? Now of course we can't be sure what google, AWS etc do with the data on disks there, but it would be a pretty big scandal if some whistleblower eventually comes out and say that google employees sit and laugh at private bucket contents on GCP or private Google Docs. So there's a difference in stated intention at least..
Who in their right mind is using free ChatGPT through that shitty no good web interface of theirs, that can barely handle two queries-and-replies before grinding down to a halt? Surely everyone is using the pay-as-you-go API keys and any one of the alternative ffrontends or integrations?

And, IIRC, pay-as-you-go API requests are explicitly not used for training data. I'm sad GPT-4 isn't there yet - except for those who won the waitlist lottery.

It's really funny to see these types of comments. I would assume a vast majority of users are using the Web interface, particularly in a corporate context where an account for the API could take ages or not be accepted.

If people were smart and performed according to best practices, articles like this one would not be necessary.

I mean, if you're using a free web interface in corporate context, you may just as well use a paid API with your personal account - either way, you're using it of your own volition, and not as approved by your employer. And getting API keys to ChatGPT equivalent (i.e. GPT-3.5) takes... a minute, maybe less.

I am honestly confused how people can use this thing with the interface OpenAI runs. The app has been near-unusable for me, for months, on every device I tried it on.

> and any one of the alternative ffrontends or integrations?

And what sort of understanding do you have with the alternative frontends/integrations about how they handle your API keys and data? This might be a better solution for a variety of reasons but it doesn't automatically mean your data is being handled any better or worse than by openai.com

I wonder what the distribution of tokens / sec at OpenAI is between the free ChatGPT, paid ChatGPT, and APIs. I’d have to think the free interface is getting slammed. Quite the scaling project, and still nowhere near peaking.
To quote a children's TV show: "Which ones of these things are not like the other ones?"

Some of those are document tools working on language / knowledge. Others are infrastructure, working on ... whatever your infra does, and your infra manages your data (knowledge).

If you read their data policies, you'll find they are not the same.

I wouldn't put sensitive work data/employer IP in a personal Google Doc (et al.) either, no?
Dont use any of it
To your average user who interfaces with these figurative black boxes with a black box in their hand, how is this particular black box any different than the other black boxes that this user hands their data to every second of every day?
there are plenty of disallowed 'black boxes' within the federal sphere; chatgpt is just yet another.

to take a stab at your question, though : my cell phone doesn't learn to get better by absorbing my telecommunications; it's just used as a means to spy on my personal life by The Powers That Be. The primary purpose of my cell phone is for the conveyance of telecommunications.

chatGPT hordes data for training and self-improvement in its' current state. It's whole modus operandi involves the capture of data, rather than it being used for that tangentially. It could not meaningfully exist without training on something, and at this stage of the game it's the trend to self-train with user data.

Until that trend changes people should probably be a bit more suspect about what kind of stuff gets thrown into the training bin.

Those typically have MSAs with legalese where parties stipulate what they will and will not do and often whether or not it’s zero knowledge and often option to have your own instance encryption keys.

If people are using the free version of chatGPT then it’s unlikely there is a contract between the companies and more likely just a terms of use applied by chatGPT and ignored by the users.

No idea
I simply don't give a crap if my employer loses data. I don't care if my carelessness costs my employer a billion bucks down the line as I won't be working for them next year.
Writing that is a really good way to end up on the wrong side of a civil suit.
I have a addon, were every other sentence is generated by Chat GPT. Good luck holding me liable for a robots actions.
"I do not take any kind of responsibility about what I'm doing, or not doing, or thinking about doing or not doing, or thinking about whenever I should be doing or not doing, or thinking about whenever I should be thinking about doing or not doing".
Unless you can prove a given sentence was generated by ChatGPT, it will be assumed it wasn't.
As a moral questionable answering robot however, i must aks, why all things else should be tainted by the machinery, but evidence like text should not?
Why don’t you feel any responsibility?
I am treating my employment like a corporation would. Risks I do not pay for and do not benefit from mitigating are waste that could allow me to transfer time back to my own priorities, increasing my personal "profit."
Not who you replied to, but if you agree, even a little, with the phrase, "the social contract between employees & employers is broken in the US"... well it goes both ways.
Do you really think the people asking ChatGPT to write their code can make that abstraction?

The fact that the can't do this is the whole reason they have to use ChatGPT.

I use it because it's 10-100x more interesting, fun, and fast as a way to program, instead of me having to personally hand-craft hundreds of lines of boilerplate API interaction code every time I want to get something done.

Besides, it's not like it puts out great code (or even always working code), so I still have to read everything and debug it. And sometimes it writes code that is just fine and fit for purpose and horrendously ugly, so I still have to scrap everything and do it myself.

(And then sometimes I spend 10x as long doing that, because it turns out it's also just plain good fun to grow an aesthetic corner of the code just for the hell of it, too — as long as I don't have to.)

And even after all that extra time is factored back in: it's still way faster and more fun than the before-times. I'm actually enjoying building things again.

Pair-programming with ChatGPT is like having an idiot-savant friend who always surprises you. Doesn’t matter if the code is horrible, amazing, or something inbetween. It’s always interesting.

And I agree it’s fun. Maybe it’s the simulated social interaction without consequences. I can be completely honest with my robot friend about the shitty or awesome code and no one’s feelings are going to get hurt. ChatGPT will just keep trying to be helpful.

People aren’t using ChatGPT because they can’t do it themselves, they’re using it to save time.
You can be an experienced developers with years building complex applications behind you and still find ChatGPT useful. I've found it useful for documenting individual methods or simply explaining my own/other's code or writing unit test methods or just using it to add boilerplate stuff that saves me an hour that I use elsewhere.
I think many people find ChatGPT useful specifically because they have years of experience building complex applications.

If you know exactly what you want to ask of it, and have the ability to evaluate and verify what it produces, it's incredible what you can get out of it. Sure it's nothing I couldn't have done otherwise... eventually. The productivity it enables is worth every cent.

Easily the best $20 I've spent in ages, they should have run with the initial idea of charging $42.

But holy moly anyone putting confidential information into it needs to stop

I’ve been doing this kind of thing pretty regularly for the past few weeks, even though I know how to do any of the tasks in question. It’s usually still faster, even when taking the time to anonymize the details; and I don’t paste anything I wouldn’t put on a public gist (lots of “foo, bar”, etc)
Precisely because I can abstract it is why I use ChatGPT. It can do the boring, tedious, repetitive stuff instead of me and has shown me the joy of using programming to solve ACTUAL problems yet again, instead of having to spend hours on unimportant problems like "how do I do X with library Y".