Hacker News new | ask | show | jobs
by embedding-shape 76 days ago
Yeah, I suppose, but how do I get sufficiently high quality synthetic data without sending the original data to OpenAI/Anthropic, or by using local models when none of them seem strong enough to be able to generate that "sufficiently high quality synthetic data" in the first place?
1 comments

you could do something like rent GPU time yourself, and use it to run a higher-quality local model (e.g. one of the Chinese "close to frontier" ones). Not guaranteed to preserve privacy of course, but it at least avoids directly sending the data to OpenAI/Anthropic.
I couldn't, as this is essentially handing over the most private I have, to 3rd parties. Don't really care about what country they're based in, it's not a possibility.