| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by californical 49 days ago
	> the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT) Holy crap that is dark. I like learning about ML for fun, and now I have to assume that their model is intentionally misinforming me to sabotage my learning? It is absolutely bananas that somebody decided that was ok behavior.

3 comments

claysmithr 49 days ago

time to support open source and local models

link

jcgl 49 days ago

I don’t see how that helps, unless you actually mean open source, rather than open weights like most people do. Without everything that goes into the model, including training data, these things are opaque.

link

Spooky23 49 days ago

Actual open source is hard without a big war chest that allows you to flagrantly steal the training data.

link

philipkglass 49 days ago

The raw training data is so large that very few parties could host it for free even if there weren't copyright barriers.

But I think you could have a full open source training software pipeline that's set up to work with Wikipedia, Common Crawl, Books3, Library Genesis, Anna's Archive, and whatever other useful data sets people can name. There would just be a step where you have to provide your own copy of Library Genesis (or whatever subset of it you have managed to obtain).

link

jcgl 49 days ago

That may very well be the case. In fact, I'm nearly certain that you're right. But it doesn't change the fact that open weight models are altogether insufficient on a number of important dimensions regarding freedom and transparency. And so often (such as the comment I replied to, I think), even technical people seem to just ignore the difference. Open weights are just weights. No amount of open-washing changes that.

link

seb1204 49 days ago

Honest question, I wonder why that is? Surely we have smart humans that did not read and learn "all the books". Can AI not be trained by re-reading material multiple times to reinforce?

link

c-linkage 48 days ago

Start up a seti at home style of open source LLM training! Assuming there is an ability to merge the sub models trained on each user's home PC into a larger model...

link

jcgl 48 days ago

That's not something that is known how to do in a reliable fashion, right? It sounds quite like the problem where transformers are unable to be updated/taught over time.

link

lostphilosopher 49 days ago

Someone could write a cyberpunk Three Body Problem with this premise.

link

crabmusket 49 days ago

They kinda did (though it's more inspired by Trusting Trust than AI)

https://corecursive.com/coding-machines-with-don-and-krystal...

link

kreelman 49 days ago

TLDR :-)

This comment is not entirely on point with your comment, it circles around and above it looking for lift though.

If you're not doing work that requires your code to stay in home nation data centres, Claude for Deepseek, Deepclaude (https://github.com/aattaran/deepclaude) is a great way to get better at using Claude like tools for software development. It even does a pretty good job of putting together cover letters for job applications...

Using Deepclaude is very much cheaper than using claude... For hobby projects, I've found it useful. A recipe (for cooking) management app I've made took a couple of hours to put together and cost $US 0.5. Claude is far more expensive.

The downsides of Deepclaude for many are:-

- DeepSeek is a Chinese corporation so the Chinese Communist Party may ask for data if it wants it.

- DeepClaude isn't as fast as normal Claude, though it's still pretty fast and I think fast enough (YMMV).

- DeepClaude might not be as optimised for various code issues that Claude may be able to solve more quickly or effectively.

- The same safeguards are probably on DeepSeek, but you won't be "wasting" as much money as you might on using Claude.

Inference focused hardware (https://www.youtube.com/watch?v=nvPqHoVSenE, AI generated speech) may in the medium future cause a large enough cost/energy reduction for LLM tools like Claude to make local LLMs more attractive.

Inference focused hardware would make running Open Source models like DeepSeek on local machines far cheaper and control over safeguards would return to the end user.

Hopefully this leads to a localised LLM provision market where local businesses provide varieties of these "local" LLM services. Here, local could mean on premise through to state or nationally based LLM services. Eventually, government orgs outside of the US may demand this kind of LLM use, in the same way governments legally require data to be stored within national borders for many critical government functions.

A bloke can dream I guess...

...Could affordable inference focused hardware also cause the bottom to fall out of these stock market bending valuations for AI corps and their datacentre obsessions?... Not to mention the societal costs caused by the AI super corps building these data centres. At the moment, they're nearly making a profit... They seem almost like speculative companies... Is that a term?

link