Hacker News new | ask | show | jobs
by indigochill 994 days ago
> For example, if you ask AI to perform certain editing jobs on a piece of content that may be somehow connected to a political or a religious issue it will refuse to do so.

Some AI services (especially those run as services by megacorps with lively legal departments) are gimped this way (usually as a sort of "thought police" model running on top of the core model, as I understand), but once you get to self-hostable models not all have such limitations.

1 comments

How good are the self hostable models though compared to those run by megacorps?
Well, "good" has a few dimensions to it:

1. Speed of output (not very fun to wait multiple seconds for each letter to be output)

2. Coherence of output (how far back does the model remember the context of the conversation?)

3. Variety of output (how's the diversity of the model's vocabulary? How about topics it can plausibly discuss?)

You can easily get comparable speed, so nothing of interest to really compare there.

I haven't done particularly strenuous coherence comparisons, but for my uses, at least, megacorp and self-hosted models are pretty comparable. Though you do need the better models to get the best coherence simply because they retain more tokens in memory.

Variety is, in my opinion, where the megacorp models still rule. Most of my dabbling has been with models designed to be writing assistants and they can certainly generate plausible strings of words and follow a general theme, but they barely "know" anything (generally when using them to write fiction, you would provide them a "factbook" that they can work from). ChatGPT by comparison can generate plausible responses to a surprising breadth of technical questions, although it definitely has a feeling of being generated from scraping certain online sources since it's decent at answering devops questions but bad at obscure grammar and physics questions, at least in my experience.