Hacker News new | ask | show | jobs
by surfingdino 998 days ago
I have used AI to produce an audio version and translations of my short story. I am not impressed. The best tools generate audio that is of high quality but missed a lot of the nuance of the written text, sometimes distort simple words. When it comes to translation, 80-90% of the sentences had to be fixed for grammar, spelling or other reasons (modifications to the meaning of the sentence, bad handling of gender in gendered languages. I came to the conclusion that writers, editors, proofreaders, and translators have nothing to fear from AI. I also noticed that AI is programmed to avoid getting its creators into legal/PR trouble. For example, if you ask AI to perform certain editing jobs on a piece of content that may be somehow connected to a political or a religious issue it will refuse to do so. Not much help if you want to produce a piece of writing that may be a commentary on such issues, be it an article or a screenplay.
2 comments

> For example, if you ask AI to perform certain editing jobs on a piece of content that may be somehow connected to a political or a religious issue it will refuse to do so.

Some AI services (especially those run as services by megacorps with lively legal departments) are gimped this way (usually as a sort of "thought police" model running on top of the core model, as I understand), but once you get to self-hostable models not all have such limitations.

How good are the self hostable models though compared to those run by megacorps?
Well, "good" has a few dimensions to it:

1. Speed of output (not very fun to wait multiple seconds for each letter to be output)

2. Coherence of output (how far back does the model remember the context of the conversation?)

3. Variety of output (how's the diversity of the model's vocabulary? How about topics it can plausibly discuss?)

You can easily get comparable speed, so nothing of interest to really compare there.

I haven't done particularly strenuous coherence comparisons, but for my uses, at least, megacorp and self-hosted models are pretty comparable. Though you do need the better models to get the best coherence simply because they retain more tokens in memory.

Variety is, in my opinion, where the megacorp models still rule. Most of my dabbling has been with models designed to be writing assistants and they can certainly generate plausible strings of words and follow a general theme, but they barely "know" anything (generally when using them to write fiction, you would provide them a "factbook" that they can work from). ChatGPT by comparison can generate plausible responses to a surprising breadth of technical questions, although it definitely has a feeling of being generated from scraping certain online sources since it's decent at answering devops questions but bad at obscure grammar and physics questions, at least in my experience.

Doesn't say much if we don't know what AI you're talking about? The best LLM is far ahead of everything else. If it's GPT-4, translation, editing, proofreading is definitely something to fear.
It's one of the big three. I am deliberately not naming it, because I do not want to steer the conversation into "You clearly should have used LLM X". That's not the point here.
>That's not the point here.

Of course it is. If you say don't worry about x technology and you're not using state of the art then it's meaningless. The "Big 3" is meaningless. The best model is far ahead of the rest.

Well, don't be coy then, what's "the best model" and how "far ahead of the rest" is it?
I already said it's GPT-4 and that it's much better than the rest.