Hacker News new | ask | show | jobs
by johnnyanmac 189 days ago
It's not training on books, but it will answer questions about the book you're reading. Doesn't pass the sniff test.

>My device, my content

I don't think you own the kindle store and servers used to train the Ai.

3 comments

There are LLM's that can process 1 million token context window. Amazon Nova 2 for one, even though it's definitely not the highest quality model. You just put whole book in context and make LLM answer questions about it. And given the fact that domain is pretty limited, you can just store KV cache for most popular books on SSD, eliminating quite a bit of cost.
You could also fill the context with just the book portion that you've read. That'd be a sure-fire way to fulfill Amazon's "spoiler-free" promise.
Are you implying that an LLM needs to be trained on a specific piece of text to answer questions about it?
If you want proper answers, yes. If you want to rely on whatever reddit or tiktok says about the book, then I guess at that point you're fine with hallucinations and others doing the thinking for you anyway. Hence the issues brought up in the article.

I wouldn't trust an LLM for anything more than the most basic questions of it didn't actually have text to cite.

Luckily, the LLM has the text to cite, it can be passed in at inference time, which is legally distinct from training on the data.
Having access to the text and being trained on the text are two different things.
> It's not training on books, but it will answer questions about the book you're reading. Doesn't pass the sniff test.

What do you mean? Presumably the implication is that it will essentially read the book (or search through it) in order to answer questions about it. An LLM can of course summarize text that's not in its training set.

"Reads the book" is the issue, yes. It's possible they aren't training. Vit to be frank, we're long past the BOTD where tech companies aren't going to attempt to traon on every little thing fed into their servers.

Happy to be proven wrong, though.