Hacker News new | ask | show | jobs
by badtension 1080 days ago
And yet you didn't answer them at all.
1 comments

I can expand on that (I'm the other developer working on the project).

> Seen a few of these. Are you all working on providing an easy way to maybe use LLMs for chatting/search without sending my data to OpenAI? If yes, how will you verify the quality is "reasonable"?

We're working on building a helpful AI assistant, with or without OpenAI. We use offline SentenceTransformer models for search and OpenAI (currently) for chat.

To allow user to verify quality, with search you've to look at the quality of the results returned. For chat we pass references (from your docs) used to generate the response. A lot more should be done, open to suggestions.

We also have our own chat quality test suite that "benchmarks" chat capabilities (via pytest)

> How is this better than Rewind, Needl, Mem, etc all the personal search engine that have been doing the rounds lately from various knowledge bases? Is the selling point that it's Open-source? Also if Apple improves spotlight, I wonder how useful this will be.

- I've tried Rewind. It's a neat project with a slick UI, no doubt about it. But 1. It has a cold boot problem (you can only search stuff you've opened since you installed Rewind) and 2. It's limited to Mac (M1+) machines. Khoj will index all supported files across your data sources and it can run on other machines easily.

- Needl, based on their homepage, seems to provide fuzzy/keyword based search. Khoj search works offline and supports natural language queries (e.g search for "sold my car for" and it'll find notes about your Toyota Corolla or Ferrari)

- Mem.ai is pretty neat as well. We'd love to add all the features they have. With Khoj you can self-host if you prefer or use Khoj cloud if you want to sync across devices. And it integrates into your existing tools (Emacs, Obsidian and Web)

In summary, Khoj being open-source is a critical differentiator for an AI assistant to be trustable (you can see what the code is doing). But all the AI assistance approaches are also different.