Hacker News new | ask | show | jobs
by Kiro 1080 days ago
You're being overly critical. You can definitely control the alignment of your assistant with prompt engineering and embeddings. They never say they control the underlying model.

It's an open source project and I don't see why you need to be so obnoxious about it.

2 comments

Are they being obnoxious without cause though?

The Khoj website says, and I quote:

> Khoj's offline AI models allow you to find information using natural language queries. Search using terms that are similar to what you're looking for, rather than exact or fuzzy matches. Khoj search works offline. So if you self-host your data never leaves your machine and search works without internet.

Emphasis mine.

It seems somewhat disingenuous.

I get it, parts of it run offline, parts of it use the openai api… but that’s not what it says on the box.

Why is the project making a song and dance about self hosting and being open source when it’s just another openai app.

If it’s not just another openai wrapper, cut the openai part of it out and pitch it that way, sure.

…but as it stands, I’m pretty sceptical.

Lots of people are doing the “ai magic” tech demo stuff at the moment, but when you cut them off from the openai api the magic goes away and what’s left isn't very good or interesting.

Maybe this is different? …but it doesn’t look like it; and since they’re tied up with the openai api and you can’t use it without that, how would I even tell?

>> Khoj search works offline. So if you self-host your data never leaves your machine and search works without internet.

> Emphasis mine.

> It seems somewhat disingenuous.

I've been trying it. Khoj search does work offline. Khoj chat (they are literally seperate functions in the app) requires an openAI key and if you give it one, uses openAI.

Yes! It's a bit more than, "somewhat disingenuous," to say a system built to use the OpenAI API works with you to make sure, "your data never leaves your machine".

That's like saying I invented a new form of transportation where you're feet never leave the ground but in actuality I'm just a travel agent sending you to the airport.

"your data never leaves your machine" is only mentioned in the Search section, where it is it true. No-one reading that would assume that meant everything considering the two last sentences above in the Chat section explicitly says it's using OpenAI.

Really feels like people are nitpicking and hating on this project for no good reason. I feel sorry for the authors.

It feels like you are reading too much into this. Really don't understand all the bashing here. It's an open source software for building things using OpenAI. Do you think LangChain is similarly disingenuous? Or the Vercel AI SDK?
Neither of those things claim:

> So if you self-host your data never leaves your machine

You're quoting the paragraph under "Search", describing their search engine. I feel you're misrepresenting it.

Anyway, I definitely don't think this deserves to be described as "a simple interface for the OpenAI-API drenched in fake buzzwords boosted to the top of HN to scam investors" or "twitter get-rich-quick-guru level lousy and fake, and is clearly boosted to the top of HN".

Horrible reactions in this thread to open source software you can fork to use whatever you want. Really disappointing.

Langchain works completely offline with appropriate LLM/API backend & vectore store if needed
No-one is preventing you from creating a PR or fork this project to add whatever backend you want. Did LangChain fully cover all backends on release? Are you not allowed to release a project that only supports OpenAI?

You really need to explain what you are hating on here.

STFU all I said is that what they claim is not what's the state of their project today. Don't even get me started on their alignment BS.
It says open source AI personal assistant.

The AI isn't open source and sending your data to a third party isn't really trustworthy personal.

I understand your concerns, but let me zoom out a little here and talk about the nature of open source.

Open source means that the source code which is developed for a piece of software is fully open (i.e, anyone can read, fork, modify the code) for what they are installing.

According to your definition, it would be really hard to do anything that is fully, end to end, open source. We've developed the code on Macs, hosted the code on GitHub, written plugins for Obsidian and GitHub, hosted the website on AWS. All of those are closed sourced software.

https://www.redhat.com/en/topics/open-source/what-is-open-so...

That being said, we are planning to integrate an open source LLM soon. When we added chat, Open AI just had the best one, but the space is changing so quickly. We're both super enthusiastic about seeing all the open source tooling for this stuff evolve.

The problem is not that there is "glue" to closed-source apps. It's that the essential core of your product, without which your product has no content or meaningful use — is _someone else's closed-source model._

If I market "a totally creative-commons blockbuster Hollywood movie", but my actual product is just a creative-commons-licensed set of driving directions to some nearby movie theater where you can buy tickets to see the same copyrighted movies anyone else is offering, then _the fundamental essence_ of what I'm offering is not, in fact, creative-commons. I sold people on a _movie_ with that license, and then failed to deliver.

That's what you've done here.

To be clear, the fatal flaw is that your marketing is dishonest about what your product currently is, not that your product is something nobody wants. I'd recommend either making your marketing honest, or else making your product live up to what your marketing promises.

_Then_ you do the PR push on HN.

> it would be really hard to do anything that is fully, end to end, open source. We've developed the code on Macs, hosted the code on GitHub, written plugins for Obsidian and GitHub, hosted the website on AWS.

Yeah, nobody did open source before those things existed...

I don't understand the presumption that the AI should be open source here. If I release an open source SDK for talking to an API, it's still open source even if the underlying API isn't.
> I don't understand the presumption that the AI should be open source here.

Because it literally says "open-source AI".

It says "open-source AI assistant".
Exactly, it doesn't say AI open source assistant. Everything mentioned after open source should be open source.
Be serious.
I'm dead serious. The assistant talking to the AI is open source. How else would you describe it? You guys are really doing everything in your power to miscredit this. I really don't understand the hostile attitude.