Hacker News new | ask | show | jobs
Show HN: google-like search for workplace knowledge (open-source) (github.com)
15 points by Roeylalazar 1192 days ago
3 comments

Thanks for sharing! Looks great, care to share your thoughts on the model decisions in the models.py file. I assume you have a take on the speed vs storage costs vs relevance quality (and/or licensing) of these models vs others here: https://huggingface.co/spaces/mteb/leaderboard and here: https://www.sbert.net/docs/pretrained_models.html:

bi_encoder = SentenceTransformer('multi-qa-MiniLM-L6-cos-v1')

cross_encoder_small = CrossEncoder('cross-encoder/ms-marco-TinyBERT-L-2-v2') cross_encoder_large = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')

qa_model = pipeline('question-answering', model='deepset/roberta-base-squad2')

Cool stuff! Excited to see where this goes
tl;dr: I built gerev - is an open-source workplace search engine (You could say it's a privacy centric glean.com alternative.)

Hi it's Roey, I'm the co-founder of gerev.ai (singular sock in hebrew).

Gerev is a google-like search engine for workplace apps, it allows you to find everything from code snippets, conversations, or relevant docs.

It supports natural language queries so a query like: "how to setup test env for auth service?" yields (a snippet exracted from a confluence page):

  curl ...eu.amazonaws.com/setup_auth.sh | sh
  export PYTEST_PLUGINS=auth.test_plugin.AuthPlugin
  pytest -v --...

gerev is 100% self hosted, easy as "docker run"