|
|
|
|
|
by rdli
763 days ago
|
|
I'm working on something like this! It's simple in concept, but there are lots of fiddly bits. A big one is performance (at least, without spending $$$$$ on GPUs.) I haven't found that much in terms of how to tune/deploy LLMs on commodity cloud hardware, which is what I'm trying this out on. |
|
Also, don’t discount plain old BM25 and fastText. For many queries, keyword or bag-of-words based search works just as well as fancy 1536 dim vectors.
You can also do things like tokenize your text using the tokenizer that GPT-4 uses (via tiktoken for instance) and then index those tokens instead of words in BM25.