Hacker News new | ask | show | jobs
by RicoElectrico 656 days ago
I wonder if it will work on https://github.com/organicmaps/organicmaps

So far two similar solutions I tested crapped out on non-ASCII characters. Because Python's UTF-8 decoder is quite strict about it.

2 comments

OP's cofounder here. Thanks for pointing out this test case. Surfaced that we weren't handling symlinks properly. With this fix, I was able to successfully embed and index most of the repo (though I stopped at 100 embedding jobs so that we don't burn through OpenAI credits).

P.S. You'll see a bunch of warnings for e.g. binary files that are ignored. https://github.com/Storia-AI/repo2vec/commit/1864102949e7203...

OP here! I love this stress test. Will index and get back to you!