| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by skp1995 630 days ago
	ohh inserting.. I tried it on couple of big repos and it was a bit of a miss to me. How large are the codebases on which you work? I want to get a sense check on where the behavior detoriates with embedding + gpt3.5 based reranker search (not sure if they are doing more now!)

1 comments

yen223 630 days ago

Largest repo I used with Cursor was about 600,000 lines long

link

skp1995 630 days ago

that's a good metric to aim for... creating a full local index for 600k lines is pretty expensive but there are a bunch of huristics which can take us pretty far

- looking at git commits - making use of recently accesses files - keyword search

If I set these constraints and allow for maybe around 2 LLM round trips we can get pretty far in terms of performance.

link