|
|
|
|
|
by throwup238
404 days ago
|
|
An important Cursor feature that no one else seems to have implemented yet is documentation indexing. You give it a base URL and it crawls and generates embeddings for API documentation, guides, tutorials, specifications, RFCs, etc in a very language agnostic way. That plus an agent tool to do fuzzy or full text search on those same docs would also be nice. Referring to those @docs in the context works really well to ground the LLMs and eliminate API hallucinations Back in 2023 one of the cursor devs mentioned [1] that they first convert the HTML to markdown then do n-gram deduplication to remove nav, headers, and footers. The state of the art for chunking has probably gotten a lot better though. [1] https://forum.cursor.com/t/how-does-docs-crawling-work/264/3 |
|