Hacker News new | ask | show | jobs
Need Help Optimizing JSON Search with Cached Path Traversal (github.com)
3 points by abdheshnayak 618 days ago
1 comments

Hey HN,

I’ve built a project called search-in-json (https://github.com/abdheshnayak/search-in-json) that allows searching through JSON objects of any structure using regex. It also returns the path to the matching content, making it easier to navigate nested data. However, I’m facing a performance challenge and could use some help improving it.

Current Issue: Right now, the implementation starts traversing from the root of the JSON for every search, even if some parts of the data have already been visited. This works fine for small JSON objects but becomes slow with larger, deeply nested structures.

What I’m Trying to Achieve: I want to cache previously traversed paths so that:

    If a search cursor lands after already visited nodes, it can resume from the last known point.
    This would reduce redundant traversals and improve performance, especially for large JSON files with multiple searches.
Challenges:

    Efficient caching: How to store paths in memory in a way that makes them quick to reuse.
    Edge cases: Handling complex JSON structures with nested objects and arrays.
    Minimal memory overhead: Avoid using too much memory for caching paths.
How You Can Help:

    Suggestions on data structures that might work well for caching paths.
    Advice on algorithmic improvements for path reuse.
    Any experience with similar traversal optimizations in tree-like data structures.
Here’s the link to the project: https://github.com/abdheshnayak/search-in-json

Looking forward to your ideas and suggestions! Any help would be greatly appreciated.

Thanks! Abdhesh