|
Congratulations on the launch - this looks great and I've been waiting for years for something like this! As a researcher who mostly uses Python, and explores/navigates a large number of repos for a short time, often written by other researchers not necessarily trained in software best practices, I was always frustrated (and surprised) that there was no VS Code extension or tool that gave me a quick overview/visualization to get a high level gist of different modules and code/data flow! I tried this with a bunch of small open-source repos and it works great! I imagine using LLM might be a hard no for some people/enterprises - any plans to use stand-alone licenses with small local models? It seems like for what LLM is doing here (if I understand it right, help label the modules in natural language and perhaps help organize them into this hierarchy/modules) you don't necessarily need a SoTA model, right? Also, this could be coming from LLMs, but I see that the visualizations are more biased towards terminology used in web-dev? (for example, one of my robot related repo was organized into front-end, back-end, etc. with I guess is kinda right but not exactly lol). It would be nice to see an interactive visualization where I can iterate on the initial viz with information I know, e.g., I drag and drop a module or rename it and then you probably do another pass with this feedback and LLMs and update my overall visualization with more domain specific labels and partitions? Edit: Exploring CodeViz on a few more repos, and it seems like you have a set of hardcoded labels for the highest hierarchy in the architecture diagram? (so far, I've only seen Users, Databases, Backend, Frontend, and Shared Components). I am guessing this is something passed on in the prompts? It'll be nice to allow user to define their own set of labels/partitions at one or more levels and then try to create an architecture visualization that fits into these labels/constraints (although I am guessing at some point you have to be wary of hallucinations?) |
The top level categorizations are indeed fixed, however the nodes themselves can be arbitrary. We've found this helps with grouping and organization while still allowing for the flexibility required to accommodate different systems. I'm curious, are there any categories missing here that could be added?
Currently, we categorize by: Frontend (UI/UX elements), Backend (API/Business/Data Access), DB(persisting storage), External Services (Backends maintained outside codebase), Shared Components (internally maintained libraries and helpers)