Hacker News new | ask | show | jobs
by inspector14 2244 days ago
this looks great.

I would use it in a heart beat if there were an option for a one time license / activation fee and the ability to use it offline without associating the graphs with an account & communicating them back to a central server. my guess is that there may be more folks like me who work at companies that require a certain level of anonymity or security regarding sensitive information like database schemas. just a thought!

3 comments

I'd echo this - I know a decent amount of DBAs/data devs who need to generate an ERD once a year for some giant overview thing for their new VP or whatever, and Visio just doesnt cut it, graphiz has a million ways to slice it, etc.

Maybe that market isn't that interesting to you, but the value prop for 100 bucks a month (since most real schemas that have the problem of requiring visualization are going to be >50 tables) for one schema that I then get to go cancel or w/e isnt that strong - It'd be enough that I'd consider writing that graphiz layer I keep screwing around with.

Thanks, good to know of the alternate constraints in other companies. Pricing aside, indeed it would be a different challenge tech-wise to have this as an offline tool.

My motivation for building this stemmed from the use-case of smaller-medium dev teams. We were using offline tools (e.g. MySQL Workbench) as part of our dev process and trying to keep it updated as documentation. Which was quite a nightmare to keep in sync between different devs. In this case having a central server was the silver bullet.

Curious - do you all currently use other tools (eg: workbench) for this?

It could be interesting to separate the visualization functionality from the syncing/sharing aspect.

For example, if you store the schema representation as a logical dump (CREATE statements) in a git repo, syncing/sharing becomes trivial. This also provides a branch workflow for collaborative editing, and the commit history serves as an authoritative changelog.

From this point of view, it could be compelling to have an offline visualization tool that simply operates on top of the current local filesystem state of a repo. Ideally this could be paired with a self-hosted server/daemon that can generate a visualization of any arbitrary commit of a remote repo on GitHub, GitLab, etc.

Disclosure: I'm the author of an open source schema management tool https://skeema.io which is designed to support a repo-driven workflow for DDL execution. So I have a heavy bias towards storing schemas declaratively, as repos of CREATE statements.

> Pricing aside, indeed it would be a different challenge tech-wise to have this as an offline tool.

Maybe not an offline tool, then, but rather a hostable server (or isolated enterprise deployment, if you must), rather than a central cloud service. A virtual-appliance Docker image (that you can keep updated upstream) would be ideal, I think.

Maybe a docker image for rich clients.
What’s sensitive about a (normalized) database schema?

I can see a schema definition being “secret sauce” (i.e. a competitive advantage), but I can’t see it being literally dangerous for the company to publish (e.g. because it contains customer PII), unless you’re doing something very strange.

...in which case, that makes me want to know about the schema even more! There’s probably some interesting lessons in there, if just “don’t do this; we deeply regret that we did.”

I think lots of people would be rather embarrassed to post their company's database structures. And lots of databases have table prefixes that can easily be traced to a company or product.
A lot of companies have blanket policies prohibiting sending any IP to an unvetted vendor.

In quant finance, for example, a DB schema might reveal details of a proprietary strategy that the firm does not want a third party to see.

It can be a security risk. For example, imagine if a popular web framework or ORM is found to have an exploit involving some particular data type, when combined with auto-generated HTML forms. If the companies using the framework are known, and their DB schemas are publicly available, this could be a huge target for attackers.

I'd imagine it can also be a legal concern. For example, a schema may reveal presence of a soft-delete column, which conceptually violates GDPR. If the schema is made public, this could cause unwanted legal attention, even if the column is no longer actively used by any application code.