Hacker News new | ask | show | jobs
Ask HN: Would more apps build with Git back-end if there’d be a solid SDK?
9 points by hannes_paul 1330 days ago
Hi HN,

we are building a git-based localization solution (https://github.com/inlang/inlang) and we are wondering why there’s not more apps making use of git as their back-end.

Building on git has a bunch of advantages that are much harder to replicate with different architectures (automations, version control, minimal integration management etc) but very few projects use git to its full potential.

Right now building on git is full of tedious workarounds but do you think people would use a git-based architecture for more projects if there was a solid SDK to build with?

7 comments

Have you seen [libgit2](https://libgit2.org/) and the csharp libgit2sharp? Both seem to be reasonable albeit low level interfaces to a repo.

My opinion is that you’ll still desire some other data store for indexing and searching as your application grows.

I used to build a blog engine using git as the storage. As a result, my blog users can view their update histories and also collaborate on a post through a pull request like process.

behind the scenes , I used libgit2. By default, libgit2 uses the filesystem as storage backend, which is more difficult to scale than a database.

You can replace the default backend with your own database backend, but it seems to require lots of work.

If there was a solid SDK built for the above cloud use case, things could be easier for me.

If you elaborate on specific use cases that will be useful.

git as a DB I think is not that useful. I'd guess in most cases you'd be better off with sqlite.

regarding the collaboration use-case I think it's also not that interesting outside of the usual code stuff.

Maybe some git-inspired features can be made such as diffs or merge requests but I don't think you'd neccessarily need the real git for that.

Can you provide a few examples of apps you have in mind?

Are you indicating a service would fetch files from git based on user requests?

Oh I see how that was unclear. I don't mean using git as a database for user requests, but git as a back-end for collaborative applications. For example, in our case, translators and devs have to collaborate to achieve localization. Normally translators would work in some isolated cloud application and then data pipelines between said application and the git repo are built. But why doesn't the translator facing app just build directly on git? Then editing the translation strings would happen on the git repo and there would be no synching between two sources of truth (git-repo and cloud solution)
I think this is a pattern that does get used. The method I've seen is that the English language text is saved in files in particular directories with text keys. Translators add similar files with different text for the same keys and commit to the repo. Whether this is done manually or automated isn't a big deal. The app could even run the `git` commands and operate on the output of those commands and filesystem--there's nothing particularly hard to parse.
I think asking translators to use git is a hard sell.
Its dirty core could be hidden behind some pastel colored buttons perhaps
The value of git is distributed operation and tamper-evidence. What would be the value of these attributes for a cloud-based app?

To me it seems more logical to use versioned data in a database that only the cloud app accesses.

Db idea of versioning usually is useless for implementing version control as git does. You could store git objects in a relational database and that makes a lot of sense (becasuse transactions and better data integrity guarantees).

Biggest issue with git for me is that its only usable for plain text. There is Dolt and few other projects that try to solve it,

> The value of git is distributed operation and tamper-evidence. What would be the value of these attributes for a cloud-based app?

Cloud based apps are very often distributed internally. For example consider a book store that has data in a database and replicates that in elasticsearch and some data analytics platform. Being able to use data versioning for those replications could be useful.

I can't tell which parts are agreeing or disagreeing with the quoted statements. Is doing versioning explicitly in a database necessarily more complicated than with git histories?

Using a distributed datastore should make operations simpler and more transparent without having to think about the distributed nature, other than latency of operations. So if using a distributed datastore achieves distributed nature, is the tamper-evidence what's needed and missing or something else? For a cloud-based app, redundancy/fault-tolerance is the concern rather than being distributed unless you're running a globally distributed platform that can't be partitioned.

Datomic/Datalog[0] is an example where the log of changes is the primary 'store' and a coherent view at any point in time is constructed from the log.

[0] https://en.wikipedia.org/wiki/Datomic

>Is doing versioning explicitly in a database necessarily more complicated than with git histories?

What "versioning" means depends heavily on the database. "Versioning" for typical relational databases like Postgres does not achieve many of expected git features like branching and ability to materialize data for arbitrary point in time, unless you build something extra on top (or just use db as git storage). There are databases like dolt where "versioning" idea is similar to git.

> is the tamper-evidence what's needed and missing or something else?

I always work in controlled environments, so don't care much about tamper-evidence (unless its used for detecting integrity issues perhaps).

I didn't know datomic, that's looks like solution for half of the problem (distribution and history). I may still be missing something more human-focused like git, where you have branches, collaboration and push or pull when needed.

I was just commenting, not disagreeing.

git is not that much usable with its endless merge conflicts, and fragile extensions. I built my text backup solutions (e.g. wiki) with RCS, to store only the diffs. Much easier to use than git, and rock-solid.
libgit2 works pretty well, are you referring to content addressable storage in general or "Git-based architecture" specifically?