Hacker News new | ask | show | jobs
by newfeatureok 2305 days ago
One interesting thing you can do with CouchDB is that you can have a webapp where a user can specify their own database and credentials and it works over HTTP(s). That's pretty unique. I'd love to see a SaaS using CouchDB and their "on-premise" offering just means the user provides their own database. I'm not sure how payment would work though - perhaps some verification proxy?

Firebase is the gold-standard for offline apps (as a service). CouchDB replaces Cloud Firestore, and Keycloak replaces Authentication. I haven't seen OSS equivalents of Cloud Functions, ML Kit, and the other things (e.g. In-App messaging, and Cloud Messaging). It'd be nice to have the entire stack of Firebase bundled as a group of OSS projects, including CouchDB.

Sad to see that per doc access control didn't make it in 3.0. Hopefully it'll be in 3.1.

7 comments

Cloudant on IBM Cloud is CouchDB API/replication compatible and offers support for Apache CouchDB (1). Also, OpenWhisk integrates nicely with CouchDB/Cloudant and can even be a backing persistence for it (2)

(1) https://www.ibm.com/cloud/blog/announcements/announcing-supp... (2)https://github.com/apache/openwhisk/blob/master/tools/db/REA...

Cloudant is awesome, but it's way too expensive IMHO.
Partitioned dbs are supposed to allow you to query more cheaply, haven't implemented those yet.
They’ve recently shifted their pricing scheme to be more on-demand; before that, you needed to do multi-tenant at very small scale, or buy dedicated clusters.

We have dedicated clusters on Cloudant and they’ve run quite smoothly for many years. Someday we might switch to the on-demand IBM Cloud pricing, but haven’t done it yet.

Send me an email (in profile.) Would love to chat and see what we can do for you.
If you indeed work for Cloudant, please consider trying to convince someone to invest in PouchDB. It looks mostly unmaintained and it would be in IBM's and the community's interest to keep it running!
I'm really can't wait for the per-doc permissions because I'm building something very similar to what you're describing and with CouchDB!focusing on the database and auth side first and then adding functions.

So shameless plug if you're interested in signing up for the alpha: https://www.aspen.cloud

Yeah, I'm still disappointed that the MongoDB API outpaced the CouchDB Replication Protocol in general adoption. As nice as Cloudant can be some of the time, I know that my IT group would be a lot happier if we could use Cosmos DB (and/or if Cloudant would just directly support Azure data centers again).

Every now and again I wonder if I could implement the CouchDB Replication Protocol on top of Cosmos DB with a presumably hairy ball of Azure Functions and hoping someone beats me to needing that to exist and scratches that itch for me. (Cosmos DB's changes feed is so almost right for the job it hurts because it sounds like it should be easy, and yet I assume it won't be.)

For a Cloud Functions like project, OpenFaas seems like a promising project that I’ve been watching but have not yet had the chance to use.
I didn't understand. You mean it's unique to work over http(s)?
I mean your CouchDB instance itself is represented by a host and port and your application's data could be stored there and a native HTTP-based API to access said data. This is contrasted to most where you would need a driver and it's accessible only in the "back-end".
> itself is represented by a host and port and your application's data could be stored there

All databases are represented by a host and a port. I think you mean CouchDB offers a HTTP-based API that allows queries to be run without requiring a database-specific library and that because it's HTTP, it can be accessed via a browser.

That was my point exactly. Http API us very cool but very far from being unique to couchdb (see InfluxDB, ClickHouse, Prometheus, etc.)
Haha, yes, exactly. I omitted the most important part - the [HTTP-based API].
To take that further, I really like the idea of running something locally like PouchDB and then letting it sync with a remote CouchDB using the replication protocol.
That's where I fell into love with the CouchDB world. Building offline-first databases in PouchDB, and letting that sync to any HTTP address that speaks the replication protocol is sometimes a dream. In practice there are so many hurdles, sadly. (CORS, CSPs, firewalls, not enough things speaking the replication protocol that should, ...)
I’m just getting into pouchdb and am liking it. I love the idea, for sure. I ran into a replication issue running through a proxy that had something to do with sessions being cached, but that was more the fault of the proxy.

My biggest current concern is client side search and size. I’ve been developing a private journalling/notes app with some fairly particular bells and whistles for my own personal use. Although it’s mostly just text, I want to store a lot of text. I would much prefer the search not happen on the server, as I’d like to encrypt all data that hits the server.

Have you used pouchdb quick search? If so, in your experience, can it handle full text search on about 1,000-5,000 documents with about 10kb worth of text each?

Ideally I’d also like to store data uri for png sketches, and maybe photos. But I know photos especially would balloon the database size quite a bit/am worried I’d hit client side storage limits extremely quickly (I think I read some mobile devices have a 50mb limit, but I haven’t researched it that thoroughly yet)

I've not tried quick search. For the most part in the applications I've worked on I've just relied on the main primary key index (the _id field) for most lookups.

Generally I'm using a `folder/structure/ULID` approach to keys and its really easy with start_key and end_key on allDocs to grab an entire "folder" at a time. I've had some pretty large "folders" and not seen too much trouble. At this point the biggest application I worked on pulls a lot of folders into Redux on startup and so far (knock on wood) performance seems strong. (ULIDs [1] are similar to GUIDs but are timestamp ordered lexicographically so synchronizations leave a stable sort within the folder when just pulling by _id order.)

At least as far as my queries have been and what my applications needs have been, PouchDB is as fast or faster than the equivalent server-side queries (accounting for HTTPS time of flight), especially now that all modern browsers have good IndexedDB support. (There were some performance concerns I had previously when things fell back to WebSQL or worse, such as various iOS IndexedDB polyfills built on top of bad WebSQL polyfill implementations, and also a brief attempt that did not go well to use Couchbase Mobile on iOS only.)

Photos have been the bane of my applications' existence, but not for client-side reasons. I had PouchDB on top of IndexedDB handling hundreds of photos without breaking a sweat and those size limits all have nice opt-ins permission dialogs for IndexedDB if you exceed them. Where I found all of the pain in working with photos was server side. CouchDB supports binary attachments, but the Replication Protocol is really dumb at handling them. Trying to replicate/synchronize photos was always filled with HTTP timeouts due to hideously bloated JSON requests (because things often get serialized as Base64), to the point where I was restricting PouchDB to only synchronize a single document at a time (and that was painfully slow). Binary attachments would balloon CouchDB's own B-Tree files badly and its homegrown database engine is not great with that (sharding in 3.0 would help, presumably). Other replication protocol servers had their own interesting limits on binary attachments; Couchbase in my tests didn't handle them well either and Cloudant turned out to have attachment size limits that weren't obvious and would result in errors, though at least their documentation also kindly pointed out that Cloudant was not intended to be a good Blob store and recommended against using binary attachments (despite CouchDB "supporting" them). (It sounds like the proposed move to FoundationDB in CouchDB 4.0 would also hugely shake up the binary attachment game. The 8 MB document limit already eliminates some of the photos I was seeing from iOS/Android cameras.)

I'd imagine you'd have all the same replication problems with large data URIs (as it was the Base64 encoding during transfers that seemed the biggest trouble), without the benefits of how well PouchDB handles binary attachments (because of how well the browsers today have optimized IndexedDB handle binary Blobs).

The approach I've been slowly moving towards is using `_local` documents (which don't replicate) with attached photos in PouchDB, metadata documents that do replicate with name, date, captions, ULID, resource paths/bucket IDs (and comments or whatever else makes sense) and a Blurhash [2] so there's at least a placeholder to show when photos haven't replicated, and side-banding photo replication to some other Blob storage option (S3 or Azure Storage). It's somewhat disappointing to need two entirely different replication paths (and have to secure both) and multiple storage systems in play, but I haven't found a better approach.

[1] https://github.com/ulid/spec

[2] https://blurha.sh/

Cloudant off IBM cloud. Full disclosure; I utilized it to support the application layer on IBM cloud.
I'm doing this with BigQuery for Logflare (logflare.app).