| > I’m not clear what you mean by limiting "not only particular document but database". I’ve limited document size to 10mb and ratelimited updates to 10 per second. Client starts to update document with random data 10 requests per second. As far as I understand couch stores all versions at least some time. This means that this one client could fill space on my server 100mb/s. There is no such issues with postgress, and no one allow clients execute raw queries on database without any application server. Document only 10mb but database is huge. > What kind of "expensive" query are you envisioning? I have never used couch, so I don’t know what could be expensive. May be some lookup without index or something like this. Sorry for my ignorance, is it true that if I limit couch only to replication it will not be any not indexed lookups? Looks like implement secure system with couch is very hard but I can’t find any best practices, mostly only authentication and basic validation. |
Ah! Now we are getting somewhere! Your concerned about someone filling your disk.
OK, let's modify your scenario a little. Instead of updating an existing document, they create a new document. This a malicious client, why do updates that'll get cleaned up in a few minutes when I can make it permanent?
So, CouchDB allows these writes, and now your disk is full.
What does Postgres with a custom API do? Allows these writes, and now your disk is full.
Your allowing 10MB documents because that makes sense for your application right? So your Postgres table is going to have a binary column or some other column meant to hold bulk data, and your API is going to accept it.
If it doesn't make sense, lower the max document size. Apply validations to limit what fields can be written to, and how big they can be. In Postgres this is called your "schema". Couch being "schemaless", it's now your validation function. Couch is no different from any other schemaless database such as Mongo, RethinkDB and FoundationDB in this regard.
Also your rate limiting here is weak. If I can post to your sever at 100Mb/s second, I can saturate a 1GB link with only 10 clients. Doesn't matter if you reject my posts, if I can send them to the server, I can DOS you pretty easily.
The main thing Postgres gives you here is that it requires you to define your schema upfront (unless you use JSON columns, in which case it joins the schemaless club above). Couch will happily let you not, in which case someone wants to write a record of their car maintenance into your recipe book app? Couch is good with that. But take a step back. what actually stops them from putting that in the "description" column of your Postgres recipe app? Not much. So you have to think about what's important. Do I actually need to make sure these are all the same "shape"? If so I need a validation function. If I can just shrug and say "garbage in, garbage out", then I just need controls around how much data they can insert, but hey, I needed that for Postgres anyway.
> Sorry for my ignorance, is it true that if I limit couch only to replication it will not be any not indexed lookups?
Correct (enough). The entirety of CouchDB is built around efficient replication. While it's not going to use a formal "index" getting all of the changes after a specific rev is an efficient operation.