Hacker News new | ask | show | jobs
Ask HN: DB Hosting Choice
2 points by ninadmhatre 3710 days ago
Hi Guys,

I have a some idea about creating web app which would require DB to store user data. i initially zero'd on MongoDB as i was using it on machine and i knew there were managed hosting solutions for MongoDB.

Now when time came to really think about making final decision about DB (80% dev done) i am confused as MongoDB managed hosting costs $30+/month, Cassandra is another one but could not find any reliable hosting solution, DynamoDB has complicated pricing (Read/Write Unit ??). So i am asking what choice do i have?

Little bit on requirement, i am thinking per user it will take around 12kb and maybe 100,000 user then = 12gb / year (i know 100,000 is very optimistic but this is just for calculation). I am looking for DB which can store binary/blob data, so its not limited to NoSQL db's.

I am good with Unix / Linux so i can do devops job myself but i dont think i can do it round the clock, which makes managed hosting good option but its going beyond my budget as web app would be free to use. I can spend $20 a month on hosting ($5) + db ($15)

Any input would be helpful...

1 comments

One should never store binary data inside of a database, because those are either difficult or impossible to index properly, consume inordinate amounts of space, and can bring a RDBMS to a grind. Store the files in a filesystem and store the paths to the files in the database. Refrain from using database engines which cannot provide instantaneous atomicity, consistency, isolation, and durability. If you do not, you will experience data corruption and thus availability issues, possibly of both transient and silent nature. Refrain from using databases which do not provide ANSI SQL, as you will eventually need the SQL data manipulation mechanism, which is consistent.

To avoid the managed hosting penalty, you can start with SQLite, and do the replication and clustering within your application. For capitalizing on RDBMS capabilities, use PostreSQL Citus or Oracle RAC. Stay away from MySQL, as it silently corrupts data and has no OS authentication, making it difficult to automatically deploy.

Using the file system for storage, the "index" could be code, e.g. a Python dictionary, a Clojure hash, or JavaScript Object. This might simplify building a first iteration product and avoid the technical and cognitive overhead of learning a properly implementing an unfamiliar DBMS.
Thanks for the reply and tips, but my question was on more of choosing self or managed hosting?
Always choose self hosting. Not only will you learn a lot about operations and proper hardware design, servicing, and redundancy, but you will have maximum control.