| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Keverw 1887 days ago
	I have a startup idea and want to make sure it scales, I was thinking S3 but don't like vendor lock-in. Not that far along yet, I was thinking maybe SeaweedFS or even going crazy enough to write my own storage system. Use a database like CockroachDB or MongoDB to store the meta data, and then replica pieces of the file to "chunk servers". However cleaning up deleted files, etc seem a bit of a pain. I was thinking instead of top down, let each node contain a copy of the metadata and scan on each node individually instead of the central database trying to manage each node. Then have a a process to handle under replicated files. However if you can adjust the number of replicas for say a popular file, you'd need to then coordinate which extra copies to remove when scaling down. Maybe a bit optimistic. Kinda disappointed the file solutions seem more complicated and nothing more simple to setup like some of the new databases are like CockroachDB or MongoDB are to use. I feel like reinventing the wheel is kinda bad as rather let people who are more experts in this field handle this stuff, but I hate the idea of vendor lock-in and forced to use other peoples servers, self hosting be nice from a single node to test to a cluster spanning multiple datacenters. Maybe there's a solution out there, I done some searching and just seems to go in circles. I seen one system but if you wanted to add or remove nodes in the future, you couldn't just "drain" a chunk server by moving it data.