|
Having inherited an openstack install with the swift implementation implemented on earlier generation of pods, these are the issues I and my cohorts ran into: 1) The object replication between the pods will be hard on the array, as the object replication is simply an rsync wrapped inside nested for loops. If you have a bunch of small files with a lot of tenants, it'll hurt. 2) While swift allows you to simply unmount a bad disk, change around the ring files and let replication do its thing, there are actual issues. First of all, out of band smart monitoring of bad sectors actually cases the disk to pre-empt some sata commands and do the smart checks first. On a heavily loaded cluster, under a smart check for a bad block count could kill the drive, and take out the sata controller along with it. We've downed storage pods before by that way. The only way we got around it was to take a pod offline once a week, run all our smart checks, then put it back. 3) To replace a drive, you have to open the machine up and power it off. As any old operator will tell you, drives that have been running for awhile do not like to be turned off. If you are to power off a machine to replace a bad drive, do realize that you might actually break more drives just from the power cycle. 4) Once you change a drive out and use the ring replication of files to rebuild your storage pods, your entire storage cluster will take a non-trivial hit. 5) Last but not least, it is almost of tantamount importance to move the proxy, account and container services away from the hardware that also host the object servers. It's probably good to note that the account and container meta data is stored in sqlite files, with fsync writes. If you add and remove a bunch of files to multiple accounts, the container service is going to get hit first, then the object service. Further more, every single transaction to every meta data/data service, including replication, is federated through the proxy servers. If you look at a swift cluster, the proxy services take up a large chunk of the processing space. Source: Was Openstack admin for a research group in Middle Tennessee. Ran an openstack cluster with 5 gen1.5 pods for the entire swift service, then moved account/container/proxy to three 12 core 2630s with 64 gigs of ram. Cluster was for a DARPA vehicular design project, first part fielded 3000 clients, second part fielded about 1/10 of that, but with more files and bigger files (these were cad and test results respectively). |