Hacker News new | ask | show | jobs
by rdl 4799 days ago
This is a good example of bad legal/PR turning a company from a fairly well respected new security company to a joke.

Tokenization, which CipherCloud does, could actually be done fairly securely if you had a decent amount of local storage. They IIRC use a FIPS HSM for local key storage in their local appliance (I talked to one of their founders as a security event a year or two ago and was initially suspicious of their claims, but it seemed adequate for certain use cases based on how they were using it -- maybe things have changed). It's fundamentally not too different from when Stripe gives you a user key vs. PCI information.

Basically, if you can correctly identify certain fields as sensitive and others as not, and force all your traffic through a proxy, you could do totally unrelated random tokens in fields, and then do search locally on the appliance, rather than on the untrusted service. E.g. if you wanted to use Salesforce, but keep customer addresses secret (because they were super-confidential government sites or meth labs or something), you could still put names in Salesforce and do everything else, but just put a random string in for addresses; do address searches on the proxy, either going from single record to address or maybe even "give me all the records in Missouri". There is no magic here. Someone could do an open source implementation for any specific site (via scraping or a public API) easily. The difficulty is doing it for many sites, and keeping it updated, supporting it, and selling it to fortune 500.

I don't know if they've been pushed to do stupid stuff, or if they just have horrible marketing/PR now (which is weird since they raised a fuckton of VC), or what.

3 comments

Agreed, no magic here. I rolled a quick version using Squid and greasy spoon. Got it to work on SFDC and Gmail inside of a day. Using tags around the encrypted content and regex you could then feed the content into the decryption engine. Search works, etc. You could even using a unique IV per user to add a level of security, but it is by no means rock solid. It would however address some of the frequency analysis concerns, since if the encryption (tokenization??) was cracked it would only reveal the contents for a single user. That would work for the gmail side, but doing in in SFDC is a whole other issue, and unless the have some Harry Potter stuff going on, is likely huff and puff.
Maybe the correct response here is an open source version of CipherCloud, built on open/published principles (to make it easy to verify the level of security provided).
I would be happy to post my code, but honestly the process is so embarrassingly simple, I'm sure other could do it better. Setting up the squid proxy with SSL bump was more difficult than the code, as there are some great libraries out there. Using a reverse proxy and Icap server, you need to parse all content using something like jsoup (regex if you really wanna hack). Jsoup grabs the element and you then run it through a great encryption library like bouncy castle you then add some unique identifiers arounds it (!!) so that you can decrypt it using simple parsing to get the encrypted content. Plop it back into the content using your trusty greasy spoon. And walla magic! All persisted data is encrypted. When data is pulled out you simply parse for the unique tag, and then run it through the decryption side. There are a number of things that you can do to increase the security of this implementation, with a little tweaking it works for searching, and the such, so gmail is no problem. An app like SFDC with joins between records would be significantly more difficult to do properly. Doing it improperly is trivial, as you could just just all of the same keys and IVs per org (the unit of work in SFDC).
The response will be along the lines of "lacks our secret patent-pending military grade algorithms".
It sounds like they don't have anyone with actual PR experience. The standpoint they are taking is very old school and stems from an angry reaction.

If instead they had entered the discussion with a sliver of respect and honesty it would have been great. Instead many people have been introduced to them via a negative and untrustworthy atmosphere, this has certainly tarnished their reputation. Despite what they say not all exposure is good exposure.

People should learn from their example.

I'm not sure if it's that they have no PR experience in the company, or just don't consider StackExchange/HN/Reddit to be worthy of a serious effort.

IMO, this is the kind of thing founders should handle personally once it happens. Maybe guided by a PR person or an investor, but a founder giving an adequate response gets graded on a curve, and is thus a lot more effective than a completely polished PR/marketing person.

It's nowhere like Stripe's tokenization because all tokens are not equal. Their tokens have inherent patterns which aid frequency analysis. That is exactly the point of the SE discussion which got the DMCA notice.
Yeah, I've never looked at CipherCloud's security in depth, but you could do tokenization in a fairly secure way. There's essentially a triangle of security, functionality-of-SaaS-app, and complexity of the proxy.

One issue is access patterns might leak information, so if you wanted maximum security you'd end up doing crazy things like heavily caching or accessing extra "chaff records" periodically. Well before that point you'd probably just give up on the SaaS app entirely.