Hacker News new | ask | show | jobs
Show HN: Cryptomove – security via data concealment, last-line data defense (cryptomove.com)
31 points by bburshteyn 3907 days ago
4 comments

This project seems to hinge on: "Cryptomove renders futile any attack on the data store even before the attack begins – the attacker just can never find the data" and talks about how high-entropy it is.

Except, attacks on the actual cipher are doomed to failure anyways unless terribly implemented. So what exactly is it defending from?

The whole "continuously encrypts" stuff is also a bit suspect - does that not mean the key must be in memory on those devices all the time?

At best I'd guess this is some sort of traffic analysis prevention system. But if you can read/write data on-demand, then it's just adding noise. The word "traffic" doesn't appear in the concepts PDF so they aren't using a common term for it, if that is indeed what they are preventing. Edit: It seems to be much more than that, including a language, an encrypted distributed protocol for the language, storage systems, etc. Which sorta makes me wonder even more what it's aimed at: Was it not possible to achieve the goals (what are they?) while re-using existing tech?

The site should really, really, spell out clearly exactly what this is really preventing, beyond what e.g. a cloud-sync'd TrueCrypt volume provides. Otherwise it looks a lot like a fishy product. (Not saying it is; not trying to be negative. Just that there's zero clear info on what attacks this prevents.)

But there's a ton of work done here, all detailed documented (if light on the purpose/overview). That's very well done as a lot of projects skimp on that part.

Edit: It links to Hello language. Which has a pretty SF-startupy-looking page. But very little info. I'd expect at least a prominent link to some example code, showing how Hello does stuff better. Or why, if the goal is distributed systems, a new language is needed, instead of libraries. (Or how it constrasts with Erlang.) Is it safe? I see one mention of pointers, saying no pointer arithmetic is allowed. How does that mesh with seamless C++ interop? (Is it even possible to have seamless C++ interop and still be memory safe?)

Hi Michael,

I tried to address each of your questions one after another below. As a general remark, this 1.0.5 is an alpha version, some things will be changed in the upcoming updates, including those that harden the code.

An attacker may want to get individual files saved in the store for any reason. Some of them are explained in the Guide and on the website:

"Just encrypting the file content and its name, without preventing its identification, still jeopardizes security. In case the attacker gets a hold on the ciphertext, they might succeed in decrypting the ciphertext. Alternatively, they might blackmail the data owner despite being unable to decrypt the data. Also, encryption methods considered safe today might become unsafe tomorrow because of the progress in the deciphering techniques. In some cases, the attacker may apply unlimited computing resources thus succeeding in a brute force attack. Ultimately, the attacker may possess a secret algorithm that deciphers the seemingly impregnable encryption scheme. Therefore, having the attacker being able to identify the encrypted data is potentially as dangerous as having the attacker succeeding in stealing cleartext or deciphering encrypted data."

Cryptomove data passwords are usually not kept in memory but only used to generate symmetric encryption keys. The data keys are used only once on the client -- for encryption and decryption, and then discarded unless the client works in 'trust' mode where it does keep the keys in the client memory for each next get/put operation, until the time the user switches the trust mode off. When the trust is off, the user must enter the password(s) for each next get/put kind of operation.

Each server keeps its server password in memory. Currently, that virtual memory is swappable, but in the next version (coming very soon) the key will be stored in the locked physical memory. In addition, plans exist to make sure the daemons are not dumpable or amenable to ptrace() calls (to prevent debuggers and other processes to peek inside). As I have said before, this is an alpha version -- the very first proof of concept, although we are aware of numerous enhancements that must still be applied.

Hello language has an interface to encrypt its network traffic, but that interface is not used in the current Cryptomove version.

Obviously, in general many different technologies can be used to build the same software. Cryptomove is built using Hello because this language allows for a convenient transfer of bulk and individual data as copy parameters in its remote method calls and return values, because it allows for convenient scheduling on multiple threads and for controlled processing of asynchronous events. Other Hello features used in Cryptomove include native interaction with C++ which allows for reuse any system calls and auxiliary C/C++ libraries.

Cryptomove protects against the attack that attempts to find any number of individual files in the store because even stealing the encrypted data poses a security risk as noted in the above quote. Unlike TRueCrypt, Cryptomove does not work on the disk or partition level, but places data parts inside encrypted files within a file system. Advantage of this approach is that the data store owners can employ any file-based management system or policy to store and manage the data, including distribution data policies between different levels of storage, increasing/decreasing store capacity, providing online backup/restore, etc.

Thank you very much for your kind words -- we really tried very hard for this first alpha release to be useful in tryouts. More things are still to come, including those I spelled out above, others mentioned in http://cryptomove.com/news/.

Regarding your last comments about Hello. There are a lot of examples of Hello programs in the Hello User Guide and on the website itself (http://www.amsdec.com/documentation/). The white paper at http://www.amsdec.com/wp-content/uploads/2015/10/hellowhitep... explains the rationale behind developing Hello and its design goals.

When one embeds a C++ code inside a Hello program, the C++ code has read/write access to the surrounding Hello data. Hello has no spelled out memory management policy, but the current runtime uses ref count. You are correct: if one is not careful, one can ruin memory by unsafe embedded C++ code. Still, Hello can prohibit execution of any portion of the embedded C++ code at runtime, also disable embedding C++ code at translation time in protected packages. It seems many 'safe' systems allow for embedding lower level programming languages, say Java allows native C, C allows assembler, etc. -- Hello is not different in this regard. All of them rely on the programmer being careful with the code at hand.

Hope the above helps to clear at least some of your questions.

Best Regards, Boris.

One correction -- obviously C is not safe, the example abof C with embedded assembler is perhaps just to illustrate a pairing of a higher and lower level languages in the same source code.

Sorry for confusion.

Hi, I am Boris, creator of Cryptomove (and the Hello distributed programming language with which it is built -- see www.amsdec.com).

Welcome any and all feedback. Looking forward to working with initial users to get to product market fit!

Thanks and please enjoy! Follow us on www.twitter.com/cryptomove.

Impressive amount of effort here. How is this different/better than using a strong static encryption for data and then using "normal" redundant cloud storage services like backblaze? Also, why did you choose Nettle's pbkdf2_hmac_sha256 for key derivation, and how does a user-supplied password end up mixed with server generated ones? (or am I confused altogether?)
Thank you for your kind words!

The technical difference is that in cryptomove data are broken apart so that the parts travel around the store -- either on the same or between different servers. there is no assumption that the store is in any dedicated location. In fact, Cryptomove is not a service -- it is software that anyone can employ to build either private store within one server, or spread out between several servers as long as they can connect via TCP/IP and have access to mounted disks with directories.

I do not know internal implementation of backblaze or other storages. But Cryptomove architecture allows for the saved data to be broken apart, and for encryption and movement and of the resulting data parts. When one forces the access to a disk through which the parts move, or even to a memory of the server through which the part moves, there is practically no way to know which part belongs to which original file. This way, even if the attacker gets a hold on the data, there is no easy way to find where the data of interest is. The idea is for Cryptomove to be the last line of defense -- even if the data is stolen there is no way to know which data is which. Even if one freezes all servers involved (one actually needs to know which servers are involved which is again a very hard task), the sheer amount of parts with encrypted names makes identification task very hard.

I have chosen pbkdf2_hmac_sha256 for key derivation because it is described as a hash generation function with very low probability of collisions and reversing. Cryptomove had chosen Nettle because it is free open source low level cryptographic library.

When the file is saved, the client daemon encrypts data parts using user the keys generated from the user supplied password. After these parts are mutually XOR-ed and bit-scattered, they are directed to the servers. Then these parts get encrypted with the server password on each server during its forward movement -- each server encrypts the part on top of the previous server encryption. Actually, after the part is encrypted, it immediately moves onto another server, so that getting the part on a server knowing that server password is useless -- the part has been encrypted on a different server. ON the way back, the part gets first decrypted, then re-encrypted again.

When the part is restored from the store back to client following the 'get' command, it gets decrypted on the way back by each server, and finally all parts get decrypted by the client, which reassembles them back into the original file. Hope hte above helps at least in PART:-)

It looks cool :)

Can users run storage servers?

Does it work via Tor?

Can servers run as Tor onion services?

Thank you for your enthusiasm -- we think Cryptomove is cool too!

There are no special provisions right now to have storage in the cloud or on any special servers. This alpha version has servers making TCP/IP connections. The data is stored in directories that have to reside on mounted disks. So, as long as TCP/IP connection works and directories are available, the servers may reside anywhere; as long as disks are mounted, they can also reside anywhere (say, NFS).

Obviously, versions beyond alpha might choose to have a hierarchy of storage devices -- from memory disks to mounted disks to cloud storages.

Cryptomove does not use any TOR interface.

The current version randomly sends data parts from one server to another via a TCP/IP connection, or moves the part within the local store. Therefore, the path traveled is not known in advance. However, the movement is star-like -- after the part hops randomly N steps, it retracts back to the point of origin retracting the same path.

On the forward path, each data part is encrypted (AES) on each server with that server's password, so at the end of the path of N steps it is encrypted N times. When retracting, on each server it gets decrypted, and then re-encrypted again.

When it comes back to the base server, it immediately starts another random path forward, again being encrypted moving away at the base over and over, while being decrypted/re-encrypted on the way back.

By default, the movement is not frequent -- each part moves once a day. If there are many parts, then one would see constant movement of random parts in random directions. However, data user may accelerate default movement frequency. Similarly, system owners (Admins) can slow down or accelerate the frequency of any part that travels through a particular server.

Thanks.

Let's say that I had a server at <https://dbshmc5frbchaum2.onion>. Could I point Cryptomove to, for example, <https://dbshmc5frbchaum2.onion.to>?

Does Cryptomove require UDP?

Yes, you can do that. Please see section 4.4.3 "Cluster Membership" from the Guide, on page 50 -- it explains how to set hosts to connect to and hosts to prohibit connections from.

Cryptomove does not require UDP.

Thanks. I will definitely test this.
How do you ensure that you don't forget data while you're moving it around?
When the part lands on a server, it initiates a periodic pulse -- a sequence of message sent from that server back to the base server from where the part had originated. Each message traverses the so far traveled path in the backward direction, and drops a little encrypted file that contains a reference to the previous server. This way, when a user gets the file from the store, the servers follow the tracks of each part.

When a part leaves a server to another server, it stops the current pulse -- the target server initiates a new pulse.

Each pulse erases the previous track file on each server it travels through.

What if two neighboring servers fail at the same time?
Currently, no fail-over of any kind is implemented in Cryptomove.

If one or more servers go down, all parts that currently reside on those servers obviously remain there. When a down server comes back again, it restarts the movement of all parts that used to reside there before the failure.

This may hamper delivery of data parts upon restore request in case some parts reside on the down server. However, the parts of the saved file is always duplicated on the client before they are directed to the servers. Thus, if enough servers are still up, the restore request may still fetch copies that are still on the up servers, and which path back to the base also goes through the up servers.

Again, currently copies of the same data part travel independently and randomly. In the worst case scenario it may happen that all of them end up on the down server, or that for all of them the path back to their base server has a down server. This however, seems unlikely if there is enough copies and up servers.

Also, when a server decides to push a data part onto another server, it only does it onto a server that is up. All servers maintain keep-alive heartbeats with the members of their clusters, so they know which cluster servers are up and which are down. Of course, it may happen a server goes down in the middle of a data piece transmission. In this case, if it is the source server, it will restart transmission upon its own restart. If it is the target server, the source server will receive a timeout or an exception, and will re-transmit the same part later to an online server (might even be the same target server that went down in case it had come back again).

How does the re-encryption work? Does the data get de-crypted then re-encrypted, or is it always double encrypted? Maybe I'm not understanding the "mutating" crypto part. I think it would help me if you explained that part more.
Parts travel from the base server deep into the network no more than a specified number of steps (8 by default). The path is always random, it may include loops and even hoping onto the same server. When it reaches the max length, the part returns back along the same path but in reverse order.

Before getting the base server, the client encrypts it with the data password supplied by the user.

On the forward path, each data part is encrypted (AES) on each server with that server's password, so at the end of the path of N steps it is encrypted N times. When retracting, on each server it gets decrypted, and then re-encrypted again.

When it comes back to the base server, it immediately starts another random path forward, again being encrypted moving away at the base over and over, while being decrypted/re-encrypted on the way back.

When a user retrieves the file, it asks the base servers to retrieve the relevant parts. The parts travel back to the base servers retracting the path, being decrypted on each of the servers. after all needed parts arrive, the client fetches them from the base parts and decrypts them one last time using the data password.

Cryptomove means "Secret Purple".