Hacker News new | ask | show | jobs
by bodyfour 4518 days ago
Unfortunately it includes my biggest gripe about IMAP -- the requirement that messages are given a server-side message number. Maybe breaking with IMAP on this would make interop too hard though.

This requirement is convenient for the client but makes implementing robust servers harder than it need be. If instead each message was identified by a {timestamp,UUID} tuple then multiple MX servers could do final delivery to an eventually-consistent shared store. The requirement for strict, durable message ordering is the one thing that forces hard synchronization among multiple machines doing delivery.

POP3 actually made a step this direction with the common "UIDL" extension; IMAP4 felt like going backwards on this.

While I'm on the subject, another less-common POP3 extension I liked was "XTND XMIT". This gave a simple way for the POP3 client to send mail; sadly it never really caught on in favor of just using SMTP submission. Originally though this only worked OK because SMTP was open-relay. Once spam put an end to that it became complicated to send a message when accessing your mail remotely. Then to solve that problem we had to graft an auth layer on top of the existing SMTP protocol and upgrade all of the clients and servers to support it. If only "XTND XMIT" had won in the beginning everybody would be submitting outgoing mail over that already-authenticated channel and so much work could have been avoided.

So anyway, those are my two requests for any "IMAP killer": allow the server to use unique (but non-sequential) IDs for messages and provide a way to submit outgoing messages over the same channel.

4 comments

I'm in two minds about the UID stuff. There are advantages and disadvantages.

But one thing that you MUST have for sensible synchronisation is the global MODSEQ. Otherwise there's no single state you can use to know if there's new data on the server short of (as you do with POP3) fetching the entire UIDL list every time.

The way we do this with Cyrus replication is to reinject the message if we get a UID clash - so you can have delivery at both ends of a (theoretical - not yet finished) master<=>master replica pair. Upon discovering the mismatch, BOTH messages get new UIDs larger than any yet used, which brings all clients into agreement.

But anyway - our unique identifiers in JMAP can be anything the server would like. We use 'fNUMBERuNUMBER' at FastMail, which is a folder/uid pair - but a more modern server could use UUIDs that don't change for moves between folders. We could even implement it in Cyrus on top of the Digest.SHA1 field if we built a reverse mapping database and deduplicated on it.

But synchronised modseqs... I don't see a way around that. We've been discussing "modseq validity range" stuff - where a delivery logs an intent to use a modseq, and then clears the intent upon commit. Any request that includes a range past that intent needs to wait for the commit to complete or abort.

For anti-entropy in a distributed system, one could use a combination of the following:

1. Get everything created since last sync. A created index (as opposed to updated index) would not need to be resorted for subsequent updates. This would get only new mail.

2. A fixed-size Merkle tree, 8 blocks per level, 7 levels deep, with around 262144 blocks at the base. This would be the best tree configuration for syncing from a few thousand up to 40 million emails. This would identify all mail updated since last connection, within 4 roundtrips with a minimum of bandwidth. The tree should be incrementally updated using XOR at both client and server whenever an update is made. It's possible to use a tree with less depth for fewer roundtrips but with support for less emails or requiring more bandwidth.

3. Push sync. Server would push all creates/updates to client while client is connected.

Specifically, you want a Merkle tree where the final bucket is based on the lowest N bits of the message arrival time, probably at a granularity of a several seconds. In the typical sync case all of the unseen messages will then be clustered in a few buckets and the Merkle tree will be very efficient.

For instance, if each bucket contains 8 seconds worth of arrival time, then 2^17 buckets mean they'll wrap around only every ~12 days. So if you are syncing after a day away there will be only 8% of the buckets potentially dirty (even if you get an incredible flood of mail!)

Having the bucket use based on time also allows you to avoid some of the roundtrip costs when doing frequent syncing. In a single request, the client can speculatively send its hashes of buckets that are likely to have changed since its last sync time along with its root hash. The server finds which buckets changed, sending those new bucket contents -- it can then see that the top level hash must now agree on both sides and you're done in one RT.

Of course it's possible that other buckets have changed in rare edge cases; in this case the top hashes will not match and you have to do the full Merkle descent to resync.

That's an interesting idea to hash emails to Merkle buckets by arrival time. It may lead to uneven distribution of emails to buckets though, increasing the amount of bandwidth required to sync a bucket in the worst case.

There are other optimizations you can bring in when you sync the tree:

1. If you are working your way down the remote's tree and you notice that your local signature is zero for the equivalent remote's node signature, then you know that your entire subtree is empty, and you can short-circuit and start downloading entire buckets.

2. Conversely, if you are working your way down the remote's tree and you notice that your local signature is present but the equivalent remote's node signature is zero, then you know that your entire subtree has data, but the remote's entire subtree is empty, and you can short-circuit and start uploading entire buckets.

3. As you work your way down sections of the tree, you can start to build an idea of the average email to bucket ratio. If several buckets are only likely to contain at most one email, then you can short-circuit again.

I think the LSB bits of arrival time is actually a really good distribution long-term (remember: we want the distribution to be uneven in the short term, but even in the long term) Maybe you could improve it a little bit by making each bin a prime number of seconds so automated messages sent at the same time every day can hit each bucket eventually. i.e. instead of 8 second granularity, use 7 or 11.
(Author of the spec here)

A fixed-size Merkle tree is a great idea if we wanted to sync the entire set of messages between distributed MX destinations. In fact, we should probably consider that approach for our server <-> server replication. However, JMAP is aimed at client <-> server sync. There are a number of different issues to consider here. The client may have limited or no cache (a webmail app for instance must normally start anew each time, and needs to efficiently fetch just what it needs; if we had to wait for a 60GB mailbox to sync every time we loaded our webmail, well, obviously that's not going to work. A mobile mail client often only wants to download the first few items in the mailbox given the current sort order so it can display them to the client. The JMAP spec allows a client to download, say, just the list of the first 50 messages, newest first, in the Inbox (single round trip). Then, at a later point, it can ask for and receive a space-efficient delta update of what has changed within those first 50 messages in the Inbox, regardless of how big the Inbox really is, or what other messages may have been delivered elsewhere (including lower down the Inbox, but outside the top 50). Again in a single round trip. This is really powerful: you can see that by comparing the refresh time of a folder in the FastMail mobile web app with the iOS or Android native mail app (over IMAP); the mobile web app is much quicker.

The other thing to remember is that the metadata about the message is mutable, and needs to be kept in sync as well. Again, the modseq system described in JMAP allows this to happen efficiently. If you have a distributed server system with multiple masters, you will again use a different system to keep those in sync (as they each assign new modseqs, whereas the client does not). However, this too can be dealt with relatively easily: if the modseq for a message differs between two masters, it must be set on both to max(the higher of the two modseqs, the next highest global modseq on the server with the currently lower modseq). This will ensure that both will end up in the same state, and that the client will get refresh its state for that message if it has to, no matter which master it last synced with.

So essentially, there's a difference in what you want in a protocol for server <-> server synchronisation for distributed mail backends compared to client <-> server sync for mail apps. JMAP is focussed on the latter, but tries to make sure it doesn't include assumptions that severely limit the former. The spec does not require a particular format of message ids: they do not have to be based around IMAP's ascending UIDs (although the sync algorithm for partial mailbox listings may be slightly less efficient if not).

> if we had to wait for a 60GB mailbox to sync every time we loaded our webmail, well, obviously that's not going to work

First, the Merkle tree would just be of the IDs not the message contents. I.e. the problem it is trying to solve is for the heavy-weight client that wants to resync its idea of which message UIDs are contained in a mailbox. Which ones it then requests metadata (or full-contents) is up to it.

It's true that for a ephemeral client (like a webapp) you don't even want to get the whole list of 100K messages in your inbox (although you probably will ask the server for a count of them) However, here having a sequential message number hardly helps either. For example, when I click on "sort by date" in my mail client it shows me the messages by sending date, not by what integer IMAP UID they were given.

The problem of "give me the metadata for the top 100 messages sorted by criteria X" is just something that has to be offered by the server. This is no different than features like searching -- if bandwidth were infinite the client could do it, but it's not so you need to do the work near where the data is stored.

The server will have to maintain data structures to make these kinds of queries efficient. The important point however, is that these are conceptually just caches of information that is in the mailstore. Therefore there is no need to keep that synced between far-flung servers.

> The other thing to remember is that the metadata about the message is mutable, and needs to be kept in sync as well

Yes, message data and metadata are separate problems. If two servers both try to update the metadata for message X at the same instant one needs to win (which one is probably not that important) If two servers both try to add new messages to a folder at the same instant they both need to win, ideally without having to coordinate at all in advance.

For the metadata you can use a modseq, a Lamport timestamp, or probably even just a {wallclock time,server ID} tuple. Assuming good time synchronization is reasonable for a mail cluster. If two clients both try to change message flags within 10ms of each other it probably isn't important who wins as long as somebody definitively does.

I've been waiting a bit to reply to this to see if there's a polite way to answer, but there isn't.

You suffer from a disease, and it's far too bloody common amongst people who design IETF standards.

You're optimising a rare case, where in the worst case you would be re-iding the bulk of the messages delivered in a time period of a few hours, leading clients to have to re-fetch the metadata again (note: it's only the modseq that would change in our design - the UUID is still globally unique and unchaged).

That's no worse than "FETCH 1:* FLAGS" now.

And in exchange for this slight benefit, you would cause EVERY SINGLE CLIENT EVERYWHERE to implement a significantly more complex protocol which requires 4 roundtrips to resync.

Talk about a dog with a very big, very waggy tail.

I'm sorry, but I can't take this suggestion seriously. I've seen too many protocols which are designed with this sort of focus on the worst cases. You have to make sure they don't totally suck - but the important place to optimise is the common case.

...

It turns out there is a way you can gain almost all of the benefits you're looking for - which is delayed MODSEQ allocation. Newly delivered messages can be created with no MODSEQ. When you sync with another either client or server, you allocate a MODSEQ higher than than any which the other end has seen.

Which means you only pay an extra cost if a JMAP peer or client connects at both ends during the link downtime.

The reinjection on clash (CSMA/CD for numbers :-) works in close-knit clusters but has issues with network splits. Imagine one server is in SF and the other is in Amsterdam, and a network problem makes them unable to talk for a few hours. In the mean time, mail gets delivered and read on each server. Then they reconnect to each other and you find a storm of conflicts to resolve.

Also, I strongly believe that per-message UUIDs should always start with the delivery time. That way as long as your servers are all reasonably synchronized you can get reasonable "sort by folder order" behavior without any sequential ID.

You're right that the modseq issue is hard, but the Merkle tree suggestion from @jorangreef can address that. Refer to my reply to him for more details.

Some IMAP servers allow you to use a send folder to inject mail to be sent. I think that's a pretty reasonable way to go about it with the way IMAP works, but afaik no client implements it.

Agreed on the messageid thing. I frankly think that IMAP is too messy to try too hard for interop with. Using a relatively similar data model and being able to work correctly with a maildir should be considered enough.

But I love this idea and hope it takes off in spite of that. It's really time to revamp the interface to mail servers to be more friendly to webmail. It'll make a huge difference to the level of choice of webmail providers/software.

Do those IMAP servers allow separate control of the envelope recipients, or does it just take them directly from the message headers? To be fair, POP3's "XTND XMIT" had that flaw as well (IIRC qpopper just passed the message to "sendmail -t")

A proper mail submission command would preserve parity with SMTP by keeping envelope and headers distinct. I just want to leverage the fact that I've already auth'ed the user and shouldn't have to redo that to a separate server.

Also, presumably one of the advantages to the REST+JSON design of JMAP is to make it easy for javascript clients to use it. Since they can't do direct SMTP anyway there will presumably be some REST service that they will use for submission.

It just seems that it should be part of the same standard. Most mail readers are also mail senders.

> Unfortunately it includes my biggest gripe about IMAP -- the requirement that messages are given a server-side message number. Maybe breaking with IMAP on this would make interop too hard though.

I don't understand this criticism, the criticism is usually that IMAP does not require maintaining UIDs (the server can announce UIDNOTSTICKY[0])

> While I'm on the subject, another less-common POP3 extension I liked was "XTND XMIT". This gave a simple way for the POP3 client to send mail

https://news.ycombinator.com/item?id=4826447

> IMAP has actually been extended to support that feature (not just kind of randomly, but by the IETF "Lemonade" working group), even allowing (in concert with other extensions) compositing a message from parts (such as attachments) on the server without having to download them first to the client.

> However, and I think this is actually more relevant in this context, Mark Crispin was an "opponent". (Note: the following e-mail snippet by Mark Crispin, was written years after the Lemonade Submit proposal expired, and thereby should be considered to already take into account that context.)

>> For many years, there have been various proposals to add mail sending capabilities to mail access protocols such as POP and IMAP.

>> These proposals are always strongly opposed. It is one of the "attractive nuisances" of email protocols. The value of the capability is obvious to many people, but the high cost of having it in POP or IMAP is much less obvious.

>> I am one of the opponents. For the past 25 or so years we have been in the overwhelming majority. It is quite unlikely that this concensus will change. If anything, it has become stronger in recent years.

>> […]

>> If you're not really that curious, suffice it to say that the people who design the protocols and systems understand the attraction but have excellent reasons not to do it.

http://mailman2.u.washington.edu/pipermail/imap-uw/2009-Janu...

[0] http://mailman2.u.washington.edu/pipermail/imap-protocol/200...

Mark Crispin was wrong. The underlying model of IMAP is strong, but a lot of the complexity of mail access comes from clients working around things which are either broken or missing in the protocol.

Their reasons aren't excellent, they are some theoretical notion of purity which leads to things like the "MOVE" capability taking years to finally happen, and everyone implementing copy+store+expunge independently and poorly in the meanwhile.

He's dead now, so he can't tell me I'm a feckless Gen X with no clue how to do real protocols - but there's a reason why proprietary protocols like ActiveSync get traction, and it's not because you need to pay a licence fee - it's because they actually solve problems that need solving.

(Author of the spec here)

We will be adding sending to the JMAP spec, although support may be optional. The way it currently works at FastMail is a bit too specific to our architecture though, so we want to clean it up before adding it to the spec.

>>> These proposals are always strongly opposed. It is one of the "attractive nuisances" of email protocols. The value of the capability is obvious to many people, but the high cost of having it in POP or IMAP is much less obvious.
Lucky I'm not considering adding it to either POP or IMAP then. :)
Good luck with your doom-repeating experiment, I guess.
> the criticism is usually that IMAP does not require maintaining UIDs

Yes, exactly. UIDs should be sticky, but they should also be allocatable by widely distributed nodes without the need for central coordination.

If your UIDs are 32-bit monotonically-increasing integers then this is impossible. If they are 128-bit random numbers you get it for free. If you prefix them with a timestamp you even get reasonable ordering.

The entire UIDNOTSTICKY problem is a result of IMAP UIDs being so restricted.

I can see why a proposal as complicated as Lemonade failed to get traction, but I find the argument pretty thin. The "attractive nuisance" was the old open-relay SMTP infrastructure. If you were writing a POP client in 1991 you wouldn't think twice about sending mail since you just had to hit a SMTP server (didn't even have to be the "right" one!) and the mail would get delivered. When this all came crashing down far more effort got put into authenticated SMTP (talk about not separating concerns!) than would have been required to just get message submission right in the first place.

You could `sha1` the messages, similar to how `git` deals with blobs, trees, etc?