| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kragen 210 days ago

There are a lot of design alternatives possible to TCP within the "create a reliable stream of data on top of an unreliable datagram layer" space:

• TCP itself also supports a half-duplex mode—even if one end sends FIN, the other end can keep transmitting as long as it wants. This was probably also a good idea, but it's certainly not the only obvious choice.

• Sequence numbers on messages or on bytes?

• Wouldn't it be useful to expose message boundaries to applications, the way 9P, SCTP, and some SNA protocols do?

• If you expose message boundaries to applications, maybe you'd also want to include a message type field? Protocol-level message-type fields have been found to be very useful in Ethernet and IP, and in a sense the port-number field in UDP is also a message-type field.

• Do you really need urgent data?

• Do servers need different port numbers? TCPMUX is a straightforward way of giving your servers port names, like in CHAOSNET, instead of port numbers. It only creates extra overhead at connection-opening time, assuming you have the moral equivalent of file descriptor passing on your OS. The only limitation is that you have to use different client ports for multiple simultaneous connections to the same server host. But in TCP everyone uses different client ports for different connections anyway. TCPMUX itself incurs an extra round-trip time delay for connection establishment, because the requested server name can't be transmitted until the client's ACK packet, but if you incorporated it into TCP, you'd put the server name in the SYN packet. If you eliminate the server port number in every TCP header, you can expand the client port number to 24 or even 32 bits.

• Alternatively, maybe network addresses should be assigned to server processes, as in Appletalk (or IP-based virtual hosting before HTTP/1.1's Host: header, or, for TLS, before SNI became widespread), rather than assigning network addresses to hosts and requiring port numbers or TCPMUX to distinguish multiple servers on the same host?

• Probably SACK was actually a good idea and should have always been the default? SACK gets a lot easier if you ack message numbers instead of byte numbers.

• Why is acknowledgement reneging allowed in TCP? That was a terrible idea.

• It turns out that measuring round-trip time is really important for retransmission, and TCP has no way of measuring RTT on retransmitted packets, which can pose real problems for correcting a ridiculously low RTT estimate, which results in excessive retransmission.

• Do you really need a PUSH bit? C'mon.

• A modest amount of overhead in the form of erasure-coding bits would permit recovery from modest amounts of packet loss without incurring retransmission timeouts, which is especially useful if your TCP-layer protocol requires a modest amount of packet loss for congestion control, as TCP does.

• Also you could use a "congestion experienced" bit instead of packet loss to detect congestion in the usual case. (TCP did eventually acquire CWR and ECE, but not for many years.)

• The fact that you can't resume a TCP connection from a different IP address, the way you can with a Mosh connection, is a serious flaw that seriously impedes nodes from moving around the network.

• TCP's hardcoded timeout of 5 minutes is also a major flaw. Wouldn't it be better if the application could set that to 1 hour, 90 minutes, 12 hours, or a week, to handle intermittent connectivity, such as with communication satellites? Similarly for very-long-latency datagrams, such as those relayed by single LEO satellites. Together this and the previous flaw have resulted in TCP largely being replaced for its original session-management purpose with new ad-hoc protocols such as HTTP magic cookies, protocols which use TCP, if at all, merely as a reliable datagram protocol.

• Initial sequence numbers turn out not to be a very good defense against IP spoofing, because that wasn't their original purpose. Their original purpose was preventing the erroneous reception of leftover TCP segments from a previous incarnation of the connection that have been bouncing around routers ever since; this purpose would be better served by using a different client port number for each new connection. The ISN namespace is far too small for current LFNs anyway, so we had to patch over the hole in TCP with timestamps and PAWS.

4 comments

zackmorris 208 days ago

I just want to say that it's refreshing to stumble onto someone commenting in the same style that I do. Where most people see things that are good enough, hard to fix or innovative, I see things for their fatal flaws, how they should have been done right from the start and why they are obvious. So I'll just add my list of gripes about TCP that in many ways ruined the internet for decades, and maybe still do:

  - TCP should have been a reliability layer above UDB, not beside it (made P2P harder than it should be, mainly burdening teleconferencing and video games)
  - Window size field bytes should have been arbitrary length
  - Checksum size field bytes should have been arbitrary length and the algorithm should have been optionally customizable
  - Ports should have been unique binary strings of arbitrary length instead of numbers, and not limited in count (as mentioned)
  - Streams should have been encrypted by default, with clear transmission as the special case (symmetric key encryption was invented before TCP)
  - IP should have connected to an arbitrary peer ID, not a MAC address, for resumable sessions if network changes (maybe only securable with encryption)
  - Encrypted streams should not have been on a special port for HTTPS (not TCP's fault)
  - IP address field bytes should have been arbitrary length (not TCP's fault)
  - File descriptors could have been universal instead of using network sockets, unix sockets, files, pipes and bind/listen/accept/select (not TCP's fault)
  - Streams don't actually make sense in the first place, we needed state transfer with arbitrary datagram size and partial sends/ranges (not TCP's fault)

Linking this to my "why your tunnel won't work" checklist:

https://news.ycombinator.com/item?id=44713493

I want to add that the author of the article wrote one of the cleanest and most concise summaries of the TCP protocol that I've ever read.

link

musicale 210 days ago

AppleTalk didn't get much love for its broadcast (or possibly multicast?) based service discovery protocol - but of course that is what inspired mDNS. I believe AppleTalk's LAN addresses were always dynamic (like 169.x IP addresses), simplifying administration and deployment.

I tend to think that one of the reasons linux containers are needed for network services is that DNS traditionally only returns an IP address (rather than address + port) so each service process needs to have its own IP address, which in linux requires a container or at least a network namespace.

AppleTalk also supported a reliable transaction (basically request-response RPC) protocol (ATP) and a session protocol, which I believe were used for Mac network services (printing, file servers, etc.) Certainly easier than serializing/deserializing byte streams.

link

kragen 210 days ago

Does "session protocol" mean that it provided packet retransmission and reordering, like TCP? How does that save you serializing and deserializing byte streams?

I agree that, given the existing design of IP and TCP, you could get much of the benefit of first-class addresses for services by using, for example, DNS-SD, and that is what ZeroConf does. (It is not a coincidence that the DNS-SD RFC was written by a couple of Apple employees.) But, if that's the way you're going to be finding endpoints to initiate connections to, there's no benefit to having separate port numbers and IP addresses. And IP addresses are far scarcer than just requiring a Linux container or a network namespace: there are only 2³² of them. But it is rare to find an IP address that is listening on more than 64 of its 2¹⁶ TCP ports, so in an alternate history where you moved those 16 bits from the port number to the IP address, we would have one thousandth of the IP-address crunch that we do.

Historically, possibly the reason that it wasn't done this way is that port numbers predated the DNS by about 10 years.

link

musicale 210 days ago

The session protocol was for sessions with servers and was used for AFP (AppleShare file servers) I believe.

The higher level protocols were built on ATP which was message based.

ADSP was a stream protocol that could be used for remote terminal access or other applications where byte streams actually made sense.

> Historically, possibly the reason that it wasn't done this way is that port numbers predated the DNS by about 10 years.

Predated or postdated?

My understanding is that DNS can potentially provide port numbers, but this is not widely used or supported.

link

kragen 210 days ago

DNS postdated port numbers.

Mockapetris's DNS RFCs are from 01983, although I think I've talked to people who installed DNS a year or two before that. Port numbers were first proposed in RFC 38 in 01970 https://datatracker.ietf.org/doc/html/rfc38

> The END and RDY must specify relevant sockets in addition to the link number. Only the local socket name need be supplied

and given actual numbers in RFC 54, also in 01970 https://datatracker.ietf.org/doc/html/rfc54

> Connections are named by a pair of sockets. Sockets are 40 bit names which are known throughout the network. Each host is assigned a private subset of these names, and a command which requests a connection names one socket which is local to the requesting host and one local to the receiver of the request.

> Sockets are polarized; even numbered sockets are receive sockets; odd numbered ones are send sockets. One of each is required to make a connection.

In RFC 129 in 01971 we see discussion about whether socketnames should include host numbers and/or user numbers, still with the low-order bit indicating the socket's gender (emissive or receptive). https://datatracker.ietf.org/doc/html/rfc129

RFC 147 later that year https://datatracker.ietf.org/doc/html/rfc147 discusses within-machine port numbers and how they should or should not relate to the socketnames transmitted in NCP packets:

> Previous network papers postulated that a process running under control of the host's operating system would have access to a number of ports. A port might be a physical input or output device, or a logical I/O device (...)

> A socket has been defined to be the identification of a port for machine to machine communication through the ARPA network. Sockets allocated to each host must be uniquely associated with a known process or be undefined. The name of some sockets must be universally known and associated with a known process operating with a specified protocol. (e.g., a logger socket, RJE socket, a file transfer socket). The name of other sockets might not be universally known, but given in a transmission over a universally known socket, (c. g. the socket pair specified by the transmission over the logger socket under the Initial Connection Protocol (ICP). In any case, communication over the network is from one socket to another socket, each socket being identified with a process running at a known host.

RFC 167 the same year https://datatracker.ietf.org/doc/html/rfc167 proposes that socketnames not be required to be unique network-wide but just within a host. It also points out that you really only need the socketname during the initial connection process, if you have some other way of knowing which packets belong to which connections:

> Although fields will be helpful in dealing with socket number allocation, it is not essential that such field designations be uniform over the network. In all network transactions the 32-bit socket number is handled with its 8-bit host number. Thus, if hosts are able to maintain uniqueness and repeatability internally, socket numbers in the network as a whole will also be unique and repeatable. If a host fails to do so, only connections with that offending host are affected.

> Because the size, use, and character of systems on the network are so varied, it would be difficult if not impossible to come up with an agreed upon particular division of the 32-bit socket number. Hosts have different internal restrictions on the number of users, processes per user, and connections per process they will permit.

> It has been suggested that it may not be necessary to maintain socket uniqueness. It is contended that there is really no significant use made of the socket number after a connection has been established. The only reason a host must now save a socket number for the life of a connection is to include it in the CLOSE of that connection.

RFC 172 in June https://datatracker.ietf.org/doc/html/rfc172 proposes using port 3 for the second version of FTP:

> [6] It seems that socket 1 has been assigned to logger. Socket 3 seems a reasonable choice for File Transfer.

This updates the first version in RFC 114 in April https://datatracker.ietf.org/doc/html/rfc114 which said:

> [16] It seems that socket 1 has been assigned to logger and socket 5 to NETRJS. Socket 3 seems a reasonable choice for the file transfer process.

RFC 196 the same year https://datatracker.ietf.org/doc/html/rfc196 proposes to use port 5 to receive mail and/or print jobs:

> Initial Connection will be as per the Official Initial Connection Protocol, Documents #2, NIC 7101, to a standard socket not yet assigned. A candidate socket number would be socket #5.

In RFC204 in August https://www.rfc-editor.org/rfc/rfc204.html Postel publishes the first list of port number assignments:

> I would like to collect information on the use of socket numbers for "standard" service programs. For example Loggers (telnet servers) Listen on socket 1. What sockets at your host are Listened to by what programs?

> Recently Dick Watson suggested assigning socket 5 for use by a mail-box protocol (RFC196). Does any one object ? Are there any suggestions for a method of assigning sockets to standard programs? Should a subset of the socket numbers be reserved for use by future standard protocols?

> Please phone or mail your answers and commtents to (...)

Amusingly in retrospect, Postel did not include an email address, presumably because they didn't have email working yet.

FTP's assignment to port 3 was confirmed in RFC 265 in November:

> Socket 3 is the standard preassigned socket number on which the cooperating file transfer process at the serving host should "listen". (*)The connection establishment will be in accordance with the standard initial connection protocol, (*)establishing a full-duplex connection.

In May of 01972 Postel published a list as RFC 349 https://www.rfc-editor.org/rfc/rfc349.html:

> I propose that there be a czar (me ?) who hands out official socket numbers for use by standard protocols. This czar should also keep track of and publish a list of those socket numbers where host specific services can be obtained. I further suggest that the initial allocation be as follows:

        Sockets         Assignment
        0-63            Network wide standard functions
        64-127          Host specific functions
        128-239         Reserved for future use
        240-255         Any experimental function

> and within the network wide standard functions the following particular assignment be made:

        Socket          Assignment
           1            Telnet
           3            File Transfer
           5            Remote Job Entry
           7            Echo
           9            Discard

Note that ports 7 and 9 are still assigned to echo and discard in /etc/services, although Telnet and FTP got moved to ports 23 and 21, respectively.

    tcpmux          1/tcp                           # TCP port service multiplexer
    echo            7/tcp
    echo            7/udp
    discard         9/tcp           sink null
    discard         9/udp           sink null
    systat          11/tcp          users
    daytime         13/tcp
    daytime         13/udp
    netstat         15/tcp
    qotd            17/tcp          quote
    chargen         19/tcp          ttytst source
    chargen         19/udp          ttytst source
    ftp-data        20/tcp
    ftp             21/tcp
    fsp             21/udp          fspd
    ssh             22/tcp                          # SSH Remote Login Protocol
    telnet          23/tcp

So, internet port numbers in their current form are from 01971 (several years before the split between TCP and IP), and DNS is from about 01982.

In December of 01972, Postel published RFC 433 https://www.rfc-editor.org/rfc/rfc433.html, obsoleting the RFC 349 list with a list including chargen and some other interesting services:

       Socket          Assignment

       1               Telnet
       3               File Transfer
       5               Remote Job Entry
       7               Echo
       9               Discard
       19              Character Generator [e.g. TTYTST]

       65              Speech Data Base @ ll-tx-2 (74)
       67              Datacomputer @ cca (31)

       241             NCP Measurement
       243             Survey Measurement
       245             LINK

The gap between 9 and 19 is unexplained.

RFC 503 https://www.rfc-editor.org/rfc/rfc503.html from 01973 has a longer list (including systat, datetime, and netstat), but also listing which services were running on which ARPANet hosts, 33 at that time. So RFC 503 contained a list of every server process running on what would later become the internet.

Skipping RFC 604, RFC 739 from 01977 https://www.rfc-editor.org/rfc/rfc739.html is the first one that shows the modern port number assignments (still called "socket numbers") for FTP and Telnet, though those presumably dated back a couple of years at that point:

      Specific Assignments:

         Decimal   Octal     Description                      References
         -------   -----     -----------                      ----------
         Network Standard Functions
         1         1         Old Telnet                              [6]
         3         3         Old File Transfer                   [7,8,9]
         5         5         Remote Job Entry                       [10]
         7         7         Echo                                   [11]
         9         11        Discard                                [12]
         11        13        Who is on or SYSTAT
         13        15        Date and Time
         15        17        Who is up or NETSTAT
         17        21        Short Text Message
         19        23        Character generator or TTYTST          [13]
         21        25        New File Transfer                 [1,14,15]
         23        27        New Telnet                        [1,16,17]
         25        31        Distributed Programming System      [18,19]
         27        33        NSW User System w/COMPASS FE           [20]
         29        35        MSG-3 ICP                              [21]
         31        37        MSG-3 Authentication                   [21]

Etc. This time I have truncated the list. It also has Finger on port 79.

You say, "My understanding is that DNS can potentially provide port numbers, but this is not widely used or supported." DNS SRV records have existed since 01996 (proposed by Troll Tech and Paul Vixie in RFC 2052 https://www.rfc-editor.org/rfc/rfc2052), but they're really only widely used in XMPP, in SIP, and in ZeroConf, which was Apple's attempt to provide the facilities of AppleTalk on top of TCP/IP.

link

Animats 210 days ago

• Full-duplex connections are probably a good idea, but certainly are not the only way, or the most obvious way, to create a reliable stream of data on top of an unreliable datagram layer. TCP itself also supports a half-duplex mode—even if one end sends FIN, the other end can keep transmitting as long as it wants. This was probably also a good idea, but it's certainly not the only obvious choice.

Much of that comes from the original applications being FTP and TELNET.

• Sequence numbers on messages or on bytes?

Bytes, because the whole TCP message might not fit in an IP packet. This is the MTU problem.

• Wouldn't it be useful to expose message boundaries to applications, the way 9P, SCTP, and some SNA protocols do?

Early on, there were some message-oriented, rather than stream-oriented, protocols on top of IP. Most of them died out. RDP was one such. Another was QNet.[2] Both still have assigned IP protocol numbers, but I doubt that a RDP packet would get very far across today's internet.

This was a lack. TCP is not a great message-oriented protocol.

• Do you really need urgent data?

The purpose of urgent data is so that when your slow Teletype is typing away, and the recipient wants it to stop, there's a way to break in. See [1], p. 8.

Yes, reliable RTT is a problem.

• Do you really need a PUSH bit? C'mon.

It's another legacy thing to make TELNET work on slow links. Is it even supported any more?

• Also you could use a "congestion experienced" bit instead of packet loss to detect congestion in the usual case. (TCP did eventually acquire CWR and ECE, but not for many years.)

Originally, there was ICMP Source Quench for that, but Berkley didn't put it in BSD, so nobody used it. Nobody was sure when to send it or what to do when it was received.

• The fact that you can't resume a TCP connection from a different IP address, the way you can with a Mosh connection, is a serious flaw that seriously impedes nodes from moving around the network.

That would require a security system to prevent hijacking sessions.

[1] https://archive.org/stream/rfc854/rfc854.txt_djvu.txt

[2] https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers

link

musicale 210 days ago

> The fact that you can't resume a TCP connection from a different IP address, the way you can with a Mosh connection, is a serious flaw that seriously impedes nodes from moving around the network

This 100% !! And basically the reason mosh had to be created in the first place (and it probably wasn't easy.) Unfortunately mosh only solves the problem for ssh. Exposing fixed IP addresses to the application layer probably doesn't help either.

So annoying that TCP tends to break whenever you switch wi-fi networks or switch from wi-fi to cellular. (On iPhones at least you have MPTCP, but that requires server-side support.)

link