We did something similar in our Tarantella product many years ago. However we quickly discovered that many companies operate network infrastructure that verifies protocols. For example they would check that whatever happened on port 443 was valid SSL and nothing else.
In the end we modified our clients to include a decoy cipher suite in the SSL negotiation. That kept the network happy, and was enough for our multiplexer to then internally route to the correct backend.
Yup, that's also why I added the ability to tunnel anything over a transport (the only implemented one being TLS).
You can get the SSH client to connect over this either by using an openssl s_client trick, or by just using my little tunnel tool (https://github.com/joushou/tunnel).
Uh, wow, the popularity just skyrocketed. It's 00:17 and I just came home from a concert, so bear with me maybe being a bit incoherent.
First of all, I'd like to thank my wife, who helped motivate me, my parents, who you know, did the thing that brought me here, my work, for paying me to browse news and occasionally code...
So, why Go? Because it was easiest that way. Try to take a look at how simple the ProxyConn thing is, which is the core of the entire thing. Go is a tool that I found suitable for the task. It also brings things like easy cross-compilation after the code has been written (Seriously, GOOS=windows go build on my mac and I get a PE32+, GOOS=darwin go build on my linux desktop and I get a Mach-O binary). You also get to interact easily with the wonderful Go infrastructure, in case you don't want to redirect to external services.
Why make something that already exists? Well, I didn't know that there were so many solutions, I just thought "Hmm, I wonder if I can make this work in a nice fashion...". That's not going to stop me from continuing development of my own, however. I have ideas I want to try, features I want to implement. It's powering everything on my own servers, and the stealthiness of SSH over TLS (NOT just SSH on port 443) have been (ab)used multiple times already.
Is my solution better than the others? Well, I of course like my own solution, but with few exceptions, I think I bring some interesting flexibility to the game. A lot fo solutions seem to be fixed for a certain purpose, such as SSH and HTTPS, usually just in their regular variants. I don't really care what you want to serve, as long as you can write down some unique bytes in the config for serve2d, or write a more complicated handler directly for serve2. Add transports if you want. Want SSH over TLS over WebSocket over IP over avian carriers? Add the transports.
I'm going to sleep now, and I hope the world haven't self-ignited before I wake up. It's really exciting and motivating to see how people react to private projects.
There are a number of work and wifi proxies which won't allow out a lot of traffic, based on port. A multiplexer like this can work around such (assumed) well-intentioned but poorly implemented protections.
Wouldn't a VPN or tunnelling solution be better? You pass all your traffic through a fixed port on one host, then unwrap it and use the web as though the restriction wasn't there.
Nothing stops you from passing a VPN through a TCP connection on port 80 and 443. There are few protocols people haven't tunnelled IP over (DNS included...)
I often have to work behind "smoothwall". AFAIK, it only allows HTTP over port 80 and HTTPS on port 443 - any other protocol on any other port gets blocked, including other protocols over port 80/443.
To bypass this, I wrote a simple ruby script to tunnel TCP connections, while adding fake HTTP headers to get through the firewall.
Generally these well-intentioned networks block the usual VPN and tunneling solutions (think kid and a school network with a firewall that only allows FTP, HTTP, and HTTPS).
At which point you simply use these ports for your VPN.
OpenVPN can use TCP, it can use UDP, it can use whatever port you like, it can even use a fixed key to turn all traffic into random noise (obfuscating protocol structures).
SSH with SOCKS5 tunnelling enabled does exactly this. The trick is that VPN's aren't allowed very often, so that's where the TLS tunnel trick steps in. You can use a VPN over TLS with serve2d if you want. I just find ssh -D5000 easier to set up quickly "in the field" than a VPN, especially seeing that it doesn't require server configuration.
It's not a bad idea, though I'm not sure why someone would want to do this on their own in a bolt-on manner. For an example of a place where demux is encouraged, see RFC 6335:
Conservation of the port number space is required because this space is a limited resource, so applications are expected to participate in the traffic demultiplexing process where feasible. The port numbers are expected to encode as little information as possible that will still enable an application to perform further demultiplexing by itself.
[...]
IANA strives to assign only one assigned port number per service or application.
Why do I know this? Consul by default uses 5-6 ports, and one of them conflicts with an IETF registered port in-use by our data center provider. Was hoping that the RFCs gave some ammo for a resolution in my favor, came out more humble and wondering if Consul really needed that many ports.
Modern apps seems to do their best to require lots of ports numbers, often with very little justification. It's incredibly annoying.
Some of it understandable, e.g. when one port is for stuff that is always meant to stay behind your firewall, and another for public access. In that case it actually makes things easier to do properly. But so often it appears to be entirely gratuitous.
> Why? The entire point of ports is so you don't have to do this.
There are use cases - many nefarious cases but some real world cases.
Let's say your university allows everything through 80 but blocks everything else. You could SSH home or stream content.
Another use case could be that you have a web site and game server on same server. An interesting implementation could be to listen on port 80 so that they could connect to yourserver.com via HTTP and view information about the game/server and connect on the same port. Interesting but not very practical.
I suppose an acceptable/respectable use case would be custom load balancers or honey pots.
Well, back in highschool I used to use a similar idea to run SSH to my home PC via port 80 -- the network restrictions meant a VPN wouldn't work, and it was the only way to get access to run compilers I wanted to play with without relying on exploits to get around the system's restrictions.
So here's what hopefully won't be considered a trolling question: I have seen a lot of "100% Go" projects over the past several years and that's usually presented as a big feature. Some pretty trivial things have been redone as brand new in Go, and then suddenly gain lots of attention. What is so magical about a project written in Go vs C, Python, Ruby, Rust, JS, etc.? As a user of the software I won't care what it's written in, if it's done well. If it's done poorly, I am much more likely to look for better alternatives than to fix it (if only I had about 240 hours in a day...), so what's the advantage?
To me, an advantage in usability is having a PPA with properly built .deb packages. If I have to use a language-specific package manager that I don't already use regularly, you've likely lost me, unless I really need this functionality. If it doesn't come with a proper daemon mode (correct forking, PID file support, proper file or syslog logging), sample config file, man page, or an init file, that's even worse. I am much less likely to use this in any type of "production" environment if I have to maintain those pieces myself. Running things in a screen session is so "I'm running a Minecraft server".
That is not to criticize your work. You've done a great job! serve2d looks very interesting and I might actually have to give it a try sometime.
I think the the "100% Go" stuff is appealing in that you just have one file that works across OSes, with minimal screwing around.
For things that become part of the OS, yes, I'd rather they come via some install approach that includes the necessary integration. But for anything else, I think a lot of our packaging approaches are dedicated to saving disk space and RAM, which is something that matters way less to me now than it did 15-20 years ago when CPAN and APT were designed. In 2000, disk prices were circa $10/GB [1]; now we're looking at $0.50/GB of zippy SSD [2] or $0.03/GB of spinning rust [3]. RAM is similarly about 2 orders of magnitude cheaper. [4] Given that, it makes a lot more sense to burn space to minimize the chance of a library version conflict or other packaging issue.
Another thing that has changed greatly is the pace of updates. 15-20 years ago, weekly releases sounded impossible to most. Now it's common, and some places are releasing hourly or faster. [5] Thanks to things like GitHub, the whole notion of a release is getting hazy: I see plenty of things where you just install from the latest; every merge to master is in effect a new release.
Given that, I think both Go and Docker are pioneering approaches that are much more in sync with the current computing environment. I'm excited to see where they get to.
The number 1 selling point of something written in Go is that it's much easier to package. The result of a compilation is a standalone binary that can be copy-pasted everywhere, as long as the architecture matches what was input at compilation time. This means:
- no more having to deal with dependencies at packaging time, which makes packagers' job simpler because all they have to care is the one and only standard way to retrieve dependencies and build the binary. (Much like the standard way of doing things in C would be ./configure && make && make install, with the added bonus point that the dependencies are also taken into account). This also means that there's a higher chance that the software will be packaged in the distribution of your choice, because the bar is lower
- no more having to deal with dependencies at runtime, because each binary has everything it needs inside of itself. In practice this means "scp as a deploying method". It's an even lower common denominator than packages.
> If it doesn't come with a proper daemon mode (correct forking, PID file support, proper file or syslog logging), sample config file, man page, or an init file, that's even worse.
This is orthogonal to the choice of programming language, though. On top of that, I believe the application shouldn't deal with forking, it's the job of your supervision system to deal with daemons. All an application has to do is log whatever happens on STDERR and let the system handle that.
How exactly do you not have to worry about dependencies? Does the Go SDL include every routine you could possibly need and is always 100% correct and bug free? If you build your static binary today and tomorrow there is a vulnerability in your libssl dependency of choice, don't you now have to recompile and redistribute a new binary? Seems like a terrible and insecure way to do things. Instead of a distro developer worrying about security updates, you have signed up to do that yourself.
As for logging, there are loads of logging libraries that support both stdout and file logging. My policy is to support both for my project and it has been almost no burden so far (in Python and in C). Not everything is containers, and having a feature like logging does not mean it cannot be used in a container.
> If you build your static binary today and tomorrow there is a vulnerability in your libssl dependency of choice, don't you now have to recompile and redistribute a new binary
Technically there's only one ssl library you should use, it's the standard one. This doesn't change your overall point that when a part of the program must be upgraded, the whole binary must be upgraded as well and re-deployed, which I totally agree with. If your software is a server that you host yourself and you have full control over the deployment chain, as is the mindset behind Go, then re-deploying a dependency or re-deploying a binary is more or less the same.
Regarding logging, I'm really partial to the approach advertised by 12 factors (http://12factor.net/logs): let your software handle the business, let the supervisor handle the software's lifecycle, and handle the logfiles outside of the software, because there are factors specific to the hosting machine that in my opinion shouldn't be the concern of the software.
> This also means that there's a higher chance that the software will be packaged in the distribution of your choice, because the bar is lower
Static linking and bundling of dependencies is a no-no in most distributions. If anything, the Go model is a headache for package maintainers to deal with.
Have you ever actually tried producing a statically linked C/C++ binary? I've been programming in C/C++ for 10+ years. Static linking is a huge pain. My latest efforts have led me to create holy build boxes inside carefully controlled Docker-based environments just to be able to produce binaries that work on every Linux. With Go you can just run a single command to cross-compile binaries that work everywhere. Minimal setup required, no expert knowledge required.
That's a very weird way to look at it. Static linking was there before everything else. gcc/ld and, well, any other C/C++ toolchain can do that as well. There is a reason this isn't usually done. It's like you are trying to spin a bad thing into something good.
The reason this isn't usually done is that executable size was significant relative to storage capacity up until the early 2000s or so, and people tried to economize by deduping common parts of their executables via shared libraries / DLLs. This worked well enough to catch on, but came with an extremely high cost in added complexity, and over the years a whole layer of additional infrastructure was created in order to manage it. The industry progressed, storage capacity grew dramatically, and executable sizes stopped mattering, but the use of shared libraries / DLLs continued out of inertia. As time passed, people started asking - why are we doing all this? And some of them invented a reason, which was the idea that one could swap out pieces of existing executables after installation, and thereby fix security problems in an application without needing to involve the application's developer in the process. This works about as well as you'd expect if you had spent years trying to fit all the rough edges of various third-party libraries together with varying degrees of success, but the idea caught on as a popular post-hoc justification for the huge layer of complexity we're all continuing to maintain long after its original justification became obsolete.
As is no doubt obvious from my tone, I'm not buying it and am very happy to see signs of a pendulum-swing back toward static linking and monolithic executables.
Just because you don't like the single binary that works everywhere, doesn't mean that others find it a problem. One approach doesn't fit every possible situation.
I've done it for Windows, Linux and Mac before. Note that these solutions freeze the Python side of things but do not freeze the platform side. For example they do not include the system libraries. Consequently running the frozen python app on a system that is a different distro, older or newer OS version, or has different system packages installed often leads to the frozen python app not being able to start.
A staid, solid, conservative outlook, a good perspective for others to realize is out there. I would say that concerns like daemon mode and logging are a lot less in vogue these days- a program ought concern itself with running, and outputting to stdout, and if you have needs past these it's expected you have tooling you can deploy that makes that happen.
Daemonization is at least a fairly standard feature, but with logging there's so many people with such varied concerns that getting fancy, trying to meet people's many needs, can lead to a lot of program bloat very quickly. Instead of going at these on a case-by-case basis, and now that we are more container-centric, it makes sense to run in the foreground and put your output on stdout, let the rest of the system support that utterly uncomplex pattern.
I think we're learning pretty quickly that 100% Go (or Rust, or Python, or Perl, or OCaml, or ...) is a good idea for security. Especially is you're dispatching between ssh and ssl services.
Go has advantage over scripting in speed, over C/C++ in memory management, over strict FP languages in popularity, and over Rust in being stable and known for longer.
serve2d looks promising, can see some good uses for it.
If anyone's interested, Corkscrew is an alternative solution, allows you to disguise SSH as HTTP/HTTPS traffic, useful for getting through restrictive firewalls to administer remote machines:
I guess the best way to find alternative solutions is to write one and have people tell you there's other like it! I genuinely didn't know there were so many doing equivalent things.
I do think serve2d is considerably more flexible than Corkscrew, though, being able to tunnel anything over TLS if needed. Or other transports, if more are added. I also let you use the same ports for other things.
MySQL seems to use a server initiated protocol (which I always find to be a terrible idea, as it means that an evil client has to do very little to trigger a larger amount of traffic in return, potentially to a spoofed address).
Postgres (which I would recommend over MySQL any day) seems to have a saner client-initiated protocol. I only read part of the spec, but I'll try to see what pattern would need matching later today.
In the end we modified our clients to include a decoy cipher suite in the SSL negotiation. That kept the network happy, and was enough for our multiplexer to then internally route to the correct backend.