Hacker News new | ask | show | jobs
by danudey 3974 days ago
Setting up an IPSec VPN from a Linux server to Amazon VPC and running data over it. There was a host of documentation on how to do similar things with the appropriate tools, but as always, it was document A with 40% of the puzzle, document B with a non-overlapping 30% of the puzzle, and document C with an overlapping 40% of the puzzle… at which point I realized that all three documents were using different approaches/conventions/etc.

Documentation for the tools available seemed to varyingly assume that you either a) understood IPSec well enough and only needed to know how to use this one tool, or b) knew everything you needed to know, minus a few hints on the syntax of individual files.

Eventually I got everything working, but performance was abysmal. Sometimes. Sometimes SSH sessions opened instantly. Sometimes they opened slowly but then worked fine afterwards. Some tools were awful and others worked okay.

Eventually I realized that the IPSec configuration set up two tunnels to Amazon, but only set up actual routing (defining endpoints) for one of them. Thus Amazon was load-balancing packets over both tunnels and my Linux implementation was dropping 50% of packets. For established TCP connections this was fine because we had basically zero latency to VPC so retransmits (for what we were doing) were almost free since they would be discovered when the next packet arrived successfully, but for SYN/ACK packets a drop would result in an annoying wait.

Unfortunately, the tools don't allow you to define redundant/overlapping routes, so I couldn't set up two tunnels; I had to just configure one tunnel and leave the other one down so AWS wouldn't try to send data over it, and then just hope that that endpoint didn't go down at an inopportune time before I'd either set up some kind of load balancing scenario on my internal network (internal BGP maybe? ugh!) or given up entirely on the project.

After weeks of working on this specific task (the VPN setup) and making literally zero progress some days, googling for literal hours with no useful results, and trying various permutations, when I got it working I felt like I was the only person on the planet who'd ever done this before, since I was pretty sure that no one on the internet had ever written about it at least.

Even though the project was ultimately scrapped, I still feel like I learned a lot, and maybe I should feel like it was wasted time, but it also felt like quite an achievement to succeed.

1 comments

This is funny because I am basically about to try and diagnose a _very_ similar issue with a VPN tunnel between a Cisco ASA and AWS. I'm also seeing SYN/ACK being occasionally dropped and TCP connection states ending up in WAIT state.