Hacker News new | ask | show | jobs
by malux85 906 days ago
Reverse engineering a file format or protocol is almost a rite of passage for programmers, it is incredibly fun and rewarding, something I'd recommend for all medium/senior programmers get into at least once.

A few years ago I was using LiDAR scanners from a manufacturer that didn't provide a linux driver, only windows - the way it worked is that you programmed the firmware to fire UDP packets at a specified IP and port and then when the device powered up it would push this continuous stream of data to you. 300,000 points a second.

So I started capturing these UDP packets and then decoding them with python, eventually I had to write a plugin in C to do the high performance parsing and bit packing, but nothing beats that feeling when you're stumped on what a bit of data means and then a eurika moment hits you in the shower, and the project advances!

4 comments

Some of the worst code of my life, created 20 years ago when I was a teenager, today posted openly on my GitHub, was reverse engineered custom chat server protocol, as I wrote my own client to replace the Java applet. And to have logs.

The catch is... I didn't have any Internet connection. I was going to an internet cafe, logging onto the chat server, and chatting, while recording the connection with Wireshark.

At home, I'd print the hex + ASCII connection dump on my dot matrix printer, and used a highlighter and ballpoint pen to mark the fields of the message packet.

Then I'd code something around it, planned new tests, compiled a new version of the app and.... took it with me on a hard drive to the internet cafe to test next day, or next weekend.

I think I was way smarter and goal oriented than I am today.

I did this too! Having no interenet at home I would take a box of floppy disks to the library every day and save articles on C programming, OpenGL programming, and the raw weather satellite images from the NOAA - I was trying to re-create the 3D weather fly-over I saw in Jurassic Park when I was 7. I agree - having that disconnection was ultimatly good because it made me think a lot for myself rather than just google a solution, it developed my fundimentals a lot.
>is almost a rite of passage for programmers, it is incredibly fun and rewarding,

Only when you are doing it for yourself or when it's a known undertaking. It can be very frustrating when you are integrating with some hardware and you are 99% complete and you've told everyone you are ready to ship and the last 1% is a surprise reverse protocol engineering project.

Well tbh, failing to deliver after over committing is also a good experience.
Lesson is to not say it's ready before it's ready.
That’s what I was gonna say. Reverse engineering was the only way in the 90’s as documentation was scarce. I had to reverse anything if I wanted to understand how it worked.

Here is an extractor I wrote for Westwood PAK and Lucasarts LFD files when I was 16: https://gist.github.com/ssg/e3e9654612be916336c01e104b10ddc7

I picked apart some of the Dark Forces files myself as a kid. The GOB file was pretty obvious - I was familiar with Doom WADs, it's basically the same. I figured out the graphics formats by setting up VGA 320x200 mode with the contents of a .PAL file, and dumping the graphics file into screen memory. Then looking at the noise around the familiar patterns to figure out all the run/skip length stuff that wasn't just a pixel value.
Fantastic! I love how we’ve been through the same stuff despite being perhaps continents apart.
Agreed, reverse engineering a job submission and validation protocol that ran over a Unix socket was some of the most fun I ever had on the clock.

I wound up basically re-implementing the software that was listening to create a test suite by the time it was all said and done.

Even the less challenging stuff can be fun. After seeing that SimCity themed Chicago zoning map I got to thinking about if it would be possible to do the same for the Bay Area. Turns out the cities I've looked at so far publish all sorts of machine readable information, the fun part is in finding and exfiltrating it.