| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jboggan 3431 days ago

This was about 5 years ago.

It was a mess from the get-go since I was applying for a data scientist position. At the time I didn't have the slightest idea what working as a data scientist entailed, nor that despite 7 years unsuccessfully pursuing a PhD in bioinformatics I wasn't sufficiently statistically oriented to excel as one. I just knew that I was good at math and had taught myself enough Perl and Ruby to be dangerous. I should have been applying for software/data engineer roles but I didn't know what that work was like at the time. The only engineer I knew had lived across the country from me and seemed more like a god than a regular human.

So after doing not-terribly in Kaggle's machine learning competition (which I did through some unholy Perl scripting and this bizarre network propagation model which had nothing to do with any normal machine learning techniques) I viewed Kaggle as a dream job. I think I did some phone interviews, I let them know that I was already local (crashing on couches and staying in transient hotels), and they scheduled me for some in-person interviews.

Waking up with flea bites from the dogs staying in my transient hotel next door to The Armory, I walked outside to find my motorcycle placed upright on its stand but somewhat demolished and a large pile of its broken parts piled politely on the seat. No note. I had about 20 minutes to make it to my interview. This would have been sufficient lane-splitting with a functional motorcycle, yet I hadn't left myself enough time to take public transportation instead of my busted ride. I brushed the jetsam off and tried to ride to the interview.

I had no left or right footpegs; my clutch lever was bent; my front brake lever had snapped off and was barely two inches long; my handlebars had been wrenched into a 135* bend forward and upward on the righthand side. As I rode this shambling contraption to my interview with my left foot anchored on top of the shift lever and my right foot reached back on the passenger peg I must have resembled a Street Fighter character diving forward in some sort of headlong right hook. I couldn't shift out of 1st and I couldn't really brake.

I arrived at the interview miraculously on-time but absolutely drenched in sweat from the harrowing journey through hostile traffic. So we'll start with that good first impression. My interviewers seemed to all have PhDs and significant post-doc experience and more than half of my interviews consisted of questions about why I dropped out of my PhD program. I mean, repeated hammering about it. I didn't really want to talk about it in great detail because I was emotional about it, mainly because my research supervisor died unexpectedly at the age of 42 and the whole situation still makes me sad to this day. So instead of actually asking questions about the company, or the role, or my skills, about half the time was spent being grilled about this sad story of my graduate career. So I was in a wonderful mood at that point and was happy to finally move on to the technical assessment because I was worried about tearing up in front of the interviewer.

THE TECHNICAL ASSESSMENT: a bunch of softball questions that I don't remember followed by a standard algorithms question. Come to think of it I don't think I got any statistics/ML questions because we got stuck on the algorithms question. The problem was that I never took CS courses, nor had I studied the CLRS book at the time, so I didn't know the "expected" way of doing some problems. I tend to get too creative.

The problem: given a list (n) of words (as strings of size k), return all sets of anagrams. The somewhat clever O(n k ln k) solution is to sort all the strings and then look at all the sorted strings that have multiple words mapping to them. The cleverer solution is to build a dictionary for each word, using the letters as keys and the number of times each appears as the value. This is linear to construct on the word length though it takes a memory penalty that is usually nugatory.

But oh no, I didn't think of either of these bog-standard solutions off the bat. Not having ever practiced this problem I immediately got creative, having too much fun with it. My initial thought was to create a dictionary where the keys were integers and the values were arrays of words. The integer keys were produced by reading the characters of the word in sequence, mapping the alphabet to the first 26 prime numbers, and taking the product of all the primes represented by the characters in the word. By the fundamental theorem of arithmetic (a.k.a. unique factorization of primes), any anagram would create the same integer key, and all words mapping to that key would get pushed into the value array. Voila, and a very compact data structure to boot!

This was really, really bad. Basing the correctness of your interview solution on Euclid's Elements is never going to get you any traction with software engineers. As we all know, engineers are allowed to interview candidates to make themselves feel smarter and more fortunate for already having a job. Getting into an argument about prime number factorization with a candidate isn't going to support either of those goals.

So I was gloating a bit, feeling good for finding such a nice clean solution after that bit of emotional wrenching, figuring I had explained it well with a little math flair and eager to move on to the next problem.

"Um, are you sure that works? Will the numbers always work out like that?" 'Yes, that's why I chose the first 26 prime numbers, it wouldn't work with composite numbers.' ". . . Could you do it a different way? Could you do it another way that isn't this solution?" 'Could I do it slower than this? Sure, but why?'

It seemed like they had a list of possible answers on their interview script and I had gone way off of it. Now, the funny thing was I couldn't think of the sorting solution or the comparing dictionary solution, and I had a correct and well-explained answer that was superior in runtime and memory requirements (and before you object with arbitrary length words and integer overflow I'm just going to stop you with replacing the product of the primes with the sum of their logs and . . . I can explain but this margin is too small to contain). I could only think of really dumb things, like calculating every possible permutation of every single word and comparing them exactly. In fact I was kind of taking a perverse delight in trying to figure out the most inefficient way I could answer their question, seeing how I was very certain there was no better answer than what I had already come up with first. They kept asking me, "is there another way you could do this? Faster than comparing all permutations?" and I kept returning to my prime number solution and they kept saying "but is there ANOTHER WAY you could do this?" Round and round this went and I never got the solution they were looking for, whatever it was.

I don't even remember the rest of the interview process as I had the distinct feeling that it ended prematurely. I didn't catch that at the time since this was to be the first of my in-person interviews in the tech world, but looking back I'm sure they hadn't originally planned on hustling me out the door after a half-hearted tour of the loft workspace.

There are of course two sides to every story, and I'm sure there's someone from Kaggle who tells a story about a leather-clad sweaty-toothed madman who kept prattling on about his dead professor trying to score sympathy points and who gesticulated wildly for an hour while screaming FUNDAMENTAL THEOREM OF ARITHMETIC in a poorly disguised Southern accent. But, regardless of their perspective I can definitely assert that was the worst interview I'd ever had. I went 'home' to buy some calamine lotion, call my insurance company, and tell my mom that the interview at the 'dream job' didn't quite pan out.

1 comments

mailshanx 3431 days ago

Thank you, this is the best HN comment i have read in a year (or possibly more)!

Your solution to the anagram problem is very ingenious, indeed. I have seen this question a bunch of times and know all the "standard" approaches - but your solution, which i've never heard of before, is correct, clever and far superior. Kudos to you for coming up with it under such stressful conditions!

Your description of the interview process is very sad. The interviewers' attitudes and response induced an icky feeling in me.

I think that the attitude you encountered at Kaggle is an artifact of rather poor quality, insecure and insufficiently experienced / accomplished engineers. Whenever i interact with really accomplished and senior technical people, the conversations tend to be of stellar quality - they like to discuss actual past work in depth, are able to comprehend new and (perhaps unusual to them) concepts, and are generally a lot more respectful and pleasant. On the other hand, everything you described reeks of junior, inexperienced and rather insecure technical talent.

This reminds me of my own very-first tech interview. I had spent 5 years in a research lab, working on autonomous vehicles, writing software for things like signal processing algorithms, error correction codes, ML / reinforcement learning algorithms optimised to run on power-constrained devices and the like.

I went to interview with a VC-funded startup. They were looking for "experts in signal processing and machine learning experts". I figured my years of signal processing / ML experience might be a good fit. After the initial pleasantries, i get asked "How will you design a server that can detect anagrams of any word in the english language?"

For the life of me, i couldn't come up with a reasonable solution. I went home and hung my head in shame.

jboggan 3430 days ago

Thanks for the compliment, I didn't realize it would turn out so long. I think I ought to start writing more often.

I joke about the interview process existing to inflate the ego of the interviewer, but it isn't that far off. Everyone wants to be Google so they have to "act like Google". Meanwhile Google stopped doing that five years ago, but they're not going to tell you that.

I don't know what happened at that Kaggle interview and I'm not going to single out the people or the culture there because it really could have happened anywhere. Maybe I could ascribe some of it to having so many people coming from academia, since we (the PhD and post-doc dropouts) tend to have such enormous chips on our shoulders. But it's just our interview culture in general, so much unconscious bias.

I really would love for interviewers to look at your resume, say "well you've been gainfully employed doing this for 5 years, let's delve into your soft skills and see if you are actually a communication / working fit for this particular team" instead of just trying to see if you meet some arbitrary algorithmic bar and hope that correlates with job success.