Hacker News new | ask | show | jobs
by Jemaclus 2840 days ago
Not the person you're replying to, but I'll share one of my favorites:

Given an address such as "123 Main St, Boston, MA 00215", break it up into its component parts of: street number, street name, city, state, zip.

(Note: I usually just ask for pseudocode, but if they want to write real code, they can.)

The reason I like The Address Problem (TM) is that it's a real problem I've solved before. I've built address verification systems for some large US companies that you've almost certainly heard of. Most companies in the consumer space deal with some sort of address input, whether it's StitchFix deliveries or Redfin for real estate. Most of those companies outsource it, but still... you store customer addresses in your database, right? :)

I also like it because the domain knowledge is already known for anyone who has spent any amount of time in the United States (where I reside). It's intuitively obvious to any American or even recent immigrant which part of the address is the street number, which part is the city, and which part is the zip code. You don't have to know anything special coming into it -- you don't need to know what a binary tree is or what a linked list is or any of the bajillion other CS concepts that I learned in college and see in interview questions and have yet to use in actual real-world use cases. All you need to know is how to read the front of an envelope or an entry on Yelp.

Another reason I like it is that there are some really obvious ways to go, and there are some really complex ways to go. Arriving at a solution that _works_ isn't really the point. Any programmer out of boot camp can do that, and there are a billion complexities to addresses that most people don't think about. What's more interesting to me is how you arrive at your solution and is that solution something that you can build upon to handle various edge cases.

For example, a lot of candidates will say "I'd use a regular expression," at which point I'll ask them to write one. They usually struggle a bit with it, but even if they do get it, I throw a curveball: "What if some inputs have no commas?" Then they have to modify their regex to handle comma and comma-less addresses.

Some candidates will start off looking at the beginning of the string and saying something like "I'll look for a substring of Street/St or Avenue/Ave or similar", to which I can throw the curveball of "123 Main Street E, Boston, MA 00215" or a more fun one, "123 St Francis Ave, Boston, MA 00215"

I really like the address question because of the endless curveballs I can throw at it. What happens if the user leaves off the ZIP code? What if they abbreviate the city to "SF" instead of "San Francisco"? What about apartment numbers? How do you handle pre-directions and post-directions? What's the city for "123 Main Street West Palm Beach FL 33409" and how do you know?

What reasonable assumptions can you make about an address? For example, there are 50 states + a handful of territories, so states should _theoretically_ be easy to parse out... What other assumptions can we make about addresses? (Gotta be careful with this one -- people often assume that street names end in "Street" or "Avenue" or their abbreviations and forget about post-directions!)

Since the domain knowledge is so simple even a 5th grader understands, the real challenge comes down to problem solving. There's no "trick" to it. You just work through the problem until you get stuck or we run out of time. Giving hints is fairly straightforward as well.

At the end of this part of the interview, I often allow them to give me some thoughts on how they did. If they could do this whole 30 minutes over again, what would they do differently? Maybe not use regex? :) How do they think the US Postal Service handles addresses? Or Google?

Again, there's no right or wrong answer for these. I'm more interested in whether they exhibit self-awareness, whether they can identify what went wrong earlier, and whether they can learn from their mistakes.

(In case you were wondering, the _real_ answer I'm hoping for is "I'd ask Google Maps." I've had exactly two people give me this response in the 8+ years I've been giving this challenge. I don't know if it's because they think it's a dumb answer, if they know that I expect a code-related answer, or if it never occurs to them... but I like to think someone who wants to see what Google says isn't the type that constantly wants to reinvent the wheel. Note that NOT giving this answer is not a deal-breaker. I generally give the benefit of the doubt to candidates here.)

3 comments

I think my response would depend on how you phrased the question.

If you said “here’s an address, how would you parse it?” it would feel like the purpose of the interview is to use this task to see how I could. Saying Google Maps then seems quite facile.

If you said something like “you’re writing a web app and have been asked to add a new feature that would parse the submitted address and do X with it. Delivery of this feature is urgent. How would you parse the address?” then Google Maps seems like a much more natural answer.

I think the difficulty with such tasks is that it can be hard for the candidate to know what the confines of the test environment are - what kind of response is expected etc

If one person came up with a complex algorithm and another person told me that this must be an issue that many companies face and there must be a service that specializes in this and the first thing they would do is to look for a third party provider. I would hire the second person unless we were the company specializing in address look up.

There is nothing worse than a developer who doesn’t know the difference between which problems are part of the business domain that they are suppose to be writing code for and problems that other people have already solved.

And for this particular use case, I happen to know about CASS certified software. SmartyPants is $1000/month for unlimited lookups. Why would I waste company resources on writing that myself? The yearly cost is less than a month’s fully allocated salary for a senior developer in a major metropolitan city.

Yep, exactly. :)
> (In case you were wondering, the _real_ answer I'm hoping for is "I'd ask Google Maps." I've had exactly two people give me this response in the 8+ years I've been giving this challenge. I don't know if it's because they think it's a dumb answer, if they know that I expect a code-related answer, or if it never occurs to them... but I like to think someone who wants to see what Google says isn't the type that constantly wants to reinvent the wheel. Note that NOT giving this answer is not a deal-breaker. I generally give the benefit of the doubt to candidates here.)

I'm surprised that this rarely seems to be brought up as a solution. It was the first approach I had in mind since you really can't split and format all those potentially mistyped addresses reliably without looking them up in a database/directory of valid addresses.

It feels very obvious, however admittedly in an interview situation I might actually assume that the interviewer doesn't want me to use the third-party look up approach as it feels like cheating, which is somewhat absurd since it's the correct approach.

Yes, that's the conclusion I generally draw as well. It's an interview, and they think I expect them to write code. Personally, I'd answer with "I'd ask Google what they think, but if you want me to actually do it myself, I'd start with..."