Hacker News new | ask | show | jobs
by jheriko 4511 days ago
"I took me 2 hours to figure out how to parse the RSS feed, extract the data I need using regex (since it was all inside a single tag) and present it in a table view."

i hate to be negative or damning in anyway - you make many valid points. however what you did there is an awful performance imo. i suspect it comes from living in web circles too much. i mean... you used a regex? this is one of the rare cases where i would write a c program with old fashioned procedural logic and expect it to be done much faster and to a higher quality.

also yes, good programmers perform very well in unfamiliar environments. not having the api you want is a common real world problem - learning new apis on the fly is a vital developer skill. the idea of an unfamiliar api should neither be daunting or challenging - unless it is of exceptionally poor quality (poor naming, no docs, no samples you must reverse engineer everything - but that shouldn't stop you either).

also, i hear a lot about the value of regexes - be careful, this is a web centric view. regexes are a very limited parsing/recognition tool and outside of web development - they go unused for most such problems - they either aren't powerful enough or add a needless layer of complexity in the general case.

as a concrete example most compiler generators will use regex for lexing but not for parsing at all - even repeated regex on a tokenised one line string is nowhere near as useful or practical as ll, lalr or especially glr parsing.

they also add a layer of complexity... someone famously said something like "oh, you used a regex... now you have /two/ problems"

good luck though. given more time and practice you will learn to eat these interviews up then spit them out with you rejecting them for making a poor first impression on you as a prospective employer... interviews work both ways after all.

5 comments

> i hate to be negative or damning in anyway - you make many valid points. however what you did there is an awful performance imo. i suspect it comes from living in web circles too much. i mean... you used a regex? this is one of the rare cases where i would write a c program with old fashioned procedural logic and expect it to be done much faster and to a higher quality.

You were able to come to these conclusions with no knowledge of the content of the data stream, just assumptions?

> also yes, good programmers perform very well in unfamiliar environments.

I think your definition of "good programmer" is nearing unicorn territory.

> not having the api you want is a common real world problem - learning new apis on the fly is a vital developer skill. the idea of an unfamiliar api should neither be daunting or challenging - unless it is of exceptionally poor quality (poor naming, no docs, no samples you must reverse engineer everything - but that shouldn't stop you either).

By the sound of it, there was no API here. He was thrown a raw RSS feed and had to figure out what was in it and how to extract the relevant information.

> also, i hear a lot about the value of regexes - be careful, this is a web centric view. regexes are a very limited parsing/recognition tool and outside of web development - they go unused for most such problems - they either aren't powerful enough or add a needless layer of complexity in the general case.

NLP makes heavy use of regexes for tasks where GLR parsers are overkill or where we have to fix character encoding or other such data noise.

> good luck though. given more time and practice you will learn to eat these interviews up then spit them out with you rejecting them for making a poor first impression on you as a prospective employer... interviews work both ways after all.

Exactly what we need to do, teach to the test. That will keep interviews effective.

  teach to the test
Yeah. At some point the candidate will become a professional candidate. I would rather stay a professional developer instead.
>i mean... you used a regex?

That was my initial impression, but after I reread the sentence I interpreted it as he used an RSS parser to get to the element with the data and then used a regex to extract the information from the data.

He didn't specify how the data was formatted inside the RSS element which leaves me to believe it might be something custom and not a simple CSV. This would be a perfectly valid use case for a regex.

If the data was formatted like this:

<outer><inner>1</inner></outer><outer><inner>Content: This is some random content. Value: 500 SomethingElse: 23423</inner></outer>

and the problem was to extract the value, the somethingelse and the content, then a regex using extractors would probably help a lot.

I had to do a pretty similar task in the interview process to land my current role. They provided me with an example of a poorly written program to parse the RSS feed.

Their example was using regex to parse. The first thing I did was rip it out. I think knowing when to use regex and when not to use it is a valuable skill.

Let's throw you into a high stress environment give you data you've never worked with before and on top of that, just for funzies throw some namespaced SOAP at you. Oh and here's the WSDL, we want you to use that too, no parsing it on your own by hand.

A lot of people don't parse XML as much anymore. Most API providers have moved over serving up JSON and pretty much ask developers to only use that API.

Also this company fucked up. So you can't provide access to an API, but here read this RSS feed instead of something like serving up a static JSON file from a server somewhere.

As for the regex there's shipping, and there's doing it right. For all we know, they had to have it done before the 2 hours were up so they cut corners and got it done within the time requirement.

Reading a XML feed instead of JSON is not comparable in difficulty to your examples at all. Getting XML parsed is not rocket science, just use a library. That was probably not the hardest part of the problem.

We have no idea where the expectation of working with their JSON API on the interview came from, the RSS feed could be their standard test problem. They didn't "fuck up", I'd even say their filter (reasonable or not) is working as intended.

I'd say it depends. There's lots of things that can really throw people through a loop. Maybe you are using an editor they aren't familiar with, or perhaps you don't have your environment setup for gradle projects. There's so much more to this that we don't know. I'm willing to give the benefit of the doubt.

A way to get around excuses like these are to encourage people to bring in their own systems. Explain that you know how your setup might be awkward and it's probably better if they work on something they are familiar with. Besides, during the pairing session, you shouldn't be writing too much anyway.