|
I can do one quickly. I need to take affiliations from papers and work out which organisation(s) they're talking about. How would you solve this problem, assuming the affiliations are already extracted for you? What are the top level concerns, how do you break that problem down, how might your problems scale, etc. There's a lot of questions I'd expect to get to, and this should be something done along with the team kind of like we're all working on the problem together. I find it useful to see how well people can talk through the problem, it can lead easily into questions about licensing and rights of reuse, types of errors, etc. If they suggest an approach they've used before, can they explain likely failure cases / benefits? Are there workarounds, detection methods? For example, if you're doing text classification then tfidf+svm is a solid first thing to try, and there's easy ways that can fail which we could talk about. There's a lot of that you can cover in an hour, and it tests whether someone can explain a potential solution to the team effectively, just as they would have to on a day-to-day basis. We can bring up specific types of problems that we face within it, what we've tried, and we can constrain the problem more or lead someone to starting points if it's a bit overwhelming. edit - I guess this would fall under some data science fundamentals, but the approach I think works for CS fundamentals. What data structures could you use? What are the tradeoffs? It's not about finding the one optimal solution, but about how to proceed. |
This is screening for specific domain knowledge (text processing) not general programming aptitude. That's ok if you want specific kinds of prior knowledge on Day 1 but it is not a way to hire generally smart people.
> I guess this would fall under some data science fundamentals, but the approach I think works for CS fundamentals. What data structures could you use? What are the tradeoffs? It's not about finding the one optimal solution, but about how to proceed.
This is exactly how most algorithm interview questions work.
I'm trying to understand what the OP meant by "real problems" not "academic puzzles". It sounded like they avoided hard algorithms yet "got a sense" of CS fundamentals somehow.