| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Thiez 3426 days ago

> I do use the whiteboard for trivial CS questions limited to 5-10 minutes. Think fizzbuzz and string reversal.

Why do people keep saying string reversal is easy? Text is one of the hardest things out there. Unless you expect your programmers to support ASCII only?

Why not reverse a list or an array? That does sound like something that is doable in 5-10 minutes.

3 comments

adrianratnapala 3425 days ago

This is why I like the question. As an interviewer, I will leave that matter open and see if the candidate asks for clarification. If so, bonus! But I will tell them to just solve for ASCII.

The worst-case scenario is a "self-hazing" where someone sees the encoding difficulties and freezes up instead of asking for clarification. I've never seen that happen though, in my experience though, everyone -- even the good ones -- just assume ASCII.

And I am happy enough if they do it correctly.

link

ryandrake 3425 days ago

Thank you. A great response to "reverse a string" is "What's the encoding?" If it's anything but ASCII, you're in for a long white boarding session. If the interviewer doesn't know what you're talking about, back away slowly......

link

jameshart 3425 days ago

Even in ASCII you have multibyte sequences to worry about: what's the reverse of CRLF?

link

digler999 3425 days ago

what makes other encodings hard ? The two things that come to my mind are byte length and comparison function. If the encoding had a fixed-length byte length, then it should be just swapping n-bytes at a time instead of 1-byte. What else is difficult about non-ascii encodings ?

link

detaro 3425 days ago

e.g. in UTF-8 a codepoint is encoded in varying byte lengths (so you have to split into codepoints and then reverse), and, a lot more difficult, a sequence of multiple codepoints can be combined to form a symbol. Simplest case would be something like "ö" encoded as "o" (U+006F) followed by a combining diaeresis (U+0308).

Other fun special cases: 🇺🇸 is U+1F1FA REGIONAL INDICATOR SYMBOL LETTER U, followed by U+1F1F8 REGIONAL INDICATOR SYMBOL LETTER S and should if possible be displayed as a US flag (otherwise falls back to text "US"), should reversing it create 🇸🇺 (replacing the flag with the characters "SU"), or still show the flag? (I'm not even sure if there isn't a case where both are valid country codes and it would change to a different flag?)

Similarly, Emoji can be formed from a sequence with combining characters inbetween, which don't display correctly if reversed codepoint by codepoint.

link

ryandrake 3425 days ago

Some examples: If you're dealing with UTF-8, which is very common, you need to handle variable-length characters. If you're working with UTF-16 you need to handle surrogate pairs. Neither are the end of the world, but the basic "array walking" string reversal methods you'd expect from a white boarding session wouldn't work.

link

stale2002 3425 days ago

Well, OK, if you are using C or something, and traversing through binary it is hard.

But if you are in python, and given a string object, and told to reverse it without using the reverse method.... Yeah, that's easy. You should be able to do that.

People are usually expecting the latter.

link