| Unfortunately, I cannot agree with you completely. If there is no language constraint and the system resource constraint, to the problem we have understand so far, using Java will be the fastest and easiest way without hashmap. Load the complete file as a string (depending on how the size of the data set, up to 2^31 - 1), then using string.indexOf() function will get the best result. The underlying algorithm for indexOf() is implemented by JVM in C code which is must fast than any other implementation. My gut told me that it's weird to use hashmap to do string lookup. Everybody knows hashmap is used to lookup key-value pairs. The real reason for not using hashmap here are: 1. hashmap's lookup Big O is O(n), but not the build cost. if the data set size is huge, it takes long time to build the hashmap since every new element exceeded the initialCapacity being added needs a rehash 2. the underlying implementation of indexOf() will use a sort of algorithm called "automata" or something else to do a fast search within a string. So there are lots of alternative solutions. Don't always think there is only one. I'm not in this field, and I'm not interesting to get into to it too much. But I don't think the best answer is that tiny change. This is why I suggested to consider if you are doing application level optimization or changing system level algorithm. Building software is a lot more than code manipulation. Understanding requirement is the first step in the SDLC (Software Development Life Cycle). |
The best answer is one that passes the test, and using ruby it is pretty easy to do so, but you could do it any way you like.