| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by anxrn 5532 days ago

- Which programming language 'complicates' storing list objects in a hash map? Downright trivial in all languages I can think in (Java, C++, Perl, Ruby, Python)

- What do you mean by uniqueness of elements? There's no mention of sorting by a secondary criterion, if that's what you mean. Which 'standard' sorting algorithm removes duplicates?

- What do you mean by 'unique key maps to a different bucket'? You'd only have to worry about this if you're designing a hashing algorithm, clearly that's not the problem here. You're hashing on age (hence guaranteeing partition on age), and simply adding objects to a list in the hash. Are you implying we'll have to invent a new hash map to solve this problem?

This is literally less than 10 LoC in any high-level language. That 'complexity' surely seems worth O(n) running time?

1 comments

anonymoushn 5532 days ago

My understanding of bnoordhuis' proposed was that the elements of the hash table would be students, rather than lists of students.

link

bnoordhuis 5531 days ago

The language was C and my solution (adapted to the article) was something like this:

    // TODO enforce! see also: soylent green
    #define MAX_AGE 100

    struct student {
        const char *name;
        int age;
    };

    struct students {
        long n_students;
        struct student *students[10e6];
    };

    struct students all_students[MAX_AGE];

    void insert(struct student *s) {
        struct students *ss = &all_students[s->age];
        ss->students[ss->n_students++] = s;
    }

So O(1) insertion and O(n) post-processing to stitch the arrays together again. Seems like pretty acceptable performance to me but the Google guy thought otherwise, the answer he was looking for was (probably) a counting sort.

I tried to argue that my solution was a fair bit faster and simpler to boot. That's probably why I didn't get a follow-up interview - no-one likes a smart arse. :-)

link

anonymoushn 5530 days ago

Thanks, I'm sorry I misunderstood your brief description.

There's no hash table here, this is just bucket sort. I don't think anyone should care about the difference between counting sort and bucket sort, though. The only thing I would mind about this is that it allocates 100x the necessary memory, but I'm sure you wouldn't actually do that when it matters.

link