Hacker News new | ask | show | jobs
by asQuirreL 98 days ago
Post can be summarised quite succinctly:

Everything was slow because sorting was taking a lot of time. Sorting was slow because its comparator was taking ~6 read locks on every comparison, and was cloning large structures to avoid holding the lock for a long time. The first fix was to access just the information needed to avoid the clones, the second fix was to cache exactly the data needed for sorting after the underlying data was updated, and use that for the comparators without needing to take the underlying lock.

I'm looking forward to the next post about how cache consistency is tough.

1 comments

By far my favorite feature of lodash is the sortby function, in which instead of providing a custom comparator as most std libraries’ sort() function offers, provides a way to substitute a simpler object for the comparator to chew on. If your comparator needs to go beyond a couple nested conditionals, to actual data transform or grabbing locks, then that nasty little logn term in the runtime can take your sort from using 20% of the time budget for an expensive operation to taking 120%. Especially when you consider the memory bandwidth sloshing around L3 or the main memory bus to do the full table scan logn times.

I think the world would be better off if this wasn’t in a third part library in a single programming language. Iirc Ruby is the only language I know with it as a built-in.

Rust's slice of T [T] provides [T]::sort_by_cached_key which is a stable IPN sort which lets you provide a key transform callable f, which it will call at most once for each item to be sorted, sorting by comparing the (cached) results from that callable.

https://doc.rust-lang.org/std/primitive.slice.html#method.so...

However ..._by_cached_key is not provided for Rust's unstable sort because the unstable sort doesn't require an allocator and necessarily a cache does need an allocator.

Yeah this is a function called sort_by but I don’t think it’s doing the same thing.

    let (initial_values, stream) = (initial_values, stream)
            .filter(filter)
            .sort_by(new_sorter_lexicographic(vec![
                // Sort by latest event's kind.
                Box::new(new_sorter_latest_event()),
                // Sort rooms by their recency.
                Box::new(new_sorter_recency()),
                // Finally, sort by name.
                Box::new(new_sorter_name()),
            ]))
            .dynamic_head_with_initial_value(page_size, limit_stream);

That’s an api that would make an OG Java developer get tingles.

sortBy should be locking each object once and I’m reasonably sure this is happening at least three times. Author ends up approximating _.sortBy() at the bottom by introducing a struct.

This is well known in Perl as the Schwartz transform