| > but then just assumes that computational complexity and O-notation are known. At the bottom, you can find a reference to the previously published chapters. Complexity & big-O was addressed in the initial. (This is part of a book, all the org details are explained in part 1) > There is no path compression. The second variant is path compression. Yes, it may also be implemented in find, but it will make the code more complicated, in my view. > And then, the uf-union operation: No practical union-find implementation accesses the list of all entries in this operation. You are correct here. I'll make a change, thanks. > Also, uf-add seems to allow adding new entries to the data structure but does not add them to the points list This isn't needed here as it is supposed that we already have the list of points, it's just not arranged for efficient disjoint test. I, actually, had the reference to it, initially, but removed as I considered it redundant. I
ll think of adding it back. |
I stand corrected, thanks. I knew this was part of a book and did go to the sidebar to check the titles of older chapters, but there the first part is called '"Programming Algorithms" Book', not 'Introduction & Complexity' like at the bottom. Going only by the title, I didn't think there would be complexity there. My fault.
> The second variant is path compression.
Yes, but only on the "add" operation. It's more important to have it elsewhere. Imagine a set of elements {a, b, c, d, e} where you do a sequence of pairwise union operations on (a, b), (b, c), etc., and end up with the data structure degenerated into a linked list a -> b -> c -> d -> e. If you now repeatedly check whether a and b are in the same class, you end up searching the entire list twice, every time you do this search.
You could try adding path compression in union, but I don't think a complete, efficient solution is possible: You would always be able to construct small trees that you would have to link to produce bigger, deeper trees.
That's why path compression is always where it matters and can make a useful difference: In find. In the example above, depending on the details, you would have one or two full linear searches, and then every query on a and b would be constant time.
> Yes, it may also be implemented in find, but it will make the code more complicated, in my view.
That's one of the reasons why union-find might not be the best introductory data structure. As for your view, to some extent, when you teach standard stuff, your view is not as important as actually teaching what you claim to teach. If you claim to teach union-find, you don't get to pick something somewhat similar but with radically different complexity.
> This isn't needed here [...]. I ll think of adding it back.
No, don't! You're right that a full list of nodes is never needed by the data structure itself. A user of the data structure might need one, but also they might not. I have in the past used union-find in settings where it was more convenient not to have a complete list of entries.