| I understand your concern, but I don't really think the problem is within Clojure people: When I started out explaining the structure to other people and that most operations have O(log n) runtime, people dismissed Clojure as being too slow for practical purposes. Part of this is because programmers, not Clojure devs in particular, connotate O(log n) with "much slower than O(1)". However, this data structure is incredibly much faster than the "O(log n)" people are generally used to. And this is the problem. As you yourself say, most programmers usually associate/assume "Big-O notation is intended to describe how the time required for an operation grows as the number of items being handled gets large". But formally, this is not the case at all. O(g(x)) tells you the that the worst case input to the algorithm will require less operations (usually operations, sometimes memory or other things) than the function g multiplied by some positive constant. g is usually determined by the size of the input. (This definition is also simplified somewhat, but it illustrates the problem) We can say that ArrayLists.add have O(n) runtime and that quicksort have O(n²) runtime, but they both also have O(n³) and O(n!) runtime. In addition, Big-O does not describe how the algorithm runs in the average case, or how frequent the worst case is. In and of itself, the Big-O notation tells us almost nothing about an algorithm – you barely conclude anything from it. I guess that is part of why it has become this very simplified term with a very wibbly-wobbly definition from time to time. So when you're saying that calling it "practically O(1)" is lying, I agree if this was in an academic context. But for the average programmer (regardless of language!), O(g(x)) is usually thought of as the "worst case runtime for an algorithm". Since the actual runtime is – well – effectively constant, I considered it okay. --- That being said, I have formally explained the persistent vector if you're interested in the academic version. The background chapter in http://hypirion.com/thesis.pdf should cover it. |
I would be more sympathetic to your argument that O(log n) is "practically" O(1) if you would put it in context. It's certainly true that how much the difference matters depends on the application. For someone doing real-time work (HFT, for example) it could be critical. For the average web app, it matters extremely little; the programmer productivity gains offered by functional collections far outweigh the small cost in CPU cycles. I totally agree with that argument, and have made it myself many times, but I still wouldn't say that log time is "practically" constant time for all applications.