Hacker News new | ask | show | jobs
by karencarits 1244 days ago
> For example, in R, we try to avoid loops because they are very inefficient

This was true before, but the performance of for loops has been improved a lot later years, and while vectorization is still faster, for loops are no longer a no-no

See https://www.r-bloggers.com/2022/02/avoid-loops-in-r-really/

4 comments

It's a really sticky misconception. I've seen many beginners telling others to "never ever use loops in R", and so you end up with nested sapply()s or whatever soon-to-be-deprecated tidyverse functions are in vogue that nobody can reason about.
Agreed. The most common reason loops become bottlenecks is people "adding onto" vectors or dataframes. This causes a whole new vector to be created, the data from the old one copied into it, and then the new data filled in at the end. You'll rarely notice the performance hit unless you stick it in a loop that runs tens of thousands of times.

For those who want to avoid it and still use a loop, you can create a vector beforehand with the final length and fill it in. If you don't know the final length, create a vector with a good guess for length, double its length whenever it gets full, and then crop off the unused tail when you're done.

So Rob Pike’s rule 1 and 2 again:

Rule 1. You can't tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you've proven that's where the bottleneck is.

Rule 2. Measure. Don't tune for speed until you've measured, and even then don't unless one part of the code overwhelms the rest.

https://users.ece.utexas.edu/~adnan/pike.html

That holds true in general but when doing numerical calculations on a large amount of data, taking speed into account is necessary. You usually know approximately the time penalty for not doing so can evaluate the extra time spent coding verses time spent waiting for results.

For example if I am writing a toy neutral network with a small dataset I don't care how optimized it is, or if it runs slowly on the CPU.

But when training a large network on a large amount of data it is well worth spending extra effort from the start to ensure as much work as possible is done on a GPU and writing it to ensure if can support multiple GPUs.

That's some pretty generic premature optimization cargo culting.

If you have a huge data set and some understanding what you're doing, the bottlebecks will be pretty obvious.

Apparently not obvious enough for people to estimate in advance if they should avoid looping.

Pike point is not just to avoid premature optimization. It’s to measure bottlenecks. Because due to changing language and hardware developments, what you think you knew to be true might become outdated.

One of the reasons why Mark Godbolt created compiler explorer was to prove teammates that what for them was pretty obvious actually wasn't.
In Julia I do not need the Godbolt compiler explorer. The macros `@code_llvm` and `@code_native` show me the LLVM IR and native code for a function.

  julia> @code_llvm debuginfo=:none 5.0 + 3
  define double @"julia_+_156"(double %0, i64 signext %1) #0 
  {
  top:
    %2 = sitofp i64 %1 to double
    %3 = fadd double %2, %0
    ret double %3
  }

  julia> @code_native debuginfo=:none 5.0 + 3
  ...
    vcvtsi2sd %rdi, %xmm1, %xmm1
    vaddsd %xmm0, %xmm1, %xmm0
    retq
  ...
There's a reason Godbolt was made for C/C++ rather than python/R. In a fast language you need to know what the compiler is doing to know what's slow. In a slow language, the slow part is pretty much always just "code that does anything in the language".
Python is only slow because so far there has been a huge disregard for JIT implementations, versus how other dynamic languages have decided to deal with perfomance issues.
Avoid the allure of premature optimization
But embrace the repulsion from belated pessimization. As Len Lattanzi said, it's the leaf of no good.
> so you end up with nested sapply()s

And that's usually not even vectorizing anything, it just hides the for-loop that is buried somewhere in the apply-code...

Does the article you linked not show that a loop is 10x slower than vectorization for computing square roots? The fact that 10x is better than the 60x slowdown for vapply isn't really evidence that loops are a reasonable alternative to vectorization yet.
The obsession with cpu speed almost always confuses me in these topics. Time it takes to program is way more important, and that’s where a terse language like R shines. The base/most common functions are almost always executing C anyway. It’s kind of like lisp in that it’s easy to write slow code, but who cares if it’s “fast enough”? Also, it’s almost always easy to speed up if necessary at the R level and R’s C API is also easy to use for for numeric computing/optimization which is exposed at the C level if you want to use it.
It depends. Take for example any omic dataset where you might need to run a GLM model on ~500,000 rows. Codes I've seen for this operation can range in time from taking 30 minutes to 2 days.

My take away here is that, sure, for one operation the speed is not that critical, but there is always the case where that one operation will be used close to a million times in one analysis and then it all adds up. On top of that if it's implemented in C then the invocation from R to C and back will be happening that many times which adds to the slowness.

Yes, I use R, Julia, and Python from time to time depending on the case and my mood and they all have their advantages and disadvantages.

R is more than fast enough for straightforward prototypical analyses where a lot of the code is calling C or something lower level and you're not introducing something "new" to the interpreter system. But if you want to do some unusual optimization there's going to be something that bottlenecks everything unless you go into C/C++/Fortran yourself, and then Julia is a good compromise. I've had times when Julia didn't save any time whatsover, and other times when it took something that would literally run over a week at least in R and it was done in 30 minutes in Julia.

Having said that, the more I use Julia the more I find myself scratching my head about it. It's very elegant but it's just low-level enough that sometimes I wonder if it's worth it over, say, modern C++ or something similarly low level, which tends to have nice abstracted libraries that have accumulated over the years. I also have the general impression, mentioned in a controversial post discussed here on HN, that a lot of Julia libraries I've used just don't quite work for mysterious reasons I've never been able to figure out. Everything with Julia has gotten better with time but I still have this sense that I could put a lot of time into some codebase, and have it just hit a wall because of some dependency that's not operating as documented.

There's kind of an embarrassment of riches in numerical computing today, and yet I still have the feeling there's room for something else. Maybe that's the mythical golden language that's lured all sorts of language developers since the beginning though.

I have been thinking the same and had similar timing experiences. As Julia is lower level than R/Python, there is a lot of annoying things to take care of that are not needed in R/Python. And then why not use, say Rust? Or just Rcpp in R. We just did a small test program in Rust that is called very often on the command line and takes a couple of seconds to run. Very happy with the experience. Same run speed as Julia, 10 times faster than R/python, and no 60 second load time like julia.
Julia 1.9, now in beta, implements native code caching. Precompiling a Julia package now creates a native shared library, a ".so", ".dylib", or ".dll" file. For some packages, this lowers load time considerably. It may some time before many packages take full advantage of this.

The promise of Julia is that you can have the high-level interface and the low-level code in the same language. The alternative would be coding the low level code in Rust or C and then creating bindings for Python or R.

For a while Julia made the most sense for long-running code that is that is executed almost as often as it is modified (e.g. scientific computing). In this situation Rust or C static compilation times become a hinderance. As ahead-of-time and static compilation features get added to Juliaz this scope will expand.

Yes I follow this. The load time keeps getting better. And am looking forward to 1.9.

I really don't want to come across as negative, Julia is a fantastic language, and my hope is that that it will continue its impressive improvement path.

But to follow form the thread's sentiment, I have the feeling Julia lives in an unstable equilibrium. It is lower level than R/Python but doesn't quite deliver the benefits of rust/c/fortran/c++. I find my colleagues gravitate to one of the 2 equilibria.

Maybe your last paragraph crystallizes it. If one lives in the REPL, Julia is wonderful. Not how I work. I prefer the command line. Have new data, run code on it. Data changes in real time, code not. My code may run millions of times on different operating systems and only infrequently change.

One of the key points of Julia is that the language you use for performance critical parts is also Julia. That applies to both the libraries like DataFrames.jl and for situations where you'd drop to a lower level language when optimising. I think being productive in Fortran or C++ is unrealistic for most scientific programmers.
It is a trade-off and a sweet spot has a lot to do with the specific context and background. Run speed matters a lot when the difference is between having to run your code on a dataset for half an hour vs through the whole night. Once you have prototyped your code, you are gonna use it more and more (not to mention runs in order to tweak parameters or validate results), and R's speed is not satisfying enough for my work. Python matlab are easy and fast enough to program in, and much faster for tasks that are computing-heavy. If I was getting into C I would not have saved as much time as I would have put into learning how run eg parallel tasks there safely. Moreover, R is not necessarily faster to program, always; real (ie tidyverse-style) R is quite idiosyncratic, if you come from a programming and not from a statistics background probably it will take more time to learn than it is worth unless it is sth important in your work environment.
When someone understands what is happening when their program executes they will write faster programs without much more effort.

You might like writing slow programs, but that doesn't mean people like using them.

Sorry, not following the logic here. From the article, vectorization[1] is more than 10 times faster than a loop. How is this an endorsement for "for" loops.

   [1] Vectorization is more than ten times faster than the naive loop.