Hacker News new | ask | show | jobs
by elgatonegro 2006 days ago
> Lisp has extremely powerful code generation, but makes serious performance compromises

This is a serious exaggeration.

Common Lisp has extremely good compilers that can meet C performance.

There are plenty of Scheme implementations (I use Chez) with very good performance characteristics too.

4 comments

I do wish the C compiler was as fast as e.g. SBCL's compiler in terms of compile time, because multiple rounds of dependent macro compilation eats up a lot of time.

I think Lisps tend to optimize for throughput, but games have very strict latency requirements. Garbage collection pauses could cause frame pacing issues (not that C solves that completely, but it is at least not a built in disadvantage of idiomatic use of the language)

Newer concurrent java GCs (shenandoah, zgc) provide consistent ≤1ms pause times independent of heap size. That's a reasonable latency for most soft real-time tasks.
Writing a low latency garbage collector (with decent throughput) is a non-trivial engineering problem and requires both high calibre engineers with specialized skills and a lot of funding. I fear the time lisp community has had either are long gone.

How much did the development of Azul's C4 or Oracle's Shenandoah cost in terms of money, talent and time and how much has this work lowered the barrier for less well resourced languages? I don't get the impression that the answer to this question is: "enough that we will soon see low-latency, high-throughput garbage collectors becoming the norm".

Some older VMs like Erlang's also do very well in avoiding GC pauses.

In Erlang's, the key is that garbage collection is per process. (Note that Erlang processes are analogous to Golang green threads, not OS prcessses.)

And in the BEAM the garbage collector is only invoked when the heap and stack meet which means for most short-lived processes it never runs:

https://erlang.org/doc/apps/erts/GarbageCollection.html#over...

(btw, I'm enjoying the Reactor podcast, keep it up)

> (btw, I'm enjoying the Reactor podcast, keep it up)

Thank you! Since we don't get many comments on reactor.am, it's hard to tell if it's useful for others to share our masterminds.

Can you show some concrete examples? Non-idiotimatic Common Lisp compiled with SBCL can certainly come within a single digit integer factor of C, but your claim is much stronger. As for Chez, I have yet to see some code that compares favorably to even a fast scripting language, let alone a medium or high performance compiled language, and that's even if the code is littered with fixnum version of operators etc. Not what I'd call "very good performance characteristics" unless compared to bash or python.
I have seen tight functions compiled with SBCL matching and occasionally exceeding the speed of the C version. SBCL is a compiler which produces very good code. If you add enough type information so that SBCL can infer the types of all operations, you will get performance which compares very well to C.
There is no such thing as idiomatic Common Lisp.
CL-PPCRE is faster than the C version
It used to be, and when Edi Weitz first wrote it over a decade ago it was a very elegant example of the power of having a language that makes efficient code gen at runtime both possible and easy -- a single developer was able to outperform libraries with dozen of man years of investment over a beer-bet. But I'd be extremely suprised if that were still the case or if it were even within the same order of magnitude as the fastest regexp engines.

Also, despite the fact that scenarios that can leverage "jit-for-free" are both a best case scenario for lisps and not that rare in different fields (from firewall rules to stencil computations) to the best of my knowledge, even in this niche lisp plays absolutely no role in practice. To be clear, I don't think this due to any inherent shortcoming of lisp itself, indeed I suspect it's mostly due to brain-drain.

Rare to see Chez brought up, I've been increasingly integrating it into my own work recently. Its performance¹ and maturity of the runtime keep surprising me.

¹ Frequently on par or even better than LuaJIT, though it can take some work to get it there.

The power of luaJIT is that isn't doesn't take much work to get there. If you do normal things it ends up being very fast by default.
Yeah, no, I agree. But my experience with it has also been that as soon as you start writing anything except simple code, its performance starts to suffer, while Chez handles complex abstractions more gracefully. So I suppose comparing them directly might be a bit unfair both ways.
What kind of things end up being slow in LuaJIT?
Can't claim I remember much detail given how long I haven't worked with it. IIRC one of the issues had to do with tossing around anonymous functions inevitably causing it to give up on JITing and drop out into the interpreter
> Common Lisp has extremely good compilers that can meet C performance.

Provided memory is no object.

In general it takes five times as much memory for a GC'd program to be as performant as one with explicit memory management. See: https://www.cics.umass.edu/~emery/pubs/gcvsmalloc.pdf

There's a reason why the most interesting work these days is being done in and on languages like Rust, which has no GC but still saves you 90% of the work and close to 100% of the pain of bugs that are inherent to explicit memory management.

Usually it boils down to cargo cult against GC based languages instead of learning how to use all their features, including those for C++ like resource management.

https://devblogs.microsoft.com/aspnet/grpc-performance-impro...

While I can't speak for any experimental new garbage collectors, I did analyze the SBCL generational GC's implementation. I encourage anyone interested in performance to look over GC code and realize how much work they are doing.

That GC in particular has to stop the world, traverse the entire fresh heap, guesses whether something is even a pointer (and can guess wrong, causing a memory leak!), and does all this at arbitrary times, unless you do non-idiomatic things like manage your own memory in preallocated arrays, or use C/asm-powered memory mapped buffers [0].

Nothing comes for free, and some domains can't pay the price for GC (and choose to spend those resources elsewhere). Games, in my opinion, are largely still in that category.

[0] https://m.youtube.com/watch?v=S7nEZ3TuFpA

> unless you do non-idiomatic things

I wonder what exactly makes code "idiomatic" in a GC language. It is one of the greatest misconceptions about GC languages that you shouldn't worry about allocations at all. For me, GC mostly means the correctness guarantees which come with it and the convenience, that I don't have to track every single allocation and mange it manually. However, it still means that I should be aware of all the larger allocations and avoid them as in any other language, if I am looking for performance. Not reusing allocated memory where reasonable is detrimental to the performance in any language. And of course, the choice of the right GC approach can also help with applications like games. Lispworks even provided a Lisp environment for a Nasa probe, the GC had real-time capabilities.

Like the ones written in Unreal?

https://docs.unrealengine.com/en-US/ProgrammingAndScripting/...

Just because a language has a GC doesn't mean one needs to use it 100% of the time, which is exactly what is behind the .NET 5 performance improvements that top C++.

By making use of stack allocations, value types, memory slices, native heap allocations.

Features that Common Lisp also supports, and I bet Allegro and LispWorks are better than SBCL in the performance chapter.

It depends on what you talk about with "performance". Some years ago, Allegro had the faster GC than SBCL, but SBCL certainly had the best code generator of the three Lisps. So unless your runtime is constrained by the GC, SBCL really shines for performance. And it gives you a lot of knobs to optimize the memory usage.
SBCL generates the best code for toy benchmarks and number crunching applications. Throw it against a CLOS heavy codebase and you'll be surprised at how it runs VS Clozure CL and the aforementioned implementations.

ACL's GC is still the best hands down (and of course ABCL by virtue of the JVM). I think SBCL is still sporting a conservative GC on x86.

> In general it takes five times as much memory for a GC'd program to be as performant as one with explicit memory management.

In this blanket form, the statement is just wrong. Yes, with GC you need to have a larger heap space, as unreferenced objects will remain on the heap until collected and you want to have enough heap space so collections are infrequent enough, that a lot of objects can be collected (especially with generational GC, you want low survivor rates in the youngest generation).

However, how much space you want to reserve for that depends on many factors. Usually the extra space is proportional to the allocation and deallocation rates, not the total heap size. If you have lots of data on the heap which is long-living, this doesn't count to the extra space. Which leads to the allocation behavior of your program in general. If you want best performance, your program shouldn't blindly create garbage, but only, where it is needed. A lot of data can be stack allocated, so not counting towards the GC. And of course, you can have some amount of memory manually managed (depending on language), for bulk data. Be it entirely allocated outside the GCed heap or by keeping references alive to memory that manually gets reused. In all of these cases, this doesn't really count towards the extra space calculation.

The programming language used plays a huge role in this and the paper you quoted uses Java, which is a quite allocation-happy language, so the heap pressure is higher and you need more extra space to be performant.

At 3x memory you still get comparable performance, and memory is fairly cheap. In particular, memory sizes continue to scale, unlike CPU speeds, making GC an increasingly favourable proposition.
The end result is games that use 8GB of RAM like modded Minecraft. I was perfectly happy with 8GB of RAM for the entire system but no, that one game was so inefficient that I had to upgrade my entire system.

I will keep saying this over and over again. It's only cheap if you don't waste it. Once you are willing to waste it no amount of RAM will be good enough.