Hacker News new | ask | show | jobs
by est31 1638 days ago
> C# always had value types with configurable in memory layout. It also has a very good mmap solution. It also allows for hand optimize things using unsafe blocks.

And C has inline assembly. Doesn't mean that most C code will use inline assembly.

Back in 2009, a lot of git utilities were still written in scripting languages. Not sure when it started, but the porting activity of those utilities to C is still ongoing. So the maintainers still want to use a lower level language today.

In other projects in the VCS space, we are seeing a similar trend. Hg, originally a project written in Python, is being rewritten in Rust by Facebook, one of the big users of it.

Sure, maybe you could have used C# together with some niche features. But it's not going to be fun compared to a language that has zero cost abstractions and that runs on the bare metal.

Even if your problem domain demands a managed environment, like extensibility with plugins, I still suggest you to use Rust together with wasm. It's the first choice thanks to its great type system, powerful static analyzer and first class support for resource management that garbage collected languages lack.

4 comments

Is there a term for this phenomena yet?

"if another language is being discussed, Rust must be forced into the discussion, no matter how tenuous the connection"

I think in this scenario it's totally germane to mention Rust because the problem described in the linked post is exactly the problem that Rust was designed to solve: providing sufficiently precise control over low-level runtime behavior that you never hit a "sorry, it's not possible to do that optimization in this language" situation, while still (arguably? hopefully?) qualifying as a "higher-level language" in the relevant sense. In particular, every problem with Java that the post describes has a straightforward solution in Rust, and this kind of thing is why Rust exists instead of, e.g., Mozilla just rewriting Firefox in an existing managed language with a garbage collector.

That being said, GP seems to imply that Rust should be the default choice for basically every problem, which goes way too far. Not every application needs this kind of low-level control. Maybe even most don't (although I look forward to a future where it's easy to drop into Rust from a managed language when you hit a performance wall; I think this has been mostly achieved for Python, but not yet for other languages). But some do, and it sure sounds like Git's one of them.

Rust is a low level language no matter how productive it may be.

The memory layout will simply leak into the program architecture and will have to be altered on refactors — something which is transparent with managed languages.

What do you mean here by memory layout? For instance, the order of fields in a rust struct can (theoretically) change by recompiling. It's not defined by the order of fields in the definition.
On a language level, high level APIs will necessarily contain details to things like (mut) reference, Box whatever. Which is not a problem at all, given the problem domain, but in my opinion it is not possible to make a both low and high level language at the same time (and it is not really needed either)
Unless you add a repr(C) attribute for C interop.
Git is the subject of the linked e-mail. Mercurial is the big contender to git that is not written in a C language. Their response to Hg's performance issues was not to use or create some Python feature that allows them to speed up some fast paths, but to use a proper low level language in the first place, which happens to be Rust. I'm not sure you can get more relevant to the discussion than this.

The trend seems to go away from high level languages in the VCS space. Developer time is one of the most expensive resources that FANG pays for, so any kind investment in performance improvements is going to pay off quite well.

Is there a term for this phenomena yet? "if another language is being discussed, Rust must be forced into the discussion, no matter how tenuous the connection"

Rustrusion

This always happens with whatever language is in vogue at the time. Now it’s Rust. It used to be Go (which still has a little juice left). Before that, Closure and Haskell both had runs. And before that… hell, I remember when Java was talked about this way.

This is the natural order of things and is good.

And the proper term for introducing Rust should be “oxidation”.

elixir, RoR and nodeJS, (and Python a couple of times) spring to mind. Some of those languages have found a niche. But lot of new languages made older languages nicer by adopting language/framework features
Arguably, carcinization [0].

[0] https://en.wikipedia.org/wiki/Carcinisation

I’m not a…rustafarian?…but we didn’t get as cross when C# was mentioned above, in a thread about Java and C. In fact it’s top comment at my time of reading.
Well, I would prefer that people would discuss alternate systems programming languages when "C is fast" comes up.

We could use some perspective from, say, Ada programmers. Unfortunately, none of them ever seem to show up.

> say, Ada programmers.

I stand summoned.

> Unfortunately, none of them ever seem to show up.

We do from time to time, but people assume our language is dead (it isn't). I learned it last year and I've been very impressed by how simple it is, given the speed you get with it.

It was a "big language" at the time, but now it's a language smaller than Rust or C++ which offers good performance with straightforward syntax. Ada also has a package manager now which includes toolchain install.

Ada has inline assembly, easy usage of compiler intrinsics, dead-simple binding to C, built-in multi-tasking (which includes CPU pinning), a good standard library, RAII, and real honest-to-goodness built-in, not-null-terminated strings. It's a compiled language, so you get good speed in general, but the built-in concurrency really does help work which can be split up. Ada 202x is getting even finer grained parallelism (parallel for-loops) in the language itself to even further help this.

- https://alire.ada.dev/

- https://learn.adacore.com/

- https://github.com/pyjarrett/programming-with-ada

- https://en.wikibooks.org/wiki/Ada_Programming

> but people assume our language is dead

And/or a lot of misconceptions. I showed up many times as well with those links, and explanations and whatnot.

I recommend https://blog.adacore.com/, too. Ada/SPARK is great when you want formal verification, and your checks to be done by GNATprove; statically, instead of dynamically. FWIW, you can disable runtime checks in Ada.

I also commented https://docs.adacore.com/live/wave/spark2014/html/spark2014_... not too long ago. The whole documentation is useful anyway. You can prove the absence of memory leaks, among a lot of other stuff!

> And/or a lot of misconceptions.

I've tried too. I have an article about some of these:

- https://pyjarrett.github.io/programming-with-ada/clearing-th...

I've heard all sorts of things about ADA. my the main thing keeping me fron delving in has been the lack of general info about it. Thankyou for the links! I'll be taking a look through these. What kinds of projects are people building in ADA these days? I'm interested in it primarily for robotics.
I use Ada as my alternative to C, when I don't feel like doing C++.

I've written a few tools for myself, including a command line code discover tool for large code bases (tens of millions of lines). There's a bunch of embedded work being done with it.

Make sure you use "Ada" rather than "ADA". Some people might give you trouble about it--it's not an acronym, just a name :)

Ada is a bit verbose for my tastes. Nim [1] is fast like C - I have yet to really find anything rewritten in Nim be slower. It's safe-ish like Rust { there is an easily identifiable subset of unsafe constructs }. It's kind of like Ada, but with Lisp-like syntax macros/meta programming and Python-like block indentation (Lisp folks always said they "read by indentation" anyway). Nim also has user definable operators and many other features. Compile times are very small while the stdlib is big-ish.

Small sample statistics, but three or four times now I have re-written Rust in Nim and the Nim ran faster. Once you can do inline assembly/intrinsics in a PL, most "real world" benchmarks reduce to a measure of dev patience/time/energy not the language. They also become "multi-language" solutions (if you count SIMD asm as a language which I think one should). Even slow Python allows C/Cython modules which in the real world are absolutely fair game, and you can call SIMD intrinsics from Cython pretty easily, too. Since we have few ways to quantify dev patience/attention objectively, these "my PL is faster than yours" discussions are usually pretty pointless.

[1] https://nim-lang.org/

They don't show up because their not out evangelizing every oppurtunity they get.
And perhaps that's why other languages are more popular?

Akin's Laws of Spacecraft Design are appropriate here:

> 20. A bad design with a good presentation is doomed eventually. A good design with a bad presentation is doomed immediately.

The old term for .NET/Java was "Managed" languages. "Managed C++", "C# is a managed language", because they all manage your memory for you.

Rust's primary language feature - the borrow checker - is about adding compile-time checks on resource management(mainly memory), and the original article talks about boxed vs. value types being a major source of inefficiency.

So talking about Rust in a comparison of C and Java mentioning memory indirection bottlenecks seems about the most relevant place to discuss it.

Most people talking about C# and Java, they refer mostly to application development. You rarely hear these languages at system programming (doable, just rare). Rust is at C/C++ level when it comes to system programming and eliminates a lot of C/C++ issues and yet added features found in Java and C#, and even Haskell. People just don't know a lot about Rust to criticize upon and yet seeing it mentioned everywhere. I can understand if some feel a bit "fed-up" seeing Rust brought up in a non-Rust thread. But I do agree with you, Rust is very relevant for discussion here.
The article title is "Why is C Faster than Java".

I would expect to see Java, C#, C, C++, and Rust mentioned quite a bit in the threads here. It's all relevant.

Based on the article, the title should be, "Why is C Git faster than JGit." It's literally nothing but that.
I believe not mentioning Rust whenever possible is strictly forbidden as "mean behavior" in the Rust Code of Conduct.
It's actually the opposite. If anything being evangelical about Rust is heavily discouraged

The truth is Rust is an amazing language, with its own warts (async, Pin, etc.), but there is pent up demand for language that fits its description. Non-manual, non-GC low level oriented language. It's not a wonder some projects are switching to Rust

What exactly is switching to Rust?
Hg, in context of this discussion, but even Dropbox moved some of its software to Rust.
Hype
Can hardly blame people for talking about modern languages in a discussion about obsolete ones.
> Can hardly blame people for talking about modern languages in a discussion about obsolete ones.

The point is that the issue does not involve people discussing "modern languages", just mindlessly shoehorning references to Rust into any discussion involving any application of a language which is not Rust.

I get Rust fanboys are excited about their hobby, but this sort of obsessive "when the only tool you have is a hammer" discussion is very tiring and fruitless, and only conveys a poor image of Rust's community.

So, let me get this straight: We have a thread about a programming language (Java), then it gets compared to another programming language (C#), then it gets compared to a third one (C) and no one bats an eye. But when Rust is mentioned it's because of "fanboys". Yeah, sure.
> So, let me get this straight: We have a thread about a programming language (Java) (...)

No, you really don't. If you read the thread you're commenting on, you'll notice it's about C#.

The very first comment of the thread you're discussing in, and also the top post of this discussion, is, and I quote:

> It is quite interesting that most of the problems mentioned don't exist in recent version of C# on .NET Core, considering all the similarities of C# and Java. (...)

And somehow Rust fanboys parachute into the discussion to yet again talk about their hammer handling all nails and nail-like problems.

The thread I'm seeing is a top-level comment about C#, a reply that is on-topic and mentions Rust, and also assembly, Python, Hg, "scripting languages", and wasm.

Rust is exactly as relevant here as any of those other items, but people are getting really upset about the Rust mention.

I think in a discussion that already started by comparing different performance characteristics in different languages in a VCS, it's not at all out of line to bring up the fact that another VCS is being rewritten into any particular language. It seems to me that the anti-Rust sentiment is far more disruptive and off-topic here than the mention of Rust in the first place was.

> But it's not going to be fun compared to a language that has zero cost abstractions

C# has them. For instance, interfaces used as generic type constraints are zero cost.

Another thing, some C# abstractions are very low cost. Critically to this thread, Span<T> abstraction is low cost, pretty much the same thing as a pointer+length in C. It's easy to design an abstraction which uses spans of bytes backed by a memory-mapped file, and the performance going to be pretty similar to C.

> C# has them. For instance, interfaces used as generic type constraints are zero cost.

Depends on what we mean by 'zero cost'. For instance, Interface constraints themselves may not have a 'cost', but there are many cases where this means that the calls involving that generic type will be virtual (unless you're doing fun patterns like 'where TComparer : IEqualityComparer<T>,struct`). If you poke around at the internals of System.Linq you'll see there's a lot of checking to use specialized types depending on the collection in order to minimize costs.

And that's what you'll see a lot of in the .NET Standard bits; even in the past we've had some fairly low cost abstractions in places. SocketAsyncEventArgs, if a little arcane at first is a good design for it's time, and System.Linq.Expressions has been a great way for users to minimize the cost of things like reflection without having to write bytecode.

That said, some abstractions are deceptively costly; the 'new' generic constraint is definitely not zero cost, unless that got fixed in 6.0.

> unless you're doing fun patterns like 'where TComparer : IEqualityComparer<T>,struct`

These fun patterns are precisely generic type constraints I mentioned in my comment. I do use them when performance matters, here’s an open-source example: https://github.com/Const-me/Vrmac/blob/1.2/Vrmac/Draw/Main/I... That code is from a 2D vector graphics library, these interface methods may be called at 10 kHz frequency or more. Displays are often 60 Hz, the methods are called couple times for every vector path being rendered.

> If you poke around at the internals of System.Linq you'll see there's a lot of checking to use specialized types depending on the collection in order to minimize costs.

Linq is awesome, but I’m pretty sure it was designed for usability first, performance second. I tend to avoid Linq (and dynamic memory allocations in general; delegates are using the heap) on performance-critical paths. YMMV but in most of the code I write, these performance-critical paths are taking way under 50% of my code bases.

> 'new' generic constraint is definitely not zero cost

If you mean the overhead of Activator.CreateInstance<T> when generic code calls new() with the generic type, I’m not 100% certain but I think it’s fixed now. According to https://source.dot.net/, that standard library method is marked with [Intrinsic] attribute, the runtime and JIT probably have optimizations for value types.

C doesn't have inline Assembly, it is a common language extension.

An ISO C certified compiler isn't required to support it.

You should read ISO/IEC 8859:2011 J.5.10 "The asm keyword". It's the same section in the C18 standard. It's the bit describing the way an ISO C certified compiler shall provide inline assembly.
I am fully aware of it, it clearly specifies that it is implementation specific.

Two C certified compilers for the same platform are free to provide completely different behaviours for what asm is supposed to do.

Anyone that cares about compilers does actually read ISO documents.

The comparison is against Java because it has certain feature parity with C#. And it is right, C# code can be brought closer to C level of performance with less effort than in Java.