Hacker News new | ask | show | jobs
by otabdeveloper4 1638 days ago
There's no reason to program in C# if you already know C++.
3 comments

Other than memory safety, simplicity, dependency management and build tooling, the .NET standard libraries, the open source library ecosystem, and so on...
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
> Other than memory safety, simplicity, dependency management and build tooling, the .NET standard libraries, the open source library ecosystem

Not feeding the troll but outside memory safety, everything else you list exist in the C++ ecosystem with generally better alternative than in C#.

And as soon as you do touch mmap , unsafe area or native code in C# you loose memory safety too anyway.

> Not feeding the troll but outside memory safety, everything else you list exist in the C++ ecosystem with generally better alternative than in C#.

Oh? I don't think you can dispute that the C standard library is very limited and the state of dependency management / build tooling is very poor. And that actually limits the usability of the open source library ecosystem quite a lot; maybe there are more C++ libraries out there, but you can't just type what you want into the nuget search bar and get on with using it.

Simplicity is in the eye of the beholder, but the very weak semantics of C++ templates mean you can't reason compositionally about C++ code, whereas in C# it's relatively easy to have a codebase that you can reliably understand piecemeal.

> And as soon as you do touch mmap , unsafe area or native code in C# you loose memory safety too anyway.

In principle yes, but if you keep those points very rare then you can subject them to extra review etc. at a level that would be impractical with a C++ codebase (where even "a + b" is undefined behaviour in the general case). Memory safety vulnerabilities in real-world C# codebases are rare.

>simplicity

I didn't spent a lot of hours in C++ world, but it never felt simple

- N compilers, N package managers, N ways to do everything

> I didn't spent a lot of hours in C++ world, but it never felt simple

C++ is not simple.

But presenting C# (or Java) as "simple" is equally hypocritical. The JVM or the CLR and their associated frameworks are monster of complexity, engineering and legacy that require close to an entire lifetime to be mastered entirely.

C# (or Java) are "accessible", meaning a newbie devlopper can produce something halfway baked in these languages relatively quickly.

And this is something you can not say about C++.

But they are not in any way "simple".

I don't think you're talking about same thing

Just because JVM or CLR are complex, then it *doesn't* mean that writting good C# / Java requires you to be proficient at CLR/JVM lvl and because of that it is hard.

>meaning a newbie devlopper can produce something halfway baked in these languages relatively quickly.

Newbie developer can produce mediocre solutions in all of those - C#, Java, C++.

The difference is that in C#/Java world it may be slow and in C++/C world it may be exploitable (more likely) <snark>.

Anyway, in my world very often it's not about internals, but about modeling skills, about OOP, testability. Those are some of the ways of measuring how good the code is.

Good system modeling skills are way above technology

How exactly are they not simple? Well, not C# because it has a problem with a bit of a feature creep similar to C++, but Java is a really tiny language compared to.. anything.

And you don’t have to be a master of the JVM because chances are you are not a gcc/clang maintainer and yet you can write performant-enough correct code.

N ways to do something, but in exchange you can get good solutions in C++. In the C# world you are locked to a medicore compiler, with a medicore package manager, a sub standard (and complicated!) build system and a unacceptable code formatter, for example.
>with a medicore package manager

What do you mean? since .NET Core it always worked flawlessly for me

>unacceptable code formatter

hmm? that's preference not an argument.

With package manager I mean nuget. The last time I used .net (one year ago) ".Net core" was a target platform and already renamed to ".Net".

No, that's not a preference. I'm not complaining about a lack of options, I really don't care how code looks, if it all looks the same. And it fails at that. It quite often simply takes the code as it is and indents it a little bit. Clang-format (and rustfmt and dart format and plenty of others) give you the nice, tidy and homogeneous code layout i expect from a auto formatter.

"everything else you list exist in the C++ ecosystem with generally better alternative than in C#."

That seems a little questionable. Maybe "sometimes better"?.

The other people you work with may not, #1 reason to choose a language despite my own knowledge of C++ :D
Memory safety? Garbage collection? Library ecosystem?
RAII is a kind of garbage collection, too.

Longer-lived objects still need (trickier) manual deallocation.

> RAII is a kind of garbage collection, too.

No it isn't. With RAII, you can look where an object gets constructed and know exactly where it will be destructed. With garbage collection, you can't, and in fact there's no guarantee that it ever will be. Also, with garbage collection, you can save references to whatever you like, wherever you like, for as long as you like. With RAII, you need to make sure you don't create any dangling references or use any dangling pointers.

No, with RAII you still need to design your program around who owns each object, and thus who should clean it up. You end up with borrowing, move semantics and others. With (Tracing/Copying) Garbage Collection, none of this exists.

Not to mention, Copying GC also solves memory fragmentation, which C++ still suffers from unless you also design your allocations carefully around sizes of types.

> No, with RAII you still need to design your program around who owns each object, and thus who should clean it up

With or without RAII you should design your program around who owns each object, unless you want to end up with unmaintainable mess leaking file descriptors, network sockets, native memory buffers or trying to access resources after closing them. Which is why Cassandra and Netty implement their own reference counting.

> Not to mention, Copying GC also solves memory fragmentation

Not really. It only moves the problem elsewhere so it doesn't look like fragmentation. Compacting GC needs additional memory to have a room to allocate from, and that amount of memory is substantial unless you want to do more GC than any useful work. Also it is not free from fragmentation most of the time - the heap is defragmented only at the moment right after compaction. As soon as your program logically frees a memory region (by dropping a path to it), you have temporary fragmentation until the next GC cycle, because that region is not available for allocation immediately. And there is internal fragmentation caused by object headers needed to store marking flags for GC - which can consume a huge amount of memory if your data is divided into tiny chunks.

> which C++ still suffers from unless you also design your allocations carefully around sizes of types

Modern allocators split allocations into size buckets automatically.

> Compacting GC needs additional memory to have a room to allocate from, and that amount of memory is substantial unless you want to do more GC than any useful work.

Not in the case of a mark-compact collector, which works entirely in place, or a mark-region collector such as Immix [0], which only copies a small fraction of the heap.

> Also it is not free from fragmentation most of the time - the heap is defragmented only at the moment right after compaction.

An improvement would be to to perform more frequent "partial" collections, such as in the Train algorithm [1]. But some collectors (such as Immix again) avoid compaction until fragmentation is considered bad enough, which seems like a fair compromise.

> And there is internal fragmentation caused by object headers needed to store marking flags for GC - which can consume a huge amount of memory if your data is divided into tiny chunks.

The description of Doug Lea's allocator [2] suggests there are also "object headers" of a sort on allocated data in dlmalloc. You could probably steal mark bits from those headers, but it is commmon to use a separate marking bit/bytemap which is separate to space where objects are allocated, and thus has none of the fragmentation you describe.

[0] https://www.cs.utexas.edu/users/speedway/DaCapo/papers/immix...

[1] https://beta.cs.au.dk/Papers/Train/train.html

[2] http://gee.cs.oswego.edu/dl/html/malloc.html

> Not in the case of a mark-compact collector, which works entirely in place, or a mark-region collector such as Immix [0], which only copies a small fraction of the heap.

The mutator always allocates from a contiguous memory region. It can't allocate from the memory that was logically released, but not yet collected. So it needs more total memory than the amount of live memory in use at any time, unless you have an infinitely fast GC (which you don't have). In order to avoid too frequent GC cycles, or to allow it to run in the background, you need to make that additional amount of memory substantial.

JVM GCs typically try to keep low GC overhead (within single %), which often results in crazy high memory use, like 10x the size of the live memory set.

> but it is commmon to use a separate marking bit/bytemap

Sure, you can place it wherever you wish, but it still requires additional space.

Fortunately, with GC, you can avoid thinking about many small objects you constantly allocate along the way. Most of them will get collected the next GC run as a young generation going out of function / block scope. Some of them will travel down the call graph and may end up long-living, then eventually collected.

But I agree: for anything that you want to deallocate deterministically, or at least soon enough, you need to track ownership, and care about the lifetimes. Such objects are relatively few, though.

> Most of them will get collected the next GC run as a young generation going out of function / block scope.

Depends on the use case. Not if you're storing them in a long living collection. Also heap allocation is costly, even in languages with fast heap allocation. It is still an order of magnitude slower than stack allocation.

> But I agree: for anything that you want to deallocate deterministically, or at least soon enough, you need to track ownership, and care about the lifetimes

It is not only that. You need ownership not only to determine lifetimes.

You need to know it in order to be able to tell if, having a reference to an object, you're allowed to update it and in what way. Is it the only reference? If it is shared, who also has it and what can it do with it? If I call "foo" on it, will I cause a "problem at a distance" for another shareholder? Being able to answer such questions directly by looking at the code makes it way easier to navigate in a big project written by other people.

In C++ if I can see a simple value or a value wrapped in a unique_ptr, I know that I can update it safely and nothing else holds a reference. If I see a shared_ptr, I can expect it is shared, so I have been warned. The intent is clear. In Rust it is even safer, because the compiler enforces that what I see is really what I get (it is not just relying on conventions).

On the flip side, GC-based languages tend to invite a style of coding where reference aliasing is everywhere and there are no clear ownerships. I can see a reference to something and I have no idea what kind of reference it is and what I can safely do with it. It is just like a C pointer. I need to rely on code comments which could be wrong (or read a million lines of code).

I meant tracing garbage collection. I'd say that something like 95% of allocations in real-world code can be done straightforwardly with RAII, or could be if the language supported it (and indeed gain maintainability benefits from being forced into an RAII-centric paradigm). But the remaining 5% is a real pain, and distributed over a wide variety of problems in a wide variety of domains. So tracing GC really does make life a lot easier, if you can afford it.
The freedom to reference anything easily from any place is a double edge sword. I agree it makes 5% of hard issues go away, but on the flip side it makes the other 95% more complex. Tracing GC is a "goto" of memory management. You may argue goto is a good thing because it offers you freedom to jump from anywhere to anywhere and you're not tied to constraints enforced by loops and functions. We all know this is not the case. Similarly being able to make a reference from anywhere to anywhere leads to programs that are hard to reason about. We should optimize for readability not the ease of writing.
There is no reason why you could not, in principle, have Rust-style compile-time borrow checking in a managed language.

As an extreme example (that I have occasionally thought about doing though probably won't), you could fork TypeScript and add ownership and lifetime and inherited-mutability annotations to it, and have the compiler enforce single-ownership and shared-xor-mutable except in code that has specifically opted out of this. As with existing features of TypeScript's type system, this wouldn't affect the emitted code at all—heap allocations would still be freed nondeterministically by the tracing GC at runtime, not necessarily at the particular point in the program where they stop being used—but you'd get the maintainability benefits of not allowing unrestricted aliasing.

(Since you wouldn't have destructors, you might need to use linear instead of affine types, to ensure that programmers can't forget to call a resource object's cleanup method when they're done with it. Alternatively, you could require https://github.com/tc39/proposal-explicit-resource-managemen... to be used, once that gets added to JavaScript.)

Of course, if you design a runtime specifically to be targeted by such a language, more becomes possible. See https://without.boats/blog/revisiting-a-smaller-rust/ for one sketch of what this might look like.