Hacker News new | ask | show | jobs
by messe 372 days ago
With regard to the casting example, you could always wrap the cast in a function:

    fn signExtendCast(comptime T: type, x: anytype) T {
        const ST = std.meta.Int(.signed, @bitSizeOf(T));
        const SX = std.meta.Int(.signed, @bitSizeOf(@TypeOf(x)));
        return @bitCast(@as(ST, @as(SX, @bitCast(x))));
    }

    export fn addi8(addr: u16, offset: u8) u16 {
        return addr +% signExtendCast(u16, offset);
    }
This compiles to the same assembly, is reusable, and makes the intent clear.
2 comments

Yes, that's a good solution for this 'extreme' example. But in other cases I think the compiler should make better use of the available information to reduce 'redundant casting' when narrowing (like the fact that the result of `a & 15` is guaranteed to fit into an u4 etc...). But I know that the Zig team is aware of those issues, so I'm hopeful that this stuff will improve :)
This is something I used to agree with, but implicit narrowing is dangerous, enough so that I'd rather be more explicit most of the time nowadays.

The core problem is that you're changing the semantics of that integer as you change types, and if that happens automatically then the compiler can't protect you from typos, vibe-coded defects, or any of the other ways kids are generating almost-correct code nowadays. You can mitigate that with other coding patterns (like requiring type parameters in any potentially unsafe arithmetic helper functions and banning builtins which aren't wrapped that way), but under the swiss cheese model of error handling it still massively increases your risky surface area.

The issue is more obvious on the input side of that expression and with a different mask. E.g.:

  const a: u64 = 42314;
  const even_mask: u4 = 0b0101;
  a & even_mask;
Should `a` be lowered to a u4 for the computation, or `even_mask` promoted, or however we handle the internals have the result lowered sometimes to a u4? Arguably not. The mask is designed to extract even bit indices, but we're definitely going to only extract the low bits. The only safe instance of implicit conversion in this pattern is when you intend to only extract the low bits for some purpose.

What if `even_mask` is instead a comptime_int? You still have the same issue. That was a poor use of comptime ints since now that implicit conversion will always happen, and you lost your compiler errors when you misuse that constant.

Back to your proposal of something that should always be safe: implicitly lowering `a & 15` to a u4. The danger is in using it outside its intended context, and given that we're working with primitive integers you'll likely have a lot of functions floating around capable of handling the result incorrectly, so you really want to at least use the _right_ integer type to have a little type safety for the problem.

For a concrete example, code like that (able to be implicitly lowered because of information obvious to the compiler) is often used in fixed-point libraries. The fixed-point library though does those sorts of operations with the express purpose of having zeroed bits in a wide type to be able to execute operations without loss of precision (the choice of what to do for the final coalescing of those operations when precision is lost being a meaningful design choice, but it's irrelevant right this second). If you're about to do any nontrivial arithmetic on the result of that masking, you don't want to accidentally put it in a helper function with a u4 argument, but with implicit lowering that's something that has no guardrails. It requires the programmer to make zero mistakes.

That example might seem a little contrived, and this isn't something you'll run into every day, but every nontrivial project I've worked on has had _something_ like that, where implicit narrowing is extremely dangerous and also extremely easy to accidentally do.

What about the verbosity? IMO the point of verbosity is to draw your attention to code that you should be paying attention to. If you're in a module where implicit casting would be totally fine, then make a local helper function with a short name to do the thing you want. Having an unsafe thing be noisy by default feels about right though.

you could give the wrapper function a funny name like @"sign-cast" to force the eye to be drawn to it.
Yeah but what is up with all that "." and "@"? Yes, I know what they are used for, but it is noise for me (i.e. "annotation noise"). This is why I do not use Zig. Zig is more like a lighter C++, not a C replacement, IMO.

I agree with everything flohofwoe said, especially this: "C is clearly too sloppy in many corners, but Zig might (currently) swing the pendulum a bit too far into the opposite direction and require too much 'annotation noise', especially when it comes to explicit integer casting in math expressions ".

Seems like I will keep using Odin and give C3 a try (still have yet to!).

Edit: I quite dislike that the downvote is used for "I disagree, I love Zig". sighs. Look at any Zig projects, it is full of annotation noise. I would not want to work with a language like that. You might, that is cool. Good for you.

Despite all bashes that I do at C, I would be happy if during the last 40 years we had gotten at least fat pointers, official string and array vocabulary types (instead of everyone getting their own like SDS and glib), namespaces instead of mylib_something, proper enums (like enum class in C++, enums in C# and so forth), fixing the pointer decay from array to &array[0], less UB.

While Zig fixes some of these issues, the amount of @ feels like being back in Objective-C land and yeah too many uses of dot and starts.

Then again, I am one of those that actually enjoys using C++, despite all its warts and the ways of WG21 nowadays.

I also dislike the approach with source code only libraries and how importing them feels like being back in JavaScript CommonJS land.

Odin and C3 look interesting, the issue is always what is going to be the killer project, that makes reaching for those alternatives unavoidable.

I might not be a language XYZ cheerleeder, but occasionally do have to just get my hands dirty and do the needfull for an happy customer, regardlees of my point of view on XYZ.

> Yeah but what is up with all that "." and "@"

"." = the "namespace" (in this case an enum) is implied, i.e. the compiler can derive it from the function signature / type.

"@" = a language built-in.

I know what these are, but they are noise to me.
It's not annotation noise however, it's syntax noise.
Thanks for the correction. Is it really not "annotation"? What makes the difference?
You're not providing extra information to the compiler, clarifying the intent, but merely follow the requirements of the language when writing . to infer the type or @ to use a built-in function.
C++'s `::` vs Zig's `.`

C++'s `__builtin_` (or arguably `_`/`__`) vs Zig's `@`

I hate C++, too.
It is waaaaaaay less noisy than c++

C syntax may look simpler but reading zig is more comfy bc there is less to think about than c due to explicit allocator.

There is no hidden magic with zig. Only ugly parts. With c/c++ you can hide so much complexity in a dangerous way

FWIW: I hate C++, too.
You might try out Nim. It has a low annotation noise level. The python like syntax feels odd for a systems language at first but since it’s statically typed it works well. The simpler syntax seems to work very well with LLMs too.

Basically it’s a better C/C++ for me and sits between C and C++ in complexity.

I tried Zig for a while years ago but found the casts and other annotations frustrating. And at the time the language was pretty unstable in how it applied those rules. Plus I’ve never found a use for custom allocators, even on embedded.

the line noise is really ~only there for dangerous stuff, where slowing down a little bit (both reading and writing) is probably a good idea.

as for the dots, if you use zig quite a bit you'll see that dot usage is incredibly consistent, and not having the dots will feel wrong, not just in an "I'm used to it sense/stockholm syndrome" but you will feel for example that C is wrong for not having them.

for example, the use of dot to signify "anonymous" for a struct literal. why doesn't C have this? the compiler must make a "contentious" choice if something is a block or a literal. by contentious i mean the compiler knows what its doing but a quick edit might easily make you do something unexpected