Hacker News new | ask | show | jobs
by tialaramex 283 days ago
So as a concrete example:

https://odin.godbolt.org/z/8onn4hxP1

This brief example makes a hash map, then it demonstrates that if we call a sub-routine which makes its own distinct hash map, that doesn't change ours, but, once we destroy our hash map and call the sub-routine again, we can still use the variable for our (destroyed) hash map (!) but doing so reveals the contents of that other hash map from the sub-routine instead in the cases I saw.

Now, in C or C++ if you do this that's Undefined Behaviour and the symptoms I saw (and which you're likely to see if you follow the link) are just one of arbitrarily many ways that could manifest.

In Rust of course the equivalent code won't compile because the hash map is gone so we can't just go around using it after that.

And in Odin, well, as Ginger Bill has explained Odin does not have Undefined Behaviour so... this has behaviour which er, Odin has not defined ? Does that make you feel warm and tingly or do you feel like Bill just wasted time arguing semantics?

2 comments

Well, sure. If that’s what you mean by having UB then it’s trivially true for any language with a C FFI for example, or rust with unsafe. I guess what I consider UB is when the compiler exploits UB for optimizations, like discarding code that could invoke UB, which as far as I know odin doesn’t do.
> I guess what I consider UB is when the compiler exploits UB for optimizations, like discarding code that could invoke UB, which as far as I know odin doesn’t do.

This statement is incoherent. UB is undefined behavior, and it existed long before any compiler exploited it and isn't (circularly) defined by whether the Odin compiler exploits it.

Well, the statement is circular, colloquial speech doesn't have to be coherent. It's my understanding of what people generally mean when they complain about UB in the C standard, and it seems to be what gingerbill means too.

My take on what he is saying is that the odin compiler won't try to exploit that there is some behavior which is platform-defined or only knowable at runtime to do aggressive optimizations etc.

https://xcancel.com/TheGingerBill/status/1495004577531367425

To point out that use after free is possible in Odin is not really a gotcha unless you really are just arguing semantics. That's by design, just like use after free is possible in C or C++ or Rust too.

> To point out that use after free is possible in Odin is not really a gotcha unless you really are just arguing semantics

In a language with UB, the use after free is UB. Which explains the nonsensical results.

If you're pretty sure this all makes sense, I recommend one tiny tweak to further unsettle you, change either (but not both) of the int types in my example to u8 instead. Now the use after free also results in type confusion - Odin has no idea this isn't the same type and so the machine code generated is for one type but the actual bits are for a different type with a different layout.

Bill's go-to is to blame somebody else, it's the operating system, or even the CPU which should define what happens and so it's not his fault. The thread you linked does this. But for type confusion those are Odin's types, nobody else can define how Odin works, the answer must come from Bill. What is supposed to happen here? Linux didn't define your programming language, Intel didn't define your programming language, this is your fault Bill.

Your understanding is completely wrong. UB means "undefined behavior" -- behavior that is not specified by the language standard or the implementation. UB being exploited by the compiler is a separate issue. Saying that there is no UB is saying that there's no undefined behavior; it is certainly not merely saying that the compiler doesn't exploit it.

I programmed in C for over 30 years and was a member of the C Standards Committee, which originated the language about undefined behavior ... I know what I'm talking about.

> To point out that use after free is possible in Odin is not really a gotcha unless you really are just arguing semantics. That's by design, just like use after free is possible in C or C++ or Rust too.

This completely misses the point and is a failure to understand at every level. Being able to use memory after being freed is not by design -- no one intends it, no one wants it. It's undefined behavior, and a program that does it is buggy. The reason that it's possible is because it's so hard to detect or prevent. To do so requires escape analysis, lifetime declarations, borrow checking, etc. etc. And no, use after free is not possible in Rust--not in safe code. It's hard to respond to that statement without being rude, so I will say no more.

Well, first of all, I guess I am wrong! Hey, I'm not Bill, just a user of the language.

A couple of clarifications, though: I did mean unsafe rust, not the safe subset. No need to get rude!

Second of all, I am of course not under the illusion that Odin prevents use-after-free (and thus, technically, it does allow UB I guess). I just don't think Bill is either. So clearly he doesn't mean UB by the same definition as you do.

_My_ use of UB has always been in the context of what a compiler will do during optimization, and the discussion I've seen in the context of C compilers is that they perform optimizations that remove code or change code in surprising ways because the way the code was written technically resulted in UB. But I'm neither a spec writer or a compiler author, so I don't really care that much about the actual definition of the term.

Anyway, best of luck in convincing Bill to use the term correctly as well! I won't mention UB when talking about the benefits of Odin in the future. :)

> So clearly he doesn't mean UB by the same definition as you do.

Wrong.

> so I don't really care that much about the actual definition of the term.

Yes, it's evident that you don't care what's true or about being accurate.

> Anyway, best of luck in convincing Bill to use the term correctly as well!

He does use it correctly, but his claims that Odin has no UB are incorrect.

Over and out.

As someone who knows what they're talking about, I'm curious to hear from you, why you consider use-after-free an undefined behavior and not an unspecified behavior instead?

Because as far as I know both undefined behavior and unspecified behavior are the behaviors that aren't specified in the language standard nor the implementation. So what's the difference?

A useful example to consider might be an uninitialized integer in C++

In C++ 98 "int x; foo(x);" is Undefined Behaviour. We said there's an integer named x, then, without initializing x, we passed it as a parameter to the function foo, evaluating the uninitialized value, literally anything is allowed to happen. Program crashes, deletes all the JPEGs large than 15 kilobytes, displays "Boo!" on the screen, anything the program could have done is something it might now do.

In C++ 26 "int x; foo(x);" is merely Erroneous Behaviour. The value of x is Unspecified, but it does have a value. This program might pass any integer to foo -- perhaps your compiler provides a nice compiler setting to choose one, maybe the person building the program picked 814 and so this calls foo(814)

This constraint is an enormous difference. In a sense the Unspecified behaviour is defined, it's just not specific. That variable x will have some integer value, so, maybe zero, or 814, but it can't be a string, or negative infinity, and evaluating it will just do what it would do for its value, whatever that was.

If you don't find that example illuminating enough, try another from a very different language, a data race in Java

Now, in C++ (or Odin though of course Ginger Bill will insist otherwise ad nauseum) the data race is UB. Game over, you lose, anything might happen.

But in Java the data race has Unspecified behaviour. Specifically, when the race happens the variable we raced has some definite value and it's a value it definitely could have had at or before this moment in a sequentially consistent program. So e.g. maybe we're counting clowns, we started from zero, thread A is counting 800 clowns, thread B is counting 600 clowns, but the threads race the same counter, legitimate values we might see at the end include zero all the way through 1400 clowns. -1 isn't possible, 1600 isn't possible, but 1234 is entirely plausible. That's Unspecified behaviour.

This is a simple use after free on the stack. Is that UB?
Yes, of course ... the content of freed memory, on the stack or otherwise, is not defined. (And this is not in fact on the stack.)
1. I'm fairly certain you have to use make to get into heap.

2. Odin 0s out memory when declaring a variable unless you explicitly state so with ---. This defines the state of memory when allocated.

Ordinarily you'd be correct that you need the weird make overload, but I had no cause to invoke make I just told Odin that we don't care and it's fine, check the first line. Whether that feature is a good idea in this language I couldn't say.

Odin's decision to zero initialize local variables isn't relevant here.

Huh. Wasn't aware of that feature. Good to know.

I didn't fully flesh out the initializing local variables: What part of your code is undefined? You deleted the memory, and the compiler reused it. Then you re-accessed that same memory. That's just part of working with computers. The initialization comment was supposed to be from creating data to releasing it is defined. To be compliant with the Odin compiler spec, it's defined from start to end.

Not OP but:

> What part of your code is undefined?

Using a variable (`some_map` in this case) after `delete`ing it doesn't seem something languages usually define in their specification. Does Odin define that?

I don't really get the distinction between adding the dynamic-literals feature flag and using unsafe in Rust? Like, if he had called it #+unsafe dynamic-literals, would that have been better?
I'm going to guess that you've never written, and perhaps never read, any unsafe Rust, because you (in common with several Rust critics) seem to be imagining it like a switch you can turn on somehow to disable the safety rules, and that's not what it is at all.

This Odin feature flag just allows me to write what I meant in Odin, I can write it all out by hand using make and so on without the feature flag, but it's more annoying to spell that all out which presumably was the impetus for this feature flag in the first place. The flag didn't somehow "cause" the unsafety, that's an insane take.

Odin isn't very well documented, so as somebody who was writing Odin just to explain the problem here, the easier option avoids trying to guess which of a dozen undocumented functions with names that may be related to hash maps is the "right" function to do what I meant, I can just write what I meant and turn on the feature flag to acknowledge that it involved allocation.

I can't reply to the reply to this one (guess the thread is getting too deep), but I just wanted to clarify that I'm not a rust critic, I use rust, I like rust, I have no problem with rust. OK great, moving on...
Isn't there a cheap way to implement a gentle refusal to compile a code if it is not garantee that it won't prevent such a behavior?
No, definitely not. The solution is something like Rust, with its lifetimes and borrow checker ... not cheap at all.
You mean a borrow checker? That exists, but not for Odin.