| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ape4 3637 days ago
	Shouldn't a C `int` be converted to Rust's `isize`. I think that captures the spirit better.

6 comments

SilasX 3637 days ago

Not to look a gift horse in the mouth, but it seems like Corrode misses some other chances to use idiomatic Rust:

1. Rust fn:main doesn't need to return something.

2. The arguments to main aren't mutated, so Rust doesn't need to declare them as mutable.

3. Ditto for the argument to printf.

Anyone know how easy it is to recognize and code for such cases in the transpiler?

Edit: It looks like they might have opposite design goals [1]: "Corrode aims to produce Rust source code which behaves exactly the same way that the original C source behaved, if the input is free of undefined and implementation-defined behavior. ... If a programmer went to the trouble to put something in, I want it in the translated output; if it's not necessary, we can let the Rust compiler warn about it." (Edit2: cleaned up and numbered)

[1] https://github.com/jameysharp/corrode#design-principles

link

kerkeslager 3636 days ago

I think that keeping an exact one-to-one mapping makes this tool a lot more useful. There's no telling what code depends on C idioms that would be broken by using a Rust idiom instead. Generating 100% equivalent code means that programmers can make intelligent decisions about when to switch over to Rust idioms as they continue developing the program.

link

Retra 3636 days ago

Yeah, once you've got equivalent Rust, the rest is just optimization that should probably be implemented in the Rust compiler. No reason to put that stuff in the niche transpiler.

link

masklinn 3636 days ago

> Anyone know how easy it is to recognize and code for such cases in the transpiler? Edit: It looks like they might have opposite design goals

Yes the author has explicitly noted that they want a compiler as syntax-directed as possible, semantics change would go against that grain. In that spirit, idiomatic alterations would be the domain of rust-land fixers and linters (e.g. `cargo wololo` or `cargo clippy | rustfix`)

link

SilasX 3636 days ago

So you could chain Corrode with one of those to get a C-to-idiomatic-Rust converter?

FWIW, I googled those; Clippy and rustfix just seemed to be linters that can't detect things like "you're not mutating this so drop `mut`", and I couldn't find wololo.

link

pshc 3637 days ago

1. A special case could be added for `main`, but it's no big deal.

2. This seems difficult as the C arguments were mutable; the algorithm would have to start doing analysis rather than direct translation.

3. Quite difficult to "know" that this printf doesn't write to its arguments, especially since the printf is manually declared.

link

SilasX 3633 days ago

Regarding 1., If you're still reading, it looks like they discuss what they'd have to do to move `main` to its correct Rust type:

https://github.com/jameysharp/corrode/issues/20

link

cesarb 3637 days ago

No, most real-world C code will expect a C `int` to be 32 bits, while `isize` is often 64 bits.

On the other hand, at least for Unix systems `long` is often equivalent to Rust's `isize`: 32 bits for 32-bit architectures, and 64 bits for 64-bit architectures, so it would make sense to convert `long` to `isize`.

link

dbaupp 3637 days ago

They're different types. isize is ssize_t (well, intptr_t), in that it is tied to the size of the address space, while C's int is not constrained. In fact, it is usually 32 bits, even on 64-bit architectures, where isize is 64 bits.

link

wahern 3636 days ago

Wow. So I did some sleuthing and apparently in Rust the maximum size of an object must fit in isize, not usize. That means on 32-bit architectures you can't have arrays larger than 2GB, whereas on Linux and similar systems 32-bit processes have access to 3GBs and even the full 4GBs of address space. It actually matters for things like mmap'ing files.

Technically, C's int is constrained. C defines a minimum range of values for all the datatypes. The minimum range for int is -32767 to +32767. long is -2147483647 to +2147483647. Though the discerning pendant will claim, ex post, to target something like POSIX (which increases the bound on int, defines char as 8 bits, etc) if you point out improper use of int.

One irony of criticisms against C is that people argue it's too low level, but that's often because people treat it as too low-level. For example, novice C programmers think of C integer types in terms of bit representations and infer value ranges. Good C programmers think of C integer types in terms of representable values, understand that bit representation (specifically, hardware representation) is almost always irrelevant, and understand how to leverage the unspecified upper bounds on value ranges to improve the longevity and portability of their software.

Languages which emphasize fixed-width integers are, in some sense, a retrogression. The real problem with C integer types is you won't see the folly in poor assumptions until it's too late. Languages like Ada addressed this with explicit ranges. But I guess that was too burdensome. Fixed-width integers is an appeasement of lazy programming. I admit to being lazy and using fixed-width integers in C more than I should, but at least I feel dirty about it.

Many of the compromises Rust makes are clearly informed by the _particular_ experiences of the core team. For example, the fact that most Rust developers are of the belief that malloc failure is not recoverable (a big hold-up in adding catch_unwind) is a reflection of their experience with large desktop software. Desktop software has very complex, interdependent, and less fine-grained transaction-oriented state. Recovering from malloc failure is very hard and of little benefit. Most server software, by contrast, has more natural and consistent transactional characteristics. Logical tasks have less interdependent state, so it's both easier and more beneficial to be able to recover from malloc failure.

I think some of the choices wrt integer types is similarly informed.

link

Manishearth 3636 days ago

> the fact that most Rust developers are of the belief that malloc failure is not recoverable

This is untrue. The true statement is similar, but has different implications -- malloc failure is usually not recoverable, and nonrecoverable malloc failure should be the default, for the problem space Rust targets (which encompasses more than low-level things). You can recover from malloc in Rust, it just requires some extra work.

link

DanWaterworth 3637 days ago

I'm in no way connected to the project. Perhaps you should file an issue.

link

pcwalton 3637 days ago

Only for ILP64 ABIs, which aren't common.

link

nemaar 3637 days ago

On some architecture int is 32bit while isize is actually 64bit so no, that translation is definitely not the ideal one.

link