| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by DaGardner 3636 days ago
	as many "transpilers" / compilers, whatever you might name them, it lacks example input output. I want to see how my new rust code base looks light, does it compile with some heuritics, or just 1:1 C to rust primitives?

2 comments

DanWaterworth 3636 days ago

Here you go. It didn't like my stdio.h. Apparently enums and unions aren't supported, but:

    extern int printf(char *, ...);

    int main(int argc, char argv[]) {
        printf("Hello, world!\n");
        return 0;
    }

Was turned into:

    extern {
        fn printf(arg1 : *mut u8, ...) -> i32;
    }
    #[no_mangle]
    pub unsafe fn main(mut argc : i32, mut argv : *mut u8) -> i32 {
        printf(b"Hello, world!\n\0".as_ptr() as (*mut u8));
        0i32
    }

edit: Also worth noting, it removes all comments. I believe this to be a limitation of language-c [1]

[1] https://hackage.haskell.org/package/language-c

link

Animats 3636 days ago

It transliterates C to Rust all right, but the Rust isn't any safer than the C that goes in. Note the representation of an null-terminated string - it's an unsafe pointer to a byte. That's what it was in C, transliterated unsafely to Rust. Some safe Rust representation for C arrays is needed.

From the description of how it translates a FOR loop, it does so by compiling it down to the primitive operations and tests. A Rust FOR loop does not emerge. That needs idiom recognition for the common cases including, at least, "for (i=0; i<n; i++) {...}".

This is a big job, but it's good someone started on it.

link

lemming 3636 days ago

This is explained in the readme:

A Rust module that exactly captures the semantics of a C source file is a Rust module that doesn't look very much like Rust. ;-) I would like to build a companion tool which rewrites parts of a valid Rust program in ways that have the same result but make use of Rust idioms. I think it should be separate from this tool because I expect it to be useful for other folks, not just users of Corrode. I propose to call that program "idiomatic", and I think it should be written in Rust using the Rust AST from syntex_syntax.

link

the8472 3636 days ago

couldn't that be a more general 2nd pass. rust -> rust?

link

ape4 3636 days ago

Shouldn't a C `int` be converted to Rust's `isize`. I think that captures the spirit better.

link

SilasX 3636 days ago

Not to look a gift horse in the mouth, but it seems like Corrode misses some other chances to use idiomatic Rust:

1. Rust fn:main doesn't need to return something.

2. The arguments to main aren't mutated, so Rust doesn't need to declare them as mutable.

3. Ditto for the argument to printf.

Anyone know how easy it is to recognize and code for such cases in the transpiler?

Edit: It looks like they might have opposite design goals [1]: "Corrode aims to produce Rust source code which behaves exactly the same way that the original C source behaved, if the input is free of undefined and implementation-defined behavior. ... If a programmer went to the trouble to put something in, I want it in the translated output; if it's not necessary, we can let the Rust compiler warn about it." (Edit2: cleaned up and numbered)

[1] https://github.com/jameysharp/corrode#design-principles

link

kerkeslager 3636 days ago

I think that keeping an exact one-to-one mapping makes this tool a lot more useful. There's no telling what code depends on C idioms that would be broken by using a Rust idiom instead. Generating 100% equivalent code means that programmers can make intelligent decisions about when to switch over to Rust idioms as they continue developing the program.

link

Retra 3636 days ago

Yeah, once you've got equivalent Rust, the rest is just optimization that should probably be implemented in the Rust compiler. No reason to put that stuff in the niche transpiler.

link

masklinn 3636 days ago

> Anyone know how easy it is to recognize and code for such cases in the transpiler? Edit: It looks like they might have opposite design goals

Yes the author has explicitly noted that they want a compiler as syntax-directed as possible, semantics change would go against that grain. In that spirit, idiomatic alterations would be the domain of rust-land fixers and linters (e.g. `cargo wololo` or `cargo clippy | rustfix`)

link

SilasX 3636 days ago

So you could chain Corrode with one of those to get a C-to-idiomatic-Rust converter?

FWIW, I googled those; Clippy and rustfix just seemed to be linters that can't detect things like "you're not mutating this so drop `mut`", and I couldn't find wololo.

link

pshc 3636 days ago

1. A special case could be added for `main`, but it's no big deal.

2. This seems difficult as the C arguments were mutable; the algorithm would have to start doing analysis rather than direct translation.

3. Quite difficult to "know" that this printf doesn't write to its arguments, especially since the printf is manually declared.

link

SilasX 3633 days ago

Regarding 1., If you're still reading, it looks like they discuss what they'd have to do to move `main` to its correct Rust type:

https://github.com/jameysharp/corrode/issues/20

link

cesarb 3636 days ago

No, most real-world C code will expect a C `int` to be 32 bits, while `isize` is often 64 bits.

On the other hand, at least for Unix systems `long` is often equivalent to Rust's `isize`: 32 bits for 32-bit architectures, and 64 bits for 64-bit architectures, so it would make sense to convert `long` to `isize`.

link

dbaupp 3636 days ago

They're different types. isize is ssize_t (well, intptr_t), in that it is tied to the size of the address space, while C's int is not constrained. In fact, it is usually 32 bits, even on 64-bit architectures, where isize is 64 bits.

link

wahern 3636 days ago

Wow. So I did some sleuthing and apparently in Rust the maximum size of an object must fit in isize, not usize. That means on 32-bit architectures you can't have arrays larger than 2GB, whereas on Linux and similar systems 32-bit processes have access to 3GBs and even the full 4GBs of address space. It actually matters for things like mmap'ing files.

Technically, C's int is constrained. C defines a minimum range of values for all the datatypes. The minimum range for int is -32767 to +32767. long is -2147483647 to +2147483647. Though the discerning pendant will claim, ex post, to target something like POSIX (which increases the bound on int, defines char as 8 bits, etc) if you point out improper use of int.

One irony of criticisms against C is that people argue it's too low level, but that's often because people treat it as too low-level. For example, novice C programmers think of C integer types in terms of bit representations and infer value ranges. Good C programmers think of C integer types in terms of representable values, understand that bit representation (specifically, hardware representation) is almost always irrelevant, and understand how to leverage the unspecified upper bounds on value ranges to improve the longevity and portability of their software.

Languages which emphasize fixed-width integers are, in some sense, a retrogression. The real problem with C integer types is you won't see the folly in poor assumptions until it's too late. Languages like Ada addressed this with explicit ranges. But I guess that was too burdensome. Fixed-width integers is an appeasement of lazy programming. I admit to being lazy and using fixed-width integers in C more than I should, but at least I feel dirty about it.

Many of the compromises Rust makes are clearly informed by the _particular_ experiences of the core team. For example, the fact that most Rust developers are of the belief that malloc failure is not recoverable (a big hold-up in adding catch_unwind) is a reflection of their experience with large desktop software. Desktop software has very complex, interdependent, and less fine-grained transaction-oriented state. Recovering from malloc failure is very hard and of little benefit. Most server software, by contrast, has more natural and consistent transactional characteristics. Logical tasks have less interdependent state, so it's both easier and more beneficial to be able to recover from malloc failure.

I think some of the choices wrt integer types is similarly informed.

link

Manishearth 3636 days ago

> the fact that most Rust developers are of the belief that malloc failure is not recoverable

This is untrue. The true statement is similar, but has different implications -- malloc failure is usually not recoverable, and nonrecoverable malloc failure should be the default, for the problem space Rust targets (which encompasses more than low-level things). You can recover from malloc in Rust, it just requires some extra work.

link

DanWaterworth 3636 days ago

I'm in no way connected to the project. Perhaps you should file an issue.

link

pcwalton 3636 days ago

Only for ILP64 ABIs, which aren't common.

link

nemaar 3636 days ago

On some architecture int is 32bit while isize is actually 64bit so no, that translation is definitely not the ideal one.

link

steveklabnik 3636 days ago

  > Because the project is still in its early phases, it is not yet
  > possible to translate most real C programs or libraries.

It is currently trying to port over semantics exactly, so the Rust code is far from idiomatic Rust. Doesn't mean it's not useful, just saying that it's trying to be 1:1.

link

moosingin3space 3636 days ago

I guess the next stage would involve translating common non-idiomatic patterns into idiomatic Rust. Looks like this could be a job for a community-managed database!

link

Manishearth 3636 days ago

On the rust subreddit someone tongue-in-cheek suggested `cargo clippy | rustfix` to be used in conjunction with this tool for better rust code.

But that actually could work! Clippy has a ton of lints that make your code more idiomatic, and rustfix basically takes diagnostic output and applies suggestions (still WIP).

Clippy is geared towards making human-written unidiomatic code better, so it might not catch some silly things in this tool's output but or certainly could be extended to do that.

link

moosingin3space 3636 days ago

I haven't used nightly much, what all does Clippy do?

link

Manishearth 3636 days ago

It tells you about places where you can improve your code. Possible pitfalls, style issues, documentation issues, unidiomatic code, everything.

Its a developer tool so you can use rustup to switch to nightly to run clippy (and use stable otherwise) and not impose nightly on the rest of the people who use the project. We have plans for making clippy a tool that you can fetch via rustup without requiring nightly.

link

shepmaster 3636 days ago

Check out Clippy online! Go do http://play.integer32.com/, paste in your code, click "Clippy".

link

masklinn 3636 days ago

It's a linter.

link

stcredzero 3636 days ago

This is best handled on a per-project or per-organization basis. I would have such a project concentrate on the tooling for maintaining and developing such databases.

link