Hacker News new | ask | show | jobs
by acqq 4685 days ago
I've took a look on one "real" example, I admit I haven't spent much time with the language, I just want to get the idea looking at the code which really does something, I like that more than starting with boring tutorials:

https://github.com/mozilla/rust/blob/master/src/libextra/jso...

I know it's the least interesting thing, but for me the simplest rule to recognize Rust vs Go is that at the moment Go has the nice and minimalistic := and Rust the clumsy (for my aesthetics at least) "let."

Still, I understand that Rust attempts to solve more than Go in some aspects, and I really hope it can succeed. For that, yay Rust!

Can somebody explain me the reason for constructs like this:

    self.error(~"trailing characters")
vs

     self.parse_ident("ull", Null),
I read that ~ means "owned box on heap." Why should something that are facto string literals be owned and on heap? Isn't the need to allocate and box string literals on the heap something that makes the language unnecessarily slow? And why there's a need somewhere to explicitly box something that's known at the compile time and somewhere there isn't?
3 comments

> I know it's the least interesting thing, but for me the simplest rule to recognize Rust vs Go is that at the moment Go has the nice and minimalistic := and Rust the clumsy (for my aesthetics at least) "let."

It is not possible for us to have ":=" because we don't know whether to parse a pattern or an expression without a prefix token like "let". Rust's pattern language is far more expressive than Go's and one grammar will not cover both.

> Why should something that are facto string literals be owned and on heap? Isn't the need to allocate and box string literals on the heap something that makes the language unnecessarily slow?

Probably the `error` method explicitly asked for a heap-allocated string, perhaps because it wants to pass it to someone who will later free it. (Unlike in C, it would be a type error to attempt to free a constant string in read-only memory, which is what a plain string literal is in Rust.)

Does it mean that to do

    self.error(~"trailing characters")
there has to be the complete allocation of enough memory to store the copy of the whole string, then copying of all characters there, only in order for the target function to be able to do "free"? If the intention of the language is to "properly" work with a lot of strings (and it should be) it would be good not to have glaring inefficiencies? Wouldn't it be a good optimization to have the free routine check if the pointer is inside of the static area and then not do anything. That way you can pass the pointer to the static area avoiding copying and also avoid freeing. And the only thing needed is that free has "static_begin" and "static_end" addresses?

There are also other possibilities -- by knowing that the heap allocated things have lower bits of addresses 0 you can mark the stuff using the lowest bits of pointers.

I know, I have maybe too low level approach. But why not, as soon as you want to be more convenient than C, you have to think low level too.

Still what I like is that at least at the moment this explicit declaration of boxing gives a nice feeling that nothing is done "behind the back" of the programmer, which is good -- the main problem of C++ is that you can't know if somebody hid something very nasty behind some plain innocent looking construct, or even what the compiler will implicitly do or won't.

EDIT-addition: Regarding "pattern" if the := would be a single token, not allowed in patterns, wouldn't it be OK then? Do you have such a valid sequence in the patterns as the combination of the single operators or whatever?

> If the intention of the language is to "properly" work with a lot of strings (and it should be) it would be good not to have glaring inefficiencies? Wouldn't it be a good optimization to have the free routine check if the pointer is inside of the static area and then not do anything.

You can implement that yourself, by using an enum for example or by using a custom smart pointer. The moment you start adding more magic to "free" beyond "call free" you become less low level of a language.

Can I ask in the language if the address points to the item constructed by the compiler in the static area?

The thing I miss most in C is the possibility for the some kind of"introspection" -- reaching out to the info that the compiler or linker has to know anyway. As far as I know D language is very good for such things.

You can use lifetimes to enforce that a certain thing is static data, and combined with an enum, you get the best of both worlds: compile-time constants require no allocations, but still flexible enough to allow run-time construction:

  enum StringRef {
      Static(&'static str),
      Owned(~str)
  }
and then `error` would take `StringRef` and be called like:

  self.error(Static("trailing characters"))
  // or
  self.error(Owned(fmt!("%u trailing characters", count)))
(There was even a pull request that added this and the corresponding one for vectors to the stdlib, but it didn't landed (yet): https://github.com/mozilla/rust/pull/7599)
> Ca I ask in the language if the address points to the item constructed by the compiler in the static area?

It'd require some OS-specific magic. But you could probably do it.

I'm new to Rust and I'm not familiar with this code, but here's an attempt at an explanation anyways:

The "error" function returns a struct containing the error string. I suspect the intention is to allow dynamically constructed error messages like "syntax error at line 4, char 3" (although it doesn't seem to be doing this anywhere). Since the string might be dynamically allocated, it has to be freed when the struct is freed, which obviously can't happen if the struct just has a pointer to the literal in static memory. (You could flag the string somehow as "should be freed/should not be freed", as you suggest in your other comment, but this has its own tradeoffs)

If you wanted an "error" function that would take string literals directly (and only string literals) you would do something like this: https://gist.github.com/bct/6300740

The "parse_ident" function doesn't return any portion of the input string, so it can just use a borrowed reference.

If you wanted an "error" function that would take string literals directly (and only string literals)

And how about some convenient "either or"?

~str is mutable. error probably doesn't need a mutable string, though. either it's just an overlooked API (json is an old part of the codebase, doesn't really see a lot of attention) or what pcwalton said about calling something down the line.
~str isn't inherently mutable; it's an owned string, and so is mutable if placed in a mutable variable.

The reason it's used here is so that the error messages can be constructed at runtime e.g. `fmt!("line %u: trailing characters", line_number)`, if it was using &'static str (i.e. a compile-time literal) then one could only use hard-coded error messages.