Hacker News new | ask | show | jobs
by miki123211 54 days ago
This isn't specific to Rust or Typescript. You can do this in basically any language.

Imagine you have to distinguish between unescaped and escaped strings for security purposes. Even with a dynamically typed language, you can keep escaped strings as an Escaped class, with escape(str)->Escaped and dangerouslyAssumeEscaped(str)->Escaped functions (or static methods). There's a performance cost to this, so that's a tradeoff you have to weigh, but it is possible.

Another way of doing this is Application Hungarian[1], though that relies on the programmer more than it does on the compiler.

[1] https://www.joelonsoftware.com/2005/05/11/making-wrong-code-...

5 comments

> This isn't specific to Rust or Typescript. You can do this in basically any language.

This just isn't true.

In any dynamic language you would not get these guarantees at compile time. You'd get random failures at runtime. That's not safety of any kind.

Also, part of the goal of languages like Haskell is that they help you think about your code before it runs. All of that is lost.

> Imagine you have to distinguish between unescaped and escaped strings for security purposes

That would be a nightmare in many languages. You'd have to rewrite large parts of the code to be compatible with one or both. And in many languages you'd have to duplicate your code entirely.

In other languages, the result would so ugly, you would never want to touch that code. Imagine doing this with say, templates in C++.

>There's a performance cost to this

There is no performance cost in Haskell! This is entirely undone by the compiler.

Also, because the compiler understands what's going on at a much higher level, you can do things like deriving code. You can say that your classified strings behave like your regular strings in most contexts, like say, they're the same for the purpose of printing but not for the purpose of equality, in one line.

> There's a performance cost to this

That part is (de facto) required for dynamically typed languages, but not for statically typed ones where the newtype constructor/deconstructor can be elided at compile time. Rust and C++ especially both do the latter by having true value types available for wrappers that evaporate into zero extra machine code.

But then just this moment I wondered: do any major runtimes using models with no static type info manage to do full newtype elision in the JIT and only box on the deopt path? What about for models with some static type info but no value types, like Java? (Java's model would imply trickiness around mutability, but it might be possible to detect the easy cases still.) I don't remember any, but it could've shown up when I wasn't looking.

> [Performance cost] is (de facto) required for dynamically typed languages, but not for statically typed ones where the newtype constructor/deconstructor can be elided at compile time.

For a single value, it should reliably be free in any reasonable statically typed language that meet your other criteria.

For a collection, it may still be de facto required. Unwrapping a set of addresses into a set of strings takes unnecessary cycles, an unsafe coercion, sufficiently sophisticated affordances around coercion that it can be safe, or a smart enough optimiser. At least some static languages have at least one of these; I'm not sure all do, and certainly not all have always had.

Well, java can do escape analysis, so a wrapper with a single field may end up as a local variable of the embedded field.

As for other JVM languages like Kotlin and Scala, they have basically what "newtype" is, but it can only be completely erased in the byte code when they have a single field.

Escape analysis that sinks a local allocation is great in itself, but for newtypes for things like “trusted HTML vs plain text”, I feel like the primary benefits are deeply interprocedural. The type constraint is encoding a promise that can be carried from one end of the code base to the other, and where you can know for sure when you're writing a module whether you're on one side of a barrier or the other. I would tend to expect this to result in patterns that aren't well-handled by escape analysis.

What I'm imagining for my curiosity about the dynamic case would look more like “JS/Lua/whatever engine detects that in frob(x) calls, x is always shaped like { foo: ‹string› } and its object identity is unused, so it replaces the calling convention for frob internally, then propagates that to any further callers”, and it might do the same thing when storing one of those in fields of other objects of known shapes, etc. until eventually it hits a boundary where the constraint isn't known to hold and has to be ready to materialize the wrapper object there.

Kotlin and Scala sound like they're doing the Rust/C++ thing at the bytecode level, if it's being “erased”, so just the static case again but with different concrete levels for machine vs language.

Java escape analysis is very weak, much weaker than what stack allocation and moving allows in languages like C, C++, Rust.
Depends on which JVM you are taling about.
What you cannot do is compile-time safety guarantees, and in languages like Rust type system isn't strong enough to do some advanced compile-time guarantees (via types). So no, you cannot do this in basically any language (unless you turn it into Haskell).
Rust (and Scala) type systems are somewhat stronger and more expressive in some areas than Haskell. Weaker in some other. But it’s not a clear cut that Haskell type system offers more safety guarantees.

  > But it’s not a clear cut that Haskell type system offers more safety guarantees.
You can log operands in your implementation of Rust's Add trait, or compute running sum of one of the operands. In Haskell it is not possible unless you use unsafePerformIO from System.IO.Unsafe.

Haskell's type system controls effects available to the code. This can be used to implement programs adhering to specific formulas of Linear Temporal Logic or implementing a protocol specification, where operations on any given phase and side are restricted by type system.

I used Haskell's type system to prevent crossing of clock domains in the hardware description eDSL. Also, it was of great help in the CPU simulator description, fixing available commands and resources for different CPU models.

Even the logic of Rust's borrow checker was expressible in Haskell as early as February 2009 - there was HList, there was ParameterizedMonad and that's about what one needs for implementation of borrow checker.

Can you give some examples? What is Haskell’s type system capable of that you can’t express in rust?
Caveat emptor, I'm not a Haskeller, just an admirer. But Haskell let's you run functions over the type system itself very elegantly, so you can have a type that is derived from a type that composes three types that are derived from several functions' declared runtime behavior. This kind of type inference can provide arbitrarily rich information about anything in the entire program, letting you encode more than just range-types etc at compile time.

This comes with some design drawbacks. I think Rust's borrow checker would be implementable but unreasonable in Haskell: Haskell already does lazy-evaluation on types to enable its arbitrary depth of type expressivity. But the borrow checker also wouldn't really make sense for Haskell because the default programming model uses a GC. I think Linear Haskell might be a kind of Rust-in-Haskell, though.

Again, caveat emptor.

I've spent long time writing Haskell and now write Rust professionally, so let me tell you something. The stuff where you really want to "program in types" lives beyond Haskell, in languages like Agda, Idris, some HoTT stuff etc. anyways.

In principle the more developed the type system is – the closer it to not distinct between types and values. Caviat is that its "type inference" gets worse.

So, in those more developed languages, you could have type-level proofs (guarantees) that your calculator produces correct results, as a proof, as theorems. That 2+3 will be 5, not as runtime test assertion, but as a theorem, that no other result is possible no matter what happens. Or that your parser will never fail on valid JSON etc. but nobody guarantees it's going to be a pleasant thing to write, maintain, and debug. And compile times will probably be terrible.

What the parents describe can be done with almost any language.
No you cannot
They described the use of abstract data types. One can certainly use those in most languages - for sure in C.
> runST :: (forall s. ST s a) -> a

> The rank-2 type (that is, the type s is scoped within the parenthesis and can't escape) of runST ensures that the mutable references created inside the computation cannot escape due to being tagged with the type s. Internally, all sorts of imperative nonsense may occur. Externally, the function is pure. The world outside the boundary gets none of the mutation, only the result.

C does not have parametric polymorphism, nor rank-2 quantifications, so no, this cannot be done in C.

This was not the original claim in the thread I responded to which as "even with a dynamically typed language, you can keep escaped strings as an Escaped class, with escape(str)->Escaped and dangerouslyAssumeEscaped(str)->Escaped functions (or static methods)." so this was about abstract data types and not polymorphism.

Regardless, you can also have some limited parametric polymorphism in C with macros. This is very poor, but parametric polymorphism in Rust is based on monomorphization so it is also quite poor. You can also have higher-order polymorphism in C but then you need to use subtype polymorphism.

> You can do this in basically any language.

You can do it in Assembly. That doesn't mean it's cost effective.

And categorically: the issue isn’t what “I’d” do, my habits often match my habits, it’s what other project members will be doing (including future degenerate versions of myself assumed to be some combination of busy, tired, stressed and drunk).

The Confucian philosophy that people act like water coming down a mountain, seeking the path of least resistance comes to play.

Haskell, OCaml, F#, and their ilk can yield beautiful natural domain languages where using the types wrong is cost prohibitive. In languages without those guarantees every developer needs discipline to avoid shortcuts, and review needs increase, and time-pressure discussions rehashed.

Costs are a skill issue ;-)
You demonstrate well the problem: yes anything that is computable can be than in any computation system. That's not what discussions about tooling are about.

If a tool can help enforce some ways of doing things, or if it doesn't constrain people much, that has consequences for the type of work that gets done with them and the systems you encounter running out there that you might be invited or find the need to work with.

"I can do it" is exactly the wrong answer. "How can I guarantee that others will do it" is the point being made.

Does everybody _need_ to do it?
Everybody? Need? Obvious answer is no.

Some people, teams and orgs can benefit from it. "I don't need it" is missing the point. "Not everybody needs it" is missing the same point from a different direction.

> You can do this in basically any language.

Tell me you haven't understood what a type system does without...

A type system mathematically proves, at compile time, what a program can and cannot do. No, you can't "do this in basically any language".

It's ironic that "shift left" is generally considered a good thing', but when you point out that you can shift a significant majority of checks left all the way to compile time, they say "no, not like that!"