Hacker News new | ask | show | jobs
by jerf 3368 days ago
You have to model sending a message across the cluster as marshaling into a binary form and unmarshaling it again. I don't mean that you "should" model it that way... you have to model it that way, because that's what is happening. Therefore, when receiving a message, you really only ever get a Maybe Message or Either Message Error or whatever you want to model it as. The act of marshaling the message back into the local representation is also when you check it for whether it conforms to the type restrictions you think it should have.

Because you must already model this as a process that can fail, I don't think it does break the static typing model at all. In fact I routinely "statically type" messages coming from things that were actually emitted by dynamic languages!

What gets tricky is if you try to model this as a process that can't fail. But the problem there isn't static typing, it's a specific instance of the general principle that you can not build robust systems based on the principle that networks can't fail.

I also think this is an instance of the general misunderstanding about static types, which I understand deeply because I once held it, that static types somehow prevent errors. They don't. What they do is provide a gateway that says "in order to get into this type, you must meet these criteria, and the compiler is going to statically check that you've verified these criteria". A static typing system doesn't force things through that gateway, it forces you to check whether things fit through that gateway, and do something with the things that don't. Then, it also allows you to strictly declare that everything that uses that type is statically checked to be "behind" that gateway, so there are no other ways around it to get in, thus creating a space in which you can count on the fact that the values have been checked for certain properties and you can now write code that counts on those without constantly checking them. A statically typed system faced with the task of, say, parsing a number out of a string, does not prevent a user from sending me a string of "xyz"; it just prevents me from just sending it through the system as-is.

1 comments

> everything that uses that type is statically checked to be "behind" that gateway

In a distributed system, the largest the "gateway" can reliable be is a single node, because you don't get guarantees about the code that other nodes in the system are running. Even the single node case poses difficulties, because I believe in OTP the upgrade path means you have to transfer state during upgrades. What if the types of the state during the upgrade don't exactly match? Can multiple types of a thing exist simultaneously? How is these types versioned? etc... it gets complicated.

> Therefore, when receiving a message, you really only ever get a Maybe Message or Either Message Error or whatever you want to model it as.

Sure, you can receive messages as "Object" and then cast/parse them inside the node. Does that mesh with the vision of what people have when they want to bring static typing to erlang?

---

The hard part about thinking about OTP is not just the message passing, but also the myriad deployment & upgrade & versioning scenarios.

I am a fan of static typing over dynamic typing in everything else , i.e. normal programs.. just not _OTP-style_ erlang for distributed systems.

Even thinking about something like a gen_server (http://erlang.org/doc/man/gen_server.html) makes my head hurt... though if someone can figure out a way to do it that's faithful, more power to them.

> Can multiple types of a thing exist simultaneously? How is these types versioned? etc... it gets complicated.

I don't use Erlang, but I have developed an Actor system for C# [1] which is based on its (and Akka's) concepts. Clearly without a static type-checker for the whole distributed system we have to manually get involved and patch the old and new so that we can hot swap processes. Versioning I've found is best done by maintaining the old process that accepts the old message format, maps it to the new one, and then forwards it on to the new process that accepts the new message format. Any other node that is lagging behind will continue to work, and any new one will send to the new address for the process.

This isn't really rocket science, and if you stick to a few basic rules it tends to work out just fine. That doesn't mean that type safety goes out of the window, it just means that in creating a distributed process you must accept that you can't retire the old contract without it causing potential problems.

Apologies if I'm missing your point about OTP, but ultimately it seems that at some point (as the GP says) you are marshalling a message into a text or binary format, and then unmarshalling. At that point if the unmarshalled static type doesn't match the type that the process expects, then it will be off to the dead-letter queue. I don't really see how that's any different to giving the wrong type to a function in a dynamic language, or using an incorrectly typed variable that is picked up by a compiler in a statically typed language. In each case it's type checking at the earliest possible opportunity.

[1] https://github.com/louthy/echo-process

'Sure, you can receive messages as "Object" and then cast/parse them inside the node. Does that mesh with the vision of what people have when they want to bring static typing to erlang?'

No, that's not how you do it. You marshal things directly into the desired types. Check out either aeson for Haskell or how Go does things via either the json modules or the generic Text/Binary Marshaler/Unmarshaler.

"but also the myriad deployment & upgrade & versioning scenarios."

The answer to all of those things is mostly that even a lot of Erlang shops don't use live upgrading. You really have to have a very particular use case for that to be the best solution vs. a rolling upgrade and server restarts. Even if the language is capable of it, it still requires you to write services that can handle being upgraded, and it's much easier to write services that can handle being restarted, especially since you 100% have to write that anyhow because services get restarted anyhow. Most people don't have that use case. Web services certainly don't have that use case.

Once you drop that, it's a lot simpler.

"Even thinking about something like a gen_server"

gen_server is partially as complicated as it is as a side-effect of other decisions in the language. While the concept of a gen_server is a strength in Erlang, the specific implementation of gen_server as this "behavior" thing is mind-blowingly complicated for what you actually get. (It reminds me of Python's "metaclasses". I spent many hours wrapping my head around what that was, but in the end, all that it amounts to is what is now called a class decorator, which is way more sensible. A metaclass isn't a class decorator in theory, but in practice, class decorators are way easier to understand and cover 99.9% of the use cases, if not 100%.) When I implemented supervisor trees in Go, my solution for gen_server/gen_fsm/gen_* was just to... not. Behaviors are just a very, very weird half-object-ish system with a lot of limitations. They are easily replaced by simply having some sort of "interface" system, be it via conventional classes or interfaces. It's why you don't see "behaviors" as Erlang defines them anywhere else. Erlang has a lot to learn from and copy from, but that part isn't it.

After using hot code loading in production for the last 5 years, I don't see why you wouldn't use it, when it's right there. Maybe it's less thought to do a rolling restart, but it's a lot more effort expended by everything in the system to rebuild all the state that was in your processes.

A behavior is simply a list of functions you've declared that your module will export -- and a convention on what they might do. gen_server.erl is going to make lots of callbacks into your code, and rather than pass a huge list of funs, instead we pass the module name, and gen_server calls the exported functions from that module (this style means all the callbacks will hit your new code if you hot load, without you doing anything special; processing type changes is up to you, of course)

I know how aeson works, but the details of how it parses text into a HashMap that you can extract fields from into a data structure is somewhat besides the point, but I'll grant you the point that there are static solutions for message passing, sure.

It seems odd to me that they would include a unique feature like live-updating if it shouldn't be used.

I grant that live-updates and gen_servers may be anti-patterns, but my assumption was to consider the effects of static types on OTP and these are part of it.

If you identify some subset of erlang+OTP that is easier in some ways, great, I'm all for it.

I am just pointing out some complexities without making assumptions about what should be included or discarded. ( I do not know what erlang shops do in the small or in the large).

Perhaps what we want then is static types for "OTP-Lite"

I think even live-updating could be statically typed. Basically the live-update is a collection of functions that map every data type in the old process into the corresponding type in the new process. In the dynamically-typed case, these functions are just the identity. In the statically-typed case, if the new type has a new attribute, your mapping function has to define a reasonable default value. If you can't do that, your dynamic live-update would have gone badly anyway.
The main problem with pushing a typechecked live-upgrade in one shot is that you'll need to put a big lock around the distributed system (A non-upgraded node messaging an upgraded one would be fine, because the upgraded one knows the conversion function, but what happens in reverse scenario?)

It could be done without a big lock by splitting into three steps:

1) Push an upgrade that changes the types and adds the conversion functions. The valid type is the union of the old type and the new type. Wait until all nodes complete the upgrade.

2) Push an upgrade that instructs the nodes to convert their data and start using the new types by default. Wait until all nodes complete the upgrade.

3) Push an upgrade that removes the old types and conversion functions.

Why is this a limitation only of type-checked functions?

The problem is whether the function can or cannot handle the content of the new message and that's completely orthogonal

> [...] etc... it gets complicated

Which is exactly why we want to employ static types: in order to catch the difficulties in implementing it correctly. We describe the complications in the type system, through a model that captures them, to allow compiler errors -- rather than runtime errors -- to guide us in implementing it correctly.

Types only hinder getting an invalid program to compile -- which is exactly what we want.

In general, sure - but this post is about erlang/OTP, and the way you're speaking in generalities makes me think that you're just trying to persuade me about and champion the value of static types in general.

To digress slightly, consider an example from another domain, although I would rather keep this discussion about erlang. Now, Haskell is the only well-known language that has lazy (non-strict) semantics. Over the years many folks have proposed to make Haskell strict by default, alleviating some of the headaches that occur from non-strict evaluation. However appealing that may be, it would be a sad day if that occured, because we'd loose the only language to understand how lazy/non-strict evaluation affects how we design programs while there are countless strict languages, and lazy/non-strict evaluation has some very nice properties indeed.

Now to bring this back to erlang/OTP... sure, it is very nice when we add static types to erlang because we get all the nice things that static types provide, but we also loose some things. There are some features in erlang/OTP that are very dynamic, and forcing a static type system simply kills those features. I think that would be a sad day for the erlang, because you'd loose the ability to design distributed systems utilizing the full range of behaviors what the erlang/OTP system offers. There are already other actor systems in the world that offer static typing. You don't need erlang to build those systems—There is only one erlang/OTP that some some very unique features that none of other have.

Say, if we're talking about javascript, which runs at the level of a program on a single machine, I say bring on the types. If we have some other statically-typed actor system that works well for certain use cases, great. If we're talking about erlang/OTP, which is designed specially for fully distributed systems, I say let it be.