Hacker News new | ask | show | jobs
by chris_armstrong 471 days ago
OCaml

The compiler is very fast, even over large codebases.

Mostly trying to bring AWS tooling to the platform[1], or experimenting with cross-compilation[2] using another less well known systems language, zig.

[1] https://github.com/chris-armstrong/smaws/ [2] https://github.com/chris-armstrong/opam-cross-lambda

7 comments

I've used a lot of programming languages and the kind of groove you can get into with OCaml is hard to match. You can just dive into an enormous, unfamiliar codebase and make changes to it with so much more confidence. But while it's reasonably fast, it's also higher level than Rust so you don't have to struggle quite so much with forms like `Arc<Mutex<HashMap<String, Box<dyn Processor + Send + Sync>>>>` everywhere.

Re: AWS tooling, have you seen https://github.com/solvuu/awsm ?

It generates code for all 300+ AWS services and produces both Async and Lwt forms. Should be fairly extensible to Eio.

I worked on this. Let me know if you want to tag team.

I want to like OCaml but OPAM is just so bad... and tooling is super important (it's one of the reasons Go is popular at all). Windows support is also an afterthought. There's no native debugger as far as I can tell. This is before you even get to the language which definitely has its own big flaws (e.g. the lack of native 64-bit integers that MrMacCall mentioned.

The syntax is also not very friendly IMO. It's a shame because it has a lot of great ideas and a nice type system without getting all monad in your face. I think with better tooling and friendlier syntax it could have been a lot more popular. Too late for that though; it's going to stay consigned to Jane Street and maybe some compilers. Everyone else will use Rust and deal with the much worse compile time.

> The syntax is also not very friendly IMO.

Very true. There's an alternate syntax for OCaml called "ReasonML" that looks much more, uh, reasonable: https://reasonml.github.io/

The OCaml syntax was discussed a long time ago between the developers and the whole community and the agreement was that the community is happy with the current/original syntax. ReasonML was created for those developers more familiar with Javascript, but it was not very successful in attracting new developers as they usually look more at the semantics of the language along with the syntax (and that is where OCaml's type system shines). Strictly speaking, there is a long list of ML family languages that share many properties of OCaml's syntax. However, what is a ‘reasonable’ syntax is open to debate. Javascript and Python were not mainstream languages when Ocaml was developed and it made much more sense to create a syntax in line with the ML family of powerful languages available at the time. Once you program a bit in OCaml syntax is not a problem, learning to program in a functional paradigm and getting the most out of it is the real challenge.
> (e.g. the lack of native 64-bit integers that MrMacCall mentioned.

They exist, I think you just mean `int` is 63-bit and you need to use operators specialized `Int64.t` for the full precision.

How can you access the full 64 bits if "one bit is reserved for the OCaml runtime"? (the link is in the my original post's thread)
The usual int type is 63 bits. You can get a full 64 bit int, it just isn't the default.
The docs say, "one bit is reserved for the OCaml runtime", so doesn't that mean that one of the bits (likely the high bit) are unavailable for the programmer's use?

I mean, I understand "reserved" to mean either "you can't depend upon it if you use it", or "it will break the runtime if you use it".

So the "one bit" you refer to is what makes the standard int 63 bits rather than 64. If you could do things with it it would indeed break the runtime- that's what tells it that you're working with an int rather than a pointer. But full, real, 64-bit integers are available, in the base language, same goes for 32.
Why opam is bad? Compared to what? Could you elaborate
1. I've found it to be extremely buggy, often in confusing ways. E.g. there was a bug where it couldn't find `curl` if you were in more than 32 Linux groups.

2. It has some kind of pinning system that is completely incomprehensible. For example you can do `opam install .`, which works fine, and then `git switch some_other_branch; opam install .` and it will actually still install the old branch?? Honestly I've never figured out what on earth it's trying to do but me and my colleagues have had constant issues with it.

> Compared to what?

Compared to good tooling like Cargo and Go and NPM and uv (if you give it some slack for having to deal with Python).

It's better than Pip, but that doesn't take much.

In my case I have not found opam buggy at all, and I never find it confusing but this last point may be personal taste. The bug you commented is something I have never experimented with opam in linux or Mac OS and I am sure if you report the developer will check about it.

The point 2 you mention, I don't understand the issue. There is an opam switch which works for me perfectly, no issues at all. Please, like any other tool it is better to read the manual to understand how it works.

Cargo and opam is not something comparable, probably next generation of dune could be, but at this moment it is make no sense compare two utilities that are so different. Compare with pip, julia package manager, etc is fine. Personally, I like more opam than npm and pip.

Interesting, thanks, I have been using opam, but since I am lal alone and by myself, I never hit the cases you mentioned
>The syntax is also not very friendly IMO.

Why do you think that the syntax is not very friendly?

Not saying you are wrong, just interested to know.

Have you tried esy?
I've read some part of the book Real World OCaml, by Yaron Minsky and Anil Madhavapeddy.

https://dev.realworldocaml.org/

I also saw this book OCaml from the Very Beginning by John Whitington.

https://ocaml-book.com/

I have not read that one yet. But I know about the author, from having come across his PDF tools written in OCaml, called CamlPDF, earlier.

https://github.com/johnwhitington/camlpdf

>CamlPDF is an OCaml library for reading, writing and modifying PDF files. It is the basis of the "CPDF" command line tool and C/C++/Java/Python/.NET/JavaScript API, which is available at http://www.coherentpdf.com/.

My problem with OCaml is just that there is no stepping debugger for VScode. I'd use it except for that.
Yes

Symbolic debugger seem to be going out of fashion

It's my understanding that OCaml does not allow its programs to specify the size and signedness of its ints, so no 16-bit unsigned, 32-bit signed, etc...

Being a huge fan of F# v2 who has ditched all MS products, I didn't think OCaml was able to be systems-level because its integer vars can't be precisely specified.

I'd love to know if I'm wrong about this. Anyone?

You’re wrong, not sure where you got that conception but the int32/64 distinction is in the core language, with numerous libraries (eg stdint, integers) providing the full spectrum.
Thanks. They're not in the basic-data-types, but you are correct, they are available in the stdint module, which has a pub date from Oct 19, 2022. It can be found here:

> https://opam.ocaml.org/packages/stdint/

It's been a while since I investigated OCaml, so I guess this is a recent addition and is obviously not a part of the standard integer data types (and, therefore, the standard language), that not only have no signedness, and only have Int32 and Int64, but have "one bit is reserved for OCaml's runtime operation".

The stdint package also depends on Jane Street's "Dune", which they call a "Fast, portable, and opinionated build system". I don't need or want or need any of its capabilities.

As well, the issues page for stdint has a ton of more than year old open issues, so, as I understood, OCaml does not, like F#, have all sizes and signedness of ints available in their fundamental language. Such a language is simply not a good fit for system-level programming, where bit-banging is essential. Such low-level int handling is simply not a part of the language, however much it may be able to be bolted on.

I just want to install a programming language, with its base compiler and libraries and preferably with man pages, open some files in vi, compile, correct, and run. That is my requirement for a "systems-level" language.

I would never in my life consider OCaml with opam and Dune for building systems-level software. I wish it could, but it's not copacetic for the task, whose sole purpose is to produce clean, simple, understandable binaries.

Thanks for helping me understand the situation.

> which has a pub date from Oct 19, 2022

I think you're misinterpreting this. That's just the date the most recent version of the library was published. The library is something like 15 years old.

> the standard integer data types (and, therefore, the standard language), that not only have no signedness

I'm not sure what you mean by this - they're signed integers. Maybe you just mean that there aren't unsigned ints in the stdlib?

> and only have Int32 and Int64, but have "one bit is reserved for OCaml's runtime operation".

The "one bit is reserved" is only true for the `int` type (which varies in size depending on the runtime between 31 and 63 bits). Int32 and Int64 really are normal 32- and 64-bit ints. The trade-off is that they're boxed (although IIRC there is work being done to unbox them) so you pay some extra indirection to use them.

> The stdint package also depends on Jane Street's "Dune", which they call a "Fast, portable, and opinionated build system". I don't need or want or need any of its capabilities.

Most packages are moving this way. Building OCaml without a proper build system is a massive pain and completely inscrutable to most people; Dune is a clear step forward. You're free to write custom makefiles all the time for your own code, but most people avoid that.

> The library is something like 15 years old.

It's not clear from the docs, but, yeah, I suspected that might be the case. Thanks.

> I'm not sure what you mean by this - they're signed integers. Maybe you just mean that there aren't unsigned ints in the stdlib?

Yes, that's what I mean. And doesn't that mean that it's fully unsuitable for systems programming, as this entire topic is focused on?

> The "one bit is reserved" is only true for the `int` type (which varies in size depending on the runtime between 31 and 63 bits).

I don't get it. What is it reserved for then, if the int size is determined when the runtime is built? How can that possibly affect the runtime use of ints? Or is any build of an OCaml program able to target (at compile-time) either 32- or 64-bit targets, or does it mean that an OCaml program build result is always a single format that will adapt at runtime to being in either environment?

Once again, I don't see how any of this is suitable for systems programming. Knowing one's runtime details is intrinsic at design-time for dealing with systems-level semantics, by my understanding.

> Building OCaml without a proper build system

But I don't want to build the programming language, I want to use it. Sure, I can recompile gcc if I need to, but that shouldn't be a part of my dev process for building software that uses gcc, IMO.

It looks to me like JaneStreet has taken over OCaml and added a ton of apparatus to facilitate their various uses of it. Of course, I admit that I am very specific and focused on small, tightly-defined software, so multi-target, 3rd-party utilizing software systems are not of interest to me.

It looks to me like OCaml's intrinsic install is designed to facilitate far more advanced features than I care to use, and that looks like those features make it a very ill-suited choice for a systems programming language, where concise, straightforward semantics will win the day for long-term success.

Once again, it looks like we're all basically forced to fall back to C for systems code, even if our bright-eyed bushy tails can dream of nicer ways of getting the job done.

Thanks for your patient and excellent help on this topic.

> I don't get it. What is it reserved for then, if the int size is determined when the runtime is built? How can that possibly affect the runtime use of ints?

Types are fully erased after compilation of an OCaml program. However, the GC still needs to know things about the data it is looking at - for example, whether a given value is a pointer (and thus needs to be followed when resolving liveness questions) or is plain data. Values of type `int` can be stored right alongside pointers because they're distinguishable - the lowest bit is always 0 for pointers (this is free by way of memory alignment) and 1 for ints (this is the 1 bit ints give up - much usage of ints involves some shifting to keep this property without getting the wrong values).

Other types of data (such as Int64s, strings, etc) can only be handled (at least at function boundaries) by way of a pointer, regardless of whether they fit in, say, a register. Then the whole block that the pointer points to is tagged as being all data, so the GC knows there are no pointers to look for in it.

> Or is any build of an OCaml program able to target (at compile-time) either 32- or 64-bit targets, or does it mean that an OCaml program build result is always a single format that will adapt at runtime to being in either environment?

To be clear, you have to choose at build time what you're targeting, and the integer sized is part of that target specification (most processor architectures these days are 64-bit, for example, but compilation to javascript treats javascript as a 32-bit platform, and of course there's still support for various 32-bit architectures).

> Knowing one's runtime details is intrinsic at design-time for dealing with systems-level semantics, by my understanding.

Doesn't this mean that C can't be used for systems programming? You don't know the size of `int` there, either.

> But I don't want to build the programming language, I want to use it.

I meant building OCaml code, not the compiler.

As I commented above, Int32 and Int64 are part of the standard library since at least 4.X Ocaml versions (we are now in 5.3). So normally all them are available when you install any distribution of Ocaml. Note that there is also a type named nativeint (which, I think is the kind of int that you were looking for in all your comments and post) and it is part of the standard library, so in summary:

Int type (the one you dislike for systems programming)

Int32 type (part of the standard library, one of those you were looking for)

Int64 type (part of the standard library, one of those you were looking for)

Nativeint (part of the standard library, maybe the one you were looking for)

The library stdint is other option, which can be convenient in some cases but for Int32 and Int64 you don't need it also for Nativeint you don't need it.

The modules Int64 and Int32 and part of the OCaml standard library. You mentioned that it is needed dune or Janestreet in your comments to have this functionality. They are part of the standard library. Really, they are part of Ocaml core developments. Actually, for example, you even can use the library big-arrays with these types and int8, int16, signed, unsigned... even more you have platform-native signed integers (32 bits on 32-bit architectures, 64 bits on 64-bit architectures) with Bigarray.nativeint_elt as part of the standard library so all these types are there.

You also mention that Int32 and Int64 are recent, however these libraries were part of OCaml already in the 4.X versions of the compiler and standard library (now we are in the 5.3).

Note that in OCaml you can use C libraries and it is quite common to manage Int32, Int64, signed etc...

> F# v2

What does that mean?

The second version of F#, where they implemented generics, before they got into the type provider stuff.
What is ML programming language? They say OCaml is the same thing with the different name, is it truth?
Can a systems programming lanugage use garbage collection? I don't think so.
You´d be surprised.

In the 1980s, complete workstations were written in Lisp down to the lowest level code. With garbage collection of course. Operating system written in Lisp, application software written in Lisp, etc.

Symbolics Lisp Machine

https://www.chai.uni-hamburg.de/~moeller/symbolics-info/fami...

LMI Lambda http://images.computerhistory.org/revonline/images/500004885...

We're talking about commercial, production-quality, expensive machines. These machines had important software like 3D design software, CAD/CAM software, etc. And very, very advanced OS. You could inspect (step into) a function, then into the standard library, and then you could keep stepping into and into until you ended up looking at the operating system code.

The OS code, being dynamically linked, could be changed at runtime.