Hacker News new | ask | show | jobs
by mackwic 3875 days ago
I don't know if it brings anything new on the table. Also, I am not convinced that Ocaml is state-of-the-art in term of development productivity, relase-management, debugging, etc. Anything any non-trivial project could want. Language features is not the only thing we need.

Yes, the OCaml tooling has really improved recently, and the OCaml workflow has been more and more smoother. Still. I wont recommend it for a business.

So, if the original author read this, could you answer these questions about how do you ship products with Ocaml:

- How do you do Quality Assurance ? (anything from unit-testing, integration testing, functional testing, etc. I guess you have to do it a lot to check that Gmail didn't break its integration). Testing in isolation has its share of challenges in Ocaml.

- How do you manage your builds and releases ? Private Opam repositories ? Directly shipped to Google ? Do you have beta/staging channels ?

- And last but not least, what have been the pain points so far and have you been able to fix them or do you just work with it ? (it happens with any tech stack, but it's good to know what the trade-off are)

I am very curious how this could work at "entreprise-scale" and I would be glad to have some real world examples of ocaml in production.

6 comments

I'm always very amused when people say OCaml can't work at "enterprise scale" (let me snort a bit on that one, considering the scale of some open source projects) given the amount of evidences of the contrary[1]

[1]: http://ocaml.org/learn/companies.html

To answer one of your question. Yes, we have testing frameworks, both for unit[2] (inline[3]) testing and property testing[4]. As for the rest, it's not OCaml specific at all. :)

[2]: http://ounit.forge.ocamlcore.org/ [3]: https://github.com/vincent-hugot/iTeML [4]: https://github.com/c-cube/qcheck/

> I'm always very amused when people say OCaml can't work at "enterprise scale"

Yeah, you will notice that quotes that show you shouldn't take it too literally. ;) (also I have no doubt that Ocaml can work in a business. I just saw no one speak about it appart Jane Street, so I am curious)

What I was meaning that "a team with a hierarchical organization, not-ony-geniuses, and more than 4 people". Few of the serious Ocaml projects (if any) fall in this category, which is common for all businesses.

About the testing tooling, I am well aware of the state of these technologies. I even contributed to some of them. Sorry but it's light. It doesn't cover all the spectrum of what you'll want to test in a product.

Let's take your (excellent) libs and see:

- OUnit: unit tests

- QCheck: unit tests

- iTeML: unit tests

Where is integration and functional testing ? Unit Testing only test a very strict subset of the "does this work as intended" question. I recommend you to see how some projects do their testing. For example any classic Rails project. You could be surprised.

Some open questions I struggle myself to answer correctly:

- Mocking modules without having a build mess with oasis

- Managing the model in a sane way

- Integration testing is an horror

And many more that I can't recall now.

Not sure about the other issues, but I believe this one is easily solved:

> Mocking modules without having a build mess with oasis

OCaml has parametrized modules (aka "functors"), which provide a very clean solution, compared to mocking whole modules anyway.

As a side note, I usually see mocking being used to test badly structured code. Refactoring that is usually a benefit for testing as well as a benefit for the code itself.

You can't reasonably use functors of all your code dependencies (function arguments mostly).

I am amazed to see how the MirageOS team used this, but this method cannot be used for all the code.

> As a side note, I usually see mocking being used to test badly structured code.

You could want to mock just because you are inspecting particular stack frames in a specific call stack, trying to reproduce a bug. It happens.

I'll tell you how we do these things in the libguestfs/virt-tools project [1]. The project is written in a mix of C, Perl and OCaml. Mainly C is used for the low-level/library bits, and OCaml is used for the higher-level tools.

- QA: We have some unit tests, but mainly we use a huge test suite that does end to end testing of tools. It uses automake's test framework, so it works across all the languages in the project.

- Builds and releases: We use autotools and tarballs. It's automated using a thing called 'goaljobs' which is like a generalized make.

- The pain points for us all derive from autotools itself, which is both crap and better than all the other build systems[2]. It is at least well understood.

I'd also say the killer advantages of OCaml for me are: Easy calling into C, and compiles to a native binary.

[1] http://libguestfs.org https://github.com/libguestfs/libguestfs

[2] I use the term "build system" in a rather narrow sense of something that (a) runs on Linux (b) lets the end user download a tarball and (c) builds using ./configure && make

Thanks for your detailed answer and congrats for successfully maintaining libguestfs.

> I'd also say the killer advantages of OCaml for me are: Easy calling into C, and compiles to a native binary.

Very true. And Ocamlbuild makes wonders.

I'm not sure why these things require state-of-the-art technology.

Like, testing is basically just checking a bunch of if statements. You can use fancy frameworks that color your tests red and green, but other than that kind of thing, what's the actual problem? Why do you say testing in isolation is particularly challenging with OCaml?

It's been years since I used OCaml and I don't know anything about OPAM so I'm also interested in the answers.

Did you try to use OCaml and run into a bunch of difficult problems?

> Did you try to use OCaml and run into a bunch of difficult problems?

Yeah, I am a long date contributor and user of ocaml for little projects (mostly compilers and AIs). But not professionally.

> It's been years since I used OCaml and I don't know anything about OPAM so I'm also interested in the answers.

Opam is Bundler done right. It manages ocaml toolchains and packages dependencies. It's easy to pin a specific version or publish your own.

A setup oasis + ocamlbuild + $editor could go quite far. After that, it will depend of how you works and what you are developing.

> Like, testing is basically just checking a bunch of if statements. You can use fancy frameworks that color your tests red and green, but other than that kind of thing, what's the actual problem? Why do you say testing in isolation is particularly challenging with OCaml?

That is a very interesting question that deserve a blog post on its own. But I don't have the time nor the patience to do so, so let it be the HN comment.

> Like, testing is basically just checking a bunch of if statements.

Wrong ! There's multiple kind of "if statements to test";

-1- I want to test that `my_inner_fibo(0, 1) == 1`. This is unit test. Basically a transcript of the technical specifications into assertions. This is the easiest test, the most verbose, but it's also the kind that test the least.

-2- I want to test that `UserModule.retrieve_orders(user, command)` makes the good calls. This is integration testing in white box, which test that the API contracts between you inner interfaces are respected (which is also part of the technical specifications). I don't find it very useful but it's better than nothing when you can't do more.

-3- I want to test that, with a given state `user`, and `command`, when I call `UserModule.retrieve_orders(user, command)` I get exactly this object. This is integration testing in black box. This is useful to find errors in inner logic a group of modules (but you don't know immediately what gone wrong). As the call stack could be very large (and so the scope of what you are testing), you want to reduce the moving parts and replace some of the modules with trivial mocks. Integration testing in a black box test a lot of things and is very useful to find bugs.

-4- I want to test that a call to `$./my-binary ctl command --flag=true` have a specific behavior. Could also be a network request to a server, a message to a deamon, a click on a GUI... Anything exterior to the program. Here we test the behavior. This is what the user will use, that's why it's functional testing. This could break often (in the case of a GUI) or not (in the case of a CLI binary). You should always do functional testing if you respect you users a little.

All these kinds of testing have different requirements, and some need a perfectly controlled state to be created. The issue is not in the conditional testing, but in the setup this perfect controlled state, which sometimes need to make the code believe it use the good module, but you gave him a stubbed one which does trivial work, or signal you when something happens.

I don't know any simple way to spy on functions without using the MirageOS design which heavily use Functor injunction. It's quite an academic way of doing it.

Just as an aside, white box testing (#2) shouldn't be used for testing contracts between your inner interfaces, but rather, your consumption of some external interface that is hard to set up.

As an example, I want to make sure that when I hit a REST API endpoint, I do all my logic and get a proper return value (black box testing), maybe that the state of some bit that I've already set up is correctly changed (say, the database, by then checking it as part of the test to ensure the thing I said should be written was written), but that I also send a server sent event (SSE) to connected clients indicating the change. I don't want to actually have to have opened up clients that conform to the SSE specification as part of my integ tests (because that's a pain), so instead, I'll just assert that the library call to send that event does indeed get called with the right thing. From there, I can test once, that the library call does indeed lead to an SSE being sent to the browser (and can even write a full environment integ test with Selenium or something), and from then on, my single system tests just assert that call is made when it's supposed to. I'm effectively integration testing without implementing/mocking a connected user to test SSEs.

Similar things can be useful when putting items onto external queues, making calls to foreign interfaces, etc. You should never make assertions about the path through your code a call takes in white box testing (because refactoring can change those, and you have increased your test burden for little reason), you should care about side effects you can't easily otherwise test.

Excellent insight, thanks for your clarification.
Thanks for the detailed response. I've used mocking frameworks in Java and while they can be pretty helpful it seems like the basic functionality is pretty easy to get at, for example heavy modules like database layers could be module parameters, or functional parameters, or however you want to structure it. Surely language features and cutting-edge test frameworks can make all this stuff easier, but to me it seems like it can all be done easily with the normal ways of doing abstraction (and OCaml is pretty good at abstraction).
If you've a look at F#, you get all of the things you said plus OCamel syntax (without functors etc.).
F# is an excellent technology, and inspired some of the recent Ocaml developments (we still look at ActivePattern with envy).

I saw some pieces of F# in the FinTech here and there, I think Microsoft could do a better job at marketing it because it really has great potential in the .Net ecosystem.

What do you mean by "without functors?" F# doesn't have a way to define a `map` function that is generalizable to arbitrary data types?
OCaml functors are not the same thing as Haskell functors — in OCaml they refer to mappings from modules to modules.

https://realworldocaml.org/v1/en/html/functors.html

"Functor" might just be the most overloaded term in computer programming... Just of the top of my head it has totally different meanings in Ocaml, Haskell and C++.
They're not totally different between OCaml and Haskell. They're based on the same concept from Category Theory: a mapping of objects and morphisms from one category to another. It's just that Haskell functors are at the type level and OCaml's functors are at the module level.

Apparently F# has support for neither style of functor--it doesn't have parametric modules and it also doesn't have typeclasses. So in F# `map` is defined independently for each type:

   Set.map : ('a -> 'b) -> 'a Set -> 'b Set 
   Seq.map : ('a -> 'b) -> 'a seq -> 'b seq 
   List.map : ('a -> 'b) -> 'a list -> 'b list 
   Array.map : ('a -> 'b) -> 'a [] -> 'b []
What part of the tooling has improved recently? Is there a particular IDE that's now better to use, for example?
The IDE situation is now excellent (thanks to merlin[1]) and OPAM, the OCaml Package Manager, is the best package manager I have ever used.

The remaining pain point, as far as tooling goes, is the debugging situation, but steady progress has been made and it should receive very large improvements in the next OCaml version or so.

[1]: https://github.com/the-lambda-church/merlin

Well, I'm not sure I would go as far as "excellent". There is no "OCAML IDE", there is vim/emacs + tools. You can hack an IDE-like workflow by way of inotify, but if you want the "works-out-the-box" experience, that's not going to happen. As you mention, debugging is lacking, and more generally Merlin can't do much in the way of refactoring.

If you're fine with, eg, hacking Python in VIM, you'll be pleasantly surprised by the OCAML situation. If you live in an integrated IDE, there will be some adaptation.

With respect to opam: is it usable under windows yet?
Not in a released version, but there's very active work in trunk to make it work natively:

  https://github.com/ocaml/opam/compare/master...dra27:windows-build
A recent demo I saw a few weeks ago had everything running under Cygwin and building native Windows executables...
Kind of. It's a bit awkward to get it running on Windows, but somehow (by accident) I managed to do a fully functioning installation on Cygwin. That said, many of the packages have still not been ported to Windows, so my recommendation is to set up a headless VM of some GNU/Linux distro and SSH into it.
Now you have a very smooth ocaml worklow:

- you create and manage the build of your project with Oasis which call ocamlbuild nicely

- Opam works great with ocamlfind to manage the dependencies and the toolchain you use

- OUnit has backgroud workers for parallel testing

- Merlin and ocamlc annotations do wonders in term of semantic completion

- utop is an excellent toplevel with colors, completion and integration with ocamlfind so that you can load dependencies inside

All these technologies have greatly improved in the past 2 years, which is a really short span compared to the age of Ocaml.

Here is how I handle this at my company. LibreS3 is a product written in pure OCaml [1], using as a backend a cluster running Skylable SX (written in C)[2]:

- QA: unit tests written using oUnit, and integration tests by using linked Docker containers.

- builds/releases: opam + packages written for Debian and Fedora based distributions. The packages provided on our website are built inside Docker just like all the other packages. Internal beta packages are uploaded to a separate volume/bucket and served via LibreS3 itself.

- Pain points: building a package compliant with packaging policy is more complicated because I usually need newer versions, or OCaml libraries that are not yet packaged. I planned to use opam to generate templates for LibreS3+dependencies but was waiting for the C backend to get packaged upstream first. My slides from last year list a few more problems that I encountered during development but they aren't an issue currently [4].

The main advantages of OCaml for me are: - the availability of high-level libraries that makes implementing HTTP APIs simpler (Ocsigen, Cryptokit, atdgen, etc.)

- event-driven/non-blocking architecture to support large number of concurrent users (Lwt)

- native binaries and static type system

- not having to look for certain bugs in my code like uninitialized variables, NULL dereferences, memory leaks or memory corruption bugs; which (used to) take up a significant amount of time when developing applications in C (although tooling on C side have improved with valgrind, clang -fsanitize= and Coverity).

That doesn't mean that code written in OCaml or the libraries that I use doesn't have bugs, I track them down and provide patches just as I would for a C library. Although perhaps writing code in OCaml has made me somewhat overconfident in my code, and I find bugs in other people's code more easier than in mine even if I'm not looking for them.

[1] http://gitweb.skylable.com/gitweb/?p=libres3.git;a=summary http://www.skylable.com/products/libres3

[2] http://gitweb.skylable.com/gitweb/?p=libres3.git;a=tree;f=li...

[3] http://www.skylable.com/products/sx

[4] https://ocaml.org/meetings/ocaml/2014/ocaml2014_13_slides.pd...

thanks for your detailed answer. It's an interesting stack ! Using docker for integration testing seems like a good idea.

> Although perhaps writing code in OCaml has made me somewhat overconfident in my code

Many Ocamlers could be caught of overconfidence. It's hard to understate how ML typing help seeing the data flow.

I think you mean "overstate" :)