Hacker News new | ask | show | jobs
by rptb1 4790 days ago
Why thank you. Although we've opened the coffin, we've yet to reanimate the corpse. Anyone who'd like to help out, please take a look at https://github.com/Ravenbrook/mlworks/wiki/Roadmap
1 comments

I see a benchmarking suite. How does the codegen compare to MLTon, SML/NJ, dare I say o'caml?
Let's get it working again and see. On the original target platforms (SPARC/Solaris, and then MIPS/Irix) it was quicker than that era's SML/NJ on our favourite benchmark (recompiling the whole of MLWorks itself), by about 25% I think. When we first did the x86 port the performance was not so good (the lack of usable registers was a problem for a design that had originally been focussed on RISC architectures), but I think that had improved by the late 90s (I stopped work on the project by about 1996). O'CaML didn't really exist then.
It will be a challenge to beat SML/NJ if the implementation has not progressed much since then, much less MLton. Since the 90s, SML/NJ has gotten a new backend, garbage collector, and has gone through a significant amount of performance tuning for the massive changes that happened to machines (e.g., more than 4MB of RAM, multi-level caches, etc.). Further, MLton is still 2-10x faster than SML/NJ, especially on programs that make heavy use of mutation (ref cells and arrays).

That said, I think it's still awesome to ressurect this project for folks to play around with on modern hardware! It appears to have some of the full development environment experience that none of the rest of us in the ML community have put into our products.

The implementation has not progressed at all since 1999, when Harlequin folded. It doesn't do defunctorization, so probably can't compete with MLton on performance (but it does have an IDE including a REPL, unlike MLton). It's much closer to the Definition than SML/NJ (of course): once we have the compiler running again I'll be interested to use it on the MLton corner cases. And certainly the SML/NJ garbage collector was occasionally a subject of light-hearted mockery among MLWorkers. But as I say, I don't expect x86 performance to be all that great.
Certainly! SML/NJ was a research platform first and high-performance/implementation second.

Defunctorization helps, but from talking with Matthew Fluet, most of the perf comes from the combination of monomorphization and whole-program compilation. You get to avoid all the hackery involved with trying to mix essential inlining (e.g., map, foldl) with separate compilation and the somewhat unpredictable performance that results when a user accidentally writes their own map function but puts it in a separate source file without magical incantations for the inliner.

Also, we're working to get the rights for the definition back from MIT press so we can both push out a free PDF version and update the bugs. There are several corner cases that Harper, Tofte, and MacQueen consider mistakes in the '97 version of the definition, but haven't really taken the time to push out an updated definition, after the ML2000 effort went nowhere. Follow up with me offline (contact info in my profile) if you're morbidly curious.

I'm sure we'll be in touch. I suggest you join the mailing list, then you'll be able to track our progress (and you'll be the first to know once we have a working compiler).

Do give my regards to John Reppy, who I haven't seen for about 15 years. As I recall, he was the author of the "new" GC in SML/NJ (in the mid 90s), which was a distinct improvement on the original semi-space collector.

It would be excellent to have the Definition available online. MLWorks originated with a contract to develop a very strict implementation of the Definition, and we were quite proud of being truly Standard ML.
if they don't have a bootstrap compiler yet i would guess they're not running benchmarks...