Hacker News new | ask | show | jobs
MIT Scheme on Apple Silicon (kennethfriedman.org)
145 points by kennethfriedman 1629 days ago
13 comments

I've been engaged in a 1.5 year long (so far) project to port the entirety of GJS's "scmutils" package over to Clojure, and the erratic behavior of MIT Scheme over Rosetta has been a pain I've consigned myself to for months. I keep an old machine handy when I need to test functions that can't work on the M1.

I am SO HAPPY to see this work! Major timesaver for me and anyone looking to run the executable versions of Functional Differential Geometry[1] and Structure and Interpretation of Classical Mechanics[2] in the original language.

[0] https://github.com/sicmutils/sicmutils [1] https://github.com/sicmutils/fdg-book [2] https://github.com/sicmutils/sicm-book

I have to ask, how far along is this project? Is it good enough to run all of the examples in the books? I have run into the issue where the provided compiled version of scmutils from GJS' website doesn't run on recent versions of MIT Scheme (version 11 and up) and there's not much info on compiling it yourself.
It is good enough! Almost all code forms from the book live in the tests (see the FDG directory[0], for example), and there are a few nice environments like Nextjournal[1] where everything from the books works in the browser.

The Clojure port is quite fast, faster than the original for all benchmarks GJS has sent me, and more fleshed out. (That will change, as I've been pushing bugfixes and performance improvements back upstream as I go, as a meager gift to GJS for making this huge, amazing library in the first place.)

I actually wrote to GJS this morning asking for instructions on how to compile the original "scmutils", since I have the same problem. He responded saying he'll get back to me this afternoon, so I'll post here once I have details.

If you are still interested in getting the books going with MIT-Scheme, I put a decent amount of work into the exercises using the original codebase here[2], including a dockerized version of mit-scheme[3] and the scmutils package[4] that might be useful.

- [0] https://github.com/sicmutils/sicmutils/tree/main/test/sicmut...

- [1] https://nextjournal.com/try/samritchie/sicmutils/

- [2] https://github.com/sicmutils/sicm-exercises

- [3] https://hub.docker.com/r/sritchie/mit-scheme

- [4] https://hub.docker.com/r/sritchie/mechanics

Awesome resources, thank you!
Both of the compilation errors identified in this article were just fixed in the master branch of MIT/GNU Scheme five hours ago: https://git.savannah.gnu.org/cgit/mit-scheme.git/commit/?id=..., https://git.savannah.gnu.org/cgit/mit-scheme.git/commit/?id=.... So, if you grab the current master branch, it should just build for x86 without any fixes needed.
Ok, but how do you compile it? When you pull the repo you need to run autoconf to create the configure script but that tells me "This script needs an existing MIT/GNU Scheme installation to function".
What's stopping people from just compiling Scheme for ARM? The website has a separate aarch64 download it seems, so why not patch that instead of relying on Rosetta2?

The vfork/fork issue and the compiler upgrade issue don't seem to be too problematic to work around, so there must be some kind of ARM limitation that's preventing Scheme from working, but what?

MacOS on the M1 processor is the first to use, and require, the W^X bit in memory, meaning that pages of memory are either writable, or can be executed from, but not both. MIT Scheme's front page says this is fundamentally incompatible with their design, and therefore it won't build. When running in the emulator, this requirement would be relaxed for compatibility reasons.

There is an escape hatch for writing JIT compilers (essentially what MIT Scheme is in this case), described here https://developer.apple.com/documentation/apple-silicon/port... although it's fairly cumbersome and would almost certainly require a lot of extra, MacOS specific code. I assume that's why no-one has bothered so far to port it.

From perusing the source of the MIT/GNU Scheme compiler, I suspect that “only” two changes are needed to support W^X:

- Compiled code needs to be allocated separately from Scheme objects. It can still be garbage collected and such - they will probably need to make a separate set of allocation functions for code vs. data. The closure/function objects can be made to point to the code, or, if they don’t need to be written often, simply allocated wholly from the “code” pages. - Before modifying any of the code (e.g. to patch addresses after GC relocation), a system-specific hook function will need to be called to set the permissions to RW. They already call an I-cache flush function after each modification, so this shouldn’t be too bad.

Some of the necessary changes are already sketched out in cmpint.txt. And, sooner or later, they’re going to have to make these changes: OpenBSD already enforces W^X (but provides a workaround), and MIT/GNU Scheme already applies a paxctl workaround to gain W|X on NetBSD.

Wow really? A common intro to security exercise (think CTFs and university courses) is to write increasingly complicated C programs that leverage W&X. Classic buffer overflow into the stack kind of stuff. On M1 it’s now impossible to exploit even a self-compiled toy in this way?
Basically, yeah. In addition, the usual way to bypass W^X memory, using ROP chains, is also mitigated by the pointer authentication the M1 implements. It's not bullet proof, but it prevents most of the old exploit methods from working at all. You'd need to throw up a VM on an M1 Mac to learn much this way (although that'd be ideal anyway, to get an environment without other protections like ASLR)

I know at least OpenBSD also enforces W^X protection universally, anyone else? I know Linux can with the right SELinux policies, but not sure any distro ships with those by default.

Windows has had this enabled by default for a long time: https://docs.microsoft.com/en-us/windows/win32/memory/data-e...

There's a per program exception list to handle legacy programs though.

Windows DEP only applies W^X (more accurately, !X) to the default stack and heap; programs can still freely allocate new memory as PAGE_EXECUTE_READWRITE if they want RWX memory.

macOS W^X on Apple Silicon, however bans RWX memory outright, making it impossible to have a page in memory that is simultaneously writable and executable. Instead, if you want to be able to write instructions to a page and later execute them (e.g. for JIT compilation), you have to (1) have a special entitlement (or opt out of the Hardened Runtime), (2) map your memory with a special MAP_JIT flag, and (3) call special mprotect-like functions to toggle the protection between RW and RX every time you want to modify the code.

There does, however, seem to be a bit of a loophole: the JIT protection flags are applied per thread meaning that in principle one thread could have the page RW while another has it RX.

On M1 CPUs you cannot ever have simultaneously writable and executable memory. Windows just makes default allocations write only, you have to explicitly request RWX, which is what every other OS has been doing basically since x86 actually added support for non executable memory :)
Pointer authentication isn’t in 3rd party processes though, only system ones. (or maybe it’s available but optional, I forget)
> Pointer authentication isn’t in 3rd party processes though

Still isn’t, because the arm64e ABI isn’t stable. As such, any binaries not bundled with the OS, including Apple applications, use the arm64 ABI without pointer authentication.

You can use -arm64e_preview_abi as a boot argument to enable arm64e support for non-OS bundled processes.

Note that however the arm64e binaries that you compile might not work on future macOS releases.

Offtopic, but can you recommend such a course?
Hey sorry, didn't check back on this comment for a while. I can't recommend any _courses_ in particular (unless you're a Georgia Tech student, in which case "CS 6265: Information Security Lab" is absolutely incredible).

One really fun way to hone your skills is https://microcorruption.com/, a ctf-style simulated hacking game originally made by Square and Matasano.

You can do it in a Linux VM container, it's just MacOS processes, not the HW.
Ohh that makes more sense, thanks.
It is possible using Rosetta 2
Ah, thank you. That explains the problem quite well.

I suppose the wait is on for someone to rewrite the JIT engine to be compatible with Apple's implementation of ARM.

I mean if they pass the correct flags they get memory that can be toggled rapidly between X and W mode - or is MIT Scheme mixing data and code in the heap and so actually requiring RWX?
I find it a bit disingenuous to say that it runs on apple silicon if you need to modify source code. Also it's not because it compiles and starts that it's fully functional.

The main MIT Scheme page say that it's not possible and need significant efforts, so I would be curious to get a description as to why one claim it's impossible while the other show that it compiles and starts.

Are the original authors too much against M1/Apple and justify themselves ? Or does compiling on M1 sort-of works until you hit more complex features that will crash or misbehave ?

It's also a bit disingenuous to say it's "on Apple Silicon" when you're running it through a translation layer that won't exist in a few year's time. I'd wager the reason why the GNU folks say it doesn't run on ARM is because... it doesn't. Running it as an x86 program is mandatory, apparently.
Well they do say that they need rosetta just to compile, that once compiled it "works", though I agree it's still too shallow of an article.
Apple Silicon has a (mostly) hardware translation layer, which this software is running on.

There's a special aarch64 build of the software available, so it clearly runs on ARM. Perhaps there's some kind of issue specifically on macOS that makes the existing ARM port incompatible with Apple's ARM implementation?

> Apple Silicon has a (mostly) hardware translation layer…

I can’t imagine what you mean by this, Rosetta 2 is a binary translation system implemented in software, based on QuickTransit. There are a few features implemented in Apple Silicon to make translation easier and more efficient, such as supporting Intel memory ordering, but thats about it.

I think it’s reasonable to worry about how long rosetta2 will be available. The first version, that allowed Intel Macs to run PowerPC binaries, was available for 5 years. Having said that, there’s no guarantee versions of MacOS beyond 5 years time will run on today’s M1 anyway (though M1 compatible versions will likely still get updates beyond then).

I can't say what Apple will do, but I'm really hoping they'll keep Rosetta 2 around for longer than Rosetta 1.

For starters, the Mac became a lot more popular in the Intel era than it ever was while on PPC, so there's a much larger quantity of legacy software that Apple would be cutting off. Secondly, the overall user experience of running apps via Rosetta 2 seems to be a lot better than Rosetta 1. And for Apple, Rosetta 2 was developed in-house and doesn't require continuous licensing fees to keep around (not that I'm particularly sympathetic to Apple's pocketbook.)

And for Apple, Rosetta 2 was developed in-house and doesn't require continuous licensing fees to keep around (not that I'm particularly sympathetic to Apple's pocketbook.)

I don't think any of those things matter; Apple will stop supporting Rosetta 2 as quickly as they can. They announced the transition to Apple Silicon will be two years and unless something unforeseen happens, that's what it's going to be.

I suspect that Rosetta 2 won't be available for new Apple Silicon Macs running macOS five years from now.

Of course, no matter how many years in advance Apple warns that a particular technology is going to be deprecated, that never stops people from complaining vociferously when it happens.

A great example is 32-bit apps, where Apple gave something like an 8-year heads-up that 32-bit apps were going away, which happened a few years ago but it's not hard to find threads on HN where people are still complaining about it.

For Rosetta (1), QuickTransit was bought up by IBM. Rosetta disappeared not very long after that.
Rosetta 2 has nothing to do with Rosetta 1 (other than the name), nor any other company’s software.
Looks like QuickTransit was a jit engine which was the base of Rosetta 1. Rosetta 2 is AOT translation.
They’re both based on QuickTransit, but Rosetta 2 has an AOT mode as well as JIT. I’m sure the R2 engine is more advanced than the original engine, Apple employed several engineers from the original team, but it still uses and is based on licensed tech.
So Java, Groovy, Scala, Kotlin and Clojure aren't running on x86, nor ARM, nor Apple Silicon?
In a lot of ways, yes. Their runtimes are so massive that saying they "run" on any of those architectures is a stretch of what is actually happening at a lower level.
In "a lot of ways", sure, but definitely not by the most common meaning of "program x runs on y architecture", and not the one being used by most people in this thread.

If you ask any random programmer if "Java runs on x86", 99% of them will say either "yes" or "I don't know what x86 is". Similarly, if you ask them "does Kotlin run on the SPARC architecture", they'll say "I don't know" and, if you give them some time to find [1], they'll amend to "no".

To be precise: the meaning being used by most programmers (and here) is "either the compiled binaries, the virtual machine, or the interpreter runs directly on the given architecture" - which clearly excludes MIT Scheme running on Rosetta, just as (to take a less controversial example) the fact that might be able to run the JVM on qemu on SPARC doesn't mean that the JVM runs on SPARC.

Thinking about the various levels of abstraction of VM's and interpreters is a fun exercise in general, but I don't think it's constructive in this particular situation.

[1] https://openjdk.java.net/jeps/381

No, it's not a stretch at all. This is just nerd contrarianism.
Well, let's see. Is there a way for me to run a Java program as native machine code? Or is the code that I'm executing still a runtime that interprets a program?
Depending on code, a quite large fraction of your java code is run as x86 machine code at any times. It hardly gets more native than that.
Is it not running on Apple Silicon?
I got MIT Scheme running on my M1 MacBook Pro about 6 months ago when I bought the book "Software Design for Flexibility" and although I can't find my notes for that, I think I remember building from source natively, not via Rosetta - but I may remember incorrectly.

I also remember it taking a while to get Gerbil Scheme running on M1.

How did you like the book?
I like it, but I have only worked about 1/3 of the way through it.
What would someone who has worked through SICP learn from it (based on the first third that you've read)?
I've also gotten through about the first 1/3. Based on this, and the table of contents, it goes into much more depth (both in terms of implementation and non-trivial illustrative examples) into a number of topics that are either only touched upon in SICP, or not discussed at all. These include combinators, generic functions, pattern matching, etc. There's a chapter on propagators, which didn't even exist when the last edition of SICP was written, though I think SICP does discuss related (but simpler) ideas on constraint propagation.
Thank you.
The Racket fork of Chez Scheme runs natively on Apple ARM (AFAIK these changes have not yet been merged into the main branch of Chez Scheme)

https://github.com/racket/ChezScheme/

Just fyi in dark mode on this site, the code snippets are almost unreadable
Thanks for the heads up. I blame our site's Chief CSS Officer (me). It has been fixed!
Rule of thumb: don't take advice from people who can't explain why they suggest you comment code out.
As the OP here, I could not agree more.
Very cool how much interest there is in mit-scheme and sicm. On the csail website I think the '.com' binary for scmutils only works with v10 and was released about a year ago. Does anyone know where there are instructions for finding/compiling a version that works with the latest version of mit scheme?
Nice. FYI, I've been using Fennel on Monterey (via brew), and it's also great for that extra LISPy feeling.
I wish I had known this before I recently installed Racket because I'm currently reading SICP.
That MIT Scheme includes an emacs clone that uses Scheme instead of elisp is a nice touch.
There’s a fork of Emacs itself that can run scheme, by replacing the internal lisp engine with Guile (which can run both scheme and elisp). It doesn’t seem to have gotten a lot of love in the last few years, but did mostly work at one point.
Just a UI comment, the white highlighting of white text on a black/grey background is pretty unreadable in my browser
Just turn off dark mode. So many sites have css that claims to support dark mode but doesn’t. The other direction seems less common.
We should really expect better from someone who's about page says they are an interface designer, though.
We should expect better, and you deserve better! Luckily, the issue has been fixed :)