| HN Mirror

Great question, actually I tried that! m2cgen is a project that does that in fact.

It works fine for simple models, but breaks down for production-sized tree ensembles. The JVM has a hard 64KB method size limit, and javac controls how your deeply nested if/else trees get laid out. m2cgen's own FAQ says to reduce estimators when you hit recursion limits during generation. With direct bytecode emission I control the method structure precisely, I can split across methods exactly where needed and manage the constant pool directly. I also wrote much more efficient bytecode than m2cgen creates as equivalent source.

The source code is also a pretty useless step, sets off all kinds of static analysis alarms in your stack, and also I worry about source code injection (not that can't happen with petrify, it's just a lot harder).

Finally, I'm grateful for the sweat the authors of m2cgen have put in, but the project has gone without updates for 4 years. That doesn't mean it's useless (some mature software never sees updates), but it's not a positive sign either.