Hacker News new | ask | show | jobs
by sanderjd 1053 days ago
Generating code is fine, if the generated code strictly never evolves independently of what it is generated from. For instance generating libraries from .proto files (or other declarative schema definition solutions) works really well. If the schema changes, you throw away the old generated code and generate brand new code, no problem.

But if you want to make even a single tiny modification to one of the generated files, you're busted, you need a different solution.

4 comments

Generated code is fine if it's newly generated on every build. If you're going to have to maintain the generated code, it's not generated code anymore, but duplicated code.
Sure. I consider this a restatement of what I said, and thus inarguably right :)
Seconded. How many in this thread have found generated code in source control? My trophy case includes artifacts produced by: flex, bison, gperf, swig, and one particularly nasty CORBA stub generator.
No Perl?
Yes and the original article is about how duplicated code is ok. The discussion finally went the full circle.
The original article isn't very convincing though. I mean, I fully believe the single abstract super controller was a bad idea, but there are far better options than that and duplicate code. He's just comparing two of the worst ways to do it.
> But if you want to make even a single tiny modification to one of the generated files, you're busted, you need a different solution.

Not totally true, if you can robustly express your tiny change as a `sed` or `awk` script, you can just append to the generator pipeline. Speaking from experience, do not condone, etc.

I think GP means "make a tiny change [after generation, outside of the generator, and persist that change independent of the generator code]", which is where all the demons are waiting

Modifying the generator itself to do something different every time, and doing GP's stated "regenerate and throw away the old stuff" is in line

It's not modifying the generator. The generator may be a proprietary black box. It's wrapping the generator in a bash script that pipes the result through AWK, etc.
As other commenters have noted, if the awk script is just a pure function of the output of the black-box generator to a new output, then I would consider this a modification to the generator, and no problemo.

However, if your awk script requires the current state of the generated code as input in addition to the output of the black-box generator, and tries to reconcile a diff between the two things, then yep, I consider that busted.

> It's wrapping the generator in a bash script that pipes the result through AWK, etc.

Which is itself a generator

Sure, that's orthogonal. If you wrap the generator in your build system and still always regenerate, it's effectively the same. And also, I think, not what GP was talking about
Pedantic. There's a world of difference between grokking a new code generation DSL+codebase and a shell one-liner that fixes a string that is obviously invalid.

Since the issue is the maintenance of such systems, it is absolutely relevant.

No thank you! I don't enjoy fighting dragons :)
> For instance generating libraries from .proto files (or other declarative schema definition solutions) works really well.

...does it ? Generated ones always feel being mismatched with the language paradigms. Maybe that's just my nightmares of dealing with MS Graph generated vomit hose of a library...

Sure, that's true, I'm a heavy user of the standard protobuf library in python, and you certainly won't catch me singing its praises for its style.

But that's a different (and less important) kind of problem. It does not exhibit the huge issue with generated-and-then-modified code where you have to maintain all the generated code rather than just the source from which it was generated.

It's trading wasting time by few developers manually writing client, for wasting time of tens of thousands of developers that use said client that doesn't fit language well.

It is IMO very bad tradeoff.

As a Lisp guy I find this entire discussion weird
Ha, yeah, though I would say that the lisp solution has a different downside: it's really nice to be able to see what the post-generation code all looks like. None of lisps I've used have made that as easy for their macro expansions as I would like.
There was an editor (for cmucl maybe?) that would macroexpand in a tooltip on hover and macroexpand-1 on right click (or maybe the opposite) on an s-expression. I'm surprised something like that didn't make it into slime, though you can I think macroexpand to the minibuffer. But, yeah, that's why it rewards doing macros in small pieces.
I absolutely prefer code generation over macros. It is a general solution that works for all languages, databases, protocols etc. And you can easily inspect the code generated.
You can wrap a code generator by a macro.
Why would you want to do that? That would be adding unnecessary compile time overhead. And (again) code generation works for any language/framework/OS/… Not just for Lisp.
You can handle any language with a read-time parser, then work with ASTs, pretty-print the result in another language. In between, it's just Lisp.
Ahhh you mean using Lisp to write the code generator. Yep makes sense.