|
|
|
|
|
by svalorzen
1563 days ago
|
|
I'm not sure I understand why dynamic programming wouldn't work (and the author explicitly mentioned Knuth). Tex's main job is literally doing line breaks, which is the exact same problem being tackled here. I would expect a similar approach (progressively build a graph of the most promising breaking points) to be effective. Why wouldn't it be the case here? |
|
With the code formatter, to format the part after a certain character, you need to keep track of the indentation depth of all the expressions that have not yet terminated at this point — because you presumably want parallel expressions to be formatted with the same indentation depth, for closing parentheses to match their corresponding opening parentheses, etc.
For example, in this example:
and, say (I'm making this up): — knowing that there's a break after the `&&` is not enough; you also need to know the indentation of the previous expressions, to decide how you're going to format the part after the `&&`.This is what the author alludes to in the post:
> A line break changes the indentation of the remainder of the statement, which in turn affects which other line breaks are needed. Sorry, Knuth. No dynamic programming this time. […] For most of the time, the formatter did use dynamic programming and memoization. […] It worked fairly well, but was a nightmare to debug.
> It was highly recursive, and ensuring that the keys to the memoization table were precise enough to not cause bugs but not so precise that the cache lookups always fail was a very delicate balancing act. Over time, the amount of data needed to uniquely identify the state of a subproblem grew, including things like the entire expression nesting stack at a point in the line, and the memoization table performed worse and worse.
In TeX, paragraphs have each line of the same width (simple case) or can have a \parshape (in general), but these are "global" constraints that don't depend on what breaks you choose.