| Thanks, and ouch. I have some follow-up questions/comments, if you have the time: > these are predefined as `\@firstoftwo` and `\@secondoftwo` I do wish LaTeX kernel commands (which I'm assuming these are) were more widely documented. As it stands, it's pretty hard to keep track of what already exists. Is there a nice reference for those? > Also the Unicode bytes are already active so setting their catcode is useless. This is true for LaTeX and not TeX, correct? Originally, I'd `\expandafter\let\expandafter\@firstoct\@firstoftwoλ`, but I decided not to assume that that character was already active. > Also redefining the first octet breaks LaTeX's UTF-8 handling... How so? (If the else case wasn't broken) >...and the else case forms an infinite loop. If `\Firstλ` was not an active character, would this still be true? Since I store `\Firstλ` in `\lambda@first@oct` before it's declared an active character. > and it breaks other uses of `(` and `)` in the argument. This is not a concern for the DSL, but... > Changing the catcodes of `(` and `)` means that this command doesn't work in the arguments of other commands ...this is. Thanks. > Instead you could do this as Damn :) Thanks for the nice feedback. I suppose I should read up on xparse. In any case I feel like it's not moot to try to achieve the same results with primitives, to have some idea of what's breaking when a given program doesn't compile (usually at that point the primitives surface). |
Not really, the traditional commands are rather messy. Of course you can read source2e, but that's not really documentation. For new stuff it often makes sense to write the more programmy stuff in expl3 which is much better documented in interface3. (It contains these commands as `\use_i:nn` and `\use_ii:nn`)
> > Also the Unicode bytes are already active so setting their catcode is useless. > > This is true for LaTeX and not TeX, correct?
Right, this is LaTeX specific.
> Also redefining the first octet breaks LaTeX's UTF-8 handling... > > How so? (If the else case wasn't broken)
LaTeX's definition of the first byte handles arbitrary valid UTF-8 following bytes by using corresponding definitions or printing correct errors, while even a definition which wouldn't trigger the active character again would just print the two bytes which does not print a useful error message and probably prints two random characters from the font, completely ignoring any potential definition using LaTeX's mechanism for other codepoints starting with this byte.
>...and the else case forms an infinite loop. > > If `\Firstλ` was not an active character, would this still be true? Since I store `\Firstλ` in `\lambda@first@oct` before it's declared an active character.
You are correct, if the first byte wouldn't already be an active character (e.g. in plain TeX) then it wouldn't loop. It wouldn't expand to something particularly useful, but that wouldn't be any worse than without the definition so it would be "correct".
> I suppose I should read up on xparse.
Normally `xparse` is preloaded and not a package anymore, therefore also it's documentation has been moved into usrguide3. In this case you still need the package though since the `d` argument type has not been added to the kernel (and therefore also not to usrguide3) since delimited arguments are not recommended for LaTeX commands. It's still documented in the old `xparse` manual though. Just in case you're wondering about the split.