Hacker News new | ask | show | jobs
by jph00 2078 days ago
Be sure to read the last section - really deep work involved, including writing a Python parser! Here's a snippet:

"I used the built-in Python tokenizer, but built my own parser for the subset of statements (assignments, return) and operations supported by TensorSensor. There is a built-in Python parser, but it generates a different AST than I wanted; plus, filtering the built-in tree structure for just the parts I care about would be about the same amount of work as writing the parser. I built the parser using simple recursive descent, rather than adding another dependency (my friend antlr) to TensorSensor.

Next, we need to augment the AST with the values of all subexpressions so it looks like the AST from the previous section. This is a straightforward bottom-up walk of the AST calling eval() on each subexpression, saving the result in the associated node. In order to execute the various bits of the expression in the proper context after an exception, though, TensorSensor has to walk the call stack back down to the specific context that directly invoked the tensor library. The trick is not chasing the call stack too far, down into the tensor library.

Once the AST is augmented with partial values, we need to find the smallest subexpressions that evaluate to tensors for visualization purposes. That corresponds to the deepest subtrees that evaluate to tensors, which is the role of tsensor.analysis.smallest_matrix_subexpr()."

(Note: the author is also the creator of ANTLR.)

2 comments

Thanks, Jeremy. :) I didn't go into super huge detail in the article on the implementation part as most readers won't have interest in language nerd details like you and I do.
This kind of code transformation would be an everyday task in a Lisp. Would it make sense to use a Lisp FFI to interface one of tensor libraries and write the transformations there?