Hacker News new | ask | show | jobs
by narush 753 days ago
> To be clear, though, you don't necessarily have parity of outputs.

The cool thing is that the Excel file is both the programatic specification of the process as well as the actual output data you want as well. We can check parity of outputs by comparing the data we create with Python to the data in Excel - in practice, Pyoneer generates test cases for tables that do exactly this, even when we can't translate every formula correctly!

> applying heuristics to say "hey, this enormous column of VLOOKUPs is actually a join", and so on.

We do this deterministically currently. The only non-deterministic aspect is formula translation - where we defer to some LLM. Structurally, everything is deterministic though - and here we really do aim for readability (there's a lot more to do here though).