Hacker News new | ask | show | jobs
by antimora 1170 days ago
I'm presently working on enhancing Burn's (https://burn-rs.github.io/) capabilities by implementing ONNX model importation (https://github.com/burn-rs/burn/issues/204). This will enable users to generate model source code during build time and load weights at runtime.

In my opinion, ONNX is more complex than necessary. Therefore, I opted to convert it to an intermediate representation (IR) first, which is then used to generate source code. A key advantage of this approach is the ease of merging nodes into corresponding operations, since ONNX and Burn don't share the same set of operators.

2 comments

Actually WONNX also transforms to an IR first (early versions did not and simply translated the graph 1:1 to GPU shader invocations in topographically sorted order of the graph). In WONNX the IR nodes are (initially) simply (copy-on-write references to) the ONNX nodes. This IR is then optimized in various ways, including the fusion of ONNX ops (e.g. Conv+ReLU->ConvReLU). The newly inserted node still embeds an ONNX node structure to describe it but uses an internal operator.
Looks great!

ONNX is 100% more complex than necessary. Another format of interest is NNEF: https://www.khronos.org/nnef

Also see the recently introduced StableHLO and its serialization format: https://github.com/openxla/stablehlo/blob/main/docs/bytecode...