Hacker News new | ask | show | jobs
by adgjlsfhk1 1729 days ago
Note that while BLAS and friends aren't getting rewritten in C, there is an effort underway to write replacements in Julia. The basic reason is that metaprogramming and better optimization frameworks are making it possible to write these at a higher level where you basically specify a cost model, and generate an optimal method based on that. The big advantage is that this works for more than just standard matrix multiplies. The same framework can give you complex matrices, Integer and boolean matrices, matrices with different algebras (eg max-plus).
1 comments

That's a cool idea. I don't know how realistic it is to achieve performance parity, but the "generic" functionality is definitely intriguing.
The initial results are that libraries like LoopVectorization can already generate optimal micro-kernels, and is competitive with MKL (for square matrix-matrix multiplication) up to around size 512. With help on macro-kernel side from Octavian, Julia is able to outperform MKL for sizes up to to 1000 or so (and is about 20% slower for bigger sizes). https://github.com/JuliaLinearAlgebra/Octavian.jl.