Hacker News new | ask | show | jobs
by chabons 919 days ago
Interesting point. That said, the AI model space is rapidly evolving, while bitcoin's hashing problem is static. This makes it significantly more risky to make a large capital investment in dedicated HW when it's unclear if it will be able to run the next big model architecture. For instance, if this had been built + released a year ago, before SOTA models used MoE , then it would rapidly have become obselete.
1 comments

Outside of hardware/implementation optimizations, and position embedding choice - has the SOTA transformer architecture evolved that much?

Llama-2 code appears to be about the same as gpt-2.

You can look at https://github.com/ggerganov/llama.cpp/blob/master/llama.cpp... for examples of the different layers in a number of different models, and further down in the code for their implementations. tldr, yes they are very similar. I can see lots of value in something that can just run these models. Even if you just supported llama2 there are tons of options available.