Hacker News new | ask | show | jobs
by andy99 922 days ago
There is a lot going on in the LLM / AI chip space. Most of the big players are focusing on general purpose AI chips, like Cerebras and Untether. This - what I understand to be more like ASICs is an interesting market. They give up flexibility but presumably can make them more cheaply. There is also Positron AI in this space, mentioned here: https://news.ycombinator.com/item?id=38601761

I'm only peripherally aware of ASICs for bitcoin mining, I have no idea the economics or cycle times. It would be interesting to see a comparison between bitcoin mining chips and AI.

One thing I wonder about is that all of AI is very forward looking, ie anticipating there will be applications to warrant building more infrastructure. It may be a tougher sell to convince someone they need to buy a transformer inference chip now as opposed to something more flexible they'll use in an imagined future.

2 comments

Only one certainty, HBM memory makers will be doing nicely in the current climate as all these AI processing options are using it in larger and larger volumes. Those will be the unnoticed winners in this rush.
In the cloud, these chips will compete head to head with GPUs. If they are able to pull off a 10x price/performance win without excessive porting work… it’ll take off in a heartbeat.
Like ASIC Botcoin miners did. There are parallels here in how it might just pan out.
Interesting point. That said, the AI model space is rapidly evolving, while bitcoin's hashing problem is static. This makes it significantly more risky to make a large capital investment in dedicated HW when it's unclear if it will be able to run the next big model architecture. For instance, if this had been built + released a year ago, before SOTA models used MoE , then it would rapidly have become obselete.
Outside of hardware/implementation optimizations, and position embedding choice - has the SOTA transformer architecture evolved that much?

Llama-2 code appears to be about the same as gpt-2.

You can look at https://github.com/ggerganov/llama.cpp/blob/master/llama.cpp... for examples of the different layers in a number of different models, and further down in the code for their implementations. tldr, yes they are very similar. I can see lots of value in something that can just run these models. Even if you just supported llama2 there are tons of options available.
Oh man, all those years back I made a choice between antminer and butterfly labs. I backed the wrong horse.

BFL mined with customer hardware and basically didn't ship units to customers until there was no profit in running one.

Crypto ASICs are a super weird edge case IMHO in chips, strictly speaking it's not rational to sell them if they are very profitable. It only makes sense if the customer has a different risk profile than you; or the customer can somehow get power more cheaply than you; or you have some kind of scam going on; or you couldn't get capital except by presales and are unusually honest.

Note that an additional profit-making option for crypto ASIC producers is to secretly over-produce and compete with your customers and you are unlikely to get caught doing this.

That isn't correct. If the ASIc manufacturer produce them for X can use them directly and it only costs Y to operate them for Z profit. Then you can price them as Z-Y-profit_margin > X. The lower the operating cost, the higher your profit margin per chip if you sell it. Selling the chip might have a 50% profit margin and mining has a 10% profit margin. If you wanted to get into mining you would build a holding company owning both types of companies.