Hacker News new | ask | show | jobs
by amayne 992 days ago
Could you give us an example of what that means in practical terms?

The post says that 1000 neurons will give you 130 LLM responses - but of what length?

(LLMs are generally priced by input and output tokens. The longer the tokens the longer the compute time. Without an idea of what you mean by a response it's hard to understand.)

Likewise: 1,250 embeddings – How big is the text size in the example?

I'm VERY excited to see you doing this and understand it's early stages, but I wan't wrap my head around the pricing without context.