Hacker News new | ask | show | jobs
by jbarrow 1091 days ago
The order of magnitude of suggested pricing is really interesting: $0.001/word is significantly more expensive than, say, OpenAI's pricing of GPT-3.5-turbo ($0.002/1k tokens, ~750 words, so ~$0.000003/word, assuming I got my zeros correct). So this would increase the cost of running GPT-3 by about 300x.

In terms of implementation, I wonder about a few things:

Do models trained on more data have to pay more? LLaMA was trained on 1.5T tokens, the original GPT-3 was trained on ~300B tokens. And this is only partially related to model quality, LLaMA 13B and LLaMA 65B were trained on the same data, but the 65B model is better. What's the incentive to ever use the 13B model, if the licensing cost is 100x-1000x the model inference cost?

Who defines a word? Each model uses a different tokenizer. I'm personally amused by the idea of a government-mandated tokenizer.

What about generations that never see human eyes? As an NLP researcher, I've generated millions of tokens for training and automatic evaluation purposes -- are those subject to licensing as well?

1 comments

Yeah, the idea is that it's much more expensive than current OpenAI pricing but much less expensive than what even a low-end marketing copy writer would charge per word. Its side effect would be to push such tools towards more valuable uses.

The idea is to keep it simple, so it wouldn't be based upon the specifics of training, just whether or not it used public data. Anything else would require companies to divulge trade secrets and that won't fly. And words are defined here as, well, words -- English words. There'd be a separate fee per pixel/voxel, and then a catchall for non-language/non-image models.