|
|
|
|
|
by eternalban
1184 days ago
|
|
You are working under an assumption that this tech is an O(n) or better computational regime. Ask ChatGPT:
“Assume the perspective of an expert in CS and Deep Learning. What are the scaling characteristic (use LLMs and Transformer models if you need to be specific) of deep learning ? Expect answer in terms of Big O notation. Tabulate results in two rows, respectively “training” and “inference”. For columns, provide scaling characteristic for CPU, IO, Network, Disk Space, and time. ” This should get you big Os for n being the size of input (i.e. context size). You can then ask for follow up with n being the model size. Spoiler, the best scaling number in that entire estimate set is quadratic. Be “scared” when a breakthrough in model architecture and pipeline gets near linear. |
|