|
|
|
|
|
by ben_w
58 days ago
|
|
I'm not sure what DeepL uses, but Google invented the Transformer architecture, the T in GPT, for Google Translate. IIRC, the original difference between them was about the attention mask, which is akin to how the Mandelbrot and Julia fractals are the same formula but the variables mean different things; so I'd argue they're basically still the same thing, and you can model what an LLM does as translating a prompt into a response. |
|