|
|
|
|
|
by JohannaAlmeida
76 days ago
|
|
Yeah auto complete is an amazing use case. I needed a small model that used transformers , could fit on my weak consumer GPU . So i needed to make fundamental arquitecture changes .Do some KV cache tricks. And then prove the new arquitecture was faster with benchmarks and perplexity was acceptable. |
|