|
|
|
|
|
by TarqDirtyToMe
915 days ago
|
|
LLMLingua uses a well-trained small language model after alignment, such as GPT2-small or LLaMA-7B, to detect the unimportant tokens in the prompt and enable inference with the compressed prompt in black-box LLMs, achieving up to 20x compression with minimal performance loss. |
|