|
|
|
|
|
by lysecret
919 days ago
|
|
The good (and some might say bad thing) is that when it comes to fundamental technologies there are only 2 that are relevant: 1. Transformers
2. Diffusion The benefit is that, focus on understanding them both reeaaalllyy well and you are at the forefront of research;) Also, what is the reason you want to do this? If it is about building some kind of AI enabled app, you don't have to read anything. Get an API key and let's go the barrier has never been lower. |
|
https://arxiv.org/abs/2301.10743
Another interesting research topic is the trusted generation of tasks for finetuning
https://arxiv.org/abs/2306.08568
And I suppose too running these at the edge is terribly interesting too, if you can find analyses of "quantization" this is a highly active research are, and results are pretty incredible since it cuts resources by huge factors and no one quite knows why.
This is one that's easy to dive into with consumer hardware, but don't know any great papers myself
Run locally: https://github.com/ggerganov/llama.cpp
Quantized models: https://huggingface.co/TheBloke
Explainability is under research, though I haven't seen any good solutions.
This nay arise from skeptics who are calling the things stochastic parrots, incapable of reason, without a world model, etc.