Hacker News new | ask | show | jobs
by meltyness 917 days ago
To that, what can these express, precisely, is an interesting question; so for transformer encoders:

https://arxiv.org/abs/2301.10743

Another interesting research topic is the trusted generation of tasks for finetuning

https://arxiv.org/abs/2306.08568

And I suppose too running these at the edge is terribly interesting too, if you can find analyses of "quantization" this is a highly active research are, and results are pretty incredible since it cuts resources by huge factors and no one quite knows why.

This is one that's easy to dive into with consumer hardware, but don't know any great papers myself

Run locally: https://github.com/ggerganov/llama.cpp

Quantized models: https://huggingface.co/TheBloke

Explainability is under research, though I haven't seen any good solutions.

This nay arise from skeptics who are calling the things stochastic parrots, incapable of reason, without a world model, etc.