Hacker News new | ask | show | jobs
by lysecret 919 days ago
The good (and some might say bad thing) is that when it comes to fundamental technologies there are only 2 that are relevant:

1. Transformers 2. Diffusion

The benefit is that, focus on understanding them both reeaaalllyy well and you are at the forefront of research;)

Also, what is the reason you want to do this? If it is about building some kind of AI enabled app, you don't have to read anything. Get an API key and let's go the barrier has never been lower.

2 comments

To that, what can these express, precisely, is an interesting question; so for transformer encoders:

https://arxiv.org/abs/2301.10743

Another interesting research topic is the trusted generation of tasks for finetuning

https://arxiv.org/abs/2306.08568

And I suppose too running these at the edge is terribly interesting too, if you can find analyses of "quantization" this is a highly active research are, and results are pretty incredible since it cuts resources by huge factors and no one quite knows why.

This is one that's easy to dive into with consumer hardware, but don't know any great papers myself

Run locally: https://github.com/ggerganov/llama.cpp

Quantized models: https://huggingface.co/TheBloke

Explainability is under research, though I haven't seen any good solutions.

This nay arise from skeptics who are calling the things stochastic parrots, incapable of reason, without a world model, etc.

> there are only 2 that are relevant: 1. Transformers 2. Diffusion

I'd argue that there are plenty of less sexy, non-unicorn uses for AI/ML - particularly in industrial applications. SVMs, DNNs, etc are still very relevant. As is GOFAI in some domains.