|
|
|
|
|
by srcrip
22 days ago
|
|
You seem to understand this stuff pretty well, any recommendations on resources (blogs, YouTube channels, whatever) for software engineers that want to keep up with this stuff on this kind of level? A lot of the content about AI out there is kind of produced to the lowest common denominator. Basically a never ending scheme of get rich quick/passive income kinds of AI content. |
|
If you’re curious about what a particular switch does, clone the llama-cpp repository to your computer and try asking your favorite pet rock prompts like “This is llama-cpp. Can you look at what the -ctk parameter does and explain to me?” Giving Claude/codex/whatever access to the actual code goes a long way, but it is just one opinion.
If you’d like to learn how transformer-based language modeling works in detail, I suggest starting with chapter 0 or 1 of https://arena-chapter0-fundamentals.streamlit.app/ depending on your skill level, then use that to work your way to reading research papers.
Graduate students who study these topics are generally as annoyed by the “get rich quick” style of advertising as you are, so the deeper you go toward academic research the quieter those voices tend to get, mercifully. That said, this is balanced by the unfortunate fact that top labs have strong posturing signals they try to send, so it can be hard to see which preprints actually have good ideas, which are trying to promote their group’s tech instead of doing science out of curiosity, and which have authors who’ve innocently deluded themselves into overfitting their own pet projects. Read widely but adversarially, test everything but hold fast to the good stuff, etc etc