Hacker News new | ask | show | jobs
user: ModelForge
created: 2024-12-18
karma: 333

submissions:

Claude Code's Real Secret Sauce Isn't the Model
6 points | 0 comments
The State of LLMs 2025: Progress, Problems, and Predictions
3 points | 0 comments
A Researcher's Field Guide to Non-Standard LLM Architectures
2 points | 0 comments
Explanation of Gated DeltaNet (Qwen3-Next and Kimi Linear)
3 points | 0 comments
The Core Components of Modern LLMs and the Models Beyond Transformers [video]
3 points | 0 comments
Popular Attention Alternatives: GQA, MLA, SWA
4 points | 0 comments
Multi-Head Latent Attention
4 points | 0 comments
Thinking Machines Lab Co-Founder Departs for Meta
7 points | 0 comments
OpenAI's internal Slack messages could cost it billions in copyright suit
8 points | 1 comments
LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge
4 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
Gemma 3 270M re-implemented in pure PyTorch for local tinkering
417 points | 57 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2
490 points | 97 comments
LLM Research Papers: The 2024 List
5 points | 0 comments
Scaling Test-Time Compute with Open LLM Models
3 points | 0 comments