User: ModelForge | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

user: ModelForge
created: 2024-12-18
karma: 474

submissions:

0 points | 0 comments

Kimi K3 Architecture Overview and Notes

382 points | 66 comments

Inkling: A New Open-Weight 975B Moe with a Few Surprises

3 points | 0 comments

Claude Code's Real Secret Sauce Isn't the Model

6 points | 0 comments

The State of LLMs 2025: Progress, Problems, and Predictions

3 points | 0 comments

A Researcher's Field Guide to Non-Standard LLM Architectures

2 points | 0 comments

Explanation of Gated DeltaNet (Qwen3-Next and Kimi Linear)

3 points | 0 comments

The Core Components of Modern LLMs and the Models Beyond Transformers [video]

3 points | 0 comments

Popular Attention Alternatives: GQA, MLA, SWA

4 points | 0 comments

Multi-Head Latent Attention

4 points | 0 comments

Thinking Machines Lab Co-Founder Departs for Meta

7 points | 0 comments

OpenAI's internal Slack messages could cost it billions in copyright suit

8 points | 1 comments

LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge

4 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

Gemma 3 270M re-implemented in pure PyTorch for local tinkering

417 points | 57 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2

490 points | 97 comments

LLM Research Papers: The 2024 List

5 points | 0 comments

Scaling Test-Time Compute with Open LLM Models

3 points | 0 comments