Y
Hacker News
new
|
ask
|
show
|
jobs
user:
gpjt
created:
2009-01-12
karma:
1718
https://www.gilesthomas.com/
submissions:
Jax Back Ends and Devices
2 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Using Safetensors with Flax
2 points
|
0 comments
First Looking into Jax
3 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
10Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module
3 points
|
0 comments
10Gb/s Ethernet: what I did to get it working in my home
232 points
|
177 comments
10Gb Ethernet: what I had to (re)learn
1 points
|
1 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
LLM from scratch, part 33 – what I learned from the appendices
5 points
|
0 comments
LLM from scratch (32l) – Interventions: updated instruction fine-tuning results
1 points
|
0 comments
How an LLM becomes more coherent as we train it
3 points
|
0 comments
0 points
|
0 comments
LLM from scratch, part 32k – Interventions: gradient accumulation
2 points
|
0 comments
Provision: LLM-powered server setup from Markdown
2 points
|
0 comments
0 points
|
0 comments
LLM from scratch, part 32j – trying to train a better model in the cloud
2 points
|
0 comments
Writing an LLM from scratch, part 32i – Interventions: what is in the noise?
1 points
|
0 comments
Writing an LLM from scratch, part 32h – Interventions: full fat float32
7 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Writing an LLM from scratch, part 32g – Interventions: weight tying
2 points
|
0 comments
Writing an LLM from scratch, part 32f – Interventions: weight decay
6 points
|
0 comments