Hacker News new | ask | show | jobs
user: shreyansh26
created: 2017-08-18
karma: 11

submissions:

0 points | 0 comments
0 points | 0 comments
Understanding Multi-Head Latent Attention (From DeepSeek)
2 points | 1 comments
Deriving the gradient for the backward pass of Layer Normalization
3 points | 0 comments
GTC'25 Notes: CUDA Techniques to Maximize Memory Bandwidth – Part 1
1 points | 0 comments
0 points | 0 comments
FlashAttention in PyTorch
2 points | 1 comments
0 points | 0 comments
Understanding FlashAttention
2 points | 0 comments
0 points | 0 comments
0 points | 0 comments
Ask HN: What are some good resources on Recommender Systems?
14 points | 3 comments
0 points | 0 comments
0 points | 0 comments