Y
Hacker News
new
|
ask
|
show
|
jobs
user:
kumama
created:
2017-02-19
karma:
5
submissions:
I post-trained a model to reliably roll a die
2 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Open-Weight Models Don't Need to Win
5 points
|
8 comments
Prompt caching but for RL – 7.5x speedup on long-prompt/short-response workloads
4 points
|
0 comments
Pokegents: Making multi-agent coding feel like a team
8 points
|
1 comments
Grpo explained: group relative policy optimization for LLM finetuning
1 points
|
0 comments
Do RL on a model with your vector db
1 points
|
0 comments
What is reinforcement learning finetuning
3 points
|
0 comments
RAG to riches: synthetic data for training RAG agents
2 points
|
0 comments
rag not lag: rl for fast agentic retrieval
3 points
|
0 comments
Show HN: Benchmax, a new open-source RL environment framework for LLM finetuning
1 points
|
0 comments
Beating o3/o4-mini with Codebase-specific Reinforcement Learning
3 points
|
0 comments
0 points
|
0 comments
We might be overestimating coding agent performance on SWE-Bench
1 points
|
1 comments
0 points
|
0 comments
How to Improve Code Completion LLMs with Repo-Specific Finetuning
3 points
|
1 comments
Show HN: Free AI Code Completion for Xcode with model choice/codebase context
2 points
|
0 comments
0 points
|
0 comments