Hacker News new | ask | show | jobs
by orasis 525 days ago
It’s best for immediate rewards. If you have delayed rewards there is a paper on sampling from the “delay distribution” that solves this.