Hacker News new | ask | show | jobs
by wdabney 2347 days ago
Hi, thanks for posting the news story.

We can think about asymmetric regression more generally. If you have an error and apply some 'response' function f to that error you change the estimator you learn. In the case of quantile regression f is a sign function, expectile regression it is identity.

In my opinion, and this is entirely speculation, I think with further experiments more completely studying the effect we found in our paper, that we will find the response function (f) in the brain is not linear, but a type of saturating function like if we smoothed the sign function out. We repeated our experiments in the paper using such a function, which has been proposed for dopamine neuron responses before, and the analysis continues to hold because the rewards are all quite small and likely simply in the linear region of a non-linear response function (we know firing rate saturates eventually so this isn't much of a surprise).

Regarding quantiles being more commonly used, it's actually the other way around. The Huber-quantiles we saw perform best in the QR-DQN paper, and which most often get used in the follow-on RL work, are actually more like the type of saturating non-linearity you might expect in the brain (although the Huber loss is not as smooth as you probably would expect the neuron response to be).