|
|
|
|
|
by wdabney
2347 days ago
|
|
Hi, thanks for posting the news story. We can think about asymmetric regression more generally. If you have an error and apply some 'response' function f to that error you change the estimator you learn. In the case of quantile regression f is a sign function, expectile regression it is identity. In my opinion, and this is entirely speculation, I think with further experiments more completely studying the effect we found in our paper, that we will find the response function (f) in the brain is not linear, but a type of saturating function like if we smoothed the sign function out. We repeated our experiments in the paper using such a function, which has been proposed for dopamine neuron responses before, and the analysis continues to hold because the rewards are all quite small and likely simply in the linear region of a non-linear response function (we know firing rate saturates eventually so this isn't much of a surprise). Regarding quantiles being more commonly used, it's actually the other way around. The Huber-quantiles we saw perform best in the QR-DQN paper, and which most often get used in the follow-on RL work, are actually more like the type of saturating non-linearity you might expect in the brain (although the Huber loss is not as smooth as you probably would expect the neuron response to be). |
|