Hacker News new | ask | show | jobs
by danielmarkbruce 1062 days ago
Is he? A surface level reading suggests he's asking "how would you know".. and the answer is... by looking at the parameters. People do that.

>> because no one has yet looked at whether the trick helps reducing outliers in very large models

Given a softmax version doing exactly as the blog post says is baked into a google library (see this thread), and you can set it as a parameter in a pytorch model (see this thread), this claim seems off. "Let's try X, oh, X doesn't do much, let's not write a paper about it" is extremely common for many X.

1 comments

This would seem like a really good argument as to why failures should be written up, otherwise where is the list of what has been tried before?
Yup, it is. But it isn't going to happen.