|
|
|
|
|
by danielmarkbruce
1062 days ago
|
|
Is he? A surface level reading suggests he's asking "how would you know".. and the answer is... by looking at the parameters. People do that. >> because no one has yet looked at whether the trick helps reducing outliers in very large models Given a softmax version doing exactly as the blog post says is baked into a google library (see this thread), and you can set it as a parameter in a pytorch model (see this thread), this claim seems off. "Let's try X, oh, X doesn't do much, let's not write a paper about it" is extremely common for many X. |
|