|
|
|
|
|
by shoo
550 days ago
|
|
You can hand that GELU definition to a mathematician and they can interpret it as a function of a real number b. The definition does not depend upon b being a floating-point number with a particular bit representation. In contrast, the fast inverse square root really exploits the bit representation of a floating point input to cheaply compute an initial guess. |
|