Hacker News new | ask | show | jobs
by p51ngh 3385 days ago
I don't follow the intuition behind using an exponentiation of a as in e^a. You refer to this as a "context model" in your paper. Could you please elaborate? Thanks!
1 comments

The name "concept model" is borrowed from one of the papers I cited. The concept model is just "a(x_i,h_{i-1})". It decides what is important. You can sort of think of it like a filter or gate. When the concept model returns a large value, creating a large exponent, the information encoded by "z(x_i,h_{i-1})" dominates the weighted average.