|
|
|
|
|
by p1esk
2612 days ago
|
|
But the job of T1 is not to capture long term patterns, it’s to extract useful short scale features for T2 so that T2 could extract longer term patterns. T3 would hopefully extract even longer scale patterns from T2 output, and so on. That’s the point of having the lstm hierarchy, right? Why would you try to manually duplicate this process by creating F1, F2, etc? The idea of skip connections would be like feeding T1 output to T3, in addition to T2. Again, I’m not sure what useful info F sequences would supply in this scenario. |
|
Don't we already do this with text translation? Why not to let one model read a printed text pixel by pixel and the other model produce a translation, also pixel by pixel? Instead we choose to split printed text into small chunks (that we call words), give every chunk a "word vector" (those word2vec models) and produce text also one word at a time.