|
|
|
|
|
by marjimbel
2531 days ago
|
|
the memory blows up with the length of encoder sequence. for that reason we truncate the email at ~300 tokens, which is for the vast majority of cases enough to capture the relevant info. other than that we don't get rid of any "garbage" lines. instead, we let the NN (eg. the attention layer) figure out which lines are irrelevant |
|
Also: any comments about the output language and how you taught labelers this language.