Hacker News new | ask | show | jobs
by dharma1 3505 days ago
Really like what you're doing with SpaCy and explosionAI, good stuff :)

What do you think about dilated convolutional encoder/decoder networks [1]? Useful for NLP beyond machine translation?

[1] https://arxiv.org/abs/1610.10099, https://github.com/paarthneekhara/byteNet-tensorflow

2 comments

Thanks!

I don't understand those models very well yet. I haven't implemented one, or really sat down with the paper and really worked through it.

One of the main issues with character level CNN's (irrespective of convolution type IIRC) is the inability of the model to handle unknown words, which is something that word level models do well. So if you look at applications of NLP in domains that need this to work well, you won't get much from purely char models in my experience.