| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by biomodel 2730 days ago
	Not sure why anyone would use 2D CNNs for processing text when there is no spatial correlation in the embedding features. Recent work such as https://arxiv.org/abs/1803.01271 show that for most tasks, 1D CNNs outperform recurrent architectures while being faster to train

2 comments

madavidj 2730 days ago

Probably because the author followed this blog: http://www.wildml.com/2015/12/implementing-a-cnn-for-text-cl...

That blog used a 2d cnn because tensorflow didn't have a 1d version at the time of writing, so he just created a dummy 2nd dimension of length 1 and called it a day.

link

soraki_soladead 2730 days ago

This is just a bug in their code. The paper they cite uses 1D convolutions. Though, I suppose having an unused dimension only really hurts efficiency.

link

gnulinux 2730 days ago

> Though, I suppose having an unused dimension only really hurts efficiency.

That might not be true as it might increase bias and thus might need a more careful hyperparameter tuning to avoid overfitting.

link