Hacker News new | ask | show | jobs
by fixermark 2932 days ago
Question: This is one of the pieces of neural nets that has always seemed completely opaque voodoo to me. What estimating are you doing to suggest a 512-cell LSTM could stand to be swapped out with a 256-cell bidirectional? What constraints are you optimizing for?
1 comments

Not a constraint per se, but having too big of a neural network (or any statistical model) can cause it to overfit and generalize poorly; of course, generalizing better is a good objective for text generation.

You can use 512-cell LSTMs if you have a lot of text, though.