That's true, according to the paper improvement over MLP is negligible. It looks like the model is very small and that might be a reason for relatively low performance.
The performance of multilayer perceptrons and linear models is significant. The paper additionally allows the MLP and linear models to "cheat" by giving them heavily hand-engineered features incorporating extensive prior knowledge. Even with this benefit, the LSTM outperforms the baselines, but the improvement is considerably smaller.