You should try using a different PSM to see if you get better results.
If it's scientific texts specifically, look at grobid