If I need more detailed formatting information, I use "pdftohtml -xml -fullfontname" and process the resulting xml.