The format proposed by gus_massa would be capable of both of these tasks. You can generate HTML for the visual highlight, and do whatever processing that you wish in a different script.
This is just my 2 cents, I have no idea what your intended application is.
<person>Jhon doe</person> started the <vehicle>car</vehicle>.