Using compiled XSLT-translets in Java (XSLTC) has a huge impact on performance, with transformations taking 0.5s the problem seems to be in the way your XSLT is written and pushing or pulling the data, not XSLT itself:
http://xalan.apache.org/old/xalan-j/xsltc_usage.html