Turns out, Niall was also involved in one of the winning ZPrize submissions for fast multi-scalar multiplication (closely related to batch modexp, although over an elliptic curve rather than mod a prime); I assume it inherits from his work on CGBN.
He give a very nice talk about it last year at a Stanford crypto lunch, and it turns out the slides and recording are online!