In particular, I recommend "Cilk: An Efficient Multithreaded Runtime System" (http://supertech.csail.mit.edu/papers/PPoPP95.pdf) and "The Implementation of the Cilk-5 Multithreaded Language" (http://supertech.csail.mit.edu/papers/cilk5.pdf).