I would also suggest Elements of Statistical Learning: http://www-stat.stanford.edu/~tibs/ElemStatLearn/
As well as Duda, Hart, and Stork's Pattern Classification: http://rii.ricoh.com/~stork/DHS.html