Let me know how recoll works for you, if you try it. It can search within XML stuff like word docs, PDFs and can even do OCR.