It's a combination of using an LLM and some pre and post processing. Data extraction itself has been fairly accurate in my experience. The bigger challenge has been biomarker name normalization because different labs often name the same biomarkers quite differently.