There's a fair amount of experimental work happening trying different parsing and resolution procedures such that the training data reflects an AST and or predicts nodes in an AST as an in-filling capability.
LLMs don't have memory, so they can't build anything. Insofar as they produce correct results, they have implicit structures corresponding to ASTs built into their networks during training time.