That question has been baffling product managers, scrum masters, and C-suite assholes for decades. Along with how you measure engineering productivity.
The folks at Stanford in this video have a somewhat similar dataset, and they account for "code churn" i.e. reworking AI output: https://www.youtube.com/watch?v=tbDDYKRFjhk -- I think they do so by tracking if the same lines of code are changed in subsequent commits. Maybe something to consider.
I'm kind of baffled that "lines of code" seems to have come back; by the 1980s people were beginning to figure out that it didn't make any sense.