|
|
|
|
|
by parpfish
625 days ago
|
|
I feel like multiple imputation is fine when you have data missing at random. The problem is that data is never actually missing at random and there’s always some sort of interesting variable that confounds which pieces are missing |
|
More specifically, how do you determine if the pattern you seem to be identifying is actually related to the phenomenon being measured and not an error in the measurement tools themselves?
For example, a significant pattern of answers to "Yes / No: have you ever been assaulted?" are blank. This could be (A), respondents who were assaulted are more likely to leave it blank out of shame or (B) someone handling the spreadsheet accidentally dropped some rows in the data (because lets be serious here, its all spreadsheets and emails...).
While you could say that (B) should be theoretically "more truly random", we can't assume that there isn't a pattern to the way those rows were dropped (i.e. a pattern imposed on some algorithm that bugged out and dropped those rows).