Hacker News new | ask | show | jobs
by bjtitus 1085 days ago
> De-identification

> Re-identification

Wouldn't these two features address your concern? ChatGPT gets a generated unique ID that is still a consistent value for each card, just not the number itself. Then when the results are returned, that generated ID is turned back into the real card number.

This only becomes a problem when the de-identified data itself is needed to answer a question, like tell me how many Visa cards were used in these transactions by checking the card numbers.

1 comments

That's right. So in the case of credit card numbers we redact it like [CREDIT_CARD_NUMBER_1], [CREDIT_CARD_NUMBER_2], etc so the LLM can still answer prompts like "how many", but it can't answer prompts like "sort". But you can use OpenAI function calling API to do the sort, where your function re-identifies, sorts, and then de-identifies again.