|
|
|
|
|
by Arqu
2063 days ago
|
|
I'm super interested in this topic. Recently (and still ongoing) I started on hashing out how to diff large datasets and what that even means. I would love to get an understanding of how the HN crowd sees diffing datasets should be (lets say >1GB in size). Are you more interested in a "patch" quality diff of the data which is more machine tailored? Or is a change report/summary/highlights more interesting in that case? Currently I'm leaning more towards the understanding/human consumption perspective which offers some interesting tradeoffs. |
|