|
|
|
|
|
by warbiscuit
4065 days ago
|
|
Not to mention: having all the data, and comprehending all the rows on an individual level, are two very different things. Doubly so if the data is irregular (I'm currently doing fuzzy matching on really mangled street address data. ICK). Once you hit millions of rows, it's not humanly possible to survey the data. All you can do is make assertions about the data's structure / buckets it will fall into. You then try to disprove that assertion, or establish an error bounds on it. You will never see all the data, only the results of assumptions you've made about it. |
|