|
|
|
|
|
by jaffee
2572 days ago
|
|
Good catch... that sounds pretty silly. It should probably read more like "converting relationships to be represented by single bits" As a concrete example, we took the NYC taxi ride data set which is something like 300GB of CSV files and when it was indexed in Pilosa, the total size of all the bitmap files was closer to 40GB. |
|
What's not obvious is what we're associating with what in the NYC taxi ride data set.
A bitmap can also represent a set: the bit positions denote enumerated element symbols, and the value indicates whether that element is present.
So we rearrange the NYC taxi ride data into a data structure based on graphs and sets, and make large bitmaps?