|
|
|
|
|
by Yomguithereal
449 days ago
|
|
A good way to parallelize CSV processing is to split datasets into multiple files, kinda like manual sharding. xan has a parallel command able to perform a wide variety of map-reduce tasks on splitted files. https://github.com/medialab/xan |
|