|
|
|
|
|
by idiotclock
2866 days ago
|
|
Spark is not too tricky to dive into, even though you can't really take advantage unless you have a big cluster to use :) if you want to practice data-manipulation, and a lot of the map reduce type stuff you can do with spark, I find Pandas useful for small datasets (And a lot of overlap in functionality as far as Dataframes are concerned) For pipeline stuff, definitely take a look at Luigi, but again without a cluster it'll be less fun. Still, if you can try automating tasks with a mini luigi scheduler on your localhost, it would be good practice |
|