|
|
|
|
|
by dr_kiszonka
1872 days ago
|
|
I usually use Python and R for analysis. However, when dealing with larger datasets, e.g., 0.5 - 2 PB, I have to rely on SQL/BigQuery because I can't get Python and R to deal such workloads in reasonable time. I tried Dask, but I couldn't resolve a few bugs it had at the time. If you were to find outliers in a 1 PB table, what tools would you use? |
|