Hacker News new | ask | show | jobs
by santiagobasulto 1492 days ago
Usually aggregated... then can start looking at "subsets". For example, step 1 is look at the whole dataset. Then you identify that there are a lot of rows with a type of missing value, so you look at the statistical attributes of that subset (all the rows with value X in null).

From time to time you can do a `.head()/.title()` or an `.iloc[X:Y]` to check some things visually. But just as a "refresher".

1 comments

This sort of bouncing back and forth between the aggregate the raw data is something that Mito is really great at. To view aggregate info, users tend to either look at graphs or pivot tables of their data in Mito. They use that aggregate view to identify subsets that need some further investigation/cleaning/transforming. And then they filter down to that subset, make the correction, and use the aggregate view again to see the results.

Practically, this just looks like moving between two tabs in the spreadsheet!

Something that we don't support right now, but would love to support in the future is cross-filtering. It would be a powerful/easy way of supporting that back and forth workflow.