|
|
|
|
|
by giu
1988 days ago
|
|
By real datasets you mean company-specific ones? Or do you happen to have some examples that are openly available which helped you a lot? I definitely concur with your first point, since I made the same experience, specifically when working with company-specific datasets. From my experience one also underestimates how much time cleaning up the data takes; there are quite a few steps you need to go through before you can really start to analyze a dataset. |
|
I didn't stumble upon into any (tabular, at least) dataset that wasn't very curated.
Keep in mind that I studied sociology, so stuff that is a given for most HN people isn't for me. I had to learn a lot of CSS (for selectors), regex (still hate it), what's OLAP and how to take advantage of it (DuckDB) and a lot of stuff I'm not even aware now.
But I remember taking courses in my Uni, and later on, with R and Python. It was interesting, but no matter how deep into the rabbit hole of weird models I learnt, it felt... IDK, shallow?
Imagine yourself pulling data out of a company ERP, with human filled data. It won't be a walk in the park, just make some logit models and call it a day. You'll spend a lot of time trying to understand what's going on. And then you perform the models or make a dashboard.