|
|
|
|
|
by ethbr0
2038 days ago
|
|
The problem is that real world, physical business data is taken from what one can get, not what one would want. And usually "what the business was collecting for (unrelated purpose)." This means it has nearly infinite caveats and assumptions. A specifications doc or readout will never sufficiently express all of these. Especially if humans were involved in the data generated. Consequently, the most useful data products are going to turn on whether of not you (did this small thing) to (correct for this obvious bias or flaw that anyone familiar with the industry knows). |
|
Yea, sorry, but part of your job in data sci is to collect the right data. Data doesn't magically exist and we are not stuck with what's out there. A data sci job is to figure this stuff out. Tech has a weird culture of not doing their job. Kind of like the Zip Recruiter ads. "Working as a hiring manager, hiring new people is the worst part of my job." Bitch, that IS your job. If you dont do that, what's the point in keeping you around? Bee keepers collect honey. Yea it's not exactly easy if you're not careful, but they dont bitch about it because they knew what they signed up for.
Data sci/analysis is about collecting and analyzing data, in not straightforward ways. Because if it were easy and didnt require any effort, why are they needed?