Hacker News new | ask | show | jobs
by dvt 1865 days ago
Congrats on the launch! I don't mean to hijack this thread, but as a day-to-day data engineer, I can't help but think that even though this explosion of ETL solutions are undeniably helpful, they don't really get to the real root of the problem. These days, you've got every company -- from small startups to large corps -- warehousing data. But the real value proposition isn't just having access to that raw data, but rather drawing insights out of it.

I'm not sure this is even doable without a dedicated data scientist, but a potential solution is a two-way marketplace that connects companies with data scientists to help make heads or tails of the data they're storing. Otherwise, it's just sitting in a data lake somewhere. (Not sure if something like this exists already, I'm just thinking out loud.)

3 comments

I’m in extreme agreement for part -- for a company to get value out of their data, you want someone skilled at data cleaning, cutting it properly, and teasing out the insights. Where I disagree is that the person can be a data scientist, but doesn’t need to be. I believe that there is a growing population of data savvy employees without that title, many of them might not even have data at all in their title (they are in business operations, marketing, finance, and sales) -- many of them write SQL and are very comfortable manipulating data in BI tools, R, Python, Excel, or GSheets.

I also believe that company context matters a lot. I think so much of getting started with extracting value from data is getting up the learning curve of understanding what it means (which columns have the truth). One of the reasons that we don’t have a lot of canned reports is that understanding these edge cases within a company often matters a lot (and that not accounting for the nuance can often lead to a misinference). With this in mind, the explosion of ETL solutions and products like Mozart Data means that others at the company can specialize in their business context, as opposed to needing someone who can do all aspects of data including engineering, data science, analysis, and communicating/presenting it.

> connects companies with data scientists to help make heads or tails of the data they're storing

the consulting "data scientist" is likely able to do better job if they have experience with the idiosyncracies of the individual company's operations. If you get a fresh data scientist every time they need to repeat the ramp-up period before they are in a position to maybe add value.

Suggests a model where company keeps the same consultant on retainer and brings them on board each time a situation pops up where the consultant may be able to assist

(this isn't a particularly novel suggestion, the same suggestion is made in a 60s/70s era thesis investigating how applicable operations research is to small businesses)

I'm curious what you've found works the best for finding people (employees, contractors, other resources) capable of drawing insights (or making heads/tails) out of the data? We're always trying to have a helpful perspective for customers - as well as wanting to give a great "push in the back" to get them going on that dimension as well.
I fully agree with your previous statements. I worked as a BI consultant for 5 years and 4 years inhouse. Consultant can be misleading, we really created stuff (not only hot air) ;-) ... We created visualizations (dashboards) and data models mainly with Qlik.(complete road from data extract to visualization/analytics)

I think the most efficient way is to have inhouse staff for visualizing data and extracting insights. They should be able to cover 70-90% of the demand. The remaining part and possible peaks in demand should be covered by a contractor(s).

In the long run, this makes sure you have a reliable contractor, who already knows you (the company) and also the infrastructure and meaning of your data. It helps a lot for example when an employee is sick or has left the company. So you can bridge the gap with almost no delay.

Most employees don't want (and often don't have time) to learn additional tools for anayltics & data visualization next to their daily business. And the data models quickly become too complex for "casual users". To make "self-service BI" really possible it needs a lot of work upfront (to prepare data etc.). I think I never saw "self-service BI" working in the real world. (maybe some "poweruser" in finance & controlling)

Imho the best case would be to have a specialized BI team which works together with the domain experts to create insights. Normally the persons from specific departments know their data very well and they are very helpful in the process of finding insights. They often have already done reports/calculations etc. before. But the manual process is just too complicated, too slow or whatever.