|
|
|
|
|
by alexatkeplar
4856 days ago
|
|
Exactly this. At SnowPlow (https://github.com/snowplow/snowplow) we would love to spend more time downstream at the analysis phase (doing ML etc), but we still have to spend a ton of time working upstream on collection, storage, enrichment etc. A lot of this work is defining, testing and documenting standard protocols, data models etc (see https://github.com/snowplow/snowplow/wiki/SnowPlow-technical... if you're interested). And this is just for eventstream analytics, working with our own data formats - ingesting and mapping third-party formats (e.g. Omniture, MailChimp, MixPanel etc) is another lot of work that needs doing... So a solved problem? Not so much. |
|