| Analytics 101: Choosing the right database is the wrong first step. I was excited when I saw 'choosing the right data model' as one of the rules, but they are talking about the data models the DB uses internally. The important data model is choosing how you model the data you have to analyze. I have my biases, but I'd argue that a dimensional model would be a good starting if we're really at a 101 level and extensions to the model are for future classes/development. Starting with the end in mind is very important. When I look at this, I think in terms of data culture. Who needs to be able to do what in your organization? What types of question will you need to answer most often? What types of questions will you need to support in an ad-hoc manner? To many organizations "analytics" means arithmetic, but with complex filtering logic and business logic, traditional BI, essentially. To others, "analytics" means R code monkeys. To others it may mean specifically visualizations and the presentation layer. There are many interpretations of the word. Regardless of the interpretation, process and culture are more important to understand before the technology. For a rough analogy, it's like saying "Software development 101: Choosing the right programming language". Sure it matters, but knowing what your software needs to support and what the primary use cases are are more important to understand. Ninja edit: Grammar. |
What this has taught me over time is that the best systems, whether for data modeling and data storage, or for software, are systems that go to great lengths to ensure it is extremely cheap and easy to reconfigure, redesign, scrap everything and try something new, and adapt your designs to the real life pain points you didn't anticipate.
I've been very frustrated in the last several years because you see so many people trying to shoehorn this sort of idea into so-called "Agile" methods, but those methods put the emphasis in the wrong place. They depict agility as a property of the team of humans beings, and the particular tasks and schedules of the humans involved. They don't do anything to improve the agility of the underlying software systems, and you can have the most Agile team ever but still get burnt by committing to a bad and hard-to-change design, even if you're generating gobs of story points or other nonsense.
One of the most important and powerful planning tools is to prototype your architecture. Begin doing actual work with it. "Beta test" it with a limited set of actual business users. Collect data, like you're profiling, and make these decisions with evidence about your actual use case, not trite analogies or generalizations of it.
And place a premium on tools that demonstrably make it easy to scrap an underperforming design and replace it with a different design.