|
|
|
|
|
by ariskk
3390 days ago
|
|
You mention the word "batch" when talking about models. Also "BI/Analytics". Since Django/Rails applications do not support any of the two, another sort of system would be needed. This is the point where, having built everything on Django, with no foresight whatsoever about future requirements, we would have ended up creating DataFrames from SQL tables in Spark. Our BI guys have no experience with Spark, so we would need to load data to a DW-like solution, like BigQuery/Redshift/Impala/Presto/you-name-it. Instead of another sink in Flink, we would need to implement and schedule ETL jobs.
Even at our current load, computing counters (eg likes) at read time would be slow and inefficient. Which means we would need a way to pre-aggregate them. Maybe another service, possibly behind a queue?
You can see where I am going. As requirements evolve, systems evolve, and with no planning before hand, people end up with spaghetti architectures.
We knew we were funded enough to run for a couple of years. We knew the site would have traffic. We were tasked with delivering an algorithmicly-driven product, and this is the solution we came up with. I really do not understand how such a strong set of conclusions can be drawn out of so little information. |
|