|
|
|
|
|
by ariskk
3391 days ago
|
|
Hi. I am the author of the article.
Thank you for spending time to read this.
The combined reach of the co-founders is very large, thus being able to provably handle scale was an essential prerequisite. Additionally, the requirements of the platform extend way beyond a simple content server. Content performance is tracked in real-time and this is fed to multiple ranking and recommendation models. Those frequently change, thus we need a way to retroactively process our data.
Flexibility is key when trying to build an intelligent platform. Thus, we decided to early-on invest time in the ability to quickly iterate and experiment on algorithms, in real time over live data.
You are right that the API fleet could be implemented using the aforementioned technologies; We use Scala and thus decided to use Akka HTTP instead. The challenging part is how you manage state behind that. |
|
In my experience all these features sound nice on paper. But you quickly run into practical issues that are far easier when you know approximate information about the state.
E.g. Developing a model? you might just want a subset/batch data. Doing BI/Analytics? are you going to continuously tax your server to recompute? The argument about recommender systems is also honestly flimsy, having built and applied such systems to live traffic at very large scale (more than hundreds of millions of users). There is only a small advantage from being able to quickly reconfigure flows. In most cases you have a single baseline model which you compare against for a small fraction of the traffic. The real complexity/gains in recommender systems lie in choice of algorithm/hyper-parameters/features, not on continuous multi armed bandits with 1000 different models applied simultaneously while waiting an infinite amount of time to produce any statistically meaningful answer. In fact for a website like this one, recommender systems can only provide so much advantage.
There are actually several really good specialized use cases, e.g. Google secmon-tools uses a system like this one.
[1] https://web.stanford.edu/class/cs259d/lectures/Session11.pdf