| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bertil 2620 days ago

1. What I’m working on at the moment is AB-testing, so no real models there; plenty of simulations and tests though.

2. There are several videos of Jan describing his work, including that one, so I’ll let him give examples of what he means by models: https://www.datasciencefestival.com/video/dsf-day-2-jan-teic...

3. At the big company, it’s an e-commerce website with many products along many dimensions, so models about what aspect of the product customers would be interested in, whether they are likely to commit to purchasing now or just browsing; price sensitivity against other factors. They typically have non-authenticated users, so they have to guess a lot about the users, from time of day, country of connection, type of device used, browsing rhythm — the inferences are not perfect, but they inform how the product is presented, and have a meaningful impact on conversion.

4. In the presentation at Trainline, there are not explicit about what models they have in mind, but it’s also an e-commerce company, so a lot of similar decision.

One unique problem they had talked about openly before (UK train companies are not really reactive but British people love their festivals, championship matches, protests, horse- and dog-races and drinking during all of the above): they deal with the occasional crowded train, so they are trying to predict if a train is going to swamped and if the person booking is going to the event in question. In the latter case, they’d rather avoid the loud fans or drunken top-hatted horse-owners.

For all of the above models, the models are trying to predict something that they can have ground-truth about (typically: buying behaviour), often based on data obtained minutes later. That means all are monitoring the model accuracy, typically off-line. In most cases, they are also monitoring the impact of the use of the model: better recommendations should lead to better conversion, but also, say, a higher MRR.