| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by shutty 1552 days ago

Right now it runs in a dev-mode on a single EC2 t3.large instance with loadavg ~0.30, but the inference load is quite tiny right now: around 3-4 reranking requests per second. And yes, as a typical open-source project it still crashes from time to time :)

The training dataset is not that huge (see https://github.com/metarank/ranklens/ for details, it's open-source), so we do a full retraining directly on the node right after the deployment, and it takes around 1 minute to finish. We also run the same process inside the CI: https://github.com/metarank/metarank/blob/master/run_e2e.sh

There is an option to run this thing in a distributed mode:

* training is done using a separate batch job running on Apache Flink (and on k8s using flink's integration)

* feature updates are done in a separate streaming Flink job, writing everything in Redis

* The API fetches latest feature values from Redis and runs the ML model.

The dev-mode I've mentioned earlier is when all these three things are bundled together in a single process to make it easier to play with the tool. But we didn't spent much time testing distributed setup, as this thing is still a hobby side-project and we're limited in time spent developing it.

2 comments

jka 1552 days ago

From reading some of the repository and architecture overview, I think this is true, but: could you confirm that users of metarank can self-train their own models from scratch?

link

shutty 1552 days ago

This is actually part of our CI process: https://github.com/metarank/metarank/blob/master/run_e2e.sh . This script runs on every PR to retrain the model used on a demo and confirms that it's working fine.

So you can just download the jar file from releases page and run ./run_e2e.sh <jar file> in the checked-out repository, it should do the job.

link

jka 1552 days ago

Thanks!

link

dannywarner 1552 days ago

What budget for cloud infrastructure for 100K/mo buyers to an ecommerce website, approximate range, with typical purchase habits? I am new to Flink. We use Redis in production.

link