| HN Mirror

> Do you mean "time to solution" or "time to inference"?

I meant time to a real solution that works well enough to put into a product.

> There are also very public examples of this e.g. Google's data center cooling [0] and competitive sailing [1].

DeepMind really needed DRL wins on real problems.

McKinsy has a strong incentive to be able to say "we know all about the AI RL magic" (and all the better that it's in the context of an oligarchy's entry in a Rich Person Sport... such C-suite/investor class cred!)

In both cases, DRL was used because it was the right tool for the job. But, in both cases, proving DRL can be useful was the job! Go is a better example, but of course wasn't solving a real problem.

If you throw enough engineering time and compute at DRL, it can usually work well enough. (There is a real benefit to "just hack at it long enough" over "know the right bits of control theory".)