Having played starcraft - the idea doesn't work. Starcraft happens to quickly, with too much precision, and too low a tolerance for latency to wait for a human to absorb understand and execute instructions.
Having programmed simple NNs the idea doesn't work. The amount of time you'd have to spend having humans execute instructions during training would be astronomical. They were training 16000 games simultaneously, for 44 days, most likely running at some multiple of how fast the game normally runs. Moreover you lose out on supervised learning, because we don't data sets of "the human told the other human to do this", we only have data sets (almost a million games large) of "the human did this".
Having programmed simple NNs the idea doesn't work. The amount of time you'd have to spend having humans execute instructions during training would be astronomical. They were training 16000 games simultaneously, for 44 days, most likely running at some multiple of how fast the game normally runs. Moreover you lose out on supervised learning, because we don't data sets of "the human told the other human to do this", we only have data sets (almost a million games large) of "the human did this".