| The first link provided seems to need a very detailed human-provided cost function for specific development needs. The second one is indeed interesting research and seems to be a combination of the prior learned motion mapping working in tandem with a generative model. I suppose you could say that the automation of the dataset is considered as "augmentation"; but the difference here is that the dataset is just pixels and inputs rather than all that animation info and simulation data. Yes, a simulation is running; but the GAN only gets the pixels and the input. There's a similarity there though; you're right. In either case; the explicit goal of the video you posted is to combat runtime constraints of generative models. I'm not certain it's a fair comparison. The latter video and sentdex's result both seem to generalize to unique scenarios not present in the training set. This may mean they are creating an efficient representation of the underlying data in order to predict future samples more easily than simply overfitting. The top level comment here is a shallow dismissal and Randomoneh could have answered these questions themselves before throwing out a smug comment like "I fail to see novelty here" when it's at the very least the first large-scale GAN successfully trained on GTA V. |
>animation info and simulation data
but did your model learn any of that?
>explicit goal of the video you posted is to combat runtime constraints
The trick to motion mapping is feeding a lot of data with accompanying inputs to build an atlas you can reference during playback.
>first large-scale GAN successfully trained on GTA V
Its really cool. The problem I had is in the presentation. I immediately felt insincerity bordering on scamming the audience, because I assume someone working in this field would know how the sausage is made. From the YT clip: "the shadow and reflection works", "modeling of physics works". Do they? or did your model build an atlas of video frames it can play back according to the fed input? Im guessing weather/time of day was locked when recording training data - perfect shadow and constant sun position for a nice reflection. Searching for 1:1 matches of generated output in the training set would be interesting and pretty revealing.