Hacker News new | ask | show | jobs
by omtinez 3185 days ago
I haven't read the paper in full detail, but reading between the lines I'm guessing that there's a significant portion of manual processing and hand waving involved. From the abstract, emphasis mine:

> the second stage uses a pixel-wise nearest neighbor method to map the smoothed output to multiple high-quality, high-frequency outputs in a controllable manner.

My interpretation is that they select training data by hand and generate a bunch of outputs. Repeating the process until they like the final result. From the paper:

> we allow a user to have an arbitrarily-fine level of control through on-the-fly editing of the exemplar set (E.g., “resynthesize an image using the eye from this image and the nose from that one”).

2 comments

There's nothing weak or negative about that, it's exactly what'd you expect. Obviously for a given input there will be multiple plausible outputs. With any such system it would make sense to allow some control in choosing among the outputs.
Could be pretty great for police sketch artists. (Although pretty misleading for juries too.)
Just train the model with the suspect's Facebook photostream and presto you have convincing evidence.