Hacker News new | ask | show | jobs
by citnaj 2788 days ago
Author here. That's actually what I find quite fascinating myself about the results- that they look almost perfect at first glance, yet you drill down a bit closer and you see another "zombie hand". The resolution issue you mention is definitely something I'm painfully aware of- it just comes down to lack of memory resources to support bigger renderings. That's going to be something I'm going to try to attack next.
2 comments

Hey, thanks for replying!

However, I feel like you glossed over the proposed workaround, which I feel is appropriate (though more complicated if you want to implement"defade"), and extremely easy to implement.

I took a couple minutes to write an octave script that implement the workaround [1], it would have been even easier if both images had already been distinct files, and perfectly aligned.

The basic idea here is the same as the one behind the YUV transform: our brains are much less sensitive to the chroma channels than the luma channel. So I separate those, and keep the original luma channel, while I use the reconstructed chroma, which is lower-resolution.

Judge the results by yourself, but it seems to me that the end results are a whole lot better: https://imgur.com/a/n2sBYCi

And it could still be improved a lot more (by using the original high-resolution image, and avoiding to hand-align the images).

Edit: also, ironically, the Indigo dye (thus blue clothes) didn't become common before the 1900s [2], so the bias might produce historically-inaccurate images!

[1] https://gist.github.com/MayeulC/626bafbaf925fb3a3c80fdba76b7...

[2] https://en.wikipedia.org/wiki/Indigo_dye#Synthetic_indigo

Oh shit yeah that really does look good! Amazing really. Ok...I'm going to put these notes on the project board.

Yes..I definitely glossed over the proposed workaround and I apologize. Thanks for this.

No problem :)

Although I would have made it a fully-fledged github issue, with a link in your board, instead of a text entry, to add supplementary material in the issue thread.

Bonus: if you are only interested in chrominance, you can train your network to use YUV as an input instead, and output only UV. I suspect this might lead to substantial gains in the training time and network complexity.

Update: I got this working, and dude- it's so awesome in every way. This is the most substantial improvement I've seen yet. Most importantly- it massively reduces memory requirements. Thank you so much. I'll commit within a day or so and make sure to mention you, on Twitter.
Hey, thank you a lot, that's awesome! One more thing I recently thought about, but didn't get around to mention, is that you can probably reduce the input of your net to the Y (luminance) channel (with UV-only output), to trim it further ;)

But that might already be what you are doing, for all I know. I am just really glad I could be of any help! And this feels like an "free-lunch" improvement.

Yeah the more I churn over this idea in my head the more excited I get about it. This really sounds like a big win.

I'm not sure what I want to do about the Kanban board versus issues tracker yet... I'm used to JIRA mostly. I'll figure it out but do know your contribution is very very much appreciated. I don't think I would have come up with that.

Thanks for being so present in the comments :)

I don't know much about ML, but would it be possible to use some kind of attention model to iteratively construct the final colouring? The memory limit of the GPU would then limit the attention region size, but not the maximum image size. Talkin' outta my rear here, though.

I was actually thinking along the same lines because yeah...if you could break this problem down into smaller pieces, it would probably be the most effective way to reduce memory requirements. But I do think that's easier said than done. This is where I think I'll have to rely on Ian Goodfellow and others to come up with another something brilliant for me to stick in the code lol