Hacker News new | ask | show | jobs
by rdtsc 2799 days ago
> Google says that its machine learning detects what objects are in the frame, and the camera is smart enough to know what color they are supposed to have.

That is absolutely impressive.

The color and text on the fire extinguishers along with the texture detail seen in the headphones in the last picture are just stunning. Congratulations to anyone who worked on this project!

1 comments

It's impressive, but it also means that your camera isn't always going to capture what's there -- it'll capture what it guesses was there. I wonder how easily it is fooled to capture something that's not there?
Soon these cameras will be able to take Milky Way photos in San Francisco. (Find a few stars, and it fills in the blanks.) If you want to simulate a long exposure, the cameras will add star trails.

It's too bad that the technology is proprietary. I'm curious what could be done with a larger-sensor camera, from compact cameras to DSLRs.

I think the article is factually correct but makes it sound a little more complicated or advanced that it probably is. I mean, depending on how you interpret it you could think that it does basically "hey, this looks like a fire hydrant, let me paste a fire hydrant in there" which is obviously not exactly something AI can do reliably today, especially on phone hardware.

I'm guessing that it works similarly to low-budget astrophotography but with the computer doing all the busywork for you: when you want to photograph stars or planets and you don't have a fancy tracking mount to compensate for earth's rotation you'll have very mediocre results with long exposure. If you expose a lot to see the object clearly then you get motion blur. If you use a shorter exposition to reduce the blur you don't have enough light to get a clear picture.

One solution is to take a bunch of low-exposure pictures in a row and then add them together (as in, sum the value of the non-gamma-corrected pixels) in post while taking care of moving or rotating each picture to line everything up. This way you simulate a long-exposure while at the same correcting for the displacement.

An other advantage is that you can effectively do "HDR": suppose that you're taking a panorama with the milky way in the sky and some city underneath it, with a long exposure the lights of the city would saturate completely. With shorter exposures you can correct that in post by scaling the intensity of the lights in the city as you add pixels (or summing fewer pictures for these areas). This way you can effectively have several levels of exposures in the same shot and you can tweak all that in post. In the case of the city/milky way example you'll also need to compensate for the motion in the sky but obviously not on land which is also something you can't really do "live".

I have a strong suspicion that it's basically what this software is doing: take a bunch of pictures, do edge/object detection to realign everything (probably also using the phone's IMU data), fit the result on some sort of gamma curve to figure out the correct exposition then add color correction based on a model of the sensor's performance under low light (since I'm sure by default under these conditions the sensor will start breaking down and favor some colors over others). Then maybe go through a subtle edge-enhancing filter to sharpen things a bit more and remove any leftover blurriness.

If I'm right then it's definitely a lot of very clever software but it's not like it's really "making up" anything.

> which is obviously not exactly something AI can do reliably today, especially on phone hardware

But do we know that it runs on phone hardware? If voice interfaces have taught us anything, it's that we can't ever make that assumption again.

The amount of data you'd have to send to run this off board would be enormous, but hey, anything for a jawdropping hype feature, right? It just works, those preview pictures literally made me check the price and size of the pixel 3, and I haven't been interested in anything but a Sony Compact since the z1c came out.

I think it's taking as many pictures as possible and using very slightly different angle to get as good resolution as possible with as little noise as possible (I have no idea if that's really what's happening here)
That's not what this article says, do you have reason to believe it's incorrect? The quote about object detection in the parent post came from the article.
I'm sure the first step is taking many sharper short exposure shots (as opposed to longer exposures, which blur), then doing some tensor magic to stitch into a single image.

Object detection alone won't give you sharp text in low light. You need a minimum number of photons hitting pixels.

From the article: > Although it’s not one single long exposure, Google’s night mode still gathers light over a period of a few seconds, and anything moving through the frame in that time will turn into a blur of motion.
Right, but it's up to them how to process information coming in during these few seconds. Without a tripod it would be just a big blur without separating data into some frames. I don't think optical stabilization would be enough for such long exposures.
Yeah but in fact the eye does the same thing. I think a lot of people are to some degree aware of their photo biases, from color filters to ai filters, and will modify their response or not internally.