| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bayindirh 2066 days ago
	Since the losses in the analog conversion process cannot be determined exactly, the model is bound to add some noise to the converted audio. Video has more spatial data to guess the color and motion so it's easier in practice. The unconverted sound may be crisper and has more details but, there's no guarantee that they're the original details so, it won't be the original recording itself.

1 comments

Buttons840 2066 days ago

Light pollution adds noise to telescopes too. From a sky where you can barely see 4 stars you can pull detailed colored images of deep space objects.

Perhaps you have to play the media a few dozen times and do the media equivalent of frame stacking to see through the noise.

It's also quite possible no ML would be needed. I don't think frame stacking uses ML.

I wouldn't be surprised if you could play a song on repeat from the other side of your house and extract a very good copy of it, so long as you knew exactly when the song began and looped. You might only need to know the length of the song, not even when it began.

It might not be practical, but it would be a cool blog post.

link

bayindirh 2066 days ago

It's not the same thing. A CMOS sensor, especially a cooled, astro-class CMOS sensor is much more sensitive than eye.

The random noise in photography is emitted from the sensor itself and has a distinct profile. This profile can be extracted with certain procedures and can be used as a single pass NR process with very high quality results (Darktable's Profiled NR does this if the camera's profile is generated/bundled). Also Subtractive NR does something similar. After the exposure, a closed curtain exposure is taken with same shutter value and that image is subtracted from the first image. Since it's a per-sensor process, its quality is very high too.

Light pollution is also somewhat similar. It's a specific wavelength, emitted from ground to sky (so its gradient can be known) and can be filtered out relatively easily with stacking and other RAW processing (assuming your image has enough bit-depth and headroom in both highlights and shadows). There are also new filters which directly filter this kind of pollution IIRC.

Stacking does something similar. Pixels with high consistency is kept, low consistency is discarded so you get a clean image. Sorry, I don't know its exact math since I don't have a tracker and don't take many astro photos.

However, in an analog recording you have incomplete information and you want to put it back via ML, which is basically a very educated guess in this case. A well tuned and trained ML model would probably put back sensible or semi-sensible details back but, it cannot guess and re-generate the missing parts with 100% accuracy.

So at the end of the day, in photography, you have the ability to get the complete information (via stacking or subtractive NR or by cooling the sensor a great deal) however, in an analog recording, you don't have the complete information. Especially if you record it via a speaker to microphone path (since they're not ideal reproducers).

We may go to the sounding characteristics of analog audio pipelines and vinyls from there but, that's another rabbit hole I'd rather not dive now.

link

Buttons840 2066 days ago

That is a thoughtful and useful reply. Thank you.

From what you say, stacking can remove both random noise from the sensor and predictable noise such as light pollution. What other kinds of noise are there? It sounds like our noise removal ability is pretty good.

I am not an expert though. I do know we can't image the Apollo landing sites no matter how good our stacking software is. Our sensors aren't good (big) enough. I don't have an understand of why that is though.

An analog loop would have hard limitations, just like a telescope. I'm not sure how much noise stacking could clean up. At this point I'm more curious than thinking it's a good solution.

link