Hacker News new | ask | show | jobs
by agentofoblivion 1705 days ago
One of the critically useful properties of wavelets, as I understand it, is that they are localized in both the original domain and the frequency domain.

A FT on an audio file, for example, will tell you how much of each frequency is present in that file. Each component is localized in frequency. A wavelet, however, is localized in both frequency and time (due to this decaying “hat” shape). When you scan this wavelet across the audio file and record the outputs you therefore learn not only which frequencies are present, but also their locations in the file.

1 comments

This is the stuff I want to know more about, because I understand my analogies/mental model, but I have only crude/naive 'code mental model' of how these kinds of things are implemented

When you say:

    When you scan this wavelet across the audio file
Is that literal in code?

Apologies for the naive code analogies -- your comment makes me think of something like an XLA 'reduceWindow' operation (edit: or an OpenCL kernel), where a wavelet-sized 'window' is slid over the input, the difference is computed for each data point -- and then (waves hands: probably something smarter than just a) 'cumulative sum' to see how close/different is is.

(a) is that on the right track/what you meant by 'scanning the wavelet across?

(b) what about transforming the wavelet -- to different scales, etc?

This seems like the kind of thing that's actually like a search problem, heh, where you'd need to slide it over the input many many times, and where it could conceivably benefit from (waves hands) machine learning techniques as a result, to come up with something that does the search for a good wavelet encoding more efficiently based on the input.

... which, I guess, means I should just look around github for "wavelet compression" and see what code pops up, since that search for good encodings will be done in any of those.

--- Edit: this one seems simple enough to learn from, it's minimal, MIT licensed, audio-focused, written in Julia, and references the course material that it's based on:

https://github.com/nicholaskl97/wavelets

It, uh ... like lots of scientific code, it has a lot of single-letter variables. :D But if one was to google up a few acronyms ('DWT'), install Julia, get it to run, and then re-implement it in another language, my feeling is that you might understand wavelets pretty good, no? Or be at least be well-equipped to understand more interesting uses of wavelets?

(I'm posting this because I might use some 'shop time' doing approximately that, so I would definitely appreciate any HN comments telling me 'consider learning from $some_other_example instead, for $some_reason.')