Hacker News new | ask | show | jobs
by robotresearcher 462 days ago
The problematic projection into vectors is done using a window size of a constant number of samples? Isn't that very weird?

Unless the samples are nicely periodic, that makes the contents of a vector, and the position of a value in a vector, dependent only on the window length and the order of time values, and not the time values themselves. Since the ordering in the vector, and thus the meaning of that dimension in vector space, is likely to be effectively random, why would we expect clustering in that vector space to mean anything? It's a weird thing to do on the face of it.

3 comments

Imagine you know that there are some repeating patterns in your dataset, e.g. heartbeats in an EKG. If you take two windows that happen to contain a heartbeat at the same position, the corresponding vectors will be close, and if the positions are different, the vectors should become more dissimilar the more the alignment is off.

Then if you create a number of clusters equal to the window size, you might expect that each cluster will correspond to one of the possible positions of the heartbeat within the window. But somehow that's not what happens...

My understanding is that people would mostly try this approach when the samples are periodic/regular.
> Unless the samples are nicely periodic

This is assumed from context.