| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by meow_cat 825 days ago
	Maybe I'm missing something obvious, but what is the idea behind quantizing and tokenizing time series? We tokenize text because text isn't numbers. In the case of time series, we're... turning numbers into less precise numbers? The benefit of scaling and centering is trivial and i guess all timeseries ML does it, but I don't see why we need a token after that.

5 comments

matrix2596 825 days ago

I'm building upon insights from this paper (https://arxiv.org/pdf/2403.03950.pdf) and believe that classification can sometimes outperform regression, even when dealing with continuous output values. This is particularly true in scenarios where the output is noisy and may assume various values (multi modal). By treating the problem as classification over discrete bins, we can obtain an approximate distribution over these bins, rather than settling for a single, averaged value as regression would yield. This approach not only facilitates sampling but may also lead to more favorable loss landscapes. The linked paper in this comment provides more details of this idea.

link

lamename 825 days ago

Isn't it a given that classification would "outperform" regression, assuming n_classes < n_possible_continuous_labels? Turning a regression problem into a classification problem bins the data, offers more examples per label, simplifying the problem, with a tradeoff in what granularity you can predict.

(It depends on what you mean by "outperform" since metrics for classification and regression aren't always comparable, but I think I'm following the meaning of your comment overall)

link

dist-epoch 825 days ago

Tokenisation turns a continuous signal into a normalized discrete vocabulary: stock "went up a lot", "went up a little", "stayed flat". This smooths out noise and simplifies matching up similar but not identical signals.

> We tokenize text because text isn't numbers.

Text is actually numbers. People tried inputting UTF8 directly into transformers, but it doesn't work that well. Karpathy explains why:

https://www.youtube.com/watch?v=zduSFxRajkE

link

prlin 825 days ago

> Text is actually numbers

Text can be represented by numbers but they aren't the same datatype. They don't support the same operations (addition, subtraction, multiplication, etc).

link

lamename 825 days ago

Interesting. Can you explain how this is superior and/or different from traditional DSP filters or other non-tokenization tricks in the signal processing field?

link

dist-epoch 825 days ago

Traditional DSP filters still output a continuous signal. And it's a well-explored domain, hard to imagine any low-hanging fruit there.

My intuition is the following: transformers work really well for text, so we could try turning a time series into a "story" (limited vocabulary) and see what happens.

link

lamename 825 days ago

Like this or something different?

https://github.com/gzerveas/mvts_transformer

link

spyder 824 days ago

I think it could also have a connection with symbolic AI: The discrete tokens could be the symbols that many believe is useful or necessary for reasoning. It is also useful for compression, reducing memory requirements by the quantization and small integer representations.

https://en.wikipedia.org/wiki/Neuro-symbolic_AI

link

555watch 824 days ago

My primitive understanding is that we approximate a Markovian approach and indirectly model the transition probabilities just by working through tokens.

link

intalentive 824 days ago

My guess is that it enforces a kind of sparsity constraint.

link