| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by keithalewis 504 days ago
	This is incomplete, incorrect, and irrelevant. Standard notation already exists. I'm sure it is fun to draw squiggly lines and some people enjoy reinventing the wheel. Spend some time learning what others have taught us before striking out on your own lonely path.

2 comments

llm_trw 504 days ago

This is standard notation that's been used for decades.

https://arxiv.org/abs/2402.01790v1

link

chriskanan 504 days ago

This paper motivates and explains concepts much better than the Tensor Cookbook.

link

thomasahle 502 days ago

I'm hoping the Tensor Cookbook can become as engaging to read for others as Jordan Taylor's paper was to me. If you have any thoughts on where I lose people, please share!

link

llm_trw 504 days ago

The cookbook is a work in progress by the looks of it.

link

keithalewis 504 days ago

"This book aims to standardize the notation for tensor diagrams..." https://youtu.be/zELbzXAmcUA?t=73

link

thomasahle 502 days ago

Tensor diagrams are standard, but some notation is missing. My goal was to be able to handle the entire Matrix Cookbook.

For this I needed a good notation for functions applied to specific dimensions and broadcasting over the rest. Like softmax in a transformer.

The function chapter is still under development in the book though. So if you have any good references for how it's been done graphically in the past, that I might have missed, feel free to share them.

link

absolutelastone 499 days ago

You can do broadcasting with a tensor, at least for products and sums. The product is multilinear, and a sum can be in two steps, first step using a tensor to implement fanout. Though I can see the value in representing structure that can be used more efficiently versus just another box for a tensor. Beyond that (softmax?) seems kind of awkward since you're outside the domain of your "domain specific language". I don't know why it's needed to extend the matrix cookbook to tensor diagrams.

link

llm_trw 502 days ago

I come back to this every few months and do some work trying to make sense of how tensors are used in machine learning. Tensors, as used in physics and whose notation these tools inherit, are there for coordinate transforms and nothing else.

Tensors, as used in ML, are much closer to a key-value store with composite keys and scalar values, with most of the complexity coming from deciding how to filter on those composite keys.

Drop me a line if you're interested in a chat. This is something I've been thinking about for years now.

link

thomasahle 502 days ago

Highly recommend this note by Jordan Taylor.

link

HighlandSpring 504 days ago

Do point us at this standard notation

link

ok123456 504 days ago

https://en.wikipedia.org/wiki/Einstein_notation

link

keithalewis 504 days ago

The author also seems to be unaware of Fréchet derivatives.

link

gsf_emergency_2 503 days ago

I don't exactly know what you mean but from your hint I found the uh, clarifying bedtime story:

https://arxiv.org/abs/2302.09687

(On functions of 3rd-order "tensors")

((Whereas matrix-functions are of 2nd-order "tensors"))

Playground: https://gitlab.com/katlund/t-frechet

(MATLAB)

link

keithalewis 503 days ago

The Wikipedia page on this is sufficient. If F:X -> Y is a function between normed linear spaces then DF:X -> L(X,Y), where L(X,Y) is the vector space of linear operators from X to Y, satisfies F(x + h) = F(x) + DF(x)h + o(h). A function is differentiable if it can be locally approximated by a linear operator.

Some confusion arises from the difference between f:R -> R and f':R -> R. It's Fréchet derivative is Df:R -> L(R,R) where Df(x)h = f'(x)h. Row vectors and column vectors a just a clumsy way of thinking about this.

BTW, all you need in order to publish on arixv.org is to know a FoF. There is no rigorous peer review. https://arxiv.org/abs/1912.01091, https://arxiv.org/abs/2009.10852.

link

thomasahle 502 days ago

What content about Fréchet derivatives do you think would be useful to include?

link