Hacker News new | ask | show | jobs
by jayalammar 1346 days ago
I updated the post to say "multi-dimensional array".

In a context like this, we use tensor because it allows for any number of dimensions (while vector/ array is only one, matrix is two). When you get into ML libraries, both popular packages PyTorch and TensorFlow use the "tensor" terminology.

It's a good point. Hope it's clearer for devs with "array" terminology.

2 comments

> we use tensor because it allows for any number of dimensions

"Vector" implies one dimension and "matrix" strongly implies two. But an array can have any number of dimensions, so "array" is the best word.

We don't need the word "tensor"; when the context is programming, "tensor" is only confusing and doesn't really add any useful meaning.

Tensor does imply a set of operations that are expected. Multiplying two arrays together is an ambiguous operation; multiplying two tensors together is well-defined.

And really, the context is math, not programming. The programming side of DL is approximately trivial, the interesting bits are all represented as mathematical operations on a set of tensors.

It's especially confusing as math, because the standard meaning of "tensor" in math includes a lot of baggage that's not used in deep learning (related to physics and differential geometry).
Exactly.
"Tensor does imply a set of operations that are expected. Multiplying two arrays together is an ambiguous operation; multiplying two tensors together is well-defined."

In programming too, multiplying two arrays of arbitrary types is often undefined, and many languages allow their programmers to specify what they want to happen when arrays of specific types are multiplied (or may have built-in defined behavior for certain specific types of arrays).

I'd love to learn if there are any actual differences between tensors and arrays commonly used in programming, but so far it doesn't sound like it from reading through this HN thread.

I think you misunderstood what I meant. If I say "multiply this 3x3 matrix by this 3x1 vector", then I know know to expect the particular form of multiplication that multiplies each column of the matrix by it's associated value in the vector, then adds them up to produce a 3x1 vector. Likewise, multiplying a 3x3x3 tensor by a 3x1x1 tensor will produce a 3x3x1 result.

If I say "multiply this 3x3 array by this 3x3 array", then it's an ambiguous operation. There is not a canonical multiplication for arrays. One ecosystem might treat it as a matrix-matrix multiplication, another might do an element-wise multiplication, others might just throw an error. None is more correct than another, because array just implies the structure of the data.

Interestingly the Wikipedia page

Tensor (disambiguation)

https://en.wikipedia.org/wiki/Tensor_(disambiguation)

has an entry

Tensor type (computing)

which is just a redirection to

Array (data type)

https://en.wikipedia.org/wiki/Array_(data_type)

Doesn’t seem to support the relevance of “tensor” as something distinct from “array”.

"Array" implies one dimension to me. Maybe n-array?
Array may imply 1d to you, or 2d to others, but it is more general that that: https://en.wikipedia.org/wiki/Array_(data_type)
I think tensor is much shorter than "multi-dimensional array", and it is OK to use it. You could just mention that in the context of DL tensor is just a shorter way to say "multi-dimensional array", and everyone should be happy. On top of that, if they read another text about DL that uses the word tensor, there is no need to be scared anymore.