| I was pretty confused by this for a while. I think the context I was missing is that this is about a function in nympy called ‘einsum’ which is somewhat related to Einstein summation notation. To write a little more: there are two things people mean by ‘tensor’. One is a kind of geometric object that corresponds to a multilinear map between vector spaces (or modules, I suppose), and another is a array indexed by k-tuples (ie what would be called a multidimensional array in C or Fortran programming). For a given choice basis, one can represent a degree k tensor with a k-dimensional array of scalars. Only certain operations make sense geometrically on tensors (in the sense that they do not depend on the choice of basis) and these can be broken down into: - tensor products, which take degree n and m tensors and output a degree (n + m) tensor - contractions which take a degree n tensor and output a degree (n-2) tensor - generalized transpositions which take a degree n tensor and output a degree n tensor A matrix multiplication can be seen as a composition of a tensor product and a contraction; a matrix trace is just a contraction. The Einstein summation convention is a notation which succinctly expresses these geometric operations by describing what one does with the ‘grid of numbers’ representation, combined with the convention that, if an index is repeated in a term twice (an odd number bigger than 1 is meaningless, an even number is equivalent to reapplying the ‘twice’ rule many times) one should implicitly sum the expression for each basis vector for that index. You get: tensor products by juxtaposition, contractions by repeated indexes, and transpositions by reordering indexes. In numpy, it is for general computation rather than expressing something geometric so one doesn’t need the restrictions on number of times an index occurs. Instead I guess the rule is something like: - if index is only on lhs, sum over it - if index on lhs and rhs then don’t sum - if index only on rhs or repeated on rhs, error And computationally I guess it’s something like (1) figure out output shape and (2): for output_index of output:
p = 1
for (input, input_indexes) of (inputs, lhs):
p = p * input[input_indexes(output_index)]
output[output_index] = p
|