The Einstein Summation

The Einstein Summation notation is a concise and powerful way to represent tensor operations, often used in physics and machine learning. It allows us to write complex calculations on tensors in a compact form. We will cover the basics on Einstein summation, how to use it in Python with Numpy and Tensorflow, and provide examples to illustrate its use.

Basics of Einstein Summation

The Einstein Summation notation (Einsum) is based on the idea of summing over repeated indices in tensor expressions. It is based on the following two rules:

1. Summation over repeated indices: If an index appears twice in a term, it is summed over

2. Free indices: Indices that appear only once are free indices and represent the axes of the output tensor

Let’s illustrate this with the example of multiplying two matrices A and B: the resulting matrix C is defined as

$C_{ik} = \sum\limits_{j}^{}A_{ij}B_{jk}$

In Python, both the Numpy and Tensorflow libraries provide an einsum function.

Numpy

import numpy as np

# Define two matrices A and B
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Perform matrix multiplication using einsum
C = np.einsum('ij,jk->ik', A, B)

print(C)
# [[19 22]
#  [43 50]]

In the example above, ij,jk->ik is the einsum string:

ij represents the indices of matrix A

jk represents the indices of matrix B

->ik specifies the indices of the output matrix C

The operation sums over the index j

The same code in Tensorflow would look like

import tensorflow as tf

# Define two matrices A and B
A = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
B = tf.constant([[5, 6], [7, 8]], dtype=tf.float32)

# Perform matrix multiplication using einsum
C = tf.einsum('ij,jk->ik', A, B)

print(C)
# tf.Tensor(
# [[19. 22.]
#  [43. 50.]], shape=(2, 2), dtype=float32)

More Examples

Inner Product of Vectors

The inner product (dot product) of two vectors a and b is defined as

$c = \sum\limits_{i}^{}a_{i}b_{i}$

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

c = np.einsum('i,i->', a, b)

print(c)  # Output: 32

Outer Product of Vectors

The outer product of two vectors a and b is given by:

$C_{ij} = a_{i}b_{j}$

C = np.einsum('i,j->ij', a, b)

print(C)
# Output
# [[4 5 6]
#  [8 10 12]
#  [12 15 18]]

Transpose of a Matrix

The transpose of a matrix A can be obtained by swapping its indices

A_transpose = np.einsum('ij->ji', A)

print(A_transpose)
# Output
# [[1. 3.]
#  [2. 4.]]

Trace of a Matrix

The trace of a matrix A is the sum of its diagonal elements:

$Tr(A) = \sum\limits_{i}^{}A_{ii}A_{ii}$


trace = np.einsum('ii->', A)

print(trace)
# Output: 5.0

Batch Matrix Multiplication

Einsum is particularly useful for batch operations. Suppose we have a batch of matrices A and B, and we want to multiply the corresponding matrices in the batch:


A = np.random.rand(3, 2, 2)
B = np.random.rand(3, 2, 2)

# Perform batch matrix multiplication
C = np.einsum('bij,bjk->bik', A, B)

print(C)

Here, b represents the batch dimension.

Advantages of the Einsum Notation

1. Conciseness: The Einsum notation is compact, and can represent complex operations succinctly

2. Flexibility: It can handle a wide variety of tensor operations without explicitly reshaping or transposing arrays

3. Efficiency: Many libraries optimize the einsum operations internally, potentially leading to better performance.

Brought to you by Code Labs Academy – Your Leading Online Coding Bootcamp for Future Tech Innovators.