爱因斯坦总结

爱因斯坦求和表示法是一种简洁而强大的表示张量运算的方法，通常用于物理和机器学习。它允许我们以紧凑的形式对张量编写复杂的计算。我们将介绍爱因斯坦求和的基础知识、如何在 Python 中通过 Numpy 和 Tensorflow 使用它，并提供示例来说明其用法。

爱因斯坦求和的基础知识

爱因斯坦求和符号 (Einsum) 基于对张量表达式中的重复索引求和的思想。它基于以下两条规则：

1.对重复索引求和： 如果一个索引在一个术语中出现两次，则会对它进行求和

2.自由索引： 仅出现一次的索引是自由索引，表示输出张量的轴

让我们用两个矩阵 A 和 B 相乘的例子来说明这一点：结果矩阵 C 定义为

$C_{ik} = \sum\limits_{j}^{}A_{ij}B_{jk}$

在 Python 中，Numpy 和 Tensorflow 库都提供了 einsum 函数。

麻木

import numpy as np

# Define two matrices A and B
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Perform matrix multiplication using einsum
C = np.einsum('ij,jk->ik', A, B)

print(C)
# [[19 22]
#  [43 50]]

在上面的示例中，ij,jk->ik 是 einsum 字符串：

ij 表示矩阵 A 的索引

jk 表示矩阵 B 的索引

->ik 指定输出矩阵 C 的索引

对索引 j 进行求和运算

Tensorflow 中的相同代码如下所示

import tensorflow as tf

# Define two matrices A and B
A = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
B = tf.constant([[5, 6], [7, 8]], dtype=tf.float32)

# Perform matrix multiplication using einsum
C = tf.einsum('ij,jk->ik', A, B)

print(C)
# tf.Tensor(
# [[19. 22.]
#  [43. 50.]], shape=(2, 2), dtype=float32)

更多示例

向量的内积

两个向量 a 和 b 的内积（点积）定义为

$c = \sum\limits_{i}^{}a_{i}b_{i}$

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

c = np.einsum('i,i->', a, b)

print(c)  # Output: 32

向量的外积

两个向量 a 和 b 的外积由下式给出：

$C_{ij} = a_{i}b_{j}$

C = np.einsum('i,j->ij', a, b)

print(C)
# Output
# [[4 5 6]
#  [8 10 12]
#  [12 15 18]]

矩阵转置

矩阵 A 的转置可以通过交换其索引来获得

A_transpose = np.einsum('ij->ji', A)

print(A_transpose)
# Output
# [[1. 3.]
#  [2. 4.]]

矩阵的迹

矩阵 A 的迹是其对角线元素之和：

$Tr(A) = \sum\limits_{i}^{}A_{ii}A_{ii}$


trace = np.einsum('ii->', A)

print(trace)
# Output: 5.0

批量矩阵乘法

Einsum 对于批处理操作特别有用。假设我们有一批矩阵 A 和 B，我们想要将这批中对应的矩阵相乘：


A = np.random.rand(3, 2, 2)
B = np.random.rand(3, 2, 2)

# Perform batch matrix multiplication
C = np.einsum('bij,bjk->bik', A, B)

print(C)

这里，“b”代表批次维度。

Einsum 表示法的优点

1.简洁性： Einsum 表示法紧凑，可以简洁地表示复杂的运算

2.灵活性： 它可以处理各种张量运算，而无需显式重塑或转置数组

3.效率： 许多库在内部优化 einsum 操作，可能会带来更好的性能。