python 学习: 矩阵运算-易微帮

摘要: 本贴通过例子描述 python 的矩阵运算.

1. 一般乘法 (mm 与 matmul)

代码:

    input_mat1 = torch.tensor([[1, 2, 3, 4],
            [1, 2, 2, 3]])

    input_mat2 = torch.tensor([[1, 2, 3, 3],
            [2, 1, 2, 3],
            [3, 1, 2, 2],
            [2, 3, 2, 3]])
    print("input_mat1: ", input_mat1)
    print("input_mat2: ", input_mat2)

    output_mat1 = torch.mm(input_mat1, input_mat2)
    print("torch.mm() test, output_mat1 = ", output_mat1)

    output_mat2 = torch.matmul(input_mat1, input_mat2)
    print("torch.matmul() test, output_mat2 = ", output_mat2)

结果:

input_mat1:  tensor([[1, 2, 3, 4],
        [1, 2, 2, 3]])
input_mat2:  tensor([[1, 2, 3, 3],
        [2, 1, 2, 3],
        [3, 1, 2, 2],
        [2, 3, 2, 3]])
torch.mm() test, output_mat1 =  tensor([[22, 19, 21, 27],
        [17, 15, 17, 22]])
torch.matmul() test, output_mat2 =  tensor([[22, 19, 21, 27],
        [17, 15, 17, 22]])

分析:

利用 torch.tensor 来定义张量 (含矩阵);
torch.mm() 和 torch.matmul() 在这个例子里面的作用相同, 都是将 $\times n$ 与 $\times k$ 的矩阵进行乘法, 获得 $\times k$ 的矩阵;
torch.matmul() 在 3 维及以上数据中有些奇怪的用法.

2. 逐点乘法 (乘法符号)

2.1 一维数组

代码:

    print("---torch.tensor star product test---")
    input_array1 = torch.tensor([1, 2, 3, 4])
    input_array2 = torch.tensor([4, 3, 2, 1])
    star_product = input_array1 * input_array2
    print("star_product: ", star_product)

结果:

---torch.tensor star product test---
star_product:  tensor([4, 6, 6, 4])

分析:

不改变向量尺寸.

2.2 二维矩阵

代码:

    print("---element_wise_product  test---")
    input_matrix = np.array([[1, 2], [3, 4]])
    element_wise_product = input_matrix * input_matrix
    print("element_wise_product : ", element_wise_product)

结果:

---element_wise_product  test---
element_wise_product :  [[ 1  4]
 [ 9 16]]

分析:

不改变矩阵尺寸.

3. 点乘 (dot)

3.1 一维数组

代码:

    print("---torch.tensor dot_product test---")
    input_array1 = torch.tensor([1, 2, 3, 4])
    input_array2 = torch.tensor([4, 3, 2, 1])
    dot_product = torch.dot(input_array1, input_array2)
    print("array dot_product: ", dot_product)

    print("---np.array dot_product test---")
    input_array1 = np.array([1, 2, 3, 4])
    input_array2 = np.array([4, 3, 2, 1])
    dot_product = np.dot(input_array1, input_array2)
    print("array dot_product: ", dot_product)

结果:

---torch.tensor dot_product test---
array dot_product:  tensor(20)
---np.array dot_product test---
array dot_product:  20

分析:

相当于内积;
torch.tensor 和 np.array 都支持 dot;
torch 返回结果是一个 $\times 1$ tensor, np 返回的是一个标量.

3.2 矩阵

与 torch.matmul 相同.

4. 拼接 (cat)

cat 不改变数据性质 (向量/矩阵/张量仍然是向量/矩阵/张量)

4.1 向量

代码:

    print("---array test---")
    input_mat1 = torch.tensor([1, 2, 3, 4])
    print("input: ", input_mat1)

    horizontal_stack = torch.cat((input_mat1, input_mat1), 0)
    #vertical_stack = torch.cat((input_mat1, input_mat1), 1)

    print("horizontal_cat = ", horizontal_stack)
    #print("vertical_cat = ", vertical_stack)

结果:

---array test---
input:  tensor([1, 2, 3, 4])
horizontal_cat =  tensor([1, 2, 3, 4, 1, 2, 3, 4])

分析:

cat 的第 2 个参数指定方向, 0 表示水平, 1 表示垂直;
向量支持水平叠加, 不支持垂直叠加, 否则向量变成二维矩阵, 不合适.

4.2 矩阵

代码:

    print("---matrix test---")
    input_mat1 = torch.tensor([[1, 2, 3, 4],
            [1, 2, 2, 3]])
    print("input: ", input_mat1)

    horizontal_cat = torch.cat((input_mat1, input_mat1), 0)
    vertical_cat = torch.cat((input_mat1, input_mat1), 1)

    print("horizontal_cat = ", horizontal_cat)
    print("vertical_cat = ", vertical_cat)
    print("shape: ", np.shape(input_mat1), np.shape(horizontal_cat), np.shape(vertical_cat))

结果:

---matrix test---
input:  tensor([[1, 2, 3, 4],
        [1, 2, 2, 3]])
horizontal_cat =  tensor([[1, 2, 3, 4],
        [1, 2, 2, 3],
        [1, 2, 3, 4],
        [1, 2, 2, 3]])
vertical_cat =  tensor([[1, 2, 3, 4, 1, 2, 3, 4],
        [1, 2, 2, 3, 1, 2, 2, 3]])
shape:  torch.Size([2, 4]) torch.Size([4, 4]) torch.Size([2, 8])

分析:

水平叠加两个 $\times n$ 矩阵, 将获得一个 $2m \times n$ 矩阵; 垂直叠加两个 $\times n$ 矩阵, 将获得一个 $\times 2n$ 矩阵.

4.3 张量

    print("---tensor test---")
    input_tensor1 = torch.tensor([[[1, 2, 3, 4], [1, 2, 2, 3]],
                                  [[5, 6, 7, 8], [8, 7, 6, 5]]])
    print("input: ", input_tensor1)

    horizontal_cat = torch.cat((input_tensor1, input_tensor1), 0)
    vertical_cat = torch.cat((input_tensor1, input_tensor1), 1)

    print("horizontal_cat = ", horizontal_cat)
    print("vertical_cat = ", vertical_cat)
    print("shape: ", np.shape(input_tensor1), np.shape(horizontal_cat), np.shape(vertical_cat))

结果:

---tensor test---
input:  tensor([[[1, 2, 3, 4],
         [1, 2, 2, 3]],

        [[5, 6, 7, 8],
         [8, 7, 6, 5]]])
horizontal_cat =  tensor([[[1, 2, 3, 4],
         [1, 2, 2, 3]],

        [[5, 6, 7, 8],
         [8, 7, 6, 5]],

        [[1, 2, 3, 4],
         [1, 2, 2, 3]],

        [[5, 6, 7, 8],
         [8, 7, 6, 5]]])
vertical_cat =  tensor([[[1, 2, 3, 4],
         [1, 2, 2, 3],
         [1, 2, 3, 4],
         [1, 2, 2, 3]],

        [[5, 6, 7, 8],
         [8, 7, 6, 5],
         [5, 6, 7, 8],
         [8, 7, 6, 5]]])
shape:  torch.Size([2, 2, 4]) torch.Size([4, 2, 4]) torch.Size([2, 4, 4])

分析:

水平叠加两个 $\times n \times k$ 张量, 将获得一个 $2m \times n \times k$ 张量; 垂直叠加两个 $\times n$ 矩阵, 将获得一个 $\times 2n \times k$ 矩阵.

5. 堆叠 (stack)

5.1 向量堆叠成矩阵

代码:

    print("---torch.tensor stack test---")
    input_array1 = torch.tensor([1, 2, 3, 4])
    input_array2 = torch.tensor([4, 3, 2, 1])
    array_stack_horizontal = np.stack([input_array1, input_array2], axis=0)
    print("horizontal stack: ", array_stack_horizontal)
    array_stack_vertical = np.stack([input_array1, input_array2], axis=1)
    print("vertical stack: ", array_stack_vertical)

结果:

---torch.tensor stack test---
horizontal stack:  [[1 2 3 4]
 [4 3 2 1]]
vertical stack:  [[1 4]
 [2 3]
 [3 2]
 [4 1]]

分析: $k$ 个 $n$ 维向量堆叠, $\times n$ 维矩阵.

如果按照 axis=0 来读取, 则对应于 $(y, x)$ 坐标;
如果按照 axis=1 来读取, 则对应于 $(x, y)$ 坐标, 即获得 $\times k$ 维矩阵.

5.2 矩阵堆叠成张量

代码:

    print("---numpy stack test---")
    tensor1 = np.arange(1, 13).reshape((3, 4))
    tensor2 = np.arange(13, 25).reshape((3, 4))
    print("tensor1: ", tensor1)
    print("tensor2: ", tensor2)

    tensor_stack0 = np.stack([tensor1, tensor2], axis=0)
    print("\r\naxis 0 stack: ", tensor_stack0)
    print("shape: ", np.shape(tensor_stack0))

    tensor_stack1 = np.stack([tensor1, tensor2], axis=1)
    print("axis 1 stack: ", tensor_stack1)
    print("shape: ", np.shape(tensor_stack1))

    tensor_stack2 = np.stack([tensor1, tensor2], axis=2)
    print("axis 2 stack: ", tensor_stack2)
    print("shape: ", np.shape(tensor_stack2))

结果:

---numpy stack test---
tensor1:  [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
tensor2:  [[13 14 15 16]
 [17 18 19 20]
 [21 22 23 24]]

axis 0 stack:  [[[ 1  2  3  4]
  [ 5  6  7  8]
  [ 9 10 11 12]]

 [[13 14 15 16]
  [17 18 19 20]
  [21 22 23 24]]]
shape:  (2, 3, 4)
axis 1 stack:  [[[ 1  2  3  4]
  [13 14 15 16]]

 [[ 5  6  7  8]
  [17 18 19 20]]

 [[ 9 10 11 12]
  [21 22 23 24]]]
shape:  (3, 2, 4)
axis 2 stack:  [[[ 1 13]
  [ 2 14]
  [ 3 15]
  [ 4 16]]

 [[ 5 17]
  [ 6 18]
  [ 7 19]
  [ 8 20]]

 [[ 9 21]
  [10 22]
  [11 23]
  [12 24]]]
shape:  (3, 4, 2)

分析: $k$ 个 $\times n$ 矩阵堆叠

方向 0: $\times m \times n$ 张量;
方向 1: $\times k \times n$ 张量;
方向 2: $\times n \times k$ 张量.

进一步理解:
$\times n$ 矩阵在描述的时候, 可以用 $(x, y)$ 坐标描述. 堆叠成 $\times m \times n$ 张量 (立方体), 可以用三种顺序描述: $(z, x, y)$ , $(x, z, y)$ , $(x, y, z)$ , 只要控制每个维度从小到大即可.

python 学习: 矩阵运算