• lad
    link
    fedilink
    English
    24 months ago

    Yeah, in fact, I somehow calculated in assumption of n being the amount of elements in matrix, not n² (assuming square matrix)

    But I am impressed to know that there are serial algorithms that approach O(n²), thank you for sharing that info

    • @[email protected]
      link
      fedilink
      3
      edit-2
      4 months ago

      Yeah, they work by turning the problem into some crazy kind of group theory and attacking it that way. Every once in a while someone shaves the decimal down slightly, just by implementing the deep math in a more efficient way. A new approach will be needed if it is in fact possible to get down to O(n2), though. Strassen’s is a divide and conquer algorithm, and each step of the iteration looks like this:

          S[1] = B[1, 2] - B[2, 2]
          S[2] = A[1, 1] + A[1, 2]
          S[3] = A[2, 1] + A[2, 2]
          S[4] = B[2, 1] - B[1, 1]
          S[5] = A[1, 1] + A[2, 2]
          S[6] = B[1, 1] + B[2, 2]
          S[7] = A[1, 2] - A[2, 2]
          S[8] = B[2, 1] + B[2, 2]
          S[9] = A[1, 1] - A[2, 1]
          S[10] = B[1, 1] + B[1, 2]
          P[1] = STRASSEN(A[1, 1], S[1])
          P[2] = STRASSEN(S[2], B[2, 2])
          P[3] = STRASSEN(S[3], B[1, 1])
          P[4] = STRASSEN(A[2, 2], S[4])
          P[5] = STRASSEN(S[5], S[6])
          P[6] = STRASSEN(S[7], S[8])
          P[7] = STRASSEN(S[9], S[10])
          C[1..n / 2][1..n / 2] = P[5] + P[4] - P[2] + P[6]
          C[1..n / 2][n / 2 + 1..n] = P[1] + P[2]
          C[n / 2 + 1..n][1..n / 2] = P[3] + P[4]
          C[n / 2 + 1..n][n / 2 + 1..n] = P[5] + P[1] - P[3] - P[7]
          return C
      

      In my copy of Introduction to Algorithms, it says something like “this is the most bullshit algorithm in the book and it’s not close” underneath. You can make it a bit neater by representing the multiplication operation as a 3-dimensional tensor, but at the end of the day it’s still just a stupid arithmetic trick. One that’s built into your GPU.