Matrixmultiplikation: Unterschied zwischen den Versionen

Aus DGL Wiki
Wechseln zu: Navigation, Suche
(Dynamische Matrix Multiplikation)
(Matrix Multiplikation SSE unterstützt)
Zeile 49: Zeile 49:
 
function MatrixMultiplySSE(const M0, M1: Tmat4x4): Tmat4x4; assembler; nostackframe; register;
 
function MatrixMultiplySSE(const M0, M1: Tmat4x4): Tmat4x4; assembler; nostackframe; register;
 
asm
 
asm
         Movups Xmm4, [M0 + $00]
+
         Movups Xmm4, [M0 + $00]
         Movups Xmm5, [M0 + $10]
+
         Movups Xmm5, [M0 + $10]
         Movups Xmm6, [M0 + $20]
+
         Movups Xmm6, [M0 + $20]
         Movups Xmm7, [M0 + $30]
+
         Movups Xmm7, [M0 + $30]
  
         Xor    Ecx, Ecx
+
         // Spalte 0
        @loop:
+
         Movups Xmm2, [M1 + $00]
         Movss  Xmm0, [M1 + $00 + Ecx]
 
        Shufps  Xmm0, Xmm0, 00000000b
 
        Mulps  Xmm0, Xmm4
 
  
         Movss  Xmm2, [M1 + $04 + Ecx]
+
         Pshufd Xmm0, Xmm2, 00000000b
        Shufps  Xmm2, Xmm2, 00000000b
+
         Mulps Xmm0, Xmm4
         Mulps   Xmm2, Xmm5
 
        Addps  Xmm0, Xmm2
 
  
         Movss  Xmm2, [M1 + $08 + Ecx]
+
         Pshufd Xmm1, Xmm2, 01010101b
        Shufps  Xmm2, Xmm2, 00000000b
+
         Mulps Xmm1, Xmm5
         Mulps   Xmm2, Xmm6
+
         Addps Xmm0, Xmm1
         Addps   Xmm0, Xmm2
 
  
         Movss  Xmm2, [M1 + $0C + Ecx]
+
         Pshufd Xmm1, Xmm2, 10101010b
        Shufps  Xmm2, Xmm2, 00000000b
+
         Mulps Xmm1, Xmm6
         Mulps   Xmm2, Xmm7
+
         Addps Xmm0, Xmm1
         Addps   Xmm0, Xmm2
 
  
         Movups [Result + Ecx], Xmm0
+
         Pshufd Xmm1, Xmm2, 11111111b
 +
        Mulps Xmm1, Xmm7
 +
        Addps  Xmm0, Xmm1
  
         Add    Ecx, $10
+
         Movups [Result + $00], Xmm0
         Cmp    Ecx, $30
+
 
         Jbe    @loop
+
        // Spalte 1
end;</source>
+
        Movups Xmm2, [M1 + $10]
 +
 
 +
        Pshufd Xmm0, Xmm2, 00000000b
 +
        Mulps  Xmm0, Xmm4
 +
 
 +
        Pshufd Xmm1, Xmm2, 01010101b
 +
        Mulps  Xmm1, Xmm5
 +
        Addps  Xmm0, Xmm1
 +
 
 +
        Pshufd Xmm1, Xmm2, 10101010b
 +
        Mulps  Xmm1, Xmm6
 +
        Addps  Xmm0, Xmm1
 +
 
 +
        Pshufd Xmm1, Xmm2, 11111111b
 +
        Mulps  Xmm1, Xmm7
 +
        Addps  Xmm0, Xmm1
 +
 
 +
        Movups  [Result + $10], Xmm0
 +
 
 +
        // Spalte 2
 +
        Movups  Xmm2, [M1 + $20]
 +
 
 +
        Pshufd Xmm0, Xmm2, 00000000b
 +
        Mulps  Xmm0, Xmm4
 +
 
 +
        Pshufd Xmm1, Xmm2, 01010101b
 +
        Mulps  Xmm1, Xmm5
 +
        Addps  Xmm0, Xmm1
 +
 
 +
        Pshufd Xmm1, Xmm2, 10101010b
 +
        Mulps  Xmm1, Xmm6
 +
        Addps  Xmm0, Xmm1
 +
 
 +
        Pshufd Xmm1, Xmm2, 11111111b
 +
         Mulps  Xmm1, Xmm7
 +
        Addps  Xmm0, Xmm1
 +
 
 +
        Movups [Result + $20], Xmm0
 +
 
 +
        // Spalte 3
 +
        Movups Xmm2, [M1 + $30]
 +
 
 +
        Pshufd Xmm0, Xmm2, 00000000b
 +
        Mulps  Xmm0, Xmm4
 +
 
 +
        Pshufd Xmm1, Xmm2, 01010101b
 +
        Mulps  Xmm1, Xmm5
 +
        Addps  Xmm0, Xmm1
 +
 
 +
        Pshufd Xmm1, Xmm2, 10101010b
 +
         Mulps  Xmm1, Xmm6
 +
        Addps  Xmm0, Xmm1
 +
 
 +
        Pshufd Xmm1, Xmm2, 11111111b
 +
        Mulps  Xmm1, Xmm7
 +
        Addps  Xmm0, Xmm1
 +
 
 +
        Movups [Result + $30], Xmm0
 +
end;
 +
</source>
 
=== Beispiel ===
 
=== Beispiel ===
  

Version vom 10. Juli 2018, 18:05 Uhr

Matrix Multiplikation

Zur Theorie der Matrixmultiplikation siehe den Matrix Artikel.

Matrix Multiplikation mit Operator

Noch einfacher geht die Matrix Multiplikation, wen man es mit dem Operator "*" machen kann.
Diese Beispiel zeigt dies anhand der der OpenGL üblichen 4x4-Matrix.

type
  TVector4f = array[0..3] of GLfloat;
  Tmat4x4   = array[0..3] of TVector4f;


operator * (const m1, m2: Tmat4x4) Res: Tmat4x4;
var
  i, j, k: integer;
begin
  for i := 0 to 3 do begin
    for j := 0 to 3 do begin
      Res[i, j] := 0;
      for k := 0 to 3 do begin
        Res[i, j] := Res[i, j] + m2[i, k] * m1[k, j];
      end;
    end;
  end;
end;

Beispiel

var
  m, m0, m1: Tmat4x4;

begin
  m := m0 * m1; // Sieht sehr einfach aus.

Matrix Multiplikation SSE unterstützt

Mit folgendem SSE-beschleunigtem Code, kann man Matrizen-Multiplikationen bis zu 20x schneller berechnen.
Dies wird von allen gängigen Intel/AMD-CPUs unterstützt. Dies betrifft alle CPUs ab Intel-Core, teilweise auch ältere.

type
  TVector4f = array[0..3] of GLfloat;
  Tmat4x4   = array[0..3] of TVector4f;


{$asmmode intel}
function MatrixMultiplySSE(const M0, M1: Tmat4x4): Tmat4x4; assembler; nostackframe; register;
asm
         Movups Xmm4, [M0 + $00]
         Movups Xmm5, [M0 + $10]
         Movups Xmm6, [M0 + $20]
         Movups Xmm7, [M0 + $30]

         // Spalte 0
         Movups Xmm2, [M1 + $00]

         Pshufd Xmm0, Xmm2, 00000000b
         Mulps  Xmm0, Xmm4

         Pshufd Xmm1, Xmm2, 01010101b
         Mulps  Xmm1, Xmm5
         Addps  Xmm0, Xmm1

         Pshufd Xmm1, Xmm2, 10101010b
         Mulps  Xmm1, Xmm6
         Addps  Xmm0, Xmm1

         Pshufd Xmm1, Xmm2, 11111111b
         Mulps  Xmm1, Xmm7
         Addps  Xmm0, Xmm1

         Movups [Result + $00], Xmm0

         // Spalte 1
         Movups Xmm2, [M1 + $10]

         Pshufd Xmm0, Xmm2, 00000000b
         Mulps  Xmm0, Xmm4

         Pshufd Xmm1, Xmm2, 01010101b
         Mulps  Xmm1, Xmm5
         Addps  Xmm0, Xmm1

         Pshufd Xmm1, Xmm2, 10101010b
         Mulps  Xmm1, Xmm6
         Addps  Xmm0, Xmm1

         Pshufd Xmm1, Xmm2, 11111111b
         Mulps  Xmm1, Xmm7
         Addps  Xmm0, Xmm1

         Movups   [Result + $10], Xmm0

         // Spalte 2
         Movups  Xmm2, [M1 + $20]

         Pshufd Xmm0, Xmm2, 00000000b
         Mulps  Xmm0, Xmm4

         Pshufd Xmm1, Xmm2, 01010101b
         Mulps  Xmm1, Xmm5
         Addps  Xmm0, Xmm1

         Pshufd Xmm1, Xmm2, 10101010b
         Mulps  Xmm1, Xmm6
         Addps  Xmm0, Xmm1

         Pshufd Xmm1, Xmm2, 11111111b
         Mulps  Xmm1, Xmm7
         Addps  Xmm0, Xmm1

         Movups [Result + $20], Xmm0

         // Spalte 3
         Movups Xmm2, [M1 + $30]

         Pshufd Xmm0, Xmm2, 00000000b
         Mulps  Xmm0, Xmm4

         Pshufd Xmm1, Xmm2, 01010101b
         Mulps  Xmm1, Xmm5
         Addps  Xmm0, Xmm1

         Pshufd Xmm1, Xmm2, 10101010b
         Mulps  Xmm1, Xmm6
         Addps  Xmm0, Xmm1

         Pshufd Xmm1, Xmm2, 11111111b
         Mulps  Xmm1, Xmm7
         Addps  Xmm0, Xmm1

         Movups [Result + $30], Xmm0
end;

Beispiel

var
  m, m0, m1: Tmat4x4;

begin
  m := MatrixMultiplySSE(m0, m1);

Matrix / Vektor Multiplikation mit Operator

Dies funktioniert auch mit Vektoren.

type
  TVector4f = array[0..3] of GLfloat;
  Tmat4x4   = array[0..3] of TVector4f;

operator * (const m: Tmat4x4; v: TVector4f) Res: TVector4f;
var
  i: integer;
begin
  for i := 0 to 3 do begin
    Res[i] := m[0, i] * v[0] + m[1, i] * v[1] + m[2, i] * v[2]+ m[3, i] * v[3];
  end;
end;

Beispiel

var
  v    : TVector4f;
  m, m0: Tmat4x4;

begin
  m := m0 * v; // Sieht sehr einfach aus.

Matrix / Vektor Multiplikation SSE-unterstützt

Die SSE-Beschleunigung funktioniert auch mit Vektoren.

type
  TVector4f = array[0..3] of GLfloat;
  Tmat4x4   = array[0..3] of TVector4f;

{$asmmode intel}
function VectorMultiplySSE(const M: TMatrix; const V: TVector4f): TVector4f; assembler; nostackframe; register;
asm
         Movups  Xmm4, [M + $00]
         Movups  Xmm5, [M + $10]
         Movups  Xmm6, [M + $20]
         Movups  Xmm7, [M + $30]

         // Zeile 0
         Movss   Xmm0, [V + $00]
         Shufps  Xmm0, Xmm0, 00000000b
         Mulps   Xmm0, Xmm4

         // Zeile 1
         Movss   Xmm2, [V + $04]
         Shufps  Xmm2, Xmm2, 00000000b
         Mulps   Xmm2, Xmm5
         Addps   Xmm0, Xmm2

         // Zeile 2
         Movss   Xmm2, [V + $08]
         Shufps  Xmm2, Xmm2, 00000000b
         Mulps   Xmm2, Xmm6
         Addps   Xmm0, Xmm2

         // Zeile 3
         Movss   Xmm2, [V + $0C]
         Shufps  Xmm2, Xmm2, 00000000b
         Mulps   Xmm2, Xmm7
         Addps   Xmm0, Xmm2

         Movups  [Result], Xmm0
end;

Beispiel

var
  v    : TVector4f;
  m, m0: Tmat4x4;

begin
  m := VectorMultiplySSE(m0, v);

Matrix Multiplikation mit dynamischen Matrizen ( no OpenGL )

Dies ist weniger für OpenGL geeignet. Für OpenGL verwendet man besser statische 4x4 oder 3x3 Arrays, siehe oben.

Emre lieferte im Forum diesen kleinen Codeschnipsel für die Matrixmultiplikation ab:

Type
  TSMatrix = Array of Array of Single;

// sMatrix := sMatrix * Matrix
procedure pSMatrixMatrixProduct( var sMatrix: TSMatrix; const Matrix: TSMatrix );
var
  m, n, o: Integer;
  Res    : TSMatrix;
begin

  // Matrices can only be multiplicated, if the row count of the matrix#1 is the same
  // as the column of the second matrix:
  if High(sMatrix[0]) <> High(Matrix) then Exit;

  // if a k*l matrix is multiplicated by a m*n matrix,
  // the result matrix will have a k*n dimension:
  SetLength(Res, Length(sMatrix), Length(Matrix[0]));
  for m := 0 to High(Res) do
    for n := 0 to High(Res[m]) do
      for o := 0 to High(Matrix) do
        incS(Res[m, n], sMatrix[m, o] * Matrix[o, n]);
  sMatrix := Res;
end;