|
|
|
For several matrices we benchmarked the matrix-vector multiplication. For a
complete discussion of the obtained benchmarks we refer to the
actual thesis; below the results for matrix
finan512 are shown. This matrix of dimension
(74752 x 74752) and 596992 nonzeros was multiplied a threethousand
times while measuring the execution time. The sequential multiplication
delivered an effective floprate of 10.5 MFlop/s (the effective floprate was
calculated using the formula 2 * n * nnz(A) / t, where n is the
number of multiplications, nnz(A) the number of nonzero entries and
t the measured time).
All test (including this one) were executed on the Cray T3E located at the
HPaC at the Delft University of
Technology (NL).
| |
|
|
|
We also executed the BiConjugate Gradient algorithm using this matrix, the table below
shows the resulting runtime (T_P) for executing the algorithm using P processors. We calculated the
parallel efficiency using E_P = T_1 / (P* T_P), i.e. the fraction of the ideal speedup that was achieved.
| P | residue | nIter | T_P (s) | E_P |
| 1 | 4.0618636512318970e-09 | 20 | 9.32494 | 1.00 |
| 2 | 4.0618636512318590e-09 | 20 | 4.919 | 0.95 |
| 4 | 4.0618636512319290e-09 | 20 | 2.39704 | 0.97 |
| 8 | 4.0618636512319830e-09 | 20 | 1.30845 | 0.89 |
| 16 | 4.0618636512319900e-09 | 20 | 0.72289 | 0.81 |
| 32 | 4.0618636512319820e-09 | 20 | 0.649064 | 0.44 |
|
| |