Browse Source

formulate the benchmark descriptions more concisely, in benchmark-plan.md

master
Constantin Fürst 1 year ago
parent
commit
151cadf3c7
  1. 21
      benchmarks/benchmark-plan.md

21
benchmarks/benchmark-plan.md

@ -1,12 +1,9 @@
# peak performance
- meassure ddr to ddr, intra-node
- meassure ddr to hbm, intra-node
- meassure ddr to ddr, inter-node
- meassure ddr to hbm, inter-node
- meassure ddr to ddr, inter-socket
- meassure ddr to hbm, inter-socket
# peak-perf
- meassure ddr to ddr
- meassure ddr to hbm
All for 1KiB, 4KiB, 1MiB, 1GiB All for 1KiB, 4KiB, 1MiB, 1GiB
All for HW and also SW path All for HW and also SW path
All for intra-node, inter-node and inter-socket
--> conclude how much overhead DSA engine has --> conclude how much overhead DSA engine has
--> conclude size after which using HW makes sense --> conclude size after which using HW makes sense
this point is reached when submit overhead for this point is reached when submit overhead for
@ -17,17 +14,19 @@ All for HW and also SW path
- multi submit - multi submit
- batch submit - batch submit
All with both 1 and 4 engines per WQ All with both 1 and 4 engines per WQ
All for 1KiB, 4KiB, 1MiB, 1GiB but only ddr-ddr intra node
All for 1KiB, 4KiB, 1MiB, 1GiB
All only on DDR and intra-node
--> conclude which work submission strategy is best for which size --> conclude which work submission strategy is best for which size
--> conclude whether multiple engines significantly improve batch perf --> conclude whether multiple engines significantly improve batch perf
# MT submit
# mtsubmit // done
- multiple threads submit to the same WQ - multiple threads submit to the same WQ
- use 1,2,4,8,12 threads - use 1,2,4,8,12 threads
All for 1KiB, 4KiB, 1MiB, 1GiB but only ddr-ddr intra node
All using DDR and 1MiB
All for 1 vs 4 engines All for 1 vs 4 engines
All on DDR and intra-node
--> conclude how bad mt submit hurts performance --> conclude how bad mt submit hurts performance
--> conclude whether multiple engines help mt submit --> conclude whether multiple engines help mt submit
# cross copy // done
# cross-copy // done
- compare which is faster: xcopy, copy from source node, copy from dst node - compare which is faster: xcopy, copy from source node, copy from dst node
All for both inter-node and inter-socket copy using DDR and 1MiB on 4E All for both inter-node and inter-socket copy using DDR and 1MiB on 4E
--> conclude where a copy thread should live --> conclude where a copy thread should live

Loading…
Cancel
Save