Constantin Fürst
|
f905ee77eb
|
wait less for task launch and dont write iterations complete out
|
11 months ago |
Constantin Fürst
|
72fb3764fc
|
modify benchmarker script to require less parameters and sit in root dir
|
11 months ago |
Constantin Fürst
|
c8b4f3d624
|
fix issues with benchmark.hpp
|
11 months ago |
Constantin Fürst
|
1446d1575d
|
set O3 for release and g3 for debug as build flags in cmake
|
11 months ago |
Constantin Fürst
|
6e26739eb9
|
give unique name to brute cpu copy benchmark descriptors for identification in results
|
11 months ago |
Constantin Fürst
|
a4131a6b33
|
remove previous benchmark results as these will be redone
|
11 months ago |
Constantin Fürst
|
8068fc3437
|
rewrite descriptors to match the format of the rewrite of bench from previous commit
|
11 months ago |
Constantin Fürst
|
fccc255aae
|
rewrite the benchmark to meassure timings for the entire run of all threads, doing multiple sync-steps with the launch barrier as done in the qdp bench
|
11 months ago |
Constantin Fürst
|
d20b6dad93
|
remove all unused descriptors and begin rewriting submission descriptors
|
11 months ago |
Constantin Fürst
|
d06f0d0c6c
|
re-evaluate peak perf benchs
|
11 months ago |
Constantin Fürst
|
1af027417b
|
re-run benchmark peakperf from n0
|
11 months ago |
Constantin Fürst
|
cf14cf34ac
|
:wqsubdivide the descriptors for peak perf allnodes into fromn0 and fromn1to15
|
11 months ago |
Constantin Fürst
|
9cd6a41205
|
add benchmark evaluations for software and new evaluations in pdf format
|
11 months ago |
Constantin Fürst
|
d73b1cea68
|
add brute sw path results
|
11 months ago |
Constantin Fürst
|
f2059d4d47
|
use 12 threads for each brute benchmark
|
11 months ago |
Constantin Fürst
|
439ec97c8e
|
remove smart sw copy, add brute sw copy bench
|
11 months ago |
Constantin Fürst
|
24f96247fc
|
re-run cpu peak perf for allnodes with software path correctly set
|
11 months ago |
Constantin Fürst
|
e9090fb482
|
redo benchmarks for allnodes cpu due to mistake in modifying the json files
|
11 months ago |
Constantin Fürst
|
317a74d164
|
benchmark software peak performance for smart and brute force from node 0 for thesis
|
11 months ago |
Constantin Fürst
|
ba9dae8bde
|
export plots from the benchmarks as pdf for scalability and integration with thesis
|
11 months ago |
Constantin Fürst
|
8af4f46d0b
|
add unit to y-axis-label for peakthroughput plots
|
11 months ago |
Constantin Fürst
|
718ce39693
|
plot the throughput benchmark for source node 0 and destination nodess {8,11,12,15} only and as a bar plot for use in thesis
|
11 months ago |
Constantin Fürst
|
1ec7d438b2
|
rework the multithread benchmark plot to be more compact for use in the thesis
|
11 months ago |
Constantin Fürst
|
d626c263ff
|
re-configure the submission method benchmark for displaying in the thesis
|
11 months ago |
Constantin Fürst
|
c0c75aa51b
|
use uniform colormap in the plots and use separate output folder for the plots
|
12 months ago |
Constantin Fürst
|
5f72404508
|
re run changed smart peak throughput benchmarks
|
1 year ago |
Constantin Fürst
|
1fca956a0a
|
update the peak throughput plotter to show the difference of smart and allnodes too
|
1 year ago |
Constantin Fürst
|
791184ff10
|
fix mistake in benchmark descriptors for smart peak performance
|
1 year ago |
Constantin Fürst
|
f809eb5847
|
rerun copy benchmark with smart assignment
|
1 year ago |
Constantin Fürst
|
6b9581b8d1
|
add results for the new 512mib internode hbm copy peak perf test
|
1 year ago |
Constantin Fürst
|
0748826fcd
|
use transfer size of 512mib for HBM intranode copy and modify plotter accordingly
|
1 year ago |
Constantin Fürst
|
60a5ba5120
|
refactor the benchmark plotters and submit newly plotted graphs
|
1 year ago |
Constantin Fürst
|
c0f2aa2b64
|
re-plot the benchmarks with the new data
|
1 year ago |
Constantin Fürst
|
1c7369f20e
|
add results for smart and allnodes peak throughput benchmark
|
1 year ago |
Constantin Fürst
|
1e55565072
|
remove previous benchmark results for peak performance
|
1 year ago |
Constantin Fürst
|
475b2d5b5a
|
provide two benchmarks for peak performance, one that is brute force and one that uses smart node assignment and therefore lower utilization
|
1 year ago |
Constantin Fürst
|
9cef69c33f
|
add mdsa v3 benchmark results for peak performance
|
1 year ago |
Constantin Fürst
|
7ced0bce4c
|
fix an issue in the python script that lead to references being modified which caused bad node settings
|
1 year ago |
Constantin Fürst
|
8787b441bc
|
add second type of multi dsa benchmark results
|
1 year ago |
Constantin Fürst
|
6d9002d1e7
|
use different engine configuration depending on whether intra socket (all 4 engines on the socket) or inter socket (src and destination engine, cross copy) is the copy type
|
1 year ago |
Constantin Fürst
|
68a838f0d1
|
add final multi-dsa results
|
1 year ago |
Constantin Fürst
|
c92bb28d9a
|
add intermediate multi-dsa results
|
1 year ago |
Constantin Fürst
|
f11bb710ae
|
add intermediate multi-dsa results
|
1 year ago |
Constantin Fürst
|
7f7230197c
|
dont submit multiple copies - test takes too long and this has almost no effect at work size of 1gib
|
1 year ago |
Constantin Fürst
|
3e102509a9
|
add intermediate multi-dsa results
|
1 year ago |
Constantin Fürst
|
584c5bdfc4
|
add intermediate multi-dsa results
|
1 year ago |
Constantin Fürst
|
e9807df09c
|
use all 8 engines for each copy task, as the engine location does not affect performance
|
1 year ago |
Constantin Fürst
|
5089936f30
|
prepare peak throughput plotter for multi-node results
|
1 year ago |
Constantin Fürst
|
db11eb60e6
|
add 4e results from copy peak perf bench
|
1 year ago |
Constantin Fürst
|
17264186a6
|
small changes to the plotter scripts for nicer display
|
1 year ago |