Constantin Fürst
|
8ab5eb4902
|
finish adapting plotters to new result style, add division by thread count to the throughput plotters, adjust figure sizes to be small (larger font when scaled up in latex)
|
11 months ago |
Constantin Fürst
|
f3e89405a5
|
publish benchmark results from vampir for the redone microbench
|
11 months ago |
Constantin Fürst
|
067a31e560
|
modify submission benchmark descriptors to have x10 internal repetitions
|
11 months ago |
Constantin Fürst
|
326cf92af3
|
update benchmark plotters for changes made to benchmark and result format
|
11 months ago |
Constantin Fürst
|
875098b258
|
remove plotters which are not in use anymore
|
11 months ago |
Constantin Fürst
|
ed77e57e5f
|
remove benchmark descriptors for unused engine location benchmark
|
11 months ago |
Constantin Fürst
|
a216d96003
|
remove old benchmark plots
|
11 months ago |
Constantin Fürst
|
8ac601fc07
|
add option for internal repetitions to benchmarks which allows the small copies of 1kib to run long enough for the timings to become usable (goal is about 1s runtime for each iteration)
|
11 months ago |
Constantin Fürst
|
8dc3827676
|
do not check in the debug benchmark result
|
11 months ago |
Constantin Fürst
|
25451fa26a
|
pretty-format the benchmarker script which got mangled from editing on vampir
|
11 months ago |
Constantin Fürst
|
3bfbeca21f
|
resize source and destination pointer holders properly before use and use path from template and not dml::software for cache flush in benchmark loop
|
11 months ago |
Constantin Fürst
|
ea978423c0
|
slightly modify the debug benchmark descriptor to contain multiple threads and also a batch
|
11 months ago |
Constantin Fürst
|
cae0fbb56e
|
remove benchmark descriptor handling helper scripts
|
11 months ago |
Constantin Fürst
|
24bdccd1e3
|
rewrite the benchmarker to not allocate the memory regions each iteration but before the test runs, also flush cache each iteration using dml-operation, also set dsa-device using the parameter to submit and not using libnuma assignment
|
11 months ago |
Constantin Fürst
|
e2c6fd8587
|
set iteration count for submission bench to 10 for the large and 100 for the small, also test 128mib which is over the size of the cache available on xeonmax (instead of 32mib)
|
11 months ago |
Constantin Fürst
|
4f9abc911f
|
make benchmark.hpp a cpp file to make it clear that it will have global variables
|
11 months ago |
Constantin Fürst
|
f905ee77eb
|
wait less for task launch and dont write iterations complete out
|
11 months ago |
Constantin Fürst
|
72fb3764fc
|
modify benchmarker script to require less parameters and sit in root dir
|
11 months ago |
Constantin Fürst
|
c8b4f3d624
|
fix issues with benchmark.hpp
|
11 months ago |
Constantin Fürst
|
1446d1575d
|
set O3 for release and g3 for debug as build flags in cmake
|
11 months ago |
Constantin Fürst
|
6e26739eb9
|
give unique name to brute cpu copy benchmark descriptors for identification in results
|
11 months ago |
Constantin Fürst
|
a4131a6b33
|
remove previous benchmark results as these will be redone
|
11 months ago |
Constantin Fürst
|
8068fc3437
|
rewrite descriptors to match the format of the rewrite of bench from previous commit
|
11 months ago |
Constantin Fürst
|
fccc255aae
|
rewrite the benchmark to meassure timings for the entire run of all threads, doing multiple sync-steps with the launch barrier as done in the qdp bench
|
11 months ago |
Constantin Fürst
|
d20b6dad93
|
remove all unused descriptors and begin rewriting submission descriptors
|
11 months ago |
Constantin Fürst
|
d06f0d0c6c
|
re-evaluate peak perf benchs
|
11 months ago |
Constantin Fürst
|
1af027417b
|
re-run benchmark peakperf from n0
|
11 months ago |
Constantin Fürst
|
cf14cf34ac
|
:wqsubdivide the descriptors for peak perf allnodes into fromn0 and fromn1to15
|
11 months ago |
Constantin Fürst
|
9cd6a41205
|
add benchmark evaluations for software and new evaluations in pdf format
|
11 months ago |
Constantin Fürst
|
d73b1cea68
|
add brute sw path results
|
11 months ago |
Constantin Fürst
|
f2059d4d47
|
use 12 threads for each brute benchmark
|
11 months ago |
Constantin Fürst
|
439ec97c8e
|
remove smart sw copy, add brute sw copy bench
|
11 months ago |
Constantin Fürst
|
24f96247fc
|
re-run cpu peak perf for allnodes with software path correctly set
|
11 months ago |
Constantin Fürst
|
e9090fb482
|
redo benchmarks for allnodes cpu due to mistake in modifying the json files
|
11 months ago |
Constantin Fürst
|
317a74d164
|
benchmark software peak performance for smart and brute force from node 0 for thesis
|
11 months ago |
Constantin Fürst
|
ba9dae8bde
|
export plots from the benchmarks as pdf for scalability and integration with thesis
|
11 months ago |
Constantin Fürst
|
8af4f46d0b
|
add unit to y-axis-label for peakthroughput plots
|
11 months ago |
Constantin Fürst
|
718ce39693
|
plot the throughput benchmark for source node 0 and destination nodess {8,11,12,15} only and as a bar plot for use in thesis
|
11 months ago |
Constantin Fürst
|
1ec7d438b2
|
rework the multithread benchmark plot to be more compact for use in the thesis
|
11 months ago |
Constantin Fürst
|
d626c263ff
|
re-configure the submission method benchmark for displaying in the thesis
|
11 months ago |
Constantin Fürst
|
c0c75aa51b
|
use uniform colormap in the plots and use separate output folder for the plots
|
12 months ago |
Constantin Fürst
|
5f72404508
|
re run changed smart peak throughput benchmarks
|
1 year ago |
Constantin Fürst
|
1fca956a0a
|
update the peak throughput plotter to show the difference of smart and allnodes too
|
1 year ago |
Constantin Fürst
|
791184ff10
|
fix mistake in benchmark descriptors for smart peak performance
|
1 year ago |
Constantin Fürst
|
f809eb5847
|
rerun copy benchmark with smart assignment
|
1 year ago |
Constantin Fürst
|
6b9581b8d1
|
add results for the new 512mib internode hbm copy peak perf test
|
1 year ago |
Constantin Fürst
|
0748826fcd
|
use transfer size of 512mib for HBM intranode copy and modify plotter accordingly
|
1 year ago |
Constantin Fürst
|
60a5ba5120
|
refactor the benchmark plotters and submit newly plotted graphs
|
1 year ago |
Constantin Fürst
|
c0f2aa2b64
|
re-plot the benchmarks with the new data
|
1 year ago |
Constantin Fürst
|
1c7369f20e
|
add results for smart and allnodes peak throughput benchmark
|
1 year ago |