Constantin Fürst
|
3e1d9a6445
|
set right ylim of scaling factor to 9 and not 10 so that both sides have the same amount of spacing
|
10 months ago |
Constantin Fürst
|
436ce28804
|
improve axis scaling and ticks for benchmark result plots
|
10 months ago |
Constantin Fürst
|
579494bc41
|
redo the plots for benchmarks of peak throughput and work division
|
11 months ago |
Constantin Fürst
|
db8751afc7
|
run the modified peak throughput benchmarks
|
11 months ago |
Constantin Fürst
|
1b6c60c49b
|
benchmark copy throughput for using 1,2,4,8 dsas and remove brute cpu bench (we steal it from andre)
|
11 months ago |
Constantin Fürst
|
b6f85ca202
|
redo benchmarks for pushpull
|
11 months ago |
Constantin Fürst
|
21bbf53e55
|
use local and one remote node for pushpull on intranode
|
11 months ago |
Constantin Fürst
|
c43bce3e13
|
add results for pushpull benchmark
|
11 months ago |
Constantin Fürst
|
321a4fb0bd
|
manually add data from communication with andre to the cpu benchmark plotters
|
11 months ago |
Constantin Fürst
|
9e329d39e4
|
add pushpul benchmark for peak throughput
|
11 months ago |
Constantin Fürst
|
bf79435ff0
|
add new benchmark plots from the rewritten microbench
|
11 months ago |
Constantin Fürst
|
8ab5eb4902
|
finish adapting plotters to new result style, add division by thread count to the throughput plotters, adjust figure sizes to be small (larger font when scaled up in latex)
|
11 months ago |
Constantin Fürst
|
f3e89405a5
|
publish benchmark results from vampir for the redone microbench
|
11 months ago |
Constantin Fürst
|
067a31e560
|
modify submission benchmark descriptors to have x10 internal repetitions
|
11 months ago |
Constantin Fürst
|
326cf92af3
|
update benchmark plotters for changes made to benchmark and result format
|
11 months ago |
Constantin Fürst
|
875098b258
|
remove plotters which are not in use anymore
|
11 months ago |
Constantin Fürst
|
ed77e57e5f
|
remove benchmark descriptors for unused engine location benchmark
|
11 months ago |
Constantin Fürst
|
a216d96003
|
remove old benchmark plots
|
11 months ago |
Constantin Fürst
|
8ac601fc07
|
add option for internal repetitions to benchmarks which allows the small copies of 1kib to run long enough for the timings to become usable (goal is about 1s runtime for each iteration)
|
11 months ago |
Constantin Fürst
|
8dc3827676
|
do not check in the debug benchmark result
|
11 months ago |
Constantin Fürst
|
25451fa26a
|
pretty-format the benchmarker script which got mangled from editing on vampir
|
11 months ago |
Constantin Fürst
|
3bfbeca21f
|
resize source and destination pointer holders properly before use and use path from template and not dml::software for cache flush in benchmark loop
|
11 months ago |
Constantin Fürst
|
ea978423c0
|
slightly modify the debug benchmark descriptor to contain multiple threads and also a batch
|
11 months ago |
Constantin Fürst
|
cae0fbb56e
|
remove benchmark descriptor handling helper scripts
|
11 months ago |
Constantin Fürst
|
24bdccd1e3
|
rewrite the benchmarker to not allocate the memory regions each iteration but before the test runs, also flush cache each iteration using dml-operation, also set dsa-device using the parameter to submit and not using libnuma assignment
|
11 months ago |
Constantin Fürst
|
e2c6fd8587
|
set iteration count for submission bench to 10 for the large and 100 for the small, also test 128mib which is over the size of the cache available on xeonmax (instead of 32mib)
|
11 months ago |
Constantin Fürst
|
4f9abc911f
|
make benchmark.hpp a cpp file to make it clear that it will have global variables
|
11 months ago |
Constantin Fürst
|
f905ee77eb
|
wait less for task launch and dont write iterations complete out
|
11 months ago |
Constantin Fürst
|
72fb3764fc
|
modify benchmarker script to require less parameters and sit in root dir
|
11 months ago |
Constantin Fürst
|
c8b4f3d624
|
fix issues with benchmark.hpp
|
11 months ago |
Constantin Fürst
|
1446d1575d
|
set O3 for release and g3 for debug as build flags in cmake
|
11 months ago |
Constantin Fürst
|
6e26739eb9
|
give unique name to brute cpu copy benchmark descriptors for identification in results
|
11 months ago |
Constantin Fürst
|
a4131a6b33
|
remove previous benchmark results as these will be redone
|
11 months ago |
Constantin Fürst
|
8068fc3437
|
rewrite descriptors to match the format of the rewrite of bench from previous commit
|
11 months ago |
Constantin Fürst
|
fccc255aae
|
rewrite the benchmark to meassure timings for the entire run of all threads, doing multiple sync-steps with the launch barrier as done in the qdp bench
|
11 months ago |
Constantin Fürst
|
d20b6dad93
|
remove all unused descriptors and begin rewriting submission descriptors
|
11 months ago |
Constantin Fürst
|
d06f0d0c6c
|
re-evaluate peak perf benchs
|
11 months ago |
Constantin Fürst
|
1af027417b
|
re-run benchmark peakperf from n0
|
11 months ago |
Constantin Fürst
|
cf14cf34ac
|
:wqsubdivide the descriptors for peak perf allnodes into fromn0 and fromn1to15
|
11 months ago |
Constantin Fürst
|
9cd6a41205
|
add benchmark evaluations for software and new evaluations in pdf format
|
11 months ago |
Constantin Fürst
|
d73b1cea68
|
add brute sw path results
|
11 months ago |
Constantin Fürst
|
f2059d4d47
|
use 12 threads for each brute benchmark
|
11 months ago |
Constantin Fürst
|
439ec97c8e
|
remove smart sw copy, add brute sw copy bench
|
11 months ago |
Constantin Fürst
|
24f96247fc
|
re-run cpu peak perf for allnodes with software path correctly set
|
11 months ago |
Constantin Fürst
|
e9090fb482
|
redo benchmarks for allnodes cpu due to mistake in modifying the json files
|
11 months ago |
Constantin Fürst
|
317a74d164
|
benchmark software peak performance for smart and brute force from node 0 for thesis
|
11 months ago |
Constantin Fürst
|
ba9dae8bde
|
export plots from the benchmarks as pdf for scalability and integration with thesis
|
11 months ago |
Constantin Fürst
|
8af4f46d0b
|
add unit to y-axis-label for peakthroughput plots
|
11 months ago |
Constantin Fürst
|
718ce39693
|
plot the throughput benchmark for source node 0 and destination nodess {8,11,12,15} only and as a bar plot for use in thesis
|
11 months ago |
Constantin Fürst
|
1ec7d438b2
|
rework the multithread benchmark plot to be more compact for use in the thesis
|
11 months ago |