144 Commits (8ac601fc0792a67cbe43ceb56208f0b40723f5af)

Author SHA1 Message Date
Constantin Fürst 8ac601fc07 add option for internal repetitions to benchmarks which allows the small copies of 1kib to run long enough for the timings to become usable (goal is about 1s runtime for each iteration) 11 months ago
Constantin Fürst 8dc3827676 do not check in the debug benchmark result 11 months ago
Constantin Fürst 25451fa26a pretty-format the benchmarker script which got mangled from editing on vampir 11 months ago
Constantin Fürst 3bfbeca21f resize source and destination pointer holders properly before use and use path from template and not dml::software for cache flush in benchmark loop 11 months ago
Constantin Fürst ea978423c0 slightly modify the debug benchmark descriptor to contain multiple threads and also a batch 11 months ago
Constantin Fürst cae0fbb56e remove benchmark descriptor handling helper scripts 11 months ago
Constantin Fürst 24bdccd1e3 rewrite the benchmarker to not allocate the memory regions each iteration but before the test runs, also flush cache each iteration using dml-operation, also set dsa-device using the parameter to submit and not using libnuma assignment 11 months ago
Constantin Fürst e2c6fd8587 set iteration count for submission bench to 10 for the large and 100 for the small, also test 128mib which is over the size of the cache available on xeonmax (instead of 32mib) 11 months ago
Constantin Fürst 4f9abc911f make benchmark.hpp a cpp file to make it clear that it will have global variables 11 months ago
Constantin Fürst f905ee77eb wait less for task launch and dont write iterations complete out 11 months ago
Constantin Fürst 72fb3764fc modify benchmarker script to require less parameters and sit in root dir 11 months ago
Constantin Fürst c8b4f3d624 fix issues with benchmark.hpp 11 months ago
Constantin Fürst 1446d1575d set O3 for release and g3 for debug as build flags in cmake 11 months ago
Constantin Fürst 6e26739eb9 give unique name to brute cpu copy benchmark descriptors for identification in results 11 months ago
Constantin Fürst a4131a6b33 remove previous benchmark results as these will be redone 11 months ago
Constantin Fürst 8068fc3437 rewrite descriptors to match the format of the rewrite of bench from previous commit 11 months ago
Constantin Fürst fccc255aae rewrite the benchmark to meassure timings for the entire run of all threads, doing multiple sync-steps with the launch barrier as done in the qdp bench 11 months ago
Constantin Fürst d20b6dad93 remove all unused descriptors and begin rewriting submission descriptors 11 months ago
Constantin Fürst d06f0d0c6c re-evaluate peak perf benchs 11 months ago
Constantin Fürst 1af027417b re-run benchmark peakperf from n0 11 months ago
Constantin Fürst cf14cf34ac :wqsubdivide the descriptors for peak perf allnodes into fromn0 and fromn1to15 11 months ago
Constantin Fürst 9cd6a41205 add benchmark evaluations for software and new evaluations in pdf format 11 months ago
Constantin Fürst d73b1cea68 add brute sw path results 11 months ago
Constantin Fürst f2059d4d47 use 12 threads for each brute benchmark 11 months ago
Constantin Fürst 439ec97c8e remove smart sw copy, add brute sw copy bench 11 months ago
Constantin Fürst 24f96247fc re-run cpu peak perf for allnodes with software path correctly set 11 months ago
Constantin Fürst e9090fb482 redo benchmarks for allnodes cpu due to mistake in modifying the json files 11 months ago
Constantin Fürst 317a74d164 benchmark software peak performance for smart and brute force from node 0 for thesis 11 months ago
Constantin Fürst ba9dae8bde export plots from the benchmarks as pdf for scalability and integration with thesis 11 months ago
Constantin Fürst 8af4f46d0b add unit to y-axis-label for peakthroughput plots 11 months ago
Constantin Fürst 718ce39693 plot the throughput benchmark for source node 0 and destination nodess {8,11,12,15} only and as a bar plot for use in thesis 11 months ago
Constantin Fürst 1ec7d438b2 rework the multithread benchmark plot to be more compact for use in the thesis 11 months ago
Constantin Fürst d626c263ff re-configure the submission method benchmark for displaying in the thesis 11 months ago
Constantin Fürst c0c75aa51b use uniform colormap in the plots and use separate output folder for the plots 12 months ago
Constantin Fürst 5f72404508 re run changed smart peak throughput benchmarks 1 year ago
Constantin Fürst 1fca956a0a update the peak throughput plotter to show the difference of smart and allnodes too 1 year ago
Constantin Fürst 791184ff10 fix mistake in benchmark descriptors for smart peak performance 1 year ago
Constantin Fürst f809eb5847 rerun copy benchmark with smart assignment 1 year ago
Constantin Fürst 6b9581b8d1 add results for the new 512mib internode hbm copy peak perf test 1 year ago
Constantin Fürst 0748826fcd use transfer size of 512mib for HBM intranode copy and modify plotter accordingly 1 year ago
Constantin Fürst 60a5ba5120 refactor the benchmark plotters and submit newly plotted graphs 1 year ago
Constantin Fürst c0f2aa2b64 re-plot the benchmarks with the new data 1 year ago
Constantin Fürst 1c7369f20e add results for smart and allnodes peak throughput benchmark 1 year ago
Constantin Fürst 1e55565072 remove previous benchmark results for peak performance 1 year ago
Constantin Fürst 475b2d5b5a provide two benchmarks for peak performance, one that is brute force and one that uses smart node assignment and therefore lower utilization 1 year ago
Constantin Fürst 9cef69c33f add mdsa v3 benchmark results for peak performance 1 year ago
Constantin Fürst 7ced0bce4c fix an issue in the python script that lead to references being modified which caused bad node settings 1 year ago
Constantin Fürst 8787b441bc add second type of multi dsa benchmark results 1 year ago
Constantin Fürst 6d9002d1e7 use different engine configuration depending on whether intra socket (all 4 engines on the socket) or inter socket (src and destination engine, cross copy) is the copy type 1 year ago
Constantin Fürst 68a838f0d1 add final multi-dsa results 1 year ago