42 Commits (master)

Author SHA1 Message Date
Constantin Fürst 1b6c60c49b benchmark copy throughput for using 1,2,4,8 dsas and remove brute cpu bench (we steal it from andre) 11 months ago
Constantin Fürst 21bbf53e55 use local and one remote node for pushpull on intranode 11 months ago
Constantin Fürst 9e329d39e4 add pushpul benchmark for peak throughput 11 months ago
Constantin Fürst 067a31e560 modify submission benchmark descriptors to have x10 internal repetitions 11 months ago
Constantin Fürst ed77e57e5f remove benchmark descriptors for unused engine location benchmark 11 months ago
Constantin Fürst 8ac601fc07 add option for internal repetitions to benchmarks which allows the small copies of 1kib to run long enough for the timings to become usable (goal is about 1s runtime for each iteration) 11 months ago
Constantin Fürst ea978423c0 slightly modify the debug benchmark descriptor to contain multiple threads and also a batch 11 months ago
Constantin Fürst cae0fbb56e remove benchmark descriptor handling helper scripts 11 months ago
Constantin Fürst e2c6fd8587 set iteration count for submission bench to 10 for the large and 100 for the small, also test 128mib which is over the size of the cache available on xeonmax (instead of 32mib) 11 months ago
Constantin Fürst 72fb3764fc modify benchmarker script to require less parameters and sit in root dir 11 months ago
Constantin Fürst c8b4f3d624 fix issues with benchmark.hpp 11 months ago
Constantin Fürst 6e26739eb9 give unique name to brute cpu copy benchmark descriptors for identification in results 11 months ago
Constantin Fürst 8068fc3437 rewrite descriptors to match the format of the rewrite of bench from previous commit 11 months ago
Constantin Fürst d20b6dad93 remove all unused descriptors and begin rewriting submission descriptors 11 months ago
Constantin Fürst cf14cf34ac :wqsubdivide the descriptors for peak perf allnodes into fromn0 and fromn1to15 11 months ago
Constantin Fürst f2059d4d47 use 12 threads for each brute benchmark 11 months ago
Constantin Fürst 439ec97c8e remove smart sw copy, add brute sw copy bench 11 months ago
Constantin Fürst e9090fb482 redo benchmarks for allnodes cpu due to mistake in modifying the json files 11 months ago
Constantin Fürst 317a74d164 benchmark software peak performance for smart and brute force from node 0 for thesis 11 months ago
Constantin Fürst 791184ff10 fix mistake in benchmark descriptors for smart peak performance 1 year ago
Constantin Fürst 0748826fcd use transfer size of 512mib for HBM intranode copy and modify plotter accordingly 1 year ago
Constantin Fürst 475b2d5b5a provide two benchmarks for peak performance, one that is brute force and one that uses smart node assignment and therefore lower utilization 1 year ago
Constantin Fürst 7ced0bce4c fix an issue in the python script that lead to references being modified which caused bad node settings 1 year ago
Constantin Fürst 6d9002d1e7 use different engine configuration depending on whether intra socket (all 4 engines on the socket) or inter socket (src and destination engine, cross copy) is the copy type 1 year ago
Constantin Fürst 7f7230197c dont submit multiple copies - test takes too long and this has almost no effect at work size of 1gib 1 year ago
Constantin Fürst e9807df09c use all 8 engines for each copy task, as the engine location does not affect performance 1 year ago
Constantin Fürst 5089936f30 prepare peak throughput plotter for multi-node results 1 year ago
Constantin Fürst 1682b84fb4 use a batch size of 8 to check whether multiple engines can increase throughput 1 year ago
Constantin Fürst 575ff8cf82 turn node -1 into node 7 again - this got messed up by a hastily written modification script 1 year ago
Constantin Fürst cf675e37f5 dont pin the thread to hbm nodes but to the hbm src node minus 8 in peak perf benchmark 1 year ago
Constantin Fürst 808c8f3ae7 run the engine location benchmarks with size of 1gib only 10 times 1 year ago
Constantin Fürst b37968dd3f rename engine location benchmarks, modify plotter to support missing configuration files as the new cases are not universally applicable to all configurations 1 year ago
Constantin Fürst a548d9afe5 restructure mtsubmit benchmarks 1 year ago
Constantin Fürst b7cae18b6d restructure the engine location bench, correct and update the plotter to use new total time 1 year ago
Constantin Fürst 405166cbe8 add peak perf benchmark descriptors 1 year ago
Constantin Fürst 9886b20112 remove the multiple tests for mtsubmit and ensure that the wq will always receive the same elements to make the test fair 1 year ago
Constantin Fürst 3964da0d7a use ms10 instead of ms50 as ms50 with > 2 threads will overfill the wq of size 128 (max) and cause an error unhandled in the benchmark yet 1 year ago
Constantin Fürst f9b00a5b32 modify mtsubmit to meassure both ssaw and ms50 submit methods 1 year ago
Constantin Fürst d72e28adb8 make mtsubmit a multisubmit 1 year ago
Constantin Fürst 151ddd79d5 set repetition count of tests to 1000 again and replace the 1 GiB tests with 32 MiB 1 year ago
Constantin Fürst 21194b7fde fix the benchmarker script by testing for the correct amount of parameters and obtaining the test name correctly 1 year ago
Constantin Fürst d3a09181ce move the existing results into a new folder and modify them to be only benchmark descriptors 1 year ago