Constantin Fürst
|
eabba98972
|
include the new benchmarks for peak throughput and work division and modify their analysis
|
11 months ago |
Constantin Fürst
|
8bdc3e1d76
|
start formulation of thesis introduction
|
11 months ago |
Constantin Fürst
|
9d5fbe085b
|
formulate abstract
|
11 months ago |
Constantin Fürst
|
579494bc41
|
redo the plots for benchmarks of peak throughput and work division
|
11 months ago |
Constantin Fürst
|
db8751afc7
|
run the modified peak throughput benchmarks
|
11 months ago |
Constantin Fürst
|
1b6c60c49b
|
benchmark copy throughput for using 1,2,4,8 dsas and remove brute cpu bench (we steal it from andre)
|
11 months ago |
Constantin Fürst
|
b6f85ca202
|
redo benchmarks for pushpull
|
11 months ago |
Constantin Fürst
|
21bbf53e55
|
use local and one remote node for pushpull on intranode
|
11 months ago |
Constantin Fürst
|
c43bce3e13
|
add results for pushpull benchmark
|
11 months ago |
Constantin Fürst
|
850aebc6b9
|
update bachelor.pdf with the bullet points for abstract and intro
|
11 months ago |
Constantin Fürst
|
6b500c3396
|
cite figure source at end of caption and not mid-sentence for chapter2
|
11 months ago |
Constantin Fürst
|
25f55adcb2
|
formulate bullet points for abstract and introduction to the thesis
|
11 months ago |
Constantin Fürst
|
18ec95e201
|
adapt chapter on performance to the corrected measurements
|
11 months ago |
Constantin Fürst
|
23daabbd73
|
fix bad section references
|
11 months ago |
Constantin Fürst
|
321a4fb0bd
|
manually add data from communication with andre to the cpu benchmark plotters
|
11 months ago |
Constantin Fürst
|
1021197009
|
do not use the float-force parameter H but move to a more dynmic one (h!tb), also dont use Subsection anywhere instead use Section, add a small paragraph to the implementation chapter stating how we used the cache in qdp
|
11 months ago |
Constantin Fürst
|
27c57aa4ce
|
add node for content of evaluation chapter
|
11 months ago |
Constantin Fürst
|
9e329d39e4
|
add pushpul benchmark for peak throughput
|
11 months ago |
Constantin Fürst
|
33730c4a99
|
add all images which for now were committed only with -f
|
11 months ago |
Constantin Fürst
|
a580939d29
|
rewrite chapter 3 with the corrected benchmark results, add these corrected results, re-add structograms with different names
|
11 months ago |
Constantin Fürst
|
89d2a6c71f
|
redo the structograms for the benchmark code
|
11 months ago |
Constantin Fürst
|
94818536cd
|
improve the section on qdp in state chapter and use (redone) graphic for simple query
|
11 months ago |
Constantin Fürst
|
4f27e9c9c0
|
remove pdf files from gitignore
|
11 months ago |
Constantin Fürst
|
7572350b28
|
apply recommendations of andre to chapter 5
|
11 months ago |
Constantin Fürst
|
c6495b8b02
|
write introductory paragraph to design section
|
11 months ago |
Constantin Fürst
|
d1cc3e3b0c
|
modification to qdp benchmark, returns to per-chunk barrier wait, uses userspace semaphore for one-way barrier from scan_b to aggr_j as scan_b should submit asap but aggr_j should wait on submission from scan_b, contains TODO for modifying code to support chunkcount not divisible by 2
|
11 months ago |
Constantin Fürst
|
bf79435ff0
|
add new benchmark plots from the rewritten microbench
|
11 months ago |
Constantin Fürst
|
8ab5eb4902
|
finish adapting plotters to new result style, add division by thread count to the throughput plotters, adjust figure sizes to be small (larger font when scaled up in latex)
|
11 months ago |
Constantin Fürst
|
f3e89405a5
|
publish benchmark results from vampir for the redone microbench
|
11 months ago |
Constantin Fürst
|
067a31e560
|
modify submission benchmark descriptors to have x10 internal repetitions
|
11 months ago |
Constantin Fürst
|
326cf92af3
|
update benchmark plotters for changes made to benchmark and result format
|
11 months ago |
Constantin Fürst
|
875098b258
|
remove plotters which are not in use anymore
|
11 months ago |
Constantin Fürst
|
ed77e57e5f
|
remove benchmark descriptors for unused engine location benchmark
|
11 months ago |
Constantin Fürst
|
a216d96003
|
remove old benchmark plots
|
11 months ago |
Constantin Fürst
|
8ac601fc07
|
add option for internal repetitions to benchmarks which allows the small copies of 1kib to run long enough for the timings to become usable (goal is about 1s runtime for each iteration)
|
11 months ago |
Constantin Fürst
|
a963406f7c
|
move mode selection to Configuration.hpp, adapt the CopyMethodPolicy-Function to return only src_node for task sizes under 16MiB which is now required to not cause high submission count which slows down small copies
|
11 months ago |
Constantin Fürst
|
ef8286da17
|
unset the weak-wait-flag on deallocation for CacheData which is where completion guarantee is required, also extend some comments in CacheData
|
11 months ago |
Constantin Fürst
|
8dc3827676
|
do not check in the debug benchmark result
|
11 months ago |
Constantin Fürst
|
25451fa26a
|
pretty-format the benchmarker script which got mangled from editing on vampir
|
11 months ago |
Constantin Fürst
|
3bfbeca21f
|
resize source and destination pointer holders properly before use and use path from template and not dml::software for cache flush in benchmark loop
|
11 months ago |
Constantin Fürst
|
ef805244ac
|
use 4gib as size and again 1 aggrj thread for qdp bench
|
11 months ago |
Constantin Fürst
|
3c90e24bc1
|
revert the cacher to allow load balancing control through the copy placement policy function which now selects on how many nodes the task is split again, and not just which nodes the task MAY run on (which was done experimentally)
|
11 months ago |
Constantin Fürst
|
ea978423c0
|
slightly modify the debug benchmark descriptor to contain multiple threads and also a batch
|
11 months ago |
Constantin Fürst
|
cae0fbb56e
|
remove benchmark descriptor handling helper scripts
|
11 months ago |
Constantin Fürst
|
24bdccd1e3
|
rewrite the benchmarker to not allocate the memory regions each iteration but before the test runs, also flush cache each iteration using dml-operation, also set dsa-device using the parameter to submit and not using libnuma assignment
|
11 months ago |
Constantin Fürst
|
e2c6fd8587
|
set iteration count for submission bench to 10 for the large and 100 for the small, also test 128mib which is over the size of the cache available on xeonmax (instead of 32mib)
|
11 months ago |
Constantin Fürst
|
4f9abc911f
|
make benchmark.hpp a cpp file to make it clear that it will have global variables
|
11 months ago |
Constantin Fürst
|
f905ee77eb
|
wait less for task launch and dont write iterations complete out
|
11 months ago |
Constantin Fürst
|
72fb3764fc
|
modify benchmarker script to require less parameters and sit in root dir
|
11 months ago |
Constantin Fürst
|
c8b4f3d624
|
fix issues with benchmark.hpp
|
11 months ago |