98 Commits (c7b91e400f3353a2b83a40e8a53455d3776fb9e1)

Author SHA1 Message Date
Constantin Fürst 4a587a36e2 remove overlap-execution barriers and run for the entire block 11 months ago
Constantin Fürst c393b8eb88 improve load balancing node assignment 11 months ago
Constantin Fürst 21702d5309 remove sub and overchunking for scanb caching, use the per-iteration barriers again 11 months ago
Constantin Fürst 7d614769db remove forgotten access to load timer 11 months ago
Constantin Fürst 6a4eec37ca remove vector-load timing as its too expensive 11 months ago
Constantin Fürst 93a281fa26 improve debug output for relwithdebinfo in qdp, fix filename for record perf script, add perf.svg with better debug info 11 months ago
Constantin Fürst 624e8b55ea add script to record perf and make the flame graph 11 months ago
Constantin Fürst 942d7be7e9 redo benchmarks for qdp 11 months ago
Constantin Fürst a83f208cd2 fix time evaluation for qdp bench 11 months ago
Constantin Fürst cc8d203771 redo benchmarks for qdp, move previous results to old (folder) 11 months ago
Constantin Fürst 94669924c8 implement cache in aggrj for qdp 11 months ago
Constantin Fürst c7877ecdf6 remove skeleton of now defunct function in qdp 11 months ago
Constantin Fürst 20c6e54df7 remove broken implementation for non-divisible chunk-group-thread-counts 11 months ago
Constantin Fürst a3a8dff1aa reset some changes to the aggregation and filter functions not quite needed 11 months ago
Constantin Fürst 69aec6fa48 add plotter for the results of qdp which turns them into a donut-graph 11 months ago
Constantin Fürst 122eab35b7 modify benchmarking code to measure time spent loading vectors too 11 months ago
Constantin Fürst d1cc3e3b0c modification to qdp benchmark, returns to per-chunk barrier wait, uses userspace semaphore for one-way barrier from scan_b to aggr_j as scan_b should submit asap but aggr_j should wait on submission from scan_b, contains TODO for modifying code to support chunkcount not divisible by 2 11 months ago
Constantin Fürst a963406f7c move mode selection to Configuration.hpp, adapt the CopyMethodPolicy-Function to return only src_node for task sizes under 16MiB which is now required to not cause high submission count which slows down small copies 11 months ago
Constantin Fürst ef805244ac use 4gib as size and again 1 aggrj thread for qdp bench 11 months ago
Constantin Fürst 81527fdb6b commit current vampir config 11 months ago
Constantin Fürst b35f9978ae again, redo the perf-eval with reduced data size and load to prevent missing frames, the second 11 months ago
Constantin Fürst 18d5e62b80 again, redo the perf-eval with reduced data size and load to prevent missing frames 11 months ago
Constantin Fürst d63d8ac547 add redone flame graph 11 months ago
Constantin Fürst 69a3d2cef4 experimental implementation for tc-scanb > tc-aggrj, the second 11 months ago
Constantin Fürst 07fba8a5f0 experimental implementation for tc-scanb > tc-aggrj 11 months ago
Constantin Fürst d4122ba25a add updated config for prefetch from vampir 11 months ago
Constantin Fürst e4a0030049 fix prefetching subchunk indexing and adapt the weak access flag for join 11 months ago
Constantin Fürst f978d6b9b4 redo tests for prefetching 11 months ago
Constantin Fürst 972440d19f repair flags implementation 11 months ago
Constantin Fürst b3607329a6 add a flags-concept to cacher, add the option to select whether to handle pagefaults or not 11 months ago
Constantin Fürst 4b0770fc8e add result for try with strong waiting 11 months ago
Constantin Fürst 6dd7f80500 again, redo the perf flame graph 11 months ago
Constantin Fürst 29c49ca5b4 redo flame graph with correct stack information 11 months ago
Constantin Fürst 4cbe649601 generate flame graph for runtime of prefetch 11 months ago
Constantin Fürst 57e696297c provide new results for simpleq 11 months ago
Constantin Fürst bb1d20924a fix index clash for thread-and-group unique indexing 11 months ago
Constantin Fürst 0eca180e53 fix destination indexing in aggrj for happly 11 months ago
Constantin Fürst 5e8f3e05e3 fix chunk indexing in scanb and refactor result calculation 11 months ago
Constantin Fürst c2b9e6656d fix chunk selection in scanb, use the dataptr in aggrj complex mode, export some functions to src/utils/BenchmarkHelpers.cpp 11 months ago
Constantin Fürst 845e812ca7 set the correct sum check which was inverted by querry type 11 months ago
Constantin Fürst abcb9a4b2e extend modestring to contain query type 11 months ago
Constantin Fürst e4ed4ac5b9 correct and minimize subchunking implementation which now is only allowed in scanb 11 months ago
Constantin Fürst 50560606a3 add complex query as benchmarking option and evaluate results 11 months ago
Constantin Fürst 79a7dcead8 re-run bench with actually working query 11 months ago
Constantin Fürst 3c1606da51 init datab correctly as well to fix the benchmark 11 months ago
Constantin Fürst 10a791dea1 remove the experimental code branches that turned out not to yield any benefit (sched-yield has too high delay and with the new load balancer, subchunking for aggrj is also not needed anymore) 11 months ago
Constantin Fürst a6771287e9 add result with the new load balancer 11 months ago
Constantin Fürst 881047068c rerun benchmarks for dram baseline and hbm peak 11 months ago
Constantin Fürst a72a26dbee remove cout/cerr output from cache and benchmark to not falsify results 11 months ago
Constantin Fürst 19c2b35218 add results vor thesis evaluation 11 months ago