81 Commits (05ffea0bdb1c73518e116ac6bac1df43446f7158)

Author SHA1 Message Date
Constantin Fürst e99bf619c2 handle memory allocation outside of the cache, pre-allocate in benchmark and memset to hopefully guarantee no pagefaults will be encountered 11 months ago
Constantin Fürst 8999fe4ca3 share current config for qdp bench from vampir 11 months ago
Constantin Fürst 79a7e9637c fix benchmark by waiting and not dropping barrier in aggrj 11 months ago
Constantin Fürst bc1c3d0096 fix block size for access by cacher in scanb 11 months ago
Constantin Fürst 99552b3de4 add option for forcing map of pages by touching each one with a write at its begin, required as somehow behaviour changed, cache was experiencing page fault errors and handling by dsa is simply too slow 11 months ago
Constantin Fürst 5044b4419c make load balancing thread-local to reduce atomic cost 11 months ago
Constantin Fürst f9d47d3a45 add scanb back to the barrier, now other threads will wait for finish of work submission 11 months ago
Constantin Fürst 006b856c44 resolve issues from the recent reset of qdp benchmark 11 months ago
Constantin Fürst de1de9134b reset benchmark 11 months ago
Constantin Fürst 4a587a36e2 remove overlap-execution barriers and run for the entire block 11 months ago
Constantin Fürst c393b8eb88 improve load balancing node assignment 11 months ago
Constantin Fürst 21702d5309 remove sub and overchunking for scanb caching, use the per-iteration barriers again 11 months ago
Constantin Fürst 7d614769db remove forgotten access to load timer 11 months ago
Constantin Fürst 6a4eec37ca remove vector-load timing as its too expensive 11 months ago
Constantin Fürst a83f208cd2 fix time evaluation for qdp bench 11 months ago
Constantin Fürst 94669924c8 implement cache in aggrj for qdp 11 months ago
Constantin Fürst c7877ecdf6 remove skeleton of now defunct function in qdp 11 months ago
Constantin Fürst 20c6e54df7 remove broken implementation for non-divisible chunk-group-thread-counts 11 months ago
Constantin Fürst a3a8dff1aa reset some changes to the aggregation and filter functions not quite needed 11 months ago
Constantin Fürst 122eab35b7 modify benchmarking code to measure time spent loading vectors too 11 months ago
Constantin Fürst d1cc3e3b0c modification to qdp benchmark, returns to per-chunk barrier wait, uses userspace semaphore for one-way barrier from scan_b to aggr_j as scan_b should submit asap but aggr_j should wait on submission from scan_b, contains TODO for modifying code to support chunkcount not divisible by 2 11 months ago
Constantin Fürst a963406f7c move mode selection to Configuration.hpp, adapt the CopyMethodPolicy-Function to return only src_node for task sizes under 16MiB which is now required to not cause high submission count which slows down small copies 11 months ago
Constantin Fürst ef805244ac use 4gib as size and again 1 aggrj thread for qdp bench 11 months ago
Constantin Fürst 81527fdb6b commit current vampir config 11 months ago
Constantin Fürst 69a3d2cef4 experimental implementation for tc-scanb > tc-aggrj, the second 11 months ago
Constantin Fürst 07fba8a5f0 experimental implementation for tc-scanb > tc-aggrj 11 months ago
Constantin Fürst d4122ba25a add updated config for prefetch from vampir 11 months ago
Constantin Fürst e4a0030049 fix prefetching subchunk indexing and adapt the weak access flag for join 11 months ago
Constantin Fürst 972440d19f repair flags implementation 11 months ago
Constantin Fürst b3607329a6 add a flags-concept to cacher, add the option to select whether to handle pagefaults or not 11 months ago
Constantin Fürst bb1d20924a fix index clash for thread-and-group unique indexing 11 months ago
Constantin Fürst 0eca180e53 fix destination indexing in aggrj for happly 11 months ago
Constantin Fürst 5e8f3e05e3 fix chunk indexing in scanb and refactor result calculation 11 months ago
Constantin Fürst c2b9e6656d fix chunk selection in scanb, use the dataptr in aggrj complex mode, export some functions to src/utils/BenchmarkHelpers.cpp 11 months ago
Constantin Fürst 845e812ca7 set the correct sum check which was inverted by querry type 11 months ago
Constantin Fürst abcb9a4b2e extend modestring to contain query type 11 months ago
Constantin Fürst e4ed4ac5b9 correct and minimize subchunking implementation which now is only allowed in scanb 11 months ago
Constantin Fürst 50560606a3 add complex query as benchmarking option and evaluate results 11 months ago
Constantin Fürst 3c1606da51 init datab correctly as well to fix the benchmark 11 months ago
Constantin Fürst 10a791dea1 remove the experimental code branches that turned out not to yield any benefit (sched-yield has too high delay and with the new load balancer, subchunking for aggrj is also not needed anymore) 11 months ago
Constantin Fürst 881047068c rerun benchmarks for dram baseline and hbm peak 11 months ago
Constantin Fürst a72a26dbee remove cout/cerr output from cache and benchmark to not falsify results 11 months ago
Constantin Fürst 0856d58855 properly drop barrier when using iterrative aggregation 11 months ago
Constantin Fürst 5c08313830 properly set barrier when using iterrative aggregation 11 months ago
Constantin Fürst 178d45fafa use weak wait, add options to tweak for caching mode 11 months ago
Constantin Fürst 34f7aca50a correct bad thread timing storage size set which should have been 1 from the start and not 0 11 months ago
Constantin Fürst 100774f495 remove the step-by-step barrier sync and let scana and scanb run to completion before starting with aggrj 11 months ago
Constantin Fürst 52aaab3c09 prevent illegal instruction exception when no measurements have been conducted 11 months ago
Constantin Fürst 391e6ca273 use proper timing indices for documenting thread runtime 11 months ago
Constantin Fürst e429e8fd40 adapt barrier waiting points, add timings to thread execution 11 months ago