128 Commits (eae4e5846d217e69e9bf9ad5a97281ac6aaff74c)

Author SHA1 Message Date
Constantin Fürst eb9960924e extend plotter to also provide rawtime for the base configurations of qdp, forgot to add it to a previous commit that should have included it 11 months ago
Constantin Fürst f653c22f88 add rawtiming table for the comparison-benchmarks (dram,hbm) in the qdp lotter, update the plots 11 months ago
Constantin Fürst 44c220fd5a add table output to qdp result plotter which displays speedup compared to dram as baseline, redo the timing plots with the latest test results, add the speedup table 11 months ago
Constantin Fürst 366cd84a1f remove the single-group results due to them not making the results easier to view 11 months ago
Constantin Fürst 6c39d79610 add results with only a single group (1/32th) and 128 MiB Tasksize (1/32th) for cleaner results for plotting 11 months ago
Constantin Fürst a4dac61730 change config to allow 64 threads for stage 1 and 32 for stage 2 in all benchmarks 11 months ago
Constantin Fürst fb2a8e445c increase font size of timing donuts from qdp bench for better readability 11 months ago
Constantin Fürst bd23ae138e finalize plotter script and add timing results 11 months ago
Constantin Fürst 567a24f8c0 add hbm baseline and reorganize folders again 11 months ago
Constantin Fürst f6c43a6659 restructure evaluation results folder again 11 months ago
Constantin Fürst 6070f320f5 add results for distributed locations prefetching 11 months ago
Constantin Fürst 3c7c7852a5 remeassure performance for out of cache allocation 11 months ago
Constantin Fürst b710aec5fe restructure evaluation results, add new results with out of cache allocation 11 months ago
Constantin Fürst c5022105cb publish current configuration for testing 11 months ago
Constantin Fürst 16e47a862f allocate the correct amount of chunks for caching (missing was the run count) and add them to the queue for each run 11 months ago
Constantin Fürst e99bf619c2 handle memory allocation outside of the cache, pre-allocate in benchmark and memset to hopefully guarantee no pagefaults will be encountered 11 months ago
Constantin Fürst 19ef2df856 update perf profile with manually disabled huge pages 11 months ago
Constantin Fürst d4677b3c59 measure performance without huge pages on 11 months ago
Constantin Fürst 7afcffbefa set correct node in perf recording script 11 months ago
Constantin Fürst c86d517444 fix the published results for prefetching 11 months ago
Constantin Fürst 94b3576d5a publish measurements from benchmark 11 months ago
Constantin Fürst 8999fe4ca3 share current config for qdp bench from vampir 11 months ago
Constantin Fürst 79a7e9637c fix benchmark by waiting and not dropping barrier in aggrj 11 months ago
Constantin Fürst bc1c3d0096 fix block size for access by cacher in scanb 11 months ago
Constantin Fürst 99552b3de4 add option for forcing map of pages by touching each one with a write at its begin, required as somehow behaviour changed, cache was experiencing page fault errors and handling by dsa is simply too slow 11 months ago
Constantin Fürst 5044b4419c make load balancing thread-local to reduce atomic cost 11 months ago
Constantin Fürst f9d47d3a45 add scanb back to the barrier, now other threads will wait for finish of work submission 11 months ago
Constantin Fürst 006b856c44 resolve issues from the recent reset of qdp benchmark 11 months ago
Constantin Fürst de1de9134b reset benchmark 11 months ago
Constantin Fürst 4a587a36e2 remove overlap-execution barriers and run for the entire block 11 months ago
Constantin Fürst c393b8eb88 improve load balancing node assignment 11 months ago
Constantin Fürst 21702d5309 remove sub and overchunking for scanb caching, use the per-iteration barriers again 11 months ago
Constantin Fürst 7d614769db remove forgotten access to load timer 11 months ago
Constantin Fürst 6a4eec37ca remove vector-load timing as its too expensive 11 months ago
Constantin Fürst 93a281fa26 improve debug output for relwithdebinfo in qdp, fix filename for record perf script, add perf.svg with better debug info 11 months ago
Constantin Fürst 624e8b55ea add script to record perf and make the flame graph 11 months ago
Constantin Fürst 942d7be7e9 redo benchmarks for qdp 11 months ago
Constantin Fürst a83f208cd2 fix time evaluation for qdp bench 11 months ago
Constantin Fürst cc8d203771 redo benchmarks for qdp, move previous results to old (folder) 11 months ago
Constantin Fürst 94669924c8 implement cache in aggrj for qdp 11 months ago
Constantin Fürst c7877ecdf6 remove skeleton of now defunct function in qdp 11 months ago
Constantin Fürst 20c6e54df7 remove broken implementation for non-divisible chunk-group-thread-counts 11 months ago
Constantin Fürst a3a8dff1aa reset some changes to the aggregation and filter functions not quite needed 11 months ago
Constantin Fürst 69aec6fa48 add plotter for the results of qdp which turns them into a donut-graph 11 months ago
Constantin Fürst 122eab35b7 modify benchmarking code to measure time spent loading vectors too 11 months ago
Constantin Fürst d1cc3e3b0c modification to qdp benchmark, returns to per-chunk barrier wait, uses userspace semaphore for one-way barrier from scan_b to aggr_j as scan_b should submit asap but aggr_j should wait on submission from scan_b, contains TODO for modifying code to support chunkcount not divisible by 2 11 months ago
Constantin Fürst a963406f7c move mode selection to Configuration.hpp, adapt the CopyMethodPolicy-Function to return only src_node for task sizes under 16MiB which is now required to not cause high submission count which slows down small copies 11 months ago
Constantin Fürst ef805244ac use 4gib as size and again 1 aggrj thread for qdp bench 11 months ago
Constantin Fürst 81527fdb6b commit current vampir config 11 months ago
Constantin Fürst b35f9978ae again, redo the perf-eval with reduced data size and load to prevent missing frames, the second 11 months ago