diff --git a/thesis/content/60_evaluation.tex b/thesis/content/60_evaluation.tex index 2967ca4..5802248 100644 --- a/thesis/content/60_evaluation.tex +++ b/thesis/content/60_evaluation.tex @@ -27,6 +27,8 @@ With this difficult scenario, we expect to spend time analysing runtime behaviou Consider using parts of flamegraph. Same speed as dram, even though allocation is performed in the timed region and not before. Mention dml performs busy waiting (cite dsa-paper 4.4 for use of interrupts mentioned in arch), optimization with weak wait. Mention optimization weak access for prefetching scenario. +Scan A is memory bound, therefore copying B from DRAM to HBM directly cuts into available bandwidth to A when both are located on the same node. Better performance when A and B are stored on different nodes. + \section{Observation and Discussion} %%% Local Variables: