diff --git a/thesis/content/30_performance.tex b/thesis/content/30_performance.tex index 706a317..8679ba9 100644 --- a/thesis/content/30_performance.tex +++ b/thesis/content/30_performance.tex @@ -60,7 +60,7 @@ We anticipate that single submissions will consistently yield poorer performance \label{fig:perf-submitmethod} \end{figure} -In Figure \ref{fig:perf-submitmethod} we conclude that with transfers of 1 MiB and upwards, the submission method makes no noticeable difference. For smaller transfers the performance varies greatly, with batch operations leading in throughput. This finding is aligned with the observation that \enquote{SWQ observes lower throughput between 1-8 KB [transfer size]} \cite[p. 6 and 7]{intel:analysis} for normal submission method. \par +In Figure \ref{fig:perf-submitmethod} we conclude that with transfers of 1 MiB and upwards, the submission method makes no noticeable difference. For smaller transfers the performance varies greatly, with batch operations leading in throughput. Reese Kuper et al. observed that \enquote{SWQ observes lower throughput between 1-8 KB [transfer size]} \cite[pp. 6]{intel:analysis}. We however observe a much higher point of equalization, pointing to additional delays introduced by programming the \gls{dsa} through \gls{intel:dml}. \par Another limitation may be observed in this result, namely the inherent throughput limit per \gls{dsa} chip of close to 30 GiB/s. This is apparently caused by I/O fabric limitations \cite[p. 5]{intel:analysis}. \par