From 142f9391bf39ce870740792b7af894c72120b572 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Constantin=20F=C3=BCrst?= Date: Mon, 19 Feb 2024 17:44:02 +0100 Subject: [PATCH] address calculations the way that the align-environment numbers them (leading with chapter number) --- thesis/content/30_performance.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/thesis/content/30_performance.tex b/thesis/content/30_performance.tex index cfe658a..7aff4b6 100644 --- a/thesis/content/30_performance.tex +++ b/thesis/content/30_performance.tex @@ -85,7 +85,7 @@ In Figure \ref{fig:perf-mtsubmit}, we note that threading has no discernible neg 8800\ MT/s \times 8B/T = 70400 \times 10^6 B/s &= 65.56\ GiB/s \end{align} -Moving data from \glsentryshort{dram} to \gls{hbm} is most relevant to the rest of this work, as it is the target application. With \gls{hbm} offering higher bandwidth than the \glsentryshort{dram} of our system, we will be restricted by the available bandwidth of the source. To determine the upper limit achievable, we must calculate the available peak bandwidth. For each \gls{numa:node}, the test system is configured with two DIMMs of DDR5-4800. The naming scheme contains the data rate in Megatransfers (MT) per second, however the processor specification notes that for dual channel operation, the maximum supported speed drops to \(4400\ MT/s\) \cite{intel:xeonmax-ark}. We calculate the transfers performed per second for one \gls{numa:node} (1), followed by the bytes per transfer \cite{kingston:ddr5-spec-overview} in calculation (2), and at last combine these two for the theoretical peak bandwidth per \gls{numa:node} on the system (3). \par +Moving data from \glsentryshort{dram} to \gls{hbm} is most relevant to the rest of this work, as it is the target application. With \gls{hbm} offering higher bandwidth than the \glsentryshort{dram} of our system, we will be restricted by the available bandwidth of the source. To determine the upper limit achievable, we must calculate the available peak bandwidth. For each \gls{numa:node}, the test system is configured with two DIMMs of DDR5-4800. The naming scheme contains the data rate in Megatransfers (MT) per second, however the processor specification notes that for dual channel operation, the maximum supported speed drops to \(4400\ MT/s\) \cite{intel:xeonmax-ark}. We calculate the transfers performed per second for one \gls{numa:node} (3.1), followed by the bytes per transfer \cite{kingston:ddr5-spec-overview} in calculation (3.2), and at last combine these two for the theoretical peak bandwidth per \gls{numa:node} on the system (3.3). \par \begin{figure}[t] \centering