\item Provide an overview of the challenges posed by Non-Uniform Memory Architectures (NUMA) with diverse non-functional properties, such as varying throughput, in modern hardware design.
\item Introduce the Intel Data Streaming Accelerator (Intel DSA) and its role in optimizing streaming data movement operations for various applications.
\item
\item Explain the concept of Query Driven Prefetching (\gls{qdp}) and its significance in optimizing database performance through intelligent prefetching.
\item Clearly state the main objectives of thesis, emphasizing the analysis and characterization of the Intel DSA architecture, and application of dsa to qdp
\item Outline the primary contributions of the work, including the implementation of a cache within \gls{qdp}, the performance and analysis of microbenchmarks
@ -22,6 +23,21 @@
\item Provide an overview of the structure, outlining the main sections and their respective content.
\end{itemize}
Alongside Intel DSA, the concept of Query Driven Prefetching (\gls{qdp}) emerges as a strategic methodology for optimizing database performance through intelligent prefetching. This section elaborates on the principles of \gls{qdp}, emphasizing its significance in enhancing database query execution, particularly in heterogeneous memory systems. \par
As a response to these challenges, this thesis navigates through the complex terrain of optimizing data movement operations. Central to this exploration is the introduction of the Intel Data Streaming Accelerator (Intel DSA). This groundbreaking technology plays a pivotal role in enhancing streaming data movement operations across various applications. Understanding the architecture and functionality of Intel DSA becomes paramount in the context of addressing the challenges presented by NUMA. \par
The main objectives of this thesis are twofold. Firstly, a comprehensive analysis and characterization of the Intel DSA architecture is undertaken. This involves a deep dive into the intricacies of Intel DSA and its role in optimizing data movement operations. Secondly, the focus extends to the application of Intel DSA in the domain-specific context of \gls{qdp}. This thesis aims to explore how Intel DSA can be strategically utilized to address the challenges posed by heterogeneous memory systems, providing insights into the integration of data streaming acceleration with intelligent prefetching. \par
This work introduces substantial contributions to the field. The implementation of a cache within the \gls{qdp} framework is a key highlight. Additionally, the thesis includes a detailed examination and analysis of microbenchmarks. These benchmarks serve as a practical guideline, offering insights for optimal application of Intel DSA in diverse scenarios. Aligned with the generic nature of caching, this implementation addresses the specific requirements of \gls{qdp}, offering a nuanced solution to the challenges of prefetching in heterogeneous memory systems. \par
To facilitate a coherent exploration, this thesis is structured into distinct sections. The Technical Background section delves into essential elements, including High Bandwidth Memory (HBM), \gls{qdp}, Intel DSA, the programming interface, and the system setup and configuration. Following this, the Performance Microbenchmarks section meticulously examines benchmarking methodologies, specific benchmarks employed, and a detailed analysis of the results. \par
The subsequent sections, Design and Implementation, unravel the practical aspects of the work. Detailed task descriptions, cache design intricacies within \gls{qdp}, and insights into locking mechanisms and accelerator usage are presented. The Evaluation section provides a comprehensive assessment of the implemented solutions and benchmarks. \par
Finally, the Conclusion and Outlook section wraps up the thesis by summarizing findings, emphasizing conclusions, and offering a glimpse into potential future work and developments in the field. This structured approach ensures a logical flow, enabling readers to navigate seamlessly through the intricate exploration of Intel DSA, \gls{qdp}, and their applications in heterogeneous memory systems. \par