Browse Source

slight modification of the intro, making it fit on one page

master
Constantin Fürst 11 months ago
parent
commit
89887b6cad
  1. BIN
      thesis/bachelor.pdf
  2. 4
      thesis/content/10_introduction.tex

BIN
thesis/bachelor.pdf

4
thesis/content/10_introduction.tex

@ -14,9 +14,9 @@
The proliferation of various technologies, such as Non-Volatile RAM (NVRAM), High Bandwidth Memory (HBM), and Remote Memory, has ushered in a diverse landscape of systems characterized by varying tiers of main memory. Within these systems, the movement of data across memory classes becomes imperative to leverage the distinct properties offered by the available technologies. Traditionally tasked with managing data locality, the CPU faces an added burden in heterogeneous memory environments, thereby diminishing available processing cycles. To mitigate this strain on the CPU, certain Intel Server Processors now feature the \glsentryfirst{dsa}, to which certain data operations may be offloaded \cite{intel:xeonbrief}. With it, this thesis undertakes the challenge of optimizing data locality on \glsentrylong{numa}s. \par
The primary objectives of this thesis are twofold. Firstly, it involves a comprehensive analysis and characterization of the Intel \gls{dsa} architecture. Secondly, the focus extends to the application of \gls{dsa} in the domain-specific context of \glsentryfirst{qdp} to accelerate database queries \cite{dimes-prefetching}. This thesis seeks to explore how \gls{dsa} can be strategically utilized to tackle the challenges posed by heterogeneous memory systems, offering insights into the integration of data streaming acceleration with intelligent prefetching. \par
The primary objectives of this thesis are twofold. Firstly, it involves a comprehensive analysis and characterization of the architecture of the Intel \gls{dsa}. Secondly, the focus extends to the application of \gls{dsa} in the domain-specific context of \glsentryfirst{qdp} to accelerate database queries \cite{dimes-prefetching}. \par
This work introduces significant contributions to the field. Notably, the design and implementation of an offloading cache represent a key highlight, providing an interface for leveraging the strengths of tiered storage with minimal integration efforts. The code for this is made available in the accompanying repository \cite{thesis-repo} under 'offloading-cacher'. Additionally, the thesis includes a detailed examination and analysis of the strengths and weaknesses of the \gls{dsa} through microbenchmarks. These benchmarks serve as practical guidelines, offering insights for the optimal application of \gls{dsa} in various scenarios. As of the time of writing, this thesis stands as the first scientific work to extensively evaluate the \gls{dsa} in a multi-socket system and provide benchmarks for programming through the \glsentryfirst{dml}. Furthermore, performance for data movement from \glsentryshort{dram} to \glsentryfirst{hbm} using \gls{dsa} has not yet been evaluated by the scientific community. \par
This work introduces significant contributions to the field. Notably, the design and implementation of an offloading cache represent a key highlight, providing an interface for leveraging the strengths of tiered storage with minimal integration efforts. Additionally, the thesis includes a detailed examination and analysis of the strengths and weaknesses of the \gls{dsa} through microbenchmarks. These benchmarks serve as practical guidelines, offering insights for the optimal application of \gls{dsa} in various scenarios. As of the time of writing, this thesis stands as the first scientific work to extensively evaluate the \gls{dsa} in a multi-socket system and provide benchmarks for programming through the \glsentryfirst{intel:dml}. Furthermore, performance for data movement from \glsentryshort{dram} to \glsentryfirst{hbm} using \gls{dsa} has not yet been evaluated by the scientific community. \par
The Technical Background chapter furnishes the reader with pertinent background information necessary for understanding the subsequent sections of this work, encompassing \gls{hbm}, \gls{qdp}, and \gls{dsa} along with its programming interface \cite{intel:dmldoc}. Additionally, guidance on system setup and configuration is provided. Subsequently, the Performance Microbenchmarks section analyzes the strengths and weaknesses of the \gls{dsa}. Methodologies are presented, each benchmark is elaborated upon in detail, and usage guidance is drawn from the results. The following sections, Design and Implementation, elucidate the practical aspects of the work, including the development of the interface and implementation for an offloading cache, shedding light on specific design considerations and implementation challenges. The Evaluation section offers a comprehensive assessment of the implemented solution. In Conclusion, insights gained are reflected upon, and the contributions and results of the preceding chapters are reviewed. \par

Loading…
Cancel
Save