background.tex

\section{Background}
\label{sec:background}
%Darshan is a lightweight I/O characterization tool that captures I/O access pattern information from HPC applications.~\cite{Darshan} This will be described in detail in Section III. This I/O characterization tool will be used for tracing and collecting detailed I/O event data.
\subsection{Darshan}
Darshan is a lightweight I/O characterization tool that captures I/O access pattern information from HPC applications.~\cite{Darshan}.
This tool is used to tune applications for increased scientific productivity or performance and is suitable for continuous deployment for workload characterization of large systems~\cite{darshan-webpage}. It provides detailed statistics about various types of file accesses made by MPI and non-MPI applications, which can be enabled or disabled as desired. These types include POSIX, STDIO, LUSTRE and MDHIM for non-MPI applications and MPI-IO, HDF5 and some PnetCDF for MPI applications~\cite{darshan-runtime}. This functionality provides users with a summary of I/O behavior and patterns from an application post-run. It does not provide insights into run time I/O behavior and patterns or concurrent system conditions. This may limit the ability to use this data to identify the root cause(s) of I/O variability and when during an application run this occurs. 

\subsection{LDMS}
%The Lightweight Distributed Metric Service (LDMS) is a low-overhead system that collects and transports HPC data for OVIS via \emph{samplers} and \emph{plugins}.~\cite{ovisweb} OVIS which is a modular system for collecting, analyzing, storing, transporting and visualization HPC data in order to provide further insights into the system state, resource utilization and performance. A \emph{sampler} refers to an LDMS daemon that collects system data determined by sampler \emph{plugins} which determine the type of data to be sampled, aggregated and stored.
% which takes a set configuration to initialize and configure plugins. 

The Lightweight Distributed Metric Service (LDMS) is a low-overhead production 
monitoring system that can run on HPC machines with thousands of nodes collecting 
system data via LDMS daemons running data sampling plugins.
Data is typically transported from these \emph{sampler} daemons to \emph{aggregator} 
daemons using a Remote Direct Memory Access (RDMA) transport to minimize CPU 
overhead on compute nodes. Final aggregators in a transport chain store
the data to a database.
%A \emph{sampler} 
%is a daemon that collects the data while the \emph{plugin} determines the kind of 
%data sampled, aggregated, or stored, which takes a set configuration to initialize 
%and configure plugins. 
%There are a variety of samplers and other
%plugins that can be written for the LDMS API (in C). The system state insights are achieved by LDMS's \emph{absolute-timestamp} view of system conditions through multi-hop \emph{aggregation} and the LDMS transport. An LDMS daemon, \emph{LDMSD}, aggregator supports multiple levels, networks and security domains so data can be sent to various locations. 
System state insights are achieved by LDMS's synchronous sampling with 
\emph{absolute timestamps}. This provides a snapshot-like view of system conditions.
% through multi-hop aggregator support with multiple levels. 
%This support is in 
Transport is performed in a tree structure where the \emph{samplers} (e.g. leaves) 
determine the kind of data sampled, the intermediate aggregators are used for data 
transport, and the last level aggregators are used for storage. There are a variety 
of sampler plugins that can be used to collect different system metrics 
(e.g. memory, CPU, network).
Additional functionalities exist in LDMS, such as the \emph{Streams} publish-subscribe
functionality, that enable the aggregation of event-based application data. Our framework 
utilizes this publish-subscribe functionality and the \emph{LDMS Streams API} to 
transport Darshan's I/O event data that is collected from applications during 
execution time and store it, along with \emph{absolute timestamps}, in a database.

\subsection{DSOS}
The Distributed Scalable Object Store (DSOS) is a database designed to manage large 
volumes of HPC data~\cite{sosgithub} efficiently. LDMS uses it to support high 
data injection rates to enhance query performance and flexible storage management. 
%It supports high data injection rates, has an enhanced query performance and flexible storage management. 
DSOS has a command line interface for data interaction and various APIs for 
languages such as Python, C, and C++. A DSOS cluster consists of multiple 
instances of DSOS daemons, \emph{dsosd}, that run on multiple storage servers 
on a single cluster. The DSOS Client API can perform parallel queries to all 
\emph{dsosd} in a DSOS cluster. The results of the queried data are then returned 
in parallel and sorted based on the index selected by the user. This database and 
it's Python API are used in this framework for storing and querying the I/O event data. 
\subsection{HPC Web Services}
The HPC Web Services is an analysis and visualization infrastructure for 
LDMS~\cite{ClusterAV}, that integrates an open-source web application, 
Grafana~\cite{grafana-website}, with a custom back-end web framework (Django)
which calls python modules for analysis and visualization of HPC data. Grafana 
is an open-source visualization tool tailored towards time-series data from 
various database sources. Grafana provides charts, graphs, tables, etc. for 
viewing and analyzing queried data in real time. Using a custom DSOS-Grafana API, 
the python analysis modules to be used can be specified in a Grafana query. 
Once specified, that python analysis transforms any data queried from the dashboard 
before returning the data to Grafana. Grafana enables a wide variety of visualization 
options for the data and allows users to save and share those visualizations to others. 
Our framework leverages the HPC Web Services for run time analyses and visualizations 
of I/O event data collected by Darshan.