Skip to content
ddurden edited this page Sep 24, 2018 · 28 revisions

What is eddy4R?

Contents:

eddy4R Background

eddy4R is a family of open-source packages for EC raw data processing, analyses and modeling in the R Language for Statistical Computing (R Core Team, 2016). As described in Metzger et al.(2017), eddy4R is being developed by NEON scientists with wide input from the scientific community (e.g., De Roo et al., 2014; Kohnert et al., 2015; Lee et al., 2015; Metzger et al., 2012; Metzger et al., 2013; Metzger et al., 2016; Sachs et al., 2014; Salmon et al., 2015; Serafimovich et al., 2013; Starkenburg et al., 2016; Vaughan et al., 2015; Xu et al., 2017). eddy4R currently consists of two public packages eddy4R.base and eddy4R.qaqc, with several additional packages in preparation, including eddy4R.stor, eddy4R.turb, eddy4R.ucrt and eddy4R.erf. eddy4R.base and eddy4R.qaqc are published here in conjunction with NEON’s release of EC Level 1 data products, and are now available in the eddy4R public repo.

Eddy-covariance Calculations

A current challenge for EC tower networks in informing regional and continental scale processes is instrument and computational compatibility. The computations involved in EC processing are complex and developmentally dynamic, making code portability, extensibility, and documentation paramount.The suite of eddy4R packages are used to process eddy-covariance data, i.e. calculating the net surface atmosphere exchange (animated depiction below) by combingin turbulent exchange and storage exchange estimates.

DevOps Approach

The NEON DevOps approach was fully documented in Metzger et al. (2017), so I will quote the paper below to give a brief description:

NEON’s DevOps framework consists of a periodic sequence (Figure 2) that incorporates these workflow steps. For this purpose we define NEON Science as personnel working directly on the NEON project, and the Science Community, regardless of whether they also work on the NEON project, as anyone producing or using data, algorithms, or research products related to the NEON data themes (Atmosphere; Biogeochemistry; Ecohydrology; Land Cover and Processes; Organisms, Populations, and Communities): The science community contributes algorithms and best practices (1). Implicitly or explicitly, this embodies the DevOps: Plan stage – the algorithms most valued by the community are being incorporated. Together with NEON Science (2), these algorithms are coded in the open-source R computational environment (DevOps: Create stage). DevOps: Verify (testing) and Package (packaging) are performed as the code is compiled into eddy4R packages via the GitHub distributed version control system (3). NEON Science releases an eddy4R version from GitHub, which automatically builds an eddy4R-Docker image on DockerHub as specified in a “Dockerfile” (4; DevOps: Release stage). The eddy4R-Docker image is immediately available for deployment by NEON CI (5; DevOps: Configure & Monitor stages), the Science Community (1) and NEON Science (2) alike. Here the DevOps: Configure (computational resource allocation) & Monitor stages occur. Monitoring of end-user experience is also performed in GitHub (3) via issue-tracking.

How this DevOps cycle is realized in Github is described in the figure below.

FAQ

This is a short FAQ that will grow as we receive feedback from the user community.

  • Q: Why is NEON using eddy4R to process eddy-covariance data?

    • A: eddy4R (Metzger et al., 2017)…
      • …is an existing off-the-shelf solution like EddyPro (Fratini et al.), TK3 (Mauder et al.), EddyUH (Mammarella et al.), EdiRe (Clement et al.) and others. The main difference is that eddy4R is not a pre-compiled "black box", but consisting of open-source modules that can be modified and combined to user-defined workflows as needed. Mauder and Clement are contributing to eddy4R, among others.
      • …provides the necessary flexibility to efficiently mold an eddy-covariance software into the NEON Cyberinfrastructure, Data Products and Problem Tracking and Resolution framework.
      • …calculates storage flux from tower profiles, which is not handled by other flux processors.
        • …produces fluxes that fulfil requirements ensuring critical NEON science questions can be addressed. Standard configurations of "black box" processors essentially produce engineering-grade fluxes.
      • …has been verified to within 1% against TK3, and EddyPro has expressed interest in several eddy4R modules.
  • Q: What are the advantages of NEON deploying eddy4R in Docker containers compared to other flux processing schemes, such as e.g. ICOS, AmeriFlux etc.?

    • A: eddy4R in Docker containers…
      • …is fully automated: It is designed to require human interaction only per notification, mostly in case of hardware issues. This is made possible through NEON’s tight integration of hardware and software development, which is not currently available at ICOS or AmeriFlux.
        • …is computationally efficient: It uses a fully adaptive single-pass workflow, which makes it computationally efficient compared to other flux processing schemes requiring double or triple passes. This approach permits near-real-time processing, for which AmeriFlux and NASA have already expressed interest.
        • …is scalable across large compute facilities: It is fully parallelized.
        • …development and use is actively conducted by the research community, and consolidated at NEON.
        • …maximizes uptime and data coverage through tight integration with NEON Problem Tracking and Resolution.
        • …combines existing off-the-shelf solutions in a most modular way, permitting straightforward adjustments and versioning as science and/or hardware progresses.
        • …provides a modular blueprint of the necessary framework for deploying other community-developed algorithms as part of the NEON data processing pipeline. Examples for expected synergies and efficiencies are the generation of higher-level data products including gap-filling, flux maps etc.
  • Q: What are some of the advanced features of eddy4R in Docker containers not currently available in other flux processors?

    • A: eddy4R in Docker containers…
      • …performs alignment and motion compensation for wind measurements.
        • …equally processes eddy-covariance data from stationary towers or moving platforms (aircraft, buoys etc.) while keeping track of position and changing source areas.
        • …provides time-frequency decomposed fluxes. This allows e.g. order-of-magnitude improved temporal and spatial resolution, unveiling of organized turbulent transport modes and more.
        • …allows inferring multi-dimensional environmental response functions. These permit calculating land-cover specific fluxes, rectification of spatial representativeness and more.
      • …performs regionalization at fundamentally improved space/time resolution compared to other approaches, allowing to investigate diel cycle and even sub-hourly changes.