Skip to content

ACES data QA guide

Ashley Barnes edited this page Jun 21, 2023 · 21 revisions

Contents

Getting started

Weblog review

Manual image inspection

Updating the GitHub issues

Updating clean parameter json files

Uploading Updated Data

Getting started

  • All ACES Execution Blocks (EBs) can be found here: https://github.com/ACES-CMZ/reduction_ACES/projects/1
  • EBs that are in the Delivered Execution Blocks column are ready for QA.
  • Ideally, we would like at least 2 people to perform independent QA per EB. If you see any EBs with fewer than 2 user icons in the lower right, please consider assigning yourself to them. To do this, click on the EB and you should see a section for Assignees. Click on the gear icon next to this and assign yourself. If you do not see this option, visit this link and request to join the team: https://github.com/orgs/ACES-CMZ/teams/data-reduction-wp1. Once approved, you should now be able to assign yourself (or others).

Weblog review

Our datasets are automatically processed and imaged by the ALMA pipeline, which produces a detailed weblog that can be reviewed to check what the pipeline did, and to look for any potential issues that may or may not have been flagged by the pipeline.

Each EB issue has a check-list for tracking purposes. On this list you should see a hyperlink that will take you to the weblog for that EB.

Task warnings

The easiest place to begin the weblog review is in the "By Task" tab at the top of the page. This will show you a list of the tasks executed by the pipeline, along with a QA score and any relevant warnings for each task. See the example below:

The main things to look out for here are warnings and low QA scores. In the above example, there are two amber warnings and one red. Amber warnings aren't always serious, but you should look into the task output to check. In this particular example, the two amber warnings are both related to an issue with the phase calibrator that is being discussed here: https://github.com/ACES-CMZ/reduction_ACES/issues/6#issuecomment-1007434917.

The red warning in the above is a severe one: QA tclean stopped to prevent divergence (stop code 5). Field: Sgr_A_star SPW: 24. This is telling us that the cleaning started to diverge during the imaging process for the data cube for spectral window 24, and it will therefore need to be re-imaged. The peak intensity map below shows some weird structure that suggests divergence.

As a good example, check out the discussion and figures regarding divergence and how to address this in this issue: https://github.com/ACES-CMZ/reduction_ACES/issues/8#issuecomment-969666330.

Pipeline images

There are many figures and plots in the weblog, and it's good practice to at least skim through most of them. For our purposes, the most important ones to review are:

  • The calibrator images, which should look well cleaned with a shape similar to that of the synthesised beam [21. hif_makeimages (cals)]
  • The find-continuum results. Continuum-only channel selection is shown in the "Average spectrum" plots. Look out for any line emission that may be contaminating these selected channels. [31. hif_findcont]
  • The full continuum image. Click "View other QA images" for more images. The continuum should look well cleaned, with no signs of divergence. Check that the residual and clean mask look acceptable.
  • The image cubes for each spectral window (SPW). For each SPW, click "View other QA images" for more images. [38. hif_makeimages (cube)]
    • The image should look well cleaned, with no signs of divergence.
    • The residual should not have much significant structure/signal.
    • The line-free moment 0 & 8 images should look fairly uniform and free of structure/signal. If the moments do have signal, this is probably a sign that the continuum channel selection contains line emission and should be revised. This will typically be accompanied by the warning: ! QA MOM8 FC image for field Sgr_A_star spw X with a peak SNR of Y indicates that there may be residual line emission in the findcont channels.
    • Check the "Spectra" plots for each SPW. Look out for spurious signals, and whether any emission extends beyond the edge of the SPW. The baseline should be flat ~ 0.

Manual image inspection

While the weblog images are informative, it can also be useful to download the full images to inspect them in your viewer of choice. This is particularly useful for the SPW cubes, as the weblog only displays integrated and averaged maps.

This step is crucial if there are any warnings or apparent issues with the images. For example, if cleaning started to diverge for a SPW, you should step through the cube and make a note of the affected channel(s).

To download the pipeline image products locally, follow the steps outlined here: https://github.com/ACES-CMZ/reduction_ACES/wiki/Using-Globus-to-transfer-ACES-data.

For example, if you want to download the image products for the execution block uid://A001/X15a0/Xae, you would go to the following directory on ACES-HPG and select the relevant files to transfer to your local endpoint:

/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xae/product/

Updating the GitHub issues

Each execution block has its own issue here: https://github.com/ACES-CMZ/reduction_ACES/projects/1

Please make sure to update the relevant issue with your summary of the weblog/image review. There is a checklist at the top of each issue, so please check any tasks that have been completed, and feel free to add any new tasks, e.g. "SPW X needs to be re-imaged to avoid divergence".

You should add comments to the issue, making note of any warnings and issues that you have spotted, and include any images/screenshots that may be relevant. If you think that some of the data needs to be re-calibrated or re-imaged, please make a note of this and add a task to the checklist at the top of the issue.

Manual reclean

Please use CASA version casa-6.4.1-12-pipeline-2022.2.0.68 when re-imaging in order to maintain consistency.

Where to find on globus

Every 12m and 7m SB has tclean scripts for each SPW, for example:

/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_X130/calibrated/working/tclean_cube_pars_Sgr_A_st_y_03_TM1_spw33.py

You can simply re-run this script, changing parameters as necessary.

The measurement sets needed for cleaning can be found in the /calibrated/working/ directory for each SB, e.g.:

/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_X130/calibrated/working/uid___A002_Xf53eeb_X3071_target.ms

Be sure to grab the *_target.ms files, as these contain the continuum-subtracted data in the CORRECTED datacolumn.

Note:

• A calibrated MS for each ASDM with a name like uid___A00X_XXXX_XXX.ms. This ms
includes both calibrator and science data and all spectral windows, with the raw data in
the DATA column, and the calibrated continuum + line data in the CORRECTED column.

• The science-target only MS (uid___A00X_XXXX_XXX_target.ms), now with the
calibrated continuum + line data in the DATA column, and the calibrated continuum
subtracted data in the CORRECTED column. 

From 4.3.3 here: https://almascience.nrao.edu/processing/documents-and-tools/alma-science-pipeline-users-guide-casa-4-7.2

This is correct for PL version 6.2.1-7. But for 6.4.1.12 the naming conventions are different:

• In preparation for self-calibration capability in a future release, 
hif_mstransform creates a "*_targets.ms" that includes the science target 
data with no continuum subtracted, and now also a "*_targets_line.ms" measurement 
set that has the continuum subtracted.

Updating clean parameter JSON files

How to update JSON files in github via pull request

When you re-image a file, we need to preserve the clean commands used in the override_tclean_commands.json file.

Edits to this file should be submitted as a pull request.

Follow the steps below to update the clean parameter JSON files via a GitHub pull request:

  1. Fork the ACES-CMZ/reduction_ACES repository by clicking on the "Fork" button at the top right corner of the repository page. This creates a copy of the repository in your GitHub account.

  2. Clone your forked repository to your local machine using Git. Open a terminal or Git Bash and run the following command:

git clone https://github.com/your-username/reduction_ACES.git

Replace your-username with your GitHub username.

  1. Navigate to the cloned repository:
cd reduction_ACES
  1. Create a new branch to make your changes:
git checkout -b update-clean-parameters

Replace update-clean-parameters with a descriptive branch name. 5. Open the override_tclean_commands.json file in a text editor of your choice.

  1. Make the necessary updates to the clean commands, such as undoing size mitigation, adjusting cyclefactor, or modifying channel spacing.

  2. Save the changes to the file.

  3. Add the modified file to the staging area:

git add aces/pipeline_scripts/override_tclean_commands.json
  1. Commit the changes with a descriptive commit message:
git commit -m "Update clean parameters for re-imaging"
  1. Push the changes to your forked repository:
git push origin update-clean-parameters
  1. Visit your forked repository on GitHub and switch to the update-clean-parameters branch.
  2. Click on the "New pull request" button next to the branch name.
  3. Review the changes in the pull request and provide any additional information or context if needed.
  4. Click on the "Create pull request" button to submit the pull request.
  5. Wait for the repository maintainers (Adam) to review your changes. They may ask for further modifications or provide feedback.
  6. Once the pull request is approved and merged, the clean parameter JSON files will be updated with your changes.

Note: It's always a good practice to keep your forked repository in sync with the original repository to avoid conflicts. You can refer to the GitHub documentation on how to sync a forked repository for more information.

Note: Do a syntax check on this file before submitting - you can use the command-line tool jq, i.e., do jq . override_tclean_commands.json. You can also try loading the file into python with

import json
with open('override_tclean_commands.json', 'r') as fh:
    data = json.load(fh)

Common updates needed include:

  • Undoing size mitigation by adding spw33 to the list - this should be handled automatically, but sometimes updates may be needed
  • Increasing cyclefactor or making other small tweaks to fix divergence issues. Usually you can increase from 1->1.5, but sometimes higher values (~2-4) are required. Deciding on the right parameter is more art than science.
  • Undoing size mitigation by making the channel spacing 1x instead of downsampled-by-2x. This requires modifying the nchan and width parameters and probably the threshold should be increased by a factor of ~sqrt(2).

Uploading updated data

If you have a completed clean cube that you've imaged locally, you can upload it to globus. If you do, note where you uploaded it in the appropriate github issue.

Then, a UF person (Adam or Nazar) will rename and/or move the file to the reclean/ directory under the member.../ directory and will symlink the files using this command: for fn in ../../reclean/*; do ln -s $fn $(basename ${fn/.reclean/}); done

The idea is that the reclean data should never be deleted (since they were human-made) even if the rest of the folders need to be removed for pipeline re-running in the future.