Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error - registration:moving spots #29

Closed
FangmingXie opened this issue Apr 12, 2023 · 9 comments
Closed

error - registration:moving spots #29

FangmingXie opened this issue Apr 12, 2023 · 9 comments
Assignees
Labels
bug Something isn't working

Comments

@FangmingXie
Copy link
Contributor

FangmingXie commented Apr 12, 2023

Bug report

Description of the problem

I am trying to run some EASI-FISH data collected by ourselves using AWS s3. It errored out at registration:moving spots. Do you have a clue where it went wrong?

In addition to the parameters and execution log below, I can also provide the full log file (nf-xxxx.log), though it is too large to paste it here. Please let me know if you need it though!

Log file(s)

params {
   lsf_opts = ''
   runtime_opts = ''
   singularity_cache_dir = '/root/.singularity_cache'
   shared_work_dir = '/fusion/s3/lt185/work'
   data_dir = '/fusion/s3/easi-fish-test1/lt185/'
   segmentation_model_dir = '/fusion/s3/easi-fish-test1/lt185/model/starfinity'
   publish_dir = '/fusion/s3/easi-fish-test1/lt185_r1-r5_outputs/'
   acq_names = 'lt185_r1,lt185_r2,lt185_r3,lt185_r4,lt185_r5,'
   ref_acq = 'lt185_r2'
   channels = 'c0,c1,c2,c3,c4'
   dapi_channel = 'c3'
   bleed_channel = 'c1'
   deform_memory = '50 G'
   registration_stitch_memory = '50 G'
   spots_memory = '20 G'
   coarse_spots_memory = '20 G'
   aff_scale_transform_memory = '45 G'
   def_scale = 's2'
   def_scale_transform_memory = '270 G'
   registration_transform_memory = '120 G'
   segmentation_cpus = 9
   segmentation_memory = '135 G'
   segmentation_scale = 's3'
   spot_extraction_scale = 's1'
   use_rsfish = true
   rsfish_min = 0
   rsfish_max = 1500
   rsfish_anisotropy = 0.6
   rsfish_sigma = 1.16
   rsfish_threshold = 0.0017
   rsfish_workers = 16
   rsfish_worker_cores = 8
   rsfish_gb_per_core = 15
   skip = 'stitching'
   singularity_user = 'ec2_user'
   envMap = [PATH:'/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/google-cloud-sdk/bin', AWS_BATCH_JQ_NAME:'[secret]', TOWER_ACCESS_TOKEN:'[secret]', JAVA_HOME:'/usr/lib/jvm/java-17-amazon-corretto', AWS_EXECUTION_ENV:'[secret]', NXF_OUT_FILE:'nf-44Ua9qeYLuxFw0.txt', ECS_CONTAINER_METADATA_URI_V4:'http://169.254.170.2/v4/7fe768c6-0f95-4a58-89bc-1b503ca91735', ECS_CONTAINER_METADATA_URI:'http://169.254.170.2/v3/7fe768c6-0f95-4a58-89bc-1b503ca91735', NXF_UUID:'8fff0ebf-849e-4247-a210-99576c5e0703', NXF_TML_FILE:'timeline-44Ua9qeYLuxFw0.html', LANG:'C.UTF-8', NXF_HOME:'/.nextflow', NXF_DEFAULT_DSL:'1', ECS_AGENT_URI:'http://169.254.170.2/api/7fe768c6-0f95-4a58-89bc-1b503ca91735', NXF_ORG:'nextflow-io', NXF_ANSI_LOG:'false', NXF_PLUGINS_DEFAULT:'nf-tower,nf-amazon,xpack-amzn', NXF_VER:'23.04.0', CAPSULE_CACHE_DIR:'/.nextflow/capsule', NXF_SCM_FILE:'https://api.tower.nf/ephemeral/gpub96CJKjG6Hj_zy6oAhQ', NXF_JVM_ARGS:'-XX:InitialRAMPercentage=40 -XX:MaxRAMPercentage=75', PWD:'/', NXF_IGNORE_RESUME_HISTORY:'true', AWS_BATCH_JOB_ID:'[secret]', NXF_WORK:'s3://lt185/work', TOWER_WORKSPACE_ID:'215439446535112', AWS_BATCH_JOB_ATTEMPT:'[secret]', NXF_CLI:'/usr/local/bin/nextflow run https://github.com/JaneliaSciComp/multifish -name lt185_r1-r5_2 -params-file https://api.tower.nf/ephemeral/7h-1d2nnNwsD06Xh2QXT-A.json -with-tower -profile tower', NXF_PACK:'one', TOWER_WORKFLOW_ID:'44Ua9qeYLuxFw0', JAVA_CMD:'/usr/lib/jvm/java-17-amazon-corretto/bin/java', NXF_XPACK_LICENSE:'eyJ2ZXIiOjF9LnsiaWQiOiI0cUhvZ2d6ZktqaFNNb3FXRzJUTkdYIiwicHJvZCI6InhwYWNrLWdvb2dsZSx4cGFjay1hbXpuIiwiYWN0IjoiMjAyMS0wNy0yOVQxNToxOTo0MloiLCJleHAiOiIyMDIzLTExLTAxVDAwOjAwOjAwWiJ9LjExMDQ1N2RlMjMwNWEzYWI1YWRkZWQ5MGNlOTM4Mzc3OTEzYzY3Mzg=', TOWER_REFRESH_TOKEN:'[secret]', HOSTNAME:'ip-172-31-24-95.us-west-1.compute.internal', NXF_PRERUN_BASE64:'ZXhwb3J0IFRPV0VSX0FDQ0VTU19UT0tFTj1leUpoYkdjaU9pSklVekkxTmlKOS5leUp6ZFdJaU9pSXlPVFkwSWl3aWJtSm1Jam94TmpneE1qYzJNemszTENKeWIyeGxjeUk2V3lKMWMyVnlJbDBzSW1semN5STZJblJ2ZDJWeUxXRndjQ0lzSW1WNGNDSTZNVFk0TVRJM09UazVOeXdpYVdGMElqb3hOamd4TWpjMk16azNmUS5tdkI3Q2poS3BHSGViTmxpRlEtY3JKQWd5aU5uXzV1dVNmc0xiVUF3Q25NCmV4cG9ydCBUT1dFUl9SRUZSRVNIX1RPS0VOPWV5SmhiR2NpT2lKSVV6STFOaUo5Lk5EWTNORFU1WVRNdFlUZG1aQzAwT1RFNExXSTFPVEV0TldFM1pUQmtaVGM0TW1JeS5ENGF4bDRHblVodktKbkt6eXBsa0loREJ5aHR6R01kT1pKVmNqeUdMTzBVCmV4cG9ydCBOWEZfU0NNX0ZJTEU9aHR0cHM6Ly9hcGkudG93ZXIubmYvZXBoZW1lcmFsL2dwdWI5NkNKS2pHNkhqX3p5Nm9BaFEKZXhwb3J0IE5YRl9YUEFDS19MSUNFTlNFPSdodHRwczovL2FwaS50b3dlci5uZi9lcGhlbWVyYWwvdGp2US1UVENMcHVUOFVvRG5nRk1BdycK', AWS_BATCH_CE_NAME:'[secret]', NXF_ENABLE_SECRETS:'[secret]', NXF_FUSION_BUCKETS:'s3://lt185,s3://lt185/work,s3://easi-fish-test1', NXF_LOG_FILE:'nf-44Ua9qeYLuxFw0.log', SHLVL:'1', HOME:'/root']
}

docker {
   temp = 'auto'
   runOptions = ''
   enabled = true
}

singularity {
   autoMounts = true
   cacheDir = '/root/.singularity_cache'
   runOptions = '--nv -e --env PROCESS_DIR=$PROCESS_DIR --env USER=ec2_user '
   enabled = false
}

process {
   ext {
      sparkLocation = '/spark'
   }
   beforeScript = 'export PROCESS_DIR=`pwd`'
   withLabel:small {
      cpus = 1
      memory = '1 GB'
   }
   withLabel:withGPU {
      containerOptions = ''
   }
   executor = 'awsbatch'
   queue = 'TowerForge-2ilZprh8QNHueQEpuVTtz5'
}

manifest {
   name = 'JaneliaSciComp/multifish'
   author = 'Janelia MultiFISH Team Project'
   homePage = 'https://github.com/JaneliaSciComp/multifish'
   description = 'Analysis pipeline for EASI-FISH (Expansion-Assisted Iterative Fluorescence In Situ Hybridization)'
   mainScript = 'main.nf'
   nextflowVersion = '>=20.10.0'
   version = '1.1.0'
}

timeline {
   enabled = true
   file = 'timeline-44Ua9qeYLuxFw0.html'
}

aws {
   region = 'us-west-1'
   client {
      uploadChunkSize = 10485760
   }
   batch {
      volumes = '/fusion/s3/lt185,/fusion/s3/easi-fish-test1,/fusion/s3/lt185/work'
      cliPath = '/home/ec2-user/miniconda/bin/aws'
      executionRole = 'arn:aws:iam::688180692228:role/TowerForge-2ilZprh8QNHueQEpuVTtz5-ExecutionRole'
   }
}

workDir = 's3://lt185/work'
runName = 'lt185_r1-r5_2'

tower {
   enabled = true
   endpoint = 'https://api.tower.nf'
}

[77/c77d33] Submitted process > registration:moving_spots (1433)
[69/0b313f] Submitted process > registration:moving_spots (1434)
ERROR ~ Error executing process > 'registration:moving_spots (481)'
Caused by:
  Essential container in task exited
Command executed:
  /app/scripts/waitforpaths.sh /fusion/s3/lt185/work/outputs/lt185_r5/registration/lt185_r5-to-lt185_r2/aff/ransac_affine/c3/s3
  /entrypoint.sh spots /fusion/s3/lt185/work/outputs/lt185_r5/registration/lt185_r5-to-lt185_r2/tiles/0/coords.txt /fusion/s3/lt185/work/outputs/lt185_r5/registration/lt185_r5-to-lt185_r2/aff/ransac_affine /c3/s3 /fusion/s3/lt185/work/outputs/lt185_r5/registration/lt185_r5-to-lt185_r2/tiles/0/moving_spots.pkl 8 2000
Command exit status:
  1
Command output:
  Checking for /fusion/s3/lt185/work/outputs/lt185_r5/registration/lt185_r5-to-lt185_r2/aff/ransac_affine/c3/s3
Command error:
  Checking for /fusion/s3/lt185/work/outputs/lt185_r5/registration/lt185_r5-to-lt185_r2/aff/ransac_affine/c3/s3
  Traceback (most recent call last):
    File "/app/bigstream/spots.py", line 148, in
      pruned_spots = prune_blobs(sortedSpots, overlap, min_distance)[:,:-2].astype(np.int)
  IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
Work dir:
  /fusion/s3/lt185/work/c2/c65770aa401bddd955275fcbabbf6d
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
-- Check 'nf-44Ua9qeYLuxFw0.log' file for details
[40/e14ee5] Submitted process > registration:moving_spots (1435)
WARN: Killing running tasks (1000)
WARN: Tower request field `workflow.errorMessage` exceeds expected size | offending value: `Checking for /fusion/s3/lt185/work/outputs/lt185_r5/registration/lt185_r5-to-lt185_r2/aff/ransac_affine/c3/s3
Traceback (most recent call last):
  File "/app/bigstream/spots.py", line 148, in
    pruned_spots = prune_blobs(sortedSpots, overlap, min_distance)[:,:-2].astype(np.int)
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed`, size: 372 (max: 255)
[AWS BATCH] Waiting jobs reaper to complete (613 jobs to be terminated)
Saving cache: .nextflow/cache/8fff0ebf-849e-4247-a210-99576c5e0703 => /fusion/s3/lt185/work/.nextflow/cache/8fff0ebf-849e-4247-a210-99576c5e0703

@FangmingXie FangmingXie added the bug Something isn't working label Apr 12, 2023
@krokicki
Copy link
Member

Hi @FangmingXie, I think this is the same issue that Liming ran into. Unfortunately, the software we're using here (BigStream1) is no longer being supported. We're in the process of switching to BigStream2, and will update you when it's ready.

@FangmingXie
Copy link
Contributor Author

Thanks!

@wangyuhan01
Copy link
Collaborator

wangyuhan01 commented May 15, 2023

Same issue as reported here, will report back once BigStream2 is implemented.

@FangmingXie
Copy link
Contributor Author

Thanks @wangyuhan01 for your comment! Could you elaborate a bit on where you think the issue is? Is it an issue of big sample size? If so, how large of a sample is the limit for the current BigStream? We would like to plan our experiments accordingly. Thank you so much for your help!

@wangyuhan01
Copy link
Collaborator

wangyuhan01 commented May 15, 2023

This is caused by lack of overlap in some image "tiles" between fixed and moving imaging rounds. In these "tiles", no moving spots pkl file is returned, which caused the error "IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed" in the tile-wise affine registration step. I created a pull request here to address this, but I am hopeful that BigStream2 will be the ultimate solution.

@FangmingXie
Copy link
Contributor Author

Thanks so much @wangyuhan01! This is very informative. Here is my understanding of your comments:

  • The problem isn't really just due to the size or scale of the image volume, although the larger the tissue size, the more likely it is to have a situation like you described. Rather, it is quite specific to the particular dataset.
  • You have solved this issue on your end, by tweaking the code yourself.

Did I get it right? Thank you so much!

@wangyuhan01
Copy link
Collaborator

- The problem isn't really just due to the size or scale of the image volume, although the larger the tissue size, the more likely it is to have a situation like you described. Rather, it is quite specific to the particular dataset.
It depends on the amount of overlap between fixed and moving image volumes. For now, if you can get the non-overlapping region between the two image rounds smaller than ~100µm in each dimension, it should work.

- You have solved this issue on your end, by tweaking the code yourself.
The tweak is to write an identity matrix when the 'moving_spots.pkl' file does not exit. However, this has not been incorporated into the pipeline yet because we want to upgrade to bigstream2, which will be more powerful and will likely solve a few other issues we have had as well.

@FangmingXie
Copy link
Contributor Author

Thanks @wangyuhan01 This is very helpful! We will try your changes first and see. Meanwhile, @krokicki we will be excited to use BigStream2 when it is ready.

@krokicki
Copy link
Member

BigStream2 has been integrated (see here) so I'm closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

3 participants