Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clodius aggregate with custom assembly: TypeError: Can't broadcast #96

Closed
liz-is opened this issue Jun 24, 2019 · 3 comments
Closed

Clodius aggregate with custom assembly: TypeError: Can't broadcast #96

liz-is opened this issue Jun 24, 2019 · 3 comments

Comments

@liz-is
Copy link
Contributor

liz-is commented Jun 24, 2019

Hi there,

I'm working with Drosophila data, aligned to dm6 from Flybase, which is in an Ensembl-like format (i.e., no 'chr' prefix'). Because negspy only has UCSC-like assemblies included, using --assembly dm6 I get errors like KeyError: 'X'.

So, I'm using --chromsizes-filename to specify a file that contains chrom sizes for my genome version and for only the main chromosomes, since my bedgraph has already been filtered to have only the main chromosomes. Here's the command I'm running and the output:

clodius aggregate bedgraph test_Rep1_10kb_corrected_pc.eigenvector.bed \
--output-file test_Rep1_10kb_corrected_pc.eigenvector.hitile \
--chromosome-col 1 --from-pos-col 2 --to-pos-col 3 --value-col 5 \
--chromsizes-filename dm6_chrom_sizes_sanitized.txt  --nan-value nan --no-header
output file: test_Rep1_10kb_corrected_pc.eigenvector.hitile
assembly_size: 137547960
assembly: hg19
assembly size (max-length) 137547960
max-width 268435456
max_zoom: 18
chunk-size: 16777216
chrom-order [b'2L' b'2R' b'3L' b'3R' b'4' b'X' b'Y']
len(values): 110458336 16777216
line: X	1	120000	A	0.0	.

position: 1 progress: 0.00 elapsed: 8.87 remaining: 1220465716.46
len(data_buffers[curr_zoom]) 16777216
positions[curr_zoom]: 0
len(values): 93681120 16777216
line: X	1	120000	A	0.0	.

[some output removed]

Traceback (most recent call last):
  File "/home/research/vaquerizas/liz/test/env/bin/clodius", line 11, in <module>
    load_entry_point('clodius==0.10.8', 'console_scripts', 'clodius')()
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/clodius/cli/aggregate.py", line 1322, in bedgraph
    chromsizes_filename, zoom_step)
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/clodius/cli/aggregate.py", line 938, in _bedgraph
    values[:chunk_size], nan_values[:chunk_size]
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/clodius/cli/aggregate.py", line 842, in add_values_to_data_buffers
    dsets[curr_zoom][curr_pos:curr_pos+chunk_size] = curr_chunk
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/h5py/_hl/dataset.py", line 707, in __setitem__
    for fspace in selection.broadcast(mshape):
  File "/home/research/vaquerizas/liz/test/env/lib/python3.7/site-packages/h5py/_hl/selections.py", line 299, in broadcast
    raise TypeError("Can't broadcast %s -> %s" % (target_shape, self.mshape))
TypeError: Can't broadcast (16777216,) -> (3330232,)

Any suggestions would be appreciated! I was wondering if this is also related to #87 ?

@pkerpedjiev
Copy link
Member

Hey, it sounds like you're doing everything right. Would you mind trying to convert to a bigWig and ingesting that instead?

https://docs.higlass.io/data_preparation.html#creating-bigwig-files

We need to either deprecate the clodius aggregate bedgraph functionality or change it to just output bigWig files.

@liz-is
Copy link
Contributor Author

liz-is commented Jun 25, 2019

Oh, I didn't realise it was possible to ingest bigwig files directly! Is that new, or did I just completely miss it? I'll give that a try then. It's also nice not to have to create the extra file :)

@liz-is
Copy link
Contributor Author

liz-is commented Jun 25, 2019

Ingest bigwig files directly works well as long as I provide an appropriate chrom.sizes file as well, so I'll close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants