Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training reproduce error #10

Open
mzc2113391 opened this issue Jan 25, 2024 · 4 comments
Open

training reproduce error #10

mzc2113391 opened this issue Jan 25, 2024 · 4 comments

Comments

@mzc2113391
Copy link

Hello, I tried to reproduce the training of orca but there are some bugs

first when I run python misc/make_genome_memmap.py, I get this error: AttributeError: 'MemmapGenome' object has no attribute 'initialized'. When I manually add self.initialized to the MemmapGenome class, this step works.

But when I run python train_h1esc_a.py, I get a new error: File "/backup1/orca/orca-main/train/.. /selene_utils2.py", line 1135, in sample
targets = np.zeros((batch_size, *self.target.shape))
AttributeError: 'GenomicFeatures' object has no attribute 'shape'

Can you give me some help?thanks

@jzthree
Copy link
Contributor

jzthree commented Jan 25, 2024

You should install Selene with the provided commands

git clone https://github.com/kathyxchen/selene.git
cd selene
git checkout custom_target_support
python setup.py build_ext --inplace
python setup.py install 

@mzc2113391
Copy link
Author

Thanks a lot! I found that in SamplerDataLoader step, the memory usage was too high and my 128 GB memory would crash. How much memory does orca need? Is there a better way to allocate memory? Can I reduce the batch size to reduce memory usage? Though I think reducing the batch size may not be a good choice😭😭

@jzthree
Copy link
Contributor

jzthree commented Jan 26, 2024

There might be other places but I think the main thing is that you should reduce num_workers in data_loader.

@jimmylihui
Copy link

Thanks a lot! I found that in SamplerDataLoader step, the memory usage was too high and my 128 GB memory would crash. How much memory does orca need? Is there a better way to allocate memory? Can I reduce the batch size to reduce memory usage? Though I think reducing the batch size may not be a good choice😭😭

I think you didn't run the make_genome_memmap.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants