Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace bytes_xor with much faster numpy based version #15

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

reticulatedpines
Copy link

Hi, I was using your code on some fairly large files and writing out the ELF was quite slow. Most of this was in bytes_xor(), so I found a faster implementation using numpy.

If you don't want the numpy dependency, that's perfectly understandable. Perhaps this could be refactored so it uses the fast version if numpy is already available.

I did very little testing - just checking that for the files I was working with, SHA1 was the same with old vs fast implementation of bytes_xor().

Comment on lines +5 to +9
# Old code was quite slow on large files (e.g., several seconds on a 32MB ELF).
# Most of this was xor. Large speedup from this SO answer:
#https://stackoverflow.com/questions/2119761/simple-python-challenge-fastest-bitwise-xor-on-data-buffers
#
# Although, this does add numpy as a dependency, so perhaps this should be checked at runtime?
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is perfect information for commit message, but serves no purpose inline, as whoever reads this does not see previous version.

@v3l0c1r4pt0r
Copy link
Owner

For me, adding dependency, like numpy is acceptable. This is fairly common library, so should not hurt. But before merging this, definitely dependencies should be fixed. As you can see, at the moment tests are failing because of that.

@reticulatedpines
Copy link
Author

Yes, sorry about the broken build. I didn't know what tests, if any, would run. I don't know the process you want to follow for your repo, and I'm unfamiliar with the Github interface. What's the best approach? Make changes in Github? Make changes locally and then a new PR?

@v3l0c1r4pt0r
Copy link
Owner

Don't worry about the builds. They are here to help you. You can push to this PR as many times as you like. So,please do not create another PR. To push updates to this PR simply push to your branch again. Regarding the problem, tests are just to avoid errors and bugs. I'm pushing new versions to PyPI, from time to time. To not break that, all dependencies has to be specified explicitly. Otherwise, once someone installs makeelf on fresh system, he would have it broken due to a lack of numpy. That's what definitely has to be fixed. To be honest, I didn't add any dependencies in the past, so if you still want to continue the topic, you have to do some research on your own. But, I don't believe it is a rocket science :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants