makeIdentifiers: Iterating through blocks in parallel #12
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a possible solution for #11
__main__.py
Now has a
-n/--n_jobs
parameter (same name used injoblib
andsklearn
packages). By default this is 1; setting a higher number will use that many cores, setting-1
will use all cores available.Example command using two cores:
$ cd rnlp $ python setup.py develop $ python -m rnlp -n 2 -d example_files/
rnlp.parse.makeIdentifiers()
rnlp.parse.makeIdentifiers
still takes blocks as its only required parameter, but an optional parametern_jobs=1
can be overwritten to use more cores.Potential problems with this request:
_writeBlock(block, blockID)
and_writeSentenceInBlock(sentence, blockID, sentenceID)
are currently removed in these commits. This may not be problematic, given that re-thinking how to deal with positive and negative examples is on the list of to-do's.