Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev #32

Closed
wants to merge 4 commits into from
Closed

Dev #32

wants to merge 4 commits into from

Conversation

jalajthanaki
Copy link

I wanted to resolve issue #31

@codecov-io
Copy link

codecov-io commented Feb 17, 2019

Codecov Report

Merging #32 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #32      +/-   ##
==========================================
- Coverage   97.95%   97.94%   -0.02%     
==========================================
  Files           5        5              
  Lines         537      534       -3     
  Branches      131      130       -1     
==========================================
- Hits          526      523       -3     
  Misses          4        4              
  Partials        7        7
Impacted Files Coverage Δ
symspellpy/symspellpy.py 97.15% <100%> (-0.03%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9b9ab9c...fc6c704. Read the comment docs.

@jalajthanaki
Copy link
Author

Save delete combinations as pickle

def save_pickle(self, filename, compressed=True):
    pickle_data = {"deletes": self._deletes, "words": self._words, "max_length": self._max_length}
    with (gzip.open if compressed else open)(filename, "wb") as f:
        pickle.dump(pickle_data, f)

# Load delete combination as pickle. This will reduce the loading time.
def load_pickle(self, filename, compressed=True):
    with (gzip.open if compressed else open)(filename, "rb") as f:
        pickle_data = pickle.load(f)
    self._deletes = pickle_data["deletes"]
    self._words = pickle_data["words"]
    self._max_length = pickle_data["max_length"]
    return True`

@mammothb
Copy link
Owner

Just to confirm how this works (not too familiar with pickle): load_dictionary will still be used to generate the initial dictionary. The generated dictionary will be saved as a pickle and quicker loading on subsequent use, i.e., other projects which will use the same dictionary file.

So this will not solve the long run time of load_dictionary (on first run)?

@jalajthanaki
Copy link
Author

Just to confirm how this works (not too familiar with pickle): load_dictionary will still be used to generate the initial dictionary. The generated dictionary will be saved as a pickle and quicker loading on subsequent use, i.e., other projects which will use the same dictionary file.

So this will not solve the long run time of load_dictionary (on first run)?

Yes load_dictionary has to be there and it will generate initial delete combinations. I just wanted to save these initial delete combination as pickle format. So that users can use them afterwards.

Why do we need load dictionary?

  • Users can customize these parameters based on their needs and dictionary size. self, initial_capacity=16, max_dictionary_edit_distance=2, prefix_length=7, count_threshold=1
  • For relatively big dictionary saving the delete combinations as pickle make lot more sense so that users can load these combinations comparatively fast.
  • If dictionary is not updated by user so frequently then I guess this is the good option to have.

I will update you once I have solid idea on how to integrate in-memory database with existing flow.

Do let me know if you have more concern.

@mammothb
Copy link
Owner

mammothb commented Feb 18, 2019

implemented in fc6c704

Please submit another PR when you manage to integrate in-memory database.

@mammothb mammothb closed this Feb 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants