-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process_data.py run error, display memoryError, I use win10, 8G memory, how to solve? #15
Comments
import numpy as np def build_data_cv(datafile, cv=10, clean_string=True):
def get_W(word_vecs, k=300): def load_bin_vec(fname, vocab): def add_unknown_words(word_vecs, vocab, min_df=1, k=300): def clean_str(string, TREC=False): string = re.sub(r"[a-zA-Z]{4,}", "", string)
def clean_str_sst(string): def get_mairesse_features(file_name): if name=="main": |
Copy paste that code it optimizes and also resolves the encoding issue :) |
Thank you for solving this code problem , however, the code indentation you sent does not seem to be uploaded. So could you please give the code with indentation, thanks. |
Could anyone please submit a PR? |
@soujanyaporia |
Solving the error of preprocessing not working SenticNet#15
I have done some changes and now process_data.py file is working |
python process_data.py ./GoogleNews-vectors-negative300.bin ./essays.csv ./mairesse.csv
loading data... data loaded!
number of status: 2467
vocab size: 30391
max sentence length: 149
loading word2vec vectors...
Traceback (most recent call last):
File "process_data.py", line 171, in
w2v = load_bin_vec(w2v_file, vocab)
File "process_data.py", line 104, in load_bin_vec
word.append(ch)
MemoryError
I feel that this problem is very likely because the binary data set read. /GoogleNews-vectors-negative300.bin is too large
Ask how to solve it? ? ?
How do everyone run?
The text was updated successfully, but these errors were encountered: