TF/IDF and Okapi BM25 word frequency retrieval #22
Replies: 2 comments 6 replies
-
While we had plans to add BM25 search capabilities to winkNLP, this is indeed a good idea. We can partition the task into two tasks — computing word weights using BM25 and later add search capabilities. How would you like to see the output — a sparse matrix of weights or a bag of words like structure e.g. |
Beta Was this translation helpful? Give feedback.
-
I was using the new TFDF implementation in my react project and I encountered a problem. In react, components are often re-rendered and in effect, the learn() function is calling again and when it happens the below error occurs. `Error: wink-nlp: learn can not be used after a call to out() API in BM25 Vectorizer 19 |
AnalysePageWrapper/< 2087 | const corpus = postArr;
It is like something persisting in the memory after calling out(). Is there any option to nullify the learn() call so that we could be able to relearn with the new data (or any other option)? Can I get any help in getting around this problem? |
Beta Was this translation helpful? Give feedback.
-
It would be great if there is some word frequency retrieval utility using TF/IDF and BM25 with Wink js. Even if there are many features like tokenizers and Wink BM25 search with the ecosystem, this is really missing. Since TF/IDF or BM25 leaves only very low footprint and doesn't requires any heavy models to relay upon, this will be very useful with fronted implementation with modern front-end libraries like React js.
Beta Was this translation helpful? Give feedback.
All reactions