Replies: 7 comments 6 replies
-
@xenova has implemented features in a transformers.js PR that will make it possible to cache models in a browser extension. I'll be working from the PR to turn Semantic Finder into an extension. This will be a bit of work but I hope to make it as elegant as the MkDocs homepage that @do-me referenced. |
Beta Was this translation helpful? Give feedback.
-
Here's an update on the browser extension: https://github.com/VarunNSrivastava/transformers.js. It's a very rudimentary proof of concept. Feel free to take a look. For bugfixing / improvements:
In terms of next steps, I was considering
|
Beta Was this translation helpful? Give feedback.
-
Looking at the error message, it might be that the relative import
in extension/src/package.json is causing a problem. To fix it, when in src, you can try
Let me know if these don't work but the point is you want to be using the local version of transformers.js, and there are lots of hacky ways to get it working. |
Beta Was this translation helpful? Give feedback.
-
Got it running by manually copying the files from transformers dist folder to I really like the simplistic UI - not intrusive but very powerful! I tried for the wikipedia page of Albert Einstein, but I guess the splitting function has some issues: What are your defaults? If you want to keep it simple I guess it's very hard to come up with universally well-working settings, considering different text lenght nd the tradeoff of indexing time vs. chunk lenght / accuracy (if you're interested in words or rather paragraphs). Maybe it would be an option to: a) define a waiting time that is still acceptable for users like 5-10 sec (even though this heavily depends on the users machine...) Anyway, it's awesome to see this working in a browser and on any page! As a side note or a play of thought: just like in the web-version, it would also be cool if a web page could already provide the indexed file to the extension so the search would be almost instant. E.g. I was thinking that this browser plugin could fuel mkdocs semantic search too and maybe even attract more developers to work on it. Mkdocs-material also has a good tutorial section about their approach to tokenization with some regex examples. |
Beta Was this translation helpful? Give feedback.
-
For anyone reading through this, there are some comments in the PR #33. FYI: Mozilla announced an open ecosystem for browser plugins on mobile. |
Beta Was this translation helpful? Give feedback.
-
After the most recent PR #34, my biggest priority would be adding iframes support (including/especially pdfs). I have a hacky solution for pdfs in right now but it's not working consistently. After that, after some bugtesting, we could probably do an initial ship on the extensions store. |
Beta Was this translation helpful? Give feedback.
-
If anyone has any tips, tricks, or resources for getting page content that would be great. The current solution (using mozilla's readability.js library) is great but it doesn't access iframes and some other content. I've tried some hacky solutions but nothing is very satisfactory so far. |
Beta Was this translation helpful? Give feedback.
-
As partially discussed in #11 , there are more concrete plans to create a browser plugin (maybe even supported by web-GPU). @lizozom what are your exact plans?
Beta Was this translation helpful? Give feedback.
All reactions