-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GOOGLE OCR: current situation #685
Comments
With delay 30!! I still have sometimes 303 error... |
Here is my library. Make a bunch of images (40 pieces) and run the folder through this library. If everything goes well with it, I will add updated COOKIES. A 303 error literally means that Google is sending you somewhere. And there are two reasons
You can check with this command: lens_scan <folder> full_text_default --debug=debug |
I just want to say what is simple way to solve this problem 303 - rerun this OCRed picture again (and again if you got 303 in second time). I try to do it manually and it works. |
Do the test as I described above. If it goes smoothly, I'll rewrite the plugin to make it work correctly. I don't yet understand what causes this error. |
Sorry, but it's too late in my country (KZ). Tomorrow I'm going to test it. |
We could have spoken Russian |
На 30 тестовых страницах из 30 комиксов вывел результаты для всех 30 страниц, но насколько он там все или не все распознал, или часть баллонов пропустил - я сверять не могу, у меня нет столько времени. Но по крайней мере все 30 страниц есть результат в виде текста |
Offtop:
|
Offtop. 2 вопроса.
|
В теории тоже поможет. Eng:
In theory, it will also help. |
Оба решения не помогут. |
Кинь ка мне в тг страницы с аномалиями. Я хочу посмотреть че там гугл засылает. А еще кинь скрин страницы с аномалией. Так же в тг. @bropines |
303 ошибку починил(теоретически). Как DmMaze проснется, он зальет изменение. Ща сижу потею над фиксом "неверного" языка I fixed the 303 error (theoretically). As DmMaze wakes up, he will flood the change. Right now I’m sitting here sweating over a fix for the “wrong” language |
Problem 303 is apparently solved, it is no longer present on the tests. The problem with partial recognition of another language remains, if you still need - I can give you test pages, but I have this problem on almost any page of French comics (I mainly use translation from French to English), it's strange that no one else reported this problem. Do you still need me to send test pages to telegram? |
Yes. Let's. I'll see what's there |
22.12.24
Based on my tests, it seems the paid version of Cloud Vision is significantly worse compared to what Google Lens provides. My assumption is that the paid version is trained on much more specialized data, such as documents. While it handles English fairly well, it struggles with other languages. For example:
I initially thought the issues might be caused by specific data included in the API requests. However, the problem remains unresolved because the official documentation at Google Cloud Vision API Reference has been down for several days. This has prevented me from experimenting further, although my current implementation follows Google’s official guidelines. When it comes to detecting text and text blocks (such as speech bubbles), Google Vision performs on par with CTD, and in some minor cases, even exceeds it. My plan is to integrate Google Vision as a text detection option, but only experimentally. The primary issue is the cost: both detection and recognition are charged under the same pricing model. This means the free monthly limit of 1,000 units will be exhausted quickly. Unfortunately, I’m not yet sure if the plugin can be designed to handle detection and recognition in a single request while outputting the identified text blocks. @dmMaze – is this possible, or would this require modifications to the plugin?
From what I’ve seen on forums, Google seems to have rolled back the recent changes—possibly due to user complaints or because they broke something. After the update, Google browsers suddenly stopped utilizing Lens features entirely. I don’t know how long the current method will remain functional, but I’ll provide updates here as soon as something changes. |
I @bropines wrote a plugin for Google ocr.
Lately it may not work, producing “text not found” or 303 errors.
In case of an error, the text was not found, try going to the google.com page and passing it an image to the search area. If you are moved to this (screenshot), then the problem is in the EU region. Use proxies of CIS countries or any other except EU.
ScreenShot
If your page is significantly different from my example, then there was an update in your region and Google changed the Endpoints. This happens 3 times. I will find a solution if suddenly the update becomes final.
If you have a 303 error, increase delay to 1.5. Recently, Google has introduced protection from our plugins (and they work identically everywhere) and can ignore some of the for quick queries. I also do not recommend using google translate during recognition. Better recognize it and then start the translation.Temporarily FixedI'm writing a plugin that connects to Google's paid API, and unfortunately, in many regions it's not even possible to register with it. The plugin will be for those who have somehow gained access and a key from google vision.
The text was updated successfully, but these errors were encountered: